Prediction of non-muscle invasive bladder cancer outcomes assessed by innovative multimarker prognostic models

Download PDF Czech version

Authors: E. López De Maturana ¹; A. Picornell ¹; A. Masson-Lecomte ¹; M. Kogevinas ^2,10; M. Márquez ¹; A. Carrato ³; A. Tardón ^4,10; J. Lloreta ⁵; M. García-Closas ⁶; D. Silverman ⁷; N. Rothman ⁷; S. Chanock ⁷; F. X. Real ⁸; M. E. Goddard ⁹; N. Malats ^1*; And On Behalf Of The Sbc/epicuro Study Investigators
Authors‘ workplace: Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), C/Melchor Fernández, Almagro , 80 , Madrid, Spain ¹; Centre for Research in Environmental Epidemiology (CREAL), Parc de Salut Mar, Barcelona, Spain ²; Servicio de Oncología, Hospital Universitario Ramon y Cajal, Madrid, and Servicio de Oncología, Hospital Universitario de Elche, Elche, Spain ³; Department of Preventive Medicine Universidad de Oviedo, Oviedo, Spain ⁴; Parc de Salut Mar and Departament of Pathology, Hospital del Mar - IMAS, Barcelona, Spain ⁵; Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK ⁶; Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland, USA ⁷; Epithelial Carcinogenesis Group, Spanish National Cancer Research Centre (CNIO), Madrid, and Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain ⁸; Biosciences Research Division, Department of Environment and Primary Industries, Agribio, and Department of Food and Agricultural Systems, University of Melbourne, Melbourne, Australia ⁹; CIBERESP, Madrid, Spain. ¹⁰
Published in: BMC Cancer 2016, 351:16
Category: Research Article
doi: https://doi.org/10.1186/s12885-016-2361-7

© 2016 de Maturana et al.

Open access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
The electronic version of this article is the complete one and can be found online at: http://bmccancer.biomedcentral.com/articles/10.1186/s12885-016-2361-7

Overview

Background:
We adapted Bayesian statistical learning strategies to the prognosis field to investigate if genome-wide common SNP improve the prediction ability of clinico-pathological prognosticators and applied it to non-muscle invasive bladder cancer (NMIBC) patients.

Methods:
Adapted Bayesian sequential threshold models in combination with LASSO were applied to consider the time-to-event and the censoring nature of data. We studied 822 NMIBC patients followed-up >10 years. The study outcomes were time-to-first-recurrence and time-to-progression. The predictive ability of the models including up to 171,304 SNP and/or 6 clinico-pathological prognosticators was evaluated using AUC-ROC and determination coefficient.

Results:
Clinico-pathological prognosticators explained a larger proportion of the time-to-first-recurrence (3.1 %) and time-to-progression (5.4 %) phenotypic variances than SNPs (1 and 0.01 %, respectively). Adding SNPs to the clinico-pathological-parameters model slightly improved the prediction of time-to-first-recurrence (up to 4 %). The prediction of time-to-progression using both clinico-pathological prognosticators and SNP did not improve. Heritability (ĥ ²) of both outcomes was <1 % in NMIBC.

Conclusions:
We adapted a Bayesian statistical learning method to deal with a large number of parameters in prognostic studies. Common SNPs showed a limited role in predicting NMIBC outcomes yielding a very low heritability for both outcomes. We report for the first time a heritability estimate for a disease outcome. Our method can be extended to other disease models.

Keywords:
Multimarker models Bayesian statistical learning method Bayesian regression Bayesian LASSO AUC-ROC Determination coefficient heritability Bladder cancer outcome Prognosis Recurrence Progression Genome-wide common SNP Illumina Infinium HumanHap 1 M array Predictive ability

Background

Urothelial bladder cancer (UBC) is among the most common malignant tumors of the urological system and one of the most prevalent cancers due to its chronic nature [1]. As a consequence, it poses an enormous burden on health care systems [2].

UBC also represents a paradigm of heterogeneous diseases with respect to its phenotype and prognosis. Approximately, 75 % of newly diagnosed UBCs do not invade the muscle (non-muscle invasive bladder cancer, NMIBC) at the time of diagnosis. Most of these cancers remain stable over the time after a transurethral resection (TUR); a high proportion relapse without invading the muscle (recurrence) while a lower proportion progress as a muscle invasive bladder cancer (MIBC). Based on tumor characteristics, mainly stage and grade, NMIBC are subsequently classified as “low risk” (LR) and “high risk” (HiR) of progression [3].

Current prognostic tools for NMIBC are based on well-known clinico-pathological prognosticators such as pathological grade and stage, number and size of tumours, and presence of carcinoma in situ [3, 4]. However, these factors do not have enough discriminative ability to predict, at the patient level, the risk of recurrence and progression [5]. An accurate estimation of the outcome risk in the individual patient would help identifying the most appropriate therapy to avoid tumor progression and, hopefully reducing the number of follow-up cystoscopies in patients at low risk [6].

There is a growing evidence for a role of germline genetic polymorphisms in cancer risk and prognosis, UBC being a paradigm [7, href="http://bmccancer.biomedcentral.com/articles/10.1186/s12885-016-2361-7#CR8">8]. However, the individual effect of the genetic variants is expected to be small and they may not be medically actionable. Multimarker analyses have been shown to capture a much higher percentage of the genetic variance than individual markers which passed the significant threshold in GWAS [9, 10, 11].

Our objective was to investigate whether genome-wide common SNP profiles are able to predict the risk of recurrence and progression in NMIBC patients and to estimate how much they contribute to these predictions when combined with clinico-pathological prognosticators. To this end, we adapted Bayesian statistical learning strategies to be applied to the human prognosis field for the first time.

Methods

Study population

This study was performed in patients with primary UBC included in the Spanish Bladder cancer (SBC)/EPICURO Study. Cases were recruited in 18 hospitals and followed up >10 years after diagnosis. A total of 1,105 patients had their diagnosis confirmed through a pathological review conducted by a panel of experts. Trained monitors collected detailed data on clinico-pathological prognosticators from clinical charts and followed the patients up prospectively through the participating hospitals and direct telephone interviews.

In this study, we focused on patients with a primary diagnosis of NMIBC (N = 995). Two endpoints were of interest: 1) Time-to-first-recurrence (TFR), defined as the reappearance of a NMIBC tumor following a previous negative follow-up cystoscopy, and 2) time-to-progression (TP), defined as the development of a muscle invasive tumor or a metastatic disease, or death because of UCB, after a previous diagnosis of NMIBC. Patients who did not present any event until the end of study, those lost of follow up and those who died from other causes were considered as censored either at last medical visit or at death.

Patients who underwent to a cystectomy were not considered in the analyses of TFR. A final number of 810 and 822 cases with NMIBC were available for the analyses of TFR and TP, respectively: 284 were HiR tumors (Ta high grade, T1 high grade, carcinoma in situ (CIS) and T1 low grade tumors) and 538 LR tumors (those presenting papillary UBC of low malignant potential or Ta low-grade papillary UBC according to the 2004 WHO classification).

Genotyping and quality control

Genotyping was performed as described in ¹² and provided calls for 1,072,820 SNP genotypes. We excluded SNPs in sex chromosomes, those with a low genotyping rate (<95 %) and MAF < 0.02 in NMIBC.

Stringent LD pruning (r ² < 0.2) was applied to reduce the number of markers, prioritizing those with less missing data. In addition, SNPs found significant in a previous prognostic study were considered here [11]. The final numbers of assessed SNPs for TFR and TP were 171,295 and 171,304, respectively, providing a good coverage of the genome. Missing genotypes were imputed using the package randomForest in R [12].

Statistical model

We used a sequential threshold model [13] to analyze time-to-event data. This approach was previously applied in quantitative genetics [13, 14, 15], although till present it has not been applied in a human genomic study. This model assumes that for an observation of a patient to be present at a given period of time, he/she must have survived through all previous time periods. Thus, the probability of not presenting the event of interest until interval k, conditional on the event that the k-th interval has been reached, is given by:

where γ corresponds to unordered cutoff points corresponding to each time interval, X corresponds to the incidence matrix of effects (β) affecting the liability to survive to the next interval given that the present interval has been reached. Residual variance (σ _e² ) was fixed to 1 to ensure identifiability of the parameters [16].

Patients were classified as censored or uncensored in each time interval considered for each event as displayed in Fig. 1. We divided the follow-up time for TFR and TP in 9 and 4 intervals, respectively, according to the survival functions for each event (see Figs. 2a and 3a). The analysis of TP was further stratified according to the tumor risk group (LR and HiR, see Fig. 3b). For these subgroup analyses the number of intervals was lower.

**1. Data censoring in each defined interval according to the presence/absence of event when a sequential threshold model is applied**

2. Survival function (solid line) and 95 % CI (dotted lines) of the time to recurrence (TFR) for the whole series (A) and according to the group of risk (B: HiR in red and LR in blue). Vertical lines separate the 9 time intervals considered for this outcome

3. Survival function (solid line) and 95 % CI (dotted lines) of the time to progression (TP) for the whole series (a) and according to the group of risk (b: HiR in red and LR in blue). Vertical lines separate the 9 time intervals considered for this outcome

Three models were used in the analyses of each outcome: (1) Model including the clinico-pathological prognosticators only, (2) model including the SNP data only, and (3) model including both clinico-pathological prognosticators &SNP data. As for the first model, a Bayesian regression was used (see Additional file 1: Table S1). Further information of the model building is in Additional file 2: Supplementary methods. Regarding the second model, a Bayesian LASSO [17] was applied to analyze the predictive ability of common SNPs (see Additional file 2: Supplementary methods for further details). Finally, for the full model, a Bayesian regression coupled with LASSO [18, 19] was used. Priors and fully conditional distributions for both SNP and clinico-pathological prognosticators are described in Additional file 2: Supplementary methods.

Evaluation of the predictive ability

The predictive ability of each model in the whole cohort was evaluated through a 10 fold cross-validation (CV) [20]. When patients were stratified as HiR/LR for the TP analyses, a 2-fold CV procedure was performed instead, due to the low number of events. We measured the predictive ability of each model using two statistics: 1) the area under the ROC (AUC), generated with the ROCR package for R (www.r-project.org), and the determination coefficient on the liability scale (R _probit² ), which is the proportion of the total variance explained by predictors in the testing set on the probit liability scale [21]:

Results

Additional file 1: Table S2 provides the number of censored patients and events in each time interval according to the outcome of interest (TFR and TP).

Time to first recurrence

33 % of the patients with a primary NMIBC suffered a recurrence of the primary tumor (first recurrence). Fifty percent of patients presented the first recurrence during the first year and in most cases (94 %), the first recurrence was diagnosed during the first 4 years of follow up. Fifty-two percent of the NMIBC patients were censored at the end of the follow-up.

Table 1 and Additional file 1: Table S3 show the averaged AUC and R _probit² obtained after the 10 fold CV analyses with the three models. The model including clinico-pathological prognosticators had an averaged AUC of 0.62. Model including only SNPs classified slightly better than random (AUC = 0.55). The joint model did not perform better (AUC = 0.61).

Averaged area under the ROC curve (AUC) and coefficient of determination (R<sub>probit</sub><sup>2</sup>), as well standard deviations (between
parenthesis), obtained from the testing sets in the 10 fold-crossvalidation analyses of time to first recurrence (TFR) and time
to progression in the whole (TP), high risk (TPHiR) and low risk (TPLR) cohorts — 1. Averaged area under the ROC curve (AUC) and coefficient of determination (R_probit²), as well standard deviations (between parenthesis), obtained from the testing sets in the 10 fold-crossvalidation analyses of time to first recurrence (TFR) and time to progression in the whole (TP), high risk (TPHiR) and low risk (TPLR) cohorts

When the predictive ability was evaluated using R _probit² , the model combining clinico-pathological prognosticators &SNPs performed the best, capturing 4 % of the phenotypic variance on the liability scale. The predictive abilities for the clinico-pathological prognosticators and the SNP models were 3 and 1 %, respectively; the latter being the first heritability estimate (ĥ ²) for TFR in NMIBC reported so far.

Time to first progression

Whole cohort

Nine percent of the patients with a primary NMIBC suffered of a tumor progression during the follow-up. Fifty percent of the patients were diagnosed during the first two years and most of them (89 %) were diagnosed during the first 5 years (see Additional file 1: Table S2). Seventy five percent of the patients did not show any progression at the end of the follow-up period (>10 year). Table 1 and Additional file 1: Table S4 show the AUC and R _probit² after the 10 CV analyses for TP. The model including clinico-pathological prognosticators had an averaged AUC of 0.76, a much higher value than the model with SNPs only (AUC = 0.58). Adding SNPs to clinico-pathological prognosticators did not increase their individual classification performance (AUC = 0.76). Clinico-pathological prognosticators explained 5.4 % of the phenotypic variance on the liability scale. Surprisingly, SNP explained only 0.1 % of the variance. Adding SNPs to the clinico-pathological prognosticators worsened the R _probit² of the model (Table 1).

Patients at HiR

The majority (~70 %) of patients showed a progression during the first two years of follow-up and 75 % of them finished the follow-up without any progression (Additional file 1: Table S2). Table 1 and Additional file1: Table S5 show the AUC and R _probit² of the three models evaluated. The model including only clinico-pathological prognosticators classified the patients according to the TP similarly to the model including only SNPs (0.57 vs. 0.56, respectively). The model with the best R _probit² for progression at HiR was the one considering clinico-pathological prognosticators (R _probit² = 0.151). Including only common SNPs explained <1 % of the phenotypic variance of the cohort at HiR. Adding them to the clinico-pathological prognosticators increased their predictive ability by 2.6 % (R _probit² = 0.155).

Patients at LR

Only 24 patients showed a progression during the follow-up (<5 %). Two thirds of those patients were diagnosed during the first 2 years of follow-up. Table 1 and Additional file 1: Table S6 present the AUC and R_probit² of the three models corresponding to the 2 fold-CV procedure. The model including clinico-pathological prognosticators poorly categorized LG-NMIBC patients according to their progression status (AUC = 0.45). By including age at diagnosis we obtained a better classification (AUC = 0.68). The SNP model classified the patients slightly better than random (AUC = 0.55). The best R _probit² was found for the model including only clinico-pathological prognosticators (0.0358). Adding SNPs to latter model worsened its R_probit² (0.0267).

Discussion

Here we present a high dimensional model considering the time-to-event nature of the information and censored data enabling to accommodate a large number of variables in a relatively small number of individuals. To our knowledge, this is the first time that such a model is applied in the clinical and genetic epidemiology fields. More specifically, we have applied it to study the predictive ability of prognostic models for NMIBC patients.

The major goal in managing NMIBC patients is to prevent tumor relapse, this including both the high number of recurrences and the progression to MIBC. To this end, treatment needs to be tailored according to the aggressiveness of the disease. Therefore, accurate prognostic models are crucial. Currently, there are no validated prognostic molecular biomarkers to guide the clinical management of patients [22, 23] and the therapeutic decisions are still based on risk tables only including clinico-pathological prognosticators [3]. Here we have investigated the potential clinical utility of inherited genetic markers (SNP profiles) based on their robustness and precise measurements as well as on their time-independent nature in comparison to serological and histological markers. To this end we have assessed the ability to improve TFR and TP risk stratification in NMIBC patients of genome-wide common SNPs profiles. We have also evaluated the performance of well-known clinico-pathological prognosticators and how much the whole genome approach improved their performance to better classify patients.

Regarding the classification performance of clinico-pathological prognosticators alone, our sequential threshold models for both TFR and TP got similar estimates to those obtained previously by us with a Cox proportional hazard regression analysis [11]. Discrimination of patients according to the risk of TFR using clinico-pathological prognosticators was poorer than previously reported by Hernandez et al [24] (0.62 vs. 0.75), although better than that reported by Vedder et al [25] in a large cohort including ours. Nevertheless, it is worth noting that the definition of the outcome differs (recurrence vs. first recurrence) between our and these studies [24, 25]. Regarding TP outcome, our clinico-pathological prognosticators model classified the patients better than in Hernandez et al [24] (0.76 vs. 0.54) and than in a Danish cohort using both EORTC (0.76 vs. 0.72) and CUETO (0.76 vs. 0.74) scores [25]. However, it performed worse than in a Dutch cohort using the same classifiers: EORTC (0.76 vs. 0.81 and 0.77) and CUETO scores (0.76 vs. 0.82 and 0.81) [25].

The prediction ability of clinico-pathological prognosticators depends on the outcome. They clearly perform better in predicting TP than TFR, both in terms of classification (AUC, 0.76 vs. 0.62) and proportion of the explained variance (R _probit², 5.4 % vs. 3.1 %). Their lower performance when predicting TFR could be due to the dependence of factors other than biological explanations such as the potential incomplete resection of the tumor during the TURB and the tumour cell reimplantation on first tumour recurrence [23], factors that are difficult to be assessed and therefore are not accounted for in the model. When the patients were stratified according to their risk status, clinico-pathological prognosticators explained a larger proportion of the phenotypic variance (~15 %) in the HiR group than in the LR NMIBC, probably because these factors were specifically selected to identify patients with HiR tumors with a high potential of progression. However, the overall classification performance of HiR NMIBC patients was poorer (AUC = 0.57) than in the whole cohort. While the discriminatory ability of clinical-pathological parameters for both NMIBC outcomes is valuable, there is room for improvement. More accurate discriminatory models would better select patients for aggressive treatment as well as would avoid unnecessary treatments towards a better patient management. This justifies the search of further prognostic factors, among them tumour molecular alteration and inherited variation markers [3, 26, 27].

Our results showed that common genome-wide SNPs similarly, though poorly, classified patients regarding both TFR and TP in the whole series and in the HiR and LR subcohorts, AUCs ranging from 0.55 to 0.58. Adding SNP to the models did not improve the classification performance of clinico-pathological prognosticators although improvements of R _probit² were achieved for TFR (3–4) and TP in the HiR cohort (15.1 - 15.5 %). Surprisingly, adding SNP to clinico-pathological prognosticators worsened the percentage of phenotypic variance (R _probit²) explained by the model with clinico-pathological prognosticators only by 7 and 25 % when predicting TP in the whole and the LR-NMIBC cohorts, respectively. The little improvement or even deterioration in terms of R _probit² could be explained by a correlation between the prediction of clinico-pathological prognosticators and that of SNPs. To confirm this, we calculated the R _probit² of a model with XβˆXβ^ obtained from clinico-pathological prognosticators only as dependent variable and the SNPs as independent variables (see Tables 2 and Additional file 1: Table S6). The proportion of the clinico-pathological prognosticators prediction variances of TFR and TP explained by SNPs was larger than that of the TFR and TP phenotypic variances. The calculation of R _probit² allowed us to report the first ĥ ² for TFR and TP in the whole series and in the HiR and LR subcohorts. The largest ĥ ² corresponded to TFR (1 %) and to TP of patients at HiR (1 %), although they may be underestimated because of the sample size and the limitation on the number of SNPs included in the model [28]. All the above explains the small or nil contribution of the SNPs to the predictive ability of clinico-pathological prognosticators of the phenotypes of interest. The poor predictive ability of common SNPs in NMIBC prognosis is in line with a previous study reporting low GWAS risk predictive values for UBC [19], as well as with those obtained in studies predicting risk for other neoplasms, such as breast cancer [29, 30]. The different results obtained with AUC and R _probit² can be explained by the different scales in which the predictions are expressed (observable for AUC and liability forR _probit² ), their non-monotonic relationship, and the lower number of events, especially when the individuals were stratified.

2. Estimates of the determination coefficient (R<sub>probit</sub><sup>2</sup>) measuring the proportion of variance of the liability to first recurrence (TFR) and progression (TP) risks in whole, high risk (TPHiR) and low risk (TPLR) cohorts of the clinicopathological prognosticators explained by the common SNPs

While this is one of the largest and well-characterized NMIBC cohort worldwide, the restricted sample size in the subgroup analyses is one of the limitations we face here because the small number of events limits the prediction accuracy of the genomic profile achieved with the SNPs. This is even clearer when patients were further stratified as LR-NMIBC. Although increasing sample size of the study would be desirable, heterogeneity across studies regarding patient recruitment, pathological classifications applied, and treatment or patient management would increase random misclassification and, therefore, would dilute estimates. While we conducted a genome-wide exploration, the models did not include all genotyped SNPs (1 million) but a subset that were filtered by a restrict LD. When we applied a less restrictive LD threshold (r ² < 0.8) and considered a larger number of common SNPs neither the classification performance nor the percentage of the phenotypic variance explained improved (results not shown). Including in the models both rare and structural variants may help in further characterizing and increase the precision of the predictive estimates. Application of other statistical modeling approaches could indeed yield improvements in the predictive power, for example by considering non-additive models that include epistatic interactions between SNPs or adding functional information in the model. Exploring the integration of other –omics data such as microRNAs, as well as considering possible interactions between treatment and variants could also help in this regard.

This study also presents several strengths as its population-based nature, detailed medical information, long follow-up, and centralized pathological review decreasing heterogeneity of the covariates stage and grade. The use of state-of-the art methodology applied here allowed to handle a highly dimensional problem and time-to-event data, as well as censoring. The application of such methodology allowed us to provide the first estimates of heritability for UBC outcomes.

Conclusions

Here we provide the scientific community, for the first time, with a methodology to estimate the heritability and the prediction ability of multidimensional data in the prognosis field. By applying it to the UBC setting, we observed that the role of common SNPs is very limited in the prediction of risk of recurrence and progression in NMIBC. Future studies should explore whether the integration of other genetic variants, as well as their interaction among them and with treatment, contribute to build a more accurate predictive model allowing the final assessment of the translational potential of genetic inherited variants into the clinics.

Abbreviations

AUC-ROC: area under the receiving operating curve; CiS: carcinoma in situ; GWAS: genome wide association study; HiR: high risk; LASSO: least absolute shrinkage and selection operator; LD: linkage disequilibrium; LR: low risk; MIBC: muscle invasive bladder cancer; NMIBC: non-muscle invasive bladder cancer; SBC: Spanish Bladder Cancer; SNP: single nucleotide polymorphism; TFR: time-to-first-recurrence; TP: time-to-progression; TURB: transurethral resection of the bladder; UBC: urothelial bladder cancer; WHO: World health organization.

Acknowledgements

We acknowledge the coordinators, field and administrative workers, technicians and study participants of the Spanish Bladder Cancer/EPICURO study.

Spanish Bladder Cancer (SBC)/EPICURO Study investigators: Institut Municipal d’Investigació Mèdica, Universitat Pompeu Fabra, Barcelona – Coordinating Center (M. Kogevinas, N. Malats, F.X. Real, M. Sala, G. Castaño, M. Torà, D. Puente, C. Villanueva, C. Murta-Nascimento, J. Fortuny, E. López, S. Hernández, R. Jaramillo, G. Vellalta, L. Palencia, F. Fermández, A. Amorós, A. Alfaro, G. Carretero); Hospital del Mar, Universitat Autònoma de Barcelona, Barcelona (J. Lloreta, S. Serrano, L. Ferrer, A. Gelabert, J. Carles, O. Bielsa, K. Villadiego), Hospital Germans Trias i Pujol, Badalona, Barcelona (L. Cecchini, J.M. Saladié, L. Ibarz); Hospital de Sant Boi, Sant Boi de Llobregat, Barcelona (M. Céspedes); Consorci Hospitalari Parc Taulí, Sabadell (C. Serra, D. García, J. Pujadas, R. Hernando, A. Cabezuelo, C. Abad, A. Prera, J. Prat); Centre Hospitalari i Cardiològic, Manresa, Barcelona (M. Domènech, J. Badal, J. Malet); Hospital Universitario de Canarias, La Laguna, Tenerife (R. García-Closas, J. Rodríguez de Vera, A.I. Martín); Hospital Universitario Nuestra Señora de la Candelaria, Tenerife (J. Taño, F. Cáceres); Hospital General Universitario de Elche, Universidad Miguel Hernández, Elche, Alicante (A. Carrato, F. García-López, M. Ull, A. Teruel, E. Andrada, A. Bustos, A. Castillejo, J.L. Soto); Universidad de Oviedo, Oviedo, Asturias (A. Tardón); Hospital San Agustín, Avilés, Asturias (J.L. Guate, J.M. Lanzas, J. Velasco); Hospital Central Covadonga, Oviedo, Asturias (J.M. Fernández, J.J. Rodríguez, A. Herrero), Hospital Central General, Oviedo, Asturias (R. Abascal, C. Manzano, T. Miralles); Hospital de Cabueñes, Gijón, Asturias (M. Rivas, M. Arguelles); Hospital de Jove, Gijón, Asturias (M. Díaz, J. Sánchez, O. González); Hospital de Cruz Roja, Gijón, Asturias (A. Mateos, V. Frade); Hospital Alvarez-Buylla (Mieres, Asturias): P. Muntañola, C. Pravia; Hospital Jarrio, Coaña, Asturias (A.M. Huescar, F. Huergo); Hospital Carmen y Severo Ochoa, Cangas, Asturias (J. Mosquera).

Funding

The work was partially supported by Red Temática de Investigación Cooperativa en Cáncer (#RD12/0036/0050), Fondo de Investigaciones Sanitarias (FIS), Instituto de Salud Carlos III, (Grant numbers #PI00–0745, #PI05–1436, and #PI06–1614), and Asociación Española Contra el Cáncer (AECC), Spain; the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, USA (Contract NCI NO2-CP-11015); and EU-FP7-HEALTH-F2–2008–201663-UROMOL and EU-7FP-HEALTH-TransBioBC #601933. ELM was funded by a Sara Borrell fellowship, Instituto de Salud Carlos III, Spain; and AML by a fellowship of the European Urological Scholarship Program for Research (EUSP Scholarship S-01–2013).

Availability of data and materials

Data is available upon collaborative research. Please contact the corresponding author.

Authors’ contributions

Conceived and designed the experiments: ELdM, MEG, NM. Performed the experiments: NR, MK, SJC, AT, MGC, AC, DTS, FXR, NM. Analyzed the data: ELdM, ACP, AML. Contributed reagents/materials/analysis tools: MGC, AGN, FXR, NM, MM. Wrote the paper: ELdM, MEG, NM. Lead the statistical analysis: ELdM. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Written informed consent was obtained from study participants in accordance with the Institutional Review Board of the U.S. National Cancer Institute and the Ethics Committees of each participating hospitals (Appendix 1).

Appendix 1. Participating centers in the study

U.S. National Cancer Institute (NCI)
Institut Municipal d’Investigació Mèdica and Hospital
del Mar
Centro Nacional de Investigaciones Oncológicas (CNIO)
Hospital Germans Tries i Pujol (Badalona, Barcelona)
Hospital de Sant Boi (Sant Boi, Barcelona)
Centre Hospitalari Parc Taulí (Sabadell, Barcelona)
Centre Hospitalari i Cardiològic (Manresa, Barcelona)
Hospital Universitario (La Laguna, Tenerife)
Hospital La Candelaria (Santa Cruz, Tenerife)
Hospital General Universitario de Elche
Universidad Miguel Hernández (Elche, Alicante)
Universidad de Oviedo (Oviedo, Asturias)
Hospital San Agustín (Avilés, Asturias)
Hospital Central Covadonga (Oviedo, Asturias)
Hospital Central General (Oviedo, Asturias)
Hospital de Cabueñes (Gijón, Asturias)
Hospital de Jove (Gijón, Asturias)
Hospital de Cruz Roja (Gijón, Asturias)
Hospital Alvarez-Buylla (Mieres, Asturias)
Hospital Jarrio (Coaña, Asturias)
Hospital Carmen y Severo Ochoa (Cangas, Asturias)

Additional files

Additional file 1: Table S1. Clinico-pathological variables included in the predictive models for time to first recurrence (TFR) and time to progression (TP). Table S2. Summary of censored patients and events (%) for each event in each time interval defined for the statistical analyses. Table S3. Area under the ROC curve (AUC) and coefficient of determination (R_probit²) obtained for each testing set in the 10 fold-crossvalidation analyses of time to first recurrence. Table S4. Area under the ROC curve (AUC) and coefficient of determination (R_probit²) obtained for each testing set in the 10 fold-crossvalidation analyses of time to progression. Table S5. Area under the ROC curve (AUC) and coefficient of determination (R_probit²) obtained for each testing set in the 2 fold-crossvalidation analyses of time to progression in patients at high risk. Table S6.Area under the ROC curve (AUC) and coefficient of determination (R_probit²) obtained for each testing set in the 2 fold-crossvalidation analyses of time to progression in patients at low risk. Table S7. Coefficient of determination (R_probit²) obtained for each testing set in the 10 fold-crossvalidation analyses of time to first recurrence (TFR), time to progression (TP) in the whole cohort, and time to progression (TP) in the high and low risk cohorts (TPHiR and TPLR). (DOC 113 kb)

Additional file 2: Supplemental Methods. Model including non-genetic variables. (DOC 55 kb)

Received: 3 November 2015
Accepted: 12 May 2016
Published: 3 June 2016

* Correspondence:

N. Malats

Genetic and Molecular Epidemiology Group
Spanish National Cancer Research Centre (CNIO)
C/Melchor Fernández, Almagro 3
28029, Madrid, Spain

nmalats@cnio.es

Sources

1. Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer. 2010; 127:2893–917.

2. Sievert KD, Amend B, Nagele U, Schilling D, Bedke J, Horstmann M, Hennenlotter J, Kruck S, Stenzl A. Economic aspects of bladder cancer: What are the benefits and costs? World J Urol. 2009;27:295–300.

3. Sylvester RJ, Van Der Meijden APM, Oosterlinck W, Witjes JA, Bouffioux C, Denis L, Newling DWW, Kurth K. Predicting recurrence and progression in individual patients with stage Ta T1 bladder cancer using EORTC risk tables: a combined analysis of 2596 patients from seven EORTC trials. Eur Urol. 2006;49:466–75.

4. Fernandez-Gomez J, Madero R, Solsona E, Unda M, Martinez-Piñeiro L, Gonzalez M, Portillo J, Ojea A, Pertusa C, Rodriguez-Molina J, Camacho JE, Rabadan M, Astobieta A, Montesinos M, Isorna S, Muntañola P, Gimeno A, Blas M, Martinez-Piñeiro JA. Predicting nonmuscle invasive bladder cancer recurrence and progression in patients treated with bacillus Calmette-Guerin: the CUETO scoring model. J Urol. 2009;182:2195–203.

5. Sylvester RJ. How well can you actually predict which non-muscle-invasive bladder cancer patients will progress? Eur Urol. 2011;60:431–3.

6. Thomas F, Rosario DJ, Rubin N, Goepel JR, Abbod MF, Catto JWF. The longterm outcome of treated high-risk nonmuscle-invasive bladder cancer: time to change treatment paradigm? Cancer. 2012;118:5525–34.

7. Grotenhuis AJ, Dudek AM, Verhaegh GW, Witjes JA, Aben KK, van der Marel SL, Vermeulen SH, Kiemeney LA. Prognostic relevance of urinary bladder cancer susceptibility loci. PLoS One. 2014;9:e89164.

8. Chen M, Hildebrandt MAT, Clague J, Kamat AM, Picornell A, Chang J, Zhang X, Izzo J, Yang H, Lin J, Gu J, Chanock S, Kogevinas M, Rothman N, Silverman DT, Garcia-Closas M, Barton Grossman H, Dinney CP, Malats N, Wu X. Genetic variations in the sonic hedgehog pathway affect clinical outcomes in nonmuscle-invasive bladder cancer. Cancer Prev Res. 2010;3:1235–45.

9. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, et al. Common {SNPs} explain a large proportion of the heritability for human height. Nat Gen. 2010;42:565–9.

10. Makowsky R, Pajewski NM, Klimentidis YC, Vazquez AI, Duarte CW, Allison DB, de los Campos G. Beyond missing heritability: prediction of complex traits. PLoS Genet. 2011;7:e1002051.

11. Picornell AC. Genomewide pronostic study in bladder cancer. 2013.

12. Liaw A, Wiener M. Package “randomForest.”. 2015.

13. Albert JH, Chib S. Sequential ordinal modeling with applications to survival data. Biometrics. 2001;57:829–36.

14. Visscher PM, Goddard ME. Genetic parameters for milk yield, survival, workability, and type traits for Australian dairy cattle. J Dairy Sci. 1995;78:205–20.

15. Gonzalez-Recio O, Alenda R. Genetic relationship of discrete-time survival with fertility and production in dairy cattle using bivariate models. Genet Evol. 2007;39(0999-193X (Print):391–404.

16. Gianola D, Sorensen D. Quantitative genetic models for describing simultaneous and recursive relationships between phenotypes. Genetics. 2004;167:1407–24.

17. Park T, Casella G. The Bayesian lasso. J Am Stat Assoc. 2008;103:681–6.

18. De Los CG, Naya H, Gianola D, Crossa J, Legarra A, Manfredi E, Weigel K, Cotes JM. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics. 2009;182:375–85.

19. de Maturana EL, Chanok SJ, Picornell AC, Rothman N, Herranz J, Calle ML, García-Closas M, Marenne G, Brand A, Tardón A, Carrato A, Silverman DT, Kogevinas M, Gianola D, Real FX, Malats N. Whole genome prediction of bladder cancer risk with the Bayesian LASSO. Genet Epidemiol. 2014;38:467–76.

20. Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. 2009.

21. Lee SH, Goddard ME, Wray NR, Visscher PM. A better coefficient of determination for genetic profile analysis. Genet Epidemiol. 2012;36:214–24.

22. Di Martino E, Tomlinson DC, Knowles MA. A decade of FGF receptor research in bladder cancer: past, present, and future challenges. Adv Urol. 2012;2012:429213.

23. Karaoglu I, van der Heijden AG, Witjes JA. The role of urine markers, white light cystoscopy and fluorescence cystoscopy in recurrence, progression and follow-up of non-muscle invasive bladder cancer. World J Urol. 2014;32:651–9.

24. Hernández V, De La Peña E, Martin MD, Blázquez C, Diaz FJ, Llorente C. External validation and applicability of the EORTC risk tables for nonmuscle-invasive bladder cancer. World J Urol. 2011;29:409–14.

25. Vedder MM, Márquez M, de Bekker-Grob EW, Calle ML, Dyrskjøt L, Kogevinas M, Segersten U, Malmström P-U, Algaba F, Beukers W, Ørntoft TF, Zwarthoff E, Real FX, Malats N, Steyerberg EW. Risk prediction scores for recurrence and progression of non-muscle invasive bladder cancer: an international validation in primary tumours. PLoS One. 2014;9:e96849.

26. Stenzl A, Cowan NC, De Santis M, Kuczyk MA, Merseburger AS, Ribal MJ, Sherif A, Witjes JA. Treatment of muscle-invasive and metastatic bladder cancer: update of the EAU guidelines. Eur Urol. 2011;59:1009–18.

27. Babjuk M, Oosterlinck W, Sylvester R, Kaasinen E, Böhle A, Palou-Redorta J, Rouprêt M. EAU guidelines on non-muscle-invasive urothelial carcinoma of the bladder, the 2011 update. Eur Urol. 2011;59:997–1008.

28. de Los CG, Sorensen D, Gianola D. Genomic heritability: what is it? PLoS Genet. 2015;11:e1005048.

29. Van Zitteren M, Van Der Net JB, Kundu S, Freedman AN, Van Duijn CM, Janssens ACJW. Genome-based prediction of breast cancer risk in the general population: a modeling study based on metaanalyses of genetic associations. Cancer Epidemiol Biomarkers Prev. 2011;20:9–22.

30. Wacholder S, Hartge P, Prentice R, Garcia-Closas M, Feigelson HS, Diver WR, Thun MJ, Cox DG, Hankinson SE, Kraft P, Rosner B, Berg CD, Brinton L a, Lissowska J, Sherman ME, Chlebowski R, Kooperberg C, Jackson RD, Buckman DW, Hui P, Pfeiffer R, Jacobs KB, Thomas GD, Hoover RN, Gail MH, Chanock SJ, Hunter DJ. Performance of common genetic variants in breastcancer risk models. N Engl J Med. 2010;362:986–93.

Labels

Paediatric clinical oncology Clinical oncology

Article was published in

BMC Cancer

2016 Issue 16