Abstract
To identify genetic variants associated with outcome from chronic lymphocytic leukemia (CLL), we genotyped 977 nonsynonymous single nucleotide polymorphisms (nsSNPs) in 755 genes with relevance to cancer biology in 425 patients participating in a phase 3 trial comparing the efficacy of fludarabine, chlorambucil, and fludarabine with cyclophosphamide as first-line treatment. Selection of nsSNPs was biased toward those likely to be functionally deleterious. SNP genotypes were linked to individual patient outcome data and response to chemotherapy. The effect of genotype on progression-free survival (PFS) and overall survival (OS) was assessed by Cox regression analysis adjusting for treatment and clinico-pathologic variables. A total of 78 SNPs (51 dominantly acting and a further 27 recessively acting) were associated with PFS (9 also affecting OS) at the 5% level. These included SNPs mapping to the immune-regulation genes IL16 P434S (P = .03), IL19 S213F (P = .001), LILRA4 P27L (P = .004), KLRC4 S29I (P = .007), and CD5 V471A (P = .002); and DNA response genes POLB P242R (P = .04) and TOPBP1 S730L (P = .02), which were all independently prognostic of immunoglobulin heavy-chain variable region (IgVH) mutational status. The variants identified warrant further evaluation as promising prognostic markers of patient outcome. To facilitate the identification of prognostic markers through pooled analyses, we have made all data from our analysis publicly available.
Introduction
B-cell chronic lymphocytic leukemia (B-CLL) accounts for approximately 25% of all leukemia and is the most common lymphoid malignancy in Western countries.1 Although staging systems (Binet, RAi) are useful for predicting prognosis and treatment requirements,2 there is variability in clinical outcome for patients with apparently the same stage of disease. Hence, accurate assessment of prognosis would be beneficial in the choice of specific therapeutic options, while assessment of the likelihood of response and adverse reactions to chemotherapeutic treatments would allow patient-tailored decisions on drug selection and consequent improvements in survival.
Molecular markers such as immunoglobulin heavy-chain variable region (IgVH) mutational status have been shown to greatly assist in defining patient prognosis.3-5 In addition to somatic differences, it is probable that germ-line variation also plays a role in defining individual patient outcome. Studies that have investigated this possibility have to date only evaluated a restricted number of genetic variants in a restricted number of candidate genes—polymorphisms in the coding sequence of P2×7,6 GNAS,7 SDF-1,8 MTHFR,9 p53,10 and polymorphisms in the putative promoter sequences of TNFα,11 IL-1 and IL-6,12 BCL-2,13 and BAX.14,15 However, many of these purported associations such as those in P2×7, BAX, and p53 have been unconfirmed by independent studies.16-19
To date, most searches for polymorphic markers of prognosis have been formulated on an “educated guess”; however, without a full understanding of the biology of CLL, definition of additional “suitable” candidate genes is problematic. This makes searches on a “genome-wide” basis an attractive proposition. A further issue with many of the previously reported studies is that they have been based on small numbers of patients ascertained outside the context of a clinical trial and in addition to having limited power, are prone to bias as a consequence of survivorship and other cofounders, especially when based on retrospectively ascertained patients who have received nonuniform treatment.
We have recently conducted an association study based on an analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) in genes with a priori relevance to cancer biology to identify susceptibility alleles for CLL.20 As nsSNPs alter the encoded amino acid sequence, they have the potential to directly affect the function of expressed proteins. A total of 425 patients with CLL genotyped in this study were participants in a large phase 3 clinical trial. Linking genetic data to clinical outcome information for these patients has allowed us to test, in an unbiased fashion, the hypothesis that sequence variation defined by these nsSNPs influences the clinical behavior of CLL.
Methods
Patients
The study is based on patients entered in the UK Leukemia Research Fund (LRF) CLL-4 trial. Full details about the design and conduct of the trial are described in previously published material.21 Briefly, CLL-4 was a randomized phase 3 trial established to compare the efficacy of fludarabine, chlorambucil, and the combination of fludarabine plus cyclophosphamide as a first-line treatment for Binet stages B, C, and A-progressive CLL. The trial as a whole recruited from a total of 136 centres, primarily from the United Kingdom. Of the 777 patients entered into the trial, the current analysis is based on a random subset of 425 white patients from 110 of the participating centers which had blood samples taken for clinical diagnostic purposes and cell marker studies. Ethical approval for the study was obtained from the Multi-Center Research Ethics Committee in accordance with the Declaration of Helsinki.
Selection of candidate genes, SNP selection and genotyping
Candidate gene selection and SNP genotyping has been described previously.22 Briefly, nsSNPs mapping to candidate cancer genes were identified by interrogation of the Predicted Impact of Coding SNPs (PICS) database23 of potential functional SNPs and published work on the resequencing of DNA repair genes. To increase the likelihood of identifying susceptibility alleles, we analyzed nsSNPs in 871 genes with relevance to tumor biology, biasing selection of SNPs to those likely to directly have functional consequences. nsSNPs were genotyped across patient DNAs extracted by standard salt-lysis protocol using customized Illumina Sentrix Bead Arrays according to the manufacturer's protocols (Illumina, San Diego, CA).
To predict the impact of missense variants on protein function, we applied 2 in silico algorithms, polymorphism phenotyping (PolyPhen)24 and sifting intolerant from tolerant (SIFT).25 PolyPhen predicts the functional impact of amino acid changes by considering evolutionary conservation, physicochemical differences, and the proximity of the substitution to predicted functional domains and/or structural features. SIFT predicts the functional importance of amino acid substitutions based on the alignment of orthologous and/or paralogous protein sequences. SIFT and PolyPhen scores were classified as intolerant, potentially intolerant, borderline, or tolerant and probably damaging, possibly damaging, potentially damaging, borderline, or benign, respectively, according to the classifications proposed by Xi et al26 and Ng and Henikoff.25 Where it was not possible to derive SIFT or PolyPhen metrics, we categorized codon replacements on the basis of Grantham matrix27 according to the classification of Li et al.28
Statistical methods
To examine for systematic biases, we assessed the deviation of the genotype frequencies in the patients from that expected under Hardy-Weinberg Equilibrium (HWE) using a quantile-quantile (Q-Q) plot and estimated an overdispersion factor λ29 by calculating the ratio of the mean of the smallest 90% of observed test statistics to the mean of the corresponding expected values.
Progression-free survival (PFS) and overall survival (OS) of patients were the end points of the analysis. The response recorded for each patient was the best achieved at any time due to first-line treatment. PFS was defined as the survival time from the date of randomization to relapse needing further therapy, progression, or death from any cause. For nonresponders and progressive disease, the date of progression was when no response or progressive disease was recorded. OS was calculated from randomization to death from any cause. Data on age at diagnosis, stage, sex, and treatment were available for all patients, and IgVH mutation status data were available for a subset of patients, enabling us to examine the significance of each SNP conditional on these covariates. Cox-regression analysis was used to estimate genotype-specific hazard ratios (HRs) and 95% confidence intervals (CIs). For each SNP genotype, HRs were generated using common allele homozygotes as the reference group. P values presented correspond to the significance of a test difference among all 3 of the genotype groups (common allele homozygote, heterozygote, and rare allele homozygote). For SNPs where 5 or fewer minor allele homozygotes were observed, minor allele homozygote genotypes were combined with heterozygotes. If this combined frequency was still 5 or less, then the SNP was removed from the analysis. HRs, 95% CIs, and associated P values under both dominant and recessive models were also generated. To assess the distribution of test statistics, we generated Q-Q plots. To investigate possible epistatic interactions, the relationship between PFS and each pair of SNPs that showed significant allelic association at the 5% level was evaluated. PFS curves were plotted using Kaplan-Meier estimates. All statistical analyses were undertaken using SAS Version 9.1 (SAS Institute, Cary, NC) and R Version 2.5.0 (The R Foundation for Statistical Computing, www.r-project.org/foundation/main.html).
Results
Descriptive statistics
The clinicopathologic characteristics of patients studied are detailed in Table 1. Of the patients genotyped, approximately a third had been diagnosed with CLL before age 60 years, and approximately two-thirds had presented with stage B or C disease. The median follow-up time for patients was 32 months with follow-up to October, 31, 2005. There were 27% deaths in the cohort. Sex, stage at presentation, and drug therapy all significantly influenced PFS. Specifically, being female, having early-stage disease, and the combination of fludarabine and cyclophosphamide as drug therapy were all significantly associated with a more favorable prognosis (Table 2). Only age and stage at presentation significantly affected OS. For 354 (83%) of the 425 patients, we had information on IgVH mutation status. As in the parent trial, this covariate was prognostically important, being predictive of both PFS and OS (P < .001 and .002, respectively).
. | CLL-4 patients genotyped (N = 425), no. (%) . | CLL-4 patients excluded (N = 352), no. (%) . | P* . |
---|---|---|---|
Sex | |||
Male | 312 (73.4) | 261 (74.1) | — |
Female | 113 (26.6) | 91 (25.9) | .8 |
Age | |||
Less than 60 y | 127 (29.9) | 127 (36.1) | — |
60 to 69 y | 168 (39.5) | 121 (34.4) | — |
Over 70 y | 130 (30.6) | 104 (29.5) | .2 |
Stage | |||
A | 104 (24.5) | 87 (24.7) | — |
B | 186 (43.8) | 166 (47.2) | — |
C | 135 (31.8) | 99 (28.1) | .5 |
Died | |||
Yes | 114 (26.8) | 88 (25.0) | — |
No | 311 (73.2) | 264 (75.0) | .6 |
Treatment | |||
Chlorambucil | 206 (48.5) | 181 (51.4) | — |
Fludarabine | 112 (26.4) | 82 (23.3) | — |
Fludarabine and cyclophosphamide | 107 (25.2) | 89 (25.3) | .6 |
. | CLL-4 patients genotyped (N = 425), no. (%) . | CLL-4 patients excluded (N = 352), no. (%) . | P* . |
---|---|---|---|
Sex | |||
Male | 312 (73.4) | 261 (74.1) | — |
Female | 113 (26.6) | 91 (25.9) | .8 |
Age | |||
Less than 60 y | 127 (29.9) | 127 (36.1) | — |
60 to 69 y | 168 (39.5) | 121 (34.4) | — |
Over 70 y | 130 (30.6) | 104 (29.5) | .2 |
Stage | |||
A | 104 (24.5) | 87 (24.7) | — |
B | 186 (43.8) | 166 (47.2) | — |
C | 135 (31.8) | 99 (28.1) | .5 |
Died | |||
Yes | 114 (26.8) | 88 (25.0) | — |
No | 311 (73.2) | 264 (75.0) | .6 |
Treatment | |||
Chlorambucil | 206 (48.5) | 181 (51.4) | — |
Fludarabine | 112 (26.4) | 82 (23.3) | — |
Fludarabine and cyclophosphamide | 107 (25.2) | 89 (25.3) | .6 |
—indicates not applicable.
Chi-square test.
. | N . | OS . | PFS . | ||||
---|---|---|---|---|---|---|---|
No. deaths . | O/E ratio . | P* . | Progression or death . | O/E ratio . | P* . | ||
Treatment | |||||||
Chlorambucil | 206 | 48 | 0.87 | .3 | 155 | 1.36 | < .001 |
Fludarabine | 112 | 33 | 1.16 | — | 75 | 1.12 | — |
Fludarabine and cyclophosphamide | 107 | 33 | 1.08 | — | 52 | 0.51 | — |
Stage | |||||||
A | 104 | 20 | 0.67 | < .001 | 66 | 0.94 | .03 |
B | 186 | 45 | 0.88 | — | 116 | 0.87 | — |
C | 135 | 49 | 1.5 | — | 100 | 1.28 | — |
Age | |||||||
Less than 60 y | 127 | 21 | 0.51 | < .001 | 88 | 1.01 | .8 |
60 to 69 y | 168 | 42 | 0.94 | — | 109 | 0.96 | — |
Over 70 y | 130 | 51 | 1.81 | — | 85 | 1.05 | — |
Sex | |||||||
Male | 312 | 88 | 1.05 | .4 | 218 | 1.08 | .02 |
Female | 113 | 26 | 0.87 | — | 64 | 0.79 | — |
. | N . | OS . | PFS . | ||||
---|---|---|---|---|---|---|---|
No. deaths . | O/E ratio . | P* . | Progression or death . | O/E ratio . | P* . | ||
Treatment | |||||||
Chlorambucil | 206 | 48 | 0.87 | .3 | 155 | 1.36 | < .001 |
Fludarabine | 112 | 33 | 1.16 | — | 75 | 1.12 | — |
Fludarabine and cyclophosphamide | 107 | 33 | 1.08 | — | 52 | 0.51 | — |
Stage | |||||||
A | 104 | 20 | 0.67 | < .001 | 66 | 0.94 | .03 |
B | 186 | 45 | 0.88 | — | 116 | 0.87 | — |
C | 135 | 49 | 1.5 | — | 100 | 1.28 | — |
Age | |||||||
Less than 60 y | 127 | 21 | 0.51 | < .001 | 88 | 1.01 | .8 |
60 to 69 y | 168 | 42 | 0.94 | — | 109 | 0.96 | — |
Over 70 y | 130 | 51 | 1.81 | — | 85 | 1.05 | — |
Sex | |||||||
Male | 312 | 88 | 1.05 | .4 | 218 | 1.08 | .02 |
Female | 113 | 26 | 0.87 | — | 64 | 0.79 | — |
O/E ratio indicates observed versus expected ratio.
Log-rank test for trend.
There was no difference in the demographics, treatment, and follow-up characteristics of the 425 patients genotyped herein compared with the patients entered into CLL-4 but who were excluded from this analysis (Table 1). The observation that conventional staging and other prognosis markers were equally predictive of survival in the complete dataset and the subgroup analyzed provides further evidence that subselection of patients genotyped is unlikely to bias overall study findings.
Relationship between SNP genotype and PFS
Within the parent study, SNP call rates per sample were greater than 99% in patients. Of the 1467 SNPs submitted for analysis, 1218 (83%) were satisfactory genotyped, with mean individual SNP call rates of 99.7% Of the 1218 SNPs loci that satisfactorily genotyped, 241 were either fixed in all patient samples or were observed at subpolymorphic frequencies (ie, having minor allele frequency [MAF] of less than 1%), and hence were excluded from all analyses. Distribution of SNP genotypes was not significantly different to that expected under HWE, with λequal to 1.02 (Figure S1, available on the Blood website; see the Supplemental Materials link at the top of the online article). Of the 977 remaining SNPs mapping to the 755 genes to be analyzed, cell counts of minor allele genotypes were sufficiently infrequent to exclude 57 from the genotype and dominant analyses and 438 from the recessive model analyses. The SNPs for which genotype-survival associations were examined are detailed in Table S1 together with their associated MAFs.
As 10% to 15% of patients with CLL with progressive disease are thought to display deletions involving 17p, we examined this particular locus in detail. Genotypes for the p53 Arg72Pro (rs104252230 ) were obtained for 421 (99.1%) of the 425 patients, and the distribution of genotypes (239, 154, 28) was not significantly different to those we have previously documented in the United Kingdom population31 (1471, 1022, 197; P = .64).
After adjustment for sex, age, stage of disease at presentation, and drug regime, statistically significant associations between genotype and PFS were identified for 42 SNPs at the 5% level. A Q-Q plot of test statistics showed no large deviation from the expected distribution (Figure S2). Results of all SNPs included in the genotype-survival analysis are detailed in Table S2. On the basis of dominant and recessive models, 78 different nsSNPs were significant, 51 under a dominant model and 27 under a recessive model; none, however, were significant after adjustment for multiple testing using the conservative Bonferroni correction.
With such caveats in mind, several potentially interesting findings were observed. Of the SNPs identified to be prognostic (on the basis of nominal P ≤ .05), 2 have been established to be functional, and a further 28 SNPs are predicted to be functional (Table 3). Collectively, the SNPs predictive of PFS mapped to a restricted number of gene ontologies, and for purposes of clarity, we have restricted our commentary to the genes within pathways for which polymorphic variation in at least one SNP was associated with PFS, specifically genes involved in DNA damage-response/repair and immune regulation.
Gene . | Symbol . | SNP . | Reference allele (dbSNP) . | N* . | HR . | 95% CI . | P . | |
---|---|---|---|---|---|---|---|---|
Dominantly acting alleles | ||||||||
Sec23 homolog B | SEC23B | H489Q | — | C | 93 | 0.60 | 0.45-0.81 | .001† |
Leukocyte immunoglobulin-like receptor, subfamily A member 4 | LILRA4 | P27L‡ | rs2241384 | C | 285 | 1.44 | 1.12-1.85 | .004† |
Chondroitin sulfate galnact-2 | GALNACT-2 | P479S‡ | rs2435381 | C | 237 | 0.70 | 0.55-0.89 | .004† |
Cubilin (intrinsic factor-cobalamin receptor) | CUBN | S253F | rs1801222 | C | 151 | 1.42 | 1.11-1.83 | .006† |
Killer cell lectin-like receptor subfamily C, member 4 | KLRC4 | S29I‡ | rs1841958 | G | 195 | 0.72 | 0.57-0.91 | .007† |
Phospholysine phosphohistidine inorganic pyrophosphate phosphatase | LHPP | R94Q | rs6597801 | G | 344 | 1.49 | 1.11-1.99 | .008 |
Melanoma inhibitory activity 2 | MIA2 | D547H | rs10134365 | G | 269 | 1.39 | 1.09-1.78 | .008 |
ARP10 protein | ARP10 | K121N | rs139299 | G | 114 | 1.47 | 1.11-1.95 | .008† |
Dystonin | DST | Q1634R | rs4712138 | A | 153 | 1.39 | 1.08-1.79 | .011† |
Myosin binding protein C, cardiac | MYBPC3 | V158M‡ | rs3729986 | G | 350 | 1.47 | 1.09-1.99 | .012† |
Sema domain, immunoglobulin domain, short basic domain, secreted | SEMA3C | V337M‡ | rs1527482 | G | 419 | 3.15 | 1.28-7.75 | .013 |
Zinc finger protein 169 | ZNF169 | Q583H‡ | rs12350212 | G | 344 | 1.46 | 1.08-1.97 | .013† |
Serologically defined colon cancer antigen 3-like | SDCCAG3L | I437T | rs2000746 | T | 218 | 0.74 | 0.58-0.94 | .013 |
Phospholipase A2, group VII | PLA2G7 | A379V | rs1051931 | C | 281 | 0.72 | 0.56-0.94 | .014† |
Apolipoprotein E | APOE | R176C‡§ | rs7412 | C | 352 | 1.46 | 1.08-1.98 | .015† |
BRCA1-associated RING domain 1 | BARD1 | S378R‡ | rs2229571 | C | 154 | 0.75 | 0.59-0.95 | .019† |
General transcription factor IIE, polypeptide 1, alpha 56 kDa | GTF2E1 | P366S | rs3732401 | C | 387 | 0.56 | 0.34-0.91 | .02† |
X-ray repair complementing defective repair in Chinese hamster cells 2 | XRCC2 | R188H | rs3218536 | G | 354 | 1.44 | 1.06-1.95 | .02 |
Excision repair cross-complementing rodent repair deficiency, complementation group | ERCC6 | G399D | rs2228528 | G | 299 | 1.35 | 1.05-1.73 | .021 |
Low-density lipoprotein receptor–related protein 4 | LRP4 | G1554S‡ | rs2306029 | G | 133 | 1.37 | 1.05-1.78 | .021† |
Erbb2 interacting protein | ERBB2IP | S274L‡ | rs3213837 | C | 291 | 1.34 | 1.04-1.72 | .022 |
Low-density lipoprotein–related protein 2 | LRP2 | L4210I‡ | rs4667591 | C | 262 | 1.32 | 1.04-1.68 | .024 |
Cancer susceptibility candidate 5 | CASC5 | M598T | rs11858113 | T | 139 | 0.75 | 0.59-0.96 | .025† |
Tyrosinase | TYR | S192Y‡ | rs1042602 | C | 159 | 0.76 | 0.60-0.97 | .027 |
Thrombospondin 1 | THBS1 | N700S‡§ | rs17632786 | A | 325 | 0.72 | 0.54-0.96 | .027† |
Excision repair cross-complementing rodent repair deficiency, complementation group | ERCC5 | H1104D | rs17655 | C | 256 | 1.32 | 1.03-1.69 | .028 |
Transmembrane protease, serine 4 | TMPRSS4 | G206V | rs1941635 | G | 395 | 1.57 | 1.05-2.35 | .028† |
E2F transcription factor 2 | E2F2 | G205R | rs3218170 | G | 412 | 1.98 | 1.08-3.62 | .028 |
Laminin, alpha 2 | LAMA2 | V1138M | rs2306942 | G | 378 | 1.51 | 1.04-2.18 | .028 |
Interleukin-16 (lymphocyte chemoattractant factor) | IL16 | P434S | rs4072111 | C | 334 | 0.72 | 0.53-0.97 | .031† |
Carboxypeptidase A4 | CPA4 | G303C‡ | rs2171492 | G | 153 | 0.76 | 0.6-0.98 | .032 |
RAD18 homolog | RAD18 | Q302R | rs373572 | A | 223 | 1.29 | 1.02-1.64 | .034 |
Cytochrome P450, family 39, subfamily A, polypeptide 1 | CYP39A1 | P23R | rs12192544 | C | 258 | 1.3 | 1.02-1.65 | .034† |
Keratin 1 | KRT1 | A454S‡ | rs17678945 | G | 395 | 1.59 | 1.03-2.44 | .035 |
Polymerase (DNA-directed), beta | POLB | P242R‡ | rs3136797 | C | 409 | 1.87 | 1.04-3.36 | .036† |
Breast cancer 2, early onset | BRCA2 | T1915M | rs4987117 | C | 393 | 1.59 | 1.03-2.44 | .036 |
PMS2 postmeiotic segregation increased 2 | PMS2 | M622I | rs1805324 | G | 411 | 0.39 | 0.16-0.95 | .037 |
Ubiquitin specific protease 47 | USP47 | V75G‡ | rs11022079 | T | 205 | 1.28 | 1.01-1.63 | .039 |
Zinc finger protein 311 | ZNF311 | K511Q | rs6456880 | A | 168 | 1.29 | 1.01-1.65 | .039 |
Zinc finger CCCH-type containing 3 | ZC3H3 | G452S | rs4874147 | G | 245 | 1.30 | 1.01-1.66 | .040 |
Protein tyrosine phosphatase, nonreceptor type 13 | PTPN13 | Y2062D | rs989902 | T | 146 | 0.78 | 0.61-0.99 | .041† |
Homeodomain interacting protein kinase 4 | HIPK4 | R302Q‡ | rs11670988 | G | 347 | 1.37 | 1.01-1.87 | .042† |
RAD1 homolog (Saccharomyces pombe) | RAD1 | E281G | rs1805327 | A | 357 | 0.71 | 0.50-0.99 | .042 |
Cytochrome P450, family 2, subfamily C, polypeptide 9 | CYP2C9 | I359L‡ | rs1057910 | A | 370 | 1.42 | 1.01-2.00 | .042† |
Forkhead box N1 | FOXN1 | R69C‡ | rs2071587 | C | 350 | 1.37 | 1.01-1.85 | .042 |
Ectonucleotide pyrophosphatase/phosphodiesterase 5 | ENPP5 | V171I | rs6926570 | G | 117 | 0.76 | 0.58-0.99 | .043 |
FRAS1-related extracellular matrix protein 2 | FREM2 | F1070S | rs2496425 | T | 231 | 1.28 | 1.01-1.62 | .044 |
Serine/threonine kinase 6 | STK6 | I31F | rs2273535 | A | 261 | 0.78 | 0.61-0.99 | .045† |
O-6-methylguanine-DNA methyltransferase | MGMT | L84F | rs12917 | C | 327 | 0.75 | 0.56-0.99 | .045 |
Carbonyl reductase 3 | CBR3 | V244M | rs1056892 | G | 186 | 1.28 | 1.00-1.63 | .046 |
Zinc finger protein 527 | ZNF527 | H181R‡ | rs4452075 | A | 295 | 0.76 | 0.59-1.00 | .047 |
Recessively acting alleles | ||||||||
Interleukin-19 | IL19 | S213F | rs2243191 | C | 16 | 2.53 | 1.46-4.38 | .001† |
Cadherin 11, type 2, OB-cadherin | CDH11 | T255M | rs35195 | C | 47 | 1.73 | 1.22-2.45 | .002† |
Cathepsin B | CTSB | L26V | rs12338 | C | 71 | 0.57 | 0.40-0.82 | .002† |
CD5 antigen (p56–62) | CD5 | V471A‡ | rs13328900 | T | 104 | 1.50 | 1.16-1.96 | .002† |
Extra spindle poles–like 1 (S cerevisiae) | ESPL1 | R614S‡ | rs1318648 | A | 58 | 1.60 | 1.15-2.22 | .006 |
Ectonucleotide pyrophosphatase/phosphodiesterase 5 | ENPP5 | V171I | rs6926570 | G | 86 | 0.64 | 0.46-0.88 | .006† |
Glypican 1 | GPC1 | G500S | rs2228331 | G | 32 | 0.48 | 0.28-0.81 | .006† |
Matrix metalloproteinase 10 (stromelysin 2) | MMP10 | G65R‡ | rs17293607 | G | 8 | 2.92 | 1.36-6.27 | .006† |
Dual specificity phosphatase 13 | DUSP13 | C206Y‡ | rs3088142 | G | 77 | 0.63 | 0.45-0.88 | .007 |
Sucrase-isomaltase (alpha-glucosidase) | SI | A231T | rs9283633 | G | 71 | 1.51 | 1.12-2.05 | .008 |
Neural precursor cell–expressed, developmentally down-regulated 4 | NEDD4 | Q260R | rs2303580 | A | 15 | 0.35 | 0.16-0.79 | .012† |
Reticulon 3 | RTN3 | A6E | rs11551944 | C | 9 | 2.44 | 1.19-5.00 | .015† |
Carnitine palmitoyltransferase II | CPT2 | M647V‡ | rs1799822 | A | 27 | 1.74 | 1.11-2.73 | .016† |
Monoacylglycerol O-acyltransferase 1 | MOGAT1 | S162P | rs1868024 | T | 21 | 1.91 | 1.13-3.23 | .016 |
Extra spindle poles–like 1 (S cerevisiae) | ESPL1 | D25A | rs6580942 | A | 44 | 1.58 | 1.09-2.29 | .017 |
Topoisomerase (DNA) II binding protein 1 | TOPBP1 | S730L | rs17301766 | C | 18 | 1.89 | 1.12-3.19 | .018† |
Zinc finger protein 643 | ZNF643 | C13Y | rs2272994 | G | 16 | 0.45 | 0.23-0.87 | .019 |
SET domain, bifurcated 2 | SETDB2 | G117E | rs7998427 | G | 64 | 1.45 | 1.05-2.00 | .023† |
Zinc finger protein 573 | ZNF573 | A166G‡ | rs3752365 | C | 24 | 1.71 | 1.08-2.70 | .024 |
A kinase (PRKA) anchor protein (yotiao) 9 | AKAP9 | M463I | rs6964587 | G | 65 | 1.43 | 1.05-1.97 | .025 |
BCL2-like 13 (apoptosis facilitator) | BCL2L13 | P360S | rs9306198 | C | 6 | 2.68 | 1.09-6.60 | .032† |
Dynein, axonemal, heavy polypeptide 9 | DNAH9 | N2195S | rs3744581 | A | 23 | 1.68 | 1.03-2.72 | .037 |
A kinase (PRKA) anchor protein 10 | AKAP10 | R249H | rs2108978 | G | 63 | 1.41 | 1.02-1.94 | .038† |
DEAD (Asp-Glu-Ala-Asp) box polypeptide 27 | DDX27 | G206V | rs11908296 | G | 22 | 1.70 | 1.03-2.80 | .038 |
Inhibin, beta C | INHBC | R322Q‡ | rs2229357 | G | 22 | 1.75 | 1.02-2.98 | .041 |
Sterile alpha motif and leucine zipper containing kinase AZK | ZAK | S531L | rs3769148 | C | 95 | 0.75 | 0.56-1.00 | .047 |
Eosinophil peroxidase | EPX | Q122H | rs11652709 | G | 35 | 0.64 | 0.41-0.99 | .047 |
Four and a half LIM domains 5 | FHL5 | S243R‡ | rs9373985 | C | 43 | 0.65 | 0.42-1.00 | .049 |
Gene . | Symbol . | SNP . | Reference allele (dbSNP) . | N* . | HR . | 95% CI . | P . | |
---|---|---|---|---|---|---|---|---|
Dominantly acting alleles | ||||||||
Sec23 homolog B | SEC23B | H489Q | — | C | 93 | 0.60 | 0.45-0.81 | .001† |
Leukocyte immunoglobulin-like receptor, subfamily A member 4 | LILRA4 | P27L‡ | rs2241384 | C | 285 | 1.44 | 1.12-1.85 | .004† |
Chondroitin sulfate galnact-2 | GALNACT-2 | P479S‡ | rs2435381 | C | 237 | 0.70 | 0.55-0.89 | .004† |
Cubilin (intrinsic factor-cobalamin receptor) | CUBN | S253F | rs1801222 | C | 151 | 1.42 | 1.11-1.83 | .006† |
Killer cell lectin-like receptor subfamily C, member 4 | KLRC4 | S29I‡ | rs1841958 | G | 195 | 0.72 | 0.57-0.91 | .007† |
Phospholysine phosphohistidine inorganic pyrophosphate phosphatase | LHPP | R94Q | rs6597801 | G | 344 | 1.49 | 1.11-1.99 | .008 |
Melanoma inhibitory activity 2 | MIA2 | D547H | rs10134365 | G | 269 | 1.39 | 1.09-1.78 | .008 |
ARP10 protein | ARP10 | K121N | rs139299 | G | 114 | 1.47 | 1.11-1.95 | .008† |
Dystonin | DST | Q1634R | rs4712138 | A | 153 | 1.39 | 1.08-1.79 | .011† |
Myosin binding protein C, cardiac | MYBPC3 | V158M‡ | rs3729986 | G | 350 | 1.47 | 1.09-1.99 | .012† |
Sema domain, immunoglobulin domain, short basic domain, secreted | SEMA3C | V337M‡ | rs1527482 | G | 419 | 3.15 | 1.28-7.75 | .013 |
Zinc finger protein 169 | ZNF169 | Q583H‡ | rs12350212 | G | 344 | 1.46 | 1.08-1.97 | .013† |
Serologically defined colon cancer antigen 3-like | SDCCAG3L | I437T | rs2000746 | T | 218 | 0.74 | 0.58-0.94 | .013 |
Phospholipase A2, group VII | PLA2G7 | A379V | rs1051931 | C | 281 | 0.72 | 0.56-0.94 | .014† |
Apolipoprotein E | APOE | R176C‡§ | rs7412 | C | 352 | 1.46 | 1.08-1.98 | .015† |
BRCA1-associated RING domain 1 | BARD1 | S378R‡ | rs2229571 | C | 154 | 0.75 | 0.59-0.95 | .019† |
General transcription factor IIE, polypeptide 1, alpha 56 kDa | GTF2E1 | P366S | rs3732401 | C | 387 | 0.56 | 0.34-0.91 | .02† |
X-ray repair complementing defective repair in Chinese hamster cells 2 | XRCC2 | R188H | rs3218536 | G | 354 | 1.44 | 1.06-1.95 | .02 |
Excision repair cross-complementing rodent repair deficiency, complementation group | ERCC6 | G399D | rs2228528 | G | 299 | 1.35 | 1.05-1.73 | .021 |
Low-density lipoprotein receptor–related protein 4 | LRP4 | G1554S‡ | rs2306029 | G | 133 | 1.37 | 1.05-1.78 | .021† |
Erbb2 interacting protein | ERBB2IP | S274L‡ | rs3213837 | C | 291 | 1.34 | 1.04-1.72 | .022 |
Low-density lipoprotein–related protein 2 | LRP2 | L4210I‡ | rs4667591 | C | 262 | 1.32 | 1.04-1.68 | .024 |
Cancer susceptibility candidate 5 | CASC5 | M598T | rs11858113 | T | 139 | 0.75 | 0.59-0.96 | .025† |
Tyrosinase | TYR | S192Y‡ | rs1042602 | C | 159 | 0.76 | 0.60-0.97 | .027 |
Thrombospondin 1 | THBS1 | N700S‡§ | rs17632786 | A | 325 | 0.72 | 0.54-0.96 | .027† |
Excision repair cross-complementing rodent repair deficiency, complementation group | ERCC5 | H1104D | rs17655 | C | 256 | 1.32 | 1.03-1.69 | .028 |
Transmembrane protease, serine 4 | TMPRSS4 | G206V | rs1941635 | G | 395 | 1.57 | 1.05-2.35 | .028† |
E2F transcription factor 2 | E2F2 | G205R | rs3218170 | G | 412 | 1.98 | 1.08-3.62 | .028 |
Laminin, alpha 2 | LAMA2 | V1138M | rs2306942 | G | 378 | 1.51 | 1.04-2.18 | .028 |
Interleukin-16 (lymphocyte chemoattractant factor) | IL16 | P434S | rs4072111 | C | 334 | 0.72 | 0.53-0.97 | .031† |
Carboxypeptidase A4 | CPA4 | G303C‡ | rs2171492 | G | 153 | 0.76 | 0.6-0.98 | .032 |
RAD18 homolog | RAD18 | Q302R | rs373572 | A | 223 | 1.29 | 1.02-1.64 | .034 |
Cytochrome P450, family 39, subfamily A, polypeptide 1 | CYP39A1 | P23R | rs12192544 | C | 258 | 1.3 | 1.02-1.65 | .034† |
Keratin 1 | KRT1 | A454S‡ | rs17678945 | G | 395 | 1.59 | 1.03-2.44 | .035 |
Polymerase (DNA-directed), beta | POLB | P242R‡ | rs3136797 | C | 409 | 1.87 | 1.04-3.36 | .036† |
Breast cancer 2, early onset | BRCA2 | T1915M | rs4987117 | C | 393 | 1.59 | 1.03-2.44 | .036 |
PMS2 postmeiotic segregation increased 2 | PMS2 | M622I | rs1805324 | G | 411 | 0.39 | 0.16-0.95 | .037 |
Ubiquitin specific protease 47 | USP47 | V75G‡ | rs11022079 | T | 205 | 1.28 | 1.01-1.63 | .039 |
Zinc finger protein 311 | ZNF311 | K511Q | rs6456880 | A | 168 | 1.29 | 1.01-1.65 | .039 |
Zinc finger CCCH-type containing 3 | ZC3H3 | G452S | rs4874147 | G | 245 | 1.30 | 1.01-1.66 | .040 |
Protein tyrosine phosphatase, nonreceptor type 13 | PTPN13 | Y2062D | rs989902 | T | 146 | 0.78 | 0.61-0.99 | .041† |
Homeodomain interacting protein kinase 4 | HIPK4 | R302Q‡ | rs11670988 | G | 347 | 1.37 | 1.01-1.87 | .042† |
RAD1 homolog (Saccharomyces pombe) | RAD1 | E281G | rs1805327 | A | 357 | 0.71 | 0.50-0.99 | .042 |
Cytochrome P450, family 2, subfamily C, polypeptide 9 | CYP2C9 | I359L‡ | rs1057910 | A | 370 | 1.42 | 1.01-2.00 | .042† |
Forkhead box N1 | FOXN1 | R69C‡ | rs2071587 | C | 350 | 1.37 | 1.01-1.85 | .042 |
Ectonucleotide pyrophosphatase/phosphodiesterase 5 | ENPP5 | V171I | rs6926570 | G | 117 | 0.76 | 0.58-0.99 | .043 |
FRAS1-related extracellular matrix protein 2 | FREM2 | F1070S | rs2496425 | T | 231 | 1.28 | 1.01-1.62 | .044 |
Serine/threonine kinase 6 | STK6 | I31F | rs2273535 | A | 261 | 0.78 | 0.61-0.99 | .045† |
O-6-methylguanine-DNA methyltransferase | MGMT | L84F | rs12917 | C | 327 | 0.75 | 0.56-0.99 | .045 |
Carbonyl reductase 3 | CBR3 | V244M | rs1056892 | G | 186 | 1.28 | 1.00-1.63 | .046 |
Zinc finger protein 527 | ZNF527 | H181R‡ | rs4452075 | A | 295 | 0.76 | 0.59-1.00 | .047 |
Recessively acting alleles | ||||||||
Interleukin-19 | IL19 | S213F | rs2243191 | C | 16 | 2.53 | 1.46-4.38 | .001† |
Cadherin 11, type 2, OB-cadherin | CDH11 | T255M | rs35195 | C | 47 | 1.73 | 1.22-2.45 | .002† |
Cathepsin B | CTSB | L26V | rs12338 | C | 71 | 0.57 | 0.40-0.82 | .002† |
CD5 antigen (p56–62) | CD5 | V471A‡ | rs13328900 | T | 104 | 1.50 | 1.16-1.96 | .002† |
Extra spindle poles–like 1 (S cerevisiae) | ESPL1 | R614S‡ | rs1318648 | A | 58 | 1.60 | 1.15-2.22 | .006 |
Ectonucleotide pyrophosphatase/phosphodiesterase 5 | ENPP5 | V171I | rs6926570 | G | 86 | 0.64 | 0.46-0.88 | .006† |
Glypican 1 | GPC1 | G500S | rs2228331 | G | 32 | 0.48 | 0.28-0.81 | .006† |
Matrix metalloproteinase 10 (stromelysin 2) | MMP10 | G65R‡ | rs17293607 | G | 8 | 2.92 | 1.36-6.27 | .006† |
Dual specificity phosphatase 13 | DUSP13 | C206Y‡ | rs3088142 | G | 77 | 0.63 | 0.45-0.88 | .007 |
Sucrase-isomaltase (alpha-glucosidase) | SI | A231T | rs9283633 | G | 71 | 1.51 | 1.12-2.05 | .008 |
Neural precursor cell–expressed, developmentally down-regulated 4 | NEDD4 | Q260R | rs2303580 | A | 15 | 0.35 | 0.16-0.79 | .012† |
Reticulon 3 | RTN3 | A6E | rs11551944 | C | 9 | 2.44 | 1.19-5.00 | .015† |
Carnitine palmitoyltransferase II | CPT2 | M647V‡ | rs1799822 | A | 27 | 1.74 | 1.11-2.73 | .016† |
Monoacylglycerol O-acyltransferase 1 | MOGAT1 | S162P | rs1868024 | T | 21 | 1.91 | 1.13-3.23 | .016 |
Extra spindle poles–like 1 (S cerevisiae) | ESPL1 | D25A | rs6580942 | A | 44 | 1.58 | 1.09-2.29 | .017 |
Topoisomerase (DNA) II binding protein 1 | TOPBP1 | S730L | rs17301766 | C | 18 | 1.89 | 1.12-3.19 | .018† |
Zinc finger protein 643 | ZNF643 | C13Y | rs2272994 | G | 16 | 0.45 | 0.23-0.87 | .019 |
SET domain, bifurcated 2 | SETDB2 | G117E | rs7998427 | G | 64 | 1.45 | 1.05-2.00 | .023† |
Zinc finger protein 573 | ZNF573 | A166G‡ | rs3752365 | C | 24 | 1.71 | 1.08-2.70 | .024 |
A kinase (PRKA) anchor protein (yotiao) 9 | AKAP9 | M463I | rs6964587 | G | 65 | 1.43 | 1.05-1.97 | .025 |
BCL2-like 13 (apoptosis facilitator) | BCL2L13 | P360S | rs9306198 | C | 6 | 2.68 | 1.09-6.60 | .032† |
Dynein, axonemal, heavy polypeptide 9 | DNAH9 | N2195S | rs3744581 | A | 23 | 1.68 | 1.03-2.72 | .037 |
A kinase (PRKA) anchor protein 10 | AKAP10 | R249H | rs2108978 | G | 63 | 1.41 | 1.02-1.94 | .038† |
DEAD (Asp-Glu-Ala-Asp) box polypeptide 27 | DDX27 | G206V | rs11908296 | G | 22 | 1.70 | 1.03-2.80 | .038 |
Inhibin, beta C | INHBC | R322Q‡ | rs2229357 | G | 22 | 1.75 | 1.02-2.98 | .041 |
Sterile alpha motif and leucine zipper containing kinase AZK | ZAK | S531L | rs3769148 | C | 95 | 0.75 | 0.56-1.00 | .047 |
Eosinophil peroxidase | EPX | Q122H | rs11652709 | G | 35 | 0.64 | 0.41-0.99 | .047 |
Four and a half LIM domains 5 | FHL5 | S243R‡ | rs9373985 | C | 43 | 0.65 | 0.42-1.00 | .049 |
HRs adjusted for age, sex, stage of disease, and treatment.
db SNP indicates database of single nucleotide polymorphisms.
Number of patients with at-risk genotype.
Significant after adjustment for IgVH mutation status.
Predicted to be intolerant/potentially intolerant, damaging/probably damaging, or radical/moderately radical based on SIFT/POLYPHEN/Grantham scores.
Established to be functional.
DNA damage-response genes
DNA polymerase beta (POLB; MIM 17476032 ) performs base excision repair required for DNA maintenance, replication, recombination, and drug resistance. Under the Cox proportional hazards model, carrier status for POLB P242R variant genotype was significantly associated with a poorer PFS (HR = 1.87; 95% CI: 1.04-3.36; P = .04). PFS (3-year) for carriers of the wild-type genotype was 22.2% (95% CI: 7.4%-66.5%) compared with 32.7% (95% CI: 27.9%-38.4%) for those with the wild-type genotype (Figure 1).
Genes participating in homologous recombination include breast cancer 2 (BRCA2; MIM 600185) and x-ray repair complementing defective repair in Chinese hamster cells (XRCC2; MIM 600375), which encodes a member of the RecA/Rad51-related protein family. Carrier status for BRCA2 T1915M and XRCC2 R188H variant genotypes were associated with poorer PFS (HR = 1.59; 95% CI: 1.03-2.44; P = .04 and HR = 1.44; 95% CI: 1.06-1.95; P = .02, respectively). Patients with the BRCA2 T1915M wild-type genotype had a 3-year PFS of 33.6% (95% CI: 28.6%-39.4%), whereas the 3-year PFS was 17.8% (95% CI: 6.9%-45.9%) in carriers of wild-type genotype (Figure 2).
Nucleotide excision repair is the major cellular system that repairs bulky DNA adducts caused by mutagens and chemotherapy. This form of DNA repair is in part performed by the proteins encoded by Chinese hamster groups 5 and 6 (ERCC5; MIM 133530 and ERCC6; MIM 609413). Carrier status for ERCC5 H1104D and ERCC6 G399D variant genotypes were both associated with a less favorable PFS (HR = 1.32, 95% CI: 1.03-1.69; P = .03 and HR = 1.35, 95% CI: 1.05-1.73; P = .02). For ERCC5 H1104D, carriers with the wild-type genotype had a 3-year PFS of 13.8% (95% CI: 5.1%-37.7%) compared with 33.9% (95% CI: 29.0%-39.7%) for noncarriers (Figure 3).
Other DNA damage-response/repair gene SNPs displaying an association with PFS included those mapping to postmeiotic segregation increased 2 (PMS2; MIM 600259), which participates in mismatch repair; methylguanine-DNA methyltransferase (MGMT; MIM 156569); and topoisomerase (DNA) II binding protein 1 (TOP53; MIM #07760), with carrier status for PMS2 M622I and MGMT L84F being associated with a better prognosis than those with wild-type alleles (HR = 0.39, 95% CI: 0.16-0.95; P = .04 and HR = 0.75, 95% CI: 0.66-0.99; P = .04). Conversely, homozygosity for TOPBP1 S730L variant genotype was associated with a poorer outlook (HR = 1.89, 95% CI: 1.12-3.19; P = .02).
Immune regulation genes
Preeminent among immune regulation pathway SNPs were those mapping to IL-19 (IL19; MIM 605687) and IL-16 genes (IL16; MIM 603035). Although based on relatively small numbers, homozygosity for IL19 S213F variant genotype was associated with a worse PFS (HR = 2.53, 95% CI: 1.46-4.38; P = .001). Conversely, carrier status for IL16 P434S was associated with a more favorable prognosis (HR = 0.72, 95% CI: 0.53-0.97; P = .03); 3-year PFS was 47.3% (95% CI: 37.1%-60.3%) compared with 28.4% (95% CI: 23.4%-34.6%; Figure 4). Under the recessive model, homozygosity for lymphocyte antigen CD5 (MIM 153340) V471A variant genotype was associated with a poorer prognosis (HR = 1.50; 95% CI: 1.16-1.96; P = .002). Also associated with poorer PFS were carriers of leukocyte immunoglobulin-like receptor, subfamily A member 4 (LILRA4; MIM 607517) P27L variant genotype (HR = 1.44, 95% CI: 1.12-1.85; P = .004); 3-year PFS was 23.3% (95% CI: 16.2%-33.6%) compared with 36.4% (95% CI: 30.6%-43.3%; Figure 5) and sema domain, immunoglobulin domain (Ig) short basic domain, secreted, 3C (SEMA3C; MIM 602645) V337M genotype (HR = 3.15, 95% CI: 1.28-7.75; P = .01). In contrast, carrier status for killer cell lectin-like receptor subfamily C, member 4 (KLRC4; MIM 602893) S29I variant genotype was associated with a better PFS (HR = 0.72, 95% CI: 0.57-0.91; P = .007); 3-year PFS was 38.2% (95% CI: 31.4%-46.5%) compared with 26.1% (95% CI: 20.0%-34.0%; Figure 6).
Relationship between SNP genotype and OS
A total of 9 of the 78 SNPs associated with PFS also showed an association with OS: SEC23B H489Q, GALNACT-2 P479S, ENPP5 V171I, SEMA3C V337M, GTF2E1 P366S, INHBC R322, LAMA2 V1138M, ZNF527 H181R, and BRCA2 T1915M.
Impact of SNP genotype adjusting for IgVH mutational status
For 354 (83%) of the 425 patients, we had information on IgVH mutation status, allowing us to evaluate the impact of polymorphic variation adjusting for this significant covariate. In a multivariate analysis incorporating information on IgVH mutation status, 38 of the 78 SNPs were independently prognostic (Table 3). Among these 38 SNPs were 5 mapping to immune regulation genes IL16 P434S, IL19 S213F, LILRA4 P27L, KLRC4 S29I, and CD5 V471A and 2 within DNA response genes POLB P242R and TOPBP1 S730L.
Relationship between SNP genotypes and chemotherapy
In view of the differences in pharmacology of chlorambucil, fludarabine, and cyclophosphamide, we examined for potential interactive effects between SNPs and PFS for the different drugs separately. Table S3 details the relationship between SNPs significant at the 5% level and PFS. A total of 5 SNPs associated with PFS were common to patients treated with chlorambucil or fludarabine (DST L22S, LILRA4 P27L, SEC23B H489Q, XRCC2 R188H, and ZAK S531L), 3 SNPs were common to patients treated with either chlorambucil or fludarabine with cyclophosphamide (APBB3 C236R, ENPPS I171V, and C21orf57 S2L), and 4 SNPs were common to patients treated with either fludarabine or fludarabine with cyclophosphamide (DDX27 G206V, DPYD S534N, WNT16 G72R, and DHX16 D566G).
Discussion
Using SNP genotype data generated on 425 patients within the CLL-4 trial, we have sought to identify a number of promising genetic variants with prognostic potential. A major strength of our study is its large size and the fact that we have analyzed patients entered into a phase 3 randomized trial, thereby minimizing bias. While not all of the patients entered into CLL-4 trial were analyzed, there were no salient differences in the demography of patients genotyped and those excluded from the analysis; hence, it is unlikely that any spurious biases will have influenced study findings. Furthermore, our analysis is unlikely to be confounded by population stratification. We have been able to adjust for the major molecular covariate, IgVH mutation status, and identify a number of SNPs independently predictive of PFS, including variants of IL16, IL19, CD5, and POLB. We did not, however, find evidence that MTHFR A222V or p53 Arg72Pro genotype influences prognosis as previously purported.
Despite the strong biologic plausibility for several individual associations as discussed herein, and the fact that our study is large compared with contemporaneous ones, these types of analyses are prone to false discovery as a consequence of multiple comparisons. Hence, although the signal generated from our gene level analyses suggests that we observed an approximately 8% greater signal over that expected on the basis of the dominant model, our nsSNPs findings are consistent with chance after adjusting for the number of tests used.
Although our observations must be considered preliminary and hypothesis-generating, it is intriguing that a large proportion of the SNPs we have identified as having prognostic relevance resides in genes mediating the immune response, extracellular matrix interactions, or DNA damage-response/repair. Especially relevant to CLL was the observation that variation in CD5 defined by the V471I polymorphism was a determinant of PFS. Although in part speculative, through interrogation of the Pathway Architecture program (Stratagene, La Jolla, CA), 30 (38%) of the SNPs associated with prognosis were found within interrelated genes encoding pivotal components of the DNA damage-response, immune regulation, and cell-signaling pathways (Figure 7). Biologically, it is eminently plausible that variation in these genes will affect prognosis in the case of CD5 and the interleukin genes, as levels of interleukins have been documented to influence clinical behavior.34,35 Furthermore, defects in the TP53 tumor suppressor gene pathway are known to be important in CLL, as p53 inactivation is associated with aggressive disease.36,37 It is intriguing that we identified an association with APOE, as the expressed protein has cytostatic and cytotoxic activity; moreover, apolipoprotein levels and activity of aplolipoprotein receptors have been linked to prognosis in B-CLL.38-40 The prior probability of identifying a significant association with prognosis for a series of SNPs mapping to a single DNA repair pathway is intuitively small, thereby providing a measure of robustness to our observations. However, ultimately, replication in independent studies is required to either validate or refute study findings.
Our observation that polymorphic variation in a number of the DNA repair genes (ERCC5, BRCA2, and POLB) influences survival of patients with cancer is not without precedent. Outside the context of CLL variation, ERCC5 D1104H has been reported to influence esophageal41 and lung42 cancer, and POLB P242R has been reported to influence lung cancer prognosis42 in the same manner, suggesting generic effects on outcome from malignancy.
While we have evaluated only nsSNPs with a higher probability of being directly causal, there is currently direct evidence for functionality only for a minority, such as APOE R176C and THBS1 N700S.43,44 Evidence for others is provided by in silico predictions regarding the consequences of amino acid changes. While such predictions have been shown to successfully categorize a high proportion of amino acid substitutions in benchmarking studies,26,45,46 it is acknowledged that assigning functionality to SNPs on the basis of metrics is in essence speculative. Moreover, it is probable that some associations are a consequence of linkage disequilibrium with causal variants.
As the patients analyzed in our study were part of a large clinical trial, this has provided a unique opportunity to examine differential effects of SNP on PFS according to specific chemotherapy. On the basis that chlorambucil and cyclophosphamide are both alkylating agents, we have looked for SNPs common to both patient groups. Although variants such as XRCC2 R188H are attractive candidates for conferring differential response to treatment, there are currently no data for the role for variation in this or the other genes identified as a determinant of drug response. Interestingly, the S534N variant of dihydropyrimidine dehydrogenase (DPYD), shown to influence response to fludarabine, was related to PFS (P = .03 and .002 for PFS in fludarabine-treated patients and fludarabine plus cyclophosphamide-treated patients, respectively). DPYD is the initial and rate-limiting factor in the pathway of uracil and thymidine catabolism, with genetic deficiency being well recognized to result in an error in pyrimidine metabolism.47 In addition, cyclophosphamide has been shown to augment the tumor effect of pyrimidine agents such as 5-fluorouracil through inhibition of DPYD.48 In this respect, although fludarabine acts as a pyrimidine analog, it is interesting that a comparable effect was observed.
Here, we have restricted our search for SNPs associated with disease outcome by assessing those which have the potential to affect the behavior of the expressed protein through change in the amino sequence. Epigenetic silencing of DAPK1 by promoter methylation has recently been shown to occur in most patients with CLL49 ; hence, expression of this gene is a key player in CLL. Intriguingly, a rare variant in the promoter of DAPK1 was demonstrated to affect DAPK1 expression.49 It is therefore highly probable that in addition to nsSNPs, noncoding variants influencing expression of key genes such as DAPK1 will also be determinants of patient outcome and worthy of evaluation.
Our study provides evidence that inherited variation influences the clinical outcome from CLL and potentially provides additional insight into the biological determinants of prognosis. Furthermore, it also serves to highlight the statistical problem of searching for genetic associations when the impact of any variant is likely to be at best modest. Hence, stipulating significance levels of less than 10−5 for an analysis of clinical trial data is unrealistic, because to have 80% power to demonstrate a 5% difference in survival, which is clinically relevant, requires at least 4800 patient samples to be analyzed, even if the frequency of the at risk genotype is 50%. Moreover, imposing very stringent statistical thresholds can create the serious issue of type II errors. To facilitate the identification of prognostic markers through pooled analyses, we have therefore made data from our study publicly accessible (Tables S1Table S2. Results of all SNPs included in the genotype-survival analysis (XLS, 511 KB)–S3, Figures S1,S2).
Finally, although germ-line variants are unlikely to replace staging schemes and conventional markers such as IgVH mutation status in the short term, they have potential to assist in distinguishing different outcome patterns among patients with the same stage of disease where 10% differences are clinically relevant, thereby opening up the possibility of a rational, targeted approach to treatment based on a combination of genotype and tumor characteristics of a patient.
The online version of this article contains a data supplement.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
We are grateful to all patients and clinicians participating in the LRF CLL4 trial. We thank Emily Webb for additional statistical analyses and Peter Broderick for his expertise in use of the Pathway Assist Software.
This work was undertaken primarily with grant support from Leukemia Research. Additional funding was provided by the Arabib Foundation and Cancer Research UK.
Authorship
Contribution: G.S.S. designed research, performed genetic analyses, and analyzed and interpreted data; R.W. and S.R. performed statistical analyses; D.G.O. performed hematologic assays; D.C. designed and conducted the clinical trial; and R.H. designed research, performed research, analyzed and interpreted data, and drafted the manuscript.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Richard S. Houlston, Section of Cancer Genetics, Institute of Cancer Research, 15 Cotswald Road, Sutton, Surrey SM2 5NG, UK; e-mail: richard.houlston@icr.ac.uk.