Abstract
Chromosomal translocations, insertions, and deletions are common early events in non-Hodgkin lymphoma (NHL) carcinogenesis, and implicated in their formation are endogenous processes involved in antigen-receptor diversification, such as V(D)J recombination. DNA repair genes respond to the double- and single-strand breaks induced by these processes and may influence NHL etiology. We examined 34 genetic variants in 19 genes within or related to 5 DNA repair pathways among 1172 cases and 982 matched controls who participated in a population-based NHL study in Los Angeles, Seattle, Detroit, and Iowa from 1998 to 2000. Cases were more likely than controls to have the RAG1 820 R/R (odds ratio [OR] = 2.7; 95% confidence interval [CI] = 1.4 to 5.0) than Lys/Lys genotypes, with evidence of a gene dosage effect (P trend < .001), and less likely to have the LIG4 (DNA ligase IV) 9 Ile/Ile (OR = 0.5; 95% CI = 0.3 to 0.9) than T/T genotype (P trend = .03) in the nonhomologous end joining (NHEJ)/V(D)J pathway. These NHEJ/V(D)J-related gene variants represent promising candidates for further studies of NHL etiology and require replication in other studies.
Introduction
Chromosomal translocations are a hallmark of non-Hodgkin lymphoma (NHL) and can arise as a consequence of misrepair of DNA double-strand breaks. Major translocations identified in NHL include those fusing BCL2 with immunoglobulin (Ig) H in approximately 80% of follicular lymphomas1 ; BCL2, BCL6, or MYC with IgH in approximately 50% of diffuse large B-cell lymphomas (DLBCLs)2 ; and MYC with one of several Ig loci in 80% or more of Burkitt lymphomas.1 These and other genomic rearrangements (eg, insertions, deletions) are thought to occur early in malignant transformation in most NHLs. Accordingly, unrepaired or misrepaired DNA strand breaks could be critical events in lymphomagenesis.
While some translocations are generated by aberrant repair of double-strand breaks induced by ionizing radiation or other external exposures, alterations in endogenous processes such as V(D)J recombination could also contribute to such rearrangements. V(D)J recombination involves the deliberate introduction of double-strand breaks that reshuffle dozens of Ig building blocks, the V, D, and J segments. This process produces a highly diverse repertoire of antibodies, which are induced by a wide spectrum of antigenic challenges. Errors by the DNA repair genes responsible for ligating the V, D, and J segments, the nonhomologous end joining (NHEJ) genes, are implicated at the sites of rearrangements characteristic of NHL.3,4 In addition, 2 steps that follow V(D)J in B-cell maturation, class-switch recombination and somatic hypermutation, also introduce DNA strand breaks.5 The observation of NHL-associated translocations6 or aberrant hypermutation7 preferentially involving those regions suggests that misrepair of DNA breaks during these events could also contribute to lymphomagenesis. The fidelity of repair of such breaks and other types of DNA damage implicated in lymphoma development may directly or indirectly involve any of 5 overlapping DNA repair pathways: (1) NHEJ, (2) homologous recombination (HR) repair, (3) nucleotide excision repair (NER), (4) base excision repair (BER), and (5) direct damage reversal.
NHEJ, 1 of 2 major double-strand break repair pathways, is considered “error prone” in part because it does not employ a homologous strand as a template to repair DNA breaks but instead allows ligation by introducing small nucleotide insertions or deletions into DNA. NHEJ has been shown in vitro to produce translocations.8 Mice deficient in any of several NHEJ genes on a p53–/– background are predisposed to develop immunodeficiency and pro–B-cell lymphomas,9,10 and the lymphomas demonstrate myc-IgH translocations reminiscent of those in human Burkitt lymphoma. Mutations in the NHEJ genes RAG or DCLRE1C (known as Artemis) in humans lead to severe combined immunodeficiency (SCID) syndrome, which often involves almost complete abrogation of B and T cells.11 However, 2 of 4 carriers of a DCLRE1C mutation that allowed partial B- and T-cell expression developed lymphomas.12 Thus, accumulating evidence suggests that NHEJ/V(D)J genes may participate in a vital way in lymphomagenesis.
HR repair, the second major double-strand break repair pathway, has been described as “error free” but can also result in translocations.13 Inherited mutations in HR genes have been recognized in familial cancer syndromes that involve an elevated lymphoma risk, including Bloom syndrome,14 Fanconi anemia (FA),15 and Nijmegen breakage syndrome (NBS).16 Development of NHL, while possibly facilitated by NHEJ or HR genes, may also be influenced indirectly by several genes active in BER,17,18 a DNA single-strand break repair pathway, or NER, a pathway involved in repair of DNA damage induced by ultraviolet radiation19,20 or bulky adducts. Finally, MGMT, a gene that participates in “direct reversal” of DNA damage by removal of O6-methylguanine adducts, is frequently hypermethylated in NHL tumors21,22 and is also hypothesized to contribute to lymphoma development.
Because several lines of evidence support the involvement of DNA repair and related genes in the etiology of NHL, particularly in the double-strand break pathway, we selected 19 genes that play an important role in or are related to 5 DNA repair pathways for analysis in a large, population-based case-control study of NHL in the United States. Here, we report results for a total of 34 genetic variants that were selected based on theoretical or experimental evidence of functionality and previous evidence of association.
Patients, materials, and methods
Study population
A detailed description of the study methods for the National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) case-control study of NHL has previously been published.23 Briefly, individuals aged 20 to 74 years diagnosed with incident NHL from July 1, 1998, to June 30, 2000, were identified in 4 US SEER population-based cancer registries: Iowa and the metropolitan areas of Detroit, Los Angeles, and Seattle. Eligible controls were selected from the general population in the 4 registry areas using random-digit dialing (for ages 20 to 64 years) or Medicare eligibility files (for ages 65 to 74 years) and were frequencymatched to NHL cases by age (5-year intervals), sex, race (White, African American, Asian/other), and SEER study site. Eligible cases or controls who were identified by themselves or a physician as HIV infected were excluded.
Of the 2248 selected eligible NHL cases, 320 (14.2%) died before interview, 57 (2.5%) were not interviewed because of physician refusal, and 143 (6.4%) were unable to be located. The remaining 1728 were contacted, of whom 274 (15.9%) refused to participate and 133 (7.7%) were not interviewed because of nonresponse, illness, impairment, or other reasons. Thus, 1321 eligible cases were interviewed, for a 76% participation rate and a 59% response rate. Of 2409 eligible controls, 311 (13%) could not be located, 28 (1%) died before interview, and 24 (1%) moved out of the area. The remaining 2046 were contacted and, of these, 839 (41%) declined to participate and 150 (6%) were not interviewed for other reasons, yielding 1057 eligible controls and a 52% participation and a 44% response rate. Participants signed an informed consent form, received a computer-assisted personal interview regarding known or suspected NHL risk factors, and donated a blood (773 cases, 668 controls) or buccal-cell (399 cases, 314 controls) sample. Of the 1172 case and 982 control participants, the study population included in the genetic analyses consists of the 1150 cases and 956 controls for whom DNA could be extracted and subsequently geno-typed for polymorphisms in DNA repair and related genes. The study was reviewed and approved by institutional review boards at the NCI and at each of the SEER study sites.
Laboratory methods
DNA extraction. DNA was extracted from blood clots or buffy coats at BBI Biotech Research Laboratories repository (Gaithersburg, MD) using Puregene Autopure DNA extraction kits (Gentra Systems, Minneapolis, MN). Phenol chloroform extraction methods were used to obtain DNA from buccal-cell samples collected via mouthwash. DNA was stored at 4°C until genotyping.
Genotyping. All genotyping was conducted at the NCI Core Genotyping Facility (Advanced Technology, Gaithersburg, MD) using either the Taqman (Applied Biosystems, Foster City, CA), Sequenom (San Diego, CA), or MGB Eclipse (Epoch Biosciences/Nanogen, Bothell, WA) sequencing platforms. Assays used to examine gene variation were developed and validated using previously published procedures.24 Details regarding platforms, primers, and assay conditions can be obtained from the Cancer Genome Anatomy Project SNP500Cancer Database.25 All laboratory personnel were blinded as to the case or control status of samples. The frequency of “undetermined” genotype or “no PCR [polymerase chain reaction] amplification observed” (generally 2% to 4%) for any particular genotype did not differ among cases and controls. After DNA derived from blood samples had been genotyped, a preliminary analysis was conducted, and only those variants that demonstrated a relationship with NHL risk were genotyped using the buccal-cell DNA (due to limited DNA yield from those specimens). Less than 1% of the 140 included blinded samples (40 blood donor replicates, 100 duplicates from study subjects) were discordant for each genotype. In particular, no homozygous wild-type genotype for any included gene was classified as homozygous variant or vice versa. For each genotype, 4 samples of known homozygous wild-type, heterozygote, or homozygous variant genotypes also were included for quality control in each plate of 386 samples as well as 4 DNA negative controls.
Statistical methods
We examined whether the distribution of genotypes in controls was consistent with Hardy-Weinberg equilibrium (HWE) using the χ2 test. All control genotype frequencies were in accordance with HWE in each race/ethnic group. We conducted unconditional logistic regression to calculate odds ratios (OR) and 95% confidence intervals (95% CI) for the relationship between genotype and NHL risk, adjusting for the matching factors of age (younger than 55 years, 55 to 64 years, 65 years or older), race (white, African American, Asian/other), sex, and study site. OR estimates were determined using the common homozygote genotype as the referent group. We also examined whether the relationship between gene variants and NHL risk differed by race (White, African American), sex, age (younger than 60 years, 60 years or older), and Revised European-American Lymphoma/World Health Organization (REAL/WHO) tumor pathology group (follicular, DLBCL, T-cell, other, unknown). Risk estimate heterogeneity between tumor groups was tested by designating one tumor pathology group as “cases” and another as “controls.” Statistical interaction on a multiplicative scale between genotypes, or between a particular genotype and race, sex, or age, was assessed by including main effects terms for each variable in the logistic regression model and adding a product term (gene1 * gene2 or gene1 * sex). All analyses were conducted using SAS software, version 8 (SAS, Cary, NC.)
There is a lack of consensus regarding the optimal approach to address the false-positive probability of single nucleotide polymorphism (SNP) associations. We therefore evaluated the robustness of our results using 2 complementary methods: the false discovery rate (FDR)26 and the false-positive report probability (FPRP).27 FDR is the expected ratio of erroneous rejections of the null hypothesis to the total number of rejected hypotheses among the SNPs analyzed in this report. We applied the FDR method to the P value for trend because this allows for the fewest number of comparisons and thus degrees of freedom and also assessment of the additive model. We applied the FPRP method, which controls the probability that a single SNP association is a false-positive report, for a range of prior probabilities (ie, .001 to .1) that the given SNP is truly associated with risk of NHL. The same prior range was used in a previous large pooled report of cytokine polymorphisms and NHL28 and reflects the extent to which a candidate SNP is likely to be functional and located in a gene that plays a role in the pathogenesis of NHL.27,29 We used an FPRP criteria of 0.20 (recommended in the original presentation of the method)27 to identify which, if any, findings should be considered noteworthy.
Results
The median age of NHL cases and controls was 58 and 61 years, respectively (Table 1). As expected due to matching, cases and controls were broadly similar in sex and race, with the exception of a higher proportion of African American controls (14%) than cases (8%). Thirty-four genetic variations among 19 genes in or related to 5 DNA repair pathways were examined in the present study (Table 2).
NHEJ and V(D)J recombination genes
The LIG4 (DNA ligase IV) 9 I variant allele was less common among NHL cases than controls overall (T/I, OR = 0.9, 95% CI = 0.7 to 1.1; I/I, OR = 0.5, 95% CI = 0.3 to 0.9; P trend = .03) (Table 3). The reduced risk was also apparent for follicular lymphoma (I/T or I/I, OR = 0.7, 95% CI = 0.5 to 1.0) and DLBCL (I/T or I/I, OR = 0.8, 95% CI = 0.6 5 to 1.0) (Table 4). NHL cases were more likely than controls to have the RAG1 820 R variant allele (K/R, OR = 1.3, 95% CI = 1.0 to 1.6; R/R, OR = 2.7, 95% CI = 1.4 to 5.0; P trend = .001) (Table 3). When examined among NHL pathology groups, follicular lymphoma cases were also more likely than controls to have inherited this variant (K/R, OR = 1.3, 95% CI = 0.9 to 1.8; R/R, OR = 5.1, 95% CI = 2.3 to 11.7; P trend < .001) (Table 4). The relationship with NHL risk was not strong among other subtypes; however, the difference between those subtypes and follicular was not significant. Other NHEJ gene variants were not associated with altered NHL risk (Tables S1 and S2, available on the Blood website; see the Supplemental Tables link at the top of the online article). Among those with both a RAG1 820 R and LIG4 9 T (increased risk) allele, risk of NHL was not elevated beyond that expected from the joint multiplicative effects of the 2 risk factors (data not shown). Risk among individuals with one or more RAG1 or LIG4 variants also did not vary by race, sex, or age (younger than 60, 60 years or older) (data not shown).
HR repair genes
NHL risk was examined in relation to 14 variants in 6 HR genes: BRCA2, NBS1, TP53, WRN, XRCC2, and XRCC3. Overall, NHL cases were 1.5-fold more likely than controls to be homozygous for the BRCA2 372 H/H genotype (95% CI = 1.0 to 2.1) (Table 3). Although risks of follicular and DLBC lymphoma were similarly elevated only among homozygotes, risk of T-cell lymphoma increased with an increasing number of BRCA2 372 H alleles (1.8-fold among BRCA2 Asp/His heterozygotes and 3.0-fold among H/H homozygotes; P trend = .003) (Table 4). The WRN V114I variant was less common among cases than controls overall, and NHL risk decreased with an increasing number of WRN 114 I alleles (P trend = .04) (Table 3). The reduced risk was not confined to a particular NHL pathology group (Table 4). The altered NHL risks among individuals with BRCA2 or WRN variants were equally apparent among participant subgroups defined by race, sex, or age (data not shown). Individuals with other HR variants did not have an altered NHL risk (Tables S1-S2).
NER genes
Seven variants in 6 NER genes (ERCC1, ERCC2, ERCC4, ERCC5, XPC, and RAD23B) were examined in relation to NHL risk. Overall, NHL cases were no more likely than controls to have any NER variant, and apparent differences in participant subgroups defined by race, sex, or age were ascribable to chance (Tables S1-S2 and data not shown).
Base excision repair (BER) genes
Risk of NHL was examined in relation to 5 variants in 3 BER genes (PARP, APEX1, and XRCC1). The XRCC1 194 W allele was associated with a moderately increased risk (1.4-fold) of NHL overall (Table 3) but did not differ significantly by NHL pathology group (Table 4), race, sex, or age (data not shown). The presence of other BER variants was not related to an altered NHL risk (Tables S1-S2).
Direct reversal of damage
Three variants in MGMT, a gene active in direct reversal of DNA damage, were examined, and none were associated with NHL risk overall (Tables S1-S2).
Discussion
Our results suggest that variants in several DNA repair or V(D)J pathway genes may be related to an altered risk of NHL or its subtypes. Particularly, homozygotes for the RAG1 820 R missense substitution had a 2.7-fold increased risk of NHL, with evidence of a gene dosage effect. While RAG1 is not considered a DNA repair gene, it participates in V(D)J recombination with genes active in NHEJ repair. Functional studies of the RAG1 820 Arg variant are lacking, although assessment of functional effects using the Sorting Intolerant From Tolerant (SIFT)30 program indicated that the polymorphism was likely to be “not tolerated” (probability < .01). Individuals with highly penetrant, disruptive RAG1 mutations are immunodeficient, have partial (Omenn syndrome) or virtually absent (SCID) V(D)J activity,31 and experience severe B- and T-cell defects.11 The RAG1 and RAG2 core protein complex has been shown in vitro to cleave DNA at specific sites and insert the cleaved segment at target sites unrelated to V(D)J, creating a translocation.8 The biologic plausibility of RAG1 involvement in events that initiate translocations, the deleterious nature of the substitution, and the dose response found in this study support the possibility that individuals with the RAG1 K 820 R variant may have an altered risk of lymphoma.
Functional and epidemiologic studies also support our finding that the LIG4 T9I polymorphism may be related to altered lymphoma risk. When evaluated for functional effects using SIFT, the T9I substitution was predicted to be “not tolerated” (probability = .01). In addition, the LIG4 T9I variant construct, when expressed in cell culture with the A3V variant in linkage disequilibrium, demonstrated 2- to 3-fold lower DNA double-strand break ligation activity and 2-fold lower adenylation activity than wild-type LIG4.32 Our observation that individuals with the T9I variant have a decreased lymphoma risk is consistent with the 3-fold reduced risk of lymphoma and the 5-fold decreased risk of multiple myeloma (n = 7 and 4 cases, respectively) among LIG4 9 I/I homozygotes in a previous case-control study.33 Although an XRCC4–DNA ligase IV complex ligates DNA ends during double-strand break repair, no interaction on a multiplicative scale was observed between polymorphisms in the 2 genes. Also, lymphoma risk among those with one or more NHEJ gene variants was not modified by TP53 genotype (data not shown), although TP53 status appears to alter NHEJ gene response in mice.9 Homozygotes for the RAG1 or DNA ligase IV variants had the strongest alterations in NHL risk in this study, although such individuals are rare (1.6% and 3.0% of controls, respectively). However, if heterozygotes for these variants have an altered NHL risk comparable to that suggested by our data, their higher prevalence (20.0% and 29.5%) implies that a greater proportion of NHL risk in the population would be attributable to heterozygosity.
Although NHL risk was most strongly related to variants in the V(D)J/NHEJ pathway, polymorphisms in 2 genes (BRCA2, WRN) involved in double-strand break resolution via HR (a pathway also known to induce translocations13 ) were also related to an altered NHL risk. In some34-36 but not all37,38 previous studies, BRCA2 372 His/His homozygotes have had an approximately 1.4-fold increased risk of breast or ovarian cancer, but their NHL risk has not previously been evaluated. In our study, homozygotes also had a 1.4-fold elevated risk of all lymphoma and of follicular and DLBC lymphoma, while T-cell lymphoma risk was elevated among individuals with the His variant, with evidence of an effect of gene dosage. In a previous study, risk of lymphoma was nonsignificantly increased (OR = 1.8) in relatives of BRCA2 mutation carriers.39 We also found that individuals with a WRN 114 I allele had a dose-dependent reduced risk of NHL that was not confined to any tumor type, sex, or age-specific subgroup. However, prior studies have not evaluated this allelic change, and the V114I substitution is predicted to be tolerated by the SIFT program. Individuals carrying mutations in WRN, which predisposes to Werner syndrome, have an increased risk of sarcomas, melanomas, and thyroid cancer, and one leukemia but no lymphomas have been reported among 124 individuals.40 Although PARP interacts with WRN in DNA repair processes,17 PARP genotype status did not alter the decreased NHL risk among those with at least one WRN 114I variant allele (data not shown).
The involvement of XRCC1 and other BER genes in the processing of Ig rearrangement intermediates during somatic hypermutation and class-switch recombination18 argues that BER genes could participate in early events in lymphomagenesis. In the overall analysis, individuals with XRCC1 R194W variant alleles had a moderately increased risk of NHL, and this finding was not limited to any specific tumor type, sex, or age group. Although this variant has been related to risk of tumors at other sites,41,42 in a recent study individuals who inherited one or more R194W variant alleles did not have an altered risk of follicular lymphoma.43
Our results should be considered in the context of the strengths and limitations of the study as well as the possibility that some findings are false positives, given the number of relationships examined. Strengths of this study include the population-based design and the large sample size. In addition, excellent laboratory quality control measures, including a concordance of 99% or greater for replicate genotypes in blinded samples, testify to the reliability of the data. However, while similar to those in many recent case-control studies, the response rates for this study were lower than desirable for both cases and controls. Participation among cases and controls was unlikely to be differential by genotype, and variant prevalence in white non-Hispanic study controls corresponded closely to that observed in random samples of similar individuals.24,25 However, if any included polymorphism is related to early mortality from NHL, the prevalence in participating cases could be altered, introducing bias in OR estimates. In addition, although the sample size provided sufficient power to evaluate the main effects of low-frequency variants, the study did not have ample power to evaluate most gene-gene interactions or to determine whether there were statistically significant differences between NHL histologies. Because survival and genetic alterations in NHL tumors differ by histology, we believe inclusion of histology information is helpful to examine subgroup heterogeneity; however, we did not observe major heterogeneity between subgroups. We presented results for all study subjects, adjusting for race, but our key findings, including the LIG4 T91 and RAG1 K820R associations, remained statistically significant and were essentially identical in magnitude when analyses were restricted to white non-Hispanics (data not shown).
An assessment of the probability that a statistically significant result at P less than .05 is a false-positive finding can aid in the interpretation of study findings. We evaluated our results using the FDR26 and FPRP27 approaches, as described in “Patients, materials, and methods.” The FDR value of the RAG1 K820R variant was 0.02, taking into account all SNPs tested for association with risk of NHL overall in this report, and the FPRP value was below our criterion of 0.2 (for a prior probability of association of .01 or higher, expected OR of 1.3 or higher, and observed odds ratio from the additive model for all NHL: OR = 1.37, 95% CI = 1.14 to 1.65, P trend < .001); that is, both methods indicate that the association may be particularly robust and suggest that there is only a small chance that the RAG1 K820R finding is a false positive. Although no other findings were deemed noteworthy after carrying out FDR and FPRP calculations, exploration of these associations in larger studies with greater power may be of value, particularly using tagged SNPs to obtain full genomic coverage of the most promising candidate genes.
In summary, our results suggest that inherited variants in NHEJ or V(D)J genes may alter risk of NHL, but our findings require replication by other studies, with an eventual goal of pooling across multiple investigations to evaluate the robustness of the findings. Investigation of the phenotypic relevance of the identified genetic variation, the contribution of other genes in the pathway, and potential interactions with other risk factors may ultimately yield new insights into the poorly understood process of lymphomagenesis.
Prepublished online as Blood First Edition Paper, July 20, 2006; DOI 10.1182/blood-2005-01-026690.
Supported by Public Health Service contracts N01-PC-65064, N01-PC-67008, N01-PC-67009, N01-PC-67010, and N02-PC-71105.
This project was conceived and led by the NCI–Surveillance, Epidemiology, and End Results (SEER) NHL group (P.H., N.R., S.W., S.J.C., R.K.S., S.D., W.C., J.R.C.). Genotyping, quality control specimens, and bioinformatics support were provided by M.Y. and S.J.C. D.A.H. selected the genes and variants to be assayed, with the assistance of S.S.W., N.R., and S.J.C. The statistical analysis was conducted by D.A.H. with input from S.S.W., N.R., P.H., and, on methods for controlling the chance that a reported finding of association is a true positive, from S.W. The manuscript was drafted by D.A.H. and was revised with contributions from all coauthors. All authors reviewed and approved the manuscript.
The online version of this article contains a data supplement.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.
We gratefully acknowledge all staff and scientists at each of the SEER registry sites for the conduct of the study's field effort and collection of biologic specimens. We also are grateful to Robert Welch and Sunita Yadavalli at the NCI Core Genotyping Facility for their careful handling of specimens and meticulous analysis of genotyping data.