Key Points
N-glycosylation sites are acquired early in disease and persist during tumor progression, despite therapy.
Scarcity of N-glycosylation sites-negative subclones and their loss during progression suggest positive clones expand preferentially.
Abstract
Follicular lymphoma B cells undergo continuous somatic hypermutation (SHM) of their immunoglobulin variable region genes, generating a heterogeneous tumor population. SHM introduces DNA sequences encoding N-glycosylation sites asparagine-X-serine/threonine (N-gly sites) within the V-region that are rarely found in normal B-cell counterparts. Unique attached oligomannoses activate B-cell receptor signaling pathways after engagement with calcium-dependent lectins expressed by tissue macrophages. This novel interaction appears critical for tumor growth and survival. To elucidate the significance of N-gly site presence and loss during ongoing SHM, we tracked site behavior during tumor evolution and progression in a diverse group of patients through next-generation sequencing. A hierarchy of subclones was visualized through lineage trees based on SHM semblance between subclones and their discordance from the germline sequence. We observed conservation of N-gly sites in more than 96% of subclone populations within and across diagnostic, progression, and transformation events. Rare N-gly-negative subclones were lost or negligible from successive events, in contrast to N-gly-positive subclones, which could additionally migrate between anatomical sites. Ongoing SHM of the N-gly sites resulted in subclones with different amino acid compositions across disease events, yet the vast majority of resulting DNA sequences still encoded for an N-gly site. The selection and expansion of only N-gly-positive subclones is evidence of the tumor cells’ dependence on sites, despite the changing genomic complexity as the disease progresses. N-gly sites were gained in the earliest identified lymphoma cells, indicating they are an early and stable event of pathogenesis. Targeting the inferred mannose-lectin interaction holds therapeutic promise.
Introduction
Follicular lymphoma (FL) is a biologically and clinically heterogeneous disease that remains incurable. Although the majority of patients follow an indolent course, high-risk groups are prone to early progression or transformation to aggressive lymphoma associated with a dismal prognosis. For these patients, current therapies are suboptimal, and uncovering changes occurring early during disease development is essential to improving prognosis.
Despite the loss of 1 immunoglobulin allele through the t14;18 translocation1 and ongoing somatic hypermutation (SHM) of the immunoglobulin heavy-chain variable region gene (IGHV) that can introduce crippling IGV mutations, all detectable tumor subclones retain functional expression of the surface immunoglobulin throughout disease, resulting in thousands of tumor subclones displaying distinct but clonally related IGHV sequences. This retention suggests a tumor dependence on signaling through the B-cell receptor. Through SHM, replacement mutations introduce amino acid sequence motifs consisting of asparagine (N)-X-serine/threonine, where X can be any amino acid except proline.2-4 These sequences are known as N-glycosylation (N-gly) sites and are found in more than 90% of FL cases.5,6 N-gly sites are rarely found in normal B cells,7 indicative of a pathogenic function. Unusual glycans terminating with high mannose attach to sites and activate B-cell receptor signaling pathways after engagement with lectins.8-12 This novel interaction represents a critical mechanism by which tumor cells survive in the germinal center, accumulating mutations of epigenetic modifiers early during FL pathogenesis.13,14
The behavior of N-gly sites during disease evolution and progression has been investigated by IGHV cloning technique in a number of FL cases5 and in 1 case of contiguous FL and in situ follicular neoplasia (ISFN).15 These studies have indicated conservation of acquired N-gly sites within identified clones. However, clone numbers were limited in these studies, underrepresenting the extent of intraclonal diversity. Furthermore, as analysis has been restricted to a single disease event, behavior of N-gly sites over time has not been addressed and would be critical in determining their role in disease initiation and progression. To address this requires comprehensive IGV analysis of the clonal repertoire taken from subsequent (temporal) biopsies, ranging from a relatively early point in disease manifestation (eg, diagnosis) to a point at which the disease has become genetically and clinically distinct (eg, relapse and transformation). As SHM continues during disease progression and transformation, the stepwise process can be visualized through lineage trees rooted to a putative nonmalignant germline IGV sequence, making them an important tool in B-cell evolutionary studies.
Our goal was to investigate the behavior of N-gly sites during the disease course. We analyzed the incidence and maintenance of sites within the tumor clones of 6 patients taken at different points of disease. This included analysis of events from different anatomical sites. This is the first study that has analyzed the relationship between FL progression and N-gly sites in patients who have undergone different lines of therapy and presented with different clinical courses, reflecting the heterogeneous nature of the disease.
We found that N-gly sites are acquired within early FL clones and are retained in the intraclonal population, despite ongoing SHM. A striking observation is that sites are a universal determinant of both cell expansion and cell fate, as evidenced by the low frequency of N-gly site-negative subclones in and across diagnostic, progression, and transformation events and their disappearance in subsequent events.
Methods
Methods
Three individuals with FL were selected on the availability of genomic DNA derived from sequential tumor lymph node biopsies that had previously undergone somatic variant profiling to reveal a "sparse" or "rich" disease evolution pattern based on degree of genetic semblance.13 Patients 1 and 3 were categorized as sparse, and patient 2 was categorized as rich. Samples were selected on the basis of detection of a clonal IGHV (IGHVDJ) rearrangement through Sanger sequencing (supplemental Methods, available on the Blood Web site). In total, 8 samples were selected, all carrying an IgH-VH3 rearranged major tumor clone (major clone). All samples were obtained after written informed consent in accordance with the Declaration of Helsinki and the London Research Ethics Committee. Approximately 50 ng IGHV genomic DNA amplicons prepared using JH consensus and VH3-FR1 primers were sent for 2 × 250 bp paired-end sequencing, using the Miseq Illumina platform (Genewiz, NJ). As primers bind within the FR1 and JH regions, a portion of these regions were absent from the sequencing data. The sequential steps involved in the analysis of Illumina reads and identification of tumor-related subclones are detailed in the supplemental Methods. Additional tumor-related reads covering the IGHV gene for 2 patients over different disease events were available from our collaborator (patients 4 and 5) and were produced using Roche 454 Life Sciences Genome Sequencer FLX.16 Additional raw IGHV data files produced from the MiSeq platform were obtained from the NCBI database (BioProject PRJNA240336) for patient 6.17 Clones were analyzed for acquired N-gly sites, using the NetN-glyc 1.0 server. Lineage trees based on the SHM profiles of clones were generated using IgTree,18 detailed in the supplemental Methods.
Statistical analyses
Two-way ANOVA was performed using GraphPad Prism (GraphPad Software, La Jolla, CA).
Results
High-throughput sequencing analysis of tumor-related subclones
Sequencing metrics for patients 1 to 3 are found in supplemental Table 1. We generated 0.81 to 1.17 million paired-end reads/sample (average, 1.09 million; supplemental Table 2). In total, we identified 0.12 to 0.46 million (average, 0.29 million) VDJ junctions per sample. The major clone was identified as being the dominant VDJ rearrangement in the sample (Table 1), and tumor-related reads were identified as described in the supplemental Methods. The number of unique subclones that reads encoded for is detailed in Table 1. To ensure detection of all tumor subclones in the different samples, we used VH3 family oligonucleotides in the single sequencing run, rather than tumor-specific primers. Therefore, contaminating sequences from normal B cells were observed for each sample, and the number of unique VDJ rearrangements/sample is stated in Table 1. Sequencing data from patients 4 to 6 in Table 1 were extracted from the original articles.16,17 The relatively greater number of unique VDJ rearrangements detected for patient 6 is a result of the sequencing approach amplifying all VH families.17 There is a heterogeneous level of contaminating B cells, as indicated by the percentage of merged reads expressing the dominant tumor rearrangement, ranging from 53.3% to 99.03% (Table 1; supplemental Table 2).
Site conservation is a universal feature of the tumor clonal population
We sequenced the IGHV gene in samples obtained at sequential time points of FL in 6 patients and interrogated the derived tumor sequence for the acquisition of N-gly sites. Details regarding patient samples can be found in supplemental Table 3. All major clones identified across samples contained 1 or more N-gly sites (Table 2). With the exception of patient 5, N-gly sites were conserved across disease events, despite patients undergoing several lines of therapy between biopsies (supplemental Table 3). For patient 1 and the fourth N-gly site of patient 2, sites were conserved in transformation events through nonsilent mutations that affected the amino acid sequence (eg, NFS>NVS). For the remaining sites in patient 2 and sites in patients 3 and 4, the amino acid sequences were conserved across disease events. Conservation of sites is also supported in our extension cohort of serial FL and transformed samples from patients A to E who underwent IGHV Sanger sequencing (supplemental Table 4). Patients’ 5 and 6 sequential samples were derived from different anatomical sites (supplemental Table 3). For patient 5, the 2 disease events have distinct N-gly sites: NFS in the CDR1 region (first relapse event) and NLT in the FR3 region (third relapse event). For patient 6, all events contain the same N-gly site and amino acid motif (NGS; Table 2). To elucidate whether N-gly site acquisition is a clonal event, we interrogated the subclone population by next-generation sequencing. For patients containing a single N-gly site in their major clone, at least 97% of the subclone population within and across disease events maintained the site (Table 2). For patients 2 and 4, who had multiple N-gly sites, no subclone with the complete absence of N-gly sites was detected. For patient 2, the first, second, and third sites were conserved in more than 96% of subclones across events. The fourth site was conserved in 97.2% of clones in the first relapse sample and in 82% and 85% of subclones in the third relapse and transformation samples, respectively. Interestingly, for all patients, no further N-gly site accumulated within or across events that were not found in the major clone. This infers that site acquisition is a conserved event.
SHM diversity within the N-glycosylation site indicates a selective retention of site-positive subclones
Table 3 highlights the number of unique subclones that have a different sequence in the N-gly site region compared with the major clone of the disease event. As the N-gly site is encoded by 9 nucleotides, these subclones differ from the major clone by at least 1 nucleotide within the N-gly site region. There is wide variation in percentage of affected subclones between patients, ranging from 0% to 58.41% of the total subclone population, indicating the (largely) random targeting of SHM within the variable region. For patients 1, 2 (sites 1-3), 3, 4 (site 1), and 5, the majority of affected subclones across disease events maintain the N-gly site, indicating a positive selection (Figure 1a). Analysis of the codon sequences of these positive subclones across patients and events revealed that N-gly sites are retained through either synonymous mutations or nonsynonymous mutations. The profiling of these subclones from patient 3 is used to highlight these 2 means of N-gly site retention in Figure 1b.
For patient 2, the fourth N-gly site was absent in the majority of affected subclones in the third relapse and transformation events, yet remaining N-gly sites may be supporting their survival and expansion, as indicated by their high percentage in the subclone population. Although in patient 6 the affected subclones in the first 2 events were mostly N-gly site positive, the affected subclones of the relapsed tFL event were predominantly site negative. As this is a relatively late disease event, compared with the other patients, the N-gly site may have become redundant in promoting tumor survival. However, it is important to point out that these negative subclones make up 2.7% of the total subclone population (Table 3). Furthermore, site-negative subclones only making up 1% or less of the total count number in samples expressing only a single N-gly site (supplemental Table 5), indicating they are a minor component of the tumor bulk.
N-glycosylation sites in distinct anatomical sites
The distinct anatomical sites for patients’ 5 and 6 serial samples make them important in studies regarding the genealogy of N-gly sites. For both patients, serial disease events were derived from the same precursor B cell, as evidenced by a shared VDJ rearrangement and t(14;18) translocation. For patient 5, the 2 events have distinct N-gly sites (Table 2), whereas for patient 6, all events contain the same N-gly site and amino acid motif (NGS). When comparing the subclones of patient 5, we observe a clear discordance in the SHM pattern of the 2 temporal populations and how this translates into a highly distinct amino acid sequence (supplemental Figure 1). This suggests that for patient 5, there was early divergence of the precursor tumor cell before N-gly site acquirement, whereas for patient 6, the precursor cell diverged after acquiring the N-gly site (Figure 2). Patient 5 demonstrates that N-gly acquirement may not be an event of an early divergence evolution model, instead occurring in anatomical site-specific ancestral cells that have undergone unique SHM processes. However, acquirement occurs early in these site-specific cells, as illustrated by the presence of N-gly sites in 97.14% and 99.19% of unique subclones in the first and third relapse events, respectively (Table 2).
N-glycosylation site-positive subclones are important in disease progression and migration between anatomical sites
As described here and with the exception of patient 5, N-gly sites are conserved in the clonal population across disease events. When we compared subclone populations, the majority of subclones for each disease event are unique, highlighting the intertumor heterogeneity generated through SHM (Figure 3). The number of shared subclones make up 0.03% to 27.5% of the total tumor subclones identified across disease events. These subclones survive for years, as indicated by the intervals between temporal biopsy acquirements (supplemental Table 3).
Analysis of shared subclones revealed they are all N-gly site positive. Patient 3 was an exception, as 1.5% of the shared clones were negative (n = 20). Interestingly, most of these negative clones make up a higher percentage of the total tumor count in the successive disease event, suggesting that they confer an advantage (supplemental Table 6). The lack of shared N-gly site-negative subclones in all other patients indicates that progression subclones are dependent on N-gly sites for their long-term survival. Analysis of shared subclones in patients 1 and 2, in which the amino acid composition of N-gly sites changes between disease events in both the major clones and subclone populations (Table 2), reveals that subclones giving rise to transformation tumors were already preexistent as minor subclones in earlier events, gaining clonal dominance after therapy to generate the transformation tumors. This subclone plasticity relies on the conservation of N-gly sites, indicating the important role sites provide subclones involved in disease progression.
Subclones were also shared between biopsy sites, making up 0.4% and 0.3% of the overall subclone population across all disease events for patients 5 and 6, respectively. Similar to the other patients, shared subclones of both patients were all N-gly site positive, with patient 5 subclones containing the site of the first event (NFS motif in the CDR2 region). This indicates that migratory subclones require site presence and could represent a tumor cell feature that is critical for establishing disease in new locations; however, this requires investigation in a larger cohort. The lack of shared subclones containing the N-gly site of the second disease event in patient 5 indicates that this disease event did not arise as a result of a preexisting minor subclone in the first event gaining clonal dominance and repopulating the tumor at another site. This is in contrast to the subclone plasticity we observe in patients 1 and 2, described here.
N-glycosylation sites are acquired early in disease evolution
We can gain insight into tumor evolution by analyzing the degree of SHM in each subclone. The range of SHM for each patient is indicated in Table 4, in which the least and most mutated subclones for each disease event (compared with their germline sequence) were identified by the IMGT High V-QUEST program. The percentage difference in homology between the least and most mutated subclones ranged from 2.0% to 21.7%.
For patients 1, 4, and 6, N-gly sites were acquired within their least mutated subclones in all their disease events. For patient 6, acquirement of the CDR3 located site was observed after only 4 nucleotide substitutions (97.8% sequence homology to germline V gene; Table 4). However, for the least mutated subclone of the transformed event for patient 3, the N-gly site is not acquired despite the subclone harboring a relatively greater number of point mutations. Despite this heterogeneity, N-gly sites are conserved once acquired, despite ongoing SHM, as evidenced in the most mutated subclones of all patients. Therefore, subclone selection is based on conserving N-gly sites in spite of active mechanisms that have the potential to disrupt this.
Lineage trees specifically based on VDJ sequences were used to visualize the evolutionary intraclonal hierarchy.18 For patients 4 and 5, the complete hierarchy can be visualized in lineage trees (Figures 4 and 5; supplemental Figure 2).
For patient 4, the earliest experimentally derived subclones (identified as filled nodes closest to the germline Ig sequence at the top of the tree) are N-gly site-positive. Although patient 4 does not have any truly negative subclones, several subclones lose at least a single N-gly site. Some of these subclones are observed to undergo further SHM and reacquire the lost site (Figure 4), giving rise to several further clones. This is in contrast to patient 5, in whom the loss of the single N-gly site results in the subclone not undergoing further diversification or expansion (Figure 5). One N-gly site-negative subclone in the first relapse event is placed high in the tree and only differs from the germline sequence by 5 bases. This subclone corresponds to the least mutated subclone (98% homology) highlighted in Table 4, indicating this clone never acquired the N-gly site. The other site-negative subclones were descendants of site-positive clones because of ongoing SHM. As these clones are lost from progression samples, we can infer their elimination.
N-glycosylation site-negative clones arise from further SHM of site-positive clones
With greater numbers of N-gly site-negative subclones as a result of an increase in overall subclones, patients 1, 3, and 6 lineage trees give a more comprehensive insight into the behavior of site-negative clones in the tumor hierarchy (Figure 6). As patient 2 did not have any truly negative subclones, the analysis was omitted. However, these negative subclones represent a minority within the heterogeneous population. For patient 1, negative clones represented 1.7% and 1.8% of the subclone population in diagnosis and transformation events, respectively. For patient 3, negative clones found in second relapse, third relapse, and transformation represented 2.5%, 2.1%, and 1.8% of the population, respectively. For patient 6, negative clones represent 1.6%, 2.1%, and 2.7% of the subclone population in FL diagnosis, tFL diagnosis, and tFL relapse events. N-gly site-negative clones were found to arise from either a positive or a negative clone, through a single nucleotide variant. Several negative clones can arise from a shared positive ancestor, as depicted through the wide branching. Further SHM in these negative clones does not result in site reacquirement or gain of new sites.
Discussion
The high propensity of relapse in patients with FL suggests that current therapies are not successfully targeting the early aberrations needed to propagate disease, leading to acquirement of further mutations that reduce effective treatment options. Therefore, uncovering and targeting features of FL ancestral cells may offer durable outcomes for patients.
We report for the first time the behavior of N-gly sites during disease progression by analyzing the clonal repertoire of temporal FL samples based on IGHV sequencing. Samples ranged from diagnosis to transformation and included a mixed patient cohort with variable clinical disease courses, reflecting the heterogeneous nature of the disease (supplemental Table 3). All patients harbored at least 1 acquired site in their earliest disease event that was conserved in both the heterogeneous subclonal population and the overall tumor mass. N-gly sites were also retained in sequential relapse and transformation samples, although for patients 1 and 2, sites were conserved through nonsynonymous mutations (Table 2). Analysis of the 9 base pair region encoding the N-gly site for each patient revealed a group of subclones harboring a different nucleotide sequence in the site to that of the major clone, as a result of ongoing SHM. However, the majority of these affected subclones maintained the N-gly site for patients 1, 2 (sites 1-3), 3, 4 (site 1), and 5 across disease events through synonymous and nonsynonymous mutations (Figure 1a). Although the acquirement of additional "driver" mutations through natural or therapy-related selection pressures may dampen the tumor’s microenvironment dependency at later stages of disease, the conservation of N-gly sites suggest they retain an important functional significance.
The presence of negative subclones is an expected occurrence, as SHM does not differentiate between seemingly favorable and nonfavorable mutations. Lineage trees have revealed how negative clones are derived from positive clones, suggesting acquirement of sites is an early event. As these negative clones represent only a small percentage of the tumor population, they are likely to be outcompeted by N-gly site-positive clones, perhaps because of loss of the microenvironmental interaction provided via the added mannoses. Negative clones can still undergo SHM, but do not reacquire sites in their progeny and, with the exception of patient 3, are lost from subsequent samples, indicating that they are not selected to undergo expansion or long-term survival. Sanger sequencing of the light chain variable region for patient 3 did not reveal additional N-gly sites in the major clone. However, although we cannot assume that sites in the light chain are not acquired subclonally, and may therefore be present in the negative subclones of patient 3, determining the light chain N-gly site status of IGHV-based subclones is currently impossible.
Patient 5 provided an interesting case for 2 reasons; the different anatomical sites for the 2 events and the discordant SHM within the IGHV between the 2 clonal populations. This discordance suggests an early divergence, in which an ancestral cell with limited SHM migrated from 1 site to another, where selection pressures drove the outward growth of subclones with a specific SHM pattern. However despite IGHV sequence heterogeneity, the acquirement of N-gly sites within each population at different locations illustrates that sites are an essential feature of FL. The sharing of 2 subclones with the N-gly site of the diagnostic sample highlights the trafficking ability of site-positive subclones between anatomical sites, which is also observed in patient 6 (Figure 3). However for patient 6, SHM patterns between events were highly similar, and the CDR3 N-gly site was conserved throughout, suggesting a late divergency between events from a shared ancestral cell. Therefore, N-gly sites are required in both early and late divergency models of evolution.
The conservation of N-gly sites within and across disease events and the lack of accumulation during ongoing SHM suggests they are an early and stable event in FL pathogenesis. Early events are usually determined through their conservation within temporal samples and for patients 1 to 3, whole-genome sequencing/whole-exome sequencing (WGS/WES) had previously identified key genetic aberrations within a putative ancestral cell, known as the common progenitor cell (CPC).19-21 Patients 1 and 3 had a sparse CPC because of the lack of shared genetic aberrations across temporal samples, suggesting an early divergency with episodes arising from more genetically independent pathways. However, despite this mutational heterogeneity between events, N-gly sites are conserved, identifying an important feature of the CPC. This is a significant finding, as the CPC is believed to be the reservoir pool from which successive disease events arise, accounting for the high relapse rates experienced by the majority of patients. The latency between biopsy sampling (supplemental Table 3) suggests an N-gly site-positive CPC that is able to remain dormant for many years before a mutational event leads to a new disease episode. The mannose-lectin interaction enables tumor retention and survival of the CPC, permitting the accumulation of genetic events that lead to overt disease, suggesting a critical priming event in FL manifestation. Although epigenetic deregulation is a considered CPC event, as evidenced in the previous genetic profiling of patients’ 1 to 3 samples, our data imply that it is not solely sufficient for "driving" the disease. Instead, it seems that the N-gly site profile determines which clones are able to expand and survive during disease progression, irrespective of the genetic profile of the subclones. Analyzing the genetic profile of N-gly site-negative subclones will determine the validity of this hypothesis.
Although the t14:18 translocation can be found in healthy circulating B cells that do not go on to become malignant,22-24 N-gly sites in the variable region are restricted to germinal center-derived lymphomas,3 indicating an attractive and tumor-specific therapeutic target that may lead to the loss of a critical CPC-microenvironmental interaction and reduce the frequency of relapse. The presence of N-gly sites in the presumed FL precursor, ISFN15 supports the theory of N-gly sites occurring at an early stage of pathogenesis, being acquired even before disease manifestation. Figure 7 summarizes how N-gly sites affect the evolution of disease.
For original data, please contact s.krysov@qmul.ac.uk.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
The authors thank Genewiz (New Jersey, US) for performing next generation sequencing. The authors acknowledge the Tissue Bank at Barts Cancer Institute (UK) for providing patient samples and corresponding clinical information.
This work was supported by grants from the Pathological Society of Great Britain and Ireland, Leukaemia UK Charity, Barts Cancer Charity and The Greg Wolf Fund.
Authorship
Contribution: M.O., E.C., S.A., J.O., F.S., M.C., and S.K. performed research and analyzed data; M.O., J.G.G., F.F., F.K.S., M.C., and S.K. designed the research and analyzed the data; S.A. and J.O. provided patient samples and analyzed clinical data; M.O. and S.K. wrote the initial draft of the manuscript; and all authors contributed to the modification of the draft and approved the final submission.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Sergey Krysov, Centre for Haemato-Oncology, Barts Cancer Institute, John Vane Science Centre, Queen Mary University of London, London EC1M 6BQ, United Kingdom; e-mail: s.krysov@qmul.ac.uk.
REFERENCES
Author notes
M.C. and S.K. contributed equally to this study.