TO THE EDITOR:
A focus on defining genomic mutations at the whole population level in cancer may underestimate clonal plasticity in the tumor genome and how it generates genetic variation between individual single cells (SCs). This may have an impact on understanding cancer evolution and biological behavior. Differences between SCs are known to underpin intraclonal variation (ICV), which is considered a major driver in cancer progression in which clonal competitor cells evolve by Darwinian mechanisms.1 However, knowledge of the entire spectrum of genome-wide mutations in SCs requires an unbiased and unselected approach. In this requirement, SC whole-exome sequencing (WES) seems to be essential for understanding functional gene variation in the tumor clone. To probe this, we examined SC WES in an index case of amp1q21 multiple myeloma (MM).
MM is a lethal plasma cell malignancy located in the bone marrow. Significant efforts have now determined the nature of genome-wide somatic mutations in MM, but this work has largely focused at the whole tumor population level. At this level, of the top 20 mutated genes, the most frequently mutated are NRAS (19%) and KRAS (16%)2 ; no universally shared gene mutations have emerged in MM, which reveals a marked disease heterogeneity. The extent of ICV in these genomic data in individual tumors has been inferred by mapping subclonal mutations using variant allele frequencies (VAFs), defined as the percentage of total reads derived from the variant allele obtained from genome sequencing of a single pool of thousands to millions of purified tumor cells.3 However, VAFs in bulk tumor population genomic sequencing data do not allow identification of SC mutations. The first comprehensive profile of mutations in MM SCs used a preselected targeted gene sequencing approach in which 35 variant loci of 13 commonly mutated genes in MM bulk tumor populations were assessed in SC-amplified DNA by using massively parallel genome sequencing.4 That notable study reproducibly captured the 13 gene mutations with a detection sensitivity of ∼93% in MM cell line–derived SCs, but the study was essentially aimed at evaluating the utility of targeted genome sequencing of individual circulating tumor cells to assess disease status.4 Here we report unbiased WES in SCs from an index case of amp1q21 MM at disease presentation.
We performed WES in an MM bulk tumor population (1000 cells) and SC (n = 20) levels, respectively, comparing data with autologous germ line T-cell genomes (bulk 1000 cells and 5 SCs). Full details are provided in supplemental Data, available on the Blood Web site. Bulk 1000 cells and SCs were isolated from a CD138+CD38++ or CD3+ fluorescence-activated cell sorted tumor population and T cells separately by precision micropipetting. Isolated cell DNA was subjected to whole-genome amplification, library preparation, and Illumina sequencing. Genome sequencing data underwent concise bioinformatics analyses, and data were determined to be high quality (Table 1; supplemental Figure 1).
Sequencing was performed to a mean depth of 57.7×, achieving a mean of 82.1% coverage of target exome at ≥5×, mean allele dropout rate of 31.27%, and a mean heterozygous/homozygous ratio of 0.72 (supplemental Data; supplemental Table 1). These data surpassed the SC WES mean read depths of ∼30× to 33× and ∼70% to 80% exome coverage at ≥5× reported in early seminal SC WES studies.5,6
We next assessed chromosomal amplifications and deletions from WES data by low-resolution copy number variant analysis (supplemental Figure 2). There was no apparent evidence of subclonal variation in the karyotype of tumor cells, which suggests that genomic aberrations are early events in tumorigenesis in MM. Loss of heterozygosity variants were not called as they were indistinguishable from allele dropout. These observations indicate that karyotype determined by copy number variant does not segregate ICV at the SC level from WES data.
Somatic variants were then identified in WES data by pairwise comparison of bulk tumor population with bulk germ line T cells, excluding those present in databases of human germ line variations. Stringent quality control measures were used to reduce the number of false-positive calls (supplemental Data), and we identified 48 high-confidence variants in the bulk tumor population that were considered to be acquired somatic variants (Table 1; supplemental Table 1). Of these 48 variants in the bulk tumor exome, only 34 (71%) were identified in varying penetrations as mutations in SCs, commensurate with VAFs that indicated ICVs (Figure 1A-B). We also determined variations between SCs in principal component analysis plots (data not shown), providing further evidence of intraclonal heterogeneity (supplemental Table 2). Overall, however, 29% of bulk population somatic variants were not seen in SCs. A panel of mutated genes (labeled P1 in Figure 1B, spanning genes RNF112 to CBX6) present in the bulk population were notably absent in several SCs despite VAFs >40% (supplemental Table 1) and draw caution in inferences from gene VAFs in bulk tumor population genomic data.
We also identified an additional 21 somatic variants that were called in only 2 or more SC exomes but not in the bulk tumor population despite high coverage at each of 17 sites in bulk exome data (mean sequence depth, 105×; range 36 to 330×). The vast majority of the 21 variants thereby provided robust evidence of somatic mutations occurring in differing SCs that were not detectable at the whole tumor population level and revealed a greater sensitivity in identifying mutations when using SC genome-wide methodologies. The somatic variants identified in SC exomes confirmed that mapping subclonal mutations across an MM tumor population requires an unbiased approach.
To confirm variant calls and occurrence of mutations in our study, 15 randomly selected nonsynonymous single-nucleotide variants were validated by Sanger sequencing in the bulk tumor population and 4 randomly chosen SCs. Fifteen of 15 variants were confirmed in the bulk tumor–amplified DNA and in 55 of 55 variants amplifiable in SC DNA to obtain 100% concordance (supplemental Table 3). An additional 10 nonsynonymous single-nucleotide variants that were observed only in SCs were also re-sequenced, and 7 of 10 variants were amplified in at least 1 SC; in each, the variant nucleotide was confirmed by sequence. By using Sanger sequencing, we confirmed somatic variants with VAFs ranging from 19% to 100%, of which 7 of 15 had a VAF <40%.
We identified a total of 69 unique mutations present with high confidence in the bulk and/or SC tumor exomes, and 48 of these were predicted to be potentially deleterious (nonsynonymous, stopgain, frameshift indel), and 37 of 48 were predicted to be of functional relevance to disease origins (supplemental Table 2). All but 2 deleterious variants occurred in genes also reported in Catalogue of Somatic Mutations in Cancer (COSMIC).1 Driver gene analysis using IntOGen and KEGG pathway analysis identified 5 potential driver genes (ANK3, AXIN1, BRCA2, MAP4K3, TRIP10), but these seemed to be subclonal (Figure 1B).
Strikingly, these SC WES data indicate that the mutational status of the MM genome is markedly underestimated by bulk tumor population analysis and overlooked by targeted gene resequencing approaches. The 21 somatic variants identified in 2 or more SC exomes revealed replicates and an apparent ∼44% increase in mutational complexity not suspected from 48 bulk population mutations. Importantly, our data in a lymphoid malignancy are comparable to findings from 16 SC nucleus-based exomes in a triple-negative breast cancer case in which in 2 or more SC exomes, 145 nonsynonymous mutations were identified that were not detected at the bulk population level (374 variants), an apparent ∼40% increase.7
The following are implications of our data in MM. We compared 20 SCs with bulk tumor (1000 cells) evaluating only 20 of 1000 cells (0.02%), but the magnitude of how mutational load is underestimated emerged. The scale of the problem and the need to derive a truly representational map of SC exomes was revealed by considering estimates of tumor mass in MM. It has been shown using 125I-loaded synthesis of immunoglobulin G and mathematical modeling that MM tumor mass approaches 1012 cells.8,9 Our patient had immunoglobulin G MM with 80% to 90% bone marrow disease infiltrate at presentation and a potentially comparable tumor mass. It becomes a major challenge then to map the functional SC genome in a tumor mass of this scale. This is an essential requirement for fully understanding how cell-to-cell variation determines tumor biology and behavior and for developing individualized medicine. Any nonsynonymous somatic mutation in an SC has considerable capacity to alter cellular function intrinsically and extrinsically by altering molecular cross-talk with microenvironment cells. The sum of variation in SC behavior will determine population level growth and progression. Singular findings from SC exomes also challenge the concept of the tumor clone, with a key question being how many SCs are identical and how this will relate to defining a clone.7 We performed a study of an index case of 1q21 MM and have thus begun to unravel the true genomic complexity of this lethal disease at the SC level.
The online version of this article contains a data supplement.
Exome sequencing data reported in this article have been deposited in the NCBI BioProject database (accession ID PRJNA474073).
Acknowledgments
The authors acknowledge use of the IRIDIS High Performance Computing Facility and associated support services at the University of Southampton in the completion of this work.
This work was supported by Bloodwise (United Kingdom).
Authorship
Contribution: D.B. performed research, collected, analyzed, and interpreted data, performed bioinformatics and statistical analysis, and helped write the manuscript; W.T. and A.R.C. performed bioinformatics and statistical analysis and analyzed data; N.J.W.-B., A.B., and N.Z. prepared samples and analyzed and interpreted data; N.Z. arranged patient consent and ethics approval in Vienna, Austria, and contributed to design of study; L.S. and S.X. coordinated whole-exome sequencing and data preparation by BGI; S.S.S. designed research, analyzed and interpreted data, and wrote the manuscript; and all authors approved the final submission of the manuscript.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Surinder Singh Sahota, Somers Cancer Research Building, MP891, University Hospital Southampton, Tremona Rd, Southampton SO16 6YD, United Kingdom; e-mail: s.s.sahota@soton.ac.uk.