Abstract
Leukemia evolution is driven by a complex interplay between somatic mutations and epigenetic changes. However, our understanding of these processes is limited by a lack of scalable and cost-effective methods to co-assay genetic and epigenetic changes in single cells. Here, we present scGEM-seq (single-cell Genotyping and Methylome sequencing), a flexible multi-omics platform that permits simultaneous characterization of genetic variants and DNA methylation profiles from single cells. scGEM-seq combines three core innovations that can be accomplished at ~3% the cost ($0.03 per cell) of competing technologies (>$1 per cell in snmC-seq3 and SciMETv2): (1) single-cell capture with semi-permeable capsules (Atrandi Biosciences), enabling multi-step enzymatic reactions to be performed efficiently with minimal loss of DNA; (2) a novel split-pool strategy (HexaDecimal barcoding) that efficiently generates ~1 million unique cell barcodes that are compatible with bisulfite-like enzymatic conversions; and (3) flexible library preparation that can be adapted for genome-wide or targeted sequencing.
In cell-mixing studies, we show that scGEM-seq accurately resolves cell populations based on DNA methylation profiles and genetics. At modest sequencing depth (0.001X per cell for whole-genome DNA libraries, 15734.4±12123.1 CpGs per cell for methylation libraries), scGEM-seq resolved mouse and human cell lines at expected ratios (1:1:1 ratio of GM12878, K562 and YAC1 cells) with low doublet rates (~2.4% using species genome alignment). scGEM-seq also resolved human cancer cell lines of distinct lineages (1:1:1 ratio of SW480, A375 and OCIAML3 cells) according to methylation and genetics (cell-specific SNVs and copy-number changes) with modest sequencing depth (0.002X per cell for whole-genome DNA libraries, 20242.6±12808.2 CpGs per cell for methylation libraries) and low doublet rates (2.88% using cell-specific SNVs).
To enhance genotyping efficiency, we also combined scGEM-seq with two different target enrichment strategies: hybrid capture (Pan-cancer hybridization panel, IDT; n=127 genes) and amplicon sequencing. In mixed human cancer cell lines, scGEM-seq achieved consistent single-cell genotyping for known cancer genes and hotspot mutations, including a mean of 13.23±5.30% of cells for hybrid capture and >98% of cells for amplicon sequencing. To determine the sensitivity of scGEM-seq for genotyping of rare cells, we prepared a cell mixture containing varying proportions of human cancer cell lines (GM12878: 92.85%, SW480: 5%, A375: 1.5%, OCIAML3: 0.5%, K562, 0.15%). scGEM-seq coupled with amplicon sequencing for known cell-specific variants recovered the expected cell proportions down to ~0.1-0.2%, representing fewer than 2 malignant cells per 1000 (SW480: 302/6294 cells, 4.80%; A375: 88/6294 cells, 1.40%; OCIAML3: 30/6294 cells, 0.48%; K562: 11/6294 cells, 0.17%).
Finally, we applied scGEM-seq to primary bone marrow samples from 4 patients with myeloid neoplasms (MDS and AML) representing a range of blast counts (47%, 52%, 10% and 16%). Here, scGEM-seq revealed distinct populations of normal, premalignant, and malignant cells, including subclonal leukemia cells with KIT D816V (47/777 cells, 6.05%) and T417_D419delinsL (72/700 cells, 10.29%) mutations in a patient with inv(16) AML, and clonal (premalignant) hematopoietic cells harboring a DNMT3A S770L mutation (251/558 cells, 44.98%) and subclonal malignant blasts with distinct KRAS G12C (92/385 cells, 23.9%) and G12D (4/381 cells, 1.05%) mutations.
In conclusion, scGEM-seq provides a flexible, cost-effective, and high-resolution method for elucidating the complex interplay between epigenetic and genetic cell states in cancer cells, including within rare cell populations. We envision multiple applications for scGEM-seq for the study of leukemia and other blood cancers, including to characterize (epi)clonal heterogeneity and dynamics, identify minimal residual disease, and capture rare stem/progenitor populations or drug-resistant clones that emerge post therapy.