Key Points
Various types of leukemia share a hematopoietic stem and progenitor population (HSPC-like) associated with chemoresistance and poor outcome.
HSPC-like blasts are targetable with FLT3, BCL2, and PI3K pathway inhibitors, providing common mechanisms to treat high-risk leukemias.
Visual Abstract
The critical role of leukemia-initiating cells as a therapy-resistant population in myeloid leukemia is well established. However, the molecular signatures of such cells in acute lymphoblastic leukemia remain underexplored. Moreover, their role in therapy response and patient prognosis is yet to be systematically investigated across various types of acute leukemia. We used single-cell multiomics to analyze diagnostic specimens from 96 pediatric patients with acute lymphoblastic, myeloid, and lineage-ambiguous leukemias. Through the integration of single-cell multiomics with extensive bulk RNA sequencing and clinical data sets, we uncovered a prevalent, chemotherapy-resistant subpopulation that resembles hematopoietic stem and progenitor cells (HSPC-like) and is associated with poor clinical outcomes across all subtypes investigated. We identified a core transcriptional regulatory network (TRN) in HSPC-like blasts that is combinatorially controlled by HOXA/AP1/CEBPA. This TRN signature can predict chemotherapy response and long-term clinical outcomes. We identified shared potential therapeutic targets against HSPC-like blasts, including FLT3, BCL2, and the PI3K pathway. Our study provides a framework for linking intratumoral heterogeneity with therapy response, patient outcomes, and the discovery of new therapeutic targets for pediatric acute leukemias.
Introduction
Acute leukemia is the most common pediatric cancer. Although outcomes have improved for many types of acute leukemia, considerable variability in patient response to treatment remains. Understanding the molecular mechanisms driving these variations is crucial for developing personalized treatments. Identifying common, targetable mechanisms of treatment failure across different leukemia types would represent a major advance. Previous studies suggest that leukemia-initiating cells (LICs) arise from primitive hematopoietic stem cells (HSCs), multipotent progenitor cells, and lineage-restricted progenitor cells that harbor oncogenic drivers.1-7 The oncogenic lesions can endow multipotent or lineage-restricted progenitor cells with self-renewal capabilities similar to those of endogenous stem cells. The hematopoietic stem and multipotent progenitor cell (HSPC)-like LICs share many characteristics with normal HSPCs and are thought to play a critical role in resistance and relapse.8,9
Normal HSPCs display significant phenotypic heterogeneity and comprise a small fraction of the bone marrow.10-12 Likewise, HSPC-like LICs have been found to have distinct intertumor phenotypes and varied frequency in primary tumors. Most studies investigating LICs have relied on mouse models, with limited data available derived from primary patient samples. Therefore, the interpatient and intrapatient heterogeneity and plasticity of LIC phenotypes represent crucial aspects that require investigation to comprehend their molecular diversity and functional and clinical implications. To address this knowledge gap, we conducted a comprehensive single-cell multiomic analysis of HSPC-like populations in samples from 96 patients representing 4 major types of high-risk pediatric acute leukemias, including acute myeloid leukemia (AML), B-cell acute lymphoblastic leukemia (B-ALL), T-cell acute lymphoblastic leukemia (T-ALL), and mixed-phenotype acute leukemia (MPAL). We defined the developmental arrest states of blasts and common and leukemia subtype–specific factors that maintain the diverse leukemic blast states. We identified HSPC-like transcriptomic and epigenomic signatures that are predictive of therapy response and patient outcome. Our data validate multiple potential therapeutic targets in HSPC-like blasts using ex vivo models. Our study establishes, to our knowledge, the first panleukemia HSPC-like signatures that can be used for risk stratification and targeted therapy of stem cell–like pediatric leukemias.
Methods
Peripheral blood or bone marrow samples were collected from patients with (1) AML enrolled in the Children’s Hospital of Philadelphia Institutional Review Board Protocol 10-007767; (2) B-ALL enrolled in the Children’s Oncology Group clinical trial AALL15P1 (NCT02828358); and (3) T-ALL enrolled in the Children’s Oncology Group clinical trials AALL0434 (NCT00408005) and AALL1231 (NCT02112916). All samples were obtained with informed consent according to the Declaration of Helsinki with institutional review board approval from the participating centers. Single-cell RNA sequencing (scRNA-seq) and single-cell ATAC (assay for transposase-accessible chromatin) sequencing (scATAC-seq) were conducted as previously described.13,14 The supplemental Methods (available on the Blood website) include additional descriptions of methods and materials.
Results
The landscape of developmental arrest states of leukemic blasts
We generated scRNA-seq and scATAC-seq data for 96 diagnostic bone marrow or peripheral blood samples from patients with 4 subtypes of pediatric leukemia: AML (n = 10), B-ALL (n = 35), T-ALL (n = 40), and MPAL (n = 11; Figure 1A; supplemental Tables 1 and 2). The B-ALL cohort includes infant KMT2A-rearranged (KMT2A-R) B-ALL and 4 pediatric subtypes: BCR::ABL1, Philadelphia chromosome–like, intrachromosomal amplification of chromosome 21 (iAMP21), and KMT2A-R, which account for over 70% of all high-risk B-ALL cases.15 Nine genomic subtypes, including the newly defined “early T-cell precursor (ETP)-like” and TAL1 αβ-like subtypes, are represented in the T-ALL cohort, which consists of ETP, near-ETP, and non-ETP T-ALL cases with a range of clinical responses to treatment in the AALL0434 trial (Figure 1A; supplemental Table 2). We compared these T/B-ALL cases with 10 AML cases comprising distinct genetic drivers (KMT2A, NUP98, RUNX1, CBFB) and 11 MPAL cases harboring varied response to conventional therapy. We sequenced a total of 649 599 and 767 122 high-quality cells using scRNA-seq and scATAC-seq, respectively (supplemental Figure 1A-B). We first annotated cell types within each leukemia subtype, identifying malignant blasts, healthy T cells and natural killer cells, healthy myeloid cells, and healthy B cells in all subtypes (Figure 1B-D; supplemental Figures 1C-D and 2A-I).
The landscape of developmental arrest states across acute leukemia subtypes. (A) Patient cohort across leukemia subtypes. (B-C) Overall UMAPs of all scRNA-seq cells (B) and scATAC-seq cells (C) from 96 leukemia samples, colored by major cell types. (D) Bubble plot of manually curated marker genes for leukemic blasts and healthy cells. Color represents scRNA-seq z score. Bubble size represents the percentage of cells expressing a marker in a given population. (E) UMAPs of healthy human hematopoiesis based on scRNA-seq (left) and scATAC-seq (right) data from healthy donor (HD) samples of pediatric bone marrow13 and thymus.14 Cell type annotation for scATAC-seq data was label-transferred from scRNA-seq data. (F) Fractions of leukemic blasts projected onto stem/progenitor, T-lineage, B-lineage, and myeloid-lineage trajectories using healthy reference data. Clinical information of each patient is shown at the bottom. B-M, B-myeloid; CLP, common lymphoid progenitor; DP, double positive T cells; GMP, granulocyte monocyte progenitor; iB-ALL, infant B-ALL; LMPP, lympho-myeloid primed progenitor; MEP, megakaryocyte erythroid progenitor; MPP, multipotent progenitor; pB-ALL, pediatric B-ALL; pDC, plasmacytoid dendritic cells; T-M, T-myeloid; T/NK, T/natural killer cells; UMAP, Uniform Manifold Approximation and Projection.
The landscape of developmental arrest states across acute leukemia subtypes. (A) Patient cohort across leukemia subtypes. (B-C) Overall UMAPs of all scRNA-seq cells (B) and scATAC-seq cells (C) from 96 leukemia samples, colored by major cell types. (D) Bubble plot of manually curated marker genes for leukemic blasts and healthy cells. Color represents scRNA-seq z score. Bubble size represents the percentage of cells expressing a marker in a given population. (E) UMAPs of healthy human hematopoiesis based on scRNA-seq (left) and scATAC-seq (right) data from healthy donor (HD) samples of pediatric bone marrow13 and thymus.14 Cell type annotation for scATAC-seq data was label-transferred from scRNA-seq data. (F) Fractions of leukemic blasts projected onto stem/progenitor, T-lineage, B-lineage, and myeloid-lineage trajectories using healthy reference data. Clinical information of each patient is shown at the bottom. B-M, B-myeloid; CLP, common lymphoid progenitor; DP, double positive T cells; GMP, granulocyte monocyte progenitor; iB-ALL, infant B-ALL; LMPP, lympho-myeloid primed progenitor; MEP, megakaryocyte erythroid progenitor; MPP, multipotent progenitor; pB-ALL, pediatric B-ALL; pDC, plasmacytoid dendritic cells; T-M, T-myeloid; T/NK, T/natural killer cells; UMAP, Uniform Manifold Approximation and Projection.
To further characterize intratumoral cell states, we projected leukemic blasts onto scRNA-seq and scATAC-seq reference maps of pediatric hematopoiesis assembled via single-cell profiling of healthy pediatric bone marrow13 and thymus (Figure 1E; supplemental Table 2). Our reference atlas includes CD38–Lin– HSPCs from multiple donors and their major developmental trajectories toward differentiated T, B, and myeloid cells, encompassing the full spectrum of nonmalignant counterparts for lymphoid and myeloid leukemias (Figure 1E; supplemental Figure 3A-B). Projection of leukemic blasts onto our healthy reference (Figure 1F; supplemental Figure 3C) identified a wide spectrum of developmental arrest in each leukemic subtype. We verified the accuracy of our computational projection using canonical marker genes, noting high concordance with genes corresponding to their respective developmental lineages (supplemental Figure 3D-G). Most leukemic blasts projected to the trajectories consistent with their clinical diagnosis. However, across both lymphoid and myeloid leukemias, we found evidence of lineage plasticity. For instance, myeloid-progenitor projected blasts were found in patients with B-ALL and T-ALL (Figure 1F). These blasts had strong upregulation of canonical myeloid genes including CD33, LYZ, and CEBPA (supplemental Figure 3E-F) and were enriched infant KMT2A-R samples (supplemental Figure 3H), potentially explaining the known risk of lineage switch more commonly described in infant KMT2A-R than in other high-risk pediatric B-ALL after immunotherapies.13,16,17
HSPC-like leukemic blasts represent a shared subpopulation across acute leukemias
Our results revealed several common cell states shared across different subtypes of leukemia. Among these, multipotent stem and progenitor states (including HSC and multipotent progenitor and lymphoid-primed multipotent progenitor cells) were observed at the highest shared frequency in patients with myeloid, lymphoid, and mixed-phenotype leukemia (Figure 1E; supplemental Figure 3C). Given that multipotency is retained in these cell states, we term this blast subpopulation “HSPC-like” blasts, in contrast to blasts expressing lineage markers, which we termed “lineage-like” blasts. Although the frequency of HSPC-like blasts varied by subtype (Figure 2A), they universally exhibited stem cell marker gene expression (supplemental Figure 3D-G) and chromatin accessibility profiles (Figure 2B). We next used our single-cell derived molecular signatures to deconvolute public bulk RNA sequencing (RNA-seq) data sets of pediatric acute leukemias, including the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) cohort,18 the AALL0434 T-ALL cohort,19 and the pan-ALL cohort20 (supplemental Table 1). We found that 68% of AML, 65% of MPAL, 26% of T-ALL, and 5.6% of B-ALL cases contained at least 5% HSPC-like blasts (supplemental Figure 4A). Within B-ALL, we also observed that the frequency of HSPC-like blasts was related to previously defined molecular subtype. For example, in the pan-ALL cohort, we found that the KMT2A-R subtype has the highest frequency of HSPC-like blasts, followed by the ZNF384-rearranged (ZNF384-R) and Ph+ subtypes (supplemental Figure 4B). Previous studies have demonstrated that ZNF384-R leukemia often exhibits various immunophenotypes,21,22 high lineage ambiguity, and potential for myeloid lineage switch.23,24 The high frequency of HSPC-like blasts in ZNF384-R leukemia may explain its high lineage plasticity (supplemental Figure 4C). Within T-ALL samples, patients with ETP acute lymphoblastic leukemia (ALL) exhibited the highest estimated HSPC-like frequency, followed by patients with near-ETP and non-ETP subtypes, in line with patients analyzed using single-cell sequencing (supplemental Figure 4D). The similarity of HSPC-like blasts to nonmalignant HSPCs residing in the bone marrow represents a potential confounder in our single-cell and bulk RNA-seq analyses. We thus directly amplified transcripts of mutated oncogenes detected via whole-exome/genome sequencing in single-cell RNA-seq libraries using the Genotyping of Transcriptomes assay.25 We found that HSPC-like and lineage-like blasts have significantly greater proportions of mutant cells than nonmalignant cells, corroborating the molecular differences between HSPC-like blasts and nonmalignant HSPC (P = 1.6 × 10–4; P = 5.9 × 10–4; Figure 2C; supplemental Table 4). To investigate shared and unique HSPC-like phenotypes across different subtypes of leukemia, we performed computational integration of all HSPC-like blasts, identifying 11 cell clusters that differed in transcriptional profile but were all composed of patients with all subtypes of leukemia (supplemental Figure 4E-G). Our results suggest that heterogeneity in HSPC-like blasts is shared across disease states.
HSPC-like leukemic blasts are commonly observed across high-risk acute leukemias. (A) Fraction of leukemic blasts projected as stem/progenitor-like cells based on scRNA-seq in 4 leukemia subtypes. (B) Genomic tracks of scATAC-seq signal at CD34 (left) and KIT (right) loci in stem/progenitor-like and lineage-like leukemic blasts. Cis-regulatory elements are highlighted in yellow. (C) Fractions of mutant cells detected in HSPC-like, lineage-like blasts, and healthy cells based on 9 recurrent mutations (supplemental Table 4). Mutations in single cells were identified using the Genotyping of Transcriptomes assay. Patient IDs are added in front of the mutation name. (D) Percentage of BM replacement by leukemia blasts in 3 PDX models: T-myeloid MPAL (HSPC-like PDXs n = 4; lineage-like PDXs n = 2; left), ETP-ALL (HSPC-like PDXs n = 2; lineage-like PDXs n = 2; middle), and B-ALL (HSPC-like PDXs n = 8; lineage-like PDXs n = 8; right). Sorted cells (HSPC-like or lineage-like; supplemental Figure 4D-F) were used for the transplant experiments. (E,G) UMAPs show the projections of PDX cells derived from sorted HSPC-like (left) and lineage-like (right) primary leukemic blasts from B-ALL (E) and ETP-ALL (G). Gray dots represent cells from HDs, blue dots cells from HSPC-like derived PDX, and red dots cells from lineage-like derived PDX. (F,H) Fractions of PDX cells projected to particular populations for B-ALL (HSPC-like-derived PDXs n = 8, lineage-like-derived PDXs n = 8) (F) and ETP-ALL (HSPC-like-derived PDXs n = 2; lineage-like-derived PDXs n = 2) (H). Left, HSPC-like; middle, CLP-like; right, lineage-like. P values were computed using the Student t test. (I) Stemness gene signature scores for stem/progenitor-like and lineage-like leukemic blasts in all subtypes. Scores were calculated using AUCell. P values were computed using the Student t test. ∗∗P < .001; ∗∗∗P < .0001. (J-K) Box plots of Shannon entropy scores computed on the basis of scRNA-seq (J) and scATAC-seq (K) projected populations in each patient. Samples with high HSPC-like percentages represented the top 50% of HSPC-like fractions in each disease group, whereas samples with low HSPC-like percentages represented the bottom 50% of HSPC-like fractions in each disease group. P values were computed using the Student t test. ∗P < .05; ∗∗∗∗P < .0001. BM, bone marrow; B-M, B-myeloid; CLP, common lymphoid progenitor; n.s., not significant; T-M, T-myeloid; UMAP, Uniform Manifold Approximation and Projection.
HSPC-like leukemic blasts are commonly observed across high-risk acute leukemias. (A) Fraction of leukemic blasts projected as stem/progenitor-like cells based on scRNA-seq in 4 leukemia subtypes. (B) Genomic tracks of scATAC-seq signal at CD34 (left) and KIT (right) loci in stem/progenitor-like and lineage-like leukemic blasts. Cis-regulatory elements are highlighted in yellow. (C) Fractions of mutant cells detected in HSPC-like, lineage-like blasts, and healthy cells based on 9 recurrent mutations (supplemental Table 4). Mutations in single cells were identified using the Genotyping of Transcriptomes assay. Patient IDs are added in front of the mutation name. (D) Percentage of BM replacement by leukemia blasts in 3 PDX models: T-myeloid MPAL (HSPC-like PDXs n = 4; lineage-like PDXs n = 2; left), ETP-ALL (HSPC-like PDXs n = 2; lineage-like PDXs n = 2; middle), and B-ALL (HSPC-like PDXs n = 8; lineage-like PDXs n = 8; right). Sorted cells (HSPC-like or lineage-like; supplemental Figure 4D-F) were used for the transplant experiments. (E,G) UMAPs show the projections of PDX cells derived from sorted HSPC-like (left) and lineage-like (right) primary leukemic blasts from B-ALL (E) and ETP-ALL (G). Gray dots represent cells from HDs, blue dots cells from HSPC-like derived PDX, and red dots cells from lineage-like derived PDX. (F,H) Fractions of PDX cells projected to particular populations for B-ALL (HSPC-like-derived PDXs n = 8, lineage-like-derived PDXs n = 8) (F) and ETP-ALL (HSPC-like-derived PDXs n = 2; lineage-like-derived PDXs n = 2) (H). Left, HSPC-like; middle, CLP-like; right, lineage-like. P values were computed using the Student t test. (I) Stemness gene signature scores for stem/progenitor-like and lineage-like leukemic blasts in all subtypes. Scores were calculated using AUCell. P values were computed using the Student t test. ∗∗P < .001; ∗∗∗P < .0001. (J-K) Box plots of Shannon entropy scores computed on the basis of scRNA-seq (J) and scATAC-seq (K) projected populations in each patient. Samples with high HSPC-like percentages represented the top 50% of HSPC-like fractions in each disease group, whereas samples with low HSPC-like percentages represented the bottom 50% of HSPC-like fractions in each disease group. P values were computed using the Student t test. ∗P < .05; ∗∗∗∗P < .0001. BM, bone marrow; B-M, B-myeloid; CLP, common lymphoid progenitor; n.s., not significant; T-M, T-myeloid; UMAP, Uniform Manifold Approximation and Projection.
We next investigated whether HSPC-like blasts differed from lineage-like blasts in leukemogenesis potential in xenograft models. On the basis of immunophenotypic markers identified through scRNA-seq (supplemental Figure 4H-J), we sorted HSPC-like blasts (CD34+CD63+CD19– in B-ALL, CD34+CD44+ in T-ALL/MPAL) and lineage-like blasts (CD19+ in B-ALL, CD34– in T-ALL and MPAL) for transplantation into NOD scid gamma/NOD scid gamma-SGM3 mice (supplemental Figure 4K-M). We did not observe engraftment differences between HSPC-like and lineage-like blasts, with both populations giving rise to leukemia with >90% bone marrow replacement in all 3 leukemias (Figure 2D). However, scRNA-seq of the engrafted leukemias revealed large transcriptional differences, with HSPC-like-derived leukemia clustering separately and having elevated frequency of immature blasts (supplemental Figure 5A-F). Patient-derived xenograft (PDX) expansion of HSPC-like blasts derived from primary B-ALL arrested at earlier stages of B-cell development (HSC/multipotent progenitor, lymphoid-primed multipotent progenitor, common lymphoid progenitor) compared with lineage-like blasts (Figure 2E-F; supplemental Figure 5B). HSPC-like blasts maintained their CD19–CD34hiCD63hi phenotype after engraftment (supplemental Figure 5C) and harbored higher expression of stem-cell markers and lower expression of B-cell development markers, including IGLL1, VPREB1, VPREB3, and CD24. Among all PDX models, most early progenitor-like cells were derived from the expansion of HSPC-like blasts (supplemental Figure 5C). We observed similar trends when testing the engraftment potential of HSPC-like blasts derived from patients with T-ALL (Figure 2G-H; supplemental Figure 5E). Taken together, HSPC-like blasts and lineage-like blasts derived from lymphoid leukemias both harbor high engraftment potential but differ in cellular arrest state after engraftment.
Prior studies have indicated that stem cell–like blasts in ALL and AML differ in their functional capacity to generate leukemic hierarchies and have isolated signatures specific to myeloid stem cells. To better contextualize transcriptional similarities in HSPC-like blasts across leukemias, we applied 6 AML leukemia stem cell (LSC) gene signatures26-31 to HSPC-like blasts identified in our single-cell data, revealing a strong upregulation in all subtypes of leukemia (Figure 2I). Notably, although HSPC-like blasts in AML have a more pronounced adult LSC17 signature, those in ALL show a more pronounced pediatric LSC6 signature (Figure 2I), underscoring the need to discover a panleukemia HSPC-like signature. We next investigated whether the frequency of HSPC-like blasts was associated with the developmental arrest spectrum observed within each leukemia. We computed Shannon entropy–based scores using projected cell states as a measure of heterogeneity. We found a strong correlation between entropy score and percentage of HSPC-like cells in scRNA-seq and scATAC-seq data, across all subtypes except for MPAL (Figure 2J-K). Taken together, our results suggest that an elevated HSPC-like blasts frequency is associated with higher lineage plasticity, supporting a leukemia-initiating role for HSPC-like blasts in ALL. scRNA-seq on 14 PDX models and matched primary samples (supplemental Table 2) indicated close transcriptional similarity of paired primary and PDX samples and maintenance of HSPC-like frequency in matched PDX models (supplemental Figure 5G-H). Further gene expression analysis confirmed maintained expression of stem-cell markers, such as CD34, and lower expression of lineage markers (CD38, CD33, CD7, CD19) in leukemias derived from HSPC-like blasts (supplemental Figure 5I).
In summary, our analysis revealed that HSPC-like blasts are shared across major subtypes of acute leukemia, including AML, ALL, and MPAL. These blasts harbor recurrent mutations and are leukemogenic. They share transcriptomic and epigenomic features with normal HSPCs and are associated with high heterogeneity of developmental arrest states in large patient cohorts.
HSPC-like blasts are resistant to conventional chemotherapy
Since conventional HSPCs and myeloid stem cells are known to display high degrees of self-renewal and resistance to genotoxic stress, we next investigated whether HSPC-like blasts were associated with resistance to chemotherapy across different leukemic subtypes. We first assessed the correlation between the proportion of HSPC-like blasts in single-cell sequenced samples and key clinical biomarkers for risk classification,32-37 including minimal residual disease (MRD) at end of induction therapy (EOI; Figure 3A-C) and genetically determined risk groups (Figure 3B). Our analysis identified a strong positive correlation between HSPC-like blast frequency and MRD level at the EOI in patients with AML (Figure 3A) and T-ALL (Figure 3C). We also observed a correlation between genetically determined risk groups and HSPC-like blast frequency, finding that high-risk infant KMT2A-R B-ALL exhibited the highest frequency of HSPC-like blasts compared with other high-risk pediatric B-ALL subtypes and low-risk ETV6::RUNX138 subtypes (supplemental Table 1; Figure 3B,D; supplemental Figure 6A-B). In addition, we found that HSPC-like fraction was associated with the presence of MRD in patients with B-ALL (supplemental Figure 6C).
HSPC-like leukemic blasts are resistant to chemotherapy in acute leukemias. (A-C) Frequencies of developmental arrest stages of leukemia blasts across leukemia subtypes. Each developmental pseudotime trajectory from HSC/MPP to terminally differentiated population is ordered in 20 bins. Heat maps show the frequencies of each bin from individual patients: AML (n = 10) (A), B-ALL (n = 35) (B), and T-ALL (n = 40) (C). Dashed vertical lines delineate the division between multipotent stem/progenitor-like population and lineage-like populations for each lineage. Bottom line plots show the frequency of each developmental stage along the pseudotime trajectories using HD data: myeloid lineage (A), B-cell lineage (B), and T-cell lineage (C). Legends on the right side show the percentage of MRD for AML and T-ALL or genetic subtypes for B-ALL. (D) Box plots showing the frequencies of HSPC-like blasts in MRD+ vs MRD– patients (AML and T-ALL) or in high-risk vs low-risk subtype (B-ALL) patients in the single-cell data of this study. P values were computed using the Student t test. (E) Association between frequency of HSPC-like/lineage-like blasts and EOI MRD values after induction chemotherapy in AML, B-ALL, and T-ALL based on multiple linear regression (supplemental Methods). Error bars show the 95% confidence interval of regression coefficient. The frequencies of HSPC-like and lineage-like blasts were computed using CIBERSORT and bulk RNA-seq data from the National Cancer Institute’s TARGET (AML, B-ALL) project, and the Children’s Oncology Group AALL0434 trial (T-ALL). (F) Association between frequency of HSPC-like/lineage-like blasts and LC50 of 9 conventional chemotherapy drugs. The color of the dots represents regression coefficients and the size of the dots represents the –log10(adjusted P value of regression coefficient). The frequencies of HSPC-like and lineage-like blasts were computed using CIBERSORT. LC50 data and corresponding bulk RNA-seq data sets were downloaded from Lee et al.39 P values were computed using the Student t test and adjusted for multiple testing using the Benjamini-Hochberg method. Significant associations are indicated with a circle with thicker borders. (G) Drug response curves showing the different response to conventional chemotherapy drugs between HSPC-like derived (n = 2) and lineage-like derived PDX cells (n = 2). (H) Frequencies of HSPC-like blasts from PDX treatment groups based on PDX treatment scRNA-seq data. Sample numbers: dexamethasone (Dex) n = 5; vincristine (VCR) n = 2; untreated controls n = 7. P values were computed using the Student t test. (I) Volcano plot showing the DEGs comparing dexamethasone-treated PDXs and control PDXs based on scRNA-seq data. Red dots represent the genes with adjusted P value <.05 and abs(log2FC) > 0.25; purple dots represent the genes with adjusted P value < .05 only; black dots represent the nonsignificant genes. CLP, common lymphoid progenitor; Ctrl, control; DP, double positive T cells; GMP, granulocyte monocyte progenitor; LC50, lethal concentration 50%; LMPP, lympho-myeloid primed progenitor; MPP, multipotent progenitor; NK, natural killer; TNF, tumor necrosis factor.
HSPC-like leukemic blasts are resistant to chemotherapy in acute leukemias. (A-C) Frequencies of developmental arrest stages of leukemia blasts across leukemia subtypes. Each developmental pseudotime trajectory from HSC/MPP to terminally differentiated population is ordered in 20 bins. Heat maps show the frequencies of each bin from individual patients: AML (n = 10) (A), B-ALL (n = 35) (B), and T-ALL (n = 40) (C). Dashed vertical lines delineate the division between multipotent stem/progenitor-like population and lineage-like populations for each lineage. Bottom line plots show the frequency of each developmental stage along the pseudotime trajectories using HD data: myeloid lineage (A), B-cell lineage (B), and T-cell lineage (C). Legends on the right side show the percentage of MRD for AML and T-ALL or genetic subtypes for B-ALL. (D) Box plots showing the frequencies of HSPC-like blasts in MRD+ vs MRD– patients (AML and T-ALL) or in high-risk vs low-risk subtype (B-ALL) patients in the single-cell data of this study. P values were computed using the Student t test. (E) Association between frequency of HSPC-like/lineage-like blasts and EOI MRD values after induction chemotherapy in AML, B-ALL, and T-ALL based on multiple linear regression (supplemental Methods). Error bars show the 95% confidence interval of regression coefficient. The frequencies of HSPC-like and lineage-like blasts were computed using CIBERSORT and bulk RNA-seq data from the National Cancer Institute’s TARGET (AML, B-ALL) project, and the Children’s Oncology Group AALL0434 trial (T-ALL). (F) Association between frequency of HSPC-like/lineage-like blasts and LC50 of 9 conventional chemotherapy drugs. The color of the dots represents regression coefficients and the size of the dots represents the –log10(adjusted P value of regression coefficient). The frequencies of HSPC-like and lineage-like blasts were computed using CIBERSORT. LC50 data and corresponding bulk RNA-seq data sets were downloaded from Lee et al.39 P values were computed using the Student t test and adjusted for multiple testing using the Benjamini-Hochberg method. Significant associations are indicated with a circle with thicker borders. (G) Drug response curves showing the different response to conventional chemotherapy drugs between HSPC-like derived (n = 2) and lineage-like derived PDX cells (n = 2). (H) Frequencies of HSPC-like blasts from PDX treatment groups based on PDX treatment scRNA-seq data. Sample numbers: dexamethasone (Dex) n = 5; vincristine (VCR) n = 2; untreated controls n = 7. P values were computed using the Student t test. (I) Volcano plot showing the DEGs comparing dexamethasone-treated PDXs and control PDXs based on scRNA-seq data. Red dots represent the genes with adjusted P value <.05 and abs(log2FC) > 0.25; purple dots represent the genes with adjusted P value < .05 only; black dots represent the nonsignificant genes. CLP, common lymphoid progenitor; Ctrl, control; DP, double positive T cells; GMP, granulocyte monocyte progenitor; LC50, lethal concentration 50%; LMPP, lympho-myeloid primed progenitor; MPP, multipotent progenitor; NK, natural killer; TNF, tumor necrosis factor.
To examine the utility of HSPC-like blasts as a biomarker for predicting residual disease, we performed multivariate linear regression using MRD levels and estimated HSPC-like fractions from deconvoluted bulk RNA-seq data, while accounting for key clinical covariates, including white blood cell count, age, and genetic lesions. Our analysis identified large positive correlation between MRD at EOI and HSPC-like abundance, but not lineage-like abundance, in treatment-naive AML, B-ALL, and T-ALL patient cohorts, with statistically significant associations in patients with AML and T-ALL (Figure 3E). To examine other clinical variables interacting with HSPC-like frequency in B-ALL, we assessed the influence of HSPC-like blasts within different B-ALL genetic subtypes in the Total XVI data set.40,41 Notably, subtype analysis indicated varying correlations: KMT2A-R, DUX4, and Ph+ subtypes harbored positive correlations between HSPC-like fraction and MRD levels (supplemental Figure 6D), whereas the ZNF384 subtype, despite its high HSPC-like frequency, did not show a statistically significant association (supplemental Figure 6D). Taken together, our results nominate HSPC-like blasts as a promising biomarker for predicting resistance to conventional chemotherapy.
We next sought to identify specific compounds to which HSPC-like blasts display therapy resistance. Using previously published pharmacogenomic data, we performed linear regression between estimated HSPC-like abundance and sensitivity to 17 clinically used drugs.39 We found that higher HSPC-like abundance was associated with greater resistance to multiple chemotherapy drugs, including cytarabine, prednisolone, vincristine, and thioguanine, whereas lineage-like blasts displayed the opposite trend (Figure 3F). We next performed in vitro compound screening in PDX-expanded ETP-ALL HSPC-like and lineage-like blasts using a library of 6 conventional chemotherapies and 34 leukemia specific targeted therapies. HSPC-like blasts consistently exhibited therapeutic resistance to conventional chemotherapy in comparison with lineage-like blasts (Figure 3G; supplemental Figure 6E). To validate the impact of chemotherapy on HSPC-like fraction in vivo, we treated PDX with dexamethasone and vincristine for 5 weeks and conducted scRNA-seq on the posttreatment human CD45+ leukemic populations. In line with the preferential targeting of cycling cells, vincristine/dexamethasone-treated samples had higher frequency of noncycling cells compared with control-treated PDX (supplemental Figure 6G). Dexamethasone-treated and vincristine-treated PDX also displayed significant enrichment of HSPC-like blasts (Figure 3H; supplemental Figure 6F), and higher expression of stem-cell markers and transcription factors (TFs), including CD63,13,42,SOX4,43 and ZBTB1644 (Figure 3I; supplemental Figure 6H). Our in silico, in vitro, and in vivo results support the selection of resistant HSPC-like blasts following conventional therapy. Moreover, we envision our experimental design to serve as a platform for future in vivo studies, which should be performed on a wide array of PDX models to validate common resistance mechanisms among HSPC-like blasts.
HSPC-like transcriptomic signatures predict therapy response and outcome
Having linked HSPC-like blasts with chemotherapy resistance, we next sought to identify molecular signatures of HSPC-like blasts that were linked with patient outcomes. We first used scRNA-seq data to analyze differential expression between HSPC-like blasts and lineage-like blasts in each subtype, identifying 784 upregulated and 1459 downregulated genes in HSPC-like blasts in at least 1 comparison (supplemental Figure 7A-C; supplemental Table 5). We performed k-means clustering based on fold-change in each subtype, grouping HSPC-like upregulated genes into 9 clusters (Figure 4A, left) and HSPC-like downregulated genes into 5 clusters (Figure 4A, right). In line with their primary clinical diagnosis, we observed that most HSPC-like differentially expressed genes (DEGs) were subtype-specific, with 65.8% of upregulated and 91.1% of downregulated genes observed in only 1 leukemia subtype. However, our analysis also identified 183 genes (clusters 1-2) that were upregulated in HSPC-like blasts in all 3 leukemia subtypes. Among the 19 genes (cluster 1) with the strongest upregulation in HSPC-like blasts were CD34, FAM30A, and SPINK2 (supplemental Figure 7D-E), which have been reported in AML as part of the adult LSC1726 and pediatric LSC627 signatures, as well as pathways involved in ribosome biogenesis, translation and cell adhesion (Figure 4B). Among the moderately upregulated HSPC-like genes (n = 164, cluster 2) were several cell surface genes (FLT3, PROM1, CD9) and HOXA family TFs.
HSPC-like blast transcriptomic signatures predict clinical outcome. (A) Heat maps of DEGs between HSPC-like and lineage-like blasts across leukemia subtypes: upregulated genes in HSPC-like blasts in at least 1 subtype (left); downregulated genes in HSPC-like blasts in at least 1 subtype (right). DEGs were determined by abs(log2[FC]) > 0.25 and adjusted P value < .05. DEGs were pooled and clustered by k-means clustering (k = 9 for left panel, k = 5 for right panel) on the basis of their log2FC. The full list of DEGs is provided in supplemental Table 5. (B) Heat map showing enriched gene ontology terms of each DEG cluster in panel A. Shades of blue represent the –log10(enrichment P value). Selected terms are labeled. (C) Single-cell RNA-seq–derived HSPC-like signatures can stratify MRD levels in bulk RNA-seq cohort. Rows, leukemia subtypes; columns, single-cell derived signatures. The AML cohort was divided into induction failure (MRD > 5%), MRD5 (5% ≥ MRD > 1%), MRD1 (1% ≥ MRD > 0.1%), and MRD⁻ (MRD ≤ 0.1%); B-ALL and T-ALL cohorts were divided into induction failure (MRD > 5%), MRD5 (5% ≥ MRD > 1%), MRD1 (1% ≥ MRD > 0.01%), and MRD- (MRD ≤ 0.01%) on the basis of their day 29 MRD level at the EOI. Dashed boxes highlight the comparison that related to each signature. Sample numbers for each group were indicated. P values were computed using the 2-sided Student t test. The source of AML and B-ALL cohorts: TARGET; T-ALL cohort: AALL0434. (D-F) Pan-HSPC-like signature can stratify patient survival across leukemia subtypes. The source of patient cohort data and case numbers are indicated at the top of each plot. The P values were calculated using the log-likelihood statistic from the Cox proportional hazards test with central nervous system status and white blood cell count included as covariates. BCR, B-cell receptor; OS, overall survival; TCR, T-cell receptor.
HSPC-like blast transcriptomic signatures predict clinical outcome. (A) Heat maps of DEGs between HSPC-like and lineage-like blasts across leukemia subtypes: upregulated genes in HSPC-like blasts in at least 1 subtype (left); downregulated genes in HSPC-like blasts in at least 1 subtype (right). DEGs were determined by abs(log2[FC]) > 0.25 and adjusted P value < .05. DEGs were pooled and clustered by k-means clustering (k = 9 for left panel, k = 5 for right panel) on the basis of their log2FC. The full list of DEGs is provided in supplemental Table 5. (B) Heat map showing enriched gene ontology terms of each DEG cluster in panel A. Shades of blue represent the –log10(enrichment P value). Selected terms are labeled. (C) Single-cell RNA-seq–derived HSPC-like signatures can stratify MRD levels in bulk RNA-seq cohort. Rows, leukemia subtypes; columns, single-cell derived signatures. The AML cohort was divided into induction failure (MRD > 5%), MRD5 (5% ≥ MRD > 1%), MRD1 (1% ≥ MRD > 0.1%), and MRD⁻ (MRD ≤ 0.1%); B-ALL and T-ALL cohorts were divided into induction failure (MRD > 5%), MRD5 (5% ≥ MRD > 1%), MRD1 (1% ≥ MRD > 0.01%), and MRD- (MRD ≤ 0.01%) on the basis of their day 29 MRD level at the EOI. Dashed boxes highlight the comparison that related to each signature. Sample numbers for each group were indicated. P values were computed using the 2-sided Student t test. The source of AML and B-ALL cohorts: TARGET; T-ALL cohort: AALL0434. (D-F) Pan-HSPC-like signature can stratify patient survival across leukemia subtypes. The source of patient cohort data and case numbers are indicated at the top of each plot. The P values were calculated using the log-likelihood statistic from the Cox proportional hazards test with central nervous system status and white blood cell count included as covariates. BCR, B-cell receptor; OS, overall survival; TCR, T-cell receptor.
To assess the prognostic utility of HSPC-like transcriptomic signatures, we applied 4 scRNA-seq–derived HSPC-like signatures to stratify bulk RNA-sequenced clinical cohorts. We used universal HSPC-like signatures (upregulated cluster 1 and downregulated cluster 1) and subtype-specific HSPC-like signatures (AML: upregulated clusters 3-6 and downregulated clusters 2 and 3; B-ALL: upregulated clusters 7 and 8 and downregulated cluster 4; T-ALL: upregulated cluster 9 and downregulated cluster 5) to risk-stratify patients with AML, B-ALL, and T-ALL. We found that universal HSPC-like signatures scores robustly predict MRD status across all subtypes, whereas subtype-specific scores predict the MRD status only in their corresponding leukemia subtypes (supplemental Figure 7F). This observation is further emphasized when treating MRD as a continuum and stratifying according to MRD level (Figure 4C). Additionally, both panleukemia and AML-specific HSPC-like scores effectively stratify EOI cycle 2 MRD levels in the AML TARGET cohort, a critical determinant of AML disease progression (supplemental Figure 7G). We next applied our universal HSPC-like signature to survival analysis in multiple public bulk RNA-seq cohorts (supplemental Table 1), finding that our pan-HSPC-like signature robustly stratifies overall survival in AML, B-ALL, and T-ALL patient cohorts (Figure 4D-F). In line with the utility of subtype-specific HSPC-like signatures within each leukemia, we found that subtype-specific HSPC signatures also stratify long-term patient outcomes in the corresponding leukemia subtypes (supplemental Figure 7H-I). Collectively, our results identify, to our knowledge, novel panleukemia and subtype-specific transcriptomic signatures that robustly associate with clinical outcomes.
Core transcriptional regulatory circuitries of HSPC-like blasts
Because aberrant activation of TFs represents a unifying genetic event in acute leukemias, we hypothesized that HSPC-like blasts could co-opt hematopoietic TFs to maintain their stem cell–like phenotype. We thus sought to identify the transcriptional regulatory program in HSPC-like blasts by constructing transcriptional regulatory networks (TRNs). Through integration of scRNA-seq and scATAC-seq data, we identified a total of 125 TFs with increased activity in HSPC-like blasts compared with lineage-like blasts in at least 1 leukemia subtype (supplemental Figure 8A-B; supplemental Table 6).
We then aimed to define a common core TRN that operates in HSPC-like blasts across subtypes. We first identified 6 commonly upregulated TFs across leukemia subtypes, which included HOXA3/5/9, AP-1, and CEBPA factors, which were predicted to regulate hematopoiesis and cell proliferation (Figure 5A). We observed that only a small fraction of TF targets was shared across 3 leukemia subtypes (Figure 5B-C). We therefore defined a universal HSPC-like TRN as the 6 TFs and 45 targets that are shared across all leukemia subtypes (Figure 5D; supplemental Table 7). Among the 45 target genes of the core TRN were the most highly expressed stem-cell genes identified previously, including CD34, CD44, SPINK2, BAALC, and FAM30A (Figure 5D). Overall, the core TRN is enriched for functions in ribosome biogenesis, stem-cell differentiation, and tumor necrosis factor/NF-κB signaling (Figure 5E). Interestingly, within identical target genes we found evidence of subtype-specific enhancer accessibility, suggesting that perturbations to these core TFs may also lead to subtype-specific transcriptional changes (supplemental Figures 8C and 9A). To further explore this hypothesis, we performed in silico perturbation analysis using SCENIC+45 to computationally knock out each of the 6 TFs in HSPC-like blasts. After in silico knockout, we observed a significant reduction in the TRN signature across all 3 leukemia types (Figure 5F; supplemental Figure 9C-D). To refine these findings, we intersected the DEGs from the perturbation analysis with the original regulons for each TF. The analysis revealed that overlapping target genes were significantly differentially expressed in HSPC-like blasts compared with lineage-like blasts in all 6 TFs across various leukemias (supplemental Figure 9E-G).
Core transcriptional regulatory circuitries of HSPC-like blasts. (A) Heat map showing enriched gene ontology terms of each regulon in the shared TRN (see “Methods”). Shades of blue represent –log10(enrichment P value). Selected terms are labeled. (B) Bar plots showing the numbers of shared and leukemia subtype–specific targets of each core TF. (C) Box plots showing log2 expression fold-change of shared and subtype-specific targets. Red dashed lines indicate log2FC = 0. (D) Core HSPC-like TRN shared across leukemia subtypes. All TFs and targets are shared across leukemia subtypes. Node color represents average log2FC between HSPC-like and lineage-like blasts across leukemia subtypes. (E) Enriched gene ontology terms of shared targets. (F) Core TRN signature in baseline and in silico TF KO matrices in AML HSPC-like blasts using scRNA-seq in this study (B-ALL results in supplemental Figure 9C; T-ALL results in supplemental Figure 9D). The colored boxes represent baseline signatures; gray boxes represent in silico TF KO signatures. P values were computed using the Student t test. (G) Core TRN signature in paired pediatric T-ALL bulk RNA-seq data set from X01.19 P values were computed using a paired t test. (H) Core TRN signature for the PDX treatment scRNA-seq data set in this study. P values were computed using the 2-sided Student t test. (I) Core TRN signature stratifies MRD level in the bulk RNA-seq cohort. Each cohort was classified as induction failure (≥5%), MRD5 (≥1% and <5%), MRD1 (≥0.01%), and MRD⁻ on the basis of day 29 MRD level at the EOI. The source of AML and B-ALL cohorts: TARGET; the source of 2 T-ALL cohorts: X01 and TARGET. P values were computed using the Student t test and indicated on the top of each box plot. Red dashed lines indicate a TRN score of 0 on the y-axis. (J) TRN signature stratifies patient survival across leukemia subtypes. The source of patient cohort data is indicated at the top of each plot. The P values were calculated using the log-likelihood statistic from the Cox proportional hazards test with central nervous system status and white blood cell count as covariates. Dex, dexamethasone; IF, induction failure; KO, knockout; OS, overall survival; TNF, tumor necrosis factor; TRN, transcriptional regulatory network; Vinc, vincristine.
Core transcriptional regulatory circuitries of HSPC-like blasts. (A) Heat map showing enriched gene ontology terms of each regulon in the shared TRN (see “Methods”). Shades of blue represent –log10(enrichment P value). Selected terms are labeled. (B) Bar plots showing the numbers of shared and leukemia subtype–specific targets of each core TF. (C) Box plots showing log2 expression fold-change of shared and subtype-specific targets. Red dashed lines indicate log2FC = 0. (D) Core HSPC-like TRN shared across leukemia subtypes. All TFs and targets are shared across leukemia subtypes. Node color represents average log2FC between HSPC-like and lineage-like blasts across leukemia subtypes. (E) Enriched gene ontology terms of shared targets. (F) Core TRN signature in baseline and in silico TF KO matrices in AML HSPC-like blasts using scRNA-seq in this study (B-ALL results in supplemental Figure 9C; T-ALL results in supplemental Figure 9D). The colored boxes represent baseline signatures; gray boxes represent in silico TF KO signatures. P values were computed using the Student t test. (G) Core TRN signature in paired pediatric T-ALL bulk RNA-seq data set from X01.19 P values were computed using a paired t test. (H) Core TRN signature for the PDX treatment scRNA-seq data set in this study. P values were computed using the 2-sided Student t test. (I) Core TRN signature stratifies MRD level in the bulk RNA-seq cohort. Each cohort was classified as induction failure (≥5%), MRD5 (≥1% and <5%), MRD1 (≥0.01%), and MRD⁻ on the basis of day 29 MRD level at the EOI. The source of AML and B-ALL cohorts: TARGET; the source of 2 T-ALL cohorts: X01 and TARGET. P values were computed using the Student t test and indicated on the top of each box plot. Red dashed lines indicate a TRN score of 0 on the y-axis. (J) TRN signature stratifies patient survival across leukemia subtypes. The source of patient cohort data is indicated at the top of each plot. The P values were calculated using the log-likelihood statistic from the Cox proportional hazards test with central nervous system status and white blood cell count as covariates. Dex, dexamethasone; IF, induction failure; KO, knockout; OS, overall survival; TNF, tumor necrosis factor; TRN, transcriptional regulatory network; Vinc, vincristine.
Next, we asked whether the core HSPC TRN signature was upregulated after therapy and after relapse. Within patient samples, we observed a significant increase in the core TRN signature score in AALL0434 patients who relapsed (Figure 5G). In treated PDX models, we also observed enrichment of the HSPC-like TRN following vincristine and dexamethasone, but not control treatment (Figure 5H). In line with these analyses, we found that the core HSPC-like signature predicts MRD across all leukemia subtypes (Figure 5I) and is associated with overall survival across leukemia subtypes (Figure 5J). A potential confounder in our analyses is the enrichment of patients with KMT2A-R B-ALL in our single-cell cohort, which imparts upregulation of the HOXA gene family. We analyzed the expression levels and TF activities of HOXA3/5/9 and found that expression and TF activity of all 3 HOXA factors were robustly upregulated compared with lineage-like blasts in both KMT2A-R and non-KMT2A-R patients (supplemental Figure 9H-I). More importantly, the prognostic potential of our core TRN signature remains robust even after excluding KMT2A-R patients from the TARGET data set (supplemental Figure 9J). Taken together, our analyses highlight a core TRN involved in the regulation of target genes in HSPC-like blasts across leukemic subtypes that predict early clinical response and long-term outcome following therapy.
In silico screening of drug targets against HSPC-like blasts
Having identified HSPC-like blasts as a putative therapeutic resistance mechanism, we next sought to identify novel drug targets that could be leveraged for targeted subpopulation-directed therapy. We devised an in silico screening strategy that intersects HSPC-like DEGs with public drug target and functional genomics databases (Figure 6A).46-49 We first identified genes specifically upregulated in HSPC-like blasts compared with lineage-like blasts in each subtype (Figure 4A). To avoid targeting normal HSPCs in patients, we removed genes that are upregulated in the HSPC-like blasts compared with normal HSPCs (Figure 6A; supplemental Figure 10A-C), resulting in 672 candidate target genes among the 3 subtypes (Figure 6B-E; supplemental Figure 10D; supplemental Table 8). This gene set consists of some well-known drug targets in cancer, such as CD4750 and CD9951 in the AML DEG set (Figure 6B); TGFB152 and PROM153 in the B-ALL DEG set (Figure 6C); LGALS154-56 and ITGA457 in the T-ALL DEG set (Figure 6D); and CD4458 in all 3 DEG sets. We then ranked each DEG according to its magnitude of expression change, targetability, and gene essentiality (supplemental Table 8). Our top 30 candidate targets included cell surface proteins (FLT3, CD44, KIT, CD47, CD63, and CD34), homeostatic enzymes (S100A4, CASP3, and BCL2), and signal transduction kinases (CDK6 and PIK3R1) (Figure 6D; supplemental Table 8). FLT3 and BCL2 were nominated as the most highly differentially expressed targets in HSPC-like blasts (Figure 6F). Interestingly, the PI3K inhibitor was predicted as one of the most effective HSPC-like-targeting drugs across leukemia subtypes by the Library of Integrated Network-Based Cellular Signatures database (supplemental Figure 10E). In line with BCL2 being a highly ranked target gene, we found HSPC-like frequency to be significantly associated with venetoclax sensitivity in previously pharmacogenomically profiled T-ALL samples (Figure 6G).
Identification of drug targets against HSPC-like blasts. (A) Schema for in silico drug screening. DEG 1: DEGs from comparison between HSPC-like and lineage-like blasts; DEG 2: DEGs from comparison between HSPC-like blasts and normal HSPCs. (B-D) Scatterplots showing the log2FC of DEGs based on scRNA-seq data in this study for AML (B), B-ALL (C), and T-ALL (D). (E) Overall scores of top 30 genes with known drugs based on the in silico screening. The type of supporting evidence is color-coded. (F) DepMap essentiality scores for BCL2, FLT3, and PIK3R1 in leukemia cell lines and nonleukemia cell lines, respectively. P values were computed using the Student t test. (G) Dot plot of association between subpopulation frequencies and IC50 values. The color of the dots represents regression coefficients and the size of the dots represents the –log10(adjusted P value of regression coefficient). The frequencies of HSPC-like and lineage-like blasts were computed using CIBERSORT. LC50 data and corresponding bulk RNA-seq data sets were downloaded from Lee et al.39 P values were computed using the Student t test and adjusted for multiple testing using the Benjamini-Hochberg method. Significant associations are highlighted with circles with thicker borders. (H) Drug response curves showing the different response to FLT3i (sorafenib) and PI3Ki (buparlisib) between HSPC-like derived (n = 2) and lineage-like-derived (n = 2) PDX cells. DE, differentially expressed genes; IC50, 50% inhibitory concentration;LC50, lethal concentration 50%; n.s., not significant; TTD, Therapeutic Target Database. The logos at the bottom of panel A are reused under Creative Commons licenses.
Identification of drug targets against HSPC-like blasts. (A) Schema for in silico drug screening. DEG 1: DEGs from comparison between HSPC-like and lineage-like blasts; DEG 2: DEGs from comparison between HSPC-like blasts and normal HSPCs. (B-D) Scatterplots showing the log2FC of DEGs based on scRNA-seq data in this study for AML (B), B-ALL (C), and T-ALL (D). (E) Overall scores of top 30 genes with known drugs based on the in silico screening. The type of supporting evidence is color-coded. (F) DepMap essentiality scores for BCL2, FLT3, and PIK3R1 in leukemia cell lines and nonleukemia cell lines, respectively. P values were computed using the Student t test. (G) Dot plot of association between subpopulation frequencies and IC50 values. The color of the dots represents regression coefficients and the size of the dots represents the –log10(adjusted P value of regression coefficient). The frequencies of HSPC-like and lineage-like blasts were computed using CIBERSORT. LC50 data and corresponding bulk RNA-seq data sets were downloaded from Lee et al.39 P values were computed using the Student t test and adjusted for multiple testing using the Benjamini-Hochberg method. Significant associations are highlighted with circles with thicker borders. (H) Drug response curves showing the different response to FLT3i (sorafenib) and PI3Ki (buparlisib) between HSPC-like derived (n = 2) and lineage-like-derived (n = 2) PDX cells. DE, differentially expressed genes; IC50, 50% inhibitory concentration;LC50, lethal concentration 50%; n.s., not significant; TTD, Therapeutic Target Database. The logos at the bottom of panel A are reused under Creative Commons licenses.
We next performed in vitro testing of 38 drug/target pairs in HSPC-like and lineage-like blasts (supplemental Table 9) that were initially isolated from PAVYVY (refractory ETP-ALL) and expanded in PDX models. Whereas lineage-like blasts were more sensitive to cytotoxic drugs (Figure 3G), HSPC-like blasts were more sensitive to 11 targeted agents, including an FLT3 inhibitor (sorafenib), pan-PI3K inhibitor (buparlisib), PI3K/AKT inhibitor (ipatasertib), JAK inhibitor (momelotinib), IAP inhibitor (birinapant), BET inhibitor (JQ1), nuclear export inhibitor (selinexor), NF-κB/proteasome inhibitor (bortezomib), CDK4/6 inhibitor (palbociclib), and a rescue chemotherapeutic agent (nelarabine; Figure 6H; supplemental Figure 10F). Overall, 6 of the 11 drugs with higher potency against HSPC-like blasts targeted genes that overlapped with our top 30 in silico screening candidates (sorafenib, venetoclax, buparlisib, ipatasertib, bortezomib, and palbociclib). Interestingly, testing of apoptosis-inducing agents displayed a unique trend between HSPC-like and lineage-like subsets. The BCL2 inhibitor venetoclax was more effective against lineage-like blasts at low concentrations, whereas at higher concentrations, HSPC-like blasts became more sensitive (supplemental Figure 10G, left panel). S63845, an MCL1 inhibitor, was more effective in lineage-like blasts across all concentrations tested (supplemental Figure 10G, right panel). The therapeutic effects of these 2 agents could be explained by the divergent expression patterns of BCL2 and MCL1 along T-cell differentiation (supplemental Figure 10H). Taken together, our in silico, in vitro, and in vivo results provide a framework for the molecular identification of HSPC-like blasts in acute leukemias and, to our knowledge, represent the first preclinical testing of targeted compounds against this high-risk subpopulation. Given that the bulk of our in silico predictions were validated through in vitro approaches using PDX-expanded blasts, further in vivo studies should be undertaken to advance our nominated therapeutics toward clinical testing.
Discussion
Our study presents a comprehensive atlas of intratumoral developmental arrest within pediatric acute leukemia, capturing transcriptional and epigenetic profiles of ∼1.4 million cells from primary acute leukemias. We computationally matched leukemic blasts to nonmalignant counterparts, adding subpopulation-level resolution to bulk-genomic initiatives in T-ALL, AML, and B-ALL.20,59,60 We identified a significant overlap in developmental arrest spectra across leukemias arising from different hematopoietic lineages, reflecting, to our knowledge, a previously unrecognized degree of lineage plasticity. We characterized a subpopulation of HSPC-like blasts with a universal transcriptomic signature found across all leukemia subtypes and identified a robust association with initial treatment response and poor outcome. Multiple reports have also converged on similar stem cell–like phenotypes, with one recently reporting an inflammatory, SPINK2+/PRSS57+ HSPC-like population in refractory T-ALL and another isolating noncanonical ZBTB16+ blasts in MRD+ T-ALL.61 Expression of ZBTB16 is highly upregulated in megakaryocyte-erythroid progenitor–like T-ALL subpopulations discovered in our atlas, and is restricted to megakaryocyte-erythroid progenitor–lineage cells in our healthy donor reference map. Further characterization of the lineage-aberrant blasts in our pediatric leukemia atlas is warranted and has potential to reveal new mechanisms of subpopulation-specific therapy resistance.
It is interesting to speculate whether HSPC-like blasts across leukemic subtypes represent transformation of a common ancestor stem cell. Shared transcriptional features, core TFs, and therapeutic responses between myeloid and lymphoid HSPC-like cells support this hypothesis. Lineage tracing in MPAL samples suggests that initial mutations occur very early in hematopoietic hierarchy.22 Further studies should apply lineage tracing to clarify the cell-of-origin question and genetic differences between lineage-like and HSPC-like blasts.
Subsequently, our in silico drug screening identified and validated the roles of known and novel genes and candidate targets, including multiple TFs (supplemental Figure 11A-C). Although TFs are challenging to target, emerging studies indicate their potential therapeutic value.62 This is particularly noteworthy given our identification of core hematopoietic TFs (JUNB/FOSB, CEBPA, and HOXA3/5/9) that regulate stem-like genes in HSPC-like blasts. Emerging evidence indicates that HOXA cluster activation can be targeted using menin inhibitors, with a primary mechanism of causing blast differentiation into a more mature “low-risk” cell state. Further studies should explore the feasibility of targeting TF networks to reverse the deregulated transcriptional program in HSPC-like blasts. In addition, future studies should leverage the wide array of genomically characterized PDX models that our group has recently published to expand our understanding of the in vivo efficacy of our genomically identified, in vitro tested targeted therapeutics.
In conclusion, our integrative analysis reveals a prevalent HSPC-like population in acute leukemias that contributes to poor drug response and unfavorable survival outcome. By validating these findings through in silico, in vivo, and in vitro experiments, our study integrates computational predictions with functional characterization, setting the stage for future translational efforts.
Acknowledgments
The authors thank the Children’s Hospital of Philadelphia Flow Cytometry Core, High-Throughput Sequencing Core, Small Animal Imaging Facility Core, and Research Information Services for their technical support. The authors are grateful to all the patients and their families who volunteered to participate in this study.
This work was supported by the National Cancer Institute (NCI), National Institutes of Health (NIH) Human Tumor Atlas Network grant U2C CA233285 (K.T. and S.P.H.). C.C. was supported by the Chinese Academy of Medical Sciences Innovation Funds for Medical Sciences 2023-I2M-2-007 and 2024-I2M-3-001. Additional support includes grants from the Alex’s Lemonade Stand Foundation (K.T. and D.T.T.), Gabriella Miller Kids First X01HD100702 (D.T.T., C.G.M., P.P., M.L.L., S.P.H., E.A.R., B.L.W., S.B.P., and J.J.Y.), the Leukemia & Lymphoma Society (D.T.T.), Hyundai Hope On Wheels (D.T.T. and K.T.), NCI, NIH grants R03CA256550 (D.T.T. and C.G.M.), R01CA193776 (D.T.T. and B.L.W.) and R01CA264837 (D.T.T. and J.J.Y.); National Heart Lung and Blood Institute, NIH U54HL165442 (K.T.), NCI, NIH F30-CA-268782 (J.X.) and F30-CA-277965 (S.B.), the Children’s Oncology Group (U10CA180886, U10CA180899, U24CA114766, U24CA196173), and the NIH Medical Scientist Training Program (T32 GM07170). K.T. is the Richard and Sheila Sanford Endowed Chair at Children’s Hospital of Philadelphia. D.T.T. is the Garrett Brodeur, MD Endowed Chair in Oncology at the Children’s Hospital of Philadelphia. S.P.H. is the Jeffrey E. Perelman Distinguished Chair in Pediatrics at the Children’s Hospital of Philadelphia.
Authorship
Contribution: K.T. and D.T.T. conceived the study; C.C. designed and performed experiments, analyzed and interpreted data, and wrote the manuscript; J.X. performed additional data analysis and wrote the manuscript; T.V., S.Y., Y.-y.D., and C.-h.C. performed additional experiments; W.Y., J.S.T., E.Y.L., S.B., P.P., H.N., J.H.S., F.A., Y.-y.D., A.Y., X.Q., J.P., B.L.W., J.H., R.S., A.D.H., C.D., L.U., G.S., T.R., T.F., M.L.L., E.A.R., S.P.H., S.B.P., C.G.M., D.F., J.J.Y., and K.M.B. performed additional data analysis and interpretation; K.T. and D.T.T. supervised the study, designed experiments, interpreted data, and wrote the manuscript; and all authors approved the final version of the manuscript.
Conflict-of-interest disclosure: D.T.T. received research funding from Beam Therapeutics and NeoImmuneTech; serves on advisory boards for Beam Therapeutics, Janssen, Servier, Sobi, and Jazz Pharmaceuticals; and has multiple patents pending on chimeric antigen receptor T-cell therapy. S.P.H. has received consulting fees from Novartis; has received honoraria from Jazz Pharmaceuticals and Servier; and owns common stock in Amgen. The remaining authors declare no competing financial interests.
Correspondence: Kai Tan, 3501 Civic Center Blvd, CTRB 4004, Philadelphia, PA 19104; email: tank1@chop.edu; and David T. Teachey, 3501 Civic Center Blvd, CTRB 3008, Philadelphia, PA 19104; email: teacheyd@chop.edu.
References
Author notes
C.C. and J.X. contributed equally to this study
D.T.T. and K.T. jointly supervised this study.
Data from this study have been deposited at the Human Tumor Atlas Network (HTAN) data portal: https://data.humantumoratlas.org/. The linkage between HTAN IDs and sample IDs is provided in supplemental Table 2. All code used for the main analyses in this study is available at https://github.com/tanlabcode/PanLeukemia.
Any other code used to create the figures is available on reasonable request from the corresponding authors, Kai Tan (tank1@chop.edu) and David T. Teachey (teacheyd@chop.edu).
The online version of this article contains a data supplement.
There is a Blood Commentary on this article in this issue.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal