Abstract
Objectives: Immunophenotyping of acute myeloid leukemia (AML) currently relies heavily on a limited panel of validated surface markers. However, the limited repertoire of these markers, the technical complexity of multiparameter flow cytometry, and the inherent subjectivity of data interpretation collectively contribute to substantial biases in AML classification. Enhancing the accuracy and efficiency of AML immunophenotyping remains a significant challenge. To overcome these limitations, we performed large-scale surfaceome profiling of 124 primary AML samples accompanied by matched multi-omics datasets. Our objective is to develop a machine learning–based surfaceome classification system for AML and to validate it using comprehensive multi-omics analysis.
Methods: Membrane proteins were enriched from bone marrow mononuclear cells (BMMCs) of 124 primary AML patients and analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Multi-omics profiling, including transcriptomics, proteomics, and metabolomics, was concurrently performed on BMMCs from 79 of these patients. Clinical data were incorporated, including flow cytometry results and targeted capture sequencing. Non-negative matrix factorization (NMF) was used for unsupervised clustering and linear discriminant analysis (LDA) was used for machined learning. Multi-omics factor analysis (MOFA) was employed to integrate diverse omics datasets and evaluate the interrelationships among them.
Results: Through AML surfaceome profiling, we identified 2,764 membrane proteins based on integrated annotations from five public databases. The mean protein intensities of well-established surface markers, including CD33 and CD123, in our surfaceome data closely aligned with flow cytometry measurements, underscoring the potential of surfaceome profiling to enhance AML immunophenotyping. Using NMF-based unsupervised clustering, seven distinct surfaceome subgroups (G1-G7) were identified. To assess their discriminability, a machine learning classifier based on LDA was trained, achieving a classification accuracy of 95.97%, demonstrating strong discriminatory power among the defined surfaceome subgroups.
The machine learning-based surfaceome classification showed partial overlap with the genetic classification system. G1 was mainly composed of patients with CBFβ::MYH11 fusion (11/18, 61%), G3 was enriched for RUNX1::RUNX1T1 fusion-positive cases (12/15, 80%), and G6 predominantly included patients with CEBPA bZIP in-frame mutations (16/20, 80%). Notably, some surfaceome subgroups did not align clearly with genetic classifications. NPM1-mutated patients were distributed across two distinct subgroups, G4 and G7, highlighting the heterogeneity at the surfaceome level despite shared genetic alterations.
To further dissect this heterogeneity, we conducted an integrated multi-omics analysis. Transcriptomic and proteomic analyses uncovered subtype-specific differentiation blocks: G4 was primarily arrested at a hematopoietic stem cell (HSC)-like stage, marked by elevated expression of stemness-associated markers such as MSI2 and SOX4, whereas G7 was blocked at a monocyte-like stage, with high levels of monocytic markers including CD14 and FCN1. At the metabolomic level, G4 exhibited elevated antioxidant metabolites such as glutathione, indicative of leukemic stem cell–like metabolic characteristics, whereas G7 demonstrated increased levels of acylcarnitines, reflecting enhanced fatty acid metabolic activity. These findings suggest that the heterogeneity observed in our surfaceome subgroups is driven not only by genetic mutations but also by differentiation hierarchies and metabolic patterns.
Conclusions: We established a previously unreported machine learning–based surfaceome classification system for AML. Comprehensive multi-omics analysis indicates that the heterogeneity of surfaceome-defined AML subgroups is shaped by an interplay among genetic mutations, differentiation stages, and metabolic rewiring. These results suggest that surfaceome profiling offers a more comprehensive framework for AML immunophenotyping compared to traditional flow cytometry. Moreover, our study underscores the transformative power of machine learning and artificial intelligence in advancing AML classification and paving the way for more personalized cancer treatment.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal