Abstract
Background: Acute myeloid leukemia (AML) is characterized by marked molecular heterogeneity, which critically dictate risk stratification and prognostic assessment. While integrating genomic, transcriptomic, and clinical data yields a more comprehensive view of AML disease biology and enhances survival prediction, contemporary deep-learning prognostic models are ubiquitously challenged by two obstacles: strong correlations across omics modalities that injects redundant information and limits model generalizability, and variability in data quality and distribution across different omics layers further complicates effective integration.
Methods: To address the challenges of intra- and inter-modal redundancy, limited feature discriminability, and the identification of survival-associated molecular signatures in AML, we propose a novel multi-omics survival prediction model, termed Acute Myeloid Leukemia Survival Prediction (AMLSP). We utilized a cohort of 1,160 AML patients assembled from multiple publicly available datasets, including TCGA (n = 132), TARGET (n = 228), and two OHSU studies (OHSU\_2018: n = 295; OHSU\_2022: n = 505), retaining only samples with matched gene expression, mutation, and clinical data. For each patient, the model integrates gene mutations and gene expression profiles to output a continuous prognostic risk score. To reduce intra- and inter-modal redundancy, shared/private encoders with orthogonal loss and cross-modal alignment via contrastive and alignment losses are employed. A MOPIB module identifies survival-relevant features while suppressing noise. Gated and self-attention mechanisms enhance intra- and inter-modal fusion. A composite loss balances prediction accuracy and regularization. Model performance is assessed via C-index, log-rank P-values, and t-SNE/UMAP-based feature visualization.
Results: Our AMLSP model achieved strong and consistent performance across four publicly available AML datasets, yielding an average concordance index (C-index) of 0.8956 (mean ± SD: 0.893 ± 0.021, 0.881 ± 0.031, 0.881 ± 0.017, and 0.928 ± 0.021), demonstrating its robustness in multi-omics (gene expression and mutation)–based survival prediction. Compared to existing methods, our model achieves superior performance by effectively addressing intra- and inter-modal redundancy through a unified framework integrating private/shared encoders, feature alignment, cross-modal attention, and a prototype bottleneck, resulting in significantly improved multi-omics fusion and the highest C-index (0.928) on TCGA_AML. Qualitative analyses using t-SNE and UMAP revealed that patients with different survival outcomes or risk levels formed distinct clusters in the latent feature space, indicating that the learned representations capture biologically meaningful patterns. Kaplan–Meier analyses confirmed significant separations between high- and low-risk groups (p < 0.05), supporting the clinical utility of the risk stratification. Gradient-based attribution further identified prognostic genes, including FLT3, consistent with established AML literature, while also highlighting previously unreported candidates whose expression or mutational profiles may be linked to patient prognosis and thus merit further investigation (e.g., CRYGD, LAMC1, and LAMA1). Pathway enrichment analyses showed recurrent involvement of FLT3 signaling, reinforcing its central role in AML pathogenesis and supporting the clinical application of FLT3 inhibitors (e.g., midostaurin, gilteritinib) for precision therapy. Additionally, frequent enrichment of insulin/IGF1R signaling pathways suggests an auxiliary role of metabolic signaling in AML regulation, whereas dataset-specific pathways, such as transcription factor regulation and post-translational modification processes in TCGA_AML, reflect heterogeneity in sample sources and profiling platforms. These findings not only validate the biological relevance of AMLSP but also provide mechanistic insights and potential therapeutic targets, offering a foundation for future strategies aimed at optimizing targeted and combination therapies to enhance treatment efficacy and improve patient outcomes.
Conclusions:We propose AMLSP, a framework for AML prognosis prediction that reduces redundancy by systematically decoupling and fusing multi-omics information, which significantly improves AML survival prediction accuracy and biological interpretability.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal