Patients with acquired aplastic anemia (AA) treated with immunosuppressive therapy (IST) face up to a 20% long-term risk of developing secondary myeloid neoplasms (sMNs), including acute myeloid leukemia and myelodysplastic syndromes. Although hematopoietic stem cell transplantation (HSCT) is curative and prevents sMNs, older patients and those lacking suitable donors have historically received IST as first-line therapy. Recent improvements in HSCT outcomes have expanded transplant eligibility, highlighting the need for tools to better identify patients at high risk for sMN. Validated predictive models could help guide early HSCT consideration or tailor surveillance strategies. We developed 2 binary machine learning models to predict sMN development in patients with acquired AA at clinically relevant time points: diagnosis (model 1) and 6 months after IST response (model 2). We analyzed data from 275 adult patients with AA treated at University of Texas Southwestern, Cleveland Clinic, and the Hospital of the University of Pennsylvania between 1975 and 2023. Seventy-nine clinical variables were collected, including demographics, somatic mutations, and treatment response. Neural networks were trained with leave-1-out crossvalidation. Both models achieved strong performance (area under the curve, 0.82; sensitivity, 0.82, specificity, 0.73). Shared key predictors included DNMT3A mutation, CUX1 mutation, total mutation count, and age. TET2 mutation was specific to model 1; paroxysmal nocturnal hemoglobinuria clone presence was unique to model 2. High-risk classification was significantly associated with worse overall survival (P < .0001). These findings support the feasibility of machine learning–based sMN risk prediction in AA. With training on larger data sets and external validation, these models may support individualized decision-making around HSCT and post-IST surveillance.

Introduction

Acquired aplastic anemia (AA) is a rare and life-threatening disorder characterized by immune-mediated destruction of hematopoietic stem and progenitor cells.¹ A major long-term complication of AA is the development of secondary myeloid neoplasms (sMNs), such as acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS), which account for a substantial proportion of treatment-related mortality. Among patients receiving immunosuppressive therapy (IST), ∼15% to 20% will eventually experience malignant transformation.^2-5

The standard frontline therapies for severe aplastic anemia (SAA) include IST and hematopoietic stem cell transplantation (HSCT). IST, typically combining horse antithymocyte globulin and cyclosporine A,⁶ with eltrombopag frequently added to enhance hematologic response, remains the initial treatment of choice for most patients without a matched sibling donor, particularly those aged ≥40 years or with significant comorbidities. Although IST can lead to hematologic recovery in most patients, a subset experience relapse, require prolonged immunosuppression, or develop clonal evolution. In contrast, HSCT offers the potential for long-term hematopoietic reconstitution and is generally considered curative of sMN risk in appropriately selected patients. Historically reserved for younger patients with matched sibling donors, HSCT is increasingly being considered in a broader range of patients due to advances in donor matching, supportive care, and conditioning regimens.⁷ Current clinical guidelines recommend IST as the preferred initial treatment for older patients and those with significant comorbidities, although HSCT being considered as an increasingly feasible option in high-risk individuals and in those with high-risk molecular features.^8-10

Clonal evolution in acquired AA is driven by autoimmune pressure from cytotoxic T cells, which selectively favors hematopoietic stem and progenitor cells harboring somatic mutations or cytogenetic abnormalities.¹¹ Specific gene mutations (eg, ASXL1, RUNX1, DNMT3A, TET2, and BCOR) and chromosomal abnormalities (eg, del(Y), +8, 6p CN-LOH) are frequently observed in bone marrow studies of patients with AA and have highly variable prognostic implications for progression-free survival (PFS)¹²^,¹³ are frequently observed in bone marrow studies of patients with AA, which have highly variable prognostic implications in PFS. Previous studies have also identified demographic and treatment-related factors, along with clonal genetic and cytogenetic alterations, that are associated with either increased risk of sMN or protective effects against malignant transformation, informing clinical decision-making.^14-17 However, somatic mutations may also be present at diagnosis or emerge over time without signifying imminent transformation and physicians are advised to interpret with caution.¹⁸ No validated predictive models currently exist to accurately estimate an individual patient’s risk of malignant progression.

In this study, we present 2 machine learning models trained on the clinical data of adult acquired patients with AA. Model 1 is designed to assess a patient’s sMN risk using clinical data routinely obtained during the diagnostic workup, whereas model 2 is designed to reassess sMN after 6 months of first-line IST. Such an approach may support future efforts to individualize treatment selection, including consideration of upfront HSCT in carefully selected cases.

Methods

Study design and patient selection

We collected a comprehensive multi-institutional data set of patients with AA for use in training machine learning models to predict the incidence of sMNs in patients with AA. The initial cohort included 350 adult patients treated for AA at the University of Texas Southwestern Medical Center, the Cleveland Clinic Foundation, and the Hospital of the University of Pennsylvania between 1975 and 2023.

Patients were excluded if they had sMNs detected at the time of AA diagnosis (n = 18). Additional exclusions were applied for those lost to follow-up, patients with <180 days of transplant-free survival (n = 19), and those who did not complete a standard course of first-line IST (n = 38). This minimized bias from incomplete data, nonstandardized treatment, short-term survival, and early transplantation. After applying these criteria, the final cohort for analysis consisted of 275 patients.

Data collection

Seventy-nine variables relevant to the diagnosis, treatment, and prognosis of AA were collected (Table 1). These variables included demographic information, clinical presentation, laboratory findings, and treatment responses. AA severity was defined using the Camitta criteria. Clinical diagnoses, including AA, AA/paroxysmal nocturnal hemoglobinuria (PNH) overlap syndrome, AML, and MDS, were confirmed through detailed pathologic analysis of bone marrow biopsies, aspirates, and peripheral blood samples.

Diagnostic modalities included cytomorphology, histomorphology, iron staining, chromosome banding, single-nucleotide polymorphism array karyotyping, fluorescence in situ hybridization, and next-generation sequencing (NGS) for myeloid mutation panels. To rule out potential inherited etiologies of AA, patients underwent comprehensive evaluations, including physical examination findings, family history reviews, chromosome breakage studies, lymphocyte telomere length measurements, and NGS panels targeting genes associated with inherited bone marrow failure syndromes. Diagnoses of AA and MDS were retrospectively reviewed and confirmed using the 2016 World Health Organization classification system for myeloid malignancies, ensuring accurate distinction between true sMN cases and nonmalignant clones. As a notable exception to this, we qualified patients harboring clones with chromosome 13q deletions as having AA, not MDS-unclassifiable, in consideration of several studies qualifying this as a favorable prognostic marker and benign karyotype abnormality in acquired aplastic anemia.^22-26 PNH clone size was determined by the percent of glycosylphosphatidylinositol-deficient granulocyte cells. A detailed description of the study design, research protocols, patient selection criteria, and variable definitions is provided in the supplemental Materials.

Genomic and cytogenetic analysis

Bone marrow aspirate samples were analyzed using multiple NGS and cytogenetic platforms, each with varying sequencing depths, gene coverage, and limits of detection. All clinical and research laboratories are clinical laboratory improvement amendments certified. Reported variants from myeloid NGS studies were independently reviewed using the VarSome Somatic Variant Classifier²⁷ (Varsome.com) to verify classification accuracy and incorporate any variation classification updates. Detailed descriptions of the genomic analyses, as well as a table containing the individual mutations (supplemental Table 1), are available in the supplemental Methods.

Data preprocessing

Several data preprocessing techniques were implemented, including mutation grouping, data binning, and imputation methods. Somatic mutation data obtained from several clinical and experimental hematopathology NGS panels, were standardized to maximize data utility and create standardized data sets suitable for model training. Only variants classified as pathogenic or likely pathogenic in a curated list of 41 recurrent driver genes shared across all platforms (supplemental Table 1) were included in the training data set and downstream analyses. Variants that were confirmed as pathogenic or likely pathogenic and exceeded both the test-specific limits of detection, and a uniform variant allele frequency (VAF) threshold of 1% (VAF > 0.01) were assigned a value of 1. All other variants, those classified as benign, likely benign, of uncertain significance, or with a VAF below the detection threshold or below 1%, were assigned a value of 0. BCOR and BCORL1 were merged into a single variable (BCOR/L1) to maximize data points and enhance statistical power by reducing missingness. Variables with >30% missing data were excluded from the analysis to ensure data integrity.^28-30 Details on data preprocessing are available in the supplemental Materials.

Model training and validation

For variables of which <30% of the cohort’s data were missing, iterative imputation was used to maximize available data and improve model fitness. Missing values, comprising 24% of the data set, were imputed with the multiple imputation by chain equation³¹ algorithm. Missing values of all predictor variables were included in the imputation process. All features were then scaled to a range of −1 to 1 to support efficient neural network training. A multilayer perceptron (MLP) with 5 hidden layers (32, 16, 8, 4, and 2 neurons) was used for binary classification.³² To address class imbalance (43 sMN vs 232 non-sMN), the positive class was assigned a weight of 7.2, selected to optimize macro area under the curve (AUC) while maintaining sensitivity and specificity of >0.7. Random forest feature importance scores were iteratively computed (N = 10) to rank predictors.³³ The MLP was trained via backpropagation and selected for its capacity to model complex interactions among tabular input features.

The MLP architecture comprised 5 hidden layers, each using the Leaky ReLU activation function (α = 0.1). Leaky ReLU helps mitigate the vanishing gradient problem by allowing a small gradient even when the unit’s output is <0. The output layer had 2 units and used the SoftMax activation function, which produced output probabilities for both classes (eg, 0.71, 0.29). Each model was trained for 150 epochs, with early stopping triggered after 20 epochs (s = 20) without improvement.

Leave-1-out (LOO) crossvalidation was used to validate the model. In LOO, 1 sample at a time is designated as the validation set, while the remaining 274 samples are used for training.³⁴ This process repeats until each of the 275 patients has served as the validation set exactly once. Notably, running LOO crossvalidation with 5 to 21 features for the 275 patients required ∼75 seconds on a standard desktop computer (3.5 MHz Intel i7, 12 gigabytes random-access memory, Scikit-learn version 1.3). The final classification assigns a patient to the high-risk category if the prediction score exceeds 0.5 and to the low-risk category otherwise.

Statistical analysis and software tools

Statistical analyses and data visualizations were conducted using Python (version 3.8.5). Cumulative incidence functions were plotted and competing risk analyses were conducted using SAS Viya (version 3.8.1). Deep learning tasks were performed with TensorFlow (version 2.4.1), using Keras (version 2.4.3) for model construction and training. Scikit-learn (version 0.24.1) was used for key machine learning tasks, including data imputation, feature selection, and model evaluation. The “multiple imputation by chain equation” algorithm was fulfilled using Scikit-learn’s IterativeImputer function.³⁵ Data manipulation and numerical computations were performed using Pandas (version 1.2.1) and NumPy (version 1.19.5). To prevent overfitting during model training, the early stopping method was implemented. Data visualization was carried out using Matplotlib (version 3.3.3). Kaplan-Meier curves were generated with the lifelines package (version 0.30.0), and statistical significance for survival analyses was assessed using scikit-survival (version 0.23.1). To evaluate multicollinearity among covariates, the variance inflation factor was calculated with Stats models (version 0.12.2). Time-dependent AUC values were computed using sksurv (version 0.16) to assess model performance over time.

All patients provided written informed consent, and the study protocol was approved by the institutional review boards of each participating institution. All research activities were conducted in accordance with the ethical principles outlined in the Declaration of Helsinki.

Results

Of 275 patients included in the study, 222 (80.7%) had SAA or very SAA (VSAA), and 53 (19.3%) had nonsevere AA. sMNs developed in 40 (18.0%) patients with SAA/VSAA and in 4 (7.6%) patients with nonsevere AA, with a higher risk observed in the SAA/VSAA group (risk ratio, 2.39; odds ratio, 2.69; P = .064). The cohort comprised 139 females (50.5%), and had a median age of 54.0 years (range, 18.0-89.4; interquartile range [IQR], 34.4-66.3; mean ± standard deviation [SD]: 50.8 ± 18.5 years). A total of 188 patients (68.3%) achieved either a complete or partial response to therapy. Among 230 patients with available karyotype data, 8 (3.5%) had abnormal karyotypes at diagnosis (Table 2). Although most patients were enrolled at a single center, sMN incidence did not differ significantly across institutions (χ² = 1.95; P = .377), reducing concern for treatment-related bias despite differences in enrollment rates.

PNH clones were detected in 111 patients (40.4%), with a median granulocyte clone size of 1.2% (range, 0.01-93). Of these, 69 patients had clone sizes of ≥1%, and 15 had clone sizes of ≥20%. Among patients who did not develop sMN, 98 (35%) harbored detectable PNH clones (Table 2). Baseline somatic mutation data were available for 195 patients. Among them, 28 (14.4%) had 1 mutation, and 8 (4.1%) had ≥2 mutations. The most frequently observed mutations were ASXL1 (n = 7 [3.6%]), TET2 (n = 7 [3.6%]), BCOR or BCORL1 (n = 6 [3.1%]), CUX1 (n = 5 [2.6%]), RUNX1 (n = 5 [2.6%]), and DNMT3A (n = 4 [2.1%]; supplemental Materials; Table 2).

Model 1 was trained using 23 baseline clinical and molecular features (Table 2; supplemental Figure 1). The 5 most predictive variables for sMN development were CUX1 mutation, DNMT3A mutation, TET2 mutation, total mutation count, and patient age at diagnosis (Figure 1A-B). Model 1 achieved an AUC of 0.82, with a sensitivity of 0.81, specificity of 0.74, positive predictive value (PPV) of 35.0%, and negative predictive value of 95.5% (Figure 1C). The confusion matrix (Figure 1D) revealed 36 true positives (13.0%), 171 true negatives (62.0%), 61 false positives (22.1%), and 8 false negatives (2.9%). The receiver operating characteristic (ROC) curve demonstrated strong discriminatory ability (Figure 1E).

Figure 1.

View large Download PPT

Feature impact scores, performance metrics, and validation of machine learning model 1 for predicting sMNs. (A) Bar plot of the feature impact scores for model 1, demonstrating the relative contribution of each feature to the model’s performance. (B) Multivariate regression (R) scores and P values of variables used in model 1, with asterisks (∗) denoting statistical significance (P < .05). (C) Summary of model 1 performance metrics, including sensitivity (0.81), specificity (0.74), and an AUC of 0.82. (D) Confusion matrix displaying true positives (36 [13.0%]), true negatives (171 [62.0%]), false positives (61 [22.1%]), and false negatives (8 [2.9%]). (E) AUC-ROC curve illustrating the model’s predictive performance with an AUC of 0.82, indicating strong discriminatory ability for identifying sMN risk.

Model 2, which incorporated 29 features including 6-month treatment response, identified DNMT3A mutation, age at diagnosis, PNH clone presence, CUX1 mutation, and total mutation count as the top predictors (Figures 2A-B and 3C-D). Model 2 also achieved an AUC of 0.82, with a sensitivity of 0.84, specificity of 0.73, PPV of 36.7%, and negative predictive value of 95.5% (Figure 2C). The confusion matrix included 36 true positives (13.0%), 169 true negatives (61.5%), 62 false positives (22.5%), and 8 false negatives (2.9%; Figure 2D). ROC analysis confirmed strong performance (Figure 2E).

Figure 2.

View large Download PPT

Feature impact scores, performance metrics, and validation of machine learning model 2 for predicting sMNs. (A) Bar plot of the feature impact scores for model 2, demonstrating the relative contribution of each feature to the model’s performance. (B) Multivariate regression (R) scores and P values of variables used in model 2, with asterisks denoting statistical significance (P < .05). (C) Summary of model 2 performance metrics, including sensitivity (0.84), specificity (0.73), and an AUC of 0.82. (D) Confusion matrix displaying true positives (36 [13.0%]), true negatives (169 [61.5%]), false positives (62 [22.5%]), and false negatives (8 [2.9%]). (E) AUC-ROC curve illustrating the model’s predictive performance with an AUC of 0.82, indicating strong discriminatory ability for identifying sMN risk. ATG, antithymocyte globulin.

DeepSHAP was applied to obtain both local and global explanations for the neural network models. Local contributions are depicted in beeswarm summary plots (Figure 3A,C), which map individual SHAP (Shapley additive explanations) values for each feature, colored by feature magnitude, to indicate how varying feature values influence risk predictions. Global importance is presented via mean absolute SHAP value bar plots with hierarchical clustering (Figure 3B,D), in which features are sorted by average contribution magnitude and clusters are delineated at a linkage distance cutoff of 0.50 to group highly redundant features for visualization. In model 1, mean absolute SHAP values for CUX1 mutation (0.99), age at diagnosis (0.77), and PNH clone size (0.50; binary presence of >0% at 0.46) exceeded the clustering threshold, whereas all remaining features displayed lower mean values. In model 2, mean values for age (0.78), DNMT3A mutation (0.70), U2AF1 mutation (0.67), and PNH clone size (0.54) were above the threshold, followed by treatment-response and additional genomic variables forming a secondary cluster; features outside these clusters exhibited lower mean absolute SHAP values. The clustering cutoff serves solely as a visualization aid and does not imply statistical testing. The observed difference in SHAP values across models reflects a shift in feature contribution due to the inclusion of dynamic variables, rather than a reversal of predictive direction or a fundamental mechanistic change.

Figure 3.

View large Download PPT

SHAP analysis of feature contributions in machine learning models predicting sMN development in AA. SHAP summary (beeswarm) plots for model 1 (A) and model 2 (C). Each point represents the SHAP value of a feature for an individual observation. Features are ranked by overall importance. The x-axis indicates the SHAP value (ie, the impact of that feature on the model’s output for a given patient), whereas color reflects the feature value (blue = low, red = high). Features to the right of x = 0 were associated with increased predicted risk of sMN, whereas features to the left were associated with reduced risk. Mean absolute SHAP value plots with hierarchical clustering of features for model 1 (B) and model 2 (D). These bar plots quantify the global importance of each feature based on the magnitude of its contribution to model predictions. Features with low mean SHAP values were considered to have minimal impact and were excluded from subsequent clustering. “Remaining 3 features” in panel D refers to “Abnormal Karyotype,” “Del(13q) Karyotype,” and “Other IST Treatment,” which were retained but contributed minimally. ATG, antithymocyte globulin.

Kaplan-Meier analysis demonstrated significantly reduced sMN-free survival among high-risk patients as stratified by both models (supplemental Figure 3A-B). The log-rank test statistic was 37.56 (P = 8.86 × 10⁻¹⁰) for model 1 and 43.53 (P = 4.17 × 10⁻¹¹) for model 2. Competing risks analysis showed a cumulative sMN incidence of 4.92% at 2 years, 21.74% at 5 years, 57.38% at 10 years, and ∼65% at 15 years (supplemental Figure 2A-B). Stratification by model-defined risk groups confirmed markedly increased incidence in high-risk patients. At 10 and 15 years, high-risk patients had cumulative incidences of 58.2% and 67.5%, respectively, by model 1, and 54.6% and 63.4%, respectively, by model 2, compared with 15.2% and 17.5% (model 1) and 11.7% and 13.4% (model 2), respectively, in the low-risk group (Gray test χ² = 35.50 for model 1; and χ² = 40.14 for model 2; P < .001 for both). Correlation and covariance matrices for all features used in model training are provided in supplemental Figure 4A-D.

A total of 44 patients (16.0%) developed sMNs; 38 with MDSs and 6 with AML. The median time from AA diagnosis to sMN onset was 3.93 years (IQR, 2.06-6.63; mean ± SD: 4.50 ± 3.25 years; range, 0.13-14.86). Median latency to AML onset was 2.96 years (IQR, 1.70-5.04), which was similar to latency to MDS, 4.44 years (IQR, 2.21-7.03; Mann-Whitney U test, P = .188).

The median follow-up for the overall cohort was 4.19 years (IQR, 7.78; mean ± SD: 6.58 ± 6.61 years). Median follow-up durations by subgroup were: 2.37 years (IQR, 1.45-7.27) for patients who remained alive, 3.93 years (IQR, 2.06-6.63) for patients who developed sMN, and 3.60 years (IQR, 1.75-6.02) for patients who underwent HSCT. Missing follow-up data were noted in 6.91% of all patients, including 5.96% of patients without sMN and 12.5% of those who developed sMN.

Mann-Whitney U testing showed that patients who died had significantly shorter follow-up compared with those who remained alive (P = .0016), developed sMN (P = .023), or underwent HSCT (P = .025). There were no significant differences in follow-up time between patients who developed sMN and underwent transplant (P = .978), developed sMN and remained alive (P = .467), or underwent transplant and remained alive (P = .490).

Discussion

In this study, we developed 2 binary classification machine learning models to identify adult patients with acquired AA at high risk for developing sMN using routinely collected clinical and molecular features.

Model 1 was trained using baseline diagnostic features and achieved an AUC of 0.82. The top predictive variables included CUX1 mutation, DNMT3A mutation, TET2 mutation, PNH clone presence, total somatic mutation count, and patient age at diagnosis, features broadly consistent with previous literature implicating clonal hematopoiesis and epigenetic dysregulation in leukemic evolution.²^,¹³^,¹⁵^,³⁶^,³⁷ Although CUX1 emerged as a highly predictive feature in our model, it is considered a less frequent mutation in comparison with more commonly reported high-risk mutations such as ASXL1, RUNX1, and SETBP1. Previous studies more suited for pathophysiologic interpretation suggest that CUX1 mutations may represent early, unstable events that disappear at transformation, often supplanted by −7/del(7q).¹² In MNs, CUX1 mutations are frequently subclonal, enriched in older patients, co-occur with adverse cytogenetics and high mutational burden, and are often underrecognized in standard panels due to low VAF.³⁸ It should be noted that our study design does not allow mechanistic inference on clonal evolution.

Model 2 incorporated treatment response data from 6 months after diagnosis to reassess sMN risk dynamically. This model achieved similar performance (AUC of 0.82) and relied on many of the same top predictors: DNMT3A, CUX1, total mutation count, age, and PNH clone presence. The positive association of driver mutations and age, as well as the protective effect of PNH against sMN development is consistent with previous studies.¹¹^,¹²^,¹⁴^,³⁶ Importantly, this time point captures a clinically relevant juncture in AA management; ∼30% of patients do not respond to horse antithymocyte globulin–based IST, and among these, ∼15% progress to MDS or AML. In practice, not all nonresponders proceed directly to transplant due to factors such as donor availability, comorbidities, or institutional policy. Model 2 may help stratify these patients by sMN risk and facilitate more informed decisions regarding expedited transplant referral or intensified surveillance, even beyond response categorization alone.¹⁸

To the best of our knowledge, this effort represents the first validated machine learning-based approach for individualized risk prediction of sMN in acquired AA. Notably, Yoshizato et al used penalized variable selection and random survival forests for feature selection in a cohort of 256 National Institute of Health patients to identify gene combinations associated with IST response, overall survival, and PFS.¹⁴ Although their analysis revealed that BCOR and BCORL1 mutations conferred favorable outcomes and DNMT3A, ASXL1, RUNX1, JAK2, and JAK3 conferred worse PFS (P < .03), it was not intended for patient-specific risk estimation. In contrast, our methodology was explicitly developed to generate individualized outcome predictions rather than infer mechanistic insight.

Although AA-associated sMN is often described as a long-term complication, recent studies reveal a persistently elevated and nonplateauing risk over time. In our cohort, 44 patients (16%) developed sMN; 36 patients within 5 years, and 10 within 2 years. The cumulative incidence of sMN reached 21.7% at 5 years and 57.4% at 10 years. Risk was notably higher with age; previous studies have reported 10-year cumulative incidences of 20.6% for patients aged >35 years vs 6.6% for those aged 15 to 35 years at diagnosis.¹² Our cohort was consistent in terms of sMN incidence rates and median age to onset from initial diagnosis, which underscores the need for predictive tools to enable proactive intervention.¹²^,¹³^,³⁹

Currently, allogeneic HSCT remains the only curative treatment for both AA and its clonal complications. Long-term follow-up studies suggest that MDS/AML risk after transplantation is negligible.⁴⁰ Early transplantation, ideally within 6 months of AA diagnosis, has been associated with improved clinical outcomes.¹⁸^,⁴¹ In matched related donor transplants, delays beyond 6 months have been linked to significantly increased risks of graft-versus-host disease and relapse-free survival failure (hazard ratio, 4.08; 95% confidence interval, 1.41-11.83; P = .010). Transplantation after overt progression to MDS or AML introduces additional risks related to chemotherapy toxicity, relapse, and poorer survival outcomes. Five-year overall survival after transplantation for post-AA MDS or AML has been reported at ∼62%, compared with ∼23% with chemotherapy or supportive care alone (P < .01).¹³ Despite its curative potential, transplantation carries substantial risks, and graft-versus-host disease and relapse-free survival rates in adults remain limited. Thus, individualized risk prediction remains critically important for guiding decisions regarding transplant timing and patient selection.

Although our study demonstrates the feasibility of using machine learning to forecast sMN risk in acquired AA, it is constrained by data limitations. The modest sample size limits generalizability and precludes robust stratification by age, disease severity, or donor availability. Although both models achieved strong discriminative performance (AUC of ∼0.82), their PPVs were modest (32%-36%), consistent with the relatively low long-term incidence of sMN (∼15%-20%). Unlike AUC, which is invariant to outcome prevalence,^42-44 PPV is directly influenced by disease incidence and is expected to improve with model training on larger data sets. We prioritized standard performance metrics such as AUC, ROC curves, sensitivity, and specificity to assess the models’ ability to discriminate between patients who did and did not develop sMN, because these metrics provide prevalence-independent evaluations of model performance.⁴⁵^,⁴⁶ Kaplan-Meier and cumulative incidence function analyses (supplemental Figures 2A-B and 3A-B) demonstrated significant early divergence in sMN incidence between model-defined high- and low-risk groups. For both model 1 and model 2, log-rank tests showed strong statistical significance (P < .001), supporting the potential clinical utility of these models in identifying patients at elevated risk of malignant transformation early in the disease course.

Another key limitation is the limited interpretability of the models due to the “black box” nature of deep learning algorithms.⁴⁷ We addressed this by including SHAP (Figure 3), regression scores (Figures 1B and 2B), multivariate hazard ratios (Table 3), as well as correlations and covariance matrices (supplemental Figure 4A-B) to help analyze the effect size of each feature across many methods of analysis.⁴⁸^,⁴⁹

The data set used to train model 1 did not include data on immunosuppressive treatments, which may introduce bias into the model due to implicit correlations between treatment patterns and sMN risk. Model 2 would not suffer from this data bias, which is expected to be minimal due to the similarity in performance of the 2 models. We used LOO crossvalidation to minimize bias and fully leverage a multicenter data set,^50-52 yet this internal validation approach cannot replace testing in an independent cohort. Given the rarity of AA, many studies face similar constraints in assembling large, external data sets. Nonetheless, external validation will be a critical next step in the development of clinically robust and generalizable prediction models.

In summary, our study provides proof of concept for machine learning–based risk prediction of sMN in acquired AA using clinically accessible variables. Future studies should focus on training with larger data sets, external validation. With refinement, such models may eventually support risk stratification, inform transplant decisions, and facilitate personalized care. However, until prospective studies, including randomized trials, demonstrate added clinical utility beyond physician judgment, we cannot recommend how such tools should be used in practice.

Acknowledgments

The authors thank the University of Texas Southwestern Medical Center, the Cleveland Clinic, and the University of Pennsylvania for their contributions to this research. The authors thank their research coordinator, Kasia Harrah, for her valuable contributions to this study.

Authorship

Contribution: A.C.T. contributed to conceptualization, methodology, coding, validation, statistical analysis, investigation, data curation, writing the manuscript, visualization, project administration, and funding acquisition; J.J.S. contributed to writing the manuscript, visualization, and statistical analysis; M.M. contributed to methodology, software, validation, formal analysis, resources, writing the manuscript, and visualization; C.I.A.S. contributed to writing the manuscript and statistical analysis; C.G. contributed to validation, investigation, resources, data curation, writing the manuscript, supervision, project administration, and funding acquisition; J.P.M. contributed to validation, investigation, resources, data curation, writing the manuscript, supervision, project administration, and funding acquisition; D.V.B. contributed to resources, investigation, and writing the manuscript; Z.B. contributed to investigation and writing the manuscript; Z.T., M.N.D., H.A., and I.I. contributed to writing the manuscript; Y.O. and L.G. contributed to visualization and writing the manuscript; and T.B. contributed to conceptualization, methodology, validation, investigation, resources, writing the manuscript, visualization, supervision, project administration, and funding acquisition. A.O. contributed to data visualization. J.A.T. contributed to writing manuscript.

Conflict-of-interest disclosure: T.B. reports advisory committee/board membership with Alexion, Novartis, Samsung Bioepis, Omeros, and Recordati Rare Disease. J.P.M. reports honoraria from, and speakers bureau role with, Novartis; advisory committee/board membership with Alexion; consultancy with, and honoraria from, Regeneron; and consultancy with Omeros. D.B. reports consultancy with Retro Biosciences. The remaining authors declare no competing financial interests.

Correspondence: Taha Bat, Division of Hematology-Oncology, Department of Internal Medicine, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9255; email: taha.bat@utsouthwestern.edu.

References

Young

Maciejewski

The pathophysiology of acquired aplastic anemia

N Engl J Med

1997

;

336

(

1365

1372

Google Scholar

Crossref

PubMed

Sun

Babushok

Secondary myelodysplastic syndrome and leukemia in acquired aplastic anemia and paroxysmal nocturnal hemoglobinuria

Blood

2020

;

136

(

Google Scholar

Crossref

PubMed

de Planque

Bacigalupo

Würsch

, et al.

Long-term follow-up of severe aplastic anaemia patients treated with antithymocyte globulin

Br J Haematol

1989

;

(

121

126

Google Scholar

Crossref

PubMed

Rosenfeld

Follmann

Nunez

Young

Antithymocyte globulin and cyclosporine for severe aplastic anemia: association between hematologic response and long-term outcome

JAMA

2003

;

289

(

1130

1135

Google Scholar

Crossref

PubMed

Frickhofen

Heimpel

Kaltwasser

Schrezenmeier

;

German Aplastic Anemia Study Group

Antithymocyte globulin with or without cyclosporin A: 11-year follow-up of a randomized trial comparing treatments of aplastic anemia

Blood

2003

;

101

(

1236

1242

Google Scholar

Crossref

PubMed

Bacigalupo

How I treat acquired aplastic anemia

Blood

2017

;

129

(

1428

1436

Google Scholar

Crossref

PubMed

Rice

Eikema

D-J

Marsh

JCW

, et al.

Allogeneic hematopoietic cell transplantation in patients aged 50 years or older with severe aplastic anemia

Biol Blood Marrow Transplant

2019

;

(

488

495

Google Scholar

Crossref

PubMed

Sureda

Bacigalupo

Boogaerts

, et al.

The EBMT Handbook: Hematopoietic Stem Cell Transplantation and Cellular Therapies

. (7th ed) .

Springer

;

2019

Google Scholar

Iftikhar

DeFilipp

DeZern

, et al.

Allogeneic hematopoietic cell transplantation for the treatment of severe aplastic anemia: evidence-based guidelines from the American society for transplantation and cellular therapy

Transplant Cell Ther

2024

;

(

1155

1170

Google Scholar

Crossref

PubMed

10.

Wirk

Acquired aplastic anemia therapies: immunosuppressive therapy versus alternative donor hematopoietic cell transplantation

J Hematol

2024

;

(

Google Scholar

Crossref

PubMed

11.

Negoro

Nagata

Clemente

, et al.

Origins of myelodysplastic syndromes after aplastic anemia

Blood

2017

;

130

(

1953

1957

Google Scholar

Crossref

PubMed

12.

Gurnari

Pagliuca

Prata

, et al.

Clinical and molecular determinants of clonal evolution in aplastic anemia and paroxysmal nocturnal hemoglobinuria

J Clin Oncol

2023

;

(

132

142

Google Scholar

Crossref

PubMed

13.

Groarke

Patel

Shalhoub

, et al.

Predictors of clonal evolution and myeloid neoplasia following immunosuppressive therapy in severe aplastic anemia

Leukemia

2022

;

(

2328

2337

Google Scholar

Crossref

PubMed

14.

Yoshizato

Dumitriu

Hosokawa

, et al.

Somatic mutations and clonal hematopoiesis in aplastic anemia

N Engl J Med

2015

;

373

(

Google Scholar

Crossref

PubMed

15.

Babushok

Perdigones

Perin

, et al.

Emergence of clonal hematopoiesis in the majority of patients with acquired aplastic anemia

Cancer Genet

2015

;

208

(

115

128

Google Scholar

Crossref

PubMed

16.

Kulasekararaj

Jiang

Smith

, et al.

Somatic mutations identify a subgroup of aplastic anemia patients who progress to myelodysplastic syndrome

Blood

2014

;

124

(

2698

2704

Google Scholar

Crossref

PubMed

17.

Zaimoku

Takamatsu

Hosomichi

, et al.

Identification of an HLA class I allele closely involved in the autoantigen presentation in acquired aplastic anemia

Blood

2017

;

129

(

2908

2916

Google Scholar

Crossref

PubMed

18.

Kulasekararaj

Cavenagh

Dokal

, et al.

Guidelines for the diagnosis and management of adult aplastic anaemia: a British Society for Haematology Guideline

Br J Haematol

2024

;

204

(

784

804

Google Scholar

Crossref

PubMed

19.

Camitta

Rappeport

Parkman

Nathan

Selection of patients for bone marrow transplantation in severe aplastic anemia

Blood

1975

;

(

355

363

Google Scholar

Crossref

PubMed

20.

Scheinberg

Nunez

Weinstein

, et al.

Horse versus rabbit antithymocyte globulin in acquired aplastic anemia

N Engl J Med

2011

;

365

430

438

Google Scholar

Crossref

PubMed

21.

Scheinberg

Young

How I treat acquired aplastic anemia

Blood

2012

;

120

(

1185

1196

Google Scholar

Crossref

PubMed

22.

Hosokawa

Katagiri

Sugimori

, et al.

Favorable outcome of patients who have 13q deletion: a suggestion for revision of the WHO 'MDS-U' designation

Haematologica

2012

;

(

1845

1849

Google Scholar

Crossref

PubMed

23.

Ishiyama

Karasawa

Miyawaki

, et al.

Aplastic anaemia with 13q-: a benign subset of bone marrow failure responsive to immunosuppressive therapy

Br J Haematol

2002

;

117

(

747

750

Google Scholar

Crossref

PubMed

24.

Maciejewski

Risitano

Sloand

Nunez

Young

Distinct clinical outcomes for cytogenetic abnormalities evolving from aplastic anemia

Blood

2002

;

(

3129

3135

Google Scholar

Crossref

PubMed

25.

Holbro

Jotterand

Passweg

Buser

Tichelli

Rovó

Comment to “favorable outcome of patients who have 13q deletion: a suggestion for revision of the WHO ‘MDS-U’ designation”

Haematologica

2013

;

(

1845

1849

Google Scholar

Crossref

26.

Litzow

Kyle

Multiple responses of aplastic anemia to low-dose cyclosporine therapy despite development of a myelodysplastic syndrome

Am J Hematol

1989

;

(

226

229

Google Scholar

Crossref

PubMed

27.

Kopanos

Tsiolkas

Kouris

, et al.

VarSome: the human genomic variant search engine

Bioinformatics

2019

;

(

1978

1980

Google Scholar

Crossref

PubMed

28.

Barrabés

Perera

Novelle Moriano

Giró-I-Nieto

Mas Montserrat

Ioannidis

Advances in biomedical missing data imputation: a survey

IEEE Access

2025

;

16918

16932

Google Scholar

Crossref

29.

Rahman

Davis

. Machine learning-based missing value imputation method for clinical datasets. In:

Yang

G-C

S-l

Gelman

, eds.

IAENG Transactions on Engineering Technologies

Springer

;

2013

245

257

Google Scholar

Crossref

30.

Phung

Kumar

Kim

. A deep learning technique for imputing missing healthcare data.

2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

;

2019

6513

6516

31.

van Buuren

Groothuis-Oudshoorn

mice: multivariate imputation by chained equations in R

J Stat Softw

2011

;

(

Google Scholar

32.

Mitchell

. Machine Learning.

McGraw-Hill

;

1997

33.

Breiman

Random Forests

Machine Learning 2001

2001

;

. 10;45(1).

Google Scholar

34.

Berrar

. Cross-Validation.

Encyclopedia of Bioinformatics and Computational Biology

2019/01/01

Google Scholar

35.

Abraham

Pedregosa

Eickenberg

, et al.

Frontiers | machine learning for neuroimaging with scikit-learn

Frontiers in Neuroinformatics

2014

Google Scholar

36.

Nagata

Makishima

Kerr

, et al.

Invariant patterns of clonal succession determine specific clinical features of myelodysplastic syndromes

Nat Commun

2019

;

(

5386

Google Scholar

Crossref

PubMed

37.

Babushok

A brief, but comprehensive, guide to clonal evolution in aplastic anemia

Hematology Am Soc Hematol Educ Program

2018

;

2018

(

457

466

Google Scholar

Crossref

PubMed

38.

Dermawan

Wensel

Visconte

Maciejewski

Cook

Bosler

Clinically significant CUX1 mutations are frequently subclonal and common in myeloid disorders with a high number of co-mutated genes and dysplastic features

Am J Clin Pathol

2022

;

157

(

586

594

Google Scholar

Crossref

PubMed

39.

, et al.

Long-term follow-up of clonal evolutions in 802 aplastic anemia patients: a single-center experience

Ann Hematol

2011

;

(

529

537

Google Scholar

Crossref

PubMed

40.

Gurnari

Pagliuca

Kewan

, et al.

Is nature truly healing itself? spontaneous remissions in paroxysmal nocturnal hemoglobinuria

Blood Cancer J

2021

;

(

187

Google Scholar

Crossref

PubMed

41.

Killick

Bown

Cavenagh

, et al.

Guidelines for the diagnosis and management of adult aplastic anaemia

Br J Haematol

2016

;

172

(

187

207

Google Scholar

Crossref

PubMed

42.

Schaefer

Lehne

Schepers

Prasser

Thun

The use of machine learning in rare diseases: a scoping review

Orphanet J Rare Dis

2020

;

(

145

Google Scholar

Crossref

PubMed

43.

Varoquaux

Colliot

Evaluating machine learning models and their diagnostic value

Neuromethods

2023

Google Scholar

44.

Vidyasagar

Identifying predictive features in drug response using machine learning: opportunities and challenges

Annu Rev Pharmacol Toxicol

2015

;

Google Scholar

Crossref

PubMed

45.

Shapiro

The interpretation of diagnostic tests

Stat Methods Med Res

1999

;

(

113

134

Google Scholar

Crossref

PubMed

46.

Monaghan

Rahman

Agudelo

, et al.

Foundational statistical principles in medical research: sensitivity, specificity, positive predictive value, and negative predictive value

Medicina

2021

;

(

503

Google Scholar

Crossref

PubMed

47.

Dobson

On reading and interpreting black box deep neural networks

Int J Digit Humanit

2023

;

(

431

449

Google Scholar

48.

Louhichi

Nesmaoui

Mbarek

Lazaar

Shapley values for explaining the black box nature of machine learning model clustering

Procedia Computer Science

2023

;

220

806

811

Google Scholar

Crossref

49.

Rodríguez-Pérez

Bajorath

Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions

J Comput Aided Mol Des

2020

;

(

1013

1026

Google Scholar

Crossref

PubMed

50.

Stone

Cross-validatory choice and assessment of statistical predictions

Journal of the Royal Statistical Society

1974

;

(

111

133

Google Scholar

Crossref

51.

Molinaro

Simon

Pfeiffer

Prediction error estimation: a comparison of resampling methods

Bioinformatics

2005

;

(

3301

3307

Google Scholar

Crossref

PubMed

52.

Lee

Kim

Kwon

, et al.

Identification of a complex karyotype signature with clinical implications in AML and MDS-EB using gene expression profiling

Cancers (Basel)

2023

;

(

5289

Google Scholar

Crossref

PubMed

Author notes

∗

A.C.T. and M.M. are joint first authors.

Deidentified data, study protocol, and source code are available on request from the corresponding author, Taha Bat (taha.bat@utsouthwestern.edu).

The full-text version of this article contains a data supplement.

© 2025 American Society of Hematology. Published by Elsevier Inc. Licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), permitting only noncommercial, nonderivative use with attribution. All other rights reserved.

2025

View large Download slide

Figure 1.

View large Download PPT

Figure 2.

View large Download PPT

Figure 3.

View large Download PPT

Table 1.

Clinical features recorded by medical record review

Data type	Specific variables recorded
Genetic/cytogenetic abnormalities∗	ASXL1, TET2, RUNX1, SETBP1, DNMT3A, U2AF1, EZH2, TP53, BCOR, BCORL1, EZH2, CUX1, del(13q), −Y, +8, del 7/7q, complex karyotype, germ line testing for inherited causes of AA (see the supplemental Methods for full list)
Blood counts†	Hgb, Plt, ALC, ANC, IPF, MCV, ARC, RC
PNH clone	PNH flow cytometry date, PNH clone distribution (RBC type 2, type 3, total), monocyte, granulocyte, initial PNH clone, secondary PNH clone
Demographic information	Age at definitive AA diagnosis, age at sMN diagnosis, sex
Follow-up information	Date of last appointment, date of death, date of initial diagnosis, survival time after diagnosis, survival time after sMN
Clonal evolution event information	Date of sMN diagnosis, sMN type (AML, MDS), time elapsed between AA diagnosis and sMN evolution
Treatment information	Treatment types (antithymocyte globulin, cyclosporine, eltrombopag, HSCT), treatment start date(s), response (NR/PR/CR§), time elapsed between diagnosis and treatment type, time elapsed between treatment and sMN evolution
Miscellaneous	AST, ALT, ferritin, bone marrow cellularity (%), aplastic anemia disease severity‡

Data type	Specific variables recorded
Genetic/cytogenetic abnormalities∗	ASXL1, TET2, RUNX1, SETBP1, DNMT3A, U2AF1, EZH2, TP53, BCOR, BCORL1, EZH2, CUX1, del(13q), −Y, +8, del 7/7q, complex karyotype, germ line testing for inherited causes of AA (see the supplemental Methods for full list)
Blood counts†	Hgb, Plt, ALC, ANC, IPF, MCV, ARC, RC
PNH clone	PNH flow cytometry date, PNH clone distribution (RBC type 2, type 3, total), monocyte, granulocyte, initial PNH clone, secondary PNH clone
Demographic information	Age at definitive AA diagnosis, age at sMN diagnosis, sex
Follow-up information	Date of last appointment, date of death, date of initial diagnosis, survival time after diagnosis, survival time after sMN
Clonal evolution event information	Date of sMN diagnosis, sMN type (AML, MDS), time elapsed between AA diagnosis and sMN evolution
Treatment information	Treatment types (antithymocyte globulin, cyclosporine, eltrombopag, HSCT), treatment start date(s), response (NR/PR/CR§), time elapsed between diagnosis and treatment type, time elapsed between treatment and sMN evolution
Miscellaneous	AST, ALT, ferritin, bone marrow cellularity (%), aplastic anemia disease severity‡

ALC, absolute leukocyte count; ALT, alanine aminotransferase ANC, absolute neutrophil count; ARC, absolute reticulocyte count; AST, aspartate aminotransferase; CR, complete response; Hgb, hemoglobin; IPF, immature platelet fraction; MCV, mean corpuscular volume; NR, no response; Plt, platelet count; PR, partial response; RC%, reticulocyte count percentage.

∗

Data obtained from the initial diagnostic workup and at sMN diagnosis. Recorded as VAF (%) of mutation or as presence/absence of mutation.

†

Data obtained from the initial diagnostic workup and at the 6-month post-IST assessment.

‡

As defined by the Camitta criteria¹⁹.

As defined by the National Institutes of Health criteria²⁰^,²¹.

Table 2.

Clinical characteristics of the cohort

Variable	Modalities	Total (N = 275)	No sMN (N = 231)	sMN (N = 44)
Age at diagnosis	Median ± SD (range), y	54.0 ± 18.4 (18.0-89.4)	51.2 ± 19.0 (18.0-89.4)	61.0 ± 13.2 (25.0-78.3)
Sex, n (%)	Male	136 (49.5)	112 (48.5)	24 (54.5)
	Female	139 (50.5)	119 (51.5)	20 (45.5)
Disease severity at diagnosis, n (%)	SAA or VSAA	222 (80.7)	182 (78.8)	40 (90.9)
	Nonsevere AA	53 (19.3)	49 (21.2)	4 (9.1)
Treatment response, n (%)	No response	88 (32.0)	71 (30.7)	17 (38.6)
	Partial response	96 (34.9)	78 (33.8)	18 (40.9)
	Complete response	88 (32.0)	79 (34.2)	9 (20.5)
	Unknown	3 (1.1)	3 (1.3)	0 (0.0)
PNH clone at diagnosis, n (%)	Clone present	111 (40.4)	98 (42.4)	13 (29.5)
	Clone absent	109 (39.6)	87 (37.7)	22 (50.0)
	Unknown	55 (20.0)	46 (19.9)	9 (20.5)
Event-free follow-up time	Median ± SD, y	4.2 ± 6.6	4.5 ± 7.0	3.9 ± 3.2
Somatic mutation testing at AA diagnosis, n (%)	Preleukemic mutations at baseline	36 (13.1)	29 (12.5)	7 (15.9)
	No detected preleukemic mutations at baseline	159 (90.8)	140 (60.6)	19 (43.2)
	Baseline mutation data missing	80 (29.1)	62 (26.8)	18 (40.9)
Cohort, n (%)	Cleveland Clinic	215 (78.2)	181 (78.4)	34 (77.2)
	University of Texas Southwestern	41 (14.9)	36 (15.6)	5 (11.4)
	University of Pennsylvania	19 (6.9)	14 (6.0)	5 (11.4)

Variable	Modalities	Total (N = 275)	No sMN (N = 231)	sMN (N = 44)
Age at diagnosis	Median ± SD (range), y	54.0 ± 18.4 (18.0-89.4)	51.2 ± 19.0 (18.0-89.4)	61.0 ± 13.2 (25.0-78.3)
Sex, n (%)	Male	136 (49.5)	112 (48.5)	24 (54.5)
	Female	139 (50.5)	119 (51.5)	20 (45.5)
Disease severity at diagnosis, n (%)	SAA or VSAA	222 (80.7)	182 (78.8)	40 (90.9)
	Nonsevere AA	53 (19.3)	49 (21.2)	4 (9.1)
Treatment response, n (%)	No response	88 (32.0)	71 (30.7)	17 (38.6)
	Partial response	96 (34.9)	78 (33.8)	18 (40.9)
	Complete response	88 (32.0)	79 (34.2)	9 (20.5)
	Unknown	3 (1.1)	3 (1.3)	0 (0.0)
PNH clone at diagnosis, n (%)	Clone present	111 (40.4)	98 (42.4)	13 (29.5)
	Clone absent	109 (39.6)	87 (37.7)	22 (50.0)
	Unknown	55 (20.0)	46 (19.9)	9 (20.5)
Event-free follow-up time	Median ± SD, y	4.2 ± 6.6	4.5 ± 7.0	3.9 ± 3.2
Somatic mutation testing at AA diagnosis, n (%)	Preleukemic mutations at baseline	36 (13.1)	29 (12.5)	7 (15.9)
	No detected preleukemic mutations at baseline	159 (90.8)	140 (60.6)	19 (43.2)
	Baseline mutation data missing	80 (29.1)	62 (26.8)	18 (40.9)
Cohort, n (%)	Cleveland Clinic	215 (78.2)	181 (78.4)	34 (77.2)
	University of Texas Southwestern	41 (14.9)	36 (15.6)	5 (11.4)
	University of Pennsylvania	19 (6.9)	14 (6.0)	5 (11.4)

For patients who developed sMNs, event-free follow-up time is defined as the duration (in years) from the diagnosis of AA to the onset of sMN. For patients who did not develop sMNs, event-free follow-up time is measured from the diagnosis of AA to the occurrence of death, transplantation, or the last recorded follow-up, whichever came first.

Table 3.

Multivariable Cox regression analysis of clinical, cytogenetic, and molecular predictors of adverse outcome in patients with AA

Variable	HR (95% CI)	P value
DNMT3A mutation	2.01 (0.04-104.74)	.73
Age at diagnosis	1.02 (1.00-1.05)	.094
PIGA mutation	0.56 (0.20-1.59)	.274
CUX1 mutation	8.18 (0.74-89.99)	.086
Total number of mutations	1.02 (0.67-1.56)	.916
U2AF1 mutation	0.58 (0.04-7.68)	.682
RUNX1 mutation	1.06 (0.06-19.70)	.969
TP53 mutation	1.60 (0.06-46.32)	.785
PNH clone size at diagnosis (%)	1.00 (0.97-1.04)	.948
TET2 mutation	0.81 (0.08-7.81)	.854
ZRSR2 mutation	0.58 (0.02-20.84)	.768
ASXL1 mutation	1.03 (0.10-10.20)	.981
BCOR/L1 mutation	0.77 (0.08-7.26)	.82
EZH2 mutation	0.93 (0.06-15.49)	.959
SETBP1 mutation	1.42 (0.06-35.24)	.832
PNH clone size of >0%	0.86 (0.38-1.96)	.727
Abnormal karyotype at AA diagnosis	0.59 (0.02-14.74)	.747
Karyotype del(13q)	0.45 (0.01-17.95)	.674
SAA or VSAA diagnosis	1.71 (0.64-4.52)	.284
Male sex	1.21 (0.54-2.70)	.637
AA-PNH overlap syndrome diagnosis	1.50 (0.29-7.92)	.63

Variable	HR (95% CI)	P value
DNMT3A mutation	2.01 (0.04-104.74)	.73
Age at diagnosis	1.02 (1.00-1.05)	.094
PIGA mutation	0.56 (0.20-1.59)	.274
CUX1 mutation	8.18 (0.74-89.99)	.086
Total number of mutations	1.02 (0.67-1.56)	.916
U2AF1 mutation	0.58 (0.04-7.68)	.682
RUNX1 mutation	1.06 (0.06-19.70)	.969
TP53 mutation	1.60 (0.06-46.32)	.785
PNH clone size at diagnosis (%)	1.00 (0.97-1.04)	.948
TET2 mutation	0.81 (0.08-7.81)	.854
ZRSR2 mutation	0.58 (0.02-20.84)	.768
ASXL1 mutation	1.03 (0.10-10.20)	.981
BCOR/L1 mutation	0.77 (0.08-7.26)	.82
EZH2 mutation	0.93 (0.06-15.49)	.959
SETBP1 mutation	1.42 (0.06-35.24)	.832
PNH clone size of >0%	0.86 (0.38-1.96)	.727
Abnormal karyotype at AA diagnosis	0.59 (0.02-14.74)	.747
Karyotype del(13q)	0.45 (0.01-17.95)	.674
SAA or VSAA diagnosis	1.71 (0.64-4.52)	.284
Male sex	1.21 (0.54-2.70)	.637
AA-PNH overlap syndrome diagnosis	1.50 (0.29-7.92)	.63

A multivariable Cox proportional hazards model was constructed to assess the association between baseline clinical characteristics, somatic mutations, karyotypic abnormalities, and PNH clone size with the risk of adverse outcomes in patients with aplastic anemia. HRs with 95% CIs and P values are shown for each variable. Among the top-ranked features by model-derived impact scores, CUX1 and DNMT3A mutations demonstrated the highest HRs (HR of 8.18 and 2.01, respectively), although neither reached statistical significance. Age at diagnosis trended toward significance (HR, 1.02 per year; P = .094). Other mutations, including TP53, RUNX1, TET2, and ASXL1, as well as clinical features such as PNH clone size, abnormal karyotype, and AA-PNH overlap, were not significantly associated with outcome in this cohort.

CI, confidence interval; HR, hazard ratio.

Young

Maciejewski

The pathophysiology of acquired aplastic anemia

N Engl J Med

1997

;

336

(

1365

1372

Google Scholar

Crossref

PubMed

Sun

Babushok

Secondary myelodysplastic syndrome and leukemia in acquired aplastic anemia and paroxysmal nocturnal hemoglobinuria

Blood

2020

;

136

(

Google Scholar

Crossref

PubMed

de Planque

Bacigalupo

Würsch

, et al.

Long-term follow-up of severe aplastic anaemia patients treated with antithymocyte globulin

Br J Haematol

1989

;

(

121

126

Google Scholar

Crossref

PubMed

Rosenfeld

Follmann

Nunez

Young

Antithymocyte globulin and cyclosporine for severe aplastic anemia: association between hematologic response and long-term outcome

JAMA

2003

;

289

(

1130

1135

Google Scholar

Crossref

PubMed

Frickhofen

Heimpel

Kaltwasser

Schrezenmeier

;

German Aplastic Anemia Study Group

Antithymocyte globulin with or without cyclosporin A: 11-year follow-up of a randomized trial comparing treatments of aplastic anemia

Blood

2003

;

101

(

1236

1242

Google Scholar

Crossref

PubMed

Bacigalupo

How I treat acquired aplastic anemia

Blood

2017

;

129

(

1428

1436

Google Scholar

Crossref

PubMed

Rice

Eikema

D-J

Marsh

JCW

, et al.

Allogeneic hematopoietic cell transplantation in patients aged 50 years or older with severe aplastic anemia

Biol Blood Marrow Transplant

2019

;

(

488

495

Google Scholar

Crossref

PubMed

Sureda

Bacigalupo

Boogaerts

, et al.

The EBMT Handbook: Hematopoietic Stem Cell Transplantation and Cellular Therapies

. (7th ed) .

Springer

;

2019

Google Scholar

Iftikhar

DeFilipp

DeZern

, et al.

Allogeneic hematopoietic cell transplantation for the treatment of severe aplastic anemia: evidence-based guidelines from the American society for transplantation and cellular therapy

Transplant Cell Ther

2024

;

(

1155

1170

Google Scholar

Crossref

PubMed

10.

Wirk

Acquired aplastic anemia therapies: immunosuppressive therapy versus alternative donor hematopoietic cell transplantation

J Hematol

2024

;

(

Google Scholar

Crossref

PubMed

11.

Negoro

Nagata

Clemente

, et al.

Origins of myelodysplastic syndromes after aplastic anemia

Blood

2017

;

130

(

1953

1957

Google Scholar

Crossref

PubMed

12.

Gurnari

Pagliuca

Prata

, et al.

Clinical and molecular determinants of clonal evolution in aplastic anemia and paroxysmal nocturnal hemoglobinuria

J Clin Oncol

2023

;

(

132

142

Google Scholar

Crossref

PubMed

13.

Groarke

Patel

Shalhoub

, et al.

Predictors of clonal evolution and myeloid neoplasia following immunosuppressive therapy in severe aplastic anemia

Leukemia

2022

;

(

2328

2337

Google Scholar

Crossref

PubMed

14.

Yoshizato

Dumitriu

Hosokawa

, et al.

Somatic mutations and clonal hematopoiesis in aplastic anemia

N Engl J Med

2015

;

373

(

Google Scholar

Crossref

PubMed

15.

Babushok

Perdigones

Perin

, et al.

Emergence of clonal hematopoiesis in the majority of patients with acquired aplastic anemia

Cancer Genet

2015

;

208

(

115

128

Google Scholar

Crossref

PubMed

16.

Kulasekararaj

Jiang

Smith

, et al.

Somatic mutations identify a subgroup of aplastic anemia patients who progress to myelodysplastic syndrome

Blood

2014

;

124

(

2698

2704

Google Scholar

Crossref

PubMed

17.

Zaimoku

Takamatsu

Hosomichi

, et al.

Identification of an HLA class I allele closely involved in the autoantigen presentation in acquired aplastic anemia

Blood

2017

;

129

(

2908

2916

Google Scholar

Crossref

PubMed

18.

Kulasekararaj

Cavenagh

Dokal

, et al.

Guidelines for the diagnosis and management of adult aplastic anaemia: a British Society for Haematology Guideline

Br J Haematol

2024

;

204

(

784

804

Google Scholar

Crossref

PubMed

19.

Camitta

Rappeport

Parkman

Nathan

Selection of patients for bone marrow transplantation in severe aplastic anemia

Blood

1975

;

(

355

363

Google Scholar

Crossref

PubMed

20.

Scheinberg

Nunez

Weinstein

, et al.

Horse versus rabbit antithymocyte globulin in acquired aplastic anemia

N Engl J Med

2011

;

365

430

438

Google Scholar

Crossref

PubMed

21.

Scheinberg

Young

How I treat acquired aplastic anemia

Blood

2012

;

120

(

1185

1196

Google Scholar

Crossref

PubMed

22.

Hosokawa

Katagiri

Sugimori

, et al.

Favorable outcome of patients who have 13q deletion: a suggestion for revision of the WHO 'MDS-U' designation

Haematologica

2012

;

(

1845

1849

Google Scholar

Crossref

PubMed

23.

Ishiyama

Karasawa

Miyawaki

, et al.

Aplastic anaemia with 13q-: a benign subset of bone marrow failure responsive to immunosuppressive therapy

Br J Haematol

2002

;

117

(

747

750

Google Scholar

Crossref

PubMed

24.

Maciejewski

Risitano

Sloand

Nunez

Young

Distinct clinical outcomes for cytogenetic abnormalities evolving from aplastic anemia

Blood

2002

;

(

3129

3135

Google Scholar

Crossref

PubMed

25.

Holbro

Jotterand

Passweg

Buser

Tichelli

Rovó

Comment to “favorable outcome of patients who have 13q deletion: a suggestion for revision of the WHO ‘MDS-U’ designation”

Haematologica

2013

;

(

1845

1849

Google Scholar

Crossref

26.

Litzow

Kyle

Multiple responses of aplastic anemia to low-dose cyclosporine therapy despite development of a myelodysplastic syndrome

Am J Hematol

1989

;

(

226

229

Google Scholar

Crossref

PubMed

27.

Kopanos

Tsiolkas

Kouris

, et al.

VarSome: the human genomic variant search engine

Bioinformatics

2019

;

(

1978

1980

Google Scholar

Crossref

PubMed

28.

Barrabés

Perera

Novelle Moriano

Giró-I-Nieto

Mas Montserrat

Ioannidis

Advances in biomedical missing data imputation: a survey

IEEE Access

2025

;

16918

16932

Google Scholar

Crossref

29.

Rahman

Davis

. Machine learning-based missing value imputation method for clinical datasets. In:

Yang

G-C

S-l

Gelman

, eds.

IAENG Transactions on Engineering Technologies

Springer

;

2013

245

257

Google Scholar

Crossref

30.

Phung

Kumar

Kim

. A deep learning technique for imputing missing healthcare data.

2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

;

2019

6513

6516

31.

van Buuren

Groothuis-Oudshoorn

mice: multivariate imputation by chained equations in R

J Stat Softw

2011

;

(

Google Scholar

32.

Mitchell

. Machine Learning.

McGraw-Hill

;

1997

33.

Breiman

Random Forests

Machine Learning 2001

2001

;

. 10;45(1).

Google Scholar

34.

Berrar

. Cross-Validation.

Encyclopedia of Bioinformatics and Computational Biology

2019/01/01

Google Scholar

35.

Abraham

Pedregosa

Eickenberg

, et al.

Frontiers | machine learning for neuroimaging with scikit-learn

Frontiers in Neuroinformatics

2014

Google Scholar

36.

Nagata

Makishima

Kerr

, et al.

Invariant patterns of clonal succession determine specific clinical features of myelodysplastic syndromes

Nat Commun

2019

;

(

5386

Google Scholar

Crossref

PubMed

37.

Babushok

A brief, but comprehensive, guide to clonal evolution in aplastic anemia

Hematology Am Soc Hematol Educ Program

2018

;

2018

(

457

466

Google Scholar

Crossref

PubMed

38.

Dermawan

Wensel

Visconte

Maciejewski

Cook

Bosler

Clinically significant CUX1 mutations are frequently subclonal and common in myeloid disorders with a high number of co-mutated genes and dysplastic features

Am J Clin Pathol

2022

;

157

(

586

594

Google Scholar

Crossref

PubMed

39.

, et al.