Abstract
Introduction: In adults, acute myeloid leukemia (AML) is the most frequently diagnosed acute leukemia, yet with heterogeneous biology and outcomes. These are dominated by disease-related factors, i.e. genomic aberrations. Current European LeukemiaNet (ELN) risk classifications (Döhner et al. Blood 2022) have incorporated insights provided by Hierarchical Dirichlet Mixture models (HDMM, e.g. Papaemanuil et al., NEJM 2016) for intensively treated patients. Most recently, we developed an unsupervised classification of AML (Turki et al. EHA 2025 plenary abstract) leveraging only genetic features (86 total, 55 gene mutations plus cytogenetics) from a large HARMONY AML cohort (n=5,244) and redesigning the post-processing of the HDMM resulting in important refinements.
Aim: Here, we independently tested the generalizability and robustness of the assignment into the HARMONY components using two large cohorts from Italy (n=1,969) and China (n=905) and compared its results to the HARMONY AML derivation cohort.
Methods: Data of 2,874 patients with newly diagnosed AML (aged >16 years) from Humanitas University Milan, Italy and Zhejiang University, Hangzhou, China was selected based on availability of a panel of 55 gene mutation features and karyotype categorized into additional 31 features. Few missing genetic variables in the validation datasets (e.g. KIT mutation locus, exon 8 or 17) were imputed as non-mutated features. For each of these two geographically diverse validation cohorts, individual patients were separately mapped to the 17 previously developed HDMM components using their genetic characteristics (i.e. individual cytogenetic aberrations and mutations). Hence, the mutational landscape was characterized by the HDMM, whose components were further categorized with the multivariate Fisher's Non-Central Hypergeometric distribution (FNCH, Dall'Olio et al. PLOS Comp. Biol. 2023).
Results: The previously identified 17 genomic classes in the HARMONY cohort (Turki et al. EHA 2025 plenary abstract) were independently validated on two geographically diverse cohorts. All 2,874 patients were successfully assigned. The robustness of the unsupervised classification was also demonstrated for the splitting of large components. Compared to the initial derivation cohort from HARMONY, the distribution of components as well as the differences in overall survival (OS) were also significant across the 17 classes (p<0.001). In detail, the assignment for the Hangzhou cohort was overall comparable to the one in the HARMONY development cohort. Differences >2% were noted for biological reasons. For instance, CEBPAbi mutated patients were numerically more prevalent in the Chinese cohort (6% versus 5%), which is in line with previous publications (e.g. Su et al. Oncotargets 2016), while TP53 mutated AML was less frequent (5.9% vs. 8.1%). In contrast, the TP53 mutated fraction was more prevalent (10.5%) in the Italian cohort, which presented with more secondary AML cases. In the Italian cohort, the general component distribution was similar to the one of the European UK-NCRI trials cohort, our first validation cohort supporting 17 genomic classes (Turki et al. EHA 2025). For instance, the RUNX1 component was enriched in both the Italian and NCRI cohorts, while the component with t(8;21) AML was less frequent. Importantly, all newly identified or subdivided AML classes of the HARMONY classification were confirmed in each validation cohort, e.g. the separation of AML patients with inv(16) plus FLT3 mutations from the main group of AML with inv(16). This split was also associated with significant differences in overall survival (OS, p<0.001). Furthermore, we confirmed the three NPM1 components, including FLT3 + NPM1 (both drivers), NPM1+passengers and NPM1+IDH2.
Conclusion:
This large and geographically diverse AML genetics study provides important information on the generalizability of the HARMONY approach to an unsupervised classification of AML, which based on multivariate analysis has the potential to further refine the current AML risk classification.