Abstract
Background: Emerging evidence suggests that host gut microbiota-related genes (GMRGs) are critically involved in orchestrating immune responses and influencing leukemogenesis in pediatric acute lymphoblastic leukemia (ALL). However, their prognostic relevance in this context remains largely unexplored. This study sought to explore the prognostic value of host GMRGs and to construct a clinically applicable prediction model to enhance risk assessment and personalized management in pediatric ALL.
Methods: RNA-seq and clinical data of 532 pediatric ALL patients were obtained from the TARGET-ALL Phase II project (phs000218.v22.p8). After removing patients with missing overall survival (OS) information or poor-quality expression data, 466 cases were included for analysis. A curated list of 239 host GMRGs was retrieved from the GutMGene database and intersected with the expression matrix. Univariate Cox regression was used to identify OS-related genes, followed by functional enrichment analyses. A 7-gene prognostic signature was constructed using LASSO Cox regression and then validated by multivariate Cox analysis incorporating clinical variables. A nomogram model was developed to estimate 1-, 3-, and 5-year survival probabilities. Model performance was evaluated via calibration plots, Harrell's concordance index (C-index), and decision curve analysis (DCA). Kaplan–Meier and time-dependent ROC analyses were used to assess survival discrimination. All statistical analyses were conducted in R version 4.5, with p < 0.05 considered statistically significant.
Results: 110 out of total 239 GMRGs were identified as significantly associated with OS and were enriched in pathways related to inflammatory responses, oxidative stress, microbial recognition and response, and immune regulation. Using LASSO Cox regression, a 7-gene prognostic signature (MYD88, AURKAIP1, SERINC2, PRKCZ, AKT1, HDAC2, and LGR5) was established to calculate an individual risk score by summing weighted gene expression values. Multivariate Cox regression confirmed that both the risk score (HR = 2.26, p < 0.001) and age (HR = 1.04, p = 0.045) were independent prognostic factors. An individualized nomogram model was then developed, with the survival probability at time t calculated by the equation: S^(t)=S0(t)exp(0.8057×risk score+0.0365×AGE), where S0(t) is the estimated baseline survival function. The nomogram showed excellent agreement between predicted and actual survival outcomes, and DCA confirmed substantial net clinical benefit across a range of threshold probabilities. Kaplan–Meier analysis revealed significantly reduced OS in the high-risk group (p < 0.0001), and the time-dependent ROC analysis yielded AUCs of 0.78, 0.85, and 0.86 at 1, 3, and 5 years, respectively. The Harrell's C-index for the risk score model was 0.821, while the integrated clinical model had a C-index of 0.806, demonstrating both predictive accuracy and clinical utility.
Conclusion: This study is the first to establish a practical and visualized 7-gene GMRG-based individualized prognostic model for pediatric ALL. The model demonstrates strong predictive power, robustness, and clinical applicability, offering a valuable tool for risk stratification, therapeutic decision-making, and personalized follow-up in pediatric ALL patients.
Keywords: Pediatric acute lymphoblastic leukemia; gut microbiota-related genes; prognostic model; risk score; nomogram
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal