Abstract
Background: Non-Hodgkin lymphoma (NHL) is the most common type of lymphoma, with diffuse large B-cell lymphoma (DLBCL) and follicular lymphoma (FL) as the most frequent subtypes. Although some plasma proteins (e.g., AFP and PSA) have been clinically used for cancer risk screening, plasma biomarkers for NHL remain underexplored. This study leverages a data-driven proteomics approach to identify plasma biomarkers associated with NHL, offering novel insights into early risk prediction using a large-scale, prospective cohort. Methods: Using data from the UK Biobank, we analyzed plasma proteomic profiles from 50,281 participants to identify proteins significantly associated with NHL risk through Cox regression models. Mendelian randomization was employed to assess causality, and LightGBM machine learning models were used to predict NHL risk. Temporal trajectories of plasma protein levels were analyzed over a 13-year period prior to diagnosis. Moreover, transcriptomic data encoding these proteins and drug sensitivity analyses were conducted using relevant databases. Results: During a median follow-up of 13.6 years, 317 new cases of NHL were recorded (0.63%). We identified several plasma proteins, including PDCD1, TNFRSF9, and BCL2, that were strongly associated with NHL risk. Higher baseline levels of PDCD1, TNFRSF9, and BCL2 were associated with an increased risk of developing NHL in the future. PDCD1 and TNFRSF9 demonstrated robust predictive performance for NHL, achieving area under the receiver operating characteristic curve (AUC) values of 0.81 and 0.80 for predicting the 5-year incidence of NHL. For DLBCL, PDCD1 showed a 5-year AUC of 0.83, while TNFRSF9 demonstrated an AUC of 0.88 for predicting FL. Although BCL2 exhibited lower predictive value, it showed significant causal effects across all NHL subtypes. A combined protein model achieved AUCs exceeding 0.82 for NHL prediction over 5, 10, and 15 years. Temporal trajectory analysis revealed that PDCD1, TNFRSF9, and BCL2 levels began to deviate from normal controls at least 10 years prior to diagnosis, with a marked increase in the 5 years preceding disease onset. Further transcriptomic analysis confirmed the upregulation of these proteins in lymphoma relative to normal controls, and drug sensitivity profiling suggested that these proteins could serve as promising therapeutic targets. Conclusions: PDCD1, TNFRSF9, and BCL2 are strong predictors of NHL risk, with detectable changes in plasma levels occurring years before clinical diagnosis. The predictive model developed using these proteins holds significant promise for clinical application, providing a noninvasive, cost-effective tool for routine cancer screening and risk assessment.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal