Abstract
Background Inpatient mortality in chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL) approaches 6%, yet clinicians lack a transparent, rapidly computable tool to triage patients, target early escalation, and de-escalate low-risk care. We derived full and simplified bedside scores from a nationally representative dataset and quantified operating characteristics linked to concrete clinical actions.
Methods We analyzed the 2016–2022 National Inpatient Sample (NIS), identifying 117,765 weighted CLL/SLL admissions. Building on Charlson and Elixhauser constructs, we specified 59 ICD-10–coded predictors (52 chronic comorbidities, 7 acute complications) and fit a ridge-penalized multivariable logistic benchmark. Using grouped LASSO, we derived a 42-item integer score (Model B; −23 to +113); truncating negative coefficients yielded a 23-item non-negative score (Model C; 0–116). Models were derived on 2016–2020 data and temporally validated on 2021–2022. Performance was evaluated by AUROC, Brier error, calibration slope/intercept, and decision-curve net benefit (5–25% thresholds), with equity tested across 23 demographic–socioeconomic strata and risk stratification summarized by quartiles with relative risk (RR) referenced to Q1.
Results Score construction and content. Using NIS 2016–2022, grouped LASSO selected 59 ICD-10 features into a 42-item integer score (Model B, −23 to +113) and a 23-item non-negative bedside score (Model C, 0–116) created by truncating all negative coefficients to zero to avoid subtraction and standardize bedside use. Acute complications carried the largest weights - respiratory failure +24 (present in 67.6% of deaths vs 24.8% of survivors; prevalence 15.7%→27.7% across 2016–2022), sepsis +18 (51.8% vs 16.6%), tumor lysis +15, acute kidney injury (AKI) +13, acute myocardial infarction (AMI) +10, pneumonia +4, neutropenia +2, seizures +1. High-risk chronic conditions included severe liver dysfunction +14, other neurologic disorders +14, metastatic cancer +11, weight loss +7, cerebrovascular disease +7, coagulopathy +5, AIDS +4, heart failure +3, severe renal failure +3, with smaller contributions from solid tumor +3, mild liver dysfunction +3, neoplasm in situ +2, peripheral vascular disease +1, movement disorder +1, paralysis +3. Several common diagnoses carried negative (protective) points in Model B - hypertension (complicated/uncomplicated −2/−2), COPD −2, other chronic lung disease −2, urinary-tract infection −2, history of MI −1, moderate renal failure −1, diabetes with/without complications −1/−1, hypothyroidism −1, other thyroid disorders −5, anemia −1, rheumatic disease −1, autoimmune disease −1, psychoses −4, depression −4, alcohol use −5, drug abuse −6 - and were set to 0 in Model C.
Performance and clinical utility. Crude mortality was 5.8% (6,781/117,765). The ridge benchmark achieved AUROC 0.8495 (95% CI 0.8423–0.8568) with Brier 0.057 and calibration slope ≈0.99. Simplified scores preserved performance (Model B AUROC 0.8456, Brier 0.062; Model C 0.8417, Brier 0.061; both slopes ≈0.97). Quartiles separated risk from Q1 0.79% (Model B ≤4; Model C ≤9) to Q4 ~22% (Model B ≥33; Model C ≥39); Q1 comprised 24.7% of admissions with NPV 99.2% (supports de-escalation). A 10% risk threshold flagged 9.0% of admissions yet captured 57% of deaths, yielding ~3 fewer false alerts per 100 patients versus treat-all; decision-curve analysis showed net benefit across 5–25% thresholds. Subgroup performance was stable across 23 demographic/payer/site strata (lowest AUROC 0.755; mean inter-score ΔAUROC 0.006; only the Native American subgroup showed ΔAUROC >0.02). Sensitivity analyses shifted AUROC by ≤0.004, supporting robustness.
Conclusions A nationally derived 42-item score and a 23-item non-negative bedside score predict CLL/SLL inpatient mortality nearly as well as a penalized-regression benchmark while eliminating subtraction errors and enabling seconds-level computation. The tool operationalizes care: Q1 supports de-escalation, Q2–Q3 prompt intensified monitoring with predefined triggers, and Q4 or ≥10% predicted risk triggers rapid escalation bundles. Given preserved calibration, strong net benefit, and equitable subgroup performance, the non-negative score is ready for further explore of multicenter implementation testing to confirm improvements in escalation decisions, resource use, and outcomes.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal