Abstract
Measurable residual disease (MRD)-testing is often used to evaluate therapy efficacy in chronic lymphocytic leukaemia (CLL), with undetectable MRD (uMRD) often serving as a surrogate for progression-free survival (PFS). However, its accuracy in predicting PFS with Bruton tyrosine kinase inhibitors (BTKis), with or without venetoclax, remains controversial.
We searched PubMed, Web of Science, Embase, and Cochrane Library databases up to February 8, 2025 using terms chronic lymphocytic leukemiaOR small lymphocytic lymphomaAND measurable residual diseaseAND progression-free survival. Bayesian Weibull accelerated failure time models were used for individual-level analyses, while trial-level correlations used weighted Spearman rank tests and weighted least squares regression. A total of 41 trials (9,569 subjects) were included. Therapies tested included BTKis (11 trials), venetoclax (12), the combination (5), and other therapies like chemoimmunotherapy and monoclonal antibodies (23). MRD was assessed in blood (21 trials), bone marrow (7), or both (9).
At the individual level, subjects with detectable MRD (dMRD) had a higher progression and/or death risk compared to uMRD (Hazard Ratio [HR] = 3.63; 95% Credibility Interval [CrI] 3.30-4.00; P < 0.01). In sensitivity analysis, MRD-testing showed reduced predictive accuracy for PFS in subjects receiving BTKi-based therapy (HR = 1.52, 1.02–2.19) compared to venetoclax-based (HR = 4.46, 3.61–5.50) or other therapies (HR = 3.86, 3.45–4.34). Across subgroups, dMRD remained consistently associated with increased risk. In subjects with del(17p)/TP53 variants, dMRD predicted greater (HR = 6.52; 3.47–12.79) versus those without (HR = 4.34; 3.24–5.81). By IGHV status, dMRD conferred a 3.54-fold risk in IGHV-mutated (95% CrI: 2.42-5.34) and 2.64-fold in unmutated subjects (95% CrI: 2.10-3.34). Among response categories, risk was higher in partial responders (HR = 2.39; 1.70–3.43) than complete responders (HR = 1.64; 1.24–2.17). dMRD also predicted increased risk in relapsed/refractory (R/R) CLL subjects (HR = 4.11, 2.54-6.84) and untreated subjects (HR = 3.59, 3.25-3.98). Bone marrow MRD testing proved a stronger predictor of PFS than blood (HR = 4.27, 3.42-5.35 versus HR = 3.52, 3.17-3.92). The highest risk was seen with MRD testing 12–32 months post-treatment (HR = 5.37, 4.56-6.40), though differences across timepoints were modest after adjusting for analytical method.
At the trial level, correlations between MRD and PFS were weak (Spearman's rho [R] = -0.35; Determination Coefficients [R²] = 0.02). In sensitivity analyses, correlations remained low for BTKi-based therapy (R = -0.19 [-0.73, 0.50]; R² = 0.04) and venetoclax-based therapy (R = -0.32 [-0.85, 0.45]; R² = 0.14), but were stronger for other therapies (R = -0.80; R² = 0.58). For other therapies, MRD tests of bone marrow samples showed higher predictive value than blood (R = -0.94; R² = 0.86 versus R = -0.81; R² = 0.63). Despite blood samples and testing at 9-months of other therapies meeting the validity threshold (R² = 0.63, P = 0.21; R² = 0.63, P = 0.06), the associated P-values preclude definitive conclusions. Additional analyses examining sample source, time points and testing method within treatment classes suggest limited accuracy of predicting PFS in heterogeneous conditions (all R < -0.85, R² < 0.60).
In subgroup analyses, untreated trials showed weak associations (R = -0.30 [-0.77, 0.13]; R²= 0.02), while R/R CLL trials were insufficient for analysis. MRD-testing of blood showed lower PFS predictive accuracy than bone marrow(R = -0.26 [-0.71, 0.20]; R² = 0.03 versus R = -0.37 [-1.00, 0.62]; R² = 0.03). MRD-testing at 15 months demonstrated slightly better accuracy than 9-month testing (R = -0.59 [-1.00, 0.61]; R² = 0.27 versus R = -0.26 [-0.88, 0.41]; R² = 0.04).MRD-testing by NGS outperformed MPFC (R = -0.54 [-1.00, 1.00]; R² = 0.45 versus R = -0.30 [-0.76, 0.13]; R² = 0.01), though both methods remained below the pre-specified determination coefficient threshold.
Our data indicate accuracy of results of MRD-testing to predict PFS depends on whether used at the subject or trial-level, type of therapy and sample tested. Importantly MRD-testing is not an accurate predictor of PFS for BTKis with or without venetoclax. We suggest results of MRD-testing as a surrogate for PFS should not be used as a surrogate endpoint for studies of BTKis with or without venetoclax.