Abstract
Improvements in multiple myeloma therapy have led to deeper responses that are beyond the limit of detection by historical immunohistochemistry and conventional flow cytometry in bone marrow samples. In parallel, more sensitive techniques for assessing minimal residual disease (MRD) through next-generation flow cytometry and sequencing have been developed and are now routinely available. Deep responses when measured by these assays correspond with improved outcomes and survival. We review the data supporting MRD testing as well as its limitations and how it may fit in with current and future clinical practice.
Understand the role of MRD status in prognosis in multiple myeloma
Explore how to apply MRD testing in clinical practice and its limitations
CLINICAL CASE
A 57-year-old woman with multiple myeloma (MM) has completed initial therapy with lenalidomide, bortezomib, and dexamethasone, followed by high-dose melphalan and autologous stem cell transplant (auto-SCT). She initially presented with anemia with hemoglobin of 6.1 g/dL, and the disease was staged as Revised International Staging System stage II. Fluorescence in situ hybridization (FISH) showed gain of 1q. She is on maintenance therapy with lenalidomide and ixazomib. Laboratory studies showed no monoclonal protein, and the serum free light chain ratio was normal. She mentions that she went to a patient education session and heard about “minimal residual disease (MRD) testing.” She is interested in having MRD testing performed.
Introduction
Response assessment in MM has traditionally relied on measuring the monoclonal protein in the serum and urine and plasma cell involvement in the bone marrow and, more recently, serum free light chains. The past 2 decades have seen tremendous progress in the treatment of MM with the approval and adoption of effective agents such as proteasome inhibitors (bortezomib, carfilzomib, ixazomib), immunomodulatory drugs (lenalidomide, pomalidomide), and, more recently, anti-CD38 monoclonal antibodies (daratumumab and isatuximab). With the increasing use of these agents, especially in 3- and 4-drug combinations, responses have substantially deepened in newly diagnosed patients in whom complete responses (CRs) are routinely achieved, for example, from historically 10% with thalidomide and dexamethasone1 to 95% in a recently reported combination of daratumumab, carfilzomib, lenalidomide, and dexamethasone (dara-KRd) without the use of high-dose melphalan.2 Similar trends are also seen in relapsed disease, especially with the advent of anti-B-cell maturation antigen (BCMA) directed therapies. To better assess these improving responses, MRD testing using more sensitive tools has emerged. This review provides an overview of MRD assessment in MM and highlights the practical aspects of MRD testing.
Importance of depth of response
Intuitively, the depth of response with myeloma therapy correlates with long-term outcomes. The relationship between CR and progression-free survival (PFS) and overall survival (OS) has been consistently demonstrated in a meta-analysis of trials using intensive therapy combined with older3 or contemporary therapies.4 Moreover, an analysis of 344 patients treated by the Grupo Español de Mieloma (GEM) and Programa Para el Estudio de la Terapéutica en Hemopatías Malignas (PETHEMA) groups noted differences in survival between CR, near CR, and very good partial response.5 Similar observations with CR hold true in older patients who are not eligible for high-dose therapy.6 This raises the question of whether further gains may be seen with even deeper responses such as MRD-negative disease. It should be noted that this relationship between response and outcomes generally holds true, provided that the method of achieving this depth of response is tolerated well. However, the Eastern Cooperative Oncology Group study of high-dose dexamethasone with lenalidomide7 and the BELLINI trial of venetoclax with bortezomib and dexamethasone8 are instructive for illustrating that deeper responses are not always associated with improvements in survival.
The term minimal residual disease conventionally refers to disease in the bone marrow space. Measurement of minimal disease in the bone marrow is relevant as it commonly serves as the reservoir of disease relapse. Methods currently used to detect MRD include multiparameter, next-generation flow cytometry (NGF) and next-generation sequencing (NGS) (Table 1). More than 10 years ago, flow cytometry was the first technique to evaluate MRD, in which sensitivity was 10−4 initially.10,11 The sensitivity has improved to 2 × 10−6, and the EuroFlow consortium has standardized the methodology.12 NGS has also emerged in parallel for measuring MRD, in which immunoglobulin gene segments are amplified using consensus primers and sequenced.13 Currently, the sensitivity of the Adaptive Clonoseq platform (previously known as LymphoSIGHT) is 6.77 × 10−7 with 20 µg DNA from 1 mL of bone marrow aspirate.14,15
The concordance between NGF and NGS is high. It exceeded 80% when examined in the FORTE16 and CASSIOPEIA17 trials in newly diagnosed patients. There was similarly high concordance, 85.8%18 and 92.9%,19 when comparing NGF with NGS from a different platform, LymphoTrack. The choice of assay used for MRD is based on availability and institutional preference. NGS by the Clonoseq assay is commercially available through Adaptive, and in January 2019, Medicare announced coverage of this test. NGF is also commercially available, for example, through Mayo Clinic reference laboratory. A consideration with NGS is that it requires a baseline sample to provide a trackable sequence; NGF does not require a baseline sample. In 1 series, a trackable sequence for NGS could not be identified in 7.8% of samples.20 NGF also has the advantage of assessing for hemodilution by looking for mast cell, erythroblast, and B-cell precursor populations.12 Finally, from a research perspective, NGF may be able to evaluate the bone marrow microenvironment, which may have prognostic relevance.
Several meta-analyses have consistently shown that depth of response beyond CR correlates with improvement in OS.21-23 Although the initial meta-analyses focused on transplant-eligible patients managed with intensive therapy and where MRD was mostly assessed by older, less sensitive, flow cytometry (10−4), the recent meta-analysis extends on prior observations to include older, transplant-ineligible patients and patients with relapsed disease.23 Compared with MRD-positive disease, MRD-negative status showed improved PFS (hazard ratio, 0.33; 95% CI, 0.29-0.37) and OS (hazard ratio, 0.45; 95% CI, 0.39-0.51) across multiple patient populations, including in relapsed disease and high-risk disease.23 Importantly, MRD status can stratify patients in CR, where OS was 112 vs 82 months for MRD-negative vs MRD-positive patients, respectively.22
Given these findings, MRD status is increasingly used as an end point when comparing different regimens, especially now that regimens are increasingly achieving deeper responses. The International Myeloma Working Group9 and Bone Marrow Transplant Clinical Trials Network (BMT CTN)24 have provided guidance around definitions of MRD and performance (Table 2), with the IMWG recommending a sensitivity of 10−5. The use of MRD as a surrogate end point for regulatory purposes is an area of active discussion25 and is being addressed by a consortium of academic groups and pharmaceutical partners, the International Independent Team for Endpoint Approval of Myeloma MRD.26-28
The depth of MRD negative status is also important. This was initially shown with flow cytometry with sensitivity down to 10−4 and where each log reduction in MRD translated into improvement in median OS.29 In the Francophone du Myélome 2009 study of upfront vs deferred auto-SCT after initial therapy with lenalidomide, bortezomib, and dexamethasone, MRD status was assessed by flow cytometry in all patients, and a subset of these patients was evaluated by more sensitive NGS.30,31 Patients who were able to achieve MRD negative status at 10−6 by NGS, which is deeper than the recommended IMWG threshold of 10−5, had superior outcomes in PFS and OS compared with MRD-positive status (Figure 1).31 Moreover, the study showed differences in outcomes between 10−6, 10−5, and 10−4. Prior to starting maintenance therapy, patients who were MRD negative had similar PFS whether they received transplant upfront or not, although patients in the transplant arm were more likely to be MRD negative (29.8% vs 20.5%). Of note is that in the IFM 2009 study, MRD assessments were after completion of initial therapy, prior to maintenance therapy; the effect of high-dose therapy in patients who were already MRD negative prior to high-dose therapy was not addressed. Nevertheless, as long as a deep, MRD-negative response is achieved, the method of achieving the response may not be as important. For example, the CASSIOPEIA study evaluated daratumumab, bortezomib, thalidomide, and dexamethasone (dara-VTd) vs VTd in newly diagnosed patients undergoing high-dose melphalan and auto-SCT. Patients who achieved both CR and MRD-negative status had similar PFS, irrespective of treatment arm (although higher-quality responses were more common in the dara-VTd arm).32 Similar findings were observed in the FORTE study, in which outcomes of patients with MRD-negative disease sustained for 1 year were similar, irrespective of the initial treatment KRd vs 12 cycles of KRd without auto-SCT vs carfilzomib, cyclophosphamide, and dexamethasone with auto-SCT.33
In patients with high-risk disease, achieving an MRD-negative response may be even more important. An analysis of the PETHEMA/GEM2012MENOS65 trial showed that MRD-negative responses were able to overcome poor prognostic features at diagnosis, including Revised International Staging System stage III.34,35 Similar observations were seen for patients with high-risk cytogenetics in IFM 2009 and earlier PETHEMA/GEM trials.31,36 The findings with MRD extend on previous observations where achieving CR was especially important in high-risk disease defined by gene expression profiling.37
It is well established with traditional response criteria that durability of response is a powerful prognostic factor38,39 and that loss of CR is associated with inferior survival.40 Durability of MRD-negative status is similarly important. This was demonstrated recently in the POLLUX and CASTOR studies, which evaluated daratumumab with lenalidomide and dexamethasone (dara-Rd) or daratumumab with bortezomib and dexamethasone, respectively, using NGS at 10−5 sensitivity.41 Patients with sustained MRD negativity over 12 months had the best outcomes, irrespective of the treatment arm, although this was more likely to be achieved in the daratumumab-containing combination. Similar findings of improved outcomes were seen with sustained MRD negativity over 6 or 12 months in newly diagnosed, transplant-ineligible patients in the ALCYONE (daratumumab, bortezomib, melphalan, and prednisone [dara-VMP] vs VMP) and MAIA (dara-Rd vs Rd) trials.42 Reflecting these observations, the IMWG defines a separate response category of “sustained MRD negative,” in which assessments by marrow and by imaging are confirmed at least 1 year apart.9 If 1 year is better, 2 years may be even better: this was demonstrated in patients with sustained MRD negativity (by NGF at 10−5) for 2 years in a trial of patients on lenalidomide maintenance.43 Moreover, in this study, loss of MRD negativity was actually worse than sustained MRD positivity.
There have been several studies examining the patterns of loss of MRD negativity and its clinical relevance.44-46 For example, MRD progression by flow cytometry with sensitivity at 10−4 or by allele-specific oligonucleotide polymerase chain reaction with sensitivity of 10−5 in a series of patients on lenalidomide maintenance anticipated biochemical relapse by 4 months and clinical relapse by 9 to 10 months.45 Similarly, in a retrospective study using NGS (10−6), molecular relapse by MRD evaluation was able to predict clinical relapse.46 Serial MRD testing was able to predict clinical relapse in 9 of 10 cases, and relapse by IMWG criteria occurred at a median of 13 months (range, 1-28 months) following molecular relapse. These findings raise the question of whether initiating treatment at the time of molecular relapse rather than waiting for biochemical or clinical relapse could alter the natural history of the disease (see Relapse from MRD Negativity as Indication for Treatment study below).
Limitations of MRD assessment
An inherent limitation in MRD assessment is its reliance on measuring disease in the bone marrow. This assessment focuses on plasma cells and does not take into account the bone marrow microenvironment,47 which may play a role in shaping prognosis. From a practical perspective, bone marrow involvement may not be uniform, such as in the case of macrofocal disease,48 and perhaps most important, extramedullary disease may also be present. For example, in the IMAgerie du JEune Myélome study of the Intergroupe Francophone du Myélome 2009 trial, 26% of patients with MRD-negative disease by flow cytometry (sensitivity 10−4) had positive positron emission tomography (PET) computed tomography (CT) findings.49 Similar findings were seen in the CASSIOPET substudy of CASSIOPEIA, in which 10.5% of patients who were negative by NGF at 10−5 had positive PET CT.50 This discrepancy is relevant, as patients who were MRD negative but PET CT positive had similar outcomes to patients who were MRD positive. The discrepancy between MRD negativity and imaging is higher in patients with relapsed disease, 50% vs 12% in newly diagnosed patients in 1 series.51 Overall, as was seen in CASSIOPET, patients who are “double negative” on MRD and imaging tended to have the best outcomes, suggesting that these 2 modalities complement each other. The Deauville scale used in lymphoma has been applied to MM to standardize “metabolic response” criteria by PET and was an independent predictor for improved PFS and OS outcomes.52
Moreover, inherent to this discussion is the heterogeneity of MM, in which the depth of response may not be as important in all patients. This was previously recognized with CR, in which patients with a history of monoclonal gammopathy of undetermined significance53 or with a MGUS-like gene expression profile54 had lower CR rates with a tandem transplant regimen in Total Therapy 2 but superior outcomes. The cyclin D2 molecular subtype, which characteristically includes patients with t(11;14), has the lowest and slowest cumulative incidence of response, yet has comparable outcomes with MRD-positive disease compared with other patients with MRD-negative disease.55,56 The challenge at this time is how to prospectively identify these patients in whom an MRD-negative response is not as critical.
Perhaps the most obvious limitation for MRD assessment is the requirement for a bone marrow aspiration procedure. This has motivated investigating “liquid biopsies,” using the same tools on the peripheral blood. Indeed, analysis of peripheral blood provides a systemic assessment and avoids the pitfalls of heterogeneity in bone marrow sampling. Methods involving the peripheral blood may allow for detecting and monitoring extramedullary disease that is missed by focusing on the bone marrow. Using the same NGF methodology optimized in bone marrow on peripheral blood, the sensitivity is less.57 Forty percent of patients with bone marrow that was MRD positive were negative in the peripheral blood; all patients with circulating plasma cells were MRD positive in the bone marrow. Similarly, NGS has been explored on peripheral blood.58 Of patients with positive bone marrow MRD tests, the test was negative in plasma 69% of the time. This may reflect the lower circulating DNA burden in peripheral blood. Other approaches under development include analysis of circulating free tumor DNA using targeted mutation detection59 or whole-genome low-pass sequencing.60
Mass spectrometry is now being used to measure monoclonal gammopathy in the peripheral blood. There are 2 forms of mass spectrometry: matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS) and liquid chromatography quadrupole time-of-flight mass spectrometry (LC-MS).61 MALDI-TOF-MS has a sensitivity of less than 0.01 g/dL62 and has replaced conventional serum protein electrophoresis at some institutions. LC-MS has even more sensitivity than MALDI-TOF-MS, down to 0.005 g/dL, but has lower throughput.63 Moreover, mass spectrometry can distinguish between “false-positive” bands on protein electrophoresis from therapeutic monoclonal antibodies such as daratumumab vs the underlying disease.64 Given the increased sensitivity of mass spectrometry of the peripheral blood, studies are comparing the performance of mass spectrometry with MRD performed on bone marrow by NGS or NGF, for example, in the Stem Cell Transplant in Myeloma Incorporating Novel Agents65 and in GEM2012MENOS65 trials.66 In one study, LC-MS was estimated to be even more sensitive than NGS at 10−5 and could be used as a screen for MRD.67
Applying MRD to clinical practice
The data establishing depth of response by MRD testing and outcomes are robust, and clinical trials now routinely incorporate MRD testing to benchmark performance. MRD testing is also being used to stratify patients in clinical trials. For example, the ECOG Effective Quadruplet Utilization after Treatment Evaluation trial (NCT04566328) randomizes patients after initial therapy with dara-Rd to either consolidation with additional dara-Rd or adding bortezomib to dara-Rd, and the study stratifies by MRD status. However, applying MRD testing to patient care is evolving. Indeed, the questions raised when this topic was initially covered in this education program 4 years ago continue to be relevant now.68 There are ongoing trials to help answer this question (Figure 2). We acknowledge that there is significant variability in MRD use in clinical practice. At this time, there are no prospective, randomized data in which the information from MRD testing can guide treatment decisions. Nevertheless, if a bone marrow biopsy is being performed to confirm a CR, sending the aspirate for MRD testing is appropriate, as it may provide prognostic information as well as establish a reference point for subsequent MRD testing that may confirm sustainability of response. To increase the sensitivity, the operator should prioritize the first pull for MRD testing, given hemodilution with subsequent pulls.69 Finally, if MRD assessment is being performed, for completeness, it may be important to also assess for extramedullary disease with imaging such as PET CT.
Timing of high-dose melphalan and SCT
This has been a core question over the years for transplant-eligible patients and continues to be an ongoing area of debate. Despite the IFM 2009 trial showing significant improvement in PFS, the lack of improvement in OS30 spurs this ongoing debate, including with the FORTE trial.33 If outcomes of patients with MRD-negative and especially sustained MRD-negative disease are comparable, does it matter if this is achieved without high-dose melphalan? Although the IFM and FORTE studies incorporated MRD testing, this was after completion of initial therapy. These studies did not evaluate MRD findings before high-dose melphalan to inform decision making.
Should therapy change to deepen response?
This is another open question in the myeloma field. Attempts at answering this question, before the availability of MRD assessments, include the Myeloma XI study of risk-adapted intensification,70 which showed that the addition of cyclophosphamide, bortezomib, and dexamethasone in patients with suboptimal responses improved PFS. However, current practice does not intensify therapy above what was previously planned in patients who have not achieved an optimal response. Because myeloma therapy is continuous, responses may improve over time. In a retrospective study of a real-world practice of patients on lenalidomide maintenance, 34.3% of patients who were MRD positive (with MRD assessment according to local practice) after induction treatment achieved MRD-negative status during maintenance therapy,71 suggesting that a change in therapy may not be obligatory. The AURIGA study (NCT NCT03901963) is examining the role of adding daratumumab to lenalidomide maintenance to evaluate the benefit of adding additional therapy to deepen a response.
Can we de-escalate treatment?
Current practice is to treat until progression with a combination of induction and maintenance therapy. But can patients step off this “treadmill” of continuous therapy to avoid the adverse events and burden of chronic therapy? There are several trials examining de-escalation of therapy. There is an ongoing phase 2 study in newly diagnosed, transplant eligible patients, the Monoclonal Antibody-Based Sequential Therapy for Deep Remission in Multiple Myeloma study (NCT03224507).72 Patients undergo induction therapy with dara-KRd, followed by auto-SCT. Patients who are MRD negative by NGS (10−5) after auto-SCT discontinue treatment, whereas patients who are MRD positive continue to undergo consolidation with dara-KRd for up to 2 cycles until MRD negative. In the PERSEUS trial (NCT03710603), patients on maintenance with daratumumab and lenalidomide who are MRD negative can discontinue daratumumab and continue on lenalidomide. In the DRAMMATIC trial (SWOG 1803), patients who are MRD negative after initial therapy are randomized to continue the assigned maintenance vs stopping assigned maintenance therapy.
Should therapy change if a patient becomes MRD positive?
An ongoing question is the optimal timing of treating relapsed disease. Patients who are treated at the time of biochemical rather than clinical relapse have better outcomes, as seen in a subgroup analysis of the ENDEAVOR trial (with the caveat that this study was not designed to answer this specific question).73 Could the outcomes of patients be better when treated at relapse, with an even lower burden of disease, by MRD? As noted previously, the appearance of MRD-positive disease may herald biochemical or clinical relapse several months later. The REMNANT study (NCT04513639) will help answer this question.74 Patients who are MRD negative after induction therapy are randomized to start treatment at the time of MRD relapse vs at the time of progressive disease according to IMWG criteria. However, a limitation in treating patients for relapse by MRD criteria is that current clinical trials generally require measurable disease, and MRD positivity is not considered measurable to be eligible for the trial. Consequently, this may limit treatment to standard-of-care options instead of the potentially more innovative therapies under investigation.
CLINICAL CASE (continued)
Our patient underwent MRD testing using Adaptive NGS and was MRD negative. She had MRD testing serially for 2 years and was negative on both occasions. However, even with the sustained MRD negativity, she preferred to continue with lenalidomide and ixazomib maintenance. Testing for MRD at 5 years after auto-SCT resulted in a positive test, albeit at a low level of 0 to 1 × 10−6. With this new finding, she had further workup including whole-body low-dose CT, which did not show any new findings. She has opted to continue the current regimen, with a tentative plan of repeating MRD testing in 6 months. This case illustrates some of the challenges with MRD testing, as neither the repeated negative results nor the new low positive result prompted a change in treatment.
Conclusion
Overall, the field is fortunate that newer treatments are leading to unprecedented depths of response that require newer methods such as NGS or NGF to measure their effect. What was once a test restricted to specialized research settings or clinical trials is now readily available for any patient. Although there are maturing data on how it adds new prognostic power, there is a lag in the data for how to effectively use the test to make treatment decisions. Ongoing trials will provide data and guidance on how to incorporate MRD testing into clinical practice to tailor therapy. Moreover, the incorporation of functional imaging and liquid biopsies will provide less invasive ways of evaluating disease burden.
Conflict-of-interest disclosure
Andrew J. Yee has consulted for Adaptive, Amgen, BMS, GSK, Janssen, Karyopharm, Oncopeptides, Sanofi, and Takeda and has received clinical trial support from Adaptive, Amgen, BMS, Janssen, and Takeda.
Noopur Raje has consulted for Amgen, BMS, Bluebird, GSK, Janssen, and Karyopharm; served on scientific advisory board for Caribou and Immuneel; and received research funding from Bluebird.
Off-label drug use
Andrew J. Yee: no off-label drug use discussed.
Noopur Raje: no off-label drug use discussed.