Key Points
The negative impact of pre-HCT flow cytometrically determined MRD is similar for AML in CR1 and CR2.
Even minute levels of MRD (≤0.1%) are associated with adverse outcome.
Abstract
Minimal residual disease (MRD) before myeloablative hematopoietic cell transplantation (HCT) is associated with adverse outcome in acute myeloid leukemia (AML) in first complete remission (CR1). To compare this association with that for patients in second complete remission (CR2) and to examine the quantitative impact of MRD, we studied 253 consecutive patients receiving myeloablative HCT for AML in CR1 (n = 183) or CR2 (n = 70) who had pre-HCT marrow aspirates analyzed by 10-color flow cytometry. Three-year estimates of overall survival were 73% (64%-79%) and 32% (17%-48%) for MRDneg and MRDpos CR1 patients, respectively, and 73% (57%-83%) and 44% (21%-65%) for MRDneg and MRDpos CR2 patients, respectively. Similar estimates of relapse were 21% (14%-28%) and 58% (41%-72%) for MRDneg and MRDpos CR1 patients, respectively, and 19% (9%-31%) and 68% (41%-85%) for MRDneg and MRDpos CR2 patients, respectively. Among the MRDpos patients, there was no statistically significant evidence that increasing levels of MRD were associated with increasing risks of relapse and death. After multivariable adjustment, risks of death and relapse were 2.61 times and 4.90 times higher for MRDpos patients (P < .001). Together, our findings indicate that the negative impact of pre-HCT MRD is similar for AML in CR1 and CR2 with even minute levels (≤0.1%) as being associated with adverse outcome.
Introduction
Allogeneic hematopoietic cell transplantation (HCT) is an effective therapy for many patients with acute myeloid leukemia (AML) in first or subsequent complete remission (CR).1,2 However, even in the absence of morphologically detectable disease at the time of transplantation, relapse remains a major cause of treatment failure post-HCT,2 demonstrating that microscopy-based evaluations are incapable of detecting clinically relevant amounts of tumor cells. Over the last 2 decades, several techniques were developed that enable the sensitive quantification of minimal residual disease (MRD) amounts in patients with AML in morphological remission.3-6 The most widely exploited method in AML other than acute promyelocytic leukemia is multiparameter flow cytometry (MFC)-based because AML cells feature immunophenotypic abnormalities (“leukemia-associated immunophenotypes” [LAIP]) that can be used to distinguish them from normal hematopoietic cells in the vast majority (>90%) of cases with high sensitivity.3-6
Previous studies from our group7 and others8-13 have demonstrated that MFC-detectable MRD at the time of autologous or myeloablative allogeneic HCT is a powerful, independent predictor of subsequent relapse and shorter survival for AML patients in CR. These studies have exclusively or primarily focused on patients undergoing HCT in first CR (CR1). The relationship between MRD and outcome is much less studied for patients in second CR (CR2). Furthermore, although several studies in patients with acute lymphoblastic leukemia suggest that the association between MRD and risk of post-HCT relapse is dose-dependent,6 the quantitative impact of MRD levels on outcome in AML has not been well studied. To address these uncertainties, we investigated the quantitative significance of MRD in 253 consecutive patients who underwent allogeneic myeloablative HCT for AML in CR1 or CR2 at our institution.
Patients and methods
Study cohort
Patients of all ages, identified from our computerized database, were included in this study if they had AML in CR1 or CR2 with or without incomplete peripheral blood count recovery based on morphologic criteria14,15 (ie, regardless of the presence of MRD) at the time of HCT, underwent myeloablative conditioning, had either a matched sibling or unrelated donor, and received the first transplant. We included all consecutive patients meeting these criteria if they underwent pre-HCT workup from late April 2006 (the time a refined MFC-based MRD detection method was introduced at our institution and was used routinely during the pre-HCT work-up in all patients) until November 2011. Results on the first 99 CR1 patients have been previously reported.7 We used the 2008 World Health Organization criteria to define AML16 and the refined United Kingdom Medical Research Council criteria to assign cytogenetic risk.17 Cytogenetic analysis was performed with the G-banding method. Treatment response criteria were used as proposed by international working groups.14,15 Because many patients were referred from outside institutions, molecular testing for nucleophosmin, fms-related tyrosine kinase 3, and CCAAT/enhancer binding protein alpha mutations was not uniformly available. Chronic graft-versus-host disease (cGVHD) was diagnosed using the National Institutes of Health consensus criteria.18 Information on post-transplant outcomes was captured via the Long-Term Follow-Up Program through medical records from our outpatient clinic and local clinics that provided primary care for patients. All patients were treated based on Institutional Review Board-approved protocols and provided consent in accordance with the Declaration of Helsinki. Follow-up was current as of April 1, 2013.
MFC detection of MRD
Ten-color MFC was performed on bone marrow aspirates obtained as routine baseline assessment before HCT with a panel consisting of 3 tubes as follows7 : (1) HLA-DR-Pacific Blue, CD15-fluorescein isothiocyanate, CD33-phycoerythrin (PE), CD19-PE-Texas Red (PE-TR), CD117-PE-Cy5, CD13-PE-Cy7, CD38-Alexa 594 (A594), CD34-allophycocyanin (APC), CD71-APC-A700, and CD45-APC-H7. (2) HLA-DR-Pacific Blue, CD64-fluorescein isothiocyanate, CD123-PE, CD4-PE-TR, CD14-PE-Cy5.5, CD13-PE-Cy7, CD38-A594, CD34-APC, CD16-APC-A700, and CD45-APC-H7. (3) CD56-Alexa 488, CD7-PE, CD5-PE-Cy5, CD33-PE-Cy7, CD38-A594, CD34-APC, and CD45-APC-H7. All antibodies were obtained from Beckman-Coulter (Fullerton, CA) or Becton-Dickinson (San Jose, CA). Up to 1 million events per tube were acquired on a custom-built LSRII and data compensation and analysis were performed using software developed in our laboratory (WoodList). MRD was identified as a population showing deviation from the normal patterns of antigen expression seen on specific cell lineages at specific stages of maturation as compared with either normal or regenerating marrow. Thus, this approach was not restricted to the use of LAIP, as immunophenotypic data from initial disease presentation was only available for a subset of patients; however, if available, such LAIP abnormalities were also assessed in the pre-HCT specimens. The routine sensitivity of this assay was estimated at 0.1%, although a higher level of sensitivity was possible for a subset of leukemias featuring more frankly aberrant immunophenotypes. When identified, the abnormal population was quantified as a percentage of the total CD45+ white cell events. As done previously, and as pre-specified, any level of residual disease was considered MRD-positive (MRDpos).7
Statistical analysis
Unadjusted probabilities of overall survival (OS) and disease-free survival (DFS) were estimated using the Kaplan-Meier method, and probabilities of nonrelapse mortality (NRM) and relapse were summarized using cumulative incidence estimates. NRM was defined as death without prior relapse and was considered a competing risk for relapse, although relapse was a competing risk for NRM. All outcomes were treated as time-to-event end points. Outcomes between MRDpos and MRD-negative (MRDneg) groups were compared using Cox regression. Multivariate models included the following additional factors: age at the time of HCT, CR status (CR1 vs CR2), cytogenetic risk group at time of AML diagnosis (unfavorable vs favorable/intermediate), type of AML at diagnosis (secondary vs primary), number of induction chemotherapy cycles before HCT, type of consolidation chemotherapy before HCT (none vs high-dose cytarabine [HIDAC]-containing vs non-HIDAC containing), CR duration before HCT, karyotype at time of HCT (normalized vs not normalized for patients presenting with abnormal karyotypes), peripheral blood counts at the time of HCT (CR vs CR with incomplete peripheral blood count recovery [CRi]), and conditioning regimen (with vs without total body irradiation [TBI]). A graft-versus-leukemia (GVL) effect was evaluated by adding cGVHD as a time-dependent covariate in the analysis of relapse. Missing cytogenetic risk and karyotype were accounted for each as separate categories. Categorical patient characteristics were compared between MRDpos and MRDneg groups using Pearson’s χ-square tests, and continuous characteristics were compared with two-sample Student t tests. No adjustments were made for multiple comparisons, and all two-sided P values from the regression models were derived from the Wald test. Statistical analyses were performed using STATA (StataCorp LP, College Station, TX).
Results
Patient characteristics
We identified 253 patients undergoing first myeloablative HCT from a matched-related or an unrelated donor for AML in first (n = 183) or second (n = 70) remission between April 2006 and November 2011 who had pre-HCT MFC studies available for retrospective analysis. All patients had <5% bone marrow blasts and thus met the morphologic criterion for CR. Among these, 54 patients had MRD by flow cytometry (ie, were MRDpos), whereas 199 others had no evidence of flow cytometric MRD (ie, were MRDneg). The characteristics of the study population, induction and consolidation chemotherapies, donors, and transplants are summarized in Table 1 and Table 2. Notably, CR2 patients differed from CR1 patients in that they were more likely to have favorable risk disease by standard cytogenetic criteria (22.9% vs 3.3%); in contrast, they were less likely to have adverse risk (11.4% vs 26.2%) or secondary AML (8.6% vs 34.4%). The CR1 duration before relapse was relatively long for patients transplanted in CR2 (median 337 days, range 9-2000 days).
The median time between MFC study and HCT was similar between MRDpos and MRDneg patients (24 days, range 11-46 vs 25 days, range 9-68 days, respectively; P = .58). Consistent with our previous findings,7 MRDpos patients were of comparable age (P = .32), but more likely had AML with unfavorable vs favorable/intermediate cytogenetics (P = .040) and also had a higher prevalence of secondary AML (P = .004) (Table 3). Among CR1 patients, those with MRD less often received consolidation therapy than those that were MRDneg (P = .001), whereas a high proportion of CR2 patients did not receive consolidation chemotherapy before HCT, regardless of MRD status (P = .21). The median duration of remission prior to HCT was shorter for MRDpos than MRDneg patients undergoing HCT in CR1 (P = .015), whereas no such difference was noted for the subset of CR2 patients (P = .93). Similarly consistent with our previous findings,7 a higher proportion of MRDpos patients had incomplete blood count recovery and was thus classified as having CRi rather than CR relative to MRDneg patients (P = .006). Likewise, MRDpos patients were more likely to have abnormal cytogenetic studies than MRDneg patients at the time of HCT (P < .001).
Association between MRD status and post-HCT outcome
There were a total of 93 deaths, 75 relapses, and 35 NRM events contributing to the probability estimates for OS, DFS, relapse, and NRM stratified by MRD status for CR1 and CR2 patients. The median follow-up after HCT among survivors was 1,134 days (range 389-2,230 days) for CR1 patients and 1,217 days (range 376-2,428 days) for CR2 patients, respectively. Among CR1 patients, the 3-year estimates of OS were 73% (95% confidence interval [CI] 64%-79%) and 32% (95% CI 17%-48%) for MRDneg and MRDpos patients, respectively; among CR2 patients, the 3-year OS was estimated to be 73% (95% CI 57%-83%) and 44% (95% CI 21%-65%), respectively (Figure 1A). For DFS, similar estimates were 69% (95% CI 60%-76%) and 19% (95% CI 8%-34%) for MRDneg and MRDpos patients transplanted in CR1, respectively, and 69% (95% CI 54%-80%) and 21% (95% CI 6%-42%) for MRDneg and MRDpos patients transplanted in CR2, respectively (Figure 1B). Three-year estimates of relapse among CR1 patients were 21% (95% CI 14%-28%) and 59% (95% CI 41%-72%), respectively, and 19% (95% CI 9%-31%) and 68% (95% CI 41%-85%), respectively, among CR2 patients (Figure 1C). Finally, among CR1 patients, the 3-year estimates of NRM were 11% (95% CI 6%-16%) and 22% (95% CI 10%-37%) for MRDneg and MRDpos patients, respectively; among CR2 patients, 3-year NRM was estimated to be 12% (95% CI 5%-23%) and 11% (95% CI 2%-30%), respectively (Figure 1D). Among the CR1 patients who underwent HCT without evidence of MRD, post-HCT disease eradication appeared to be somewhat better if consolidation chemotherapy was given before transplantation. Specifically, in this patient subset, having received any consolidation chemotherapy before transplantation was associated with a trend toward lower risk of relapse (hazard ratio [HR] = 0.48; 95% CI 0.21-1.07; P = .07) and better DFS (HR = 0.60; 95% CI 0.30-1.17; P = .13).
Pre-HCT MRD status as independent prognostic factor
Univariate regression models for OS, DFS, relapse, and NRM were fit to assess the relevance of MRD as a prognostic factor. As summarized in Table 4, being MRDpos at the time of HCT was significantly associated with shorter OS (P < .001) and DFS (P < .001), as well as with an increased risk of relapse (P < .001) and NRM (P = .017). The association of MRD with outcome among patients in CR1 was similar to that among patients in CR2 (e.g., P = .63 and P = .77 for tests of interaction for mortality and relapse, respectively). Among patients with MRD, there was no statistically significant evidence that increasing levels of MRD were associated with increasing risk of any outcome. This was true when MRD was evaluated as a continuous variable (on a log scale) or as a test for a trend across the 3 groups: ≤0.1% (n = 14), >0.1% to 1% (n = 24), and >1% (n = 16) (Figure 2).
Multivariate models were fit for OS, DFS, relapse, and NRM using MRD status (MRDpos vs MRDneg), age at HCT, CR status (CR1 vs CR2), cytogenetic disease risk at diagnosis (adverse vs intermediate/favorable), type of AML (secondary vs primary), number of induction chemotherapy cycles before HCT, type of consolidation chemotherapy before HCT (none vs HIDAC-containing vs non-HIDAC containing), CR duration before HCT, pre-HCT karyotype (not normalized vs normalized for patients initially presenting with abnormal karyotype), pre-HCT peripheral blood count recovery (CRi vs CR), and conditioning regimen (with vs without TBI) as covariates. After adjustment for these factors, the hazard ratios of MRDpos vs MRDneg were 2.61 (95% CI 1.62-4.20; P < .001) for overall mortality, 3.74 (95% CI 2.38-5.87; P < .001) for failure for DFS, 4.90 (95% CI 2.87-8.37; P < .001) for relapse, and 1.88 (95% CI 0.78-4.53; P = .16) for NRM, respectively (Table 5). Then we performed similar multivariate models restricting the study cohort to those 216 patients who met peripheral blood criteria for CR as proposed by international working groups.14,15 We found very similar hazard ratios of MRDpos vs MRDneg after adjustment for the same covariates: overall mortality 3.14 (95% CI 1.82-5.43; P < .001); failure for DFS 4.72 (95% CI 2.83-7.86; P < .001); relapse 6.78 (95% CI 3.70-12.40; P < .001); and NRM 1.80 (95% CI 0.65-5.02; P = .26).
Effect of cGVHD on relapse in MRDpos and MRDneg patients
Thus far, our analyses have focused on the relationship between pre-HCT characteristics (including conditioning regimen) and post-HCT outcome in MRDpos and MRDneg patients. However, as GVHD has been linked to anti-leukemic (GVL) effects of allogeneic HCT and relapse risk for AML patients,19,20 and it is conceivable that such an effect might be different for MRDpos and MRDneg patients, we assessed the impact of cGVHD on post-HCT relapse and its relationship to MRD status in our cohort. Using cGVHD as a time-dependent variable, the GVL effect of cGVHD on relapse was found to have a HR = 0.46 (95% CI 0.24-0.88; P = .02). However, the magnitude of the GVL effect of cGHVD was similar between patients with and without MRD (for MRDpos, HR = 0.39, 95% CI 0.15-1.03 [P = .06] and for MRDneg, HR = 0.52, 95% CI 0.24-1.11 [P = .09]). The difference in effect between MRDpos and MRDneg was not statistically significant (P = .63).
Discussion
For many patients diagnosed with AML, myeloablative allogeneic HCT is an option once a first morphologic CR is obtained with chemotherapy. Consistent with previous studies,6 the data presented in this retrospective analysis demonstrate that such patients have a very favorable long-term outcome with a 3-year OS that approximates 70% to 75%, and a 3-year cumulative incidence of relapse of approximately 20% to 25% if they have no flow cytometric evidence of MRD at the time of transplantation. Conversely, relative to MRDneg patients, MRDpos CR1 patients had a significantly worse outcome with a 3-year cumulative incidence of relapse that approximates 60%, resulting in an estimated survival of approximately 30%. The current study extends these findings to AML patients transplanted in morphologic CR2. Specifically, in our cohort, the outcomes of MRDneg patients were similar for CR1 and CR2 patients. Likewise, the outcomes of MRDpos patients were similar for CR1 and CR2 patients. At first glance, the relatively similar outcome for CR1 and CR2 patients may be surprising. However, our data suggest that, rather than the number of remission, MRD status (and, therefore, the susceptibility to preceding chemotherapy) is the dominating pre-HCT factor associated with post-HCT relapse risk and outcome.
MRDneg and MRDpos patients differed in many factors that predict outcome in AML, including cytogenetic disease risk and type of AML (secondary vs primary). There were also notable differences among these patient subsets regarding several pre-HCT factors, such as pre-HCT blood count recovery or abnormal pre-HCT karyotype, previously shown by us to be associated with increased risk of post-HCT outcome in univariate analyses for AML patients undergoing transplantation in CR1.7 However, our multivariate models indicate that pre-HCT MRD is an adverse risk factor for HCT outcome for both CR1 and CR2 patients even after adjusting for these other factors. With the caveat that we did not have full molecular characterization of all cases, these multivariable Cox regression models suggest that MRD is the decisive pre-HCT factor for post-HCT outcome and the only one independently associated with increased relapse risk and shorter OS and DFS (Table 5).
Our data confirm the previous studies showing that the development of cGVHD is associated with a reduced risk of relapse in AML patients undergoing myeloablative HCT.19,20 At least in our cohort, we were unable to discern any statistically significant difference in the magnitude of this GVL effect between MRDpos and MRDneg patients. This finding suggests that the high relapse risk after allogeneic transplant for MRDpos patients is not due to less strong GVL effects in this patient subset. This observation is somewhat reminiscent of recent data indicating that the allogeneic HCT-associated reduction of relapse and improvement of survival in patients with monosomal karyotype AML is relatively similar to that of patients with less unfavorable AML subtypes.21
The threshold below or above which patients should be considered MRDneg or MRDpos based on flow cytometric assessment of residual tumor amounts has been controversial, and several groups have proposed the use of different thresholds above the minimal detection limit as optimal cutoffs for the best segregation of patients into categories of post-HCT relapse risk rather than using the technical detection limit of the MRD assay as threshold.5,6 Our findings from the current study lead us to question the usefulness of this approach. Specifically, in our cohort, the risk of relapse among MRDpos patients with a level ≤0.1% (a level considered “negative” in recent series12,13 ) was significantly higher than that among patients in which we were unable to detect any MRD. On the other hand, among patients with MRD, there was no statistically significant evidence that increasing levels of MRD were associated with increasing risk of any outcome. Of course, despite the size of our study cohort, the number of MRDpos patients was still relatively modest, and such an association cannot be ruled out with certainty. In fact, our HR estimates for patients with MRD >1% were consistently (but statistically nonsignificantly) higher than those with MRD ≤0.1%, and the study of larger numbers of patients may indeed yield a statistically significant quantitative between these patient subsets. Nonetheless, these data suggest that MRDpos patients (regardless of the level of MRD) are more similar to each other than MRDneg patients are to MRDpos patients with the lowest detectable levels of MRD, an observation that would support the approach of using the MRD assay detection limit as a threshold to distinguish MRDneg from MRDpos, as is currently the approach at our institution. As stated, we a priori defined MRDpos as any level of residual disease. Although the number of MRDpos patients in the different residual disease categories was not sufficient to evaluate the full range of association, it is noteworthy that when assessing each possible cutoff within the ≤0.1% category, we found the 0 vs >0 split to be the most statistically significant.
In our previous study, we observed that, in addition to higher risk of relapse and inferior OS and DFS, MRDpos patients with AML transplanted in CR1 also had a higher risk of NRM relative to MRDneg patients.7 Our present analysis on the expanded CR1 and CR2 AML patient cohort confirms this initial finding, indicating an approximately twofold increased risk of NRM for MRDpos patients that did not, however, remain statistically significant after multivariate adjustment. We speculate that differences in the type and timing of pre-HCT therapy may account for this increase in NRM, but further studies will be required to better understand this relationship.
In summary, our findings suggest that the negative impact of MRD on outcome among AML patients in CR2 is similar to the negative impact seen in patients in CR1, whereas the outcomes of MRDneg AML patients are excellent after myeloablative HCT in either CR1 or CR2. An important question to address in future well-controlled studies is whether less intensive consolidation strategies (eg, chemotherapy, autologous HCT, or reduced-intensity allogeneic HCT) could provide a similar level of disease control with lower risks for treatment-related toxicities and mortality for these low-risk (ie, MRDneg) patients. On the other hand, even patients with minute amounts of MRD (≤0.1%) have significantly worse outcomes than MRDneg patients. Observations by others have suggested that the outcomes of AML patients with pre-HCT MRD may be better when they undergo allogeneic rather than autologous HCT,6 consistent with the demonstration of a GVL effect in MRDpos patients in our cohort. However, no well-controlled studies have analyzed the outcomes of MRDpos patients after different transplantation strategies and assessed the differential impact, if any, of such immunologic anti-leukemia effects. Clearly, the poor outcome of MRDpos patients provides the rationale for interventional studies investigating whether cure rates could be improved by MRD-directed therapy before, during, or after HCT.
There is an Inside Blood commentary on this article in this issue.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
The authors thank the physicians and nurses of the HCT teams, the staff in the Fred Hutchinson Cancer Research Center’s Long-Term Follow-Up Program, and the Hematopathology Laboratory at the University of Washington for assisting with research protocols.
This work was supported by a grant from the National Institutes of Health, National Cancer Institute (P30-CA015704-35S6). S.A.B. is the recipient of a Trainee Research Award from the American Society of Hematology.
Authorship
Contribution: R.B.W. and F.R.A. contributed to conception and design of the study; J.M.P., B.L.W., B.M.S., M.F., B.G., C.D., J.P.R., and F.R.A. contributed to provision of study material, patient recruitment, and acquisition of data; R.B.W., S.A.B., and J.M.P. participated in collection and assembly of data; R.B.W., S.A.B., B.E.S., E.H.E., and F.R.A. participated in data analysis and interpretation; R.B.W. and F.R.A. participated in drafting the manuscript; and R.B.W., S.A.B., J.M.P., B.L.W., B.E.S., B.M.S., M.F., B.G., C.D., J.P.R., E.H.E., and F.R.A. critically revised the manuscript and gave final approval to submit for publication.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Roland B. Walter, Clinical Research Division, Fred Hutchinson Cancer Research Center; 1100 Fairview Ave N, D2-190, Seattle, WA 98109-1024; e-mail: rwalter@fhcrc.org.