Key Points
High WBC is an independent predictor of early HD in APL.
Abstract
Acute promyelocytic leukemia (APL) is commonly complicated by a complex coagulopathy. Uncertainty remains as to which markers of bleeding risk are independent predictors. Drawing from 5 large clinical trials that included all-trans retinoic acid (ATRA) as part of induction, we assessed known determinants of bleeding at baseline and evaluated them as potential predictors of hemorrhagic death (HD) in the first 30 days of treatment. The studies included were ALLG APML3 (single arm of ATRA + idarubicin ± prednisone), ALLG APML4 (single arm of ATRA + idarubicin + arsenic trioxide + prednisone), CALGB C9710 (single arm of ATRA + cytarabine + daunorubicin), Eastern Cooperative Oncology Group-American College of Radiology Imaging Network (ECOG-ACRIN) E2491 (intergroup I0129, consisting of daunorubicin + cytarabine vs ATRA), and SWOG S0521 (single-arm induction of ATRA + cytarabine + daunorubicin). A total of 1009 patients were included in the original trials, of which 995 had sufficient data to be included in our multivariate analysis. In this final cohort, there were 37 HD cases during the first 30 days following induction, for an estimated cumulative incidence of 3.7% (95% confidence interval [CI], 2.6% to 5.0%). Using multivariate Cox proportional hazards regression, the hazard ratio of HD in the first 30 days was 2.17 (95% CI, 0.84-5.62) for an ECOG performance status of 3-4 vs 0-2 and 5.20 (95% CI, 2.70-10.02) for a white blood cell count of ≥20 000/μL vs <20 000/μL. In this large cohort of APL patients, high white blood cell count emerged as an independent predictor of early HD.
Introduction
Acute promyelocytic leukemia (APL) is an uncommon hematological malignancy characterized by a reciprocal t(15;17) chromosomal translocation and often associated with severe coagulation defects. All-trans retinoic acid (ATRA) revolutionized the treatment of this aggressive malignancy, and its use with concomitant cytotoxic chemotherapy has resulted in long-term leukemia-free survival rates in excess of 80% for individuals who survive induction and enter a complete remission.1-5 Mortality events typically occur early, with an incidence of 5% to 10% during the first month of induction in the trial setting.6-11 Early death (ED), defined as occurring within 1 month of APL diagnosis, has emerged as the most important cause of treatment failure and obstacle to cure of all patients.12 Most of those deaths are caused by bleeding, with infection the second most common etiology. The risk of early hemorrhagic mortality during induction remains at ∼3% to 5% in large clinical trials from the ATRA era11,13,14 and was 11% in a Swedish population-wide study.15 In another, larger US population-based study, the risk of ED was 17.3% overall. The estimate was even higher in patients ≥55 years of age, reaching a value of 24.2%.16 In a French study, the risk of ED was 17% for individuals treated outside of a trial,17 compared with 21.8% for the whole cohort in a pan-Canadian registry.18 Because most patients receiving induction therapy for APL do so outside of the trial setting, from a general perspective, early hemorrhagic death (HD) remains a major challenge in the management of this neoplasm.
Markers of bleeding risk during induction for APL as reported in the literature include high white blood cell (WBC) count, high peripheral blast count, low platelet count, prolonged prothrombin time (PT), prolonged activated partial thromboplastin time (PTT), decreased fibrinogen, increased lactate dehydrogenase, increased creatinine, and decreased performance status.11,13,14,19-21 The risk factors identified as independent predictors of severe bleeding in multivariate analysis vary between published studies, and only 3 of those cohorts have >200 participants. With this in mind, uncertainty persists as to what are the best independent predictors of bleeding mortality during induction chemotherapy for APL. We set out to identify such predictors in a large cohort of patients receiving ATRA-containing induction regimens for treatment of APL as part of a clinical trial.
Methods
Data sources
The investigators contacted authorities at cancer cooperative groups, Australasian Leukaemia and Lymphoma Group (ALLG), Cancer and Leukemia Group B (CALGB), Eastern Cooperative Oncology Group-American College of Radiology Imaging Network (ECOG-ACRIN), and Southwest Oncology Group (SWOG), to inquire about the availability of datasets from clinical trials using ATRA as part of induction chemotherapy for APL. Given the expected lack of homogeneity in defining severe bleeding across studies, we selected HD at 30 days as our primary endpoint because this outcome is usually reported in modern studies of induction treatment of APL, and its assessment is open to negligible information bias. In order for any given dataset to be used meaningfully in our analysis, it also had to contain reliable information on at least some of the previously reported determinants of severe bleeding, including WBC, peripheral blast count, platelet count, PT, PTT, fibrinogen, lactate dehydrogenase, creatinine, and performance status. We chose to assess those determinants at baseline because we expected this would be easier to apply to the different trial datasets and also in order to facilitate the use of our numerical estimates in future studies or in clinical practice. For studies featuring a non-ATRA–containing induction arm, only patients randomized to a regimen including ATRA were considered for the analysis, based on the fact that this drug is now a mandatory component of first-line therapy.
The database for the present study was from APL patients enrolled in 5 large clinical trials that used ATRA for induction: ALLG APML3 (single arm of ATRA + idarubicin ± prednisone),22 ALLG APML4 (single arm of ATRA + idarubicin + arsenic trioxide + prednisone),23,24 CALGB C9710 (single arm of ATRA + cytarabine + daunorubicin),25 ECOG-ACRIN E2491 (intergroup I0129, consisting of daunorubicin + cytarabine vs ATRA),8 and SWOG S0521 (single-arm induction of ATRA + cytarabine + daunorubicin).26 The latter only included low-risk patients.
Statistical analyses
Descriptive statistics were used to compare patient and disease characteristics across the 5 trials. χ2 and Kruskal-Wallis tests were used as appropriate. The primary study endpoint of the time-to-HD was defined as the interval between the start of induction to the date of HD within 30 days of induction commencement. Patients alive at the 30-day postinduction time point were censored. Cumulative incidence functions were used to estimate the incidence of HD at 30 days following induction initiation, treating death unrelated to hemorrhage as a competing event.
Cause-specific Cox proportional hazards regression was used to evaluate the univariate and multivariate associations between the risk of day 30 HD and disease characteristics. Patients who died unrelated to hemorrhage were censored at the time of death. For survival analyses, WBC and platelet counts were grouped based on an a priori literature review14,20 into low-risk and high-risk groups. WBC count was grouped as <20 000/μL (low risk) and ≥20 000/μL (high risk), and platelet count was grouped as ≥30 000/μL (low risk) and <30 000/μL (high risk) because those thresholds were found to be good predictors of severe or fatal bleeding in the referenced papers. The final multivariate model was selected based on significant univariate effects (P < .05) and after assessing the multicollinearity of potential predictors. Multicollinearity was assessed with Spearman’s rank-based correlations. The final multivariate model included a shared frailty that treated trial as a random factor. As a sensitivity analysis, the same selected predictors were included in a multivariate model that considered trial as a fixed effect.
There were additional sensitivity analyses to further delineate the association between HD risk and disease characteristics. We evaluated risk factors for HD when the start time was trial registration instead of treatment initiation. This endpoint was considered after observing differences in the time from registration to the start of induction across trials. There were 2 cases of death from unknown causes within the first 30 days in the CALGB C9710 trial and 6 cases in the ECOG-ACRIN E2491 trial. We therefore performed this analysis in 2 ways: (1) counting deaths from unknown causes as HD and (2) counting deaths from unknown causes as censoring events.
Variables with P values <.05 were considered statistically significant. All analyses were performed using SAS 9.4 (The SAS Institute, Cary, NC) and CRAN R Version 3.3.0 (The R Foundation for Statistical Computing, Vienna, Austria).
Results
The final cohort retained for multivariate analysis consisted of 995 patients, after excluding 14 individuals with insufficient data on WBC count or ECOG performance status. Most exclusions (n = 8) pertained to the C9710 cohort. The ECOG-ACRIN trial was the oldest of the series, with accrual having occurred between 1992 and 1995. Much of the information on variables of interest for the latter study existed only in paper records and had to be transferred manually to electronic format in 2014 for the purpose of this project. The datasets for APML3, APML4, and ECOG-ACRIN E2491 included information about whether patients had died of bleeding or some other cause. For CALGB C9710, information about cause of death was given in text fields and had to be translated into a binary variable by the authors of this study. S.M. performed categorization, the result of which was reviewed by M.T. No disagreement in cause of death was noted between the observers for the CALGB dataset. Last, the SWOG S0521 study did not have any HD events during the first 30 days after induction. There were 2 mortality events during this period, but review of the available records was sufficient to rule out bleeding as the etiology. The baseline characteristics of patients are shown in Table 1. We saw differences in laboratory values and patient characteristics across trials for age, WBC count, PT, peripheral blast count, PTT, fibrinogen, platelet count, and creatinine clearance (P < .001-.017). No significant differences were seen between trials for sex, hemoglobin, performance status, or French-American-British (FAB) classification (P = .09-.65). The median peripheral white cell count varied from 1400 to 2400 cells/μL (P < .001), whereas the median platelet count varied from 22 000 to 36 000 cells/μL (P < .001). The proportion of patients with a good ECOG performance status (ie, 0 to 2) ranged from 90.8% to 97.1% across studies (P = .19). There were 37 HD cases during the first 30 days following induction, for an estimated cumulative incidence of 3.7% (95% confidence interval [CI], 2.6% to 5.0%; Figure 1). Thirty-seven patients experienced HD within 30 days with 7 from APML3, 2 from APML4, 9 from E2491, 19 from C9710, and 0 from S021.
The first step of the analysis consisted of univariate Cox proportional hazards regression in order to screen for associations between disease characteristics and early HD (Table 2). The variables with a statistically significant measure of effect included peripheral blood blast count (hazard ratio [HR] = 1.12 for each increase in 10 000/μL, 95% CI, 1.05-1.19, P < 0.001), WBC count ≥20 000/μL vs <20 000/μL (HR = 5.49, 95% CI, 2.86-10.52, P < .001), and ECOG performance status of 3-4 vs 0-2 (HR = 2.76, 95% CI, 1.07-7.08, P = .035). There was no significant association between early HD and age, PT, PTT, fibrinogen, hemoglobin, creatinine clearance, platelet count, or FAB classification (M3 vs M3v) (P value range of .054 to .689).
As expected, a strong correlation was found between number of blasts and WBC count (ρ = 0.68, P < .001); therefore, based on clinical judgment, only WBC count was included in the multivariate model. Only WBC (P < .001) was a significant independent predictor of HD, although performance status approached significance (P = .11). The risk of early HD increased for performance status of 3-4 vs 0-2 (HR = 2.17, 95% CI, 0.84-5.62) and for those with a WBC count of ≥20 000/μL vs <20 000/μL (HR = 5.20, 95% CI, 2.70-10.02). The same predictors were included in a multivariate model, where trial was considered a fixed effect; similar results were observed (data not shown).
We conducted another sensitivity analysis examining risk factors for 30-day HD from trial registration instead of initiation of induction. There were 2 cases of death from unknown causes within the first 30 days in the CALGB C9710 trial and 6 cases in the ECOG-ACRIN E2491 trial. This analysis was performed by both counting deaths from unknown causes as HD and counting deaths from unknown causes as censoring events. When the unknown causes were censored, similar associations were observed as before where WBC count was significant (HR, 5.15; 95% CI, 2.69-9.85, P < .001) and performance status was not significant (HR, 2.20, 95% CI, 0.85-5.67, P = .10). However, when we considered these unknown deaths as HD, performance status emerged as a significant predictor (HR, 2.41, 95% CI, 1.07-5.45, P = .034) and WBC count remained a significant predictor (HR, 5.01, 95% CI, 2.77-9.05, P < .001).
Discussion
APL at its onset is associated with a complex coagulopathy in most patients. This acquired bleeding diathesis results in substantial mortality during the first month of induction, a problem that persists to this day despite the improved cure rates brought on by the addition of ATRA to cytotoxic chemotherapy. Prompt use of ATRA and aggressive blood product repletion with cryoprecipitate and platelet transfusions in all patients have been the mainstay of treatment of the often severe coagulation defects encountered in this population; however, there is a paucity of knowledge about how to identify the individuals most at risk of lethal bleeding. In particular, there is a lack of consistency between reports in terms of which baseline patient characteristics are independent predictors of HD.
Performance status emerged as a predictor of HD of borderline statistical significance in multivariate analysis, although, to our knowledge, this variable had not been frequently reported in the past to be useful. It should probably be considered for inclusion in future models trying to better-stratify patients for the risk of HD. Less surprisingly, the total WBC count and the peripheral blast count were found to be associated with an increased risk of fatal bleeding in univariate analysis, with the WBC alone being retained in the multivariate model due to multicollinearity. An increased WBC count is already known to be associated with a poorer prognosis for APL, and several reports with smaller patient populations have noted that a higher total white cell count or peripheral blast count is associated with an increased risk of severe bleeding during induction. Two mechanisms potentially explaining this effect include the APL blasts granules triggering the coagulopathy and the interaction of abnormal leukocytes with the endothelium participating in the pathogenesis of bleeding episodes, especially intracranial hemorrhages. Also, a surprising finding was that PTT prolongation was not predictive of the risk of bleeding mortality in univariate analysis. A prolonged PT had a tendency toward being associated with a higher risk of dying from hemorrhage, but because this did not reach the preset threshold of statistical significance, we did not include this predictor in our final model.
The cohort for the analysis presented herein spans 5 trials from 4 cooperative groups and reflects care given at multiple institutions, with substantially more patients than the largest study on the topic previously published,11 although we were limited by the small number of events. For some covariates and studies, values were missing for a large proportion or even all patients. Given the limited number of events, additional data may alter the significance of some of these covariates. This model should be considered exploratory.
The work presented here is limited first and foremost by the retrospective nature of data collection. This approach is bound to suffer from a lack of standardization, even though we believe that the choice of HD as a primary endpoint obviates most of the difficulties associated with using one of the other definitions of severe bleeding. Also, an important limitation of a retrospective, multicenter study like ours is that the observed baseline clinical factors may have driven additional treatment decisions (eg, blood product transfusions), which may have been different across studies. As it is unknown across studies what additional therapies were provided, we are unable to account for these therapies in our analysis. Ideally, future projects would be part of therapeutic trials in the field and prospectively accumulate data on potential bleeding predictors, along with precise information on outcomes such as major bleeding, clinically relevant non–major bleeding, and HD. Unfortunately, such future endeavors are unlikely to include nearly as many patients as we were able to capture in our analysis. Another limitation of this cohort study is the evolving nature of treatment standards for APL, with arsenic trioxide being a very promising agent,27 and only 1 study from our dataset incorporated this modality.23,24 However, at this stage, a combination of ATRA and cytotoxic chemotherapy remains standard of care for the higher risk subset of patients, and those are the individuals who suffer most fatal hemorrhagic events. In addition, the model we derived was not validated on a separate dataset, so ideally it should be tested in further studies.
As noted above, most ED is attributable to bleeding and represents the biggest obstacle to cure for APL. The rates of ED remain substantial to this day, especially for patients treated outside of a clinical trial, so any knowledge gained into the determinants of hemorrhagic episodes during induction for APL can potentially result in significant improvements of long-term survival for this disease. With this in mind, we believe that the associations identified in our analysis may be potentially useful for contemporary clinical practice. They can be helpful for physicians in the community to better identify the newly diagnosed APL patients at higher risk of HD and perhaps be more aggressive with their management in terms of blood product repletion. The later approach remains as of yet unproven, and only prospective studies will indicate if patients should be treated differently according to their bleeding risk.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
This study used clinical trial databases from the ALLG, CALGB, ECOG-ACRIN, and SWOG cooperative groups. The interpretation and reporting of these data are the sole responsibility of the authors. This research was funded in part through a grant from the National Institutes of Health, National Cancer Institute Cancer Center Support (P30 CA008748).
Authorship
Contribution: M.S.T. and S.M. conceived and designed the research; S.M. abstracted data from paper records at the ECOG-ACRIN Operations Office in Boston; J.-W.L., D.Z., M.C., S.G., K.L., and M.O. prepared the data before analysis; D.A.G. and S.M.D. analyzed the data; S.M. wrote the manuscript; J.-W.L., D.Z., M.C., D.D., H.J.I., M.R.L., E.M.S., F.R.A., R.A.L., R.S., B.L.P., S.G., K.L., J.M.R., H.E., S.C., M.O., J.H.P., and P.H.W. reviewed the manuscript and contributed to modifications in content; and all authors contributed to the revisions of the manuscript.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
The current affiliation for D.Z. is Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Melbourne, Australia.
Correspondence: Simon Mantha, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; e-mail: manthas@mskcc.org.