Longer and more intensive postinduction intensification (PII) improved the outcome of children and adolescents with “higher risk” acute lymphoblastic leukemia (ALL) and a slow marrow response to induction therapy. In the Children's Cancer Group study (CCG-1961), we tested longer versus more intensive PII, using a 2 × 2 factorial design for children with higher risk ALL and a rapid marrow response to induction therapy. Between November 1996 and May 2002, 2078 children and adolescents with newly diagnosed ALL (1 to 9 years old with white blood count 50 000/ or more, or 10 years of age or older with any white blood count) were enrolled. After induction, 1299 patients with marrow blasts less than or equal to 25% on day 7 of induction (rapid early responders) were randomized to standard or longer duration (n = 651 + 648) and standard or increased intensity (n = 649 + 650) PII. Stronger intensity PII improved event-free survival (81% vs 72%, P < .001) and survival (89% vs 83%, P = .003) at 5 years. Differences were most apparent after 2 years from diagnosis. Longer duration PII provided no benefit. Stronger intensity but not prolonged duration PII improved outcome for patients with higher-risk ALL. This study is registered at http://clinicaltrials.gov as NCT00002812.
Introduction
Postinduction intensification (PII) has proved a useful strategy in childhood acute lymphoblastic leukemia (ALL). The Berlin Frankfurt Munster Group (BFM) introduced an effective postinduction intensification element called Protocol II or Delayed Intensification (DI) in l976.1 The Children's Cancer Group (CCG) began a 25-year investigation of DI in l981. The CCG study 105 showed an event-free survival (EFS) advantage for PII for National Cancer Institute (NCI)/Rome standard-risk patients, not enhanced by earlier intensification in the first 2 months of therapy.2 CCG-1891 showed an EFS advantage for 4 versus 2 months of PII for standard-risk patients.3
CCG-1882 introduced the augmented “BFM” regimen, that is, longer and stronger PII for NCI/Rome higher-risk patients with a poor day 7 response to initial induction therapy (slow early responders, SER) who have had a higher failure rate.4 PII was intensified by adding vincristine (VCR) and asparaginase (LASP) during periods of myelosuppression in consolidation (months 2 and 3 of therapy) and DI (months 6 and 7 of therapy) and by replacing oral 6 mercaptopurine and methotrexate in the interim maintenance (IM) phase (months 4 and 5 of therapy) with vincristine and intravenous methotrexate (IV MTX) and LASP (Capizzi MTX). The duration of PII was increased by adding a second IM phase and a second DI phase. This regimen resulted in an advantage in both EFS and survival. The successful “augmented regimen” was not tested in HR patients with a rapid day 7 response (rapid early responders, RER), where outcomes were somewhat better than in SER.
In 1996, we initiated a 2 × 2 factorial trial of longer and stronger PII in the RER subset to determine the relative contributions of length and strength to PII. The longer and stronger PII regimen on CCG-1961 was the augmented “BFM” regimen from the CCG-1882 used for randomized SER with the substitution of pegylated for native asparaginase and omission of prophylactic cranial irradiation. Patients received either 5 months or 8 months of standard intensity PII or 6 months or 10 months of stronger intensity PII. Results in 1299 eligible randomized patients follow.
Methods
The CCG-1961 protocol opened to patient entry in September 1996 and closed in May 2002. Eligibility for CCG-1961 included aged 10 years through 21 years of age or aged 1 year or older with a presenting white blood cell (WBC) count 50×109/L (50 000/μL) or more. Diagnosis was based on morphologic, biochemical, and immunophenotypic features of leukemia cells, including lymphoblast morphology as determined by Wright-Giemsa staining and reactivity with monoclonal antibodies to lymphoid differentiation antigens associated with B-cell or T-cell lineage as described previously.5 In this study, central nervous system (CNS) positivity at diagnosis (CNS-3) was defined as 5 WBCs or more and blasts on cytospin preparation. The same criteria were used for defining CNS relapse. In our prior high-risk trials, the WBC criteria were more than 5 WBCs. For patients with a bloody tap, an algorithm was used. Induction therapy consisted of VCR 1.5 mg/m2 per week for 4 weeks; daunorubicin 25 mg/m2 per week for 4 weeks; prednisone 60 mg/m2 per day for 28 days; LASP 6000 units/m2 intramuscularly thrice weekly for 9 doses; and intrathecal cytarabine on day 0 and intrathecal MTX on days 7 and 28. All patients had a bone marrow aspirate performed on day 7. Bone marrow biopsies were not used in this study for assessment of response. Patients who had less than or equal to 25% blasts on day 7 were considered RER. RER patients who achieved remission were randomized to standard (SPII) or increased intensity postinduction intensification (IPII), and one or 2 IM/DI phases. In increased intensity arms, patients received additional VCR and PEG LASP courses during consolidation and DI phases and VCR, IV MTX without rescue and PEG LASP during IM phases. The postinduction regimens are given in Table 1. RER patients who were not CNS-3 received intrathecal MTX without radiotherapy.
Patients randomized to 2 DI phases received dexamethasone on days 1 to 7 and 14 to 21 of each course in an effort to reduce the high incidence of osteonecrosis seen in 1882.6 All patients randomized to the IPII therapy received PEG LASP after induction. Therapy lasted 2 years for girls and 3 years for boys, beginning with the first IM period. Patients who were CNS positive or Philadelphia chromosome positive were excluded from the randomization. These results will be reported separately.
This protocol was approved by the National Cancer Institute and Institutional Review Boards of the participating institution. Informed consent was obtained from the patients, their parents, or both as deemed appropriate according to the Department of Health and Human Services guidelines and in accordance with the Declaration of Helsinki.
Patients were assigned in a 2 × 2 factorial design to the 4 regimens described previously (Table 2). Balanced block randomization was used to ensure that approximately equal numbers of patients were randomly assigned to each regimen. The study was monitored by an independent Data and Safety Monitoring Committee and followed a monitoring plan that was based on a group sequential monitoring boundary that called for analysis of results at 4 times in the study when 25%, 50%, 75%, and 100% of the anticipated disease-related events had occurred. The original target enrollment was 1052 randomized patients, which would result in statistical power of approximately 96% at the final analysis to detect a relative hazard rate 0.626 (ie, a 37% reduction in the EFS failure rate) for either of the main regimen comparison in the 2 × 2 design. At the recommendation of the Data and Safety Monitoring Committee, in October 2000, the study duration was extended to attain the planned randomization accrual for the SER patients. Because response status is not known until day 7 after enrollment on the study, the RER accrual was also extended to coincide with achieving the SER accrual target. The monitoring boundary for the RER comparison of increased intensity versus standard intensity was crossed in February 2003 when the P value reached .0198 (the boundary value at that time was P < .023), and at that time the study results for the RER patients were released. Similarities between patients in the 2 groups were assessed with χ2 tests for homogeneity of proportions. Outcome analyses used life table methods and associated statistics. The primary endpoints examined were EFS and overall survival from the time of randomization. The EFS events considered were relapse at any site, death during remission, or a second malignant neoplasm, whichever occurred first. Data on patients who had not had an event at the time of analysis were censored in the analysis of event-free survival at the time of the last contact. Life table estimates were calculated by the Kaplan-Meier procedure and the SD of the life table estimate was obtained with Peto's method.7,8 The log rank test was used to compare outcome in treatment or prognostic groups, and estimates of the relative hazard rate (RHR) used observed and expected event rates from the log rank tests.8,9 Tests for interaction effects of the treatment components were performed with Cox regression methods. The Kaplan-Meier life table estimates (with the associated SD) are presented for the 5-year time point unless otherwise stated.
Results
Patients
A total of 2078 patients were enrolled (Figure 1). Twenty-one patients were found to be ineligible for the study (6 patients because of improper consents, 2 patients started chemotherapy before signing the consent, 8 patients were found to have malignancies other than ALL, 2 patients received steroids longer than 48 hours before diagnosis, 2 patients had been mistakenly enrolled on 1961 instead of the appropriate study for standard risk ALL, and 1 patient did not have an evaluable bone marrow result). Twenty-eight patients died during induction and of these, 3 patients died before day 7. Causes of induction death included sepsis (18), central nervous system bleeds (4), fungal infections (4), aspiration (1), and congestive heart failure (1). Twenty-four patients did not achieve a remission. Of 1911 patients who successfully achieved remission and also had an evaluable day 7 marrow result, 71.4% were RER (n = 1364) and 28.6% were SER. Sixty-five RER patients were excluded from the randomization because they were CNS-3 (43), Philadelphia chromosome positive (7), parental (9), or physician choice (6).
There were 1299 eligible RER patients randomized in the 2 × 2 design. This resulted in 649 and 650 patients assigned to SPII and IPII, and 651 and 648 patients assigned to standard duration or longer duration PII, respectively. There were also 8 SER patients erroneously randomized to the RER regimens; they are not included in the analyses. Approximately 21% of the patients with satisfactory immunophenotyping data had T-cell ALL. Tables 3 and 4 give the distribution and comparison of baseline patient characteristics for each of the main comparative regimen groupings in the factorial design randomization. No significant differences appear between the stronger and standard intensity groups, and only 2 factors had slight differences for the longer and standard duration groups, namely, platelets (P = .03) and ploidy groups (P = .06). Given the 32 characteristics being compared for the 2 regimen groupings, this would be approximately the number of statistical differences expected by random variation.
Outcome of treatment
The 5-year EFS and survival (S) for all patients on study are 71.3% (SD = 1.6%) and 80.4% (SD = 1.4%), respectively. For all RER patients achieving remission, the 5-year EFS and S postinduction are 75.5% (SD = 1.8%) and 84.7% (SD = 1.5%), respectively. The median follow-up for the randomized continuously disease-free RER patients who have not experienced an EFS event is 3.5 years.
The cumulative incidence of isolated and combined CNS relapse was 4.5% (SD = 1.0%) and 7.0% (SD = 1.2%) for RER patients at 5 years. CNS relapse occurred more frequently in T ALL (19 events/235 total) compared with B precursor ALL (37 events/880 patients; P = .01). Because our definitions for CNS disease had changed slightly and handling of traumatic taps had been formalized from the previous study, we did look at these factors. There was 1 patient on 1961 who had a CNS relapse with a WBC = 5. There was only one RER patient on 1961 with a CNS relapse having a traumatic tap.
Prognostic factors
Conventional prognostic factors (eg, age, sex, race, Down syndrome, organomegaly, presence of mediastinal mass, lymphadenopathy, testicular rating, WBC, CNS status, hemoglobin, platelet count, common acute lymphoblastic leukemia antigen positivity, and immunophenotyping) had little effect on outcome for RER patients, despite the number of patients and events. However, a WBC count more than or equal to 200 000/m3 (n = 133, 10.2% RER) resulted in a worse outcome (5-year EFS, 60% vs 73%, P = .008, RHR = 1.57), and the small number of patients who were 12.0 to 17.99 months of age (n = 31; 2.4% of RER) had a worse outcome (5-year EFS, 60.2% vs 76.8%, P = .047, RHR = 1.83).
Outcome according to intensity of PII
The 5-year EFS estimates for patients receiving IPII and SPII therapy are 81.2% (SD = 2.4%) and 71.7% (SD = 2.7%) and the corresponding 5-year survival estimates are 88.7% (SD = 1.9%) and 83.4% (SD = 2.2%). Log rank tests show that both EFS and S are significantly better for IPII compared with the SPII regimen (P < .001 and P = .005, respectively; Figures 2 and 3). The RHR for EFS events is 1.61 times higher and the RHR for death is 1.56 times higher for the standard intensity regimen. Table 5 gives the distribution of initial EFS events in the 2 intensity regimens. EFS events occurred in 170 patients in the standard intensity arms and 110 patients in the stronger intensity arms, with 12 remission deaths for both arms. Isolated marrow relapse was the main cause of treatment failure for both SPII and IPII groups, occurring in 84 standard intensity patients and 50 stronger intensity patients, respectively (P = .001, RHR = 1.77). The incidence of isolated central nervous system relapses was similar (n = 32 and 29; P = .61, RHR = 1.14). Analyses showed no interaction between intensity and duration of PII (P = .59).
Among the examined subgroups, outcomes were better with stronger intensity PII compared with standard intensity (B-cell 5-year EFS of 80.4% ± 2.9% vs 70.4% ± 3.4%, P = .001; T-cell 5-year EFS of 82.9% ± 5.4 vs 72.3% ± 6.2%, P = .16; age 1-9 years 5-year EFS of 82.1% ± 4.0% vs 70.8% ± 4.2%, P = .009; and age > 10 years 5-year EFS of 80.4% ± 2.9 vs 72.3 ± 3.5%, P = .003). As seen in the CCG-1882 trial, stronger intensity PII yielded an earlier EFS plateau.
Outcome according to duration
No significant difference was seen in outcome for patients receiving 1 IM/DI phase (5-year EFS of 76.0%, SD = 2.6%) or 2 IM/DI phases (76.8%, SD = 2.6%) (P = .94, RHR = 1.00) (Figure 4). Also, no outcome difference was apparent in any subgroup analyzed (1-10 years, 10 and older, T-cell and B-cell precursor). Survival outcome was also similar for the duration groups with 86 deaths for standard duration PII and 78 deaths for longer PII (P = .58, RHR = 1.08). Duration made no difference for the subset who received stronger intensity PII (5-year EFS of 80.2% and 82.2%) or for the subset who received standard intensity (5-year EFS 71.7% and 71.6%).
Toxicity analysis
Major toxicities observed in RER patients included osteonecrosis (avascular necrosis) and infections. Osteonecrosis developed in 103 RER patients (59 IPII; 44 SPII, P = .13). The incidence of osteonecrosis for patients treated on standard duration was 10.8% (67 events) compared with 5.5% (36 events) for patients treated on the increased duration arms (P = .001). Further data regarding osteonecrosis in this group of patients will be reported separately. The prevalence of infections (including bacteremia resulting from sepsis or central venous catheter infection) was not statistically different between the combined standard versus increased intensity regimens, regardless of phase of therapy. Some differences were noted in the use of supportive care interventions. During consolidation, antifungal agents were administered to 9.5% of patients on the increased intensity regimens compared with 3.9% of those on the standard regimens (P = .001). During IM 1, a greater percentage of patients on the increased intensity regimens versus the standard regimens received antifungal agents (4.9% versus 0.8%, P < .001), total parenteral nutrition (7.3% vs 2.1%, P < .001), antibacterials (28.8% vs 13.4%, P < .001) and blood products (20.1% vs 10.1%, P < .001). Number of days hospitalized was not different between increased intensity versus standard regimens except during consolidation (33.2% versus 23.1% for > 8 days, P = .001) and IM 1 (26.3% vs 11.5% for 1-7 days and 11.4% vs 3.9% for > 8 days, P < .001 for both). The only difference between IPII and SPII during DI 1 was in blood product use 65.2% versus 59.2% (P = .03). Among patients treated on IPII arms, 54% experienced an allergic reaction to PEG LASP.
In the randomized RER patients, there were 24 deaths (12 SPII, 12 IPII) as a first event. A total of 140 deaths occurred after a relapse or other initial EFS event (eg, second malignant neoplasms). There were 4 second malignant neoplasms on the SPII (nasopharyngeal carcinoma, CML, B-cell lymphoma, acute myelogenous leukemia) and 2 on IPII (B-cell lymphoma, myelodysplastic syndrome).
Discussion
In recent years, a dramatic improvement in outcome for children with ALL has been achieved by increasing the intensity of treatment. The striking improvement in EFS produced by longer and stronger PII therapy of NCI high-risk ALL patients showing a slow early response to induction therapy, which occurred in the previous CCG-1882 study, left many unanswered questions.4 Augmentation was achieved by increasing the intensity of individual phases, as well as increasing the number of intensified phases (ie, duration of intensification). Compared with CCG-modified BFM therapy, augmented “BFM” featured more doses of VCR and LASP during the consolidation, IM and DI phases and used intravenous MTX without leucovorin rescue during the IM phase(s). Incorporating a second IM and DI phase before maintenance further increased the duration of intensification. The relative contribution of each of these changes to the observed improvement in EFS was uncertain.
In the past, standard therapy for NCI high-risk patients with ALL showing a rapid early response was CCG-modified BFM therapy.10 In CCG-1961, the question was posed whether increasing the intensity of therapy for all high-risk patients would improve outcome. Because longer or/and stronger intensification is associated with additional risks of side effects and costs, it is essential that the relative benefit of individual components be established. Therefore, CCG-1961 assessed the relative merits of intensification approaches using a 2 × 2 factorial design. Patients were randomized to either standard intensity (consolidation, IM, and DI phases as in CCG-modified BFM) or IPII (consolidation, IM, and DI phases as in CCG-augmented BFM). In addition, patients were randomized to receive one or 2 courses of IM and DI. Thus, the 4 arms of the trial were SPII, SPII with a second standard intensity IM and DI phase, IPII with a second increased intensity IM and DI phase, and IPII with only a single increased intensity IM and DI phase. In addition, patients treated on stronger intensity regimens received PEG LASP during chemotherapy after induction (PEG LASP was not used in CCG-1882).
Stronger intensification produced a highly statistically significant improvement in EFS compared with the standard intensity therapy. Little difference was apparent for the first 2 years. However, with longer follow-up, an EFS difference has emerged and increased with few events in the stronger PII regimens after 4 years, but many later relapses occurred in the standard intensity PII regimens. This follows CCG-1882, where few events were noted after 3 years for patients treated on the augmented regimen, whereas events continued for those treated with SPII.4 Both of these observations support the long-term benefit of PII therapy.
In contrast, longer PII provided absolutely no EFS benefit, and no suggestion of an interaction effect on outcome for the intensity × duration subsets is apparent. A second IM and DI phase produced no EFS benefit over a single IM and DI. This suggests that a window of opportunity exists to eradicate resistant clones early by increasing the intensity of therapy, but residual leukemic clones after one IM/DI probably represent intrinsic drug resistant disease. In this circumstance, further intensification using the same agents would not be expected to be beneficial. Whether this remaining clone represents de novo resistant disease that existed at diagnosis or is characterized by further evolution because of somatic or epigenetic changes is another unanswered question. Specific characterization of the underlying pathways responsible for residual disease after current PII would aid in the identification of new agents with a high rate of activity in this specific setting.
Further intensification also comes with “costs” in terms of potential short- and long-term side effects as well as an increased financial burden, so it is imperative to balance improvements in EFS with these risks. The improvement in EFS seen with IPII was associated with additional side effects, but these were relatively modest and there was no difference in deaths from toxicity. However, the incidence of osteonecrosis increased, especially in older children receiving 21 days of continuous dexamethasone. Therefore, in subsequent high-risk studies, patients older than 10 years receive discontinuous dexamethasone (days 1-7 and 15-21). Because of the high incidence of allergic reactions to PEG LASP after native asparaginase in induction, all patients on the successor high-risk trial received PEG LASP in induction and all subsequent phases.
In recent CCG protocols for NCI high-risk patients, we have observed a marked decrease in the incidence of bone marrow relapse, whereas the rate of CNS relapse has remained constant or increased slightly because of the elimination of cranial radiotherapy.10 Even though the definition for CNS disease changed slightly from the previous high-risk protocol (CCG-1882), there was no change in the incidence. In addition, we found no significant difference in the incidence of CNS relapse between standard intensity and increased intensity arms. On the IPII, 30% of relapses were isolated CNS relapse. In the current COG high-risk B precursor study, we are evaluating 2 interventions, dexamethasone during induction and intensification with high-dose MTX during interim maintenance, which may contribute to reducing the rate of CNS relapse. CNS relapse occurred more frequently in patients with T ALL compared with patients with B precursor ALL. In the new COG trials, T cell patients will receive 12 Gy of cranial radiotherapy.
Comparisons across studies are always perilous as patient populations and care delivery may differ. Identification of an exact comparison group (eg, NCI higher risk with rapid day 7 marrow response) is problematic. Studies may differ as to which patients are included and which are excluded. Our strict intent-to-treat analyses included all eligible randomized patients, and no patient was excluded for failure to receive protocol therapy.
CCG-1961 provided 5-year EFS of 71% for rapid and slow response, T- and B-precursor, NCI higher-risk patients compared with 69% on the prior CCG trials (CCG-1882/1901, 1989-1995, n = 1841).12 The BFM-90 study reports a 6-year EFS of 64% for NCI HR patients (n = 724) overall.13
RER patients on CCG-1961 had a 5-year EFS of 76% versus 75% for the comparable arm on CCG-1882 that excluded patients with lymphomatous features and thereby most patients with T-cell immunophenotype (n = 31910 ; Figure 5). The RER subtype excludes CNS-3 and Philadelphia chromosome positive patients. BFM 90 reports a 6-year EFS of 73% for the “prednisone good response” HR subset (n = 564), comprising 78% of the HR population.13 We obtained a 5-year EFS of 81% for our similar but somewhat “softer,” more favorable RER subset, comprising 69% of the HR population, with stronger but not longer PII.
In conclusion, stronger, not longer, PII intensification improved EFS and survival for NCI higher-risk children and adolescents with B-precursor or T-cell ALL and a rapid response to induction therapy. In contrast, no benefit was found for longer PII. This study provides the platform for the current Children's Oncology Group studies for higher-risk B-precursor and T-cell ALL.
The online version of this article contains a data supplement.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
The authors thank Drs William Carroll and Stephen Hunger for their suggestions, insights, and support on this manuscript.
This work was supported by the National Institutes of Health (grants U10 CA 98 543, CA 13 539, and CA 30 969). A complete listing of grant support for research conducted by CCG and POG before initiation of the COG grant in 2003 is available online at: http://www.childrensoncologygroup.org/admin/grantinfo.htm.
National Institutes of Health
Authorship
Contribution: N.L.S. designed the study as study chair, supervised the study, and wrote the manuscript; P.G.S. cochaired the study and participated in the running of the study; H.N.S. designed the study statistics, analyzed the data, and wrote the statistical section of manuscript; J.B.N. contributed to the design of the study, monitored one of the arms of therapy, and edited the manuscript; C.D. contributed to the design of the study, monitored one of the arms of the study, and monitored neurotoxicity; L.J.E. contributed to the design of the study and monitored all aspects of asparaginase use and toxicity; D.R.F. contributed to the design of the study, monitored supportive care issues and infectious complications, and edited the manuscript; L.A.M. contributed to the design of the study and monitored osteonecrosis; C.A.H. contributed to the design of the study and reviewed eligibility of patients; C.M.R. contributed to design and monitored one of the arms of the study; K.B. contributed to design of the study and monitored one of the arms of the study; J.L.F. contributed to study design and monitored patients on study; N.A.H. reviewed cytogenetic data from study and outcome; T.L.M., A.F.P., and C.E. contributed to running of the study; M.K.L. analyzed data; and P.S.G. designed the study, chaired the CCG ALL Committee, and edited the manuscript.
A complete list of participants in the Children's Oncology Group is available in the online version of this article.
Conflict-of-interest disclosure: P.S.G. has been a consultant for Genzyme; P.S.G. has participated in the Speakers Bureau for Enzon and Sanofi Aventis, and J.B.N. and N.L.S. for Enzon. The remaining authors declare no competing financial interests.
Correspondence: Nita L. Seibel, Childrens National Medical Center, Department of Hematology-Oncology, 111 Michigan Ave NW, Washington, DC 20010-2970; e-mail: nseibel@cnmc.org.