Abstract
Because there are differing opinions regarding treatment of patients in the chronic phase of chronic myeloid leukemia (CML), the American Society of Hematology convened an expert panel to review and document evidence-based benefits and harms of treatment of CML with busulfan (BUS), hydroxyurea (HU), recombinant interferon- (rIFN-), and bone marrow transplantation (BMT). The primary measure for defining efficacy was survival. Analysis indicated a survival advantage for HU over BUS. Observational studies of rIFN- suffer from numerous biases including sample size, variations in study populations, definitions of hematologic and cytogenetic remissions, and dose. That rIFN- is more efficacious than chemotherapy is demonstrated by 6 prospective randomized trials. For patients with favorable clinical features in chronic phase, compared to HU and BUS, rIFN- improves survival by a median of about 20 months. Most evidence suggests that rIFN- is most effective when combined with other drugs and when given during the earliest stage of the chronic phase. Adding cytarabine to rIFN- adds further survival benefit but increases toxicity. Limitations for evaluating the long-term benefits of allogeneic BMT include the retrospective nature of most studies, incomplete documentation of the clinical characteristics of the patients, paucity of the details on patient selection, lack of control groups, and limitations of survival calculations. Survival curves for BMT show that at least half of the patients transplanted remain alive 5 to 10 years after treatment, whereas similar curves for rIFN- show a continuous relapse rate over time with the curves crossing at about 7 to 8 years. Estimates of long-term survival may be confounded by the selection biases mentioned and the analytic methods used. The magnitude of the incremental increase in benefit with BMT must be weighed against the potential serious harm and death that may accompany the procedure in the short term. The best results with BMT have been obtained when it is performed within 1 to 2 years from diagnosis. Since each treatment option involves tradeoffs between benefit and harm, patient choice must be based on the examination of facts presented in an unbiased fashion. Newly diagnosed younger patients and older patients who are candidates for BMT should also be offered information about IFN-based regimens, the tradeoffs involved, and, if possible, share in the treatment decision. Hopefully this analysis will provide the stimulus for evaluation of other important aspects of CML.
CHRONIC MYELOID (myelogenous, myelocytic, granulocytic) leukemia (CML) is a clonal myeloproliferative disorder of a pluripotent hematopoietic stem cell with a specific cytogenetic abnormality, the Philadelphia (Ph+) chromosome. This chromosome results from a balanced translocation between the long arms of chromosomes 9 and 22, resulting in the bcr/abl chimeric gene that expresses an abnormal fusion protein with altered tyrosine kinase activity. CML accounts for 7% to 20% of all leukemias and affects an estimated 1 to 2/100,000 persons in the general population.1,2 Although the median age of presentation is the fifth decade, all age groups are at risk. CML is characterized by a chronic phase with a median duration of 3 to 5 years when treated with conventional agents and an accelerated or acute phase of approximately 3 to 6 months’ duration, inevitably terminating fatally. Initially, the chronic phase is characterized by no or few symptoms and signs. However, in the majority of cases, constitutional symptoms and abnormal physical findings including extramedullary abnormalities, such as myeloblastomas, eventually develop.3
Experts differ on the best treatment for patients in the chronic phase of CML. Options include busulfan (BUS), hydroxyurea (HU), interferon (IFN)-based regimens, or bone marrow transplantation (BMT). Until a few years ago, allogeneic BMT was the treatment of choice for all eligible patients, because it was the only treatment that appeared to change the natural course of the disease. Therefore, randomized studies comparing transplantation to chemotherapy (BUS, HU) were not feasible, and follow-up reports of observational studies were not deemed necessary. Currently, this situation has changed because IFN-based regimens have also influenced the natural course of CML by also prolonging survival.
In 1996, the American Society of Hematology convened an Expert Panel on Chronic Myeloid Leukemia to review and document the strength of the evidence regarding the benefits and harms of each option and to determine whether evidence-based treatment recommendations could be developed. This report summarizes the Panel’s evidence review and recommendations.
PANEL METHODS
Panel Composition
The 12-member panel included hematologists and oncologists from the United States, England, France, Germany, and Italy with research expertise in the treatment of CML, practicing hematologists from the United States, a biostatistician, and a practice guidelines methodologist. One of the panelists was also a designated representative of the American Society of Clinical Oncology.
Scope of Review
The review evaluated the long-term efficacy of chemotherapy (BUS, HU) rIFN-α–based regimens, and allogeneic BMT as initial treatments for chronic-phase CML. Busulfan was examined only in comparison to HU. Other chemotherapeutic agents, high-dose combination chemotherapy, radiation, splenectomy, and experimental therapies were not reviewed. The review of BMT focused on allogeneic transplants using matched sibling and unrelated donors. Comparisons of pretransplant preparative regimens and protocols for preventing graft-versus-host disease (GVHD) were not within the Panel’s purview. Other important aspects of CML, eg, etiology, natural history, molecular and cytogenetic testing, autologous BMT, and treatment of accelerated phase and blast crisis, were beyond the scope of this review. The target condition, treatments, and outcomes of interest were defined explicitly as follows.
Target condition.
CML was considered present only with evidence of the Ph+ chromosome and/or chimeric bcr/abl gene. Excluded werebcr-abl–negative and Ph-negative disease, juvenile CML, chronic myelomonocytic leukemia, chronic neutrophilic leukemia, chronic eosinophilic leukemia or hypereosinophilic syndrome, and Ph+ acute leukemia.
Outcomes of interest.
Life expectancy (survival rate) was the primary measure for defining treatment efficacy. Relevant intermediate outcomes included evidence of hematologic or cytogenetic remission (as defined below), but these parameters were considered less persuasive than survival. Potential adverse effects of treatment were considered for each option. Treatment costs, although a measure of great importance,4were not analyzed because of lack of adequate data.
Relevant evidence.
Relevant evidence addressed the target condition and the efficacy of the treatments listed above in terms of survival and/or hematologic/cytogenetic remission. Admissible evidence included controlled and uncontrolled observational studies, randomized controlled trials, and letters to the editor containing primary data. Excluded studies were those with less than 5 patients in chronic-phase CML, those without English-language text, and those published before 1980.
Literature search.
A computerized literature search of the MEDLINE database, conducted in 1996, sought all publications in which the text words “chronic myelogenous leukemia” appeared in the title or abstract. This search term was not expanded because an initial list of 2,423 citations was retrieved, of which 960 addressed treatments of interest. Two hundred seven articles met criteria for closer inspection. The core literature assembled from the computerized search was supplemented in 1997 and 1998 with additional relevant articles identified by scanning bibliographic reference lists and by suggestions from panel members and reviewers. This included articles on chronic myeloid, myelogenous, myelocytic, and granulocytic leukemia.
Criteria for evaluating quality.
Both observational studies and randomized controlled trials were reviewed, but the latter were generally considered a stronger class of evidence. For both categories, the internal validity of studies was judged on the basis of explicit criteria: sample size and statistical power, selection bias, methods for allocation to treatment groups, attrition rate, definition of intervention and outcomes, confounding variables, data collection biases, and statistical methods. External validity was judged in terms of the patients, treatment protocol, and clinical setting examined in the study. The designs, results, and limitations of the studies were assembled systematically in evidence tables (see Tables 2 through 6).
Development of recommendations.
Recommendations were evidence-based: this means that treatments could not be recommended unless the evidence met explicit predetermined criteria shown in Table 1. When such data were lacking, the panel generally chose not to make recommendations on the basis of indirect evidence (eg, uncontrolled observational studies) or expert opinion.
CHEMOTHERAPY WITH BUS AND HU
For many years the principal options for chemotherapy for treating Ph+ CML have included BUS and HU.5,6 The superiority of HU was finally established after a randomized controlled trial compared the agents7 and showed that median survival was significantly shorter for BUS-treated patients than for those treated with HU (45v 58 months) (P = .008). The 5-year survival rates were 32% and 44%, respectively. A recent meta-analysis of 5 other trials also supports a survival advantage for HU over BUS.8
IFN
Observational Studies of rIFN-α
The bulk of the evidence for the effectiveness of rIFN-α therapy consists of at least 30 uncontrolled observational studies initiated in the 1980s (Table 2). The largest number of patients have been followed at the M.D. Anderson Cancer Center (Houston, TX) in observational studies where the probability of complete and partial hematologic remission in CML after interferon therapy is 70% to 80% and 6% to 10%, respectively.13-15Remission rates reported by other investigators are generally lower and vary more widely. In studies where only rIFN-α is used, the rates of complete and partial hematologic remissions range from 7% to 81% and 6% to 50%, respectively (Table 2) (the variations in study design preclude accurate calculation of means or medians). Reported rates for complete and partial cytogenetic remissions range from 0% to 38% and 0% to 16%, respectively (Table 2).
Variations in study populations and response to rIFN-α.
The wide variations in reported outcomes with rIFN-α therapy stem largely from differences among studies in patient case mix (health status and risk factors) age, stage of disease, the number of months elapsed from diagnosis, the presence or absence of symptoms or physical findings, treatment regimens, and criteria for measuring outcomes, all of which influence prognosis. IFN regimens vary considerably in preparation, dose, duration, and criteria for changing the dose based on clinical response and toxicity.
The definitions of hematologic and cytogenetic remission, although patterned after the well-established criteria of the Houston group, are not identical. (The Houston group defines a complete hematologic remission as the achievement of a normal white blood cell (WBC) and platelet count [<10,000/μL and 450,000/μL, respectively], normal differential [no immature forms], and the disappearance of all symptoms and signs of CML. A partial hematologic response is defined as a decrease in the WBC count to <50% of the pretreatment level and <20,000/μL, or the normalization of the WBC count accompanied by persistent splenomegaly or immature cells in the peripheral blood. A complete cytogenetic response is defined as the absence of Ph+ metaphases; partial cytogenetic response as 1% to 34% Ph+. Major cytogenetic remission combines the percentages of complete and partial response.) In many reports, hematologic remissions after rINF-α occur within a median of 1 to 3 months,16-18 but generalizing from these studies is uncertain in part because many of these patients received additional therapy before or concurrent with rINF-α. Evidence exists for a dose-response relationship for rIFN-α.19,20 Although average doses for rIFN-α have been established, the effective dose for an individual may vary considerably from the mean. Most (but not all) studies suggest that doses of 4 to 5 million units (MU)/M2/d are more likely to achieve remission (and toxicity) than are lower doses.19,20 In observational studies the median duration of hematologic remission is 52 months, with 80% of responders remaining in remission more than 12 months.16 Durable hematologic remissions are more common in young patients who are treated soon after diagnosis, who have less advanced stage disease, and who have favorable prognostic features17,20 21(Table 3).
Cytogenetic response is also more likely in patients who at diagnosis have favorable prognostic features including low or normal baseline platelet counts, a low percentage of blasts in the blood and marrow, or a nonpalpable spleen and who achieve a favorable hematologic response.21-23 In patients with a complete hematologic remission, the median time to complete cytogenetic remissions is 9 to 18 months, but it may occur after 4 years of therapy.16,18,24 Durable cytogenetic responses, some lasting as long as 10 years, are more common in patients in whom the Ph+ chromosome cannot be demonstrated from evaluable metaphases compared with those who experience only a partial cytogenetic remission.25
Problems With Uncontrolled Observational Studies With rIFN-α
Although single-arm uncontrolled observational studies are important because of highlighting potential improvement in long-term outcomes, they are of limited value in proving that rIFN-α is more effective than chemotherapy because of methodologic problems limiting their interpretation. Sample sizes are generally small (often <50 patients), which does not permit meaningful statistical conclusions. Poorly described patient characteristics makes it difficult to assess risk status. Retrospective studies may allow selection bias in choosing records to be analyzed. Prospective studies specifying inclusion criteria have rarely provided sufficient documentation to assure that the records of all, rather than only some, eligible patients were examined. These uncertainties introduce the possibility that observational studies might include a disproportionately large number of patients predisposed to experience particularly favorable or unfavorable outcomes.
Other methodologic problems exist. Lengths of follow-up in many studies are short, often less than the median survival time of the disease. Some centers have published multiple reports on the same cohort, but the data are inconsistent. In many studies, cytogenetic responses were measured only in patients who had a hematologic response (because patients without the latter are considered incapable of exhibiting cytogenetic improvement). Excluding patients from the denominator who did not respond hematologically tends to overestimate cytogenetic response rates.
Few studies report the primary outcome measure, survival rates, instead measuring effectiveness on the basis of intermediate outcomes, ie, achieving hematologic or cytogenetic remissions. Reliance on these surrogate markers to infer that patients experience an improvement in overall survival is problematic. Although the occurrence of completehematologic remission increases the likelihood that patients will experience longer survival,9,22,26,27 it does not guarantee it. Hematologic remissions may be short-lived; cytogenetic remissions may not necessarily confer long-term benefits. Although most studies document an association between cytogenetic remission and improved survival,15,22,26-28 some do not.9,16 Moreover, cytogenetic remission is not always durable.29,30 In addition, complete cytogenetic remission does not always indicate elimination of all cells containing the chimeric bcr/abl gene. Using the reverse transcriptase-polymerase chain reaction (RT-PCR) assay to assess minimal residual disease, chimeric bcr-abl gene transcripts have been documented in patients who have achieved complete cytogenetic remission,23,30,31; however, in one recent study of very long-term rIFN-α–treated patients in continuous cytogenetic remission, no bcr-abl transcripts were detected in 10 of 18 patients.32 Whether PCR positivity or negativity predicts decreased or increased survival respectively is as yet unresolved. The outcomes that matter most to patients, life expectancy, disease-specific mortality, and quality of life, remain the best measures of success (or failure) but are often not measured.
Randomized Controlled Trials (RCTs) of IFN
The most compelling evidence that rIFN-α is more efficacious than chemotherapy comes from 4 prospective, randomized studies (the first 4 studies in Table 4) showing a statistically significant improvement in survival rates in patients receiving rIFN-α. Five-year survival rates in these RCTs were 50% to 59% for patients receiving IFN and 29% to 44% for patients receiving BUS or HU.9,10,22 26
These prospective RCTs provide more compelling evidence than observational studies for several reasons. The prospective design, which defines inclusion criteria and outcome measures at the outset before treatment, reduces, but does not eliminate, the likelihood of selection bias. The inclusion of a comparison group allows observed effects to be attributed with greater certainty to the intervention. Random allocation of patients helps distribute confounding variables equally among groups. Finally, the RCTs of CML look beyond intermediate outcomes (hematologic or cytogenetic remission) to measure survival, the meaningful indicator of effectiveness.
Evidence from even these RCTs is far from conclusive because of imperfections in their design and conduct. Most trials of IFN suffer from a common set of methodologic difficulties: (1) selective exclusion of patients from treatment postrandomization (due to poor response, eligibility for BMT, or other factors). Not making such exclusions is also an imperfect alternative. Systematically excluding patients before randomization (eg, those without confirmed cytogentic abnormalities) would limit the generalizability of the findings. (2) failure to completely adhere to a standardized protocol; (3) variability in treatment regimens, which are not documented and in which clinicians are given latitude to alter the dosage or add other agents based on concerns about hematologic response or toxicity; and (4) crossover: patients allocated to receive rIFN-α are sometimes given chemotherapy when clinicians consider rIFN-α ineffective or too toxic. (Postrandomization crossover from rIFN-α to chemotherapy would presumably reduce, rather than exaggerate, the efficacy ascribed to rIFN-α based on an intention to treat analysis.)
Survival estimates are also subject to imprecision. Projections are typically based on Kaplan-Meier survival analysis in which the individual probabilities of survival for at-risk patients alive at successive points in time after treatment are multiplied together (“product limit” method), to calculate an overall estimate for the original cohort. Due to multiplier effects, significant errors in any one of these individual rates can have a dramatic influence on the final product. (These considerations apply also to survival estimates for bone marrow transplantation.)
A good example of these analytical problems is illustrated by Italian and German randomized trials. The Italian multicenter study22 randomly assigned 218 patients to receive rIFN-α and 104 patients to receive HU or BUS (the control group). After a median follow-up of 68 months, the observed 6-year survival rate was 50% for the rIFN-α–treated patients and 29% for the controls (P = .002). The median survivals were 72 and 52 months, respectively. The time for progression from chronic phase to accelerated or blast phase was lengthened from 45 months to more than 72 months (P < .001). Although there was no significant difference in the frequency of hematologic remissions, cytogenetic remissions were significantly more common in the rIFN-α group.
In a multicenter RCT in Germany,9 622 patients were randomized to receive either rIFN-α, BUS, or HU. They were followed for a median of 35 to 41 months. Although the 5-year survival rate in the rIFN-α group (59%) exceeded that of the BUS group (32%) (P = .008), it was not significantly higher than that of the HU group (44%) (P = .44).
Much of the discrepancy between the Italian and German findings can be explained by differences in case mix and treatment regimens, a point emphasized in a comparative analysis conducted by the investigators in both trials.33 The inclusion criteria for studies differed: The German trial excluded patients with asymptomatic disease and included those in whom blasts and promyelocytes accounted for as many as 30% of peripheral WBCs. The Italian study excluded extramedullary involvement and required less than 10% blasts. Of the patients included in the German trial, 16% would have been excluded from the Italian trial. These differences may have produced a healthier case-mix in the Italian trial. The worst prognostic group (with a so-called Sokal score34 more than 1.2) accounted for 38% of patients in the German study but only 24% of those in the Italian trial. (A new prognostic scoring index more appropriate for patients treated with rIFN-α has been proposed.35)
Differences in treatment intensity in the rIFN-α and control groups may have also contributed to the results in the German study. In the Italian trial, 66 patients allocated to rIFN-α therapy also received HU, whereas rIFN-α patients in the German trial received only monotherapy (no adjunctive HU or BUS). In the German study, patients in the HU group were administered a standard dose (40 mg/kg/d) with the goal of achieving a normal leukocyte count, whereas in the Italian study the dosage remained at the discretion of the treating physician.
Explicit inferences about the magnitude of benefit attributable to specific treatments in these trials are difficult because of changes in treatment assignments. Both trials excluded or reassigned patients after randomization. In the German trial, 109 of the 622 randomized patients were analyzed separately (because they lacked the Ph+ chromosome or bcr/abl fusion gene rearrangement or because they could not be analyzed by cytogenetic or molecular studies). Randomization methods in both studies introduced opportunities for nonrandom allocations. In the German trial, random assignment to rIFN-α began 3 years after assignment to the control groups. Using a “checkpoint” protocol established at the study’s onset, 65 patients assigned to rIFN-α were later “rerandomized” to BUS or HU if their disease progressed, or were given “free treatment” if rerandomization was considered “medically inappropriate.” (BUS was considered inappropriate after its inferior effectiveness, compared with HU, was shown.) Seventy-three patients assigned to HU or BUS were crossed over to the other agent for similar reasons. The rIFN-α patients who were rerandomized to HU or BUS had lower survival (median, 52 v 72 months) than those treated with rIFN-α alone.
In the Italian trial, HU was substituted for rIFN-α in 23 patients and was added to rIFN-α in 66 patients (however, these patients had lower survival rates than those treated with rIFN-α alone and would tend to dampen the observed benefit of rIFN-α). Measurements of outcome were not entirely systematic in either trial. Cytogenetic tests, for example, were performed on only a subset of treated patients (eg, those in chronic phase after 8 months of treatment), and only the best results over time were reported.
In summary, in reconciling their differences, the German and Italian investigators concluded that after adjustment for differences in admission criteria, the survival rates with rIFN-α were similar in both studies, that the combination of rIFN-α and HU may be more effective than either agent alone, and that the best results occurred in patients with early phase CML without features of accelerated disease or of blast crisis.36
The conclusions from two other RCTs of IFN are also limited by methodologic considerations. An RCT of the United Kingdom Medical Research Council randomly assigned 293 patients to receive rIFN-α and 294 patients to receive HU or BUS.26 The reported 5-year survival rate was 52% for the rIFN-α group and 34% for the control group. The difference in the proportion of patients who died in each group (128/293 v 158/294) was statistically significant (P = .001). The incomplete documentation of this study, however, raises questions about whether the results can be generalized. Key characteristics of the patient population (eg, median age, months from diagnosis) were not described. The methods for selecting the pretreatment regimen and for randomizing patients to rIFN-α versus chemotherapy were also inexplicit. The pretreatment regimen was selected by randomization or by the “physician’s choice”; randomization for treatment was “at diagnosis or at this point.” The proportion of patients who crossed over from rIFN-α to chemotherapy, and vice versa, was not reported. An unspecified number of patients allocated to rIFN-α received a cytotoxic drug during induction if the WBC count increased above 30,000/μL. The treatment regimen for the control group was variable. Hematologic outcome data are reported on only 72% of the patients in the rIFN-α group and are not reported at all for patients in the control group. The length of follow-up is not specified.
A Japanese RCT,10 comparing rIFN-α (80 patients) with BUS (79 patients), reinforced the conclusions of the German trial that BUS is inferior to rIFN-α. After a median follow-up of 50 months, the predicted 5-year survival rate was 54% for patients receiving rIFN-α and 32% for those receiving BUS (P = .029). Hematologic and cytogenetic remission rates did not differ significantly. The probability of remaining in chronic phase for 5 years appeared higher in the rIFN-α than in the BUS group (41% v 29%), but the difference was not statistically significant. Published documentation of this study was also incomplete. The age, stage of chronic phase, and extent of prior treatment of the patient population were not reported. As many as 6% of the patients were excluded from the study after randomization.
On the other hand, a trial conducted by the Benelux Group27did not observe a survival benefit for rIFN-α compared with HU. Investigators randomized 195 patients to receive either rIFN-α (combined with HU as needed) or HU only. Although the rIFN-α group had higher rates for complete hematologic and cytogenetic responses (62% v 38%, 9% v 0, respectively), the median survival was 64 months for the rIFN-α group and 68 months for the control group, a difference not statistically significant. Study design features may have accounted for these results. The study population was older than in other trials (median age, 56 years) and the rIFN-α group received a relatively low dose, averaging 2.14 MU/M2/d. Details of the rIFN-α regimen (eg, protocol for dose increase, duration of treatment) and the dose of HU were not reported. Other questions concerning this study have been raised.36
The added value of combining rIFN-α with cytarabine was first reported in observational studies.31,37-41 A recent French multicenter RCT adds further to this evidence.42 The investigators randomly assigned 360 patients to receive rIFN-α and HU (as part of induction) combined with cytarabine (20 mg/M2/d × 10d) and 361 patients to receive only rIFN-α and HU. During a median follow-up period of 35 months, there were 47 deaths in the combined-therapy group and 68 deaths in the rIFN-α-HU group (P = .02). The reported 3-year survival rate was 86% and 79%, respectively, although the median survival for CML had not been reached when the study was first published. The combined-therapy group had a significantly higher incidence of complete hematologic and cytogenetic remissions (66% v 55%, 15% v 9%, respectively). Cytogenetic data were not included for 96 study participants who had been randomized less than 12 months before the reference date and were excluded for 128 patients who did not receive complete treatment or cytogenetic testing.
Summary of Benefits With IFN
Despite the above-mentioned limitations in the design and conduct of the clinical trials, on balance, the accumulated evidence from RCTs suggests that, compared with BUS or HU, rIFN-α improves survival in chronic-phase patients with favorable features: no or minimal prior treatment, relatively normal hemoglobin levels and platelet counts, less than 10% blasts in the blood, and beginning treatment especially within 6 months of diagnosis when rIFN-α is coupled with other agents (HU or cytarabine). During early chronic phase, the treatment advantage of rIFN-α over chemotherapy is observed with varying magnitude in patients in each Sokal score (risk) category.43 Patients who continue rIFN-α during chronic phase do better than those who discontinue therapy. This survival advantage appears to be statistically significant. Meta-analysis suggests the pooled 5-year survival rate is 57% for rIFN-α and 42% for chemotherapy (P< .0001),8 which results from a delay in the onset of blast crisis.9 22
Compared with BUS or HU, the controlled trials suggest that, on average, rIFN-α increases life expectancy by a median of about 20 months. Patients overall have a 50% to 59% probability of being alive 5 years after treatment, which represents an improvement over the 29% to 44% 5-year survival rate seen with chemotherapy. However, achievement of a major cytogenetic response is associated with prolonged survival,42,44 as indicated by landmark analysis.45 (In a landmark analysis,45a fixed time is selected after the initiation of therapy as a landmark for conducting the analysis. Those patients still on study at the landmark time are separated into 2 response categories according to whether they have responded before that time. Patients removed from protocol before the time of landmark evaluation are excluded from the analysis. Patients are then followed forward in time to ascertain whether survival from the landmark depends on the patient’s response status at the landmark, regardless of any subsequent shifts in tumor response status. Patients are analyzed according to their response status at the landmark time. Thus, probability estimates and statistical tests are conditional on the response status of patients at the landmark time.)
The bulk of the evidence that rIFN-α improves survival comes from trials in which it is combined with other drugs. There is no direct evidence (from RCTs) that rIFN-α has a greater impact on survival than HU for patients who are in the later stages of chronic phase or who are sicker (eg, more than 1 year from diagnosis, or more than 10% to 30% blasts in peripheral blood). The single trial in which rIFN-α was used as monotherapy did not show a survival benefit. Adding cytarabine to rIFN-α appears to add further survival benefit but also increases toxicity. These benefits must be weighed against the adverse effects of the drug before judgments can be made about whether the tradeoff is worthwhile.
Summary of Adverse Effects With rIFN-α
Evidence regarding the adverse effects of rIFN-α in CML (Table 5) comes mainly from retrospective observational studies. These are compared to those observed in general clinical experience in Table 5. Reported complication rates vary widely owing to differences in patient selection and case mix, thoroughness of investigators in measuring side effects, definition of complications (eg, whether acute, subacute or chronic, mild or severe), sample size, dose and duration of rIFN-α, and length of treatment and follow-up.
In general, however, the evidence supports the clinical observation that toxicity is more common with rIFN-α than with BUS or HU. Virtually all patients receiving rIFN-α experience some constitutional side effects (Table 5), and discontinuation of treatment due to toxicity is necessary for 4% to 18% of patients compared with 1% of those receiving HU. One observational study reported that patients received only 60% of the target dose owing to side effects.16 Acute side effects are generally mild to moderate and include flulike symptoms such as fever, chills, and malaise. A wide constellation of other more severe acute reactions and chronic complications can occur (Table 5). Overall, the mechanisms underlying the toxic effects are not well understood, but the incidence of adverse effects is usually dose and duration dependent.46
ALLOGENEIC BMT
The efficacy of allogeneic BMT in the treatment of chronic-phase CML has been evaluated in a number of uncontrolled observational studies and several prospective studies (Table 6). Projected actuarial 3-year to 5-year survival rates in these studies range from 38% to 80%, with the higher values reported by experienced centers. Most studies report values around 50% to 60% and slightly lower probabilities for disease-free survival (Table 6). Reported relapse rates within 3 to 5 years are often less than 20% (Table 6). Projected survival curves appear to plateau (or taper more slowly) after 3 to 7 years, suggesting that allogeneic BMT offers eligible patients (especially young adults with a genetically HLA-identical sibling donor) a prospect for cure.
Concerns Regarding Interpretation of BMT Trials
Retrospective studies.
Most studies are retrospective, lack complete documentation of the clinical characteristics of the patient population, provide few details on methods for patient selection, use varied definitions of relapse, and are not randomized with controls. The largest studies tend to rely on registry data, collected from as many as 80 centers. Although some transplant studies do document inclusion criteria, they provide little additional information to ensure that results for all, rather than only some, patients meeting the criteria were analyzed.
Heterogeneous study designs.
Comparing outcomes across reports is difficult because of their heterogeneity. Many include patients treated by multiple protocols. Observed outcomes derive from a mixture of highly varied regimens with which clinicians and investigators have experimented over the years, making it unclear to which intervention(s) the outcome can be attributed. Allogeneic BMT is not a specific treatment but rather a general approach encompassing many different preparative regimens, stem cell sources, prophylactic regimens against GVHD, and methods of supportive care, all of which have changed dramatically in recent years.
Statistical problems.
Estimates of long-term survival are imprecise for statistical reasons. The median duration of follow-up in most BMT studies is either undocumented or less than 3 to 5 years. The multipliers for Kaplan-Meier calculations for patients surviving after BMT (eg, 7 to 10 years) are often drawn from a relatively small sample of patients who have lived that long. As an example, in the study with the longest follow-up (median, 84 months), van Rhee et al57 reported a 54% probability of surviving 8 years. In this study, the multipliers for survival 6 years beyond BMT were taken from only 10% of the original study population (patients who had lived that long and were considered still to be at risk). Estimates based on small numbers introduce imprecision for sample size reasons alone. The 95% confidence intervals for survival rates reveal the imprecision of such intervals which in various studies range from 26% to 86% (Table6).
Lead-time issues.
Survival estimates are also confounded by lead-time issues. Many patients enter transplant studies well after their diagnosis, often having tried and failed treatment with rIFN-α. This would presumably reduce, rather than improve, the apparent benefit of transplant. No study has examined the efficacy of BMT as an alternative to rIFN-α for initial treatment.
Concerns Regarding the Comparison of BMT With rIFN-α Therapy
Current evidence does not prove unequivocally that BMT is necessarily more effective than rIFN-α as first-line treatment for chronic-phase CML. For reasons outlined above, the observed plateau in survival curves for BMT, to which rIFN-α is compared, is somewhat conjectural without longer periods of follow-up. Moreover, even if the accuracy of such curves is accepted, there is little direct evidence of how they differ from those of comparable patients treated with rIFN-α. Survival curves for BMT show that at least half of patients remain alive 5 to 10 years after treatment, whereas similar curves for rIFN-α show a continuous relapse rate over time, with the curves crossing (yielding a survival advantage to BMT) at about 7 to 8 years. This pattern is frequently cited as evidence that BMT cures CML.
Such inferences implying the superiority of BMT are arguable, however, given that the data on which the curves are based are derived from patients with different clinical presentations, lengths of follow-up, and analytic methods. Patients in BMT studies are, on average, at least 6 years younger than subjects in rIFN-α studies, are less likely to have splenomegaly, and have a smaller percentage of blasts.58 It is unclear whether age and other clinical characteristics that may be associated with selection for BMT (eg, better health status to survive the procedure) introduce confounding variables that influence survival rates observed in BMT studies. Conversely, the tendency of BMT patients to have had their disease for a longer period than patients starting on rIFN-α initially and, in some cases, to have already failed rIFN-α or other therapies would be expected to have an opposite effect. Length of follow-up tends to be shorter in BMT studies, so that the right-hand portion of survival curves, based on a relatively small sample size, is more likely to appear stable (plateau) than survival curves for rIFN-α that are based on a larger cohort of long-term survivors for which event rates are available.
Gale et al58 attempted to control for the differences just cited by comparing the survival of 548 patients from the International Bone Marrow Transplant Registry with 196 patients who had received rIFN-α or HU in the German RCT.8 Survival curves were adjusted for different patient characteristics and duration of illness, showing that the percentage of patients surviving was less for BMT patients during the first 18 months of treatment (reflecting early transplant-related mortality), similar between groups from 18 to 56 months, but significantly better for BMT after 56 months (P < .0001). The 7-year probability of survival (and 95% confidence interval) was 58% (50% to 65%) for BMT and 32% (22% to 41%) for rIFN-α/HU, with the survival advantage first becoming statistically significant after 5.5 years. The corresponding rates for patients transplanted within 1 year of diagnosis compared to those treated with rIFN-α/HU were 67% (56% to 75%) and 30% (21% to 40%), respectively, with the survival advantage appearing earlier at 4.8 years. These data support the view that BMT produces better long-term outcomes, but concerns remain regarding the definitive evidence. The primary data on which the survival curves are based suffer from fundamental design problems already noted, such as selection biases and the reliance on uncontrolled observational data supplied by BMT registries. Previously mentioned concerns about the influence of small numbers and uncertain censorship criteria on the imprecision of Kaplan-Meier survival estimates pertain to this analysis. Reflecting the longer median follow-up period in the German RCT (6.5 years) than in the BMT registry (4.3 years), the data used by Gale et al to calculate survival probabilities 6 years after treatment are taken from 24% of the German cohort but only 12% of BMT patients. Probability estimates at the critical 8-year endpoint of the analysis are based on only 15 patients.
The appropriateness of the study’s method for adjusting the two data sets for differences in case mix and duration of illness, based on a specially framed Cox proportional hazards regression model, is open to debate. Modifications in the assumptions used in the model, which could potentially alter the resulting survival curves, do not appear to have been tested in sensitivity analyses. The statistically significant differences observed in long-term survival may have been influenced by these assumptions. Indeed, when the investigators used a different approach to adjust for case mix (stratifying survival by Sokal score), the difference in survival rates between transplant patients and low-risk patients who received IFN did not achieve statistical significance. Statistically significant differences were noted in other subgroup analyses.
Cytogenetic and Molecular Evaluation
Although as noted, RT-PCR negativity has been reported in patients in long-term cytogenetic remission after rIFN-α therapy,32cytogenetic and molecular remissions are substantially more common after BMT. Molecular studies, however, show that an appreciable subset of patients who appear to be in complete cytogenetic remission after BMT harbor RT-PCR evidence of the bcr-abl chimeric gene,57,59-64 which may portend an increased risk of relapse.65 Although these data are consistent with the view that BMT offers the best chance of cure, it is difficult to rely on these findings to prove a survival advantage given current uncertainties about the link between such responses and long-term survival. In this regard, the documentation of RT-PCR positivity for bcr-abl chimerism in presumably normal adult marrow is of significance.66,67 Moreover, it has been suggested that although there is a small probability that conventional RT-PCR assays will detect “innocent” bcr-abl genes it is conceivable that they may be the source of sporadically positive tests in leukemia patients in long-term remission.67
Unlikely Resolution of BMT Versus IFN-Based Therapy
Ideally, the best strategy for overcoming these methodologic concerns and for providing definitive evidence of benefit is to compare survival rates in a cohort of chronic-phase patients in an RCT that randomly allocates patients to receive either BMT or nontransplant therapy, a study not performed to date. The performance of such a trial, however, would be difficult in these times. Current worldwide practice indicates that clinicians would be reluctant not to offer BMT to eligible patients, and a large number of participants might be necessary to achieve the necessary statistical power to show an effect. In the absence of such evidence, there is no firm scientific basis from this evidence-based analysis for asserting that treating CML with one modality is of proven superiority over the other, and debates about the quality of the existing evidence for comparing BMT with IFN therapy will persist.
Potential Harms of BMT
Assuming that BMT is proven to increase the chances of survival in comparison to rIFN-α, the magnitude of the incremental increase in benefit must be weighed against the potential of serious harms and even death that may accompany the procedure, especially in the short term.
Death rate.
The reported probability that the patient will die as a result of BMT (transplant-related mortality) ranges from 20% to 41% (Table 7). Studies that included patients treated in the 1980s or those who received marrow from mismatched or unrelated donors report rates as high as 53% to 68% in certain subgroups. On the other hand, one center has reported rates as low as 15% among patients treated in recent years with marrow from matched siblings and receiving modern regimens for the prevention of opportunistic infections and GVHD.68
Preparatory regimen.
GVHD.
BMT is often followed by GVHD, opportunistic infections, or other complications. Between 8% and 63% of patients experience grade II-IVacute GVHD, a possible determinant of survival72,73and the cause of death for 2% to 13% of patients undergoing BMT (Table 7). (Some studies suggest that GVHD has an antileukemic effect and improves survival.57,73,74) The rates for chronic GVHD are 4% to 75%, with 8% to 10% mortality (Table 7). Similar findings have been reported in studies that included patients with both CML and other leukemias.75 76
Higher rates of GVHD tend to be reported by studies which included patients treated in the 1980s or those who received marrow from mismatched or unrelated donors. Among patients receiving marrow from matched siblings and modern methods for GVHD prevention, reported incidence rates for acute and chronic GVHD are 35% or lower (Table 7).
Interstitial pneumonitis, veno-occlusive disease, and secondary malignancies.
Variables Likely to Improve BMT Outcomes
The accumulated evidence, confirmed by multivariate analysis, highlights key prognostic variables that are more likely to improve the outcome of BMT and improve the tradeoff between benefits and harms. Above age 20 years, the inverse relationship between age and survival appears to be continuous. Most studies suggest that patients under age 30 years have higher overall and disease-free survival and lower transplant-related mortality than patients over age 30.80-84 At certain centers using modern methods of GVHD prophylaxis, the influence of age on outcome appears to be relatively small.85 Most data suggest that instituting BMT within 1 to 2 years of diagnosis results in higher survival rates than BMT after 2 years,57,71,86 although early studies did not support this relationship.72,87 (The observed association between duration of disease and survival noted by van Rhee et al57 lost statistical significance after multivariate analysis.) Recipients of bone marrow from an HLA-matched unrelated donor generally have lower survival and are more likely to develop GVHD than those who receive a marrow transplant from an HLA-matched sibling or other relative.88,89 At one experienced center, however, survival rates after transplantation of matched unrelated donors are approaching those of matched siblings.90 Moreover, modern methods of genomic typing of class I HLA alleles adds substantially to the success of transplantation from unrelated donors.91
Conditioning and pretransplant treatment regimens.
Studies have produced conflicting results regarding the optimal conditioning regimen and protocol for reducing the risk of GVHD. Patients who receive BUS before BMT tend to have lower survival rates than those who receive HU.83,86 T-cell depletion reduces the risk of GVHD, but it increases the risk of relapse and lowers survival.72,82,87,92 Observational studies have produced conflicting results regarding the potential adverse effects of prior treatment with rIFN-α on survival after BMT beyond that associated with the negative effects of delay itself.93-96 Prior treatment with rIFN-α does not appear to effect matchedrelated transplants.95,96 Results from one center suggests that for patients treated with matched unrelateddonors, pretransplant rIFN-α administered for more than 6 months is associated with an increased risk of acute GVHD and mortality.97
PATIENT PREFERENCES
Every available option for the treatment of chronic-phase CML involves tradeoffs between benefits and harms. Which choice is best depends on objective clinical variables that influence probabilities (eg, patient age, stage of disease, co-morbid conditions, intensity of treatment) and on subjective variables related to personal preferences. Two patients faced with the same options of chemotherapy, rIFN-α or BMT, and their likelihood of benefit from each may make different choices depending on the relative importance of prolonging survival by a period of months or years, of achieving long-term remission, of avoiding potentially severe side effects and complications, of trading short-term risks for long-term benefits, and of basing decisions on strong scientific evidence from controlled studies versus expert opinion. Improved survival may not be the only valid objective in making choices.
It follows that no treatment option should be pressed on patients without providing information about the potential benefits and harms, and the quality of the evidence on which such projections are based as in other medical situations. Patients who wish to take an active role in decision-making should be given the opportunity to weigh the options in terms of their personal preferences. Information about the importance of shared decision-making when clinical decisions involve complex tradeoffs appears elsewhere.98
RECOMMENDATIONS
IFN
(1) Based on evidence from randomized controlled trials, patients with good prognostic factors in the early stage of chronic-phase CML should be offered rIFN-α, perhaps with added chemotherapy (eg, HU or Ara-C) to achieve the highest probability of survival. This recommendation applies to newly diagnosed patients in chronic phase who do not suffer from other serious conditions that limit life expectancy or contraindicate the use of rIFN-α.
(2) Patients considering the aforementioned option should understand the degree to which life expectancy is increased by rIFN-α in comparison to chemotherapy—a median of about 20 months on average—to determine whether the added benefit is worth the increased risk of adverse effects associated with rIFN-α and the resulting effect on quality of life (patients who achieve a major cytogenetic response, however, may have a more prolonged survival). Patients should receive complete information about the most serious potential adverse effects of rIFN-α and their frequency to make an informed choice about its preferability to chemotherapy.
(3) In terms of proven survival benefits over HU, the evidence from one randomized trial is that monotherapy with rIFN-α is ineffective. The clinical trials in which rIFN-α has been shown to be more effective than chemotherapy combined rIFN-α with other agents (HU, BUS, or cytarabine) and included fewer patients with advanced disease.
(4) In clinical trials that did produce improved survival, the starting dose for rIFN-α was 3 to 5 MU/M2/d. The doses were gradually increased after 2 to 3 weeks to as high as 9 to 12 MU/d or to the maximally tolerated dose to achieve a satisfactory hematologic response (ie, WBC count of 2,000 to 4,000/μL, platelet count approximately 50,000/μL) or until the patient developed signs of toxicity and required dose reduction.
(5) There is inadequate evidence from controlled trials to recommend an optimal duration of rIFN-α therapy. In most trials, complete cytogenetic remissions were noted from 6 to 60 months after IFN therapy was started. In each study, rIFN-α was continued until disease progression or toxicity was noted.
(6) Based on evidence from a recent randomized controlled trial, adding cytarabine (20 mg/M2/d × 10 d) to rIFN-α is an option to increase the probability of survival, but the incremental benefit of doing so should be weighed against the increased risk of toxicity associated with this combination.
(7) Prolonged survival is most likely when a major or complete cytogenetic response is obtained after rIFN-α therapy. There is conflicting evidence from controlled trials to determine how long to continue rIFN-α treatment in patients who have achieved a complete response or, alternatively, who have demonstrated unsatisfactory hematologic or cytogenetic responses. Observational studies suggest that complete cytogenetic remissions tend to require from 6 months to 4 years of therapy. Evidence regarding treatment options for patients who have failed to respond to rIFN-α was not reviewed by the panel.
(8) There is inadequate evidence to set an upper age limit for considering rIFN-α therapy for CML. In the clinical trials that instituted an age-cutoff, patients were excluded if they were over the age of 70 to 75 years.
(9) Based on proven effects on survival, there is inadequate evidence from controlled trials to recommend rIFN-α over chemotherapy for patients in advanced chronic phase, including those with symptomatic disease or physical findings (eg, unexplained fatigue, weight loss, fever, progressive organomegaly, treatment-resistant leukocytosis, thrombocytosis, >10% blasts and promyelocytes in the differential count, extramedullary manifestations).
(10) For those patients who prefer conventional chemotherapy rather than rIFN-α, evidence from one randomized controlled trial (and several observational studies) supports the use of HU rather than BUS as the agent more likely to improve survival and less likely to produce serious toxicity. HU is a reasonable treatment option for patients who understand its reduced survival benefits in comparison to rIFN-α but prefer its less severe toxicity profile.
Allogeneic BMT
(1) If physicians and patients require evidence of benefit from BMT from randomized controlled studies to determine treatment preferences, then evidence to make such a recommendation is lacking. Randomized prospective studies with internal controls have not been conducted to show whether allogeneic BMT, either as first-line treatment or after initial treatment with chemotherapy or rIFN-α, achieves longer survival than nontransplant therapy. Uncontrolled observational studies do report higher long-term survival rates with allogeneic BMT after chemotherapy compared with those typically seen in patients treated only with nontransplant approaches, and BMT appears to offer a greater chance of long-term remission. It is uncertain to what extent these results are due to selection biases and the analytic methods used. Moreover, whether they can be generalized to normal practice conditions is uncertain. Further, BMT is associated with a high risk of immediate complications and transplant-related mortality that can offset the benefits of treatment, especially in the short term. For physicians and patients who are comfortable accepting evidence from uncontrolled observational studies which suggest that allogeneic BMT is more effective than nontransplant approaches and who are interested in considering transplantation, the following recommendations are offered:
(2) Allogeneic BMT is an option if the patient has a suitable HLA-matched donor (but see below) and an acceptable health status to tolerate the procedure.
(3) Based on information provided, a patient must fully understand the tradeoff between potential long-term benefits and the more immediate risks of transplant-related complications and mortality. Depending on personal priorities and life plans, the patient should decide whether the potential increase in life expectancy is worth this risk. The patient should understand how his or her age, duration of illness, HLA match with the donor, and the experience of the transplant center may modify standard outcome estimates. Decisions to delay the procedure or, if a related donor is unavailable, to use a matched unrelated donor, should be made with a clear understanding of how these choices may reduce the chances of success.
(4) BMT should preferably be offered to patients within 1 to 2 years of diagnosis to achieve the greatest likelihood of success (according to evidence from uncontrolled observational studies). Patients with adverse prognostic factors (reflected by a high Sokal score) should understand that their chances of success with rIFN-α are reduced and that early BMT may be a more compelling option. For patients who have had CML for more than 1 year and for those who are considering delaying BMT until more than 1 year from diagnosis, a decision is required of whether the decreased likelihood of benefit justifies the risk of transplant.
(5) Younger patients are most likely to benefit from allogeneic BMT. BMT is also more successful if the donor is an HLA-matched sibling or other relative according to evidence from uncontrolled observational studies. Results at most centers are inferior when the transplant is performed with marrow from “matched” unrelated donors, but outcomes vary depending on patient selection, transplant methodology, typing techniques, the expertise of the participating center, and the definition of accelerated- and blast-phase disease. Although the tradeoff between benefits and harms from BMT narrows with advancing patient age and although there is virtually no experience with BMT beyond age 65 years, there is inadequate evidence to determine an upper age limit beyond which BMT should not be offered.
(6) Patients receiving chemotherapy before allogeneic BMT appear less likely to benefit from transplant if they have been treated with BUS according to evidence from uncontrolled observational studies. If the patient chooses early BMT, there is little evidence to determine the possible benefit of prior cytoreduction with HU or rIFN-α. There is observational evidence that prior treatment with rIFN-α does not compromise the results of matched-related transplants, but its effect on BMT with matched unrelated donors, based on a published study, appears deleterious. It is also unclear whether the patient’s hematologic or cytogenetic response to rIFN-α can reliably predict the success of allogeneic BMT.
EPILOGUE
Choosing the best treatment option for an individual patient requires an orderly consideration of several issues. The limitations of current evidence and issues of patient variability make it inappropriate to propose an algorithm specifying the choices that should be made at each step in the process. However, the logical sequence of decisions that must be made by the physician and patient is clear:
(1) The first consideration is to determine whether BMT is a viable option and begins with an orderly assessment of the patient’s age, health status, and availability of a marrow donor, either match-related or unrelated.
(2) If a nontransplant regimen is selected, decisions are required regarding details of drug administration. For example, if rIFN-α is administered, its dose, duration, and its combination with HU or cytarabine must be decided.
(3) A systematic plan must be established for evaluating the degree and duration of cytogenetic and molecular response.
(4) Once the available treatment and diagnostic options are clarified, the trade-offs, which involve not only the potential outcomes of treatment but also the patient’s preferences and personal priorities, must be examined.
DISCLAIMER
The recommendations contained in this analysis describe a range of approaches to the management of CML. These recommendations are not intended to serve as inflexible rules, and they are not inclusive of all proper methods of care or other methods of care that may achieve similar results. Adherence to the recommendations will not ensure a successful outcome in every case. The ultimate judgement regarding the care of a particular patient should be made by the physician in light of the clinical data and circumstances presented by the patient and the treatment options available.
ACKNOWLEDGMENT
We acknowledge with thanks the administrative assistance of the members of the Optimization Committee (ASH) and especially its Chairperson, James P. George, MD, and Maurice Mayrides and Martha Liggett of the Administrative Staff of ASH, and Olga Brandenberger, administrative secretary to RTS.
Adopted by the Executive Committee of the American Society of Hematology, July 1999.
REFERENCES
Author notes
Address reprint requests to Richard T. Silver, MD, Division of Hematology-Oncology, the New York Presbyterian Hospital–Weill Medical College of Cornell University, 525 E 68th St, New York, NY 10021.