Abstract
The outcome of chronic myeloid leukemia (CML) has been profoundly changed by the introduction of tyrosine kinase inhibitors into therapy, but the prognosis of patients with CML is still evaluated using prognostic scores developed in the chemotherapy and interferon era. The present work describes a new prognostic score that is superior to the Sokal and Euro scores both in its prognostic ability and in its simplicity. The predictive power of the score was developed and tested on a group of patients selected from a registry of 2060 patients enrolled in studies of first-line treatment with imatinib-based regimes. The EUTOS score using the percentage of basophils and spleen size best discriminated between high-risk and low-risk groups of patients, with a positive predictive value of not reaching a CCgR of 34%. Five-year progression-free survival was significantly better in the low- than in the high-risk group (90% vs 82%, P = .006). These results were confirmed in the validation sample. The score can be used to identify CML patients with significantly lower probabilities of responding to therapy and survival, thus alerting physicians to those patients who require closer observation and early intervention.
Introduction
Until the introduction of allogeneic stem cell transplantation and IFNα,1,2 the course of Philadelphia-positive (Ph+) chronic myeloid leukemia (CML) from the chronic phase (CP) to the accelerated phase (AP), the blastic phase (BP), and then death was almost linear. The introduction of imatinib, the first of a series of tyrosine kinase inhibitors (TKIs) that target the oncogenic protein coded by the BCR-ABL gene,3,4 profoundly changed the treatment of CML. With TKIs, the yearly death rate is currently ∼ 2%, and > 80% of patients are projected to be alive after 8 years.5-7 However, a systematic approach to assessing the prognosis of TKI-treated patients has not yet been developed; instead, the use of 2 prognostic classifications developed for patients treated with either conventional chemotherapy (the Sokal score)8 or IFNα (the Euro score)9 has persisted, not only in imatinib-treated patients but also in patients participating in the most recent studies involving second-generation TKIs.10,11 Clearly, the baseline prognostic evaluation of CML must be revisited to specifically evaluate TKI-treated patients. For this purpose, a European registry of CML patients was established by the European LeukemiaNet (ELN). It has been maintained and implemented within the framework of a project supported by Novartis Oncology Europe, the European Treatment and Outcome Study for CML (EUTOS). In the present study, these registry data were used to develop a new prognostic risk score able to predict the probability of achieving a complete cytogenetic response (CCgR) within 18 months, which is the most solid and confirmed surrogate marker of survival.7
Methods
Patients
All studies complied with the Declaration of Helsinki and were approved by the ethics committees of all participating institutions. The ELN/EUTOS CML registry contains the individual data for adult patients enrolled in prospective, controlled, Good Clinical Practice–operated studies between 2002 and 2006. The eligibility criteria for the registry were diagnosis of Ph+/BCR-ABL+ CML in CP and any form of imatinib-based treatment within 6 months after diagnosis regardless of the duration of imatinib treatment. These criteria were fulfilled by 2060 patients from 5 national study groups: German12 (n = 699); Italian (the Italian Group for Adult Hematologic Diseases or GIMEMA)13 (n = 556); French 14 (n = 546); Nordic15 (n = 140); and Dutch (HOVON; n = 119).16
Of these study groups, we analyzed all 1261 patients who progressed or died within 36 months or had a minimum follow-up of 36 months, and all 1223 patients in whom cytogenetic response status had been evaluated at 18 months (acceptable interval 15-21 months). The latter group was divided into 2 subgroups: a learning sample of 938 patients from the German, GIMEMA, and HOVON groups, and a validation sample of 285 patients from the French and Nordic groups. A patient flow diagram is provided in Figure 1. All clinical and hematologic factors were determined at baseline. Patient demographics were comparable across the national subsets. There were no relevant differences in timing of diagnosis, diagnostic procedures, or monitoring. All patients had been enrolled in prospective studies of treatment with imatinib or imatinib-based regimes.
Definitions
Statistics
For every case, time was calculated from the starting date of imatinib treatment. For the analysis of time-to-event data, Kaplan-Meier curves, log-rank tests, and, as needed, competing risk methods were applied. The new prognostic score was developed using a learning sample (German, GIMEMA, and HOVON patients) and a validation sample (French and Nordic patients). Because CCgR at 18 months after the start of therapy has proven to be a solid surrogate parameter for the risk of progression, a prognostic model for this parameter was developed on the learning sample and later validated on an independent sample.
In the learning sample, logistic regression and χ2 tests were used to assess the predictive relevance of candidate variables on CCgR status at 18 months. To develop the new score, influential variables were identified and combined in multiple models. The result of the linear predictor of each model was then categorized into 1 of 2 risk groups by application of the minimal P value approach.17,18 After adjustment for multiple testing, the minimal P value approach identified the cutoff point that separated the sample into 2 risk groups according to the smallest P value for CCgR at 18 months. In calculating positive predictive values (PPVs), negative predictive values (NPVs), sensitivities, and specificities, the models were compared with each other and with the established scores. PPVs were always calculated for the high-risk group, and NPVs for the remaining patients. P = .05 was considered to be significant. Calculations were done using SAS Version 9.2 software and PASW Version 17 software.
Data collection, processing, and all statistical analyses were exclusively carried out by the Central Data Center (by J.H., V.H., D.L., and M.P.) at the Department for Medical Information Processing, Biometry and Epidemiology of the Ludwig-Maximilians-University of Munich, Germany.
Results
Registry eligibility criteria were met by 2060 patients. Their median age was 52 years (range 18-88) and 60% were male. According to the EURO score, 38% of the patients were at low risk, 51% at intermediate risk, and 11% at high risk. The corresponding numbers for the Sokal score were 39%, 37%, and 24%. The standard monotherapy of 400 mg/d of imatinib was administered to 41% of the patients, 400 mg of imatinib combined with either low-dose arabinosyl cytosine (LDAC) or IFNα to 34% of the patients, and 600 or 800 mg of imatinib to 25% of the patients. The median observation time was 42 months (range 1-81 months). At 36 months, the cumulative incidence to have achieved a first CCgR was 92%. The overall survival probability at 60 months was 91%.
CCgR and risk of progression
Among the 1261 patients observed for at least 36 months and evaluated for CCgR, the proportion of patients who had achieved CCgR increased over time: from 7% after 3 months to 31% (6 months), 65% (9 months), 74% (12 months), and 82% (15 months) until reaching 85% after 18 months. There was a clear relationship between not achieving CCgR by a certain time and an increasing risk of progression within 3 years after the start of therapy (Table 1). Patients who had not achieved CCgR after 6 months had an 8% risk of subsequent progression, which increased to 14% after 12 months and 23% after 18 months. Correspondingly, the likelihood of achieving CCgR at a later date decreased from 85% at 6 months to only 31% at 18 months. These data supported the selection of CCgR status at 18 months as the dependent outcome variable for the analysis of a prognostic model.
Sokal and Euro scores for CCgR at 18 months
Of the 1223 patients in whom cytogenetic remission status at 18 months had been examined, the Euro and Sokal scores were available for 1165 and 1167 patients, respectively. The CCgR rate at 18 months was 84%. With both scores, the discrimination was significant only for high-risk patients. The PPVs for not achieving CCgR were 25% (high-risk Sokal) and 26% (high-risk Euro).
Improvement of the discriminatory power of the 2 established scores was attempted by combining low- and intermediate-risk groups and using the minimal P value approach to define a new cutoff point for the 2 prognostic classes with the greatest differences. However, the discriminatory power did not improve sufficiently. The PPVs of high-risk patients were between 25% and 28% and the relationship between sensitivity and specificity was not well balanced. Specifically, either there were many patients in the high-risk group and thus the sensitivity was high but the specificity was low, or the specificity was high but there were few patients without CCgR who were identified as high-risk.
Identification of prognostic baseline variables in the learning sample
Because neither the Euro nor the Sokal score provided a satisfactory prediction, logistic regression was applied to identify factors with a significant impact on CCgR status at 18 months. Candidate variables were the 6 laboratory parameters (Table 2), as well as age, sex, and spleen size. All analyses were restricted to the learning sample. In univariate analysis, spleen size, leukocytes, blasts, eosinophils, and basophils were found to have a statistically significant influence on the event “no CCgR at 18 months.”
New EUTOS risk score in the learning sample
Given the correlation between potential prognostic factors, various significant models were identified and tested. In the minimal P value approach, the most discriminatory cutoff point of the linear predictor of the multiple logistic regression models defined a low–risk and a high-risk group. The best explanatory model for sensitivity, specificity, and PPV included only basophils (P = .0024) and spleen size (P = .0105). The model could be shortened without a loss of accuracy to yield a simple formula for calculating the new prognostic score: EUTOS score = (7 × basophils) + (4 × spleen size) where the spleen was measured in centimeters below the costal margin and basophils as a percentage at baseline. A EUTOS score of > 87 indicates high risk and ≤ 87 low risk. The original equation was: EUTOS score = (0.0700 × basophils) + (0.0402 × spleen size) where the spleen was also measured in centimeters below the costal margin and basophils given as a percentage. For this formula, a EUTOS score > 0.8754 indicates high risk and ≤ 0.8754 low risk. Based on the monotone transformation in the logistic regression model, it was also possible to calculate the individual estimated probability for a patient not to achieve a CCgR as follows: Probability of no CCgR = exp(−2.1007 + 0.0700 × basophils + 0.0402 × spleen size)/(1 + exp[−2.1007 + 0.0700 + basophils + 0.0402 × spleen size]). The cutoff point for this probability was located at 22.7%. The new prognostic score identified a high-risk group (risk score ≥ 87) with a PPV of 33% (Table 3 and Figure 2A).
New EUTOS risk score in the validation sample and in the total sample
Very similar results were obtained when the new prognostic score was applied to the validation sample, comprising 271 patients (Table 3 and Figure 2B). As in the learning sample, every third patient in the high-risk prognosis group was not in CCgR at 18 months, with a PPV of 34%. Finally, the new score was applied to all 1197 patients with a known CCgR status at 18 months, for whom data on basophils and spleen size were available (Table 3 and Figure 2C). The PPV in these patients was 34%, the sensitivity 23%, and the specificity 92%.
EUTOS risk score and treatment
Patients in the registry had been enrolled in prospective studies assessing different treatments: imatinib alone (400, 600, or 800 mg daily) or 400 mg of imatinib daily in combination with either LDAC or IFNα. The differences between low- and high-risk patients were significant in all treatment groups: for 400 mg of imatinib, P < .008 (Figure 2D), for 600-800 mg of imatinib, P < .0001, and for imatinib + LDAC/IFNα, P < .0001. The discriminating power of the risk score was maintained in all 3 groups (Table 4), with a PPV for high-risk patients ranging between 27% for higher-dose imatinib and 40% for the combinations of imatinib with either LDAC or IFNα.
Cumulative incidence of CCgR and progression-free survival
Achieving a CCgR within 18 months has been shown to be a solid early surrogate marker of outcome. The formula of the new EUTOS score could be applied to 1873 registry patients for whom data on spleen size, basophils, and known time to CCgR were available, and to 2010 patients for whom data on follow-up of survival were available. Figure 2C shows the cumulative incidence for CCgR. Figure 3 shows the probabilities of survival free from progression to AP or BP in both EUTOS risk groups. At 5 years, the projected figures for PFS were 82% (95% C.I. 73%-89%) for high-risk patients and 90% (95% C.I. 88%-92%) for low-risk patients (log-rank test, P = .0069). Information on basophils, spleen size, and progression status were available for 1239 patients who had been observed for at least 36 months or who had died before that time. The sensitivity of the EUTOS score for PFS was 16% and the specificity was 91%. Among patients in the high-risk group, 12% progressed compared with 7% in the low-risk group.
Discussion
We used data from 2060 patients of the ELN/EUTOS European CML registry who had been enrolled in prospective, investigator-sponsored studies of imatinib treatment to test the Sokal and Euro scores for their relationship with treatment failure. CCgR status at 18 months was selected as an end point because it is a conservative but solid and confirmed predictor of treatment success or failure.7 Whereas both scores were found to have discriminatory power, neither was well balanced in terms of sensitivity and specificity. Therefore, EUTOS, a new prognostic score based only on the percentage of basophils in the blood and on spleen size, was formulated and shown to have improved predictive power. The statistical value of the EUTOS score was validated in an independent dataset. Moreover, its simplicity allows it to be easily applied in clinical practice. However, the biologic implications of this new model remain to be determined.
A combination of spleen size and basophils was determined to best predictors of CCgR. Indeed, in all studies performed over the last 50 years, spleen size has consistently been identified as a significant predictor of treatment outcome irrespective of treatment,1 although a standardized method of assessing and reporting spleen size is still lacking. Therefore, it may be that measuring and reporting spleen size with more objective methods such as ultrasound scan or computed tomography, rather than the current method of manual palpation, may further improve the EUTOS score's predictive power. We speculate that spleen size provides a rough and imprecise but nonetheless strong appreciation of the extent of extramedullary hematopoiesis. In adults, normal hematopoiesis is limited to the bone marrow. Normal stem cells reside in marrow niches, where their proliferative and differentiation properties are kept under cellular and environmental control.19 Ph+ stem cells circulate in the blood to a much greater extent than normal stem cells and can home outside the marrow,20-23 including in the spleen. This provides a different niche—one that could favor genetic instability, clonal evolution, self-renewal, and defective differentiation. Whereas there are no recent data supporting these hypotheses, several years ago it was reported that spleen Ph+ cells were somehow different from marrow and blood Ph+ cells in terms of their kinetic properties and cytogenetic patterns.24,25 Although 3 studies failed to confirm the clinical benefits of early splenectomy,26-28 it may now be appropriate to revisit the role of the spleen, investigating the composition and the structure of the splenic microenvironment with particular attention to stem cell niches and the conditions under which Ph+ stem and progenitor cells proliferate, differentiate, and mature in the spleen.
An increase in the number of basophils has long been recognized as a signal of CML progression.29,30 In fact, a clinical definition of complete hematologic response is < 5% blood basophils, whereas > 20% defines acceleration.7 In an independent study, basophils were found to predict a molecular response.31 Therefore, basophil percentages are an established and confirmed prognostic factor, even though the reasons for the prognostic strength of this measure are not clear.
The new EUTOS score developed and validated in this study drew on a database of more than 2000 prospectively diagnosed, treated, and monitored patients with CP CML who had been enrolled in independent, investigator-sponsored studies.12-16 As has been the case with applying the established Sokal and Euro scores, the patients analyzed with the EUTOS score did not receive identical treatment. Imatinib was administered in a daily dose of 400-800 mg and in several patients was combined with LDAC or IFNα. Because different treatments may result in different responses, the EUTOS score was also tested with respect to treatment and, irrespective of treatment dose and type, was able to significantly predict patient response.
The new EUTOS score is not revolutionary, as was the case after the introduction of the Sokal and Euro scores, but it marks a significant advance because it provides better PPVs than those obtained with either of the previous scores. Moreover, it is specifically based on imatinib-treated patients, and does not prolong the use of prognostic classifications that include factors (eg, age, platelet count, blast cells, and eosinophils) that have not been found to affect the response to imatinib. The new score is also simpler and more practical in its application because it uses only 2 variables, both of which can be easily measured in routine health care practice worldwide.
In the present study, the EUTOS score predicted that 34% of high-risk patients will fail to achieve a CCgR in 18 months, and also predicted PFS. Whereas the value of the score should be further validated by other investigators, it will be difficult to improve upon because the therapeutic efficacy of imatinib is high. If it were even higher, no discrimination would be possible. We believe that only progress in assessing molecular response32-38 will support a better prognostic classification. In addition, advances in pharmacogenomics, gene-expression profiling, and whole-genome sequencing studies39-44 will no doubt contribute to identifying the molecular basis of failure, and thus will enable a more patient-tailored and better-targeted form of therapy and treatment evaluation. Until then, however, the EUTOS score is a simple and inexpensive method for determining prognosis. As demonstrated herein, 1 in 3 high-risk patients fails on imatinib. Moreover, although the new and validated EUTOS score requires only 2 variables, its predictive power is better than that of the Sokal and Euro scores. This does not mean that all treatment decisions must be based on prognosis, but that the EUTOS score can identify patients with a significantly higher risk of progression and impaired survival, thus alerting the treating physician to the need for closer patient observation and early therapeutic intervention.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
The authors thank Lara Montrucchio, Carmela Piccolo, Elke Engstroem, and Peter Schuld for logistic and organizational support.
The European Treatment and Outcome Study (EUTOS) is a project supported by Novartis Oncology Europe through a contract with European LeukemiaNet and the University of Heidelberg (Germany).
Authorship
Contribution: J.H. and M.B. designed the research; M.B., J.G., S.S., G.R., F.G., K.P., G.O., B.S., and R.H. collected the data; V.H. performed statistical analysis; D.L. processed and analyzed the data; J.H., M.B., V.H., M.P., and R.H. analyzed and interpreted the data and wrote the manuscript; and all authors revised and approved the manuscript.
Conflict-of-interest disclosure: J.H. (principal investigator) has received funding from Novartis. M.B. (principal investigator) has a consultant or advisory role with and has received honoraria from Bristol Myers Squibb and Novartis Pharma and research funding from Novartis Pharma. V.H. has received research funding from Novartis Pharma. S.S. has received research funding from Novartis. G.R. has a consultant or advisory role with and has received honoraria from Novartis and Bristol Myers Squibb. F.G. has a consultant or advisory role with and has received research funding from Novartis, honoraria from Bristol Myers Squibb, and research funding from Novartis and from the French Minister of Health. K.P. has received honoraria and research funding from BMS and Novartis. G.O. has a consultant or advisory role with Novartis and has received research funding from BMS. D.L. has received research funding from Novartis. B.S. has a consultant or advisory role with and has received honoraria and research funding from Novartis. M.P. has received research funding from Novartis. R.H. has received honoraria from Novartis and BMS and research funding from Novartis, Roche, and Essex. J.G. declares no competing financial interests.
Correspondence: Joerg Hasford, Institut für medizinische Informationsrerarbeiting, Biometric und Epidemilogie, Ludwig-Maximillians-Universität, Marchioninistr 15, 81377 München, Germany; e-mail: has@ibe.med.uni-muenchen.de.
References
Author notes
J.H. and M.B. contributed equally to this work.