Abstract
The clinical course for patients with chronic lymphocytic leukemia is extremely heterogeneous. The Rai and Binet staging systems have been used to risk-stratify patients; most patients present with early-stage disease. We evaluated a group of previously untreated patients with chronic lymphocytic leukemia (CLL) at initial presentation to University of Texas M. D. Anderson Cancer Center to identify independent characteristics that predict for overall survival. Clinical and routine laboratory characteristics for 1674 previously untreated patients who presented for evaluation of CLL from 1981 to 2004 were included. Univariate and multivariate analyses identified several patient characteristics at presentation that predicted for overall survival in previously untreated patients with CLL. A multivariate Cox proportional hazards model was developed, including the following independent characteristics: age, β-2 microglobulin, absolute lymphocyte count, sex, Rai stage, and number of involved lymph node groups. Inclusion of patients from a single institution and the proportion of patients younger than 65 years may limit this model. A weighted prognostic model, or nomogram, predictive for overall survival was constructed using these 6 characteristics for 5- and 10-year survival probability and estimated median survival time. This prognostic model may help patients and clinicians in clinical decision making as well as in clinical research and clinical trial design.
Introduction
Chronic lymphocytic leukemia (CLL) is the most common adult leukemia in the United States.1 The clinical course is remarkably variable; some patients live out their life not needing treatment and die of unrelated causes, whereas others have rapidly progressive disease requiring treatment within months of diagnosis and succumb to their disease within 2 to 3 years. The Rai2,3 and Binet4 clinical staging systems broadly identify risk groups based on clinical and laboratory characteristics. Overall, stage correlates with survival; however, for each stage there is still heterogeneity, limiting utility in predicting survival. In addition to factors used in clinical staging, several other patient characteristics and laboratory tests have been correlated with overall survival, including age,5 sex,5 pattern of bone marrow involvement,6–8 lymphocyte doubling time,9,10 and the presence of prolymphocytes in blood or bone marrow.11–13 Other factors that can be measured in the laboratory have also been correlated with poor prognosis, including the presence of chromosome abnormalities such as 17p deletion and 11q deletion,14 elevated serum levels of β-2 microglobulin (β-2M), thymidine kinase, soluble CD23,15,16 unmutated immunoglobulin heavy chain variable gene (IgVH),17,18 and expression of ZAP-7019,20 and CD3818,21 by leukemia cells. Alone, each of these prognostic factors has limited utility in predicting overall survival.
A nomogram is a graphic representation of a statistical model with scales for calculating the cumulative affect of weighted variables on the probability of a particular outcome.22 Nomograms enable continuous estimation of the probability of a particular outcome, such as death. The strength of using nomograms is that they combine multiple independent variables to predict an outcome and enable appreciation of the prognostic weight for each variable in calculating the probability of such an outcome.
Prognostic models and nomograms can be developed for a variety of clinical outcomes, including overall survival,23–25 disease-specific survival,26–29 probability of developing metastasis,30 and probability of relapse or recurrence,22,31,32 outcomes that are useful for patients, physicians, and researchers. They can facilitate discussion and counseling on the impact of disease on a patient's life, when to initiate therapy, in developing patients' expectations for outcomes, and in discussions of long-term outlook. They are useful to clinical scientists in developing expectations for clinical trials and identifying patients “at risk” who should be targeted for aggressive therapy or investigational approaches. They also may give insight into the biology of disease.
Generating predictive models and nomograms for overall survival in previously untreated patients with CLL is difficult given the chronic nature of the disease and duration of follow-up to observe enough events for a reliable model. Owing to development and maintenance of a unique and complete database of patients with CLL, we were able to perform an analysis of previously untreated patients who presented to the University of Texas M. D. Anderson Cancer Center (MDACC) for evaluation and treatment recommendations during more than 20 years. Follow-up for these patients, including for survival, has been ongoing. Using this database, we identified presenting characteristics that correlated with overall survival in univariate and multivariate analyses. A multivariate Cox proportional hazards model was developed that included the 6 significant independent covariates to predict overall survival. A nomogram and prognostic index for overall survival were developed using this model and may be a useful prognostic tool for patients, clinicians, and clinical investigators.
Patients, materials, and methods
Patients
Previously untreated patients who presented for initial evaluation to MDACC from August 1981 through August 2004 were included in this analysis (Tables 1–2). All patients provided informed consent, in accordance with MDACC IRB guidelines and the Declaration of Helsinki, and underwent initial evaluation, including history, physical examination, and laboratory evaluation of blood counts, chemistries, and bone marrow examination. There were 383 patients who presented to MDACC from 1981 to 1995, 530 from 1996 to 2000, and 761 from 2001 to 2004. The following were recorded at presentation: sex; age; Rai and Binet stages; Zubrod performance status; physical examination, including number of nodal sites affected, and liver and spleen sizes; and laboratory evaluation, including complete blood count and measure of serum albumin (ALB; normal range, 35-47 g/L [3.5-4.7 g/dL]), alkaline phosphatase (Alk phos; normal range, 38-126 IU/L), lactate dehydrogenase (LDH; normal range, 313-618 IU/L), β-2M (normal range, 51- 170 nM [0.6-2.0 mg/L]), and quantitative immunoglobulin levels (normal ranges, IgG: 6.24-16.8 g/L [624-1680 mg/dL]; IgA: 0.74-3.27 g/L [74-327 mg/dL]; IgM: 0.29-2.14 g/L [29-214 mg/dL]). Percentage of cellularity and lymphocytes in bone marrow were recorded for patients who underwent bone marrow aspirate and biopsy.
Patients who did not have an NCI Working Group indication for treatment33,34 were observed. There were 390 patients whose treatment started within 30 days of presentation, 34 patients started treatment within 30 to 60 days, and another 31 began therapy within 60 to 90 days of presentation. Significant heterogeneity in treatment was noted because the study time spanned more than 20 years. With the currently available follow-up, 767 patients have not been treated, 719 received treatment on an MDACC clinical trial, and 183 received treatment either off protocol at MDACC or with their referring physician. For those patients who were followed by their referring physician, MDACC follow-up consisted of scheduled return visits to MDACC or telephone contact with the referring physicians' office to obtain clinical notes, laboratory values, and follow-up survival status.
Statistical methods
Descriptive statistics, including median, range, and first and third quartiles were used to summarize the patient characteristics. Overall survival probability was estimated by the method of Kaplan and Meier.35 The difference between patient subgroups for each variable was assessed using the log-rank test.36 The time interval was measured from the day of presentation to MDACC until death or last follow-up. Death from all causes was included.
Univariate and multivariable Cox proportional hazards models were fit to examine the relationship between survival time and patient characteristics.37 A final multivariable Cox model was obtained by performing a backward elimination with P value cutoff of .05, then allowing any variable previously deleted to enter the final model if its P value was less than .05.
Nomogram development began by identifying patient characteristics predictive for overall survival in the multivariate Cox model. The nomogram was constructed as described by Kattan et al.22 These characteristics included age, β-2M, absolute lymphocyte count (ALC), sex, Rai stage III or IV, and number of involved lymph node groups. Patients without values were dropped from the analysis. There were 1561 patients included in this multivariable model; all had the specified characteristics measured at initial presentation. The formula to calculate the total point score for a patients is −12.5 + [1.25 × age] + [4.32 × β-2M] + [8.62 × (ALC, × 109/L/100)] + [7.34 × I(sex = male)] + [11.00 × I(Rai = III or IV)] + [10.84 ×I(nodes = 3)] where I() is the indicator function, equal to 1 if the condition in the parenthesis is met and 0 if not.
Validation of the nomogram consisted of generating the concordance index, which is the probability that, given 2 randomly drawn patients, the patient who dies first has the higher probability of death. This was calculated by bootstrapping 200 samples from the original 1561 patients used to fit the Cox model, and it served as an unbiased measure of the ability of the nomogram to discriminate among patients. Bootstrappping involves removing a small random sample of patients from the cohort while the remaining patients are analyzed as the actual results. Next, we examined the calibration of the nomogram, which also included 200 bootstrap resamples. A calibration curve was generated by plotting actuarial survival against predicted survival probabilities for patients stratified by predicted risk assessed the prediction accuracy of the nomogram. All analyses were conducted with S-plus 2000 Professional software (Statistical Sciences, Seattle, WA).
Results
Patient characteristics
There were 1674 previously untreated patients with CLL at their initial MDACC presentation included in these analyses (Table 1). The median time from diagnosis to presentation to MDACC was 3.8 months. Approximately 60% were men, the majority had a performance status of 0 or 1, and patients of all Rai stages were represented. The median age was 58 years, younger than the median age of patients presenting to community practice. Routine blood counts and chemistries were performed at presentation (Table 2). The first and third quartiles are shown for appreciation of the distribution of patient characteristics in these analyses. Some patients were missing information as noted in Tables 1–2; these patients were not included in the multivariate analysis. Characteristics that had missing data for more than 300 patients were not included in the multivariate analysis.
Survival
Survival for the entire group was estimated by Kaplan-Meier analysis (Figure 1). The estimated median overall survival was 10.7 years (95% CI, 9.8-11.2 years). The median follow-up time for all patients was 4.9 years (95% CI, 4.6-5.1). Of 1674 patients, there were 443 deaths during the follow-up time. With the currently available follow-up, 258 of these patients received treatment on an MDACC clinical trial, 45 were treated off protocol, and 140 died without documented treatment.
The 443 deaths were reviewed to assess for disease-specific mortality. This was done by review of death certificates, referring physician records and notes, and MDACC records. Active CLL was the cause of death for 165 patients (Table 3). In these cases, active CLL was characterized by proliferative disease, marrow failure, or immune dysfunction. The final event for some of these patients may have been pneumonia, bacteremia, or other infection, but the dominant underlying process was active CLL. Other deaths in this category included Richter transformation and bleeding complications related to low platelet count with active, progressive CLL. Infection accounted for 72 deaths. In those cases, active CLL was not the dominant clinical feature. Autoimmune hemolytic anemia was associated with 4 deaths. Cardiac events accounted for 15 of the deaths, and another 15 died of assorted causes unrelated to CLL, such as stroke or other medical condition. Second malignancies (solid tumors or hematologic) accounted for 72 of the deaths. In 99 cases, the cause of death could not be ascertained. Death in complete remission was rare (< 2%). We also evaluated the causes of death with regard to age (Table 3). Active CLL was the cause of death in 41% of patients younger than 50 years old, 40% of patients aged 50 to 65 years, and 32% of patients older than 65 years (Table 3). The cause of death was unknown for 24%, 18%, and 26% of patients younger than 50, 50 to 65, and older than 65 years.
Univariate analysis.
Univariate analysis was performed to identify patient characteristics that correlated with survival (Table 4). Longer time from diagnosis to MDACC correlated with higher risk of death in univariate analysis (Table 4). Women had an estimated median survival of 12 years (95% CI, 10.5-15.0 years) versus 10 years (95% CI, 8.6-10.9 years) for men. The median age at the time of presentation to MDACC for women and men were not significantly different at 59 and 58 years, respectively. Age was a significant predictor for survival, the median survivals for patients younger than 50, 50 to 65, and older than 65 years were 13.3 years (95% CI, 11.8 years to NA), 11.0 years (95% CI, 9.5-12.1 years), and 7.5 years (95% CI, 6.3-8.6 years), respectively.
The estimated median survival times by Rai stage were as follows: 11.5 years (95% CI, 10.8-13.7 years) for stage 0; 11.0 years (95% CI, 10.2-12.8 years) for stage I; 7.8 years (95% CI, 7.2-10.9 years) for stage II; 5.3 years (95% CI, 5.0-10.0 years) for stage III; and 7.0 years (95% CI, 4.6-9.3 years) for stage IV (Figure 2). These estimated survival times are similar to those previously reported with a noted exception of patients with Rai stage IV having a longer estimated survival than those with Rai stage III disease. Binet stage also correlated with survival; patients with stage A, B, and C disease had estimated median survival of 11.5 years (95% CI, 10.8-12.9 years), 8.6 years (95% CI, 7.9-10.9 years), and 7.0 years (95% CI, 5.2-7.6 years), respectively. Zubrod performance status (PS) predicted for survival; patients with PS 0 to 1 had median survival of 10.8 years (95% CI, 10.0-11.7 years) versus 6.0 years (95% CI, 2.6 years to NA) for those with PS 2 to 3.
Most laboratory parameters as well as liver and spleen measurements correlated with survival in univariate Cox proportional hazards analysis except quantitative immunoglobulin levels (Table 4). Natural log transformation was performed for several of the laboratory values to minimize the affect of skewing of data points. The number of involved lymph node sites also correlated with survival. The estimated median survival by number of involved lymph node sites were as follows: none, 11.3 years (95% CI, 10.3-12.9 years), 1 node site, 10.9 years (95% CI, 8.8-14.5), 2 node sites, 11.0 years (95% CI, 10.0-15.4 years), and 3 node sites, 8.5 years (95% CI, 7.6-10.0 years).
Martingale residual analysis was used to investigate for meaningful cut points for statistically significant continuous variables in their correlation with overall survival such as age and β-2M. With this analysis, there were no variables that had distinct cut points identified for overall survival (data not shown).
Multivariate analysis.
Multivariate regression analysis was performed to examine the relationship of independent variables with overall survival in Cox proportional hazards modeling. All significant characteristics identified in the univariate analysis were used to develop the multivariable model for survival. A total of 1617 patients with pertinent available data were included in the model; there were 432 (26.7%) deaths in this group during the follow-up period. Table 5 indicates the best model after eliminating variables that were not statistically significant.
Predictive nomogram
A nomogram (Figure 3) was developed to predict for survival using the 6 independent covariates identified in the multivariate model (Table 5). The nomogram is used by totaling the points identified on the top scale for each independent covariate. This total point score is then identified on the total points scale to identify the probability of 5- and 10-year survival and to estimate median survival. The contribution of each covariate to the total score can be visually appreciated and was potentially greatest for age, β-2M, and ALC, followed by sex, Rai stage (III or IV), and the presence of 3 palpable lymph node groups. The median total points for the 1617 patients used to fit the multivariable Cox model was 82.9 (range, 31.5-187.2), the first and third quartiles were 70.9 and 96.6, respectively.
The concordance index for this nomogram was 0.84 based on the fitted multivariable Cox model. The calibration curve (Figure 4) illustrates how the predictions from the nomogram compare with actual outcomes for the 1617 patients. The dashed line represents the performance of an ideal nomogram, in which predicted outcomes perfectly match with the actual outcomes. The dots were calculated from subcohorts of our dataset and represent the performance of our nomogram based on the Cox model, including age, β-2M, ALC, sex, Rai stage (0-II versus III and IV), and number of involved lymph node groups (0-2 versus ≥ 3). The dots were close to the dashed line, which implies that the prediction from our nomogram approximates the actual outcome. The X's were bootstrap-corrected predictions, which are more appropriate estimates of actual survival. Most of the X's are close to the dots, indicating that the predictions based on the use of the nomogram and modeled data (the dots) are near that expected from the use of the new data (the X's).
Prognostic index
A simplified prognostic index was developed using the 6 prognostic factors (Table 6). The index score is based on the sum total of factors with one point given for each of the following: β-2M 1 to 2 times upper limit of normal (ULN), age younger than 50 years, ALC 20 to 50 × 109/L, Rai stage III to IV, and 3 or more involved nodal groups, male sex; 2 points are given for each of the following: β-2M greater than 2 times ULN, age 50 to 65 years, and ALC greater than 50 000/μL; and 3 points for age older than 65 years. Risk is assigned as follows: index score 1 to 3, low; index score 4 to 7, intermediate; and index score 8 or greater, high risk (Figure 5). The estimated median survival times by risk group were as follows: not reached for low risk; 10.3 years (95% CI, 9.5-11.0 years) for intermediate risk; and 5.4 years (95% CI, 4.7-7.4 years) for high risk. Percentage of 5- and 10-year survival probabilities are provided in Table 7.
Discussion
Prognostic models can facilitate discussion between physicians and patients, help to identify high-risk patients for whom new treatments and clinical trials can be developed, and may provide insight into the biology of disease. Nomograms have been developed to predict various clinical end points for patients with other types of malignancies.22–32
Several prognostic factors for survival have been identified for patients with CLL. Clinical stage has been used to stratify patients according to risk for progression and death from their disease. We developed a nomogram for untreated patients to estimate 5- and 10-year survival probability and estimate median survival time based on a multivariate Cox proportional hazards model that included 6 independent patient characteristics measured at presentation to MDACC. This modeling aims to account for at least some of the heterogeneity seen within clinical stages and provides as accurate a predictor as possible for survival based on a large patient population experience. The model and nomogram have been validated as reliable predictors for survival in previously untreated patients with CLL, independent of an indication for treatment.
We also developed a prognostic index based on the 6 significant factors as a simplified tool to assess risk. A prognostic index was developed for patients with follicular lymphoma, referred to as the Follicular Lymphoma International Prognostic Index (FLIPI), to stratify patients into low-, intermediate-, and poor-risk groups.38 Remarkably, the characteristics identified in the FLIPI, namely age, Ann Arbor stage, HGB level, number of affected nodal areas, and serum LDH, are similar to those identified for the CLL nomogram.
An advantage of the nomogram is that it is a weighted model, combining independent prognostic factors and enabling appreciation of the magnitude of impact of each of the factors on the probability of survival. The 2 major, heavily weighted, factors in this model were β-2M and age. Not surprisingly, these 2 prognostic factors have reproducibly been correlated with overall survival for patients enrolled on clinical trials. This was most recently appreciated with the chemoimmunotherapy regimen fludarabine, cyclophosphamide, and rituximab for previously untreated and previously treated patients with CLL.39,40
In this study, overall survival was the analysis end point. We did not distinguish disease-specific mortality from other causes of mortality. Many patients with CLL succumb to infection-related complications, which are likely related to immune suppression from underlying disease as well as treatment. The vast majority of patients have disease present at the time of death, and it is conceivable that the presence of the disease was in some way related to death. We reviewed the 443 deaths in this study to assess cause; 165 could be directly attributed to active CLL, others were disease-related complications such as infection, second malignancies, or unrelated causes (Table 3). A multivariate model developed for CLL-specific survival (censoring other deaths) revealed the following significant (P < .05) independent predictors for survival: age, β-2M, ALC, and number of affected lymph node groups (data not shown). There were significantly fewer events (165 deaths) in this model, and, as such, Rai stage and sex fell out of the model as significant predictors. Notably, when evaluating for CLL-related mortality, age remains a significant prognostic factor. We feel that it is more clinically relevant and useful to analyze and have a predictive tool for overall survival from all causes, rather than to distinguish disease-related mortality.
There were 441 patients older than 65 years, among whom 176 (40%) died. The median age in this subgroup was 71 years (range, 66-90 years). The median time to death was 7.5 years (95% CI, 6.3-8.6 years). A multivariate model for survival in patients older than 65 years included the following significant (P < .05) independent predictors for survival: age, β-2M, and performance status (data not shown). The concordance index for the subpopulation of patients older than 65 years using the original 6-factor nomogram was 0.82, indicating a well-fitted model that can be validly applied to patients older than 65 years of age.
This analysis included patient characteristics measured at the time of presentation to MDACC, not at the time of diagnosis. Therefore, a range of time from diagnosis to MDACC evaluation was noted with a median time to MDACC of 3.8 months. Longer time to MDACC predicted for shorter survival in univariate analysis but not in multivariate analysis. The characteristics included in the final model are all characteristics that can easily and rapidly be assessed on clinical presentation. This model may be valid for previously untreated patients, independent of time from diagnosis, and this model may be used serially for the same patient over time, prior to treatment. This will need to be validated in a prospective fashion.
Prognostic modeling has some limitations. This is a single center study of patients who presented to a referral institution. As such, they are younger (median age, 58 years) than patients who present to community practice (median age, older than 65 years). Despite this, the concordance index for patients older than 65 years with this model was very high at 0.82. This model is internally validated; however, it would be strengthened by external validation with patients evaluated and monitored at multiple institutions. Finally, the laboratory assays used to measure β-2M may differ between laboratory facilities. In addition, different laboratories potentially have different normal ranges. This may affect the portability and ability to generally apply the nomogram.
Some of the patients included in this analysis began treatment soon after their initial evaluation at MDACC. Others continued to be monitored with observation and have not received treatment. The patients enrolled on this study span more than 20 years, during which, significant changes and advances in treatment were made. This prognostic model does not account for potential treatment advances and may therefore underestimate survival, assuming a positive impact of therapy on survival.
New prognostic factors have been identified that were not measured on initial presentation to MDACC in this study group. Examples include IgVH mutational status, ZAP-70 expression, and chromosome abnormalities identified by interphase fluorescence in situ hybridization (FISH) analysis. Without these test results available at initial presentation and without cryopreserved material obtained at presentation for retrospective testing, we were unable to incorporate them into this prognostic modeling. Although we have characterized IgVH mutation status in a subgroup of 401, ZAP70 expression in 303, and interphase FISH analysis in 123 of these patients, there are too few events (deaths) among the patients in these subgroups to reliably incorporate any of these factors in a multivariate analysis. This will require continued follow-up. Predicting survival may be improved by incorporating these prognostic factors in future predictive models.
Future work will focus on validating this model, both externally and in a prospective manner. We are also currently investigating incorporation of newer prognostic factors into this type of prognostic modeling, including cytogenetic analysis by FISH, IgVH mutational status, ZAP-70 expression, and CD38 expression. In addition, models for survival are being developed for patients at initial treatment, at treatment for relapsed disease, and for other end points such as time to treatment, time to progression, time to treatment failure.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
W.G.W. is a Leukemia and Lymphoma Society Scholar in Clinical Research.
Authorship
Contribution: W.G.W. designed research, initiated analysis, summarized results, and wrote manuscript; S.O. reviewed manuscript and managed patients; X.W. did statistical analysis; S.F., A.F., J.C., D.T., G.G.-M., C.K., M.B., F.G., F.R., and H.K. managed patients; K.-A.D. supervised statistical analysis; S.L. supervised data collection; and M.K. engaged in helpful clinical discussions and managed patients.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: William G. Wierda, Department of Leukemia, 1515 Holcombe Blvd, Unit 428, Houston, TX 77030; e-mail: wwierda@mdanderson.org.