B-cell chronic lymphocytic leukemia (B-CLL) is a heterogenous disease with a highly variable clinical course. Recent studies have shown that CD38 surface expression on the malignant cell clone may serve as a prognostic marker in that CD38+ patients with B-CLL are characterized by advanced disease stage, lesser responsiveness to chemotherapy, and shorter survival than CD38− patients. To further investigate the molecular phenotype of these 2 clinical subgroups, we compared the gene expression profiles of CD38+ (n = 25) with CD38− (n = 45) B-CLL patients using oligonucleotide-based DNA chip microarrays representative of approximately 5600 genes. The results showed that B-CLLs display a common gene expression profile that is largely independent of CD38 expression. Nonetheless, the expression of 14 genes differed significantly between the 2 groups, including genes that are involved in the regulation of cell survival. Furthermore, unsupervised hierarchical cluster analysis of 76 B-CLL samples led to the separation of 2 major subgroups, comprising 20 and 56 patients. Clustering to the smaller group was due in part to the coordinate high expression of a large number of ribosomal and other translation-associated genes, including elongation factors. Importantly, we found that patients with high expression of translation factors were characterized by a more favorable clinical course with significantly longer progression-free survival and reduced chemotherapy requirements than the remaining patients (P < .05). Our data show that gene expression profiling can help identify B-CLL subtypes with different clinical characteristics. Furthermore, our results suggest a role of translation-associated genes in the pathogenesis of B-CLL.
Introduction
B-cell chronic lymphocytic leukemia (B-CLL) is a heterogenous disease with a highly variable clinical course. Staging systems devised by Rai et al1 and Binet et al2,3 are useful methods for predicting survival and treatment requirements in patients with CLL. However, these staging systems are of limited prognostic value in early stages of the disease (Binet A or Rai 0-II); this includes most patients at diagnosis. Therefore, studies have focused on identifying novel prognostic markers that may help define patient subgroups with favorable versus poor clinical outcomes in early CLL.3,4 Recently, 2 independent studies by Damle et al5 and Hamblin et al6have demonstrated that B-CLL may arise from an immature pregerminal center B cell with unmutated immunoglobulin (Ig) variable heavy chain (VH) genes or from a more mature postgerminal center B cell with somatically mutated immunoglobulin VH genes. Moreover, Damle et al5 found a strong correlation between immunoglobulin VH gene mutation status, CD38 surface expression of the respective B-CLL clone, and clinical outcome in individual patients. B-CLL patients with mutated immunoglobulin VH genes and low numbers of CD38+ cells exhibit a favorable clinical course, whereas B-CLL patients with unmutated immunoglobulin VH genes experience poor outcome in terms of reduced survival and reduced responsiveness to chemotherapy.
Two recent gene-expression profiling studies tested the 2-disease model of CLL by correlating gene expression in CLL cells with their immunoglobulin-mutational status.7,8 Unexpectedly, both studies using unsupervised hierarchical cluster analysis found a common gene-expression profile regardless of the immunoglobulin-mutational status of the CLL patients investigated. Nonetheless, more refined statistical analysis allowed for the detection of a subset of differentially expressed genes that could predict the immunoglobulin-mutational status of CLL cells with high accuracy.8,9 Collectively, these data suggest that CLL may be viewed as a single disease with 2 common variants that differ with regard to their immunoglobulin-mutational status and clinical course.9
Given the dramatic differences in the clinical behavior of CD38+ versus CD38− CLL patients, we compared their gene expression profiles using oligonucleotide-based DNA chip microarrays representative of approximately 5600 genes. In line with the aforementioned results, our data show that B-CLLs display a gene-expression profile largely independent of CD38 expression. Nonetheless, unsupervised cluster analysis10 revealed the existence of 2 major subgroups that differed significantly with regard to known clinical prognostic factors including hemoglobin levels, chemotherapy requirements, and progression-free survival.
Patients, materials, and methods
Patients and isolation of CLL cells
Between November 2000 and February 2002, 76 patients with chronic lymphocytic leukemia were enrolled in this retrospective study and were evaluated for several biologic and clinical characteristics: age, sex, Binet stage, white blood cell count, hemoglobin level, platelet count, lactate dehydrogenase level, thymidine kinase level, treatment history, and time from diagnosis to first treatment. The study was approved by the institutional review board of University Hospital Essen; informed consent was obtained from each patient according to the Declaration of Helsinki. In each patient, morphologic diagnosis of B-CLL was confirmed by flow cytometry11,12 that revealed a typical CD19+CD20+CD5+CD23+immunoglobulin light chain (κ or λ light chain)-restricted immunophenotype. Unique patient number (UPN) 55 was excluded from the study because the diagnosis of B-CLL could not be confirmed. Patient characteristics are shown in Table 1. Heparinized whole peripheral blood (PB) samples were usually obtained during routine follow-up visits to our institutions with informed consent, according to institutional guidelines. Indications for treatment were based on standard criteria.13 Thirty-five (46%) of 76 patients had previously received chemotherapy. First-line therapy consisted of chlorambucil in 31 (89%) of 35 or fludarabine in 4 (11%) of 35 patients. Peripheral blood mononuclear cells (PBMNCs) were isolated by Ficoll-Hypaque (Pharmacia, Erlangen, Germany) density centrifugation and washed in Iscove modified Dulbecco medium (Gibco BRL, Karlsruhe, Germany). Proportions of CD19+CD5+ B-CLL cells, CD3+ T cells, and CD14+ monocytes were 89.0% ± 6.0%, 5.3% ± 4.2%, and 1.1% ± 1.0% (mean ± SD), respectively. There were no significant differences in the cellular composition of the PBMNCs between the 2 major subgroups identified by hierarchical cluster analysis (Figure 2; P > .05). RNA from 1 to 2 × 108 PBMNCs was extracted, purified using the RNeasy midi kit (Qiagen, Hilden, Germany), and quantified spectrophotometrically. PBMNC aliquots of each patient were viably frozen in fetal calf serum (FCS; Greiner, Limburg, Germany) containing 10% dimethyl sulfoxide (DMSO; Sigma, Deisenhofen, Germany) and were stored in liquid nitrogen.
Cell surface staining and flow cytometry
For the determination of CD38 expression, fresh heparinized PB samples were prepared for flow cytometry by ammonium chloride erythrocyte lysis (Ortho-mune Lysing Reagent; Ortho Diagnostic Systems, Raritan, NJ). Immunophenotype was characterized using a standard 3-color flow cytometry approach as previously described.12Antibodies were purchased from DAKO (Glostrup, Denmark; CD19, CD10, IgM, κ and λ light chains), Immunotech (Marseilles, France; CD5) and Becton Dickinson (Heidelberg, Germany; CD38, CD4, CD8, CD3). Negative isotype–matched controls (Becton Dickinson) were used to define the threshold line separating surface marker positive and negative cells such that less than 1% of isotype-positive cells were present to the right of the line. A CLL population was considered CD38+ when more than 20% of the gated population (CD19+CD5+) expressed it.12 The same method of sample preparation and 3-color staining was used throughout the entire study period. Samples were analyzed on a FACScan flow cytometer (Becton Dickinson) using CellQuest software (Becton Dickinson).
Oligonucleotide microarray analysis
For first-strand cDNA synthesis, 9 μL (13.5 μg) total RNA was mixed with 1 μL mixture of 3 polyadenylated control RNAs, 1 μL 100 μM T7-oligo-d(T)24 primer [5′-GGCCAGTGAATTGTAATACGACTCACT ATAG GGAGGCGG-(dT24)-3′], incubated at 70°C for 10 minutes and put on ice. Next, 4 μL of 5× first-strand buffer, 2 μL 0.1 M dithiothreitol (DTT), and 1 μL 10 mM dNTPs were added, and the reaction was preincubated at 42°C for 2 minutes. Then, 2 μL (200 units) Superscript II (Life Technologies, Karlsruhe, Germany) was added, and incubation was continued at 42°C for 1 hour.
For second-strand synthesis, 30 μL 5× second-strand buffer, 91 μL RNase-free water, 3 μL 10 mM dNTPs, 4 μL (40 U) Escherichia coli DNA polymerase I (Life Technologies), 1 μL (12 U) E coli DNA ligase (TaKaRa, Gennevilliers, France), and 1 μL (2 U) RNase H (TaKaRa) were added, and the reaction was incubated at 16°C for 2 hours. Then 2.5 μL (10 U) T4 DNA polymerase I (TaKaRa) was added at 16°C for 5 minutes. The reaction was stopped by the addition of 10 μL 0.5 M EDTA (ethylenediaminetetraacetic acid), double-stranded (ds) cDNA was extracted with phenol/chloroform, and the aqueous phase was recovered by phase-lock gel separation (Eppendorf, Hamburg, Germany). After precipitation, the cDNA was restored in 12 μL RNase-free water.
Five microliters ds cDNA was used to synthesize biotinylated cRNA using the BioArray High Yield RNA Transcript Labeling Kit (Enzo Diagnostics, NY). Labeled cRNA was purified using the RNeasy mini kit (Qiagen, Hilden, Germany). Fragmentation of cRNA, hybridization to HuGeneFL microarrays (Affymetrix, Santa Clara, CA), and washing, staining, and scanning of the arrays in a GeneArray scanner (Agilent, Palo Alto, CA) were performed as recommended in the Affymetrix Gene Expression Analysis Technical Manual. Signal intensities (MAS5 signal) and detection calls for statistical analysis and hierarchical clustering were determined using the Microarray Suite (MAS 5.0) software (Affymetrix, Santa Clara, CA). Scaling across all probe sets of a given array to an average intensity of 1000 U was included to compensate for variations in the amount and quality of the cRNA samples and other experimental variables.
Data processing and hierarchical clustering
For the hierarchical clustering shown in Figure 2, only genes recognized as present by the Affymetrix algorithm in at least one third of the profiles were selected. Gene expression data were ln transformed, normalized to have a mean of 0 and an SD of 1, and subjected to the average linkage clustering method (Array Explorer; Spotfire, Somerville, MA) using correlation (centered) as a similarity measure.
Statistical analysis
Progression-free survival times were measured from the time of diagnosis, plotted by the Kaplan-Meier method, and compared using the log-rank test. Comparison of clinical and laboratory parameters between patient subgroups was performed using the Wilcoxon rank sum test for metric data, Fisher exact test, and the χ2 test for categorical data. The Cox proportional hazards model was used for multivariate analysis on progression-free survival.
Results
CD38 expression in the B-CLL study group
We evaluated the surface expression of CD38 in our B-CLL study population using a 3-color flow cytometry approach with directly conjugated monoclonal antibodies.12,14 In accordance with current convention, a given leukemic population was considered positive for CD38 when 20% or more of the B-CLL cells expressed the membrane marker.12,14 Based on this cutoff value, 25 (36%) patients were defined as CD38+ and 45 (64%) patients as CD38−, respectively. Comparison of clinical and laboratory parameters among the 2 groups is shown in Table 1. Notably, significant differences were found for thymidine kinase, leukemic bone marrow infiltration, Binet stage, lactate dehydrogenase (LDH) serum levels (P < .05), and treatment-free survival (Figure1) confirming our own previous work12 and that of others.5 14-16
Comparison of gene expression profiles among CD38+ and CD38− B-CLL patients
RNA extracted from the PBMNCs of 76 B-CLL patients was converted to labeled cRNA and hybridized to HuGeneFL Affymetrix oligonucleotide chips representative of approximately 5600 genes. The same methods of cell purification, RNA preparation, and chip hybridization were used throughout the entire study period. Two different batches of HuGeneFL Affymetrix oligonucleotide chips were used for 2 independent patient series comprising UPNs 1-38 and UPNs 39-77, respectively. Although no statistically significant differences were observed between these 2 patient series in terms of sample purity and clinical characteristics (Table 1), preliminary data analysis revealed substantially stronger overall hybridization of the labeled cRNA probes in the second series of experiments than in patient samples UPNs 1-38, possibly because of differences between the 2 chip batches used. For this reason, expression data analysis was performed separately for patients UPNs 1-38 and UPNs 39-77.
Resultant gene expression profiles were analyzed using 2 independent approaches—unsupervised hierarchical clustering,10 which can identify distinct subgroups that have not been classified beforehand, and the Wilcoxon rank sum test, which is more suitable for comparing subtle differences in gene expression values between 2 predefined groups (eg, CD38+ and CD38− B-CLL patients).
Hierarchical cluster analysis (Figure 2A-B) using an average-linkage algorithm10 revealed the existence of 2 major subgroups comprising 8 and 30 patients in the first patient series (UPNs 1-38) and 12 and 26 patients in the second set of experiments (UPNs 39-77). However, the incidence of CD38+ patients was comparable in the 2 groups (P > .05; Figure 2, Table2), suggesting a common gene expression profile independent of CD38 expression status for most genes represented on the chip.
For comparative analysis of CD38+ and CD38−B-CLL patients, the following statistical criteria were applied to the array data. As for hierarchical clustering, only those genes called present by the Affymetrix algorithm in at least one third of the profiles were selected (2343 genes in UPNs 1-38 and 2769 genes in UPNs 39-77). Differentially regulated genes were identified by comparing the median signal intensities of the 2 groups and defining a cut-off fold change of ± 1.5. Resultant genes were then further analyzed using the Wilcoxon rank sum test at a significance level of .05, yielding approximately 75 differentially expressed genes in each series. Fourteen of these so-called CD38 distinction genes were found to fulfill the statistical selection criteria in both series and are shown in Table 3.
Characterization of patient subgroups based on gene expression analysis
We next tried to further characterize the previously described gene-expression subgroups identified by hierarchical cluster analysis (Figure 2A-B). Clustering within the leftmost branch of the dendrograms (Figure 2A-B) was caused, in part, by the high expression of a large number of ribosomal and translation-associated genes (ribosomal cluster; see arrows in Figure 2 and Table4) in both patient series.
For further statistical analysis, the clinical and laboratory data of the 2 series were pooled—that is, 20 patients in the ribosomal cluster were compared with the remaining 56 patients. Treatment histories of the 2 subgroups differed significantly in that the patients with a high expression of ribosomal proteins required less chemotherapy than the remaining patients (Table 5;P = .041). Furthermore, we found highly significant differences in disease progression, as indicated by the treatment-free interval (Figure 3). The mean treatment-free interval was longer in patients with a high expression of ribosomal proteins than in the other patients (P = .0172; Figure 3). Comparisons of further clinical and laboratory parameters between the 2 groups are shown in Table 2. Finally, we investigated whether the expression of ribosomal and translation-associated genes could further refine the clinical relevance of CD38 expression status (Figure4). Figure 4A shows that CD38+ patients with high expression of ribosomal genes are characterized by significantly longer progression-free survival than are the remaining CD38+ patients. This observation suggests that a combination of these 2 parameters may increase the prognostic power of either of the 2 factors. However, the subgroup analysis shown in Figure 4A is based on 25 patients and thus must be confirmed in a larger patient cohort.
Univariate analysis of risk factors
Univariate Cox regression analysis was used to assess associations between progression-free survival time and potential risk factors (Table 6). Hemoglobin level, platelet count, CD38 expression, β2-microglobulin serum concentration, LDH serum activity, and expression of ribosomal genes (ribosomal cluster) were identified as significant factors influencing progression-free survival.
Multivariate analysis
The following patient characteristics, found to impact significantly on treatment-free survival in univariate analysis, were included in the multivariate Cox regression model: hemoglobin concentration, platelet count, CD38 expression, β2-microglobulin, LDH serum activity, and expression of ribosomal genes (Table 7). In multivariate analysis, only hemoglobin concentration influenced progression-free survival.
Discussion
It is now well established that CD38 expression is an important prognostic marker in B-CLL.5,12,14-17 In a recently published retrospective study,12 we showed that CD38+ CLL patients were characterized by an unfavorable clinical course with advanced disease stage, poor responsiveness to chemotherapy, short time to initiation of first treatment, and shorter survival. In contrast, the CD38− group required minimal or no treatment, remained treatment free for a longer time period, and had prolonged survival. To identify molecular differences that may be responsible for the differential clinical course of these 2 subgroups, we compared gene expression profiles of CD38+ patients with those of CD38− patients using oligonucleotide-based microarray technology.
Unsupervised clustering (Figure 2) revealed a common pattern of expression of approximately 5600 genes, independent of the expression of CD38. Detailed statistical analysis identified a set of 14 genes differentially expressed in CD38+ versus CD38−subtypes. These results are consistent with 2 recent studies comparing gene expression profiles of CLL patients with hypermutated immunoglobulin variable region (immunoglobulin VH) sequences and a favorable prognosis with that of immunoglobulin VH–unmutated patients with a more aggressive course of disease.7,8 Both studies were unable to distinguish the 2 subgroups using cluster analysis and detected only a restricted number of differentially expressed genes using more refined statistical tests. Interestingly, patients with unmutated immunoglobulin VHgenes have been previously found to exhibit a higher percentage of CD38 expression than mutated B-CLL clones.5,16 In this regard, it is important to note that ZAP-70, found by Rosenwald et al8,9 to be among the most differentially expressed genes, was also up-regulated in our CD38+ patient cohort (Table3), indicating an overlap in the molecular characteristics of the 2 subgroups. ZAP-70, a critical kinase involved in T-cell antigen receptor signaling, has only recently been recognized to be expressed in B cells.8,18 Intriguingly, recent work by Deaglio et al19 has demonstrated that ZAP-70 is tyrosine phosphorylated in response to antibody-mediated CD38 stimulation in natural killer (NK) cells, raising the possibility that it may play a role in CD38 signaling in B-CLL. Further support for the coordinate expression of ZAP-70 and CD38 in B-CLL comes from a recent study by Chen et al18 showing that all CD38+CLL samples analyzed also expressed high levels of ZAP-70 protein; however, ZAP-70 expression was not restricted to CD38+patients but was also detected in some CD38− patients with unmutated immunoglobulin VH genes.
Another gene of potential functional importance found to be overexpressed in the CD38+ group is the α4 subunit of the VLA4-integrin (CD49d; Table 3), which plays an important role in cell adhesion to the extracellular matrix molecule fibronectin. Importantly, adhesion to fibronectin through CD49d has recently been shown to protect B-CLL cells from apoptosis induced by serum deprivation20 and fludarabine treatment in vitro.21 These observations are consistent with a flow cytometry study showing that CLL cells isolated from early-stage patients (Rai 0-II) exhibit significantly lower CD49d expression levels than CLL patients with advanced disease (Rai III-IV).22
Despite the comparatively small differences in gene expression patterns of CD38+ versus CD38− CLL patients, hierarchical clustering of the samples and gene expression levels within the samples led to the separation of 2 major subgroups composed of 20 and 56 patients, respectively (Figure 2). Clustering to the smaller group was attributed, at least in part, to the coordinate high expression of a large number of ribosomal and translation-associated genes (Table 4).
More important, we found that the clinical outcome for patients in the 2 subgroups was strikingly different. Patients with a high expression of translation-associated genes were characterized by a more favorable clinical course with significantly longer progression-free survival and fewer chemotherapy requirements than the remaining patients (Figure 3). Furthermore, the 2 patient subgroups differed with regard to a panel of known clinical prognostic factors, including peripheral blood hemoglobin levels (Table 2).
To our knowledge, this is the first report showing that unsupervised cluster analysis can identify molecular B-CLL subtypes that differ with regard to the clinical course of the disease. In particular, the ribosomal cluster described here has not been noted in prior gene array studies comparing the gene expression profiles of immunoglobulin VH mutated with immunoglobulin VH unmutated CLL patients.7,8 This discrepancy may be explained at least in part by differences in the study designs, the most important of which is probably the comparatively small number of patients included in these studies (34 and 37 patients in the series reported by Klein et al7 and Rosenwald et al,8 respectively). In addition, both studies used immunomagnetically enriched B-CLL cells with a purity exceeding 95%, whereas in our series CLL cells were isolated using density centrifugation yielding only a mean purity of 89% CD19+CD5+ cells. Thus, we cannot exclude the possibility that accessory cells might have contributed to the observed gene expression differences, although the cellular composition of the tumor samples in the 2 major subgroups, defined by differential expression of ribosomal and translation-associated genes, was similar as determined by flow cytometry analysis (see “Patients, materials, and methods”).
Overexpression of translation-associated genes in tumors has been previously noted and may reflect a higher metabolic and proliferation rate in the malignant cell population.23 However, it is now well established that translation factors also play an important role in the regulation of cell death.24,25 In this context it is important to note that high expression of ribosomal proteins S3A, S29, and elongation factor-1α (Table 4) observed here has been shown to accelerate the apoptotic rate in certain cell types.24,25 By contrast, high expression levels of elongation factor-2α (Table 4) have previously been found to positively correlate with a high proliferation rate and biologic aggressiveness in non-Hodgkin lymphomas.26 These results, together with the findings of the recent literature,27 28suggest that diverse cellular responses to alterations in translation-associated proteins strongly depend on the expression patterns of the individual factors involved.
Interestingly, high coordinate expression of a large group of ribosomal genes was reported for ovarian tumors that were histologically well differentiated compared with the more poorly differentiated tumors28 in the same series. Using hierarchical clustering, these authors found that rapidly growing tumor cell lines and the most poorly differentiated ovarian tumors grouped together and exhibited a relative underexpression of ribosomal genes, despite their presumed high metabolic rates. These observations, in combination with our data, raise the possibility that a high expression of ribosomal proteins may be correlated with a less aggressive clinical course in some tumor entities.
In conclusion, our results suggest that oligonucleotide microarray analysis can detect molecular B-CLL subtypes that differ with regard to the clinical course of the disease. A larger study is warranted to confirm our results and to further investigate the potential role of the translational apparatus in the pathogenesis of B-CLL.
We thank numerous colleagues for generously contributing information on the clinical course and treatment histories of the study patients. We also thank Ariane Kariger, Ute Schmücker, Nadine Pieda, and Adriane Parchatka for expert technical assistance.
Prepublished online as Blood First Edition Paper, November 27, 2002; DOI 10.1182/blood-2002-09-2683.
Supported by the Förderverein des Institus für Zellbiologie in Essen and the Ministerium für Schule, Wissenschaft und Forschung des Landes Nordrhein-Westfalen.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.
References
Author notes
J. Dürig, Department of Hematology, University Hospital, Essen D-45122, Germany; e-mail:duerig@t-online.de.