Figure 2.
Unsupervised hierarchical classification of samples and genes based on expression microarray data. (A) General flowchart describing the sequence of microarray analysis procedures throughout the study. (B) This bidimensional classification was performed using expression data of probe sets selected on expression variability and reliability through the samples (n = 358 probe sets), and exclusion of probe sets highly expressed in bone marrow cells; the resulting list of 255 probe sets is shown in Table S1. The top panel shows a hierarchical tree of samples (columns), sample subgroups, and sample annotations. Major subgroups corresponding to main hierarchy branches are indicated on the tree. MLL indicates cases with MLL rearrangements; TAL_R, TAL1-related cases, with 2 TAL_R subgroups (TAL_RA and TAL_RB); HOX_R, HOX-related cases; the HOXA-expressing cluster of cases is highlighted in red, and the main TLX1- and TLX3-expressing subgroup is indicated; IMMATURE, subgroup of cases characterized by strong expression of genes expressed in immature cells and frequent coexpression of myeloid genes; BM, normal human bone marrow cells. TL is the unique identification number for samples. Sample group annotations: for genomic annotations, S indicates cases with SIL-TAL1 transcripts; E, TLX1-expressing cases, L, TLX3-expressing cases (both also quoted as genomic annotations for simplicity); t, HOXA-rearranged cases; C, cases with CALM-AF10 transcripts; M, cases with MLL rearrangement as detected in these cases by FISH and Southern blot; and N, cases with NUP214-ABL transcripts. *HOXA_R cases without identified rearrangement. For TAL1, LMO1, and LMO2 gene expression, RQ-PCR evaluations of expression are indicated; - indicates cases not annotated to avoid bias due to erythroid contamination (n = 9 cases, including the HOXA-rearranged cases TL46), or not available (n = 2 cases). TAL1 expression was scored for significant increased levels compared to normal thymus level, evaluated from 1 to 5 (moderate to highest levels), and significant LMO1 and LMO2 expressions compared to normal thymus levels are indicated as P (positive); see Figure S1. Immuno indicates immunologic markers: i, immature; 3, cCD3 expression; g, γδ expression; a, αβ expression; and M, myeloid markers (CD13 or CD33 or both). For oncogenic groups, T indicates TAL_R cases; H, HOX_R cases; I, immature cases without TLX3 expression. For TAL_R subgroup annotations, A indicates TAL_RA; and B, TAL_RB. These labels were assigned based on sample annotations and microarray analyses (Figure S2). Group assignment by prediction models was questionable in 2 TAL_RB cases and 3 immature cases and these cases are indicated by boxes (TL25, TL34, TL76, TL77, and TL82; Figure S2D). The “Immature group” label is provisional because oncogenic events are unidentified in this group. In the middle panel, left, hierarchical tree of genes (rows); center, for all samples (columns), relative expression levels are indicated according to the color scale shown on the bottom of the figure (from deep blue, lower expression, to deep red, higher expression); right, rows corresponding to the HOXA genes are indicated. The bottom panel shows magnification of the cluster of genes defining the HOXA-expressing cluster. These cases, and the 5 additional samples with HOXA expression, are indicated by red boxes (HOXA-expressing samples).