Figure 3.
Identification of DLBCL consensus clusters. (A) The left panel shows consensus matrices (clusters) produced by hierarchical clustering (HC), self-organizing maps (SOMs), and probabilistic clustering (PC). Clusters were generated using the top 5% of genes with the highest reproducibility across duplicate samples and largest variation across tumors and an approach that selected the most stable numbers of clusters with each algorithm (consensus clustering).79 The right panel shows comparisons of the cluster assignments of different algorithms (PC versus HC, HC versus SOM, and PC versus SOM, respectively). Clusters are denoted as C[1,], C[2,], and C[3,]. The 3 clustering algorithms demonstrated excellent agreement with more than 84% of DLBCLs assigned to the same clusters by any 2 algorithms. (B) Expression profiles of the 3 DLBCL comprehensive clusters. The top 50 genes associated with each cluster are shown. Each column is a sample and each row is a gene. Color scale at bottom indicates relative expression and SDs from the mean. Red indicates high-level expression; blue, low-level expression. Adapted from Monti et al79 with permission.