Fig. 4.
Gene clusters separated by PCA.
PCA allows us to present the multidimensional data (in this case, 4-dimensional data of each gene expression pattern) in a simple 2-dimensional graph. First we derived the 4 principle components, which are a linear combination of the original variables (certain gene expression intensities of neutrophils of control or stimulated with one of 3 bacteria: E coli K12 and KIM5 and KIM6 strains ofY pestis). Then we found that the first 2 principal components capture most of the variation of the data (95.2% in our case). Therefore the data can be displayed (with a minor loss of information) in a 2-dimensional graph, with these 2 largest key principal components as the x- and y-axes. The axes titles “cn1” and “cn2” represent the first 2 principal components. The label of each cluster is the same as those in the “Expression pattern” row of Tables 4 and 5. As can be seen, a large fraction of the total differences in expression patterns of the genes can be visualized in this 2-dimensional graph.