Figure 2.
Gene expression in CG2 leukemia models correlate with pediatric disease. (A) (Left) Correlation of differential gene expression (log2 fold change [L2FC]) in CG2 AMKL models and patients from our institutional data set (Centre hospitalier universitaire Sainte-Justine [CHUSJ]) compared with a validation data set of pediatric CG2 AMKL (St Jude). Differentially expressed genes (DEGs), defined as |L2FC > 1| and false discovery rate (FDR) q value < .05, common to both data sets are indicated in blue or red and define the CG2 signature (supplemental Table 7). Institutional (Inst) data set: CG2 AMKL models (n = 10) and patient-derived CG2 samples (n = 2) compared with N5A AMKL models (n = 5), patient-derived N5A samples (n = 2), and normal CB CD34+ cells (n = 4). Validation (Val) data set: CG2 AMKL (n = 12) vs other genetic subtypes of AMKL (n = 61) from pediatric patients at diagnosis.9 (Right) Upset plots showing DEGs that are jointly overexpressed (n = 399) or underexpressed (n = 330) in CG2 leukemias, corresponding to blue and red dots in panel A. (B) Hierarchical clustering using the 729 DEGs of the CG2 gene expression signature. (C) Heat map showing protein (log2[mass spectrometry [MS] values, left panels) and messenger RNA (RNAseq; fragments per kilobase of transcript per million mapped reads [FPKM] values, right panels) expression of cell surface markers associated to CG2 leukemia (high expression in mCG2-1 and mCG2-2 leukemia models log2[MS values] ≥ 16 and RNAseq FPKM ≥ 5) and weak/no expression in normal CB CD34+ cells (log2[MS values] < 16 and RNAseq FPKM < 5). Samples were analyzed in triplicates and represented as mean expression (biological triplicates for RNAseq and technical triplicates for proteomic data). For comparison, values for mN5A and pdxNTF are shown alongside mCG2 samples. The St Jude Val cohort RNAseq expression is presented as a separate column. (D) Star plot presenting the adjusted P values of DEGs and differentially expressed proteins (FDR < 0.05) (transcriptome: CHUSJ CG2 vs N5A AMKL and normal CB CD34+; surfaceome: CG2 vs NUP98r and CB CD34+). The scores were calculated by multiplying the algebraic sign (+ or −) of the log2FC, surfaceome or transcriptome, by the corresponding log10(adjusted P value). Significantly upregulated CG2 AMKL–specific surface markers intersecting both data sets are labeled. (E) Validation by flow cytometry of PCDH19 surface expression on CG2 AMKL cells from patient (pCG2-1), model mCG2-1 and M07e cell line. PCDH19 is not expressed on lineage-depleted human CB cells (CB CD34+). Samples were costained with NCAM1 (CD56). (F) Scatterplot representations showing the best pairwise correlations of the 2 sets of cell surface marker genes that select for CG2 genotype in a validation data set of pediatric AMKL (of 8 cell surface marker combinations, see supplemental Figure 13). Values, from top to bottom, represent the global and subtype-specific Kendall rank correlation coefficients. P values: ∗P < .05 and ∗∗P < .005.