Abstract
The Multiple Myeloma Research Consortium (MMRC) has characterized over 300 hundred patient samples using a variety of platforms as part of the Multiple Myeloma Genomics Initiative (MMGI). Part of this large study includes a subset of 84 patients that were screened for somatic mutations using whole genome sequencing (WGS) or whole exome sequencing (WES) in combination with mRNA sequencing. This represents one of the first cohorts of myeloma patients with matched genome and transcriptome sequencing results. Given the historic value of microarray based gene expression profiling (GEP), this cohort provides the unique opportunity to compare gene expression measurements from the two platforms as Affymetrix U133Plus2.0 based GEP was performed on 42 of these samples.
As part of the MMGI study, the Broad Institute has completed the genome sequencing, using WGS and WES, for 213 patients. A frequently mutated list of 9 genes including NRAS, KRAS, TP53, PNRC1, MAGED1, FAM46C, DIS3, CCND1 and ALOX12B were identified initially. Given the potential for RNAseq data to be used to define gene expression levels and to identify mutations in expressed genes we tested the feasibility of mutation calling on RNAseq alone. We independently called mutations on the entire transcriptome of the 84 patients and used a filtering method to eliminate likely germline variants in the absence of a matched normal control. We looked for point mutation concordance between, the calls identified by RNA-Seq alone and the previously reported variants through exome sequencing in the 9 frequently mutated genes. Out of the 66 SNV’s identified by these criteria using WGS or WES sequencing, 55(84%) were detected using RNA-Seq. Of the remaining 11 loci, 7(10%) were not detectably expressed and in 4(6%) cases the mutation was not detectable even though there was ample coverage. It is unclear if the last 6% represent false positives in the genome calls or the preferential expression of the wild-type allele.
To interrogate the utility of RNAseq based GEP in myeloma we independently recapitulated many of common GEP measurements. First we independently used the 84 samples to define cutoffs for the implementation of the TC classification method. We compared our independent assignment of the 42 samples with matched gene expression array data, to their existing microarray assignments. This resulted in 40/42 (95%) samples being classified 40(95%) into identical TC classes. The two discordant samples MMRC0312 and MMRC0387 classified as TC class “none” by expression arrays were classified as other classes by RNAseq. MMRC0312 exhibited high CCND3 expression using RNA-Seq and was assigned to ‘6p21’ class. MMRC0387 exhibited elevated CCND1 expression and was classified as ‘D1’ using RNAseq. For the indexes we showed a strong correlation for the proliferation index (R2=0.971) and the NFKB index (R2=0.961) but only a moderate correlation for the 70-gene index (R2=0.761). The decreased correlation in the 70-gene index is clearly due to the large number of probesets used, which are associated with genes that are clearly not expressed by RNA-seq.
One additional advantage of RNAseq over microarray based gene expression measurements is the potential to detect fusion transcripts. We have applied fusion transcript detection to this cohort of patients and 69 human myeloma cell lines, which were also screened by RNAseq and WES as part of the MMGI study. The most common fusion transcript detected is the @IGH-MMSET fusion characteristic of t(4;14). The next most common fusion we identified appears to be a promoter replacement event were the highly expressed gene, FCHSD2, is fused to multiple partners including known myeloma related genes, MMSET and MYC, and previously unreported genes in myeloma, CARNS1 and NCF2. Additional structural rearrangements involving FCHSD2 are also predicted based on the high frequency of copy number abnormalities encompassing the 5′ region of this gene as detected by comparative genomic hybridization in the MMGI study.
This study should provide the basis for the migration of myeloma based gene expression profiling from microarrays to RNA sequencing based approaches. In the future RNA sequencing has the potential to provide novel classification schemes that leverage the multitude of measurements that can be made from this single assay.
Levy:MMRC: Employment.
Author notes
Asterisk with author names denotes non-ASH members.