Abstract
Whole genome and exome sequencing (WGS, WES) have enabled the identification of mutational signatures in Multiple Myeloma (MM) and other cancer types. In studies that assess the impact of coding mutations on protein structure and function, only reads mapping to the exome are pertinent. Thus, WES is typically preferred over WGS, as it provides deeper coverage given the same amount of total reads. However, exome enrichment - a necessary step in WES, limits the ability to call mutations, as coverage is restricted to the capture regions and affected by their GC content. Furthermore, without transcriptional information, it is not possible to determine which coding mutations found by WGS or WES are expressed and, therefore, more likely to be relevant. As an alternative, RNA-seq data directly targets the transcriptome, providing deep coverage, not requiring an enrichment step and intrinsically omitting non-expressed mutations. Moreover, when RNA-seq data is already available for evaluation of gene expression profiles, one can further leverage the data to explore expressed mutational profiles. However, limitations in pipelines to analyze RNA-seq data have restricted their applicability so far.
Using paired WES and RNA-seq data from MM patient samples, we have observed that the majority of recurrent mutations in MM occur within genes with very low or no detectable expression (only 27% of mutated genes express). Here, we have further analyzed a large RNA-seq sample set to describe a comprehensive transcriptional mutational landscape in MM and identify potential mutational driver genes. Specifically, we performed RNA-seq on CD138+ MM cells from 292 newly-diagnosed patients and 16 normal bone marrow plasma cell (NBM) samples. The unstranded 50bp paired-end reads were mapped to the human genome using MapSplice followed by a workflow for variant analysis based on GATK. Output was filtered for germline variants and technical artifacts, then evaluated computationally for functional impact, and finally further filtered at the gene level. Using this workflow we were able to identify most reported recurrently mutated genes in MM, including but not limited to TP53 (14%), NRAS (14%), KRAS (11%), ACTG1 (4%), CCND1 (4%), TRAF3 (3%), FAM46C (3%), CYLD (3%) and DIS3 (2%). Importantly, we were also able to identify novel putative mutational driver genes of lower frequency, including several genes involved in the NF-κB pathway (BCR, TAOK2, NFKBIA, PIM1) and genes coding for proteins forming the mTORC2 complex (SIN1, RICTOR, MTOR). We observe that the average mutational frequency, which is a convolution of clonality and relative allelic expression, is slightly below 0.5. Yet, we find diverse mutational frequencies across samples for each given gene. For instance, FAM46C shows a pattern representative of highly subclonal mutations, whereas CCND1 presents mostly bi-allelic and clonal mutations, and others such as TRAF3 show a wide spectrum of mutational frequencies. Further developments will be needed to deconvolve these frequencies.
We also applied the workflow to 10 of the samples for which we reported mutations at the DNA level, and observe CCND1, TP53 and KRAS to be recurrently mutated using either WES or RNA-seq. Nevertheless, some mutations are not shared, including 3 WES-exclusive BRAF mutations and one seen in CCND1 through RNA-seq only.
In conclusion, we report the first computational analysis to identify mutational driver genes using RNA-seq data, providing additional insight into the mutational landscape of MM. Our findings demonstrate that RNA-seq of unpaired tumor samples can suffice to characterize the most salient characteristics of cancer mutational landscapes.
Campbell:14M genomics: Other: Co-founder and consultant. Munshi:celgene: Membership on an entity's Board of Directors or advisory committees; onyx: Membership on an entity's Board of Directors or advisory committees; millenium: Membership on an entity's Board of Directors or advisory committees; novartis: Membership on an entity's Board of Directors or advisory committees.
Author notes
Asterisk with author names denotes non-ASH members.