Abstract
Introduction: Post-transcriptional modification of RNA, known as RNA-editing, has been shown to occur in many species including human. A recent study using genomic data from adult solid tumors generated by The Cancer Genome Atlas project (TCGA) investigated the potential effects of RNA editing on cancer cell viability, invasion potential, cancer pathogenesis and drug sensitivity (Han, et al., Cancer Cell 2015). Historically, there have been mixed reports regarding the prevalence of RNA editing in human cells, partly due to substantial difficulties in distinguishing RNA editing events from mapping artifacts in next-generation sequencing (NGS) data. In this study, we developed a suite of computational analysis tools to enable precise mapping of RNA-Seq in order to carry out the first systematic investigation of RNA editing events affecting coding regions in pediatric leukemia.
Methods: We developed a knowledge-guided accurate RNA-Seq mapping pipeline named StrongArm to maximize mapping accuracy and efficiency. StrongArm performs multiple mappings with different aligners and databases, and uses a set of competition heuristics to choose an optimal mapping, thereby reducing the mapping error rate and bias introduced by any single aligner, especially for error-prone splice junction sites and paralogs. The analysis was performed on 17 leukemia samples including 10 acute myeloid leukemia subtype M7 (AMLM7) and 7 Ph-like acute lymphoblastic leukemia (Ph-like ALL), which were profiled using RNA-Seq of tumor samples and whole-genome sequencing or whole-exome sequencing of paired tumor and normal DNA samples. The single nucleotide variants (SNVs) detected in RNA-Seq, but absent in DNA samples, were considered putative editing events and were further processed to remove additional false positives that could not be corrected by the mapping pipeline alone. These false positive variants, arising from paralog mapping artifacts, genetic polymorphisms, nano exons, and sequencing errors at homopolymer loci introduced by reverse transcription, account for 96% to 99% of putative DNA-RNA coding variants in the leukemia samples.
Results: Using 17 leukemia samples, we identified a total of 103 RNA editing events in coding regions affecting 43 unique loci, 92% of which were canonical A-to-G or C-to-T editing; 62 (61%) and 66 (64%) of the 103 editing events match those in the RNA editing database DARNED and RADAR, respectively. Seventy-eight (76%) of 103 editing events resulted in missense variants, suggesting that they may potentially affect protein function. The four most prevalent RNA editing events were present in 30% our leukemia samples, including COG3 I635V (n=12), BLCAP Q5R (n=10), CDK13 Q103R (n=9) and AZIN1 S367G (n=6). Previous studies have shown that AZIN1 S367G and COG3 I635V impact cell proliferation, and that BLCAP Q5R is correlated with survival rate in renal clear cell carcinoma (Han, et al., Cancer Cell 2015), while the impact of CDK13 Q103R in leukemia is unknown. Interestingly, three of four candidate "master" driver editing sites identified in TCGA solid tumors, AZIN1 S367G, COPA I164V, and COG3 I635V were also present in our data set, while GRIA2 R764G is absent, as GRIA2 is not expressed in leukemia.
Conclusions and Discussion: Leveraging an accurate mapping pipeline for RNA-seq data, we found that pediatric leukemia samples have fewer RNA-editing events (3 to 14 per sample) in coding exons, comparable to the informative coding RNA-editing events identified in adult solid tumors from the TCGA. Notably, 3 out of the 4 most common RNA-editing sites in our leukemia samples have been reported to have functional effects on cell survival / proliferation or have been correlated with patient survival rate in adult solid tumors, indicating that RNA-editing in coding regions may have a functional impact on leukemia tumorigenesis.
Mullighan:Incyte: Membership on an entity's Board of Directors or advisory committees; Amgen: Speakers Bureau; Loxo Oncology: Research Funding.
Author notes
Asterisk with author names denotes non-ASH members.