Abstract
Abstract 390
A conspicuous lesson that has emerged from the 1000 Genomes Project is the greater genetic variation in the population than previously appreciated. Transcriptomics is rapidly assuming a prominent role in the understanding of basic molecular mechanisms accounting for variation within the normal population and disease states. Besides protein-coding RNAs, the importance of non-coding RNAs (ncRNAs) – primarily as regulators of gene expression – is well recognized but largely unexplored. The platelet transcriptome reflects megakaryocyte RNA content at the time of proplatelet release, subsequent splicing events, selective packaging and platelet RNA stability. An accurate understanding of the platelet transcriptome has both biological (improved understanding of platelet protein translation and the mechanisms of megakaryocyte/platelet gene expression) and clinical (novel biomarkers of disease) relevance.
We carried out transcriptome sequencing of total RNA isolated from leukocyte-depleted platelet preparations from four healthy adults using an AB/LT SOLiD™ system. For each individual, we constructed 3 libraries: a) long (≥ 40 nucleotides) total RNA, b) long RNA depleted of rRNA, and c) short (< 40 nucleotides) RNA. ∼1 billion reads from the 12 datasets were mapped on each chromosome and strand of the human genome. About one-third mapped uniquely, similar to other unbiased methods like SAGE. Normalizing for transcript length and scale using ß-actin expression level provided the ability to appropriately scale expression within a read-set and to compare expression levels across read-sets. Of the known protein-coding loci, ∼9,500 were present in human platelets. Plotting the number of protein-coding genes as a function of the level of normalized expression underscored different gene estimates between total and rRNA-depleted RNA preparations, and substantial inter-individual variation in the less abundant genes. RT-PCR validated the RNA-seq estimates of transcript levels exhibiting a range of >3 orders of magnitude of normalized read counts (r=0.7757; p=0.0001). A strong correlation was measured between mRNAs identified by RNA-seq and 3 published microarray datasets for well-expressed mRNAs, although RNA-seq identified many more transcripts of lower abundance. Unexpectedly, ribosomal RNA depletion significantly and adversely affected estimates of the relative abundance of transcripts including members of the RNA interference pathway DGCR8, DROSHA, XPO5, DICER1, EIF2C1-4, which exhibited large differences (up to 32-fold) between the total and rRNA-depleted preparations. A rigorous and highly stringent approach identified bona fide intronic regions that gave rise to 6,992 and 1,236 currently uncharacterized long and short RNA transcripts, respectively. We discovered numerous previously unreported antisense transcripts: 1) to known protein-coding regions of the genome, 2) 10 miRNA precursors where each locus generated 1–2 distinct antisense transcripts, presumably mature and “star” miRNAs, and 3) long and short RNAs antisense to several known repeat families. We did not observe enrichment of long-intergenic ncRNAs. We considered various possible explanations for the ∼60% sequence reads that could not be mapped on the genome. Much more lenient parameter settings only accounted for only ∼6.5% sequenced reads. An even smaller fraction of reads was observed when considering all possible combinations of exon-exon junctions in the genome (12,382,819 junctions) and the highly polymorphic HLA region of chr 6, indicating these did not contribute in any substantive manner to the platelet transcriptome. Lastly, RNA-seq was highly reproducible (>97 for 1 subject studied on 4 occasions). In summary, our work reveals a richness and diversity of platelet RNA molecules, suggesting a context where platelet biology transcends protein- and mRNA-centric descriptions. We will provide a publicly available web tool of these data embedded in a local mirror of the UCSC genome browser, facilitating the elucidation of previously unappreciated molecular species and molecular interactions. This will eventually permit an improved understanding of the molecular mechanisms that regulate platelet physiology and that contribute to disorders of thrombosis, hemostasis and inflammation.
No relevant conflicts of interest to declare.
Author notes
Asterisk with author names denotes non-ASH members.