Background:

Long noncoding RNAs (lncRNAs) are regulators of cell identify and their aberrant expression has been associated with the development of cancer. Several studies have shown that lncRNAs are required for normal hematopoiesis and also function as oncogenic drivers in acute leukemia. However, the association of lncRNA expression with AML subtypes and impact on prognosis is not known.

Methods:

LincRNAs are transcribed from the intergenic part of the genome and their transcripts are typically capped, polyadenylated and often spliced. Given their mRNA like features their expression levels can be detected using standard expression profiling assays. Tiling arrays have been used to profile gene expression levels of large patient cohorts and are currently the largest existing resource to study leukemia genomics. We have repurposed intragenic probes of the Affymetrics HG-133 Plus 2 (HG-133P2) to estimate the expression levels of 1664 known lincRNA genes (Figure 1a).

Results:

To estimate lincRNA levels in AML, we first analyzed a dataset of 159 samples from the Cancer Genome Atlas (TCGA) that were profiled using the HG-133P2 chip and RNA-seq. In all but one case, the expression levels obtained from both technologies were significantly correlated (r>0.6 Pearson correlation; p<0.001; Figure 1b,c) suggesting that lincRNA levels can be accurately estimated from microarray data. Expression analysis of lincRNAs was then carried out in three datasets totaling 737 patient samples for which HG-133P2 data was available. Samples included those from the Netherlands (NL, n=419), USA (TCGA, n= 179) and Germany (GER, n=139).

To evaluate whether lincRNA expression was associated with AML subgroups, hierarchical clustering was performed on the NL and TCGA sets (Figure 2a). Patients with t(8:21) and t(15;17)/FAB M3 sub-groups and those with mutations in CEBPA, NPM1 and/or FLT3-ITD associated with distinct lincRNA profiles in both cohorts. In addition, we found associations that were unique to one or the other dataset i.e. FAB M2 and M5 in the NL and inv (16) in the TCGA sets. TP53 mutations were only available in the TCGA set and patients with these mutations showed a distinct lincRNA expression profile. Taken together, these data suggest that specific lincRNA expression profiles, similar to gene expression profiles, are associated with known AML subgroups.

The 1664 lincRNAs were further analyzed using non-negative matrix factorization clustering in the NL and TCGA sets. The NL cohort optimally separated into four groups (Figure 2b) that were associated (p<0.01; Fisher exact test) with either good or poor prognostic subtypes. For example, cluster one was associated with patients of poor cytogenetics and those having re-arrangements of chromosome 11q32, while cluster three was associated with patients in FAB M3/ t(15:17), FABM4 and NPM1 mutations. Similarly, the TCGA cohort was optimally separated into five groups (Figure 2c), including cluster three which was associated (p<0.01) with patients of complex karyotype, those in FAB M0 and carrying mutations in RUNX1 and TP53 while cluster five was associated with patients with chromosomal translocation in t(8,21) and t(15;17)/FAB M3 and mutations in CEBPA. These data suggest that lincRNA profiles segregate with subgroups of overlapping characteristics that are enriched for either good or poor prognostics.

The NL, TCGA and GER cohorts were also analyzed for overall survival using the cox-regression model. In total we found that 78 (NL), 92 (TCGA) and 60 (GER) lincRNAs were significantly associated with overall survival (p < 0.05). An integrative approach including a meta-analysis of the cox-regression p-values (Souffers method; p <0.01) revealed a survival signature of 17 lincRNAS (linc-sig) across the three sets. The prognostic power was maintained in the NL (p<0.001), TCGA (p<0.001) and GER cohorts (p=0.1) using Kaplan–Meier statistics (Figure 3). Importantly, the linc-sig remained an independent prognostic factor when accounted for age, sex, WBC and CEBPA.

Conclusions

We investigated the role of 1664 lincRNAs across three AML patient cohorts. The data presented shows for the first time that distinct lincRNA expression profiles are associated with recognized cytogenetic and mutational subgroups that demonstrate good or poor characteristics and that a signature of 17 lincRNAs predicts overall survival in AML.

Disclosures

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution