Introduction
Transcriptional deregulation is a central event in the development of acute myeloid leukemia (AML), with most mutations occurring in genes related to transcription, chromatin regulation and DNA methylation. Furthermore, alterations involving cis-regulatory elements have been shown to play a critical role in aberrant gene expression in AML. Genetic variation in cis-regulatory regions usually involves a single allele, which results in differential expression of the two alleles. This phenomenon, termed allele-specific expression (ASE), is therefore an accurate marker for cis-regulatory variation (Pastinen, 2010). We propose that a systematic study of genes with aberrant ASE in AML may uncover aberrantly expressed genes caused by abnormalities in cis-regulatory elements. Therefore we aim to 1) chart the landscape of ASE in AML, 2) establish a link between relevant ASE events and AML subtypes, and 3) investigate the mechanisms driving ASE.
Methods
We performed whole exome sequencing (WES) and RNA-seq on leukemic blasts from 168 de novo AML patients, representing all major subtypes of the disease. Combining both datasets, we assessed ASE in every gene with informative (non-homozygous) single nucleotide variants (SNVs).
Results
Patients had a median of 37 genes with ASE, several of which were recurrently detected across multiple patients. To shorten the gene list we selected for this study genes known to be involved either in cancer or in myeloid development. The gene most commonly found to show ASE (53/140 cases with SNVs) was GATA2, which encodes a transcription factor crucial for proliferation and maintenance of hematopoietic stem cells with a known involvement in AML.
Interestingly, integration with molecularly defined classification of AML revealed that all cases (n=17) with biallelic CEBPA mutations exhibited GATA2 ASE (p-value = 6.00·10-7, Fisher's test). Biallelic CEBPA mutations (CEBPA DM) identify an AML subtype with favorable clinical outcome and frequently co-occur with GATA2 mutations (Greif PA, 2012), pointing to a functional connection between these two genes. Indeed, 44% of the cases in our cohort exhibited a GATA2 mutation, and 27% carried a second, subclonal mutation in the same gene. Importantly, in cases where a GATA2 mutation was found, the mutant allele was always preferentially expressed. These findings were validated in the TCGA dataset, where all four CEBPA DM patients with informative SNVs in GATA2 exhibited GATA2 ASE.
Although GATA2 ASE was present in other AML subtypes, none of these subtypes showed a significant association with this finding. Patients with a t(8;21) rearrangement (n=5), which represses CEBPA expression, did not exhibit GATA2 ASE, and we only observed GATA2 ASE in 4 out of 8 CEBPA silenced leukemias (Wouters BJ, 2007). Altogether, this demonstrates the uniqueness of the 1-to-1 relationship between CEBPA DM and GATA2 ASE, and excludes a causative role for inactive CEBPA protein in mediating mono-allelic expression of GATA2.
The average expression of GATA2 in CEBPA DM patients was comparable to other AMLs, even in cases with monoallelic GATA2 expression. This suggests that a) ASE was achieved by repression of one allele rather than dramatically increased expression of the other, b) there was a compensation of the non-repressed allele. DNA methylation analysis of the GATA2 promoter did not reveal methylation-mediated gene silencing of the repressed allele. The long-distance +77 kb GATA2 enhancer appears to be involved in ASE, as RNA read-through levels at the enhancer were significantly different in CEBPA DM AMLs (p-value < 10-4, Wald test) in an allele-specific manner. The involvement of the enhancer was further confirmed by differences in H3K27ac levels between the two alleles.
Conclusions
An unbiased screen of 168 de novo AML cases revealed that all patients (n=17) with CEBPA biallelic mutations display GATA2 ASE. GATA2 mutations were found in 8 of the 17 cases, always in the allele that is preferentially expressed. Since GATA2 ASE is present in all CEBPA DM and GATA2 mutations only in a fraction, we hypothesize that GATA2 ASE is acquired first and mutations are only selected if they occur in the expressed allele. Moreover, given that other subgroups with CEBPA abnormalities do not show a similar pattern, we propose that ASE of GATA2 is not a consequence of CEBPA mutations, but rather a requirement for the development of AML in these patients.
No relevant conflicts of interest to declare.
Author notes
Asterisk with author names denotes non-ASH members.