Abstract
Single Nucleotide Polymorphisms (SNPs) identified through genome-wide association studies (GWAS) could provide insight into the mechanism of human genetic diseases. Here we have studied SNPs that are associated with six critical red blood cell traits - hemoglobin concentration (Hb), hematocrit (Hct), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC) and red blood cell count (RBC). During erythroid differentiation of human CD34+ cells, we mapped enhancers and open chromatin regions by H3K27Ac ChIPseq and ATACseq, and studied the SNPs that reside within these DNA regulatory elements. We followed genomic binding of lineage restricted GATA transcription factors and BMP signal responsive transcription factor SMAD1 in CD34+ cells during erythropoiesis. By overlapping their genomic occupancy with stage-matched RNAseq, we found that SMAD1, in association with GATA-factors, serves as marker of genes responsible for differentiation at every step of differentiation. ATACseq and H3K27Ac patterns demonstrated that GATA+SMAD1 co-occupied regions correlate with open chromatin and super enhancers at every stage, whereas GATA-only regions are associated with genes with low/basal level of expression during differentiation. ChIPseq for other crucial signaling transcription factors, such as cAMP-responsive and TGFb-responsive factors (CREB and SMAD2, respectively) demonstrated a remarkable co-existence of such factors at GATA+SMAD1 co-bound regions nearby stage-specific genes. We defined such regions as "signaling centers" where multiple signaling transcription factors converge with master transcription factors to determine optimum stage-specific gene expression in response to growth factors. Surprisingly, we observed that while only 15% of RBC-SNPs target blood-master-transcription-factor motifs, at least 70% of them reside on various signaling pathway associated transcription factor motifs including SMADs (BMP/TGFβ signaling), RXR/ROR (nuclear receptor signaling), FOXO/FOXA (FOX signaling), CREBs (cAMP signaling) and TCF7L2 (WNT signaling). Our bioinformatics-algorithms demonstrated that, in contrast to GATA-only sites, SMAD1+GATA co-bound signaling centers harbor cis -acting motifs and display enriched binding of cell-type specific transcription factors (e.g. PU1 and FLI1 in progenitor vs. KLF1 and NFE2 in differentiated cells). Such distinct identities of signaling centers could serve as codes to distinguish progenitor-specific genes from erythroid-specific genes, and govern their stage-specific expression. We performed CRISPR-CAS9 mediated perturbations of each of the PU1, GATA and SMAD1 motifs separately in a representative progenitor signaling center in K562 cells. Similar to loss of PU1 and GATA motifs, loss of SMAD1 motif selectively inhibited expression of the associated gene. This suggests a signaling factor SMAD1 is important within signaling centers to obtain optimum gene expression. Moreover, a progenitor factor PU1 direct binding of SMAD1 to progenitor-specific signaling centers since with overexpression of PU1 in K562 cells, SMAD1 occupancy was concomitantly increased in selective genomic regions where PU1 binding was increased. More than 80% of the RBC-trait-SNPs are enriched within SMAD1-bound signaling centers. Such SNPs either destroy or create new signaling factor binding sites, e.g. SMAD motifs. We validated one such SNP associated with the MCV-trait near HIST1H4A, agene that increases in expression during differentiation. Using gel-shift assay, we found that SMAD1 binding is compromised when the major allele T changes to minor allele A under MCV-trait. Remarkably, eQTL analysis using microarray gene expression profiles of peripheral blood obtained from the Framingham Heart Studies revealed that expression of HIST1H4A is significantly more in a population with T-allele than that with A-allele. This demonstrates that inhibition of SMAD1 binding by the SNP causes a loss of allele-specific HIST1H4A expression. Taken together, our study provides the first evidence that naturally occurring GWAS variations directly impact gene expression from signaling centers by modulating binding of signaling transcription factors. Such aberrant signaling events over time could lead to "signalopathies", ultimately resulting in phenotypic variations of RBC traits.
Zon: Fate, Inc.: Consultancy, Equity Ownership; Marauder, Inc.: Consultancy, Equity Ownership; Scholar Rock, Inc: Consultancy, Equity Ownership; Stemgent: Consultancy.
Author notes
Asterisk with author names denotes non-ASH members.