Abstract
Adults release ~1011 blood cells into circulation every day. These cells derive, by clonal expansion, from a minuscule population of hematopoietic stem cells (HSCs), which differentiate into mature cells through progressively fate-restricted progenitor cells. In order to identify the genetic loci controlling these processes, we performed a hypothesis-free genome-wide association study (GWAS) of complete blood count data measured on 170,000 subjects in the UK Biobank and INTERVAL cohorts. This identified almost 3,000 causally distinct variants (hits) modulating hematopoiesis or blood cell clearance, including several hundred low frequency variants (minor allele frequencies <5%) with very large effect sizes.
Over three quarters of the GWAS hits have no coding consequence and analysis of the 700 platelet hits showed a striking enrichment in the DNAse-I hypersensitivity sites of HSC-derived megakaryocytes (MKs) compared to those of HSC-derived erythroblasts (EBs), suggesting that a substantial proportions of blood trait heritability can be explained by variants modulating cell type specific genetic regulators.
To understand the nature of gene regulation in the erythroid-megakaryocytic lineage we firstly used long-range chromosomal interaction data (from genome-wide promoter-capture chromosome conformation assays) and found that the platelet trait hits are enriched in regulatory elements that interact with a gene promoter in MKs. We then scanned the genome for super-enhancers (SEs) (regions characterized by a high density of enhancer elements and a strong H3K27ac histone modification). We identified 1067 SEs in MKs and 1294 SEs in EBs, with an overlap count of 400. Platelet GWAS hits were strongly enriched in MK SEs over MK typical enhancers (TEs) relative to red cell hits and were frequently localised in SEs specific to MKs. Finally, we found that SE (but not TE) MK-interacting genes are significantly enriched for gene-ontology (GO) terms for megakaryopoiesis, platelet function and hemostasis, supporting the notion that SEs play a major role in the determination of cell identity.
To corroborate this, we focused on the second (SE2) of three SEs for the CD9-VWF gene cluster, which contained the GWAS hit rs2363877. We recalled 100 individuals by genotype at rs2363877 and found that the minor allele was associated with decreased expression of CD9 and increased expression of VWF in platelets. By flowing blood over collagen and a VWF-binding collagen-mimetic we showed this allele has a significant effect on the extent of ex-vivo thrombus formation. Finally, using CRISPR/Cas9 we eliminated SE2 from IPSC derived MKs and showed an increase in VWF expression.
In summary, recently generated epigenetic and long-range chromosomal interaction data suggest that a significant fraction of platelet GWAS hits perturb super-enhancer activity in MKs with downstream effect on networks of genes. In the light of this the maxim that 'a GWAS hit acts through a single gene' cannot be sustained.
No relevant conflicts of interest to declare.
Author notes
Asterisk with author names denotes non-ASH members.