Key Points
Nfix deficiency results in a loss of long-term hematopoietic stem cells and an accumulation of megakaryocyte and myelo-erythroid progenitors.
NFIX interacts with PU.1 at super-enhancers to regulate gene targets involved in cellular respiration and hematopoietic differentiation.
Abstract
The transcription factor (TF) nuclear factor I-X (NFIX) is a positive regulator of hematopoietic stem and progenitor cell (HSPC) transplantation. Nfix-deficient HSPCs exhibit a severe loss of repopulating activity, increased apoptosis, and a loss of colony-forming potential. However, the underlying mechanism remains elusive. Here, we performed cellular indexing of transcriptomes and epitopes by high-throughput sequencing (CITE-seq) on Nfix-deficient HSPCs and observed a loss of long-term hematopoietic stem cells and an accumulation of megakaryocyte and myelo-erythroid progenitors. The genome-wide binding profile of NFIX in primitive murine hematopoietic cells revealed its colocalization with other hematopoietic TFs, such as PU.1. We confirmed the physical interaction between NFIX and PU.1 and demonstrated that the 2 TFs co-occupy super-enhancers and regulate genes implicated in cellular respiration and hematopoietic differentiation. In addition, we provide evidence suggesting that the absence of NFIX negatively affects PU.1 binding at some genomic loci. Our data support a model in which NFIX collaborates with PU.1 at super-enhancers to promote the differentiation and homeostatic balance of hematopoietic progenitors.
Introduction
Hematopoietic stem cells (HSCs) are responsible for blood maintenance and can reconstitute the hematopoietic hierarchy when transplanted into a host whose hematopoietic system has been ablated. The previous 2 decades have witnessed an increased trend in the survival of people diagnosed with hematopoietic diseases and treated with HSC transplantation (HSCT).1 However, there remain many complications associated with HSCT, such as engraftment failure, infection, and mortality. To better understand the molecular regulation of HSCT, we have sought to identify novel molecular regulators of HSC repopulating activity. We recently identified nuclear factor I-X (NFIX) as required for optimal HSC in vivo repopulating activity after transplantation.2
NFIX belongs to the Nfi gene family, which also includes NFIA, NFIB, and NFIC.3,4 The NFI family function as transcriptional activators or repressors and bind to a palindromic DNA sequence (TGG(N6-7)GCCA) as homodimers or heterodimers. NFI family members are expressed in a variety of cell types, including multiple adult stem cell compartments.5-9 NFIC regulates the differentiation, proliferation, and apoptosis of dental follicle stem cells.6 NFIB is expressed by epithelial hair follicle stem cells, promoting proliferation and differentiation.7 NFIA functions as a transcriptional switch in multiple stem and progenitor cell compartments. It promotes gliogenesis in the developing chick neural tube while inhibiting further neurogenesis of ventricular zone progenitor cells and regulates the granulocytic/erythroid fate choice of human hematopoietic progenitors during in vitro differentiation.8,9 NFIX itself regulates the quiescence, adhesion, migration, and differentiation of neural stem cells.10-12 Recently, NFIX mutations have been associated with acute erythroid leukemia.13
We have shown that NFIX regulates HSC survival after transplantation: Nfix-deficient hematopoietic stem and progenitor cells (HSPCs) exhibit increased apoptosis acutely after transplantation and reduced colony-forming potential in vitro.2 In contrast, enforced NFIX expression protects hematopoietic cells from apoptosis and prolongs hematopoietic cell growth in vitro in a thrombopoietin-dependent manner.14 NFIX is well characterized as a vital regulator of transcription in immature cells; however, little is known about the role of NFIX in hematopoiesis. Furthermore, it is currently unknown whether NFIX collaborates with other characterized transcriptional regulators of hematopoiesis.
To explore the NFIX-regulated transcriptome in HSPCs, we generated a transcriptional, cellular atlas of wild-type and Nfix-deficient bone marrow (BM) HSPCs. Here, we report that the loss of Nfix results in perturbations in the relative frequencies of specific hematopoietic progenitor subsets. Via chromatin immunoprecipitation (ChIP)-seq and assay for transposase-accessible chromatin (ATAC)-seq, we show for the first time that NFIX partners with additional key hematopoietic transcription factors (TFs), including PU.1, especially at superenhancers. Moreover, we provide evidence suggesting the absence of NFIX negatively affects PU.1 binding at some genomic loci. We also computationally predict target genes of NFIX and PU.1 across the hematopoietic hierarchy and investigate their regulatory functions during hematopoietic differentiation.
Methods
Fluorescence-activated cell sorting (FACS) and cellular indexing of transcriptomes and epitopes by high-throughput sequencing (CITE-seq)
BM was extracted from the femurs, pelvic bones, tibias, and spines of tamoxifen (TAM)-treated Nfixflox/floxRosa26CreERT2+/+ and Nfixflox/floxRosa26CreERT2+/T mice via crushing. c-Kit+ cells were isolated via magnetic enrichment using anti-CD117 microbeads (Miltenyi Biotec, Carlsbad, CA) and an autoMACS magnetic cell separator (Miltenyi Biotec, Carlsbad, CA). For c-Kit+ cells, HSPCs (Lineage-Sca1+ c-Kit+), and long-term hematopoietic stem cells (LT-HSCs) (Lineage-Sca1+ c-Kit+CD48−CD150+) isolation, cells were stained with anti-Lineage BV605 cocktail, anti-Sca-1-PerCP-Cy5.5 (E13-161.7), anti-c-Kit-APC-e780 (2B8), anti-CD48-Alexa Fluor 700 (HM48-1), and anti-CD150-PE-Cy7 (TC15-12F12.2). For common myeloid progenitor (CMP) (Lineage−Sca-1−c-Kit+CD34+CD16/32med) isolation, cells were stained with anti-Lineage BV605 cocktail, anti-Sca-1-PerCP-Cy5.5 (E13-161.7), anti-c-Kit-APC-e780 (2B8), anti-CD34-FITC (RAM34), and anti-CD16/32-Alexa Fluor 700 (2.4G2). In addition to FACS antibodies, cells were labeled with cellular indexing of transcriptomes and epitopes by high-throughput sequencing (CITE-seq) antibodies (see supplemental Table 1). 4′,6-diamidino-2-phenylindole (Sigma-Aldrich, St Louis, MO) was used for dead cell exclusion. All cell isolations were performed using cell sorting on FACSAria III (BD Biosciences, San Jose, CA). Then, 7000 Lineage−cKit+ BM, 1000 HSPCs, 1000 CMPs, and 1000 LT-HSCs were pooled together for 10x genomics (see supplemental Methods).
Single-cell RNA-seq (scRNA-seq) analysis
Single-cell data were analyzed using cellranger count (version 6.0.0)15 with default parameters for CITE-seq (--feature-ref). The transcriptome database was cellranger-mm10-3.0.0. Cells were filtered based on: (1) number of expressed genes (ie, nFeature_RNA) between 200 and 6000 and (2) percentage of mitochondrial reads (ie, percent.mt) less than 10%. The weighted nearest neighbor Uniform Manifold Approximation and Projection (UMAP) and cell clusters were generated using Seurat (v4.0.4),16 following the reference guide of the weighted nearest neighbor analysis of CITE-seq, RNA + ADT. Specifically, principal component analysis (PCA) was used to generate embeddings for both the RNA and the protein space (ie, NormalizeData() %>% FindVariableFeatures() %>% ScaleData() %>% RunPCA()). Then the 2 spaces were integrated using the FindMultiModalNeighbors method with default parameters. Finally, cell clusters were identified using RunUMAP with default parameters and FindClusters with algorithm = 3 and resolution = 1. Cell cycle prediction was done using CellCycleScoring. The 2D kernel density estimation (KDE) was done using MASS (v7.3-54)17 and was plotted using ggplot2 (v3.3.5, the contour plot)18 and rayshader (v0.26.2, the 3D view).19,P value for the density difference comparing Nfix+/+ with NfixΔ/Δ for each cluster was calculated using the kde.test function (with parameter of binned = T) from the ks R package (version 1.13.2).20 The cluster sizes were fixed to 500 cells by sampling with replacement. The kernel density test was repeated 1000 times for each cluster, and the mean P value was used. To show that density differences were not a bias of parameter settings, we conducted: (1) KDE analysis with a range of bandwidth from 0.5 to 5 (supplemental Figure 4A-F) and (2) differential abundance analysis (DAseq)21 with a range of scales from 20 to 5000 (supplemental Figure 4G-H). The DA cells were generated using the getDAcells function based on the PCA embedding and UMAP embedding. The clustering of DA cells and differential test were done using the getDAregion function with default parameters and min.cell = 50. Differential gene expression analysis was done using FindMarkers with “slot = data, logfc.threshold = 0.2, test.used = wilcox, pseudocount.use = 1.” Genes were further filtered by false discovery rate ≤0.01, which was calculated using the Benjamini-Hochberg method in the R package.
Motif and peak analysis and motif distance analysis
NFIX and PU.1 overlapped peaks were determined using bedtools (eg, bedtools intersect -u) (v2.25.0).22 NFIX and PU.1 motifs were downloaded from the Homer motif database.23 Motif scanning was performed using FIMO (version 4.11.2)24 with default P value cutoff of 10−4. NFI motif was only searched on the positive strand. Motif distance was calculated using “bedtools closest -a NFIX -b PU.1 -k 1 -d.” Only unique NFIX and PU.1 occurrences were used in the calculation. Background sequences were randomly sampled from mm9 with “bedtools random -l 10000 -n 743.” The reason to set 10 000 for sequence length is to increase the chance of NFIX and PU.1 co-occurrence. Random sampling was performed 100 times. DeepTools (version 3.2.0)25 was used to draw footprint plots with a bin size of 2bp.
Prediction of NFIX-PU.1 cobinding peaks
Catchitt was used to identify lineage-specific NFIX-PU.1 cobinding peaks.26 The input contains: (1) training labels of NFIX-PU.1 cobinding peaks from HPC5 (generated by the “labels” subcommand); (2) NFIX and PU.1 motifs (generated by the “motif” subcommand); and (3) a cell-type–specific ATAC-seq signal (generated by the “access” subcommand with d = “Bigwig”). ATAC-seq data for LT-HSC, short-term HSC (ST-HSC), and multipotent progenitors (MPP) were downloaded from PRJNA335747. ATAC-seq data for the remaining lineages was downloaded from the mouse VISION project GSE143270.27,28 ATAC-seq data were normalized using S3norm.29 Chromosome 19 was a hold-out set for crossvalidation, with an area under the receiver operating characteristic of 0.95. Cobinding labels on all other chromosomes were put into the iterative learning model (command is “itrain i = 5 abb = 1 aba = 4 P = .01”). The top 1000 predicted cobinding peaks in each blood lineage (eg, LT-HSC, ST-HSC, MPP, megakaryocyte progenitor [MKP], CMP, granulocyte-macrophage progenitor [GMP], myelo-erythroid progenitor [MEP], erythroid progenitors, and erythroid cells [Ery]) were selected and used for downstream analyses.
Target gene assignment
Direct target genes were only assigned to the predicted NFIX-PU.1 peaks if they met both of the following requirements: (1) colocalized within the same topologically associating domain of the peaks or NFIX-PU.1 peaks interacted with the gene promoter based on promoter capture-HiC30 and (2) was a differentially expressed gene (DEG) between Nfix+/+ and NfixΔ/Δ in each cluster. To assess the quality of our approach, we randomly selected 1000 regions and applied the same target-finding pipeline. We hypothesized that cell-type–specific NFIX-PU.1 peaks would better explain DEG than random genomic loci, and thus the number of target genes for NFIX-PU.1 peaks would be higher than the number of target genes from a random approach. To test this hypothesis, we performed a paired t test. Specifically, for each of the 11 clusters, we had 2 values (ie, number of target genes): one obtained using NFIX-PU.1 peaks and the other obtained using random regions. We repeated the procedure 100 times. The average P value was 0.0049, confirming that our analysis was robust.
Results
An scatlas of Nfix-deficient hematopoietic progenitors
To illuminate the role of NFIX as a transcriptional regulator of HSPCs, we performed CITE-seq using Nfixflox/floxRosa26-CreERT2+/+ and Nfixflox/floxRosa26-CreERT2+/T mice treated with TAM (Figure 1A).31 Fourteen days of TAM treatment resulted in the efficient deletion of Nfix in BM HSPCs (supplemental Figure 1). TAM-treated Nfixflox/floxRosa26-CreERT2+/+ and Nfixflox/floxRosa26-CreERT2+/T HSPCs will be referred to as Nfix+/+ and NfixΔ/Δ, respectively, hereafter. After TAM treatment, BM was collected and simultaneously stained with CITE-seq and fluorophore-conjugated antibodies (Figure 1A; supplemental Table 1). To ensure adequate capture of each key hematopoietic progenitor population for sc profiling, rare populations (ie, LT-HSCs and CMPs) were collected separately via cell sorting and pooled with more abundant populations as follows: 7000 Lineage−c-Kit+ hematopoietic progenitors, 1000 HSPCs (Lineage−cKit+Sca1+), 1000 CMPs (Lineage−cKit+Sca1−CD34+CD16/32med), and 1000 LT-HSCs (Lineage− Sca1+ c-Kit+ CD150+CD48−). Pooled cells were then processed for scRNA-seq and cell-surface protein abundance using 10x genomics (Figure 1A). Although our sampling strategy may not reflect the true cellular abundance in the BM, our goal is to interrogate shifts in clusters found within these well-defined progenitor pools based on gene and surface marker expression (ie, CITE-seq).
To identify distinct cell populations, we first performed dimensional reduction based on the weighted nearest neighbor workflow in Seurat version 416 and generated a UMAP visualization of Nfix+/+ (7341 cells) and NfixΔ/Δ (6608 cells) data based on a weighted combination of RNA and protein expression of surface markers (Figure 1B). We then performed clustering analysis using the smart local moving algorithm and identified 11 unique clusters (Figure 1B).32 Clusters were annotated based on cell surface and known gene markers (Figure 1C-D; supplemental Figure 2; supplemental Table 2).33 CD150, CD48, and SCA1 were used to identify phenotypic LT-HSCs, ST-HSCs, and MPPs (Figure 1C; supplemental Figure 2).34 Interestingly, LT-HSC.2 express high levels of CD41 and CD150 relative to LT-HSC.1 and fall in close proximity to the MKP cluster (Figure 1B; supplemental Figure 2). Thus, LT-HSC.2 likely represent megakaryocyte-primed HSCs.35 MKPs were annotated based on high CD41 expression (Figures 1C; supplemental Figure 2). CD71 and CD105, markers of committed progenitors and Ery, were used to identify CMPs, GMPs, erythroid progenitors, and Ery (Figure 1C; supplemental Figures 2 and 3).36 CMPs and GMPs were further annotated based on high CD16/32 expression (Figure 1C; supplemental Figure 2). The cell-type annotation based on surface markers can be further confirmed using gene expression markers. For example, high expression of erythroid gene markers, Gata2 and Apoe, indicated megakaryocytic and erythroid populations (Figure 1D; supplemental Figure 3).37,38 Annotated HSCs and MPPs expressed high levels of Mllt3, Hlf, Mecom, Meis1, and Mpl (Figure 1D; supplemental Figure 3).39-43,Cavin2 and Pbx1 were both highly expressed in annotated MKPs (Figure 1D; supplemental Figure 3).44,45 Hbb-bs and Hba-a1 were expressed in the Ery cluster (Figure 1D; supplemental Figure 3).46
We thus created an single-cell atlas of c-Kit+ mouse BM based on gene expression and cell-surface markers.
Loss of Nfix depletes a LT-HSC subpopulation and enriches for MKPs
We next examined our sincle-cell atlas to ask if the abundance of any BM HSPCs was perturbed by the loss of NFIX. Two-dimensional KDE confirmed a significant loss of LT-HSC.1 (P value = 1.2 × 10−3), MPPs (P value = 9.2 × 10−4), and GMPs (P value = 1.1 × 10−3) in NfixΔ/Δ HSPCs (Figure 2Ai-iii). NfixΔ/Δ HSPCs displayed a concomitant enrichment of LT-HSC.2 (P value = 3.9 × 10−3), ST-HSC.1 (P value = 9.9 × 10−3), MKP (P value = 3.5 × 10−3), and MEP (P value = 2.8 × 10−5) relative to Nfix+/+ HSPC (Figure 2Ai-iii). Consistently, we observed a significant decrease in the fraction of cells assigned to LT-HSC.1, MPP, and GMP clusters and an increase in those assigned to LT-HSC.2, ST-HSC.1, MKP, and MEP clusters in NfixΔ/Δ vs Nfix+/+ HSPCs (Figure 2B). To test the robustness of the cell population differences and to exclude possible estimation bias caused by parameter settings, we performed KDE analysis and DA analysis using a range of parameters. The KDE analyses with different bandwidths all show more crowded contour lines in the MKP and MEP clusters and less crowded lines in the LT-HSC.1 cluster for the NfixΔ/Δ sample (supplemental Figure 4A-F). The DA analysis result is another independent confirmation showing more enriched cells in ST-HSC.1, MEP, and MKP and more depleted cells in the LT-HSC1 cluster (supplemental Figure 4G-H). Consistent with our cluster annotations, the LT-HSC and ST-HSC clusters were largely quiescent. Thus, differences in cluster abundance did not correlate with changes in cell cycle status (supplemental Figure 4I).
In sum, loss of Nfix results in a selective loss of LT- and ST-HSC subsets and MPP, as well as an accumulation of downstream MKP and MEP clusters, suggesting a possible block in differentiation. This is in line with a known role for a highly related gene, NFIA, in the granulocytic/erythroid fate choice of human hematopoietic progenitors.9
To better understand the perturbations in these populations, we next examined changes in gene expression using the gene set enrichment analysis across clusters (Figure 2C; supplemental Figure 5A-B). Differential gene expression analysis comparing NfixΔ/Δ and Nfix+/+ cells revealed enrichment of gene ontology (GO) terms associated with each cell cluster (supplemental Tables 3 and 4; supplemental Figure 5C). In LT-HSC.1, whose abundance is depleted in NfixΔ/Δ HSPCs, we observed decreased expression of genes associated with ribosome complex formation and translation (eg, macromolecule biosynthetic process) and cellular respiration (eg, cellular metabolic process; Figure 2C; supplemental Tables 3 and 4). Protein synthesis is tightly regulated in HSCs to preserve their function.47 These data suggest a novel role for NFIX in regulating metabolic processes. Multiple genes involved in modulating the electrochemical gradient and mitochondrion are downregulated in the absence of Nfix (Cox genes, Uqcr genes, and Atp synthase genes; supplemental Table 3; supplemental Figure 5B), consistent with previous studies showing mitochondrial membrane potential is important for HSPC self-renewal and differentiation.48 In addition, gene set enrichment analysis terms associated with differentiation were enriched (eg, regulation of cell differentiation; Figure 2C; supplemental Table 4), further supporting the altered hematopoiesis in NfixΔ/Δ HSPC.
We also observed an increase in LT-HSC.2 in NfixΔ/Δ HSPCs. Here, genes involved in the microenvironment (eg, response to stimuli and cell periphery) were perturbed (Figure 2C; supplemental Table 4). The BM niche is a rich microenvironment that supports the maintenance and survival of HSPCs.49 We again found GO terms related to differentiation (eg, cell differentiation and regulation of cell population proliferation) enriched in LT-HSC.2 DEG (Figure 2C; supplemental Table 4). These GO terms included the upregulated genes, Cd74 and Mcl1 (supplemental Table 3; supplemental Figure 6). Cd74 and Mcl1 play a role in HSC maintenance and survival.50,51 Activation of CD74 indirectly induces Bcl2 expression, a prosurvival factor.52 Consistently, we observe a significant increase of Bcl2 expression in NfixΔ/Δ MKPs (supplemental Table 3; supplemental Figure 5B), which is downstream of LT-HSC.2 and may explain its increased abundance in NfixΔ/Δ HSPCs (Figure 2B).
Taken together, we show that the loss of Nfix perturbs the abundance of BM HSPCs. Select LT-HSCs, MPPs, and GMPs are lost, which is accompanied by increases in a distinct LT-HSC subset, ST-HSCs, MKPs, and MEPs (summarized in Figure 2D). Our data are consistent with prior studies linking NFIX to HSPC survival and further implicate NFIX in HSPC differentiation and proliferation, ribosome biogenesis and translation, and cellular respiration.2,14,53
The scRNA-seq data we generated is illuminating and would pair well with functional studies that directly interrogate NFIX genetic targets. However, to our knowledge, there are no studies that describe how NFIX directly regulates hematopoiesis or identify potential transcriptional partners of NFIX.
Global profiling of NFIX shows co-occupancy with multiple hematopoietic factors
To investigate how NFIX regulates hematopoiesis, we sought to generate a global map of NFIX-binding sites. Currently, there are few commercial antibodies that can reliably discriminate between different NFI families because of high sequence homology. To address this, we generated a NFIX-specific monoclonal antibody (clone: 7B5.3). Our highly specific anti-NFIX monoclonal antibody exclusively immunoprecipitated NFIX-FLAG in 293T cells overexpressing NFIX-FLAG but not in cells overexpressing NFIA-FLAG (supplemental Figure 7A).
We then performed ChIP followed by high-throughput sequencing (ChIP-seq) using our novel anti-NFIX antibody and Nfix+/+ HPC5 cells as an in vitro surrogate for HSPCs.54 We identified 6783 high-quality NFIX peaks with an irreproducible discovery rate cutoff at 5% (supplemental Table 5; supplemental Figure 7C-E).55 To validate observed peaks as true NFIX-binding sites, we generated Nfix−/− HPC5 cells with sgRNAs combined with Cas9 targeting the third exon of Nfix (supplemental Figure 7B). NFIX ChIP-seq in Nfix−/− HPC5 cells detected only 155 peaks (Figure 3A; supplemental Figure 7C-E), which were excluded from further analysis. We performed motif discovery analysis and observed that the NFI consensus motif was the most overrepresented motif in NFIX peaks (supplemental Figure 7E), further confirming that these ChIP-seq peaks were directly occupied by NFIX. Genome-wide distribution reveals that most of the NFIX peaks (77%) were found in distal regulatory elements (Figure 3B). Previous studies found that NFIX primarily acts from enhancer regions.12,56 To study NFIX-occupied enhancers, we performed H3K27ac ChIP-seq in HPC5 cells and identified 2698 NFIX and H3K27ac overlapped peaks (Figure 3C). Motif enrichment analysis of these sites revealed enrichment for PU.1, NFE2, and RUNX2 motifs (Figure 3C), suggesting they are cofactors of NFIX. To validate co-occupancy of NFIX and the hematopoietic TFs suggested by motif analysis, we first compared global NFIX binding with ChIP-seq data from 27 published hematopoietic TFs in HPC7 cells.28,57,58 HPC7 cells have murine embryonic stem cell origins, whereas HPC5 cells are derived from murine adult BM, however, both cell lines overexpress the LIM-homeobox gene, Lhx2, and are cytokine-dependent.54,59 The genome-wide ChIP-seq signal correlation showed NFIX clustered with PU.1, as well as active histone marks such as H3K27ac, H3K9ac, and H3K4me3 (supplemental Figure 8). In contrast, there was no significant correlation between NFIX-binding signals and the repressive mark, H3K27me3 (supplemental Figure 8). For example, 1 distal regulatory element located ∼4kb upstream of the Il10ra gene is co-occupied by 10 different TFs in HPC7 (Figure 3D; supplemental Table 6), suggesting that NFIX regulates gene expression, such as Il10ra, via a protein complex. Consistently, NFIX occupancy signals are enriched in the binding sites of multiple TFs in HPC7 cells (Figure 3E).
Collectively, these data support a model where NFIX collaborates with additional hematopoietic TFs to regulate gene expression at enhancers.
NFIX and PU.1 physically interact and cobind genomic regions
As PU.1 was implicated as an NFIX transcriptional partner by ChIP-seq, we then focused on validating if NFIX and PU.1 physically interact in a complex via coimmunoprecipitation. In cells overexpressing MYC-tagged NFIX and FLAG-tagged PU.1, both proteins were found in the same complex recovered during coimmunoprecipitation (Figure 4A). To further understand how NFIX and PU.1 interact at DNA in an endogenous chromatin environment, we performed PU.1 ChIP-seq in HPC5 cells and identified 743 peaks cobound by NFIX and PU.1 (supplemental Table 7). To better understand the mechanism of cobinding, we investigated the motif footprints of NFIX and PU.1 using high sequencing depth ATAC-seq data from HPC5 cells (Figure 4B). The median distance between NFI and PU.1 motifs is 147bp, which is significantly less than that expected by random chance (P value = 5 × 10−127), suggesting that NFIX and PU.1 co-occupied sites likely occur within 1 nucleosome or 2 adjacent nucleosomes (supplemental Figure 9). Overall, these data reveal that NFIX and PU.1 are capable of physically interacting and can bind DNA in close proximity, supporting a model of cooperativity between the 2 TFs.
Similar to the NFIX-binding patterns, 90% of NFIX and PU.1 cobound peaks are located 2 kilobases away from the transcription start site (TSS) (Figure 4C), suggesting that these NFIX-PU.1 cobound peaks are distal regulatory elements. Compared with other NFIX peaks, NFIX-PU.1 cobound peaks show higher chromatin accessibility and signals of H3K27ac (Figure 4D). In fact, 61% (n = 451) of NFIX-PU.1 cobound peaks overlap with predicted super-enhancers based on clustering of H3K27ac signals in ATAC-seq peaks (Figure 4E). To further glean the functional importance of NFIX with PU.1, we investigated changes in PU.1 binding in the absence of NFIX using the HPC5 Nfix−/− cell line. Here, we find significant differences in the PU.1 ChIP signal between Nfix+/+ and Nfix−/− HPC5 cells. Differential PU.1 peak analysis revealed 2916 differentially bound peaks between Nfix+/+ and Nfix−/− HPC5 cells (supplemental Table 8). For example, PU.1 binding was decreased at the Pdcd, Socs3, and Meis1 loci (Figure 4F).
Collectively, this functional genomic evidence indicates that NFIX and PU.1 interact and likely function at active distal super-enhancers.
NFIX and PU.1 regulate hematopoietic progenitor differentiation
To interrogate NFIX and PU.1 coregulatory function further, we sought to generate a genome-wide cobinding profile across blood lineages. However, because of limited cell numbers, it is not possible to perform conventional ChIP-seq assays to map NFIX-PU.1–binding sites in each primitive hematopoietic population. Thus, we used chromatin accessibility and TF motifs to predict NFIX and PU.1 co-occupancy using Catchitt (Figure 5A).26 Catchitt is an iterative machine-learning algorithm that combines open chromatin signals (ie, ATAC-seq) and DNA sequence features (ie, TF motifs) to predict TF-binding sites. We first used this algorithm to train a NFIX-PU.1 co-occupancy model using ATAC-seq signals and PU.1 and NFI motifs in HPC5 cells (Figure 5A). We then used the bona fide co-occupancy sites identified by ChIP-seq to assess the prediction accuracy on a hold-out chromosome (ie, excluded from training; Figure 5A; see “Methods”). The area under the receiver operating characteristic curve (see “Methods”) was close to 0.95, confirming the good performance of the algorithm. We then used publicly available ATAC-seq (from27,28) data from different primary primitive hematopoietic populations to predict NFIX-PU.1–binding sites (Figure 5A). We then selected the top 1000 predicted co-occupancy sites for each of the 11 scRNA-seq clusters for downstream analysis (supplemental Table 9). To correlate DEG patterns with NFIX-PU.1 sites, we performed target gene assignment for the DEG identified in each scRNA-seq cluster. The NFIX-PU.1 sites were assigned to the target genes if they fell within the same topologically associating domain or if their interactions were supported by promoter-capture HiC interactions.30 For all cell clusters, a combined total of 279 direct gene targets for NFIX-PU.1 putative target sites was found (Figure 5B; supplemental Table 10). We hypothesized that if our target gene assignments were accurate, the number of target genes assigned by NFIX-PU.1 sites should be higher than the number of target genes assigned by random genomic sites. Indeed, NFIX-PU.1 sites explained DEG significantly better than random sites (P value = .0049; see “Methods”; supplemental Figure 10).
We then investigated the relationship between NFIX-PU.1 cobinding and gene expression changes in our cell clusters (Figure 2Ai). Interestingly, in HSC and MPP clusters, most of the NFIX-PU.1–predicted target genes were upregulated in NfixΔ/Δ HSPCs (Figure 5B). In contrast, NFIX-PU.1–predicted target genes were downregulated in committed progenitors (Figure 5B). These data suggest a developmentally regulated functional switch from a repressive to an activating NFIX-PU.1–containing transcriptional complex as HSC differentiate. NFIX has been shown to function as a switch during the genetic transition from embryonic muscle to fetal muscle.60
To glean how NFIX and PU.1 collaborate to regulate hematopoiesis, we performed GO analysis on all 279 genes predicted as targets of NFIX-PU.1 for each cell cluster. Consistent with our previous findings (Figure 2C), NFIX and PU.1 were predicted to positively regulate genes associated with cytoplasmic translation and oxidative phosphorylation in LT-HSC.1, suggesting a role in translation and cellular respiration (Figure 5C; supplemental Table 11). Also consistent with our previous findings (Figure 2C), NFIX and PU.1 were predicted to regulate genes related to HSC differentiation specifically in the NfixΔ/Δ MPP cluster. All together, we observed a similar enrichment of GO terms here as in our scRNA-seq data, confirming that NFIX and PU.1 likely work together to regulate cellular respiration and hematopoietic differentiation.
Discussion
Here, we investigated how NFIX regulates hematopoietic progenitors by interrogating its role in the hematopoietic hierarchy, genome-wide occupancy profiles, putative target genes, and transcriptional partners in hematopoietic cells. We find that NFIX binds active distal regulatory elements to directly regulate genes involved in ribosome biogenesis and translation, cellular respiration, and differentiation. In addition, NFIX cooperates with multiple hematopoietic transcriptional regulators, such as PU.1, at super-enhancers. Using integrative analysis (ie, scRNA-seq, ChIP-seq, and ATAC-seq), this study represents the first comprehensive description of the transcriptional targets and coregulatory partners of NFIX in hematopoietic progenitors as well as the effects of Nfix loss on the hierarchy of BM HSPCs.
NFIX is the first member of the NFI family linked to HSPC function in vivo, necessitating a more thorough understanding of its activity.2 We show that when Nfix is deleted, there is a significant loss of LT-HSC.1, MPPs, and GMPs that is concomitant with an increase in LT-HSC.2, ST-HSC.1, MKPs, and MEPs. Based on gene expression and cell-surface CD41 protein expression, LT-HSC.2 appear to be megakaryocyte-biased. Our data suggest a shift toward megakaryocyte commitment when NFIX is lost (Figure 2C). These changes do not result from bias in the sorting and pooling of cell populations (supplemental Figure 4A-B). Rather, changes in cell population frequency may represent a novel role for NFIX in regulating hematopoietic heterogeneity, especially within the LT-HSC compartment. Further investigation of the regulatory mechanisms affected by NFIX (eg, paired scRNA-seq and scATAC-seq) will clarify its precise role in hematopoietic differentiation.
The association of NFIX with genes involved in ribosome biogenesis, translation, cellular respiration, and hematopoietic differentiation is consistent with a role for NFIX in other cell contexts and adult stem cell populations.5-9 For example, Nfix−/− neural progenitors exhibit perturbed differentiation.10,11 The downregulation of multiple ribosomal RNA genes suggests that biosynthetic processes may be stunted, which can result in impaired lineage commitment.61 Indeed, characterization of Nfix−/− HPC5 cells revealed significantly increased apoptosis compared with Nfix+/+ HPC5 cells (supplemental Figure 11A). This is in contrast to our previous observations, where ectopic expression of Nfix in HSPCs results in protection from apoptosis.14 In addition, we observed a significant reduction in the number of colonies formed from Nfix−/− HPC5 cells compared with Nfix+/+ HPC5 cells (supplemental Figure 11B). These data when combined, indicate the importance of NFIX for the survival and regulation of hematopoietic cells.
NFIX concentrated at superenhancers with a master regulator of hematopoiesis, PU.1. Adam et al62 also observed NFI family member occupancy at superenhancers in adult stem cells. Likewise, PU.1 has an established role in binding to active superenhancers in multiple cellular contexts.63-65 These findings suggest that NFIX and PU.1 cooperate at super-enhancer elements to putatively coregulate genes associated with hematopoietic differentiation. We also show decreased binding of PU.1 in the absence of NFIX, suggesting PU.1 may require NFIX for efficient binding. In addition, our data suggest that NFIX likely function in multiprotein complexes that include PU.1, STAT3, FLI1, LYL1, RUNX1, and GATA2, all well-known hematopoietic TFs, to coregulate gene transcription. These data and observations consistently support a model in which NFIX collaborates with multiple key hematopoietic TFs to promote the survival of primitive hematopoietic cells. One caveat for this co-occupancy analysis is that NFIX ChIP-seq was performed using HPC5 cells; in contrast, previously published ChIP-seq for other TFs was performed using HPC7 cells. We did confirm the cobinding between NFIX and PU.1 by performing PU.1 ChIP-seq using HPC5 cells, and future work should include confirmation of other cobinding factors suggested by our analysis. Future work toward understanding the precise role of NFIX will be informative, especially its function in the molecular processes of HSPC transplantation and cooperation with other key hematopoietic factors.
Acknowledgments
The authors thank the flow cytometry core, especially Richard Ashmun, Stacie Woolard, and Luisheng He. The authors acknowledge the library preparation and sequencing performed by the Hartwell Genome Sequencing Facility and VUMC Vanderbilt Core Facilities.
This research was supported by National Institutes of Health grants R01DK104028 (S.M.-F.) and R35GM133614 (Y.C.).
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Authorship
Contribution: M.W., Y.L., A.M.-H., Q.Q., and C.P. conceptualized the study and were responsible for the methodology and validation; Y.L. worked on the software and curated the data; M.W. and Y.L. conducted the formal analysis; M.W. investigated; M.W., Y.L., S.B., and C.C. provided the resources; M.W., Y.L., W.K.C., S.M.-F., and Y.C. wrote the original draft, and reviewed and edited the manuscript; W.K.C., M.W., and Y.L. visualized; and W.K.C., S.M.-F., and Y.C. supervised.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Yong Cheng, 262 Danny Thomas Pl, MS341, Room D3007F, Memphis, TN 38105-3678; e-mail: yong.cheng@stjude.org; and Shannon McKinney-Freeman, 262 Danny Thomas Pl, MS341, Room D3007E, Memphis, TN 38105-3678; e-mail: shannon.mckinney-freeman@stjude.org.
References
Author notes
∗M.W. and Y.L. contributed equally to this study.
The data sets generated and/or analyzed reported in this article have been deposited in the NCBI Gene Expression Omnibus database (accession number GSE166922).
All codes used in this study are available at https://github.com/YichaoOU/NFIX_integrative_analysis.
Data are available on request from the corresponding author, Shannon McKinney-Freeman (shannon.mckinney-freeman@stjude.org).
The full-text version of this article contains a data supplement.