Figure 3.
Integration of scRNA-seq and scATAC-seq data sets. (A-B) UMAP representation of scRNA-seq and scATAC-seq merged data set. First, using the top 50 differentially expressed genes of each stage from the scRNA-seq data set and the number of reads within genes of interest from the scATACseq data set, a gene activity matrix was calculated for scATAC-seq data set to find and set anchors. Gene expression values of scATAC-seq data set were predicted using the global gene expression values of scRNA-seq data set and identified anchors. Both scRNA-seq and scATAC-seq gene expression matrixes were finally merged. (C) Number of cells predicted using the gene activity matrix vs actual number of observed cells at each stage (MBC, prePB, PB, and PC). (D) Percentage of prePB and PB predicted as MBC, prePB, PB, and PC. (E) Volcano plot showing differentially expressed genes (using the gene activity matrix) between the prePB predicted as prePB and the prePB predicted as PC. Genes identified as significantly differentially expressed were colored in blue (P value <.05 and log2(fold change) >0.25). (F) IFI6 expression observed in MBC, prePB, PB, and PC using scRNA-seq (top) and scATAC-seq (bottom) data sets. High and low expression were represented in dark blue and in yellow, respectively.