NRF1 binding sites are highly enriched in multiple myeloma. (A) Schematic representation of the strategy used to identify TFs involved in MM disease and stratify them by their PI. (B) Heat map depicting the patient data set. Bar chart shows the number of significant accessible regions for each of the 55 samples investigated in this study and cytogenetics via fluorescence in situ hybridization coupled with clinical information (disease status, purple for NDMM and orange for treated MM; sex, light blue for male [M] and pink for female [F]; percentage of malignant PCs, purple gradient; age, green gradient) (top to bottom). (C) Heat map of unsupervised clustering analysis showing the enrichment score of each TF detected at each PI value (range, 1-55). The enrichment score represents the ratio between observed enrichment in open chromatin regions from our in-house MM cohort and expected enrichment in random chromatin accessibility sampling scenarios, with red at PI 55 and white at PI 1. Each row represents a TF, and their clustering across PI groups used the WardD method with Euclidean distance for similarity. The clustering analysis identified 3 groups: C1 containing TFs (n = 120) with heterogeneous PI, C2 containing TFs (n = 45) with high PI, and C3 containing TFs (n = 143) with low PI. The enlargement of C2 displays the detailed enrichment scores (right). (D) MSigDB pathway ontology analysis of TFs that are highly shared (PI between 36 and 55) among patients in our in-house cohort. The x-axis represents the –log10 of the FDR. (E) Motif analysis at the most penetrating loci showing the NRF1 motif (top) detected by an independent methodology using TOBIAS69 from the BAM file of each sample. Footprinting calls were performed using all 55 BAM at the most penetrant location from PI 36 to PI 55. (F) Correlation analysis of TF enrichment scores calculated at increasing population percentiles (5% increments from 5% to 100%) between our in-house MM cohort and the Lund MM cohort. The x-axis indicates the rank index for each TF assigned based on the Spearman correlation coefficient shown on the y-axis. Both cohorts have been preprocessed with the same pipeline as described in “Methods.” ATR, ataxia telangiectasia and Rad3-related protein; C1, cluster 1; FDR, false discovery rate; MSigDB, molecular signatures database; Obs/Exp, observed vs expected; PLK, polo-like kinase.