Key Points
A unique plasma-based transcriptional signature associated with frequency of pain in individuals with SCD was discovered.
The biological signature associated with pain frequency was enriched in immune-mediated pathways; top hub genes were TNF, CCL2, and ITGAM.
Visual Abstract
Pain is the most common complication of sickle cell disease (SCD). The underlying biology of SCD pain is not well understood, which is a barrier to novel, effective analgesic and preventive therapies. A wide variability in the phenotypic expression of pain exists among individuals with SCD, despite the inheritance of a similar defective hemoglobin gene. This interindividual pain variability further complicates the ability to understand the biology and effectively treat pain. We sought to discover a biological signature comprising differentially expressed genes unique to SCD that could differentiate between individuals with varied pain frequency. We conducted plasma-induced transcription analysis from 149 individuals with SCD and 60 Black individuals without SCD from multiple sites. We discovered 3028 differentially expressed genes that underwent weighted gene coexpression network analysis to distinguish gene modules significantly associated with pain frequency. We identified 524 genes, significantly associated with pain frequency (≥|0.3| and P < .05), that were further analyzed using the “database for annotation, visualization, and integrated discovery” (DAVID) tool to delineate the biological pathways associated with these genes. The highest ranked Gene Ontology process from DAVID was inflammatory response (P = 1.67E-12) and many related pathways were enriched (eg, response to lipopolysaccharide, and chemokine and cytokine signaling). The top 10 hub genes identified within our biological signature were TNF, CCL2, ITGAM, ITGAX, ICAM1, CCR5, CXCL2, IFNG, CCR1, and CXCL3. Future work should focus on further validating this signature and investigating the potential targets uncovered for their mechanistic and potentially therapeutic role in SCD pain.
Introduction
Sudden and unpredictable severe acute pain and chronic debilitating daily pain are the most common complications of sickle cell disease (SCD),1 and higher frequency of acute pain is associated with increased mortality.2 A wide variability in the phenotypic expression of pain exists among individuals with SCD, despite common inheritance of the hemoglobin S (HbS) gene. Approximately 29% of individuals with SCD have no acute care visits per year, 28% have between 1 and 3 acute care visits per year, and almost 17% have ≥3 acute care visits per year.3 Pain-related patient-reported outcomes assessed during baseline health support this phenotypic variability, with 47% of individuals reporting low functioning, 35% reporting intermediate functioning, and only 18% reporting high functioning.4 The understanding of why some individuals with SCD experience frequent pain and have impaired functioning, whereas others rarely experience pain and have normal functioning, is limited. The inability to identify the biological factors that contribute to pain heterogeneity is a barrier to tailored, targeted, and effective pain treatment.
The use of genomic technologies to understand disease biology and heterogeneity has facilitated rapid advances in hematology, including SCD. One previously published study based on whole-blood RNA-sequencing profiles in a small sample of individuals with SCD suggested that there is a set of differentially expressed genes during acute pain compared with baseline health within the same individuals.5 However, leveraging genomic research to understand the underlying biology that contributes to interindividual variability of SCD pain frequency, remains very limited. Thus, identifying a biological signature, defined as a group of genes specific to individuals with SCD, that is reflective of interindividual differences in pain frequency is imperative to understanding the clinical variability seen in SCD pain and to develop targeted, novel treatments.
New biological insight has been gained by applying plasma-induced transcriptional analysis to type 1 diabetes, cystic fibrosis, and inflammatory bowel disease, but not for SCD. In this study, we sought to identify a biological signature, unique to individuals with SCD, that could differentiate between individuals with varied pain frequency. To characterize the extracellular milieu, we conducted plasma-induced transcription analysis, followed by weighted gene coexpression network analysis (WGCNA), and leveraged the “database for annotation, visualization, and integrated discovery” (DAVID) tool to identify biological pathways associated with SCD pain.6-8 We hypothesized that we could identify a set of differentially expressed genes in individuals with SCD that are associated with increased frequency of health care use for pain.
Methods
Study population and clinical data
This observational cohort study includes individuals with SCD in baseline health, and Black individuals without SCD.
Individuals with SCD during baseline health
Plasma samples were collected during the National Institutes of Health–funded multisite, randomized, controlled trial, Magnesium for Children in Crisis (MAGiC; ClinicalTrials.gov identifier: NCT01197417).9 Inclusion criteria for MAGiC were children aged 4 to 21 years with HbSS or HbSβ0 thalassemia who required hospitalization after failing emergency department pain management. Exclusion criteria to ensure inclusion of only uncomplicated pain events are outlined in detail in the primary MAGiC publication.10 Plasma samples in the MAGiC trial were collected during clinic visits between 2 weeks and 3 months after hospitalization for acute pain treatment. Samples from individuals with SCD collected during baseline health at a single site were also used.
Black individuals without SCD
Plasma samples were collected from individuals aged ≥7 years without SCD and were biobanked from a prior single-site observational study conducted by the research team. Exclusion criteria were another disease resulting in pain-related disorder and neurological disease.
As per prior protocols,10 plasma was collected from all individuals via peripheral blood draw (into sodium citrate anticoagulant), immediately centrifuged, and stored at −80°C.
Clinical data
Demographic, laboratory, and clinical data were extracted from the electronic medical record. Hydroxyurea use within prior 3 months was self-reported by the patient/family at the time of MAGiC trial enrollment. Proof of adherence, dose, and therapy duration were not assessed. Our phenotypic trait of interest, pain frequency, was defined as the total number of hospitalizations for an acute pain episode in the 3 years before plasma collection and was analyzed as a continuous variable.
Overview of methods
Figure 1 illustrates the study methods. Each step is subsequently described in detail. The study was approved by the institutional review board. Informed written consent/assent was obtained from all participants, or the legal guardian and child when appropriate.
Overview of study methods. The figure displays an overview of the process and methods used for plasma analyses and integration of the gene expression data with clinical data.
Overview of study methods. The figure displays an overview of the process and methods used for plasma analyses and integration of the gene expression data with clinical data.
Plasma-induced transcriptional signature analysis
Plasma-induced transcriptional signature analyses were conducted as described previously.7,8,11-15 Briefly, participant plasma was cocultured with cryopreserved peripheral blood mononuclear cells (PBMCs) obtained as a single draw from a single healthy blood donor (Cellular Technology Ltd, Shaker Heights, OH). PBMCs are used as a biosensor that transcriptionally respond to the associated factors in study participant plasma. Purified RNA from PBMCs was extracted with TRIzol reagent (Invitrogen Life Technologies, Carlsbad, CA) and the induced gene expression was comprehensively measured with Affymetrix GeneChip Human Genome U133 Plus 2.0 array (Affymetrix, Santa Clara, CA), and normalized with robust multichip analysis (www.bioconductor.org). We identified transcripts that exhibited a fold change of >1.4 and were significantly differentially expressed (analysis of variance at a P value of < .05 with a false discovery rate [FDR] of <10%) between individuals with and without SCD. Analyses were done using the using Genomics Suite 7.0 (Partek, St. Louis, MO). The same process was performed for single-center cohorts and multicenter cohorts.
Creation of a gene input list for WGCNA
We further filtered differentially expressed transcripts to create the gene input list for WGCNA using an intersection of 2 filtering methodologies. For the first method, we retained genes that were in the top 25% of transcripts meeting stated thresholds in both single and multiple center analyses. For the second method, we retained genes in the top 5000 transcripts with the highest median absolute deviation (MAD) in both analyses. Age and sex were investigated and were not found to contribute to the measured variance. We leveraged 2 independent data sets (single site, multisite) to identify the maximum number of the most variant transcripts present in the combined data sets. This approach, as used in our previous work,7 increases the likelihood that the observed variance is based on biological as opposed to technical variance. The transcripts that demonstrated the most variance and that were present in both data sets were used for the final gene input list for WGCNA.
Construction of WGCNs
WGCNA is an unsupervised classification tool that is useful for identification of prognostic and diagnostic biomarkers and potential therapeutic targets. It has been used in diabetes, cardiovascular diseases, cancer, and osteoporosis.7,16-19 The main strength of WGCNA is that it provides a solution to the multiple testing problem that exists in genomics studies when correlating transcript expression levels to clinical data by clustering variant transcripts into a limited number of coexpression modules. We used WGCNA to construct the coexpression modules from the genes included in the retained gene list that was generated by plasma-induced transcription analyses and our filtering criteria described earlier.20,21 The automatic network construction function was used to construct the coexpression network. Signed scale independence and mean connectivity were used to identify the soft threshold to which coexpression similarity is raised. This transcript input list was hierarchically clustered based on topological overlap matrix and pruned by the “dynamic tree cut” function resulting in coexpression modules. Each module is represented by an eigengene, defined as the first principal component of a given module and is representative of the expression profiles of genes within that module. WGCNA assumes genes within the same module collectively contribute to the same or related biological pathways. Pearson correlation coefficients were calculated to determine the relationship between the module eigengene and our primary phenotypic characteristic of interest, total number of acute care encounters for pain in the 3 years before plasma collection.
Correlation of WGCNA modules with clinical traits
We first performed WGCNA from a single center to identify modules that displayed significant associations with the frequency of health care encounters for pain, defined as correlation of ≥|0.3| with a P value < .05. Each module in our results is labeled with a distinct color. Next, we applied the identified modules that showed significant correlations with clinical pain data in single site data to multisite data that were derived from MAGiC trial plasma to determine whether these modules were reproducible in a larger multisite data set. Cohorts with <15 individuals were excluded from further analyses because WGCNA requires a minimum sample size of 15 to ensure reliable results. A meta–P value was calculated across the centers by adaptively weighted statistic for modules that showed a significant correlation between the eigengene and frequency of acute care encounters for pain in at least 50% of the centers. A FDR-adjusted P value < .05 was considered significant.
GO analysis
Significant genes (P < .05) in the modules that exhibited a correlation of ≥|0.3| with the frequency of acute cure use for pain and a meta–P < .05 in at least 50% of the centers were analyzed using DAVID (https://david.ncifcrf.gov/).22 Heat maps for all of these significant genes were created with Genesis.23 An annotated heat map was also created for genes in our data set that have been described in the published pain literature. The Gene Ontology (GO) biological processes that included at least 5 genes and a P value < .01 were retained for further consideration.
PPI network construction for selected modules and hub genes identification
The STRING (search tool for the retrieval of interacting genes; http://string-db.org/) was used to construct a protein-protein interaction (PPI) network of the previously described genes. The PPI network was constructed with a high-confidence (0.7) minimum required interaction score to ensure that identified interactions are likely to be true positives. We used Cytoscape visualization software version 3.6.1 to display the network. We used cytoHubba, a Cytoscape plugin, to identify and rank hub genes within the PPI. Maximal clique centrality was used as a topology method.
Validation studies for plasma-based gene expression analyses
We selected 20 samples from each of the 2 groups: high C-C motif ligand 2 (CCL2) gene expression and low CCL2 gene expression. SCD plasma from these 2 groups (n = 40 samples) was cocultured with PBMCs from the same donor as described and used earlier. Conditioned media from these cultures was recovered and enzyme-linked immunosorbent assay for CCL2 (R&D Systems, Minneapolis, MN) was conducted. CCL2 levels were compared between the 2 groups (high and low CCL2 gene expression) using an independent sample t test.
Results
Study population
Our SCD study population included 149 individuals with SCD (27 from a single-center cohort, 122 from multicenter cohorts). There were 60 Black individuals without SCD. Demographics of the study population are shown in Table 1. There were no significant differences in age and sex between individuals with and without SCD.
Identification of WGCNA input gene list
The Affymetrix HG-U133 Plus 2.0 gene chip possesses 54 675 probe sets. Our analysis began by filtering low-signal probe sets, because these data are unreliable. We retained probe sets in which at least 25% of the samples possessed a log2 intensity of ≥4 relative fluorescence unit in both the MAGiC and single-site data sets. A total of 31 202 probe sets met this criterion. Next, we identified the transcripts exhibiting the highest variance. For this, we determined the MAD for each probe set within the single-site data set and for each individual MAGiC study site. For each data set we identified the 5000 probe sets exhibiting the highest MAD and retained those present in the single-site data set and ≥4 MAGiC site data sets. We found 3265 probe sets that were differentially expressed. When intersecting the gene lists obtained from these analyses, we identified 3028 probe sets (Figure 2A).
Plasma-induced transcription and WGCNA. (A) Identification of gene input list. Affymetrix U133 plus 2.0 gene chip has probe sets for interrogation of >54 000 transcripts. We filtered differentially expressed transcripts to create the gene input list for WGCNA. This included using an intersection of 2 filtering methodologies. For the first method, we retained genes that were in the top 25% of transcripts meeting stated thresholds in both single-site and multisite analyses. We retained probe sets where at least 25% of samples possessed log2 intensity ≥4 RFU. These data are represented by the upper portion of the Venn diagram. For the second method, we retained genes that were in the top 5000 transcripts with the highest MAD in single-site and multisite analyses. These data are represented by the lower portion of the Venn diagram. We leveraged 2 independent data sets (single site, multisite) to identify the maximum number of the most variant transcripts that were present in the combined data sets. This approach increases the likelihood that the observed variance is based in biological as opposed to technical variance. The intersection of these 2 analyses was used as the WGCNA input list (n = 3028 probe sets). (B) Determination of soft-thresholding power. Power of 6 was chosen based on signed scale independence and mean connectivity. (C) Dendrogram representing coexpression modules. Vertical lines represent genes. Lower values indicate greater similarity of transcript expression profiles across samples. Coexpression modules are indicated by color. (D) Correlations between gene modules and number of acute care visits for pain in the prior 3 years in 6 different centers using WGCNA. Columns represent different centers, number of participants from each center is displayed. Coexpression modules are represented by rows and labeled by color. The number of probe sets assigned to each module are indicated in the far-left column labeled gene (n). The P value shown within each box is between the eigengene (defined as the first principal component of a principal component analysis on the gene expression of all genes from a module) and number of acute care visits for pain in the previous 3 years for each center; all red shaded boxes represent correlation coefficient ≥|0.3| and P <.05. FDR-adjusted meta–P value is also displayed that was calculated for those modules that had significant correlations with clinical data from ≥50% of the centers (ie, ≥3 centers). N/A, not applicable; Q3, third quartile; RFU, relative fluorescence unit.
Plasma-induced transcription and WGCNA. (A) Identification of gene input list. Affymetrix U133 plus 2.0 gene chip has probe sets for interrogation of >54 000 transcripts. We filtered differentially expressed transcripts to create the gene input list for WGCNA. This included using an intersection of 2 filtering methodologies. For the first method, we retained genes that were in the top 25% of transcripts meeting stated thresholds in both single-site and multisite analyses. We retained probe sets where at least 25% of samples possessed log2 intensity ≥4 RFU. These data are represented by the upper portion of the Venn diagram. For the second method, we retained genes that were in the top 5000 transcripts with the highest MAD in single-site and multisite analyses. These data are represented by the lower portion of the Venn diagram. We leveraged 2 independent data sets (single site, multisite) to identify the maximum number of the most variant transcripts that were present in the combined data sets. This approach increases the likelihood that the observed variance is based in biological as opposed to technical variance. The intersection of these 2 analyses was used as the WGCNA input list (n = 3028 probe sets). (B) Determination of soft-thresholding power. Power of 6 was chosen based on signed scale independence and mean connectivity. (C) Dendrogram representing coexpression modules. Vertical lines represent genes. Lower values indicate greater similarity of transcript expression profiles across samples. Coexpression modules are indicated by color. (D) Correlations between gene modules and number of acute care visits for pain in the prior 3 years in 6 different centers using WGCNA. Columns represent different centers, number of participants from each center is displayed. Coexpression modules are represented by rows and labeled by color. The number of probe sets assigned to each module are indicated in the far-left column labeled gene (n). The P value shown within each box is between the eigengene (defined as the first principal component of a principal component analysis on the gene expression of all genes from a module) and number of acute care visits for pain in the previous 3 years for each center; all red shaded boxes represent correlation coefficient ≥|0.3| and P <.05. FDR-adjusted meta–P value is also displayed that was calculated for those modules that had significant correlations with clinical data from ≥50% of the centers (ie, ≥3 centers). N/A, not applicable; Q3, third quartile; RFU, relative fluorescence unit.
WGCNA output
There were 3 sites from the MAGiC trial excluded from WGCNA analyses because of small sample size (n = 10, n = 7, and n = 6 individuals, respectively). The study population for WGCNA and all subsequent analyses included 126 individuals with SCD. The scale independence and mean connectivity used to choose the soft threshold are shown in Figure 2B, with the power of 6 selected as the optimal soft threshold. Gene dendrogram, shown in Figure 2C, illustrates 11 clustered modules.
Figure 2D displays the WGCNA analyses results, depicting correlation and significance of identified modules’ eigengenes with pain history. There were 11 modules identified by WGCNA, 4 modules exhibited significant correlations with our primary outcome of pain history (ie, total number of hospitalizations for pain in the previous 3 years) in at least 50% of the centers (Figure 2D). These included the blue, green, red, and brown modules (Figure 2D). The FDR-adjusted meta–P values across 6 centers for these 4 modules were also significant (blue, P = 4.00E-06; green, P = .0001; red, P = .0002; brown, P = .002). These 4 modules encompass 1354 individual genes.
Ontological analyses using DAVID
Within the modules significantly correlated with pain frequency, there were 524 genes. These were subjected to ontological analysis, generation of the PPI network, and identification of the top hub genes. Gene expression levels of these significant genes (n = 524) are illustrated by the heat map in Figure 3A-B. Figure 3C includes an annotated heat map that displays variation in expression level of genes significantly associated with clinical pain data in our study cohort and are related to pain in the published literature. The majority of these 30 genes in Figure 3C are from the blue module (60% [n = 18]), 17% (n = 5) are from the red module, 13% (n = 4) are from the green module, and 10% (n = 3) are from the brown module. The gene list and corresponding module are included as supplemental Data 1.
Expression levels of transcripts identified through WGCNA. (A) Mean expression levels of 524 transcripts. Mean expression of all 524 genes for individuals with and without SCD from the single-center cohort (2 columns on left) and multicenter cohort (2 columns on right), each row represents a gene. (B) Individual expression levels of transcripts grouped by center. Variation in gene expression between individuals with SCD, separated by different centers (unique centers labeled A through F, total number of individuals in each center indicated in parentheses). The columns within each center group represent an individual from the respective center (labeled as A through F), each row represents a gene. (C) Annotated heat map of expression levels of pain-related genes. Variation in gene expression displayed of genes significantly associated with clinical pain data in the study cohort and that have been shown to be related to pain in the published literature. Each row represents a gene. The columns within each group (separated by centers A through F) represent an individual from the respective center. Most of the genes (60%) are from the blue module. Note: the individuals in panels B-C are ordered by hierarchical clustering that is done center by center using Genesis. For each center, hierarchical clustering was done based on gene expression level.
Expression levels of transcripts identified through WGCNA. (A) Mean expression levels of 524 transcripts. Mean expression of all 524 genes for individuals with and without SCD from the single-center cohort (2 columns on left) and multicenter cohort (2 columns on right), each row represents a gene. (B) Individual expression levels of transcripts grouped by center. Variation in gene expression between individuals with SCD, separated by different centers (unique centers labeled A through F, total number of individuals in each center indicated in parentheses). The columns within each center group represent an individual from the respective center (labeled as A through F), each row represents a gene. (C) Annotated heat map of expression levels of pain-related genes. Variation in gene expression displayed of genes significantly associated with clinical pain data in the study cohort and that have been shown to be related to pain in the published literature. Each row represents a gene. The columns within each group (separated by centers A through F) represent an individual from the respective center. Most of the genes (60%) are from the blue module. Note: the individuals in panels B-C are ordered by hierarchical clustering that is done center by center using Genesis. For each center, hierarchical clustering was done based on gene expression level.
The biological processes identified via DAVID for these 524 significant genes are displayed per module in Figure 4A. The top 10 ontological processes that included at least 5 significant genes at P <.01 are in Figure 4B. The highest ranked GO process identified was related to the inflammatory response (P = 1.67E-12; Figure 4B). Fold enrichment for each pathway is displayed. The list of genes that represent each GO term in Figure 4 are included as supplemental Data 2 and 3.
Ontological analysis using the DAVID. (A) Gene clusters for each of the significant modules as identified by WGCNA were analyzed. Displayed are the main GO terms reflected by the genes contained within each module. Also included are the numbers of genes represented by the GO process, the fold enrichment value, and the P value. (B) Significant genes from each of these modules were combined and then analyzed using DAVID to identify the most significant overall GO biological processes. Displayed are the top 10 most significant GO terms, gene count, and fold enrichment value. ∗The list of genes that represent each GO term shown in panels A-B are included as the supplemental Data 2 and 3.
Ontological analysis using the DAVID. (A) Gene clusters for each of the significant modules as identified by WGCNA were analyzed. Displayed are the main GO terms reflected by the genes contained within each module. Also included are the numbers of genes represented by the GO process, the fold enrichment value, and the P value. (B) Significant genes from each of these modules were combined and then analyzed using DAVID to identify the most significant overall GO biological processes. Displayed are the top 10 most significant GO terms, gene count, and fold enrichment value. ∗The list of genes that represent each GO term shown in panels A-B are included as the supplemental Data 2 and 3.
PPI network and hub genes
Construction of a PPI network revealed 361 nodes (proteins) and 357 edges (connections). The average node degree (number of connected nodes with the individual node) was 1.98, indicating the identified protein had an interaction with ∼2 other proteins on average. Figure 5 displays the top 10 of the 524 hub genes identified. These genes include tumor necrosis factor (TNF), CCL2, integrin subunit αM, integrin subunit αX, intercellular adhesion molecule 1 (ICAM1), C-C motif chemokine receptor 5 (CCR5), C-X-C motif chemokine ligand 2 (CXCL2), interferon gamma, CCR1, and CXCL3.
Top 10 hub genes as identified by cytoHubba. The level of ranking of the hub genes is indicated by color, with red indicating the highest rank and yellow representing the lowest rank. The lines indicate connections among genes.
Top 10 hub genes as identified by cytoHubba. The level of ranking of the hub genes is indicated by color, with red indicating the highest rank and yellow representing the lowest rank. The lines indicate connections among genes.
Validation of gene expression analyses
We found significantly elevated levels of CCL2 in the conditioned media of those with increased CCL2 gene expression when compared with the CCL2 levels of those with decreased CCL2 gene expression. Gene expression of CCL2 is displayed in Figure 3C and the comparison of CCL2 levels in the conditioned media between these 2 groups is displayed in Figure 6.
Comparison of CCL2 levels between individuals with high and low CCL2 gene expression. Plasma from individuals with previously identified high (n = 20) and low (n = 20) levels of CCL2 gene expression was cultured with PBMCs. The conditioned media from these cultures was retained and CCL2 levels were determined via enzyme-linked immunosorbent assay.
Comparison of CCL2 levels between individuals with high and low CCL2 gene expression. Plasma from individuals with previously identified high (n = 20) and low (n = 20) levels of CCL2 gene expression was cultured with PBMCs. The conditioned media from these cultures was retained and CCL2 levels were determined via enzyme-linked immunosorbent assay.
Discussion
We identified a set of genes induced by plasma from individuals with SCD that is associated with increased frequency of acute pain episodes. To our knowledge, this is the first study that uses plasma-based transcriptional analyses and unsupervised methods to determine a biological signature behind pain heterogeneity in individuals with SCD. Our approach allows for comprehensive unbiased gene expression analyses, which overcomes inherent biases when measuring multiple select plasma mediators, and it captures combinatorial effects of protein and nonprotein mediators.
Our findings suggest the induction of genes related to inflammatory pathways may collectively play a key role in the interindividual variability of SCD pain frequency. The GO term “inflammatory response” was the most significant biological process associated with pain frequency in our cohort. Other significant identified processes that contribute to enhanced inflammatory or immunoregulatory response include “cellular responses to lipopolysaccharide,” “positive regulation of chemokine production,” and “cytokine-mediated signaling pathway.” “Apoptotic processes,” also significantly represented by our data, are shown to contribute to the development of neuropathic pain and are often triggered by proinflammatory cytokines during cross talk with autophagic activities.24 Our data that show the association between genes related to inflammation and pain frequency are supported by prior SCD studies. Many inflammatory cytokines (ie, interleukin-1β [IL-1β], IL-6, IL-8, and TNF) and chemokines (ie, CCL2, CXCL10, and CXCL1) are significantly elevated in individuals with SCD during baseline health as compared with controls and increase further during acute pain.25-29 However, a knowledge gap exists in understanding how inflammatory mediators, when assessed during baseline health, differentiate the heterogeneity of SCD pain and how combinational effects affect pain in vivo. Our data show these inflammatory pathways could underly the interindividual variability and heterogeneity of SCD pain. Notably, these inflammatory pathways rose to the top regarding their association with pain frequency. Other SCD pain mechanisms could contribute to the variability in pain frequency including nervous system sensitization via mechanisms beyond neuroimmune interactions (eg, effects of lipids and microbiome), genetic differences in pain sensitivity, and/or altered coagulation mechanisms.
The role of immune mediators in SCD pain likely extends beyond vaso-occlusion. Chronic inflammation is a likely consequence of ongoing vaso-occlusion. However, the downstream effects of chronic inflammation on the neurobiology of pain is an active area of investigation in SCD pain– and non-SCD pain–related disorders,30 further emphasizing the importance of our findings. Immune mediators can interact directly with pain nociceptors in sensory neurons leading to neuronal sensitization and recurrent pain.31 Data show mast cell–induced neurogenic inflammation leading to substance P and calcitonin-gene–related peptide release is associated with SCD pain biology and contributes to hyperalgesia in SCD mice.32,33 Data also support that direct neural and mast cell trap interactions in SCD mice lead to neural injury, and inhibition of mast cell extracellular trap formation contributes to decreased pain behaviors in these mice.34 These data further support neuroimmune interactions in SCD and their contributions to pain. Because our unbiased method identified a biological signature heavily enriched with immune pathways that were associated with pain frequency, it is conceivable that neuroimmune interactions are a potential mechanism of recurrent and chronic SCD pain. Additional work is needed to understand neuroimmune interactions in SCD and their contribution to nervous system sensitization and SCD pain heterogeneity.
We identified TNF as the top hub gene in our data, indicating it may be a potential therapeutic target for SCD pain. Although TNF inhibitors have not yet been investigated for individuals with SCD, a case report of an individual with SCD who received adalimumab (TNF inhibitor) for psoriasis described decreased SCD pain intensity after drug initiation, with pain resolution at week-17 follow-up.35 Individuals with SCD and rheumatoid arthritis have been treated with TNF blockade (eg, etanercept, infliximab, and adalimumab) with improved arthritis symptoms and potential amelioration of SCD pain; however, it is challenging to discern pain attributable to arthritis or SCD.36 A SCD murine study suggested TNF blockade with etanercept has an ameliorating effect on inflammatory markers including VCAM1, IL-6, monocyte chemoattractant protein 1, substance P, and calcitonin-gene–related peptide. Behavioral pain was not evaluated in this etanercept study because of small number of available animals; however, data support a decrease in neuropeptides shown to contribute to pain.37 Thus, future work with TNF blockade may show an impact on pain behaviors in SCD mice. Another study found higher TNF-receptor 1 plasma levels in individuals with severe as compared with mild SCD, as evaluated by sickle cell anemia severity score (ie, estimates probability of dying from SCD complications within next 5 years).38 TNF has also been used to induce vaso-occlusive crises in SCD mice because it promotes red blood cell adhesion to endothelial cells and vessels and subsequent vaso-occlusive events.39,40 The use of TNF blockade as a biological treatment for SCD pain has not been investigated. Our data suggest this pathway warrants further exploration in SCD murine pain models and potentially in individuals with SCD based on preclinical data.
Other top hub genes we identified could be potential therapeutic SCD pain targets. CCL2, also called monocyte chemoattractant protein 1, has been described as an important contributor to chronic inflammation and neuropathic pain. It is shown to be elevated in SCD during and between crises.41 SCD murine model data show that inhibition of CCR2, the main receptor for CCL2, leads to alleviation of mechanical and cold behavioral hypersensitivity.42 Imatinib, which ameliorated hyperalgesia in SCD mice, was shown to inhibit CCL2, further supporting the potential mechanistic role of CCL2 in SCD pain biology and an important therapeutic target to further investigate.32 Our data show that CCL2 level was significantly higher in the conditioned media after stimulation of PBMCs by plasma from individuals with SCD collected during baseline state of health, further supporting the importance of this chemokine in SCD pain biology. ICAM1 has also been investigated in SCD. A study evaluating associations between functional single-nucleotide polymorphisms and pain intensity among individuals with SCD found those carrying 1 or 2 copies of the minor allele for ICAM1 reported lower average pain levels.43 Another study evaluated interferon gamma levels in SCD pain and found levels were significantly higher among patients with >3 vaso-occlusive crises per year.44 A possible explanation for this finding was the high percentage of CD4+CD28nul T cells, also identified in this study, which lose their sensitivity to apoptosis, cannot be suppressed, and thus drive an enhanced inflammatory response. Interestingly, the apoptotic process is 1 of the most significant GO processes identified in our study. In summary, these published findings support our data that identified similar genes and biological pathways in our transcriptional signature using an unbiased approach. Collectively, these data support our hypothesis that the collection of genes identified by our plasma-based transcriptional data may play a role in the biology of the interindividual variability in SCD pain frequency.
Our data support the discovery of potential novel pathways in the context of SCD pain. Some of the top 10 identified hub genes are known contributors to other painful and inflammatory conditions but have yet to be described in SCD pain. Integrin subunit αM and integrin subunit αX encode integrins that form leukocyte-specific complement receptors 3 and 4. These receptors are expressed on most white blood cells and contribute to adherence of neutrophils and monocytes to endothelial cells and phagocytosis of complement-coated particles.45 For CXCL2 and CXCL3, previous data show their expression is increased during cycling hypoxia.46 Because hypoxic conditions exacerbate sickling, vaso-occlusion, and chronic inflammation, CXCL2 and CXCL3 should be further investigated as potential SCD pain mediators. In addition, CXCL2 is shown to be elevated in a laboratory-induced trigeminal neuropathic pain model.47 Intrathecal injections of CXCL2 and CXCL3 induced pain-related behaviors in naïve mice. Furthermore, CXCL3 was involved in the development of neuropathic pain in this model, with reduced pain behavior after administration of CXCL3 neutralizing antibody.48 CCR1 and CCR5 are receptors for multiple chemokine ligands. CCR1 is shown to modulate pain by controlling neutrophil trafficking to the inflammatory site.49 CCR5 is a potential contributor to the development of neuropathic pain; it is upregulated after partial sciatic nerve ligation, with subsequent development of allodynia and thermal hyperalgesia. CCL5, a ligand for CCR1 and CCR5, is shown to contribute to inflammatory and neuropathic pain; data support pain induction after intradermal injection of CCL5 into rat paws. The mechanism of pain development is likely due to migrating macrophages and T cells to the inflammatory site.50 In summary, many of our top identified hub genes may be linked to pain in non-SCD pain models and disorders and deserve further investigation in SCD pain mechanisms.
Notably, 2 distinct pain constructs exist, 1 is differences in the biology that underlies variability in pain frequency, and the other is individual variability in the experience and physiological determinants of pain that dictates pain severity and analgesic needs despite the same underlying pain biology. Our work investigates pathways that may contribute to pain biology that drives varied pain frequency. This work was not intended to investigate individual variability in the experience and physiological determinants of pain. Regardless, the terminal event of pain the individual experienced was severe enough to seek acute care and this drove our primary outcome. The biological etiology driving pain could be different than the sensation of pain and the biology that leads to this sensation. Everyone has variability in their pain experience that may or may not lead them to seek acute care for pain. We decided a priori that if an individual sought acute care for pain that their pain reached a sufficient threshold for needing intervention. Because our data support that the identified pathways are significantly associated with frequency of acute care use for pain, it may be possible that the biology is contributing to both the event and perception of pain. However, the answer to this question requires future work.
Our study is limited by participants from multiple sites contributing data; it is possible that differences in sample collection and processing occurred despite directing sites to follow the same protocol. However, this limitation could also be considered a strength, thereby increasing generalizability of our findings because multisite data are more reflective of true participant heterogeneity, pain variability, and other factors that cannot be controlled. The use of acute care encounters to evaluate pain frequency likely underestimates the true frequency of pain episodes that an individual experiences because many with SCD manage their pain at home. However, this limitation does not invalidate our findings and would be problematic if acute care encounters overestimated the amount of pain. Study plasma samples were obtained from a biobank. Use of biobanked research samples is common and all sites were directed to use the same processing protocol and samples were preserved at −80°C. All samples were analyzed via a fresh thaw; the absence of prior freeze/thaw cycles sustained the quality/integrity of the samples and analyses. The 3-year period before blood collection used to define pain acute care encounters may not reflect the longitudinal pain experience and captures a snapshot in time for an individual and it is possible that those with no pain events minimized the differences that we may have found.
In conclusion, we identified distinct molecular expression profiles in individuals with SCD that are significantly associated with the frequency of pain. We also identified hub genes contained within these profiles that may be linked mechanistically to SCD pain and leveraged as potential therapeutic pain targets. Future work should focus on further validating this signature and investigating the potential targets uncovered for their mechanistic role in SCD pain. Future work will leverage these data to create a potential biomarker to differentiate pain phenotypes and prognosticate SCD severity that is rooted in the potential biological pathways associated with SCD pain.
Acknowledgments
The authors acknowledge all the dedicated individuals with sickle cell disease and healthy individuals who contributed their time, biospecimens, and data for this study. The authors also acknowledge the research coordinators who assisted in the collection and processing of the biospecimens for this study.
A.M.B. was supported by grants from the National Institutes of Health (NIH)/National Heart, Lung, and Blood Institute (NHLBI; 1R01HL142657-01) and NIH/National Institute of Neurological Disorders and Stroke (1R61NS114954-01 and R33NS114954-02). D.C.B. was supported by a grant from the NIH/NHLBI (5R01HD062347-03).
The views expressed are those of J.A.P. and do not necessarily represent the views of the NIH or the US Government.
Authorship
Contribution: L.M.K. assisted with conceiving the study, data analyses/interpretation, and writing and editing of the manuscript; S.J. conducted data analyses/interpretation and assisted with writing and editing the manuscript; A.S. assisted with data analyses/interpretation and writing and editing the manuscript; M.F.R. assisted with laboratory assay completion and analyses and editing of the manuscript; J.A.P. conceived the study and assisted with editing the manuscript; D.C.B. assisted with biospecimen collection and edited the manuscript; M.J.H. led the study, assisted with data analyses/interpretation, and wrote and edited the manuscript; and A.M.B. conceived and designed the study, led the study, and wrote and edited the manuscript.
Conflict-of-interest disclosure: A.M.B. is on an adjudication committee for a clinical trial sponsored by Pfizer (unrelated to the submitted research). The remaining authors declare no competing financial interests.
The current affiliation for J.A.P. is National, Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD.
Correspondence: Amanda M. Brandow, Section of Hematology/Oncology/Bone Marrow Transplantation, Department of Pediatrics, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226; email: abrandow@mcw.edu.
References
Author notes
Data are available on request from the corresponding author, Amanda M. Brandow (abrandow@mcw.edu).
The full-text version of this article contains a data supplement.