Key Points
Patient-specific comorbidities affect treatment selection and survival in myelofibrosis.
An extended DIPSS model including weight of patient-specific comorbidities reveals significantly higher discriminating power.
Abstract
Treatment decisions in primary myelofibrosis (PMF) are guided by numerous prognostic systems. Patient-specific comorbidities have influence on treatment-related survival and are considered in clinical contexts but have not been routinely incorporated into current prognostic models. We hypothesized that patient-specific comorbidities would inform prognosis and could be incorporated into a quantitative score. All patients with PMF or secondary myelofibrosis with available DNA and comprehensive electronic health record (EHR) data treated at Vanderbilt University Medical Center between 1995 and 2016 were identified within Vanderbilt’s Synthetic Derivative and BioVU Biobank. We recapitulated established PMF risk scores (eg, Dynamic International Prognostic Scoring System [DIPSS], DIPSS plus, Genetics-Based Prognostic Scoring System, Mutation-Enhanced International Prognostic Scoring System 70+) and comorbidities through EHR chart extraction and next-generation sequencing on biobanked peripheral blood DNA. The impact of comorbidities was assessed via DIPSS-adjusted overall survival using Bonferroni correction. Comorbidities associated with inferior survival include renal failure/dysfunction (hazard ratio [HR], 4.3; 95% confidence interval [95% CI], 2.1-8.9; P = .0001), intracranial hemorrhage (HR, 28.7; 95% CI, 7.0-116.8; P = 2.83e-06), invasive fungal infection (HR, 41.2; 95% CI, 7.2-235.2; P = 2.90e-05), and chronic encephalopathy (HR, 15.1; 95% CI, 3.8-59.4; P = .0001). The extended DIPSS model including all 4 significant comorbidities showed a significantly higher discriminating power (C-index 0.81; 95% CI, 0.78-0.84) than the original DIPSS model (C-index 0.73; 95% CI, 0.70-0.77). In summary, we repurposed an institutional biobank to identify and risk-classify an uncommon hematologic malignancy by established (eg, DIPSS) and other clinical and pathologic factors (eg, comorbidities) in an unbiased fashion. The inclusion of comorbidities into risk evaluation may augment prognostic capability of future genetics-based scoring systems.
Introduction
Myelofibrosis is a devastating myeloproliferative neoplasm (MPN) hallmarked by marrow fibrosis, symptomatic extramedullary hematopoiesis, and vascular thromboembolism.1 Debilitating symptoms, aberrant hematopoiesis, and considerable risk of leukemic transformation confer a substantial burden to this patient population beyond diminished overall survival (OS).2,3 Although primary myelofibrosis (PMF) is de novo disease, secondary myelofibrosis (sMF) refers to clonal myelofibrosis, which is proceeded by another MPN (usually polycythemia vera or essential thrombocythemia). The most common molecular drivers include Janus kinase 2 (JAK2),4 calreticulin (CALR),5 and myeloproliferative leukemia virus oncogene (MPL) pathway mutations.6 Additional mutations in ASXL1, CBL, EZH2, IDH1, IDH2, RUNX1, SRSF2, and TP53 are associated with inferior OS.7
To guide treatment decisions, available prognostic systems for myelofibrosis have expanded beyond the first widely adapted tool, the International Prognostic Scoring System.8 The system includes the Dynamic International Prognostic Scoring System (DIPSS),9 DIPSS plus,10 Genetics-Based Prognostic Scoring System (GPSS),11 Mutation-Enhanced International Prognostic Scoring System (MIPSS),12 MIPSS70,13 MIPSS70+,14 Genetically Inspired Prognostic Scoring System, and the MPN personalized risk calculator.15,16 Each relies on validated disease-specific parameters including peripheral blood counts, disease-specific clinical characteristics, cytogenetics, or high molecular risk. Although comorbidities affect survival in myeloid neoplasms,17,18 patient-specific comorbidities (eg, renal failure, cardiac impairment, neurologic dysfunction) are not incorporated into current prognostic models. Retrospective evaluations of cohorts of myelofibrosis patients have noted decreased OS with increased comorbidity burden, as measured by the adult comorbidity evaluation-27 and hematopoietic cell transplantation comorbidity index.19,20 However, these evaluations are limited to the brief list of selected comorbidities included in each validated tool and do not account for the vast heterogeneity of comorbid conditions concurrent with the diagnosis of myelofibrosis that may influence treatment decision making and subsequent prognosis.
Dramatic improvements in comorbidity analysis have incorporated electronic health records (EHRs), the International Classification of Diseases, Ninth Revision (ICD-9), and groups of ICD, Ninth Revision, codes that better mimic clinical phenotypes (phecodes).21 EHRs contain the entirety of disease-specific characteristics used in PMF risk scores, while also containing patient-specific comorbidities that span the patient’s lifetime in that EHR (EHR lifetime). To achieve next-generation myelofibrosis risk prognostication, strategies must integrate EHR data, including unbiased and comprehensive comorbidity data, with the clinical and genomic data currently used.
Here, we present a different approach in incorporating our institutional patient DNA biorepository to create an unbiased EHR evaluation of all myelofibrosis patients’ comorbidities and their impact on survival based on their genotypic risk scores. We demonstrate that comorbidities, which current prognostic scores largely do not consider, could be a latent factor contributing to heterogeneity in risk category assignment.
Methods
Cohort identification
This study was approved by our institutional review board at Vanderbilt University and conducted in accordance with the Declaration of Helsinki.
We have identified a myelofibrosis cohort within Vanderbilt’s Synthetic Derivative (SD), a deidentified, research-dedicated mirror of the EHRs at Vanderbilt University Medical Center that contains >3 million distinct individual records corresponding to 148 million ICD codes, 125 million clinical notes, and 1.1 billion medication records. The SD is linked to BioVU, a DNA biorepository that includes DNA from the peripheral blood of ∼300 000 patients,22,23 from which several genome-wide association studies with reproducible genotype-phenotype associations have been generated.24-26
We interrogated the SD and BioVU database using a sensitive approach to identify possible myelofibrosis cases (Figure 1). As an exploratory endpoint, these possible cases were refined using a myelofibrosis algorithm that relied on ICD codes, natural language processing of physician notes discussing myelofibrosis, and medication history to identify high-probability cases. However, given the need for high specificity and reliability with PMF/sMF diagnosis, confirmation of diagnoses was based on hematologist manual review of accessible hematopathology details including bone marrow biopsy and cytogenetic reports, strictly adhering to the 2016 World Health Organization (WHO) criteria.1 When available, molecular variant testing was noted. Patients with only a single visit to our institution or transformation to acute myeloid leukemia (AML) at presentation were excluded. Automatic extraction of age, sex, race, and clinical and laboratory parameters was performed.
Strict inclusion criteria limited the cohort to a conservative “gold set” of patients with bone marrow meeting criteria for PMF/sMF per WHO criteria coincident to the detection of driver mutations and other prognostic features.1 Utilization of automatic extraction of EHR data led to 100% coverage of bone marrow biopsy reports, cytogenetic reports, pertinent chemistry and blood counts, and comorbidities among other data points. Evaluation excluded patients with transformation to acute leukemia within the first 30 days of referral or with a single visit to our institution.
Risk score recapitulation
Extracted comprehensive clinical, hematopathology, laboratory, and next-generation sequencing (NGS) data were included for prognosis evaluation. We recapitulated DIPSS, DIPSS Plus, MIPSS70+, and GPSS from the EHRs using published methods.9-11,14 The prognostic predictors of these methods included score specific cutoffs for age, leukocyte count (white blood cells), hemoglobin (Hgb), platelets (PLT), and circulating myeloid blasts. Given heterogeneity in physician documentation of night sweats or fevers that occurred at home, we limited the presence of constitutional symptoms status to the objective measurement of 10% weight loss compared with baseline. Transfusion dependence was extracted using algorithmic evaluation of Current Procedural Terminology coding and Hgb at the time of each patient visit, with restriction to the time after referral to our center. Only patients with bone marrow biopsy reports and karyotype data were included. Survival was calculated as the interval between myelofibrosis diagnosis or referral and death or last follow-up (censor); patients who underwent allogeneic hematopoietic stem cell transplant or transformed to AML were censored at that respective date, as similarly described in established risk scoring systems.13,10,15 Survival from myelofibrosis diagnosis was estimated using the Kaplan-Meier method and the log-rank test was used to compare Kaplan-Meier survival curves. The discriminating power of a prognostic model was estimated using the Harrell concordance index (C-index), whereas bootstrap resampling with 1000 repetitions was used to compute the bias-corrected 95% confidence intervals (CIs) for the C-index measure.27
Comorbidity analysis
We investigated correlation between comorbidity burden and OS via a phenome-wide association study (PheWAS).28 Survival was calculated using the same method as described previously. To evaluate each patient’s overall comorbidity burden, we interrogated patient phecodes, which are grouped ICD-9 codes that better mimic clinical phenotypes. Specifically, we extracted all ICD codes within 365 days of diagnosis or referral and converted them to phecodes using the map available at https://phewascatalog.org/phecodes. We excluded the PMF specific disease-related phecodes such as AML or codes that corresponded to DIPSS-dependent variables (eg, leukocytosis). For the deceased patients, we also excluded the phecodes that were attributed less than 7 days before their death date. As a result, we identified 374 phecodes at PMF/sMF diagnosis in our cohort of 193 patients. Based on this phecode list, we conducted PheWAS to test the relationship of each phecode with survival.28 We used the Cox proportional hazards model adjusted for the DIPSS predictors. Bonferroni correction was used to account for multiple comparisons, where significance is reported as P < .00013 (ie, α = 0.05/374).
Comorbidity burden and risk score agreement
To further investigate how comorbidity influences survival in myelofibrosis, we developed a simple prognosis model that accounts for comorbidity burden at PMF diagnosis. In this model, 4 prognosis groups were considered according to the quartile cutoff values for the number of distinct phecodes at PMF diagnosis. As a result, low-risk quartile 1 group included patients with 0 to 2 phecodes; quartile 2 included patients with phecode count in the 3 to 7 range; quartile 3 corresponds to 8 to 17 phecode range; and high-risk quartile 4 contains patients with >18 phecodes at PMF diagnosis. We computed Spearman rank correlations between the number of phecodes at diagnosis and prognosis points of the existing models.
Correlation between these comorbidity quartiles with existing myelofibrosis risk score categories were compared using the coefficient of determination reported as r. Significance is reported as P < .05. Further investigation into risk score correlation by comorbidity was investigated by generating Sankey plots of competing risk scores by comorbidity groups by severity quartile. “Severity” in these quartiles is defined by the number range of phecodes to guide the balance of the four quartiles.
Results
Patient clinical and laboratory characteristics
We identified 193 cases of PMF/sMF at Vanderbilt University from 1995 to 2016 that also had cytogenetic and bone marrow biopsy reports in addition to meeting strict WHO inclusion criteria. Patient characteristics and laboratory values are summarized in Table 1 and supplemental Figure 1, respectively. Median age of diagnosis was 59 years (range, 24-87), and 42% were female. Laboratory data were available on all 193 cases, and 35 patients were referred >1 year after diagnosis. Median OS was 39 months (range, 1-265), with 23 patients developing AML at a median time of 37 months (range, 1-265 months) from diagnosis. There were 40 patients treated with allogeneic hematopoietic stem cell transplant at a median time of 30 months from diagnosis.
Variable . | All patients . | Seen at VUMC within 1 y of diagnosis . | Seen at VUMC after 1 y of diagnosis . | |||
---|---|---|---|---|---|---|
No. (%) . | . | No. (%) . | . | No. (%) . | . | |
No. of patients | 193 | 158 | 35 | |||
Median age | 59 | 59.5 | 56 | |||
Range | (24-87) | (24-87) | (34-78) | |||
Primary myelofibrosis | 156 (80.8) | |||||
After polycythemia vera - myelofibrosis | 29 (15) | |||||
After essential thrombocythemia - myelofibrosis | 26 (13.5) | |||||
Female | 82 (42.5) | 66 (41.8) | 16 (45.7) | |||
Median follow-up, y | 4 | (1-22) | 2.5 | (1-16) | 6 | (1-22) |
Age >65 y | 63 (29.9) | 51 (32.3) | 12 (34.3) | |||
Laboratory characteristics | ||||||
Leukocytes, × 109/L | 193 | 158 | 35 | |||
Median | 13.8 | 13 | 16.4 | |||
Range | (0.8-256.3) | (0.8-256.3) | (1.7-179.6) | |||
Leukocytes >25 ×109/L | 57 (29.5) | 42 (26.6) | 15 (29.5) | |||
Hemoglobin, g/dL | 193 | 158 | 35 | |||
Median | 9.3 | 8.95 | 9.9 | |||
Range | (4.8-15.7) | (4.8-15.7) | (5.3-15.7) | |||
Hemoglobin ≤10 g/dL | 109 (56.5) | 91 (57.6) | 18 (51.4) | |||
Platelets, ×109/L | 193 | 158 | 35 | |||
Median | 147 | 145 | 151 | |||
Range | (5-1007) | (5-1007) | (8-712) | |||
Platelets <100 ×109/L | 75 (38.9) | 61 (38.6) | 14 (40) | |||
Platelets <200 ×109/L | 119 (61.7) | 98 (62) | 21 (60) | |||
Circulating blasts | 110 (57) | 90 (57) | 20 (57.1) | |||
Circulating blasts >1% | 83 (43) | 71 (44.9) | 12 (34.3) | |||
Circulating blasts >2% | 66 (34.2) | 55 (34.8) | 11 (31.4) | |||
Cytogenetics | 193 | 158 | 35 | |||
High risk (DIPSS Plus definition) | 12 (6.2) | 9 (5.7) | 3 (8.6) | |||
Very high risk (GPSS definition) | 10 (5.2) | 8 (5.1) | 2 (5.7) | |||
High risk (GPSS definition) | 32 (16.6) | 26 (16.5) | 6 (17.1) | |||
High risk (MIPSS70 Plus definition) | 22 (11.4) | 19 (12) | 3 (8.6) | |||
Next-generation sequencing | 140 (72.5) | 112 (70.9) | 28 (80) | |||
Mutation detected | 118 (84.3) | 96 (85.7) | 22 (78.6) | |||
Driver mutation | ||||||
JAK2WT, MPLWT, and CALRWT (triple negative) | 71 (50.7) | |||||
JAK2V617F | 78 (55.7) | |||||
CALR mutation | 16 (11.4) | |||||
MPL mutation | 9 (6.4) | |||||
High molecular risk | ||||||
ASXL1 | 17 (12.1) | |||||
EZH2 | 8 (5.7) | |||||
SRSF2 | 15 (10.7) | |||||
IDH1/2 | 3 (2.1) | |||||
≥2 High molecular risk (MIPSS 70+) | 8 (5.7) | |||||
Outcomes | ||||||
Allogeneic stem cell transplant | 40 (20.7) | 32 (20.3) | 8 (22.9) | |||
Leukemic transformation | 23 (11.9) | 20 (12.7) | 10 (28.6) | |||
Death | 42 (21.8) | 36 (22.8) | 6 (17.1) |
Variable . | All patients . | Seen at VUMC within 1 y of diagnosis . | Seen at VUMC after 1 y of diagnosis . | |||
---|---|---|---|---|---|---|
No. (%) . | . | No. (%) . | . | No. (%) . | . | |
No. of patients | 193 | 158 | 35 | |||
Median age | 59 | 59.5 | 56 | |||
Range | (24-87) | (24-87) | (34-78) | |||
Primary myelofibrosis | 156 (80.8) | |||||
After polycythemia vera - myelofibrosis | 29 (15) | |||||
After essential thrombocythemia - myelofibrosis | 26 (13.5) | |||||
Female | 82 (42.5) | 66 (41.8) | 16 (45.7) | |||
Median follow-up, y | 4 | (1-22) | 2.5 | (1-16) | 6 | (1-22) |
Age >65 y | 63 (29.9) | 51 (32.3) | 12 (34.3) | |||
Laboratory characteristics | ||||||
Leukocytes, × 109/L | 193 | 158 | 35 | |||
Median | 13.8 | 13 | 16.4 | |||
Range | (0.8-256.3) | (0.8-256.3) | (1.7-179.6) | |||
Leukocytes >25 ×109/L | 57 (29.5) | 42 (26.6) | 15 (29.5) | |||
Hemoglobin, g/dL | 193 | 158 | 35 | |||
Median | 9.3 | 8.95 | 9.9 | |||
Range | (4.8-15.7) | (4.8-15.7) | (5.3-15.7) | |||
Hemoglobin ≤10 g/dL | 109 (56.5) | 91 (57.6) | 18 (51.4) | |||
Platelets, ×109/L | 193 | 158 | 35 | |||
Median | 147 | 145 | 151 | |||
Range | (5-1007) | (5-1007) | (8-712) | |||
Platelets <100 ×109/L | 75 (38.9) | 61 (38.6) | 14 (40) | |||
Platelets <200 ×109/L | 119 (61.7) | 98 (62) | 21 (60) | |||
Circulating blasts | 110 (57) | 90 (57) | 20 (57.1) | |||
Circulating blasts >1% | 83 (43) | 71 (44.9) | 12 (34.3) | |||
Circulating blasts >2% | 66 (34.2) | 55 (34.8) | 11 (31.4) | |||
Cytogenetics | 193 | 158 | 35 | |||
High risk (DIPSS Plus definition) | 12 (6.2) | 9 (5.7) | 3 (8.6) | |||
Very high risk (GPSS definition) | 10 (5.2) | 8 (5.1) | 2 (5.7) | |||
High risk (GPSS definition) | 32 (16.6) | 26 (16.5) | 6 (17.1) | |||
High risk (MIPSS70 Plus definition) | 22 (11.4) | 19 (12) | 3 (8.6) | |||
Next-generation sequencing | 140 (72.5) | 112 (70.9) | 28 (80) | |||
Mutation detected | 118 (84.3) | 96 (85.7) | 22 (78.6) | |||
Driver mutation | ||||||
JAK2WT, MPLWT, and CALRWT (triple negative) | 71 (50.7) | |||||
JAK2V617F | 78 (55.7) | |||||
CALR mutation | 16 (11.4) | |||||
MPL mutation | 9 (6.4) | |||||
High molecular risk | ||||||
ASXL1 | 17 (12.1) | |||||
EZH2 | 8 (5.7) | |||||
SRSF2 | 15 (10.7) | |||||
IDH1/2 | 3 (2.1) | |||||
≥2 High molecular risk (MIPSS 70+) | 8 (5.7) | |||||
Outcomes | ||||||
Allogeneic stem cell transplant | 40 (20.7) | 32 (20.3) | 8 (22.9) | |||
Leukemic transformation | 23 (11.9) | 20 (12.7) | 10 (28.6) | |||
Death | 42 (21.8) | 36 (22.8) | 6 (17.1) |
Patient clinical and laboratory characteristics stratified by time of referral and patient at Vanderbilt University (N = 193).
VUMC, Vanderbilt University Medical Center.
DIPSS, DIPSS Plus, GPSS, and MIPSS70+
Standard scores for each prognostication model are summarized here and in Table 2. Within our DIPSS model (N = 191), the 5-year OS for each subgroup was as follows: low risk (94%), intermediate-1 (91%), intermediate-2 (78%), and high risk (0%); P = .001 (Figure 2A). For DIPSS, 2 patients were excluded because of an insufficient number of points. Integrating karyotype data and PLT count to calculate DIPSS Plus (N = 193) revealed 5-year OS for each subgroup: low risk (94%), intermediate-1 (93%), intermediate-2 (81%), and high risk (0%); P = .001 (Figure 2B). MIPSS70+ (N = 113) 5-year OS: low (96%), intermediate (93%), high (79%), and very high (32%); P ≤ .0010 (Figure 2C). In contrast, GPSS (N = 140) 5-year OS noted low risk (100%), intermediate-1 (100%), intermediate-2 (83%), and high risk (77%); P = .03 (Figure 2D).
Category . | DIPSS, N (%) . | DIPSS Plus, N (%) . | MIPSS70+, N (%) . | GPSS, N (%) . | Comorbidity, N (%) . |
---|---|---|---|---|---|
Very high | — | — | 8 (7) | — | — |
High | 10 (5) | 26 (13) | 47 (42) | 30 (21) | 47 (24) |
Intermediate-2 | 79 (41 | 82 (42) | — | 62 (44) | 41 (21) |
Intermediate-1 | 71 (37%) | 57 (30) | — | 38 (27) | 71 (37) |
Intermediate | — | — | 19 (17) | — | — |
Low | 33 (17) | 28 (15) | 39 (35) | 10 (7) | 34 (18) |
No. evaluated | 193 | 193 | 113 | 140 | 193 |
No. excluded | 0 | 0 | 80∗ | 53∗∗ | 0 |
Total | 193 | 193 | 193 | 193 | 193 |
Category . | DIPSS, N (%) . | DIPSS Plus, N (%) . | MIPSS70+, N (%) . | GPSS, N (%) . | Comorbidity, N (%) . |
---|---|---|---|---|---|
Very high | — | — | 8 (7) | — | — |
High | 10 (5) | 26 (13) | 47 (42) | 30 (21) | 47 (24) |
Intermediate-2 | 79 (41 | 82 (42) | — | 62 (44) | 41 (21) |
Intermediate-1 | 71 (37%) | 57 (30) | — | 38 (27) | 71 (37) |
Intermediate | — | — | 19 (17) | — | — |
Low | 33 (17) | 28 (15) | 39 (35) | 10 (7) | 34 (18) |
No. evaluated | 193 | 193 | 113 | 140 | 193 |
No. excluded | 0 | 0 | 80∗ | 53∗∗ | 0 |
Total | 193 | 193 | 193 | 193 | 193 |
Total patients (N = 193) were evaluated for their respective DIPSS, DIPSS Plus, MIPSS70+, GPSS, and comorbidity risk scores. This table summarizes the heterogeneity of calculated risk among different risk scores.
Prognostic model . | C-index . | 95% CI . |
---|---|---|
DIPSS | 0.73 | 0.70-0.77 |
DIPSS + comorbidity predictors | 0.81 | 0.78-0.84 |
Prognostic model . | C-index . | 95% CI . |
---|---|---|
DIPSS | 0.73 | 0.70-0.77 |
DIPSS + comorbidity predictors | 0.81 | 0.78-0.84 |
Comorbidity analysis: PheWAS analysis
To conduct PheWAS,28 15 559 ICD codes corresponding to 1866 phecodes evaluating comorbidities were evaluated, and 374/1866 phecodes were discovered at the onset of disease in our cohort. Comorbidity analysis revealed 4 phecodes significantly associated with reduced OS after Bonferroni correction (31 patients [16%], and 23 phecodes with strong trend toward statistical significance, P < .01) (Figure 3). In further sensitivity analysis, distinguishing sMF from PMF did not change the pattern of significant phecodes or survival after being used as a predictor in the analysis (Figure 2F). It did not influence C-index significantly: C = 0.808 vs 0.812 without the additional predictor. Diagnoses with adverse impact on survival included 3 uncommon diagnoses that were most commonly coded immediately preceding death: intracranial hemorrhage (hazard ratio [HR], 28.7; 95% CI, 7.0-116.8; P = 2.83e-06), invasive fungal infection (HR, 41.2; 95% CI, 7.2-235.2; P = 2.90e-05), and encephalopathy or coma (HR, 15.1; 95% CI, 3.8-59.4; P = .0001). With regard to intracranial hemorrhage, 3 patients were associated with this phecode, with only 1 mechanistic connection to death. Alternatively, the presence of renal failure in 22 patients was also significantly associated with decreased survival (HR, 4.3; 95% CI, 2.1-8.9; P = .0001) though none connected with instance of death. Eighteen of these patients appeared to have onset of renal dysfunction within 12 months of PMF/sMF diagnosis. Neoplastic disease–associated renal pathology is possible within this cohort, as other neoplastic parameters such as uric acid levels were elevated and correlated with the diagnosis of renal failure concurrent with PMF/sMF (N = 18, mean 9.3 mg/dL vs control [mean 7.6 mg/dL] without prior renal dysfunction, N = 132; P = .016). The prognostic model using the DIPSS predictors and the 4 comorbidities found as significant after Bonferroni correction in the PheWAS analysis corresponds to a C-index of 0.81 (95% CI, 0.78-0.84), which is significantly higher compared with the discriminating power of the original DIPSS model estimated on our dataset (C-index 0.73; 95% CI, 0.70-0.77) (Table 3).
Additional phecodes that were associated with an adverse impact on survival but fell short of the Bonferroni cutoff included renal dysfunction (HR, 54.6; 95% CI, 5.9-506.2; P = .0004), as well as several cardiopulmonary comorbid conditions such as pulmonary congestion (HR, 6.0; 95% CI, 1.7-21; P = .004), cardiomyopathy (HR, 5.0; 95% CI, 1.3-19; P = .016), congestive heart failure (HR, 4.3; 95% CI, 1.3-14; P = .015), and pneumonia (HR, 3.3; 95% CI, 1.4-7; P = .004).
Comorbidity analysis: OS by comorbidity quartiles vs established risk scores
Comorbidity risk quartiles stratified by standard low, intermediate-1, intermediate-2, and high-risk groups demonstrates significant recategorization of risk with respect to 5-year OS (P < .001) by log-rank test (Figure 2E): low comorbidity group (95%), intermediate-1 (92%), intermediate-2 (83%), and high risk (70%). There was no difference in age among quartiles (P = .29). However, laboratory values tracked with comorbidity burden with high comorbidity patients having the greatest elevation in white blood cells (P < .0001), lowest Hgb (P < .0001), and lowest PLT (P < .0001) (supplemental Figure 2A-D). Correlation between the 4 prognosis scores, DIPSS, DIPSS Plus, MIPSS70+, and GPSS, were compared with comorbidity groups (Figure 4A-D). Comorbidity burden demonstrated correlation with DIPSS (r = 0.45, P = 2.8e-11), DIPSS Plus (r = 0.51, P = 2.8e-14), but diminished correlation with MIPSS70+ (r = 0.27, P = .004), and no correlation with GPSS (r = 0.048, P = .57). This is consistent with the variance in risk assignment seen between molecular and nonmolecular based scoring systems (supplemental Figure 3A,B). The addition of cytogenetics, measurement of transfusion dependence, and thrombocytopenia in DIPSS Plus does enhance risk classification, but only slightly from DIPSS (supplemental Figure 3A). By contrast, DIPSS risk assignments are divergent from those of GPSS (supplemental Figure 3B). Separating comparative risk scores, DIPSS vs GPSS, by comorbidity quartiles noted different variance patterns from low to high risk. For example, the comorbidity low-risk group patients noted a number of DIPSS low risks that were reclassified as GPSS intermediate-2 or high risk (supplemental Figure 4). However, high-risk group comorbidity patients included 0 DIPSS low risk and had a prognosis consistent with GPSS high risk. This would suggest that in low comorbidity states, disease-specific genomics can play an outsized role in prognosis, whereas in high comorbidity states, weight is shifted to the nongenomic features. This lends support that comorbidities can be the unmeasured variable that leads to inter-risk score variance.
Discussion
In this study, we leveraged EHR/biobank data and applied DIPSS, DIPSS Plus, MIPSS70+, and GPSS scoring systems for identified myelofibrosis patients. To our knowledge, there has not been an unbiased assessment of comorbidities augmenting the prognosis determined by traditional risk prediction tools. Intuitively, clinicians consider comorbidities when making treatment decisions for patients with MPNs, including when to enroll in clinical trials and when to refer to allogeneic SCT. Delineating which patient-specific comorbidities are most impactful has remained limited to physician judgment and experience with only scant validated tools such as the adult comorbidity evaluation-27.20 The standardization of significant comorbidities in genotypic prognostication could be valuable because patients with certain comorbidity profiles could be excluded from high-risk therapy. This is especially true because age is becoming less of a factor in SCT as reduced intensity conditioning and post-SCT supportive care improve.20,29
Our results revealed that despite current risk prediction assessments, survival curves change with the incorporation of comorbidity burden (Figure 2). These comorbidities were gathered at the time of diagnosis and within 1 year, which substantiates neoplastic-associated comorbidities that occur more commonly than expected. There was a high degree of discriminatory power of the extended DIPSS model with comorbidity predictors (C-index). Interestingly, we note that survival curves of intermediate- and higher risk groups appear to overlap more consistently, suggesting that prognostic decision-making including comorbidities could account for variations not previously considered.
There was a high incidence and association of renal failure and dysfunction with reduced OS, and this may account for the decreased survival seen in the subgroup of patients with high comorbidity risk (Figure 2E). As might be expected, hyperuricemia in the first year after diagnosis was common compared with those patients with myelofibrosis who did not develop renal failure at any point. Renal complications of myelofibrosis are myriad and commonly involve renal-toxic medications, extramedullary disease in the kidney,30,31 glomerular disease,32 and cell lysis.33,34 Perhaps patients with more proliferative disease with hyperuricemia and resultant nephrotic stress account for this associated increased mortality imparting a clinical finding and comorbidity linked to disease biology. This may suggest that renal dysfunction, even mild, at the time of diagnosis should be explored further in future iterative prediction models.
Similar to previous published reports, highly morbid infectious complications in our cohort were associated with reduced OS.35 Infectious complications often coexist in PMF, particularly with treatment. With Bonferroni correction, invasive fungal infections were predictors of decreased survival. There was a statistical trend toward significance for bacterial infections after diagnosis of myelofibrosis. The former results are well-established harbingers of end-stage disease in immunocompromised patients with myeloid neoplasms, and the latter result is intriguing because infectious complications are intuitively a consideration in treatment decision-making. Similarly, the association of cardiopulmonary comorbidities including pulmonary hypertension and cardiomyopathy did not meet the stringent Bonferroni cutoff but did illustrate a strong trend toward significance, which is likely a function of the limitations of the size of our cohort and our strict inclusion criteria. Pulmonary hypertension from myeloproliferative disorders is categorized as WHO group V pulmonary hypertension and has been previously associated with poorer outcomes in PMF.36,37 This supports further investigation in larger data sets and perhaps prospective evaluation of myelofibrosis patients with echocardiography or brain natriuretic peptide levels.
We do note several limitations in the systematic use of EHR phecodes for comorbidity assessment. Many of the comorbidities identified could have secondary causes or be related to myelofibrosis itself. Though we made efforts to include only comorbidities coincident with onset of the diagnosis of myelofibrosis (Table 1), it is difficult for the algorithm to elucidate this precisely. Further, if these comorbidities were pathologically causal by their MPN, they may still serve as a proxy of aggressive phenotypes that may not be captured by genotypic risk assessment. The potential for this is to refine patient risk, avoid exclusion of patients on comorbidities of lesser importance, and, importantly, better counsel patients on treatment rationale. Although our method as a proof of concept identifies important signals of myelofibrosis-specific and nonspecific organ dysfunction, next-generation prediction models should consider accounting for this in clinical decision making, especially for transplant. For example, there is known variance between molecular and nonmolecular risk scores, and this variance seems to be preserved across the comorbidity spectrum. Although we evaluated specific comorbidities, we also measured the burden of comorbidities as a factor in risk assessments.
Integration of patient-specific comorbidities has the potential to serve as a de facto tiebreaker for difficult cases and may serve well to aid in decisions to proceed or not proceed with allogeneic stem cell transplant or other therapies. Our approach accurately identified both patients with PMF and sMF in the EHR, and each myelofibrosis risk model was preserved, illustrating a statistically significant survival distribution, despite survival analyses limited by the nature of unbiased retrospective review. Specifically, patients stratified from low to high risk among all 4 established risk scoring systems. To achieve this, we overcame the challenges of case identification including heterogeneity of clinician ICD coding and the considerable overlap between diagnostic criteria for other MPNs and subjective variability in clinical pathology.
For those patients who had clinical NGS-detected JAK2V617F mutation reported in the EHR, we found 97% agreement between EHR and biobanked NGS results. The capacity to retrospectively genotype patients who received care before the advent of routine NGS for myeloid disease represents a clear strength of working within an EHR and biobank resource. Importantly, we also detected cases of JAK2V617F-mutated myelofibrosis in patients who were not previously genotyped, which allowed for the addition of GPSS and MIPSS70+ even in patients whose disease course predated the discovery of JAK2V617F. This methodology creates the possibility for similar investigations across other hematologic malignancies and patients with clonal hematopoiesis but without hematologic malignancies.
PMF remains a complex and challenging disease that will require a continued effort to improve patient outcomes. BioVU is unique as a fully annotated deidentified patient record of millions of patients, and to our knowledge a similar deidentified data source, with this level of necessary annotation, is not available. Still, we have demonstrated reliable identification of myelofibrosis within an EHR, and further implementation of natural language processing and data extraction algorithms are actively being pursued to leverage our ability to identify hematologic malignancy in these databases. Our repurposing of an institution-wide biobank for hematologic malignancy evaluation, with the potential for clonal evolution assessment, is a novel use of a tool in a manner to study relatively large populations of otherwise rare myeloid disease. Further, we identified renal failure as organ dysfunction that may have a sizeable effect on patient outcomes. In aggregate, our findings suggest that a more objective measurement of patient-specific comorbidities is needed to best individualize therapy in this highly comorbid patient population. In this type of novel analysis, a separate cohort to validate this methodology would be ideal. Although comorbidity assessment is instinctively a part of clinical practice, we demonstrate how an unbiased and systematic approach to quantitative burden of comorbidities appears to be useful in aiding the clinician for risk assessment. By integrating these patient-specific comorbidity assessments and their impact on prognostic risk scores, our investigation continues to refine tools to select patient candidacy for novel therapy, standard of care JAK inhibition, and allogenic stem cell transplant.
Acknowledgments
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
The datasets used for the analyses described were obtained from Vanderbilt University Medical Center’s BioVU, which is supported by the CTSA grant ULTR000445 from the National Institutes of Health National Center for Advancing Translational Sciences. Research reported in this publication was supported by the National Institutes of Health under award numbers T32GM007347 (National Institute of General Medical Sciences) and the Ruth L. Kirschstein National Research Service Award F30DK127699 (National Institute of Diabetes and Digestive and Kidney Diseases). A.B. is supported by National Institutes of Health Office of the Director grant DP5-OD029586 and a career award for medical scientists from the Burroughs Wellcome Foundation. M.S. is a Leukemia and Lymphoma Society Clinical Scholar and is supported by the E.P. Evans Foundation, the Biff Ruttenberg Foundation, the Adventure Alle Fund, the Beverly and George Rawlings Directorship, and National Institutes of Health National Cancer Institute grant RO1CA262287.
Authorship
Contribution: A.L.S., C.B., and M.R.S. designed the research study; A.L.S., C.B., A.P., and M.R.S. wrote the paper; A.L.S., C.B., S. Zhao, A.P., T.P.S., A.J.S., S.S.S., D.D., K.P., M.B., N.S., S. Zhang, C.S., Y.X., and M.R.S. performed the research; A.L.S., C.B., S. Zhao, N.S., S. Zhang, T.S., A.G.B., Y.X., and M.R.S. analyzed the data; and M.R.S. supervised the study.
Conflict-of-interest disclosure: M.S. reports membership on a board or advisory committee for AbbVie, Bristol Myers Squibb, CTI, Geron, Karyopharm, Novartis, Ryvu, Sierra Oncology, Taiho, Takeda, and TG Therapeutics; patents and royalties with Boehringer Ingelheim; research funding from ALX Oncology, Astex, Incyte, Takeda, and TG Therapeutics; equity ownership with Karyopharm and Ryvu; and consultancy for Karyopharm and Ryvu. A.N. reports serving on a data monitoring committee for MEI and speaker bureaus for Incite and Novartis. The remaining authors declare no competing financial interests.
Correspondence: Michael R. Savona, Medicine and Cancer Biology, Vanderbilt University School of Medicine, 2200 Pierce Ave, Preston Research Building 777, Nashville, TN 37232; e-mail: michael.savona@vanderbilt.edu.
References
Author notes
Sequencing data can be found at https://prod.tbilab.org/myelofibrosis_comorbidity. Please direct other inquiries to the corresponding author, Michael R. Savona (michael.savona@vanderbilt.edu).
The full-text version of this article contains a data supplement.
A.L.S. and C.A.B. contributed equally to this work.