The global severity score (GS) was devised by experts during the 2005 NIH Consensus for chronic graft-versus-host disease (GVHD) to reflect the overall severity of chronic GVHD at any time point. According to this scale, overall involvement is categorized as mild, moderate or severe. Several studies demonstrated that the NIH GS was associated with subsequent mortality in patients with chronic GVHD. We used multivariate time-varying analysis accounting for longitudinal changes of severity in individual organs over time to try to develop a better global mortality score than the NIH GS.
Data were derived from the Chronic GVHD Consortium, a prospective, multicenter, longitudinal, observational study. A total of 2202 visits from 574 adult patients with systemically treated chronic GVHD through January 2013 were used for analysis. At follow-up visits every 3-6 months, clinicians reported standardized information including NIH organ severity scores. Objective medical data including laboratory and pulmonary function test results were abstracted through standardized chart review after each visit. Time-varying Cox models accounting for varied times of entry into the study and longitudinal changes in organ severity over time were used to correlate severity in individual sites with overall mortality. All models were adjusted for time after transplantation, study site and known chronic GVHD mortality risk factors. 424 randomly selected patients with 1602 visit ratings were used to develop a new global mortality score, and the remaining 150 patients with 600 visit ratings were used to validate the model. Risk stratification was compared between the final model and the NIH GS model.
In the training phase, multivariate models showed that organ severity scores in skin, gastrointestinal tract, liver and lung were independently associated with overall mortality (Table 1), while scores in mouth, eyes, joint or fascia and genital tract were not. Mortality scores were assigned based on observed hazard ratios, and 3 risk groups were identified based on the total mortality score (Table 2). In the validation cohort (n=150), the intermediate (329 visit ratings, HR 3.7, p=0.05) and high-risk categories (89 visit ratings, HR 10.8, p=0.001) were associated with higher risk of overall mortality compared with the low-risk category (185 visit ratings). Using the NIH GS instead, the severe category (196 visit ratings, HR 3.1, p=0.01) was associated with higher risk of overall mortality compared with the mild or moderate category (22 plus 382 visit ratings). Although agreement in risk categories was slight (weighted kappa = 0.32), the two models did not differ statistically based on log likelihood ratios, showing that they performed equally well in identifying groups with different mortality risks.
The influence of organ scores on mortality differs according to individual sites, with the largest influence from lung followed by liver, skin and gastrointestinal tract. The model fitness did not differ statistically between the 2 models, suggesting that the NIH GS is an adequate model for risk stratification based on mortality in patients with chronic GVHD.
No. of visits | HR | P | Mortality score | |
Skin | ||||
Score 0 | 806 | 1.00 | – | 0 |
Score 1 | 283 | 1.49 | 0.20 | 0 |
Score 2 | 343 | 2.14 | 0.013 | 1 |
Score 3 | 170 | 2.73 | 0.002 | 1 |
GI | ||||
Score 0 | 1265 | 1.00 | – | 0 |
Score 1 | 282 | 1.71 | 0.047 | 1 |
Score 2-3* | 55 | 1.54 | 0.35 | 1 |
Liver | ||||
Score 0 | 982 | 1.00 | – | 0 |
Score 1 | 393 | 1.29 | 0.34 | 0 |
Score 2 | 175 | 1.70 | 0.11 | 0 |
Score 3 | 52 | 3.71 | 0.004 | 2 |
Lung | ||||
Score 0 | 821 | 1.00 | – | 0 |
Score 1 | 564 | 2.10 | 0.009 | 1 |
Score 2 | 186 | 4.54 | <0.001 | 2 |
Score 3 | 31 | 9.71 | <0.001 | 4 |
No. of visits | HR | P | Mortality score | |
Skin | ||||
Score 0 | 806 | 1.00 | – | 0 |
Score 1 | 283 | 1.49 | 0.20 | 0 |
Score 2 | 343 | 2.14 | 0.013 | 1 |
Score 3 | 170 | 2.73 | 0.002 | 1 |
GI | ||||
Score 0 | 1265 | 1.00 | – | 0 |
Score 1 | 282 | 1.71 | 0.047 | 1 |
Score 2-3* | 55 | 1.54 | 0.35 | 1 |
Liver | ||||
Score 0 | 982 | 1.00 | – | 0 |
Score 1 | 393 | 1.29 | 0.34 | 0 |
Score 2 | 175 | 1.70 | 0.11 | 0 |
Score 3 | 52 | 3.71 | 0.004 | 2 |
Lung | ||||
Score 0 | 821 | 1.00 | – | 0 |
Score 1 | 564 | 2.10 | 0.009 | 1 |
Score 2 | 186 | 4.54 | <0.001 | 2 |
Score 3 | 31 | 9.71 | <0.001 | 4 |
Scores 2-3 were combined since score 3 occurred in only 6 visits.
Risk category | Total score | No. of visits | HR | P |
Low | 0 | 457 | 1.00 | – |
Intermediate | 1-2 | 948 | 2.95 | 0.002 |
High | ≥3 | 197 | 10.6 | <0.001 |
Risk category | Total score | No. of visits | HR | P |
Low | 0 | 457 | 1.00 | – |
Intermediate | 1-2 | 948 | 2.95 | 0.002 |
High | ≥3 | 197 | 10.6 | <0.001 |
No relevant conflicts of interest to declare.