Abstract
In 2005, the HCT-CI was introduced as a weighted scoring system to predict mortality risk following allogeneic HCT. Since then, not all investigators were able to validate the HCT-CI after testing in their respective institutions. In 2007, a collaborative multi-institutional study was initiated to investigate 1) whether the HCT-CI was predictive of outcomes across different institutions, 2) the degree of homogeneity of outcome prediction, and 3) the reasons for lack of agreement among investigators. To this end, data were collected from 3347 consecutive patients (pts) treated with allogeneic HCT between 2000 and 2006 from HLA-matched related or unrelated donors at 5 institutions. All data were collected by a single investigator, blinded from the final outcomes of pts, to ensure consistent comorbidity coding. Numbers of pts, percentages of available comorbidity data, and other transplant and pt characteristics were statistically significantly different among institutions (Table 1). Pts missing comorbidity or other covariate data were excluded from further analyses, yielding a final sample size of 2523.
Institutions . | A (n=1073), % . | B (n=973), % . | C (n=336), % . | D (n=237), % . | E (n=206), % . | p . |
---|---|---|---|---|---|---|
Missing comorbidity data | <1 | 20 | 2 | 6 | 23 | <0.001 |
HCT-CI scores | ||||||
0 | 29 | 30 | 32 | 42 | 32 | <0.001 |
1,2 | 34 | 28 | 29 | 28 | 22 | |
≥3 | 37 | 43 | 39 | 30 | 46 | |
Donor | ||||||
Unrelated | 50 | 38 | 51 | 40 | 31 | <0.001 |
Age, years | ||||||
≥50 | 42 | 29 | 47 | 21 | 51 | <0.001 |
Conditioning Regimens | ||||||
High-dose | 53 | 67 | 79 | 67 | 46 | <0.001 |
Reduced-intensity | 13 | 29 | 10 | 13 | 31 | |
Nonmyeloablative | 34 | 4 | 10 | 21 | 23 | |
ATG in regimen | 11 | 4 | 3 | 15 | 14 | <0.001 |
Diagnoses | ||||||
Myeloid | 63 | 56 | 59 | 57 | 51 | <0.001 |
Lymphoid | 28 | 41 | 38 | 25 | 46 | |
Other cancers | 2 | 3 | 1 | 3 | 1 | |
Non-malignant diseases | 7 | 0 | 2 | 15 | 4 | |
Disease risk | ||||||
High | 59 | 62 | 67 | 51 | 67 | <0.001 |
Stem cell source | ||||||
Marrow | 19 | 19 | 24 | 56 | 10 | <0.001 |
Pt CMV | ||||||
Positive | 56 | 73 | 70 | 65 | 51 | <0.001 |
KPS | ||||||
≤80 | 29 | 18 | 30 | 38 | 25 | <0.001 |
Prior regimens | ||||||
≥4 | 23 | 22 | 24 | 20 | 30 | 0.25 |
Institutions . | A (n=1073), % . | B (n=973), % . | C (n=336), % . | D (n=237), % . | E (n=206), % . | p . |
---|---|---|---|---|---|---|
Missing comorbidity data | <1 | 20 | 2 | 6 | 23 | <0.001 |
HCT-CI scores | ||||||
0 | 29 | 30 | 32 | 42 | 32 | <0.001 |
1,2 | 34 | 28 | 29 | 28 | 22 | |
≥3 | 37 | 43 | 39 | 30 | 46 | |
Donor | ||||||
Unrelated | 50 | 38 | 51 | 40 | 31 | <0.001 |
Age, years | ||||||
≥50 | 42 | 29 | 47 | 21 | 51 | <0.001 |
Conditioning Regimens | ||||||
High-dose | 53 | 67 | 79 | 67 | 46 | <0.001 |
Reduced-intensity | 13 | 29 | 10 | 13 | 31 | |
Nonmyeloablative | 34 | 4 | 10 | 21 | 23 | |
ATG in regimen | 11 | 4 | 3 | 15 | 14 | <0.001 |
Diagnoses | ||||||
Myeloid | 63 | 56 | 59 | 57 | 51 | <0.001 |
Lymphoid | 28 | 41 | 38 | 25 | 46 | |
Other cancers | 2 | 3 | 1 | 3 | 1 | |
Non-malignant diseases | 7 | 0 | 2 | 15 | 4 | |
Disease risk | ||||||
High | 59 | 62 | 67 | 51 | 67 | <0.001 |
Stem cell source | ||||||
Marrow | 19 | 19 | 24 | 56 | 10 | <0.001 |
Pt CMV | ||||||
Positive | 56 | 73 | 70 | 65 | 51 | <0.001 |
KPS | ||||||
≤80 | 29 | 18 | 30 | 38 | 25 | <0.001 |
Prior regimens | ||||||
≥4 | 23 | 22 | 24 | 20 | 30 | 0.25 |
Overall, pts with HCT-CI scores of 0 vs. 1–2 vs. ≥3 had 2-year non-relapse mortality (NRM) rates of 14%, 23%, and 39% (p <0.0001), respectively, and 2-year overall survival (OS) rates of 74%, 61%, and 39% (p <0.0001), respectively. Proportional hazards models were used to estimate the hazard ratio (HR) for NRM and OS associated with HCT-CI scores in each of the 5 institutions (Table 2). The models were adjusted for covariates in Table 1. Increased HCT-CI scores were associated with increases in the HR for NRM and OS across all 5 institutions and these increases were highly statistically significant except for institution E, which had the smallest sample size. Of note, the magnitudes of increases in HRs were not entirely comparable across institutions. In a unified model including all institutions, we found a statistically significant lack of homogeneity across institutions for the HRs associated with scores 1–2 (p=0.03) and ≥3 (p=0.04) for NRM and with scores ≥3 (p=0.01) for OS but not with scores 1–2 for OS (p=0.18). We also found a statistically significant, independent impact of institution on NRM (p=0.001) and OS (p<0.001).
Institutions . | NRM HR . | . | Overall survival HR . | . | ||||
---|---|---|---|---|---|---|---|---|
HCT-CI scores . | ||||||||
0 . | 1–2 . | ≥3 . | p . | 0 . | 1–2 . | ≥3 . | p . | |
A | 1.0 | 1.4 | 2.5 | <0.0001 | 1.0 | 1.36 | 2.23 | <0.0001 |
B | 1.0 | 2.88 | 4.15 | <0.0001 | 1.0 | 1.88 | 2.77 | <0.0001 |
C | 1.0 | 1.3 | 3.62 | <0.0001 | 1.0 | 1.33 | 3.28 | <0.0001 |
D | 1.0 | 1.65 | 6.89 | <0.0001 | 1.0 | 1.84 | 5.81 | <0.0001 |
E | 1.0 | 1.76 | 2.66 | 0.09 | 1.0 | 1.13 | 2.28 | 0.09 |
Institutions . | NRM HR . | . | Overall survival HR . | . | ||||
---|---|---|---|---|---|---|---|---|
HCT-CI scores . | ||||||||
0 . | 1–2 . | ≥3 . | p . | 0 . | 1–2 . | ≥3 . | p . | |
A | 1.0 | 1.4 | 2.5 | <0.0001 | 1.0 | 1.36 | 2.23 | <0.0001 |
B | 1.0 | 2.88 | 4.15 | <0.0001 | 1.0 | 1.88 | 2.77 | <0.0001 |
C | 1.0 | 1.3 | 3.62 | <0.0001 | 1.0 | 1.33 | 3.28 | <0.0001 |
D | 1.0 | 1.65 | 6.89 | <0.0001 | 1.0 | 1.84 | 5.81 | <0.0001 |
E | 1.0 | 1.76 | 2.66 | 0.09 | 1.0 | 1.13 | 2.28 | 0.09 |
We then assessed, among 80 pts from institution A, the inter-observer variability in scoring comorbidity between two individual investigators and between each of them and unknown individuals from a pool of other evaluators. Weighted kappa statistics were highest (0.59) between two single evaluators and lowest between each and multiple evaluators (0.43 and 0.55, respectively). The principal investigator then developed a comprehensive guideline to code comorbidities and used it to train the other single investigator in a single session. Additional evaluation of inter-observer agreement demonstrated marked improvement of the weighted kappa statistic to 0.78. The reported disagreements on the validity of the HCT-CI may be explained by different institutional experiences in managing transplant pts, small number of pts at some institutions, and inter-observer variability in score assignment.
The HCT-CI is valid to discriminate relative risks of mortalities after HCT across different institutions and should be used regularly for counseling pts and clinical trial design. Efforts to improve methods for coding comorbidity are in progress.
No relevant conflicts of interest to declare.
This icon denotes a clinically relevant abstract
Author notes
Asterisk with author names denotes non-ASH members.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal