Introduction: Capturing venous thromboembolism (VTE) outcomes using only International Classification of Disease (ICD) codes can lead to misclassification of events and to inaccurate conclusions. Manual chart review, the gold standard, is labor intensive and not always feasible. The current study aims to validate an improved computable VTE phenotype algorithm combining ICD with natural language processing (NLP) in patients undergoing hematopoietic cell transplantation (HCT).

Methods: All patients undergoing first allogeneic HCT at Fred Hutchinson Cancer Center (FHCC) from 2006-2019 were included in the current study. To capture as many as possible VTE events, we used a sensitive screening method within one year before and after the transplant date that encompassed 1) all patients with ICD-9 and ICD-10 codes for acute, chronic, or historical pulmonary embolism (PE), deep venous thrombosis (DVT), and phlebitis and thrombophlebitis, and 2) all patients with at least 1 radiology report with a pertinent VTE-related keyword from venous doppler ultrasound, contrast computed tomography, or ventilation perfusion scans. All patients from this screened subset were then reviewed by chart abstractors (JA, KM) to establish the gold standard of incident VTE events, which was defined as the new onset of radiologically confirmed PE, lower extremity DVT, or upper extremity/catheter related DVT within 1-year post-transplant.

We then tested the performance of the acute VTE ICD-9/10 codes (selective codes from 415, 451, 453; I26, I80, I82) from inpatient or outpatient encounters from stem cell infusion until 1-year post-transplant. We also tested the utility of a NLP algorithm, a method that has been previously validated in a separate cancer cohort using unstructured radiology impressions (PMID: 35647478). Finally, we compared the performance of combining these two algorithms against the gold standard to report positive predictive value (PPV) and sensitivity (Sn).

Results: Among 2,879 patients who underwent allogeneic HCT over 15 years, 740 (26%) met study inclusion criteria. Based on the gold standard of detailed medical record review, 275 (10%) were found to have a radiologically confirmed VTE event within 1-year post-transplant (Figure 1). A further 389 (14%) historical VTE events were confirmed before transplant. The acute VTE ICD-9/10 codes identified 339 patients and the NLP algorithm predicted 245 patients to have VTE (including 155 overlap).

The ICD codes alone for acute VTE had an estimated Sn of 73% and PPV of 59%. The NLP radiology algorithm alone for VTE had an estimated Sn of 73% and PPV of 82%. Approximately 3 in 10 VTE events were missed in each algorithm. However, the combination of ICD or NLP identified 245/275 of all VTE events (Sn 89%). The PPVs for ICD+/NLP+, ICD-/NLP+, and ICD+/NLP- were 89%, 64%, and 27%, respectively (Table 1). In summary, those with concordant ICD/NLP prediction had excellent PPVs, and approximately 8% of patients with discordant ICD/NLP (n=234/2,879) would require additional chart review to achieve a final PPV >90% and Sn >90%.

Conclusion: In the current study, we found that the sensitivity for either the acute ICD codes or the NLP algorithm alone was sub-optimal (missing 3 in 10), and a combined screen should be considered (missing 1 in 10). The use of ICD-9/10 codes alone for new VTE had poor accuracy in our cohort (PPV of 59%), suggesting that additional features are needed, such as concurrent anticoagulation medications. In contrast, the NLP algorithm was validated with high PPV 82% (89% when combined with positive acute ICD screen) in the current cohort and may not require additional confirmation, though caution should be taken for its usage in other studies without dedicated follow-up where radiology reports are captured and stored in one unified healthcare system. One limitation of the study is the lack of review of patients initially screened negative by either ICD codes or radiology keyword searches. While the initial screen was designed to be highly sensitive, we may have missed small number of true VTE events and the reported Sn in this study represents the best-case scenario. In conclusion, while computable phenotype algorithms represent a promising future for the identification of VTE, a hybrid approach involving manual chart review (for only cases where the ICD and NLP screens disagree) may provide the highest yield and help minimize the labor intensive manual review.

Lee:Amgen: Research Funding; AstraZeneca: Research Funding; Equillium: Consultancy, Honoraria; Incyte: Research Funding; Kadmon: Consultancy, Honoraria, Research Funding; Mallinckrodt: Consultancy, Honoraria; National Marrow Donor Program: Membership on an entity's Board of Directors or advisory committees; Novartis: Membership on an entity's Board of Directors or advisory committees; Pfizer: Research Funding; Syndax: Research Funding. Rojas Hernandez:ANTHOS Therapeutics: Research Funding; ASPEN Pharmaceuticals: Research Funding; Daichii Sankyo: Research Funding.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution