Key Points
Quantitative CT lung strain can diagnose early BOS before decline in percent predicted forced expiratory volume in 1 second.
Quantitative CT lung strain can distinguish types of BOS.
Visual Abstract
Bronchiolitis obliterans syndrome (BOS) after hematopoietic cell transplantation (HCT) is associated with substantial morbidity and mortality. Quantitative computed tomography (qCT) can help diagnose advanced BOS meeting National Institutes of Health (NIH) criteria (NIH-BOS) but has not been used to diagnose early, often asymptomatic BOS (early BOS), limiting the potential for early intervention and improved outcomes. Using pulmonary function tests (PFTs) to define NIH-BOS, early BOS, and mixed BOS (NIH-BOS with restrictive lung disease) in patients from 2 large cancer centers, we applied qCT to identify early BOS and distinguish between types of BOS. Patients with transient impairment or healthy lungs were included for comparison. PFTs were done at month 0, 6, and 12. Analysis was performed with association statistics, principal component analysis, conditional inference trees (CITs), and machine learning (ML) classifier models. Our cohort included 84 allogeneic HCT recipients, 66 with BOS (NIH-defined, early, or mixed) and 18 without BOS. All qCT metrics had moderate correlation with forced expiratory volume in 1 second, and each qCT metric differentiated BOS from those without BOS (non-BOS; P < .0001). CITs distinguished 94% of participants with BOS vs non-BOS, 85% of early BOS vs non-BOS, 92% of early BOS vs NIH-BOS. ML models diagnosed BOS with area under the curve (AUC) of 0.84 (95% confidence interval [CI], 0.74-0.94) and early BOS with AUC of 0.84 (95% CI, 0.69-0.97). qCT metrics can identify individuals with early BOS, paving the way for closer monitoring and earlier treatment in this vulnerable population.
Introduction
Bronchiolitis obliterans syndrome (BOS) is an obstructive lung disease associated with chronic graft-versus-host disease (cGVHD) after hematopoietic cell transplantation (HCT) that leads to progressive morbidity and is associated with a 5-year survival of 40%.1-3 National Institutes of Health (NIH) criteria for BOS require substantial impairment in pulmonary function testing (PFT), which limits the utility of screening PFT to detect early disease and contributes to high rates of false-negative testing.4,5 Delayed diagnosis of BOS worsens pulmonary morbidity, can result in treatment initiation after the onset of irreversible airway fibrosis, and can increase economic burden, complicating the care of HCT recipients.6,7 Novel diagnostic markers are needed for more accurate diagnosis of BOS, including earlier in the course of the disease.
Quantitative computed tomography (qCT) of the chest can use the wealth of data in inspiratory and expiratory CT scans to generate voxel-level measures of pulmonary function that augment PFT and are associated with important patient-related outcomes such as shortness of breath, functional capacity, and mortality.8 qCT metrics include threshold-based measures, such as parametric response mapping (PRM), a technique that identifies functional small airway disease (ie, air trapping) that is the hallmark of BOS; and strain metrics that quantify the magnitude of volume change (Jacobian) and the asymmetry of volume change across space (anisotropic deformation index [ADI]).9,10 ADI measures whether deformation across 3 dimensions occurs in equal magnitude (ie, isotropy) or, if not, the degree to which deformation occurs in unequal magnitude (ie, anisotropy).11 Reduced Jacobian and reduced ADI have been shown to correlate with increased disease severity of chronic obstructive pulmonary disease (COPD) as assessed by Global Initiative for Chronic Obstructive Lung Disease stage.12,13 Although thoracic CT evaluations are common in HCT recipients to diagnose lung infections,14 quantitative imaging has not been fully used to diagnose noninfectious pulmonary complications after HCT.
Previously, we have applied PRM and quantified air trapping to identify advanced BOS after HCT.15,16 Analyses with qCT have not been used to distinguish early BOS from those with transient impairment or to differentiate types of BOS. To examine the utility of lung strain as a diagnostic biomarker for BOS, we analyzed HCT recipients with cGVHD at 2 comprehensive cancer centers with quantitative measurements of inspiratory/expiratory CT scans performed at the time of diagnosis with BOS or of pulmonary impairment by PFT not meeting NIH criteria for BOS. We hypothesized that lung strain metrics would assist in the diagnosis of early BOS and help differentiate subtypes of BOS.
Materials and methods
Study participants
We included recipients of allogeneic HCT from MD Anderson Cancer Center and Stanford University Hospital who received a diagnosis of pulmonary impairment between 2018 and 2021. Inclusion criteria were age at least 18 years and qCT within 1 month of the diagnosis of pulmonary impairment by PFT, or in the case of a control group with no lung disease, qCT within 3 months of the diagnosis of cGVHD requiring systemic immune modulation. Participants were additionally required to have PFT at 6 and 12 months thereafter. We excluded patients who did not receive follow-up PFT, had initial evidence of acute lower respiratory tract infection, or had pre-HCT interstitial or obstructive lung diseases (eg, asthma or COPD) that might confound the diagnosis of BOS. Study approval was obtained from each institute’s institutional review board (MD Anderson Cancer Center institutional review board 2018-0288; Stanford University Hospital institutional review board 66591). Data were collected and analyzed in compliance with the Health Insurance Portability and Accountability Act.
Definitions
A panel of 3 clinicians (H.S., J.H., and A.S.) adjudicated the clinical diagnosis and the severity of disease for patients. NIH-BOS was defined by forced expiratory volume in 1 second (FEV1)/forced vital capacity of <0.7, percentage predicted FEV1 (FEV1%) of <75%, evidence of air trapping, and absence of infection, as detailed fully in supplemental Material in the section on NIH criteria for defining BOS.17 Early BOS was defined as an irreversible decline by month 6 of ≥10% in absolute FEV1 and not meeting NIH criteria, a definition that has been applied to HCT recipients in prior studies.4,5 Mixed BOS was NIH-BOS with concomitant sclerotic skin cGVHD of the thorax, resulting in additional extraparenchymal restriction as defined by a ≥10% reduction in total lung capacity or, when total lung capacity was not available, a ≥20% reduction in forced vital capacity compared with baseline.18 Transient impairment included persons who had obstructive physiology by PFT at month 0 that resolved by month 6. Controls were individuals with cGVHD requiring systemic immune medication who did not have evidence of lung disease.
CT scanning method and analysis
Inspiratory/expiratory CT scans of the chest were deidentified and analyzed in the Digital Imaging and Communications in Medicine format with strain map analysis by software from VIDA Diagnostics, Inc. CT scans were performed at full inspiration (total lung capacity) and end expiration (residual volume), using helical CTs (Siemens Force, Siemens Medical Systems or GE Discovery CT750 HD, GE Healthcare). Further radiographic technical details are included in supplemental Material, Radiology section. Strain mapping included calculation of Jacobian, ADI, disease probability measure (DPM) of percentage air trapping, and percentage normal lung.19,20 DPM is a probabilistic measure of air trapping, which contrasts with the discrete Hounsfield unit thresholds used to calculate the air trapping metric of PRM. DPM air trapping has been found to correlate with PRM air trapping but to differ in magnitude.21 Jacobian of >1.0 indicates lung expansion and <1.0 lung contraction. ADI starts at 0, which indicates symmetric expansion across 3 dimensions. As ADI increases above 0, it represents increasing asymmetry of expansion. The relationship of Jacobian and ADI to lung physiology is delineated in supplemental Material in the section on Jacobian and Anisotropic Deformation Index. To gain insight into the heterogeneity of strain metrics within an individual’s lungs, we measured the intralung standard deviation of Jacobian (JacobianSD) and of ADI (ADISD).
Statistical methods
Pearson correlation was performed between strain metrics and FEV1%. For statistically significant differences in group comparisons, the Kruskal-Wallis test was used. Wilcoxon rank-sum test was used for pairwise comparisons of multiple groups. The Benjamini-Hochberg procedure was applied to correct for multiple comparisons. The χ2 test was used for associations with categorical variables. P values ≤.05 were considered significant. Principal component (PC) analysis (PCA) was performed with strain metrics (Jacobian, JacobianSD, ADI, ADISD, and DPM air trapping) on the 3 disease states, transient impairment, and controls. Binary recursive partitioning with conditional inference trees (CITs) was conducted using patient classifications as outcomes and strain metrics as covariates. For CITs, χ2 comparison of classifications occurred at each node of the decision tree for P values ≤.05 to select the lung strain covariates with the strongest association to outcome. Monte Carlo simulations with 1000 repeats were done to assess the distribution of tree permutations. Machine learning (ML) classifiers were applied using lung strain metrics as covariates. A K-nearest neighbor (KNN) model was used to distinguish any form of BOS from transient impairment and controls; a Bayesian model was used to distinguish early BOS from transient impairment and controls; and a KNN model was used to distinguish early BOS from NIH-BOS. Training occurred with 10-fold crossvalidation on a 75% training set, with model performance assessed on an unexposed, held-out 25% test set. Additional details for the ML analysis are in supplemental Material in the section on class imbalance in machine learning models. All statistical analyses were performed in R version 4.3.2.
Results
A total of 84 participants were included in our analysis, comprising patients from MD Anderson Cancer Center (n = 26) and Stanford University Hospital (n = 58). Based on our a priori classification of PFT and clinical course, 79% of participants were classified as having BOS (NIH-BOS [n = 47], early BOS [n = 13], and mixed BOS [n = 6]), and the remaining did not develop BOS (transient impairment [n = 8], and control [n = 10]). Participant demographics are presented in Table 1. Age, sex, and race were similar among groups. No significant differences were seen in demographics between the 2 cancer centers (see supplemental Table 1). Acute myeloid leukemia (37%) was the most common reason for transplant, followed by acute lymphoblastic leukemia (16%) and myelodysplastic syndrome (14%). The majority of patients had either matched related donors (48% of total) or matched unrelated donors (42% of total), and the majority of patients had myeloablative conditioning. Pretransplant PFT values were similar for patients across disease states.
. | NIH-BOS (n = 47) . | Early BOS (n = 13) . | Mixed BOS (n = 6) . | Transient impairment (n = 8) . | Control (n = 10) . | P value∗ . |
---|---|---|---|---|---|---|
Age, y (mean, SD) | 53.6 (14.1) | 54.9 (14.5) | 55.8 (13.8) | 43.8 (15.6) | 46.3 (15.9) | .24 |
Sex (male, n, %) | 19 (40.4) | 7 (53.8) | 3 (50.0) | 5 (62.5) | 6 (60.0) | .64 |
Race/ethnicity (n, %) | .91 | |||||
Asian | 6 (12.8) | 1 (7.7) | 1 (16.7) | 3 (37.5) | 1 (10.0) | |
Black | 1 (2.1) | 0 | 0 | 0 | 0 | |
Latino | 6 (12.8) | 2 (15.4) | 0 | 1 (12.5) | 1 (10.0) | |
White | 34 (72.3) | 10 (76.9) | 5 (83.3) | 4 (50.0) | 8 (80.0) | |
Indication for transplant (n, %) | .01 | |||||
ALL | 8 (17.0) | 0 | 1 (16.7) | 3 (37.5) | 1 (10.0) | |
AML | 19 (40.4) | 5 (38.5) | 3 (50.0) | 2 (25.0) | 2 (20.0) | |
Aplastic anemia | 0 | 0 | 0 | 0 | 2 (20.0) | |
CLL | 2 (4.3) | 0 | 0 | 0 | 3 (30.0) | |
CML | 2 (4.3) | 2 (15.4) | 0 | 1 (12.5) | 0 | |
CMML | 1 (2.1) | 0 | 0 | 0 | 0 | |
GATA immunodeficiency | 0 | 0 | 1 (16.7) | 0 | 0 | |
HD | 0 | 1 (7.7) | 1 (16.7) | 0 | 1 (10.0) | |
MDS | 8 (17.0) | 3 (23.1) | 0 | 1 (12.5) | 0 | |
Myelofibrosis | 3 (6.4) | 1 (7.7) | 0 | 0 | 0 | |
NHL | 4 (8.5) | 1 (7.7) | 0 | 1 (12.5) | 1 (10.0) | |
Donor CMV positive (n, %) | 25 (53.2) | 7 (53.8) | 3 (50.0) | 7 (87.5) | 4 (40.0) | .35 |
Recipient CMV positive (n, %) | 29 (61.7) | 8 (61.5) | 3 (50.0) | 6 (75.0) | 3 (30.0) | .32 |
Matching status (n, %) | .001 | |||||
Haploidentical | 2 (4.3) | 0 | 0 | 0 | 1 (10.0) | |
Matched related donor | 24 (51.1) | 6 (46.2) | 3 (50.0) | 3 (37.5) | 4 (40.0) | |
Matched unrelated donor | 21 (44.7) | 7 (53.8) | 2 (33.3) | 5 (62.5) | 0 | |
Umbilical cord blood | 0 | 0 | 0 | 0 | 1 (10.0) | |
Unrelated donor | 0 | 0 | 1 (16.7) | 0 | 4 (40.0) | |
Myeloablative conditioning (n, %) | 33 (70.2) | 8 (61.5) | 3 (50.0) | 7 (87.5) | 5 (50.0) | .41 |
Acute GVHD (n, %) | 34 (72.3) | 9 (69.2) | 1 (16.7) | 4 (50.0) | 4 (40.0) | .04 |
Nonpulmonary organs with cGVHD (mean, SD) | 2.64 (1.45) | 2.64 (1.03) | 3.00 (1.10) | 1.67 (1.15) | 1.50 (0.76) | .12 |
Pre-HCT FEV1 L/s (mean, SD) | 2.92 (0.63) | 3.23 (1.00) | 2.82 (0.78) | 3.32 (0.62) | 3.00 (0.63) | .49 |
Pre-HCT FEV1 percent predicted (mean, SD) | 98.48 (20.24) | 96.77 (12.47) | 95.50 (20.11) | 102.88 (10.51) | 90.80 (9.40) | .65 |
Pre-HCT FVC L/s (mean, SD) | 3.79 (0.86) | 4.13 (1.12) | 3.76 (0.93) | 4.05 (0.75) | 3.79 (0.85) | .82 |
Pre-HCT FVC percent predicted (mean, SD) | 98.97 (18.46) | 96.31 (10.44) | 94.50 (18.04) | 103.38 (11.55) | 88.70 (6.86) | .31 |
Months from HCT to study onset (mean, SD) | 35.89 (41.79) | 24.12 (16.14) | 54.82 (21.30) | 34.50 (37.02) | 41.67 (35.15) | .53 |
. | NIH-BOS (n = 47) . | Early BOS (n = 13) . | Mixed BOS (n = 6) . | Transient impairment (n = 8) . | Control (n = 10) . | P value∗ . |
---|---|---|---|---|---|---|
Age, y (mean, SD) | 53.6 (14.1) | 54.9 (14.5) | 55.8 (13.8) | 43.8 (15.6) | 46.3 (15.9) | .24 |
Sex (male, n, %) | 19 (40.4) | 7 (53.8) | 3 (50.0) | 5 (62.5) | 6 (60.0) | .64 |
Race/ethnicity (n, %) | .91 | |||||
Asian | 6 (12.8) | 1 (7.7) | 1 (16.7) | 3 (37.5) | 1 (10.0) | |
Black | 1 (2.1) | 0 | 0 | 0 | 0 | |
Latino | 6 (12.8) | 2 (15.4) | 0 | 1 (12.5) | 1 (10.0) | |
White | 34 (72.3) | 10 (76.9) | 5 (83.3) | 4 (50.0) | 8 (80.0) | |
Indication for transplant (n, %) | .01 | |||||
ALL | 8 (17.0) | 0 | 1 (16.7) | 3 (37.5) | 1 (10.0) | |
AML | 19 (40.4) | 5 (38.5) | 3 (50.0) | 2 (25.0) | 2 (20.0) | |
Aplastic anemia | 0 | 0 | 0 | 0 | 2 (20.0) | |
CLL | 2 (4.3) | 0 | 0 | 0 | 3 (30.0) | |
CML | 2 (4.3) | 2 (15.4) | 0 | 1 (12.5) | 0 | |
CMML | 1 (2.1) | 0 | 0 | 0 | 0 | |
GATA immunodeficiency | 0 | 0 | 1 (16.7) | 0 | 0 | |
HD | 0 | 1 (7.7) | 1 (16.7) | 0 | 1 (10.0) | |
MDS | 8 (17.0) | 3 (23.1) | 0 | 1 (12.5) | 0 | |
Myelofibrosis | 3 (6.4) | 1 (7.7) | 0 | 0 | 0 | |
NHL | 4 (8.5) | 1 (7.7) | 0 | 1 (12.5) | 1 (10.0) | |
Donor CMV positive (n, %) | 25 (53.2) | 7 (53.8) | 3 (50.0) | 7 (87.5) | 4 (40.0) | .35 |
Recipient CMV positive (n, %) | 29 (61.7) | 8 (61.5) | 3 (50.0) | 6 (75.0) | 3 (30.0) | .32 |
Matching status (n, %) | .001 | |||||
Haploidentical | 2 (4.3) | 0 | 0 | 0 | 1 (10.0) | |
Matched related donor | 24 (51.1) | 6 (46.2) | 3 (50.0) | 3 (37.5) | 4 (40.0) | |
Matched unrelated donor | 21 (44.7) | 7 (53.8) | 2 (33.3) | 5 (62.5) | 0 | |
Umbilical cord blood | 0 | 0 | 0 | 0 | 1 (10.0) | |
Unrelated donor | 0 | 0 | 1 (16.7) | 0 | 4 (40.0) | |
Myeloablative conditioning (n, %) | 33 (70.2) | 8 (61.5) | 3 (50.0) | 7 (87.5) | 5 (50.0) | .41 |
Acute GVHD (n, %) | 34 (72.3) | 9 (69.2) | 1 (16.7) | 4 (50.0) | 4 (40.0) | .04 |
Nonpulmonary organs with cGVHD (mean, SD) | 2.64 (1.45) | 2.64 (1.03) | 3.00 (1.10) | 1.67 (1.15) | 1.50 (0.76) | .12 |
Pre-HCT FEV1 L/s (mean, SD) | 2.92 (0.63) | 3.23 (1.00) | 2.82 (0.78) | 3.32 (0.62) | 3.00 (0.63) | .49 |
Pre-HCT FEV1 percent predicted (mean, SD) | 98.48 (20.24) | 96.77 (12.47) | 95.50 (20.11) | 102.88 (10.51) | 90.80 (9.40) | .65 |
Pre-HCT FVC L/s (mean, SD) | 3.79 (0.86) | 4.13 (1.12) | 3.76 (0.93) | 4.05 (0.75) | 3.79 (0.85) | .82 |
Pre-HCT FVC percent predicted (mean, SD) | 98.97 (18.46) | 96.31 (10.44) | 94.50 (18.04) | 103.38 (11.55) | 88.70 (6.86) | .31 |
Months from HCT to study onset (mean, SD) | 35.89 (41.79) | 24.12 (16.14) | 54.82 (21.30) | 34.50 (37.02) | 41.67 (35.15) | .53 |
ALL, acute lymphoblastic leukemia; AML, acute myeloid leukemia; CLL, chronic lymphocytic leukemia; CMML, chronic myelomonocytic leukemia; CML, chronic myelogenous leukemia; CMV, cytomegalovirus; FVC, forced vital capacity; HD, Hodgkin disease; MDS, myelodysplastic syndrome; NHL, non-Hodgkin lymphoma.
Kruskal-Wallis test was used for group-wise statistical significance.
Qualitative imaging attributes are associated with PFT and qCT
Given that radiology studies on BOS identify diagnostic criteria such as mosaic attenuation (MA), airway dilation, bronchial wall thickening, and centrilobular nodules, we analyzed these attributes as interpreted by radiologists vs FEV1% and qCT metrics.22-24 χ2 analysis revealed that expiratory MA (P = .0001), wall thickening (P = .0045), and airway dilation (P = .0046) achieved statistical significance in distinguishing between the 5 patient groups (supplemental Table 2). We then assessed the frequency of each imaging attribute and found that they did not distinguish among disease states (supplemental Table 3). NIH-BOS had the highest airway dilation (66%) and thickening (64%), but MA was highly variable across groups. No qualitative imaging attribute distinguished between early BOS and transient impairment. Control and transient impairment CT scans were largely normal on qualitative inspection, although inspiratory MA was seen for all individuals with transient impairment and expiratory MA was seen for all individuals who were controls.
To understand the link between these radiologist-assigned qualitative imaging attributes and quantitative metrics such as PFT and qCT, we used univariate logistic regression to analyze the association of each radiology attribute (the outcome) against FEV1% or the qCT metrics air trapping, Jacobian, or ADI. Inspiratory MA (P = .0004), airway dilation (P = .0001), bronchial wall thickening (P = .0035), and centrilobular nodules (P = .0310) were associated with lower FEV1% (supplemental Table 4). Inspiratory MA (P = .001) and wall thickening (P = .0052) were associated with greater percentage air trapping (supplemental Table 5). Inspiratory MA, wall thickening, and airway dilation were associated with lower Jacobian (P = .0023, .0017, and 0.0064, respectively) and ADI (P = .0003, .0069, and 0.0054, respectively; supplemental Tables 6 and 7). Together, these results show that although some qualitative imaging findings such as inspiratory MA and wall thickening are associated with FEV1% and qCT metrics, they are limited in that they do not distinguish between patient groups and do not provide sufficient information to differentiate types of BOS.
qCT correlates with PFT
Table 2 shows correlation of FEV1% with strain metrics. In all participants, FEV1% had a moderate positive correlation with normal lung percentage (R = 0.671; 95% confidence interval [CI], 0.532-0.775), Jacobian (R = 0.666; 95% CI, 0.527-0.771), JacobianSD (R = 0.635; 95% CI, 0.487-0.750), ADI (R = 0.626; 95% CI, 0.476-0.741), and ADISD (R = 0.632; 95% CI, 0.483-0.746). FEV1% had a moderate negative correlation with air trapping percentage (R = −0.665; 95% CI, −0.770 to −0.525). Table 3 displays the numerical values of strain metrics along with initial FEV1% by group classification. NIH-BOS had the lowest FEV1% (51%), the lowest percentage of normal lung (54%), the highest percentage of air trapping (37%), and the lowest Jacobian and ADI mean. We found that mean FEV1% was 77% ± 10% for early BOS, which was higher than those with NIH-BOS (51% ± 12%) and lower than those with transient impairment (85% ± 5%) and controls (90% ± 12%).
. | Correlation coefficient . | 95% CI . | P value . |
---|---|---|---|
Normal lung percentage | 0.671 | 0.532-0.775 | <.0001 |
Air trapping percentage | −0.665 | −0.770 to −0.525 | <.0001 |
Jacobian mean | 0.666 | 0.527-0.771 | <.0001 |
JacobianSD | 0.635 | 0.487-0.750 | <.0001 |
ADI mean | 0.626 | 0.476-0.741 | <.0001 |
ADISD | 0.632 | 0.483-0.746 | <.0001 |
. | Correlation coefficient . | 95% CI . | P value . |
---|---|---|---|
Normal lung percentage | 0.671 | 0.532-0.775 | <.0001 |
Air trapping percentage | −0.665 | −0.770 to −0.525 | <.0001 |
Jacobian mean | 0.666 | 0.527-0.771 | <.0001 |
JacobianSD | 0.635 | 0.487-0.750 | <.0001 |
ADI mean | 0.626 | 0.476-0.741 | <.0001 |
ADISD | 0.632 | 0.483-0.746 | <.0001 |
. | NIH-BOS (n = 47) . | Early BOS (n = 13) . | Mixed BOS (n = 6) . | Transient impairment (n = 8) . | Control (n = 10) . | P value∗ . |
---|---|---|---|---|---|---|
FEV1 % month 0, mean (SD) | 51.45 (12.32) | 77.23 (9.99) | 70.67 (8.04) | 84.50 (5.13) | 90.30 (11.54) | <.001 |
Normal lung %, mean (SD) | 53.78 (23.47) | 73.51 (12.47) | 59.92 (21.28) | 88.95 (6.34) | 87.11 (11.87) | <.001 |
Air trapping %, mean (SD) | 36.80 (17.61) | 22.95 (12.71) | 32.42 (18.71) | 9.59 (5.04) | 11.14 (9.90) | <.001 |
Jacobian, mean (SD) | 1.62 (0.31) | 1.92 (0.44) | 1.61 (0.29) | 2.29 (0.24) | 2.29 (0.44) | <.001 |
JacobianSD, mean (SD) | 0.32 (0.17) | 0.47 (0.23) | 0.35 (0.19) | 0.71 (0.17) | 0.69 (0.26) | <.001 |
ADI, mean (SD) | 0.30 (0.11) | 0.38 (0.13) | 0.30 (0.08) | 0.55 (0.13) | 0.51 (0.17) | <.001 |
ADISD, mean (SD) | 0.16 (0.06) | 0.20 (0.08) | 0.17 (0.05) | 0.32 (0.07) | 0.28 (0.09) | <.001 |
. | NIH-BOS (n = 47) . | Early BOS (n = 13) . | Mixed BOS (n = 6) . | Transient impairment (n = 8) . | Control (n = 10) . | P value∗ . |
---|---|---|---|---|---|---|
FEV1 % month 0, mean (SD) | 51.45 (12.32) | 77.23 (9.99) | 70.67 (8.04) | 84.50 (5.13) | 90.30 (11.54) | <.001 |
Normal lung %, mean (SD) | 53.78 (23.47) | 73.51 (12.47) | 59.92 (21.28) | 88.95 (6.34) | 87.11 (11.87) | <.001 |
Air trapping %, mean (SD) | 36.80 (17.61) | 22.95 (12.71) | 32.42 (18.71) | 9.59 (5.04) | 11.14 (9.90) | <.001 |
Jacobian, mean (SD) | 1.62 (0.31) | 1.92 (0.44) | 1.61 (0.29) | 2.29 (0.24) | 2.29 (0.44) | <.001 |
JacobianSD, mean (SD) | 0.32 (0.17) | 0.47 (0.23) | 0.35 (0.19) | 0.71 (0.17) | 0.69 (0.26) | <.001 |
ADI, mean (SD) | 0.30 (0.11) | 0.38 (0.13) | 0.30 (0.08) | 0.55 (0.13) | 0.51 (0.17) | <.001 |
ADISD, mean (SD) | 0.16 (0.06) | 0.20 (0.08) | 0.17 (0.05) | 0.32 (0.07) | 0.28 (0.09) | <.001 |
Each value is reported by mean and SD.
Kruskal-Wallis test was used for group-wise statistical significance.
qCT differentiates BOS types
Figure 1 illustrates representative coronal cuts of Jacobian and ADI maps for groups in the cohort, with corresponding lung strain ranges displayed. Figure 2 shows violin plots of strain metrics across disease states, revealing notable separation between Jacobian, JacobianSD, ADI, and ADISD when comparing (1) NIH-BOS with early BOS, (2) NIH-BOS with control and transient impairment, and (3) early BOS with control and transient impairment. For all metrics, early BOS had strain metric values in between those for NIH-BOS and transient impairment or controls. Groupwise comparison achieved statistical significance (P < .0001) for each metric.
Because of the favorable performance of individual strain metrics in discriminating among disease types, we applied PCA to explore aggregate associations with the 5 metrics (air trapping, Jacobian, JacobianSD, ADI, and ADISD; Figure 3). In our analysis, the first 2 PCs captured 95% of the data set’s total variance. In the figure an arrow’s direction represents the relative contribution of a given metric to the statistical variance of the data set. Accordingly, the PCA illustrates that air trapping discriminated persons with NIH-BOS from those classified as having healthy lungs (controls) and those with transient impairment. Conversely, Jacobian and ADI discriminated controls and persons with transient impairment from those with NIH-BOS. For those with early BOS, air trapping, Jacobian, and ADI were intermediate between NIH-BOS and controls or transient impairment. Thus, PCA was congruent with the clinical expectation that individuals with NIH-BOS had the most severe air trapping and the lowest Jacobian and ADI. Individuals with early BOS had moderate impairment as represented by strain metrics, and participants with transient impairment and controls had the least impairment by strain metrics. Taken together, the data suggest that lung strain metrics are associated with the expected physiology of types of BOS, including early BOS.
We then examined the role of CIT decision tree analysis to identify numerical thresholds of strain metrics that characterize BOS types (ie, NIH-BOS, early BOS, and mixed BOS). Figure 4A and supplemental Table 8 show that Jacobian and air trapping successfully distinguished classifications via sequential partitioning by CIT. Classifications were initially discriminated by Jacobian > 1.98 separating 100% of individuals with transient impairment (n = 8) and 70% of controls (n = 7) from all other individuals (node 1-5, P < .001). After this separation, the vast majority of the remaining 62 individuals in node 2 had some form of BOS (n = 59). A subsequent filtering from node 2 to 3 of individuals with air trapping of ≤33% separated 80% (n = 8) of those with early BOS from other disease states (P = .048). This implies that the first step to identify patients who do not have BOS is to assess for a large volume expansion of the lungs between respiratory phases (ie, Jacobian > 1.98). If an individual has a smaller volume expansion of the lungs (Jacobian ≤1.98) as well as air trapping of ≤33%, they are more likely to have early BOS. Together, this analysis reveals specific lung strain thresholds that can be applied to individuals to differentiate among BOS types, including early BOS.
qCT can diagnose BOS vs non-BOS
To investigate the utility of algorithmic analysis in identifying individuals with BOS vs individuals without BOS (non-BOS), we analyzed lung strain metrics with CIT decision trees and ML algorithms. We used the algorithmic partitioning of CIT decision trees to study any form of BOS (NIH-BOS, early BOS, and mixed BOS) as a composite group (n = 66) compared with participants with transient impairment and controls (n = 18). As shown in Figure 4B, a CIT decision tree found that sequential thresholds of JacobianSD ≤0.62 (P < .001) and air trapping of >11% (P < .003) distinguished 89% (59/66) of individuals with BOS from those without BOS (see supplemental Table 9 for number of patients per node). Specifically, JacobianSD ≤0.62 differentiated 62 of 66 patients with BOS (moving from node 1 to 2). Of the remaining 62 patients with BOS in node 2, the second filtering with air trapping of >11% differentiated 59 of 62 patients with BOS (moving from node 2-4). This implied that strain metric thresholds showing homogeneity of lung expansion (ie, low JacobianSD) and moderate air trapping of >11% were useful to identify persons with any form of BOS from those without BOS.
Based on this result of finding strain metric thresholds that identify BOS, we built a diagnostic tool with a KNN classifier model, using strain metrics as covariates (air trapping, Jacobian mean, JacobianSD, ADI mean, and ADISD) and the 2 composite groups as classes. The model identified individuals with any form of BOS with area under the curve (AUC) of 0.84 (95% CI, 0.74-0.94), statistical accuracy of 0.84, sensitivity of 0.85, and specificity of 0.83. The positive predictive value (PPV) of the model was 0.95, and the negative predictive value (NPV) was 0.60. Together, these results indicate that qCT strain metrics can be used as a clinical tool to identify patients at high risk for having BOS and can diagnose BOS with a high degree of accuracy.
qCT can diagnose early BOS
To address the clinical challenge of distinguishing early BOS vs transient impairment or controls, we applied a CIT decision tree algorithm using strain metrics and found that air trapping alone achieved a statistically significant difference (P = .002) in partitioning those with early BOS. When air trapping was >11%, the decision tree separated 85% (n = 11) of patients with early BOS (node 1 to 3 in supplemental Table 10). When air trapping was ≤11%, the decision tree separated 78% (n = 14) of patients with transient impairment or controls (node 1 to 2). To diagnose early BOS, we then built a Naïve Bayes ML classifier model with strain metrics as covariates and early BOS (n = 13) vs transient impairment or controls (n = 18) as classes. Early BOS was detected with AUC of 0.84 (95% CI, 0.69-0.97), accuracy of 0.85, sensitivity of 0.77, and specificity of 0.89. There was a high level of true positive and true negative detection, with PPV of 0.83 and NPV of 0.84.
Finally, we used CIT to find strain thresholds that distinguish early BOS (n = 13) vs more advanced BOS (NIH-BOS or mixed BOS [n = 53]) and found that Jacobian alone achieved a statistically significant difference (P = .004) in partitioning those with early BOS. In the CIT, a Jacobian of >1.55 separated 92% (n = 12) of patients with early BOS (node 1 to 3 in supplemental Table 11). To diagnose persons with early BOS compared with more advanced BOS, we then built a KNN ML classifier model with strain metrics as input variables and early BOS vs advanced BOS as classes. Early BOS was detected with AUC of 0.78 (95% CI, 0.66-0.90), accuracy of 0.74, sensitivity of 0.71, specificity of 0.85, PPV of 0.95, and NPV of 0.42.
Discussion
In this large cohort of patients with a rare disease from 2 experienced transplant and imaging centers, our results show that quantitative strain metrics can characterize BOS types and that stepwise application of CIT decision trees and ML classifier models can diagnose individuals with early BOS. To our knowledge, this study is the first to apply quantitative strain mapping to algorithmic analysis of lung disease in HCT recipients. The results are important because qCT methodologies may guide more accurate and earlier diagnosis of BOS. The use of lung strain qCT leading to earlier treatment carries the potential to improve BOS-related morbidity and mortality.
Previous studies have found that the obstructive physiology of COPD and severe asthma are hallmarked by a lower Jacobian, lower JacobianSD, and lower ADI.8,13,21,25 The small airway narrowing and fibrosis of BOS also leads to obstructive airway disease,18,26 and similar to asthma and COPD, we found that lower Jacobian, lower ADI, and lower intralung heterogeneity of these parameters occurs with more severe disease after HCT. These findings reflect pulmonary physiology, given that hyperinflation from air trapping can decrease volume expansion, and severe obstructive disease with reduced pulmonary compliance would make lung deformation more equally distributed (isotropic).27 Our work also builds upon prior reports that showed that qCT assessments of air trapping identified advanced BOS after HCT and subtypes of lung allograft dysfunction that were at higher risk of mortality.15,28,29 Prior studies show that qualitative assessments of air trapping (eg, MA) by chest radiologists are insufficient for identifying BOS or anticipating progression of BOS in HCT recipients or in lung transplant recipients. Strain metrics with qCT can thus address an important gap of identifying BOS in the HCT population earlier and with greater accuracy.16,29
On the basis of strong associations between strain metrics and BOS as a composite group as well as types of BOS, including early BOS, we applied CIT decision trees and ML models to discover strain thresholds that could be applicable to clinical practice and to develop automated qCT-based ML models for the identification of BOS. The Jacobian values of 1.98 that differentiated BOS from non-BOS and of 1.55 that differentiated NIH-BOS from early BOS were consistent with Jacobian values of 1.5 to 2.0 found in prior studies to identify individuals with COPD or to stratify asthma severity.21,25 In our study, an air-trapping threshold of 33% differentiated NIH-BOS vs early BOS and was consistent with the threshold (28% air trapping) seen previously when PRM was used to characterize advanced BOS after HCT.10 When these metrics were applied to an ML classifier, we were able to identify patients with BOS with high accuracy and favorable test characteristics. To our knowledge, there are no qCT studies that have differentiated early BOS from transient impairment. In this context we built a ML model that detected early BOS vs transient impairment and controls with high statistical accuracy and with high PPV. This implies that qCT can confirm the diagnosis of early BOS; whereas, standard PFT-based diagnosis have been shown to have a low PPV.5,30 Therefore, the use of qCT-based models to identify early BOS can facilitate earlier treatment and potentially reduce longer-term morbidity and mortality.6
Our work has limitations. First, patients did not have qCT before development of BOS, which limits our understanding of the natural history of lung strain mechanics. This was addressed, in part, by the inclusion of individuals who did not go on to develop BOS. Our group also is currently conducting a multicenter longitudinal observational study to assess the natural history of BOS and efficacy of PRM and other qCT measurements as a predictor of BOS in adult and pediatric HCT recipients with cGVHD (R01HL162661).31 An additional goal of the ongoing trial is to determine whether PRM at the onset of BOS can predict the trajectory of lung function decline. Second, screening at both centers was based upon referral patterns to a pulmonary clinic, potentially introducing a selection bias. Both centers, however, have established referral pipelines with their respective blood and marrow transplant programs and in this context have nearly complete catchment of patients with BOS. Third, the absolute numbers included in this combined cohort remain relatively smaller than studies in asthma or COPD, because this is a rare disease. This limitation was mitigated by performing analysis through multiple statistical and analytical approaches, which repeatedly revealed strong associations and high levels of model performance. A larger multicenter study including comprehensive assessment of patients before the development of any impairment is currently being conducted and will help inform the validity of our results.31 Fourth, inspiratory/expiratory CT scans of the chest are prone to variability, particularly at end-expiration. Centers with less experience with these technologies may require training for consistently reliable results, limiting external generalizability. Our centers use a standardized protocol to obtain high-quality inspiratory and expiratory images, including an established methodology to determine appropriate respiratory effort on exhalation.32 Fifth, it is not clear whether strain metrics need to be adjusted for age, height, and sex, as is done with PFT. Finally, the etiology for transient impairment in patients was not investigated. Although patients were excluded from the study if they had evidence of active infection on respiratory viral swab, sputum cultures, or bronchoalveolar lavage, there remained a possibility of undiagnosed infection. Nonetheless, even if patients had transient airflow obstruction secondary to infection, the qCT metrics associated with transient impairment were not significantly different from control patients without pulmonary impairment.
Lung strain metrics can identify individuals with any form of BOS and, of particular importance, those with early BOS. The diminished expansion and anisotropy of lungs with BOS, as well as their diminished intralung heterogeneity of lung strain, provides novel insight into the pathophysiology of airflow obstruction in a manner beyond traditional modalities such as PFT and possibly before PFT decline. Given the morbidity and high mortality of pulmonary complications after HCT,33-36 future studies are needed for this important and vulnerable population.
Acknowledgments
The authors acknowledge the patients in the lung GVHD clinics of Stanford University Medical Center and MD Anderson Cancer Center.
During this study period, G.-S.C. was funded by National Institutes of Health/National Heart, Lung, and Blood Institute (NIH/NHLBI) grant R01HL161037 and NIH/National Cancer Institute (NCI) grant P30 CA015704. G.Y. and C.J.G. are funded by NIH/NHLBI grant R01HL162661. J.H. was funded through NIH/NHLBI grant R01HL157414-01. A.S. is funded by NIH/National Institute of Allergy and Infectious Diseases grant K23 AI117024.
Authorship
Contribution: H.S. had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis and of the content of the manuscript; H.S., J.H., and A.S. contributed substantially to study design, data analysis and interpretation, and writing of the manuscript; and C.D.B., M.A., M.H., Z.M., C.B., I.T., L.B., B.F.D., G.-S.C., G.Y., C.J.G., H.H.G., M.C.B.G., J.M.R., E.A.H., M.C., G.R., A.M.A., R.E.C., E.J.S., Y.L., S.P., K.D., and M.R.N. contributed to design of the work, interpretation of data, and critical review and drafting of the manuscript.
Conflict-of-interest disclosure: J.M.R. is a shareholder in VIDA Diagnostics, Inc and serves as a consultant for Auris Health, Inc. The remaining authors declare no competing financial interests.
Correspondence: Husham Sharifi, Division of Pulmonary, Allergy, and Critical Care Medicine, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305; email: husham@stanford.edu.
References
Author notes
J.H. and A.S. are joint senior authors.
Presented in poster form at the annual meeting of The American College of Chest Physicians, Honolulu, HI, 8-11 October 2023.
Data are available on request from the corresponding author, Husham Sharifi (husham@stanford.edu).
The full-text version of this article contains a data supplement.