Key Points
Genomic analysis of 1193 donor–recipient samples found no association with autosomal minor histocompatibility antigens and acute GVHD.
Y-chromosome–encoded minors that mismatch paralogous sites in female donors associate with acute GVHD.
Abstract
Allogeneic hematopoietic stem cell transplantation (allo-HCT) is a curative option for blood cancers, but the coupled effects of graft-versus-tumor and graft-versus-host disease (GVHD) limit its broader application. Outcomes improve with matching at HLAs, but other factors are required to explain residual risk of GVHD. In an effort to identify genetic associations outside the major histocompatibility complex, we conducted a genome-wide clinical outcomes study on 205 acute myeloid leukemia patients and their fully HLA-A–, HLA-B–, HLA-C–, HLA-DRB1–, and HLA-DQB1–matched (10/10) unrelated donors. HLA-DPB1 T-cell epitope permissibility mismatches were observed in less than half (45%) of acute GVHD cases, motivating a broader search for genetic factors affecting clinical outcomes. A novel bioinformatics workflow adapted from neoantigen discovery found no associations between acute GVHD and known, HLA-restricted minor histocompatibility antigens (MiHAs). These results were confirmed with microarray data from an additional 988 samples. On the other hand, Y-chromosome–encoded single-nucleotide polymorphisms in 4 genes (PCDH11Y, USP9Y, UTY, and NLGN4Y) did associate with acute GVHD in male patients with female donors. Males in this category with acute GVHD had more Y-encoded variant peptides per patient with higher predicted HLA-binding affinity than males without GVHD who matched X-paralogous alleles in their female donors. Methods and results described here have an immediate impact for allo-HCT, warranting further development and larger genomic studies where MiHAs are clinically relevant, including cancer immunotherapy, solid organ transplant, and pregnancy.
Introduction
Allogeneic hematopoietic cell transplantation (allo-HCT) can cure certain inherited diseases and acquired malignancies of the blood, yet biological mechanisms that provide beneficial effects, such as graft-versus-tumor (GVT),1-3 also contribute to life-threatening graft-versus-host disease (GVHD).4 Outcomes improve dramatically with donor–recipient matching of HLAs, but GVHD still occurs at frequencies of up to 70% in fully matched unrelated pairs, and to a lesser degree in related, HLA-identical transplant recipients,5 suggesting unaccounted-for genetic factors impact clinical responses.
Minor histocompatibility antigens (MiHAs) are germline-encoded immunogenic peptides presented by specific HLA molecules on the surface of cancer cells or normal tissues. Although donor and recipient mismatching at the major histocompatibility complex (MHC) confers the highest proportional risk of GVHD, many clinically relevant MiHAs with defined HLA restriction have been identified,6 including Y-chromosome–encoded antigens that affect outcomes in sex-mismatched HCT.7 In other nonmalignant conditions, such as solid organ transplant or pregnancy, MiHAs carry risk of rejection8-10 or miscarriage,11,12 respectively.
In leukemia, there is evidence for tumor-specific antigenicity by exogenous activation of gene expression,13-15 gene fusion,16 and alternative splicing.17 In all cancers, driver and passenger mutations mark tumor progression,18-20 which may guide biomarker discovery21,22 and individualized treatment.23-25 A subset of cancer variants give rise to immunoreactive neoantigens encoded by somatic changes in tumor DNA, and these changes are presented exclusively by tumors and targeted by patients’ normal immune systems.26 In a clinical setting, this effect may theoretically be exploited for GVT in allo-HCT27 or precision medicine approaches to cancer immunotherapy.
Despite the physiological connection between MiHAs and neoantigens, there are important differences that should guide genomic analysis. Neoantigen discovery from DNA or RNA sequences requires high sensitivity to detect rare or private variants from heterogenous tumor tissue,28 which is often chemically preserved.29 Sequencing patient and adjacent normal samples adds cost but also reduces false positives.30,31 MiHAs, on the other hand, arise from heritable germline polymorphisms that may be common in populations and accessible with less expensive microarrays or lower-coverage sequencing panels.32 In both cases, antigens are only immunoreactive if they are displayed by patient HLA in affected tissues. Therefore, it is important to annotate variants with predicted MHC restriction, binding affinity, and tissue-specific expression.
We sought a controlled, clinical-outcomes–based study in HLA-matched donor–recipient pairs to discover genetic variation outside the MHC that may contribute to the risk of acute GVHD following allo-HCT.
Methods
Study design
The study population consisted of high-resolution HLA-A–, HLA-B–, HLA-C–, HLA-DRB1–, and HLA-DQB1–matched (10/10) unrelated donor and recipient allo-HCT pairs. Patients were selected to obtain equal numbers with and without clinical evidence of grade II-IV acute GVHD, which was assessed as described based on severity or degree of organ involvement before day 100 after transplant.33 All patients received myeloablative conditioning for acute myeloid leukemia (AML) or other blood cancers in complete remission (CR1 or CR2). After quality control and other filtering (see “Whole-genome sequencing”), acute GVHD positive and negative cohorts were balanced for age, disease status, self-reported race or ethnicity, GVHD prophylaxis, and other factors (Table 1).
Clinical data collection
Clinical data were collected by the Center for International Blood and Bone Marrow Transplant Research (CIBMTR), a collaboration between the National Marrow Donor Program and the Medical College of Wisconsin representing a worldwide network of transplant centers that contribute detailed data on HCT. The CIBMTR conducts research in compliance with all applicable federal regulations pertaining to the protection of human research participants. All participants provided informed consent for participation in the CIBMTR research program, including submission of biological samples to the Research Repository, and this study was approved by the National Marrow Donor Program Institutional Review Board.
HLA typing and histocompatibility matching
HLA matching was determined at high resolution for HLA-A, HLA-B, HLA-C, HLA-DRB1, and HLA-DQB1 through retrospective typing of stored pretransplant samples and/or reported by the transplant center and match assessment performed per CIBMTR criteria as previously described.34 Five-locus haplotype matching was performed with the HapLogic algorithm.35
Whole-genome sequencing
Two hundred fifty donor and 250 HCT recipient samples (500 samples total) were sequenced at Human Longevity, Inc. (San Diego, CA) to a mean coverage depth of 30× with 2 × 150 bp paired reads using Illumina HiSeq X instruments. One hundred twenty-five pairs came from transplants with clinical evidence of acute GVHD; 125 pairs came from transplants without evidence of GVHD. Ten recipient samples did not produce adequate sequencing data. A further 2 recipient samples and 1 donor sample failed the heterozygosity test that was applied to remove contaminated samples. An additional 32 samples were missing data for their paired donor or recipient and were removed from analysis. The final set included 205 pairs of donor–recipient samples (102 acute GVHD and 103 non-GVHD). Secondary analysis with Isaac alignment and variant calling pipeline36 resulted in 1 binary alignment map37 and 1 variant call format38 file per sample using the human genome reference assembly hg38. Variants with below average read depth (30×) were excluded from analysis.
Microarray data and analysis
The microarray data and primary analysis for supplemental Table 1 have been described previously.39
Bioinformatics
Genomic similarity was measured using identity-by-descent (IBD) sequencing with default parameters.40 This technique determines phase for donor and patient genotypes to form haplotype segments of varying lengths, which indicate common ancestry. Normalizing the lengths of these segments to those of specific genomic features (including the whole genome itself) gives a relative measure of genetic similarity for each feature (Figure 1). For comparison, the null distribution of normalized IBD in each region is simulated from an all-by-all pairing of donors and recipients (excluding actual HLA-matched pairs). X and Y chromosomes were excluded from analysis. Removal of low-quality variants due to read misalignment resulted in small broken intervals in the ARS and MHC, explaining lower than expected genetic similarity for HLA-matched donor–recipient pairs within these regions.
Comparisons of donor–recipient variant call format files (Figures 3-5; supplemental Figure 2) was performed with RTG tools41 to generate patient-specific variants, which were functionally annotated with snpEff.42 Sex-mismatched pairs were considered as special cases with Y-chromosome–specific variants in male recipients aligned to paralogous sites on the X chromosome. In all samples, missense and nonsense variants were mapped to their corresponding primary transcript and translated into amino acid sequences for proteasomal cleavage site prediction with netChop 3.1.43 MHC binding prediction was performed with netMHCpan 3.044 using patient HLA typing to determine MHC restriction. Ranked peptides were further annotated with minor allele frequencies from dbSNP45 build 147. Acute GVHD usually affects the skin, liver, and gastrointestinal tract.46 While patient-specific MiHA expression is most informative, collecting these data requires invasive tissue biopsy specimens. Therefore, we opted to corroborate our results with public data from the Genotype-Tissue Expression Project47-49 using previously described methods50 to associate MiHAs with a measure of broad tissue-specific gene expression. The entire workflow is freely available at https://github.com/wwang-nmdp/MiHAIP.
Statistics
P values were calculated using the χ2 test for Table 1 and the Wilcoxon rank sum test with continuity correction for Figures 2, 4A-C, 5A,D-E, and 6A,D-E. All other P values were calculated using hypergeometric tests with sample and population counts limited to patients with specified MHC restriction. Benjamini-Hochberg false discovery was applied to correct for multiple hypothesis testing in supplemental Table 1. All tests were performed in R with default parameters.
Results
Donor-recipient matching extends beyond five HLA loci
There is strong evidence that HLA-DPB1 T-cell epitope (TCE) matching correlates with allo-HCT outcome.52-58 Generally speaking, mismatched alleles between donor and recipient may be benign (permissible) or alloreactive (nonpermissible) in either direction (graft versus host or host versus graft), with clinical consequences that include GVHD or rejection, respectively. Several methods are available to determine the direction and permissibility of HLA-DPB1 mismatching. Although pairs in this cohort were not explicitly matched at this locus at the time of transplant, a retrospective analysis revealed 16%, 60%, 68%, and 76% of donor–recipient pairs were matched by HLA-DPB1 allele, TCE permissibility,52 expression,59 or functional distance,60 respectively (supplemental Figure 1). This is consistent with baseline likelihoods of finding HLA-DPB1 matches with productive 10/10 searches.61 We found that HLA-DPB1 allele mismatching did not associate with acute GVHD (P < .92), whereas TCE mismatching did associate as expected (P < .038; 1-sided Fisher’s exact test), leaving 56 out of 102 acute GVHD cases (55%) unaccounted for by mismatching at 6 HLA loci.
We hypothesized that HLA-matched unrelated donor–recipient pairs share genetic material outside the MHC. We used IBD inference40 to measure broad genomic similarity (see Methods), which revealed matching at the MHC regions as expected (Figure 1A). Overall, high rates of IBD were observed at the MHC, indicated by many outliers in randomized pairs, which can be attributed to very strong and recent natural selection acting upon these loci in the human population.62 Genetic similarities extended further, albeit to a lesser degree, across chromosome 6 (Figure 1B) and genome-wide (P < 2.2e-16; Figure 1C). Unexpectedly, there was a single outlier in control experiments where donors and patients were randomly paired. This simulated pair shared 50% of their DNA, likely representing a parent–child or full siblings. To protect confidentiality, we did not analyze the relationship further.
Autosomal MiHAs do not associate with acute GVHD
To investigate patient-specific variation further, we developed an integrative bioinformatics workflow adapted from neoantigen discovery to perform comparative analysis of all HLA-matched donor–recipient pairs regardless of TCE permissibility (supplemental Figure 2).
The acute GVHD and non-GVHD groups displayed comparable numbers of missense variants (P = .32; Figure 2A) and known MiHAs (P = .80; Figure 2B) restricted with patient HLA (P = .76; Figure 2C). Ordering MHC-restricted MiHAs by log ratio (Figure 2D) revealed DPH1 (rs35394823) and LB-NISCH-1A (rs887515) as the lowest and highest ranking; however, no associations achieved statistical significance. Thus, we expanded our study to include preexisting single-nuclear polymorphism microarray data from nonoverlapping patient samples. With the addition of 988 HLA-matched donor–recipient pairs (456 acute GVHD, 532 non-GVHD), no statistically significant associations were identified for 17 known MiHAs represented in both data sets (supplemental Table 1).
Y-chromosome–encoded variants associate with acute GVHD
There were 89 sex-mismatched cases in our cohort (Figure 3A). Male recipients with female donors (F>M) were more likely to develop acute GVHD (78%) than male recipients with male donors (52%, P < .02) Sequence analysis of the entire Y chromosome of F>M pairs identified only 6 missense variants (relative to the reference genome hg38) encoding a total of 9 variant peptides in 10 out of 21 recipients (48%) with acute GVHD. By contrast, the Y chromosomes of all 6 non-GVHD males matched the reference (Figure 3B). The variant peptides were confined to 4 genes: PCDH11Y, USP9Y, and UTY, which have reactive minor histocompatibility epitopes determined in vitro,63,64 and NLGN4Y, a neuroligin with unknown HCT significance. Except PCDH11Y, which is specific to the brain and heart, all genes have broad tissue expression (supplemental Figure 3) and thus make qualified candidates for MiHA presentation in GVHD-affected tissues. Filtering by class I MHC restriction (Figure 3C) revealed several variant and reference peptides with strong affinity for their respective HLA allele in both the acute GVHD and non-GVHD groups (Figure 3D); however, there were significantly more predicted binders per GVHD male (P < .015; Figure 3E), suggesting a possible compound effect of multiple Y-linked MiHAs. HLA-DPB1 alone did not explain the association, as 12 out of 21 F>M patients with acute GVHD (57%) were permissibly matched compared with 3 out of 5 without GVHD (60%) (P < .26; 1 patient was not typed at HLA-DBP1.)
Paralogous X-Y mismatching explains acute GVHD risk in male recipients with female donors
Risk of chronic GVHD from allo-HCT is higher in male patients with female donors because of B-cell alloreactivity,7,65-67 which is detectable by antibody response that occurs after,68 but not before, transplant.69 Results presented here extend risk in this patient segment to the acute form of GVHD by implicating male-specific variants in four genes. Although definitive clinical recommendations require confirmatory analysis, it is possible to investigate the genetic basis for risk in this cohort.
PCDH11Y, USP9Y, UTY, and NLGN4Y have paralogs on the X chromosome70 with 72%, 91%, 86%, and 24% amino acid identity, respectively. We mapped Y-encoded variant peptides from each male patient to paralogous sites on their female donor’s X chromosomes. All the variant peptides observed in males with acute GVHD mismatched corresponding sites (Figure 4A). By contrast, the 6 males without GVHD were X-Y matched at these sites, suggesting their donor-female immune systems were educated, and consequently nonalloreactive, to same-as-self peptides encoded at these positions.
The only other category with increased (albeit statistically insignificant) risk of acute GVHD were male recipients with male donors (Figure 3A). Y-Y mismatching was explored as a possible explanation; however, the number of predicted high-affinity binders per patient (maximum 3) was comparable between recipients with and without acute GVHD in the M>M direction (P = .52; Figure 4B) and considerably lower than acute GVHD recipients in the F>M direction (maximum 12; Figure 3E). Furthermore, all female donors, regardless of recipient sex, lacked variants representing high-affinity binding peptides. These findings associate acute GVHD risk, with explanatory genetic factors, specifically in male patients with female donors, at least in this cohort.
Discussion
Allo-HCT is a curative option for many disorders, yet side effects limit its widespread application. GVHD remains a principal barrier to more effective treatment and improved quality of life, but immune responses that contribute to therapeutic benefit and adverse events are physiologically coupled (Figure 5). In malignant conditions, tumor and normal cells are genetically distinct and analytically separable. In the context of allo-HCT, immunoreactive peptides resulting from tumor-specific somatic mutations (neoantigens) may contribute specifically to GVL. On the other hand, the tissue-specific expression and immunogenicity of germline polymorphisms (MiHAs) determine their relative contributions to GVL or GVHD. As treatment options advance, it is important to precisely define genetic factors that affect (or do not affect) clinical outcomes.
Despite research associating several autosomal MiHAs with clinical outcomes, none are routinely matched in allo-HCT. Target tissue expression partially determines the predominance of GVT or GVHD. For example, HA-2 (rs61739531) is expressed in cells of hematopoietic origin71 where there is evidence for GVL in AML with low risk of GVHD.72 However, expression patterns are not wholly determinant. For example, ZAPHIR (rs2074071) associates with GVT, but not GVHD, in renal cell carcinoma patients receiving nonmyleoablative allo-HCT.73 Similarly, other ubiquitously expressed MiHAs are associated with GVL in chronic myelogenous leukemia without evidence of GVHD, suggesting complex alloreactivities from antigen processing, presentation, and costimulation.74 This is consistent with studies of cancer vaccines where therapeutic benefit results from the synergistic effects of multiple cancer-specific neoepitopes combined with immune checkpoint blockade.75,76 In all cases, MHC restriction is an important qualifier, but filtering patients by HLA reduces the number of samples available for retrospective analysis.
Here, we analyzed common autosomal MiHAs with characterized HLA restriction in separate cohorts of 205 and 988 matched samples. We extended the capabilities of commonly used bioinformatics tools to aid comparative genomic analysis of donor–recipient pairs, incorporating MHC matching and antigen restriction as well as HLA-predicted binding affinity and tissue expression into a common workflow. This study was designed specifically to interrogate acute GVHD in AML patients who were in remission at the time of transplant. Consequently, leukemic cell counts were relatively low, and whole-genome sequences represented primarily germline polymorphism. Thus, bioinformatics analysis focused on MiHAs with broad tissue expression patterns. Future studies will apply these methods to patients with active disease, analyzing somatic variants (possible neoantigens) expressed in cells of hematopoietic origin within larger cohorts that are balanced for GVL-related outcomes including relapse.
Our analysis of autosomes revealed no statistically significant associations with acute GVHD among individual MiHAs. These results confirm a recent genome-wide association study of unrelated allo-HCT where MHC mismatching outweighed other genetic factors as contributors to GVHD risk.77 As with neoantigens, it seems plausible that multiple recipient-specific variants contribute to GVHD; however, unlike clonal expansion of somatic mutations in cancer, population-genetic mechanisms account for the co-occurrence of germline-encoded MiHAs. Indeed, there is evidence that arbitrary HLA-matched donor–recipient pairs may present thousands of MiHAs,78 which have a cumulative effect on T-cell responses.79 Although MiHAs are individually common (with minor allele frequencies in our cohort ranging from 19% to 61%; supplemental Table 1), alloreactive combinations may be rare, making it difficult to power case–control studies. Indeed, segmenting patients into subsets sharing ≥2 MiHAs lacked statistical power even when HLA restriction was limited to common alleles. Larger unrelated cohorts are necessary. Additionally, we plan comparable studies in related and haploidentical HCT pairs where shared donor–recipient haplotypes should reduce the number of MiHA combinations under consideration. These studies will also assess whether results reported here are relevant for patients receiving non–calcineurin-based GVHD prophylaxis.
Our comprehensive analysis of sex-linked variation revealed multiple MiHAs encoded on the Y chromosome that associate with acute GVHD specifically in F>M allo-HCT patients. Relative to other chromosomes, the Y is better suited to case–control MiHA association studies, because it lacks population-scale genetic variability due to extremely low rates of diversifying recombination.80 Furthermore, since genetic and phenotypic sex are tightly coupled, it is easy to presegment genomic analysis into clinically weighted categories such as sex match or mismatch. Our limited cohort of primarily white patients suggests the majority of Y haplotypes in this population increase risk of acute GVHD for males with female donors. This is consistent with previous observations of increased chronic GVHD and lower relapse in F>M allo-HCT,81-83 adding a genetic basis for choosing HLA-matched male donors over nonparous females.84,85 However, in cases where a female donor is otherwise the best option for male patients, results reported here may help select a more suitable match.
The full-text version of this article contains a data supplement.
Acknowledgments
The authors thank Darryl Carter, Bronwen Shaw, Mary Horowitz, Amir Toor, and Yung-Tsi Bolon for insightful critical review of the manuscript.
Funding for this work was provided by HLI, Inc., as well as the US Office of Naval Research (ONRN00014-17-1-2388).
Authorship
Contribution: W.W., H.H., M. Halagan, M. Heuer, M. Haagenson, N.M.P., J.U., M.M., and C.J.K. developed the bioinformatics workflow and analyzed data; M. Heuer and J.E.B. developed and administered compute infrastructure and information technology; C.V.-G. and S.S. collected and curated data; R.H.S., A.T., and W.B. performed sample preparation, WGS, and primary data analysis; and C.J.K. wrote the manuscript.
Conflict-of-interest disclosure: W.B. is an employee of Human Longevity, Inc. N.M.P. is founder of Root Deep Insight, Inc., a personal immunogenomics company. The remaining authors declare no competing financial interests.
Correspondence: Caleb J. Kennedy, Center for International Blood and Marrow Transplant Research, Minneapolis, MN 55401; e-mail: caleb.kennedy@nmdp.org.
REFERENCES
Author notes
W.W. and H.H. contributed equally to this study.