JAK2V617F is an acquired mutation associated with polycythemia vera (PV), essential thrombocythemia (ET), and primary myelofibrosis (PMF). We tested the hypothesis that the paradox of a single disease allele associated with 3 distinctive clinical phenotypes could be explained in part by host-modifying influences. We screened for genetic variation within 4 candidate genes involved in JAK-STAT signaling, including receptors for erythropoietin (EPOR), thrombopoietin (MPL), and granulocyte colony-stimulating factor (GCSFR), and JAK2. We genotyped 32 linkage disequilibrium tag single nucleotide polymorphism (SNP) loci in 179 white patients: 84 had PV, 58 had PMF, and 37 had ET. Genotype-phenotype analysis showed 3 JAK2 SNPs (rs7046736, rs10815148, and rs12342421) to be significantly but reciprocally associated with PV (P < .001 for all; odds ratio = 0.16, 2.72, and 2.46, respectively) and ET (P < .001 for all; odds ratio = 3.05, 0.29, and 0.30, respectively) but not with PMF. Three additional JAK2 SNPs (rs10758669, rs3808850, and rs10974947) and a single EPOR SNP (rs318699) were also significantly associated with PV but not with ET or PMF. Finally, intragene haplotypes in JAK2 were significantly associated with PV only. Thus, host genetic variation may contribute to phenotypic diversity among myeloproliferative disorders, including in the presence of a shared disease allele.
Introduction
The recent discovery of JAK2V617F and related mutations in bcr-abl–negative myeloproliferative disorders (MPDs) has spawned great interest in their precise role in the pathogenesis of these disorders.1 JAK2V617F is found in more than 90% of patients with polycythemia vera (PV), and approximately 50% of patients with either primary myelofibrosis (PMF) or essential thrombocythemia (ET).2 It was unexpected that a single disease allele would be associated with these 3 distinct, although overlapping, clinical phenotypes. There are at least 2 possible explanations for this apparent paradox: other disease alleles that influence phenotype or host-modifying influences or both. Several lines of evidence support a role for other disease alleles. These include (1) the demonstration of heritable predisposition alleles for development of JAK2V617F-positive PV,3,4 (2) the demonstration of clonal hematopoiesis by X chromosome inactivation pattern (XCIP) analysis in informative females with JAK2V617F-negative ET,5 (3) the observation that patients with JAK2V617F-positive PV may progress to acute myeloid leukemia that is JAK2V617F negative,6 and (4) the demonstration that only a proportion of clonal PV cells are JAK2V617F positive.7,8 However, host modifiers may contribute to phenotypic pleiotropy of MPDs, in the presence or absence of JAK2V617F. The observation that there is strain-specific variation in leukocytosis and myelofibrosis in murine models of JAK2V617F-mediated myeloproliferative disease provides indirect evidence in this regard.9
To examine the contribution of genetic factors other than JAK2V617F in the distinction between PV, PMF, and ET, we used a candidate gene approach. The choice of candidate genes reflects the key role of JAK-STAT signaling, which is constitutively activated through acquisition of somatic mutations (eg, JAK2V617F), in MPD pathogenesis. JAK2 plays a central role in mediating signaling downstream of key cytokine receptors that are required for normal hematopoietic development, including receptors for erythropoietin (EPOR), thrombopoietin (MPL), and granulocyte colony-stimulating factor (GCSFR).10,11 Therefore, we hypothesized that single nucleotide polymorphisms (SNPs) in EPOR, MPL, GCSFR, or JAK2 might influence MPD phenotype, possibly through altered interaction of the involved cytokine receptor with wild-type or mutant JAK2 (eg, erythrocytosis favoring a PV phenotype may ensue from the interaction between a “gain-of-function” SNP in EPOR and JAK2V617F). Data supporting this hypothesis includes (1) Janus kinases (JAKs) intimately associate with cytokine receptors and regulate the cell-surface expression of at least some of these receptors (eg, JAK2 regulates EPOR and MPL expression)12,13 and (2) JAK2V617F is most efficient in transforming hematopoietic cells that express type I cytokine receptors that lack a common chain, including EPOR, MPL, and GCSFR.14 This analysis of host genetic variation in these 4 candidate genes and its association with MPDs using SNP association and haplotype analyses supports a role for host modifiers in the phenotypic pleiotropy of MPDs.
Methods
The current study was approved by the Mayo Clinic Institutional Review Board. Verbal and written informed consent was obtained from all patients and research was performed in accordance with the principles of the Declaration of Helsinki. We identified white patients with PV, ET, or PMF from our database of patients with MPD for analysis. Patient clinical data were carefully reviewed by A.P. and A.T., and diseases were classified according to World Health Organization criteria.15 DNA from peripheral blood granulocytes was isolated, and genotyping for JAK2V617F was performed with the use of a previously described assay (sensitivity ≤ 1%).16
Granulocyte DNA was used for SNP genotyping. We selected 32 LD tagSNPs using the Carlson method17 with a minimum allele frequency of at least 5% and an r2 value of 0.80 in the 4 candidate genes using the HapMap CEU database18 (JAK2 = 13, EPOR = 4, MPL = 5, GCSFR = 10; Table S1, available on the Blood website; see the Supplemental Materials link at the top of the online article).
Genotyping was performed using the GenomeLab SNPstream Genotyping System (Beckman Coulter, Fullerton, CA), with details provided in Document S1. Primers were designed using the web-based design site http://www.autoprimer.com provided by Beckman Coulter (Table S3). Controls included 2 genomic DNAs, each with 8 replicates per 384 well plate and 6 no DNA template wells. Call rates for each SNP ranged between 90% and 99.9%.
Assessment of linkage disequilibrium between the 32 tagSNPs and JAK2V617F using the measures of D′ and r2 was completed with the use of the software package Haploview (Broad Institute, http://www.broad.mit.edu/mpg/haploview/).19 The contribution of genetic variation in candidate genes in discriminating among PV, ET, and PMF was assessed with a logistic regression adjusting for covariates of age at diagnosis, sex, and JAK2V617F status. The SNPs were coded as 0, 1, or 2 according to the number of rare alleles (ie, additive model), and JAK2V617F status was modeled as presence (either homozygous or heterozygous) or absence of the mutation. A total of 6 models were fit, one for presence or absence of each MPD, including and excluding JAK2V617F status from the model. Because the study participants were all of a similar “population” (ie, white), controlling for population stratification was not completed. Nominal P values for the single SNP analysis were reported, with many of the significant findings still maintaining significance after applying the overconservative Bonferroni correction for multiple testing.
For haplotype analysis, both intragene haplotypes as well as haplotypes based on a sliding window of 3 SNPs within each gene were considered, because of the large number of possible intragene haplotypes for JAK2 and GCSFR genes. Because haplotypes are not observed directly, we accounted for an unknown phase of haplotypes composed of tagSNPs by use of the score statistics developed by Schaid et al20 and implemented in the Splus library of HaploStat software (Mayo Clinic, http://mayoresearch.mayo.edu/mayo/research/schaid_lab/software.cfm). Simulated P values are reported for haplotype analysis, adjusting for multiple testing within the haplotype analysis. Because parameter estimates and effect sizes are not estimated with the score test, logistic regression models were fit to produce estimates of the haplotype effect sizes, for haplotypes with observed counts greater than 5. Because haplotypes are not observed directly, we first estimated for each person, all possible haplotypes and the posterior probability associated with each haplotype using the EM algorithm outlined by Excoffier and Slatkin,21 which is implemented in the Splus library of HaploStat (Mayo Clinic).20 This produces a design matrix containing the expected proportion of haplotypes for each person. Using this design matrix with posterior probabilities, logistic regression models were fit, treating the expected haplotypes as covariates in the model, resulting in an additive haplotype genetic model. Maximum likelihood estimates for the haplotype effect sizes were subsequently produced. This approach for haplotype analysis, in which the analysis is based on the expected proportion of haplotypes, is described in detail by Zaykin et al.22 Two models were fit; one in which covariates of age at diagnosis and sex were adjusted for in the haplotype analysis and one in which covariates of age at diagnosis, sex, and JAK2V617F status were adjusted for in the haplotype analysis.
The Cochran-Armitage trend test was used to assess differences in genotype frequencies between patients with MPD and the HapMap founder CEU population, with nominal P values reported.
Results
We studied a total of 179 white patients seen in our MPD practice for whom complete clinical information, as well as archived granulocyte DNA, was available. Of these patients, 84 had PV, 58 had PMF, and 37 had ET. Demographic data, JAK2V617F status, and other relevant clinical data for study patients are presented in Table 1. Patients with PV and PMF were older than patients with ET at the time of diagnosis (median age, 56 and 58 years vs 47 years, respectively), and patients with PMF were tested for JAK2V617F approximately a year later in the disease course compared with patients with PV and ET (median, 21 months vs 12 and 11 months after diagnosis, respectively). Prevalence of JAK2V617F in each MPD was in accordance with published data.23
We selected 32 linkage disequilibrium (LD) tagSNPs using specific criteria (see “Methods”) in the 4 candidate genes: JAK2 = 13, EPOR = 4, MPL = 5, and GCSFR = 10 (Table S1). One SNP within GCSFR showed evidence of deviation from Hardy Weinberg Equilibrium (rs4026505; P < .001). The association of individual SNPs with a particular MPD was studied after adjusting for age at diagnosis and sex. Here, we compared among the study patients with PV, PMF, or ET. In this analysis, 3 SNP loci within JAK2 (rs7046736, rs10815148, and rs12342421) were found to be significantly but reciprocally associated with PV (P < .001 for all; odds ratio = 0.16, 2.72, and 2.46, respectively) and ET (P < .0007 for all; odds ratio = 3.05, 0.29, and 0.30, respectively; Table 2). In other words, the presence of the minor allele increased the odds of one phenotype (say ET), while decreasing the odds of the other phenotype (PV). For instance, for SNP rs7046736, the presence of the C allele increased the odds of ET (odds ratio = 3.05) but decreased the odds of PV (odds ratio = 0.16) (Table 2). These 3 SNPs, which were not associated with PMF, exhibited high LD, with r2 measures of LD between 0.78 and 0.87 (Figure 1). Furthermore, 3 additional JAK2 SNPs (rs10758669, rs3808850, and rs10974947) were significantly associated with PV (P = .003, .009, and .005, respectively; odds ratio = 2.71, 0.36, and 0.34, respectively; Table 2) but not with ET or PMF (data not shown). Finally, the presence of the A allele at a single SNP locus in EPOR (rs318699, P = .001) significantly increased the odds of PV only (odds ratio, 1.87; Table 2).
When there are multiple causative variants, haplotypes offer increased power over individual SNPs to detect genotype-phenotype associations.24 When assessing haplotypes that span the gene (ie, intragene haplotypes), we found a significant or marginally significant association between haplotypes within JAK2 (P < .001) and PV but not ET and PMF (Table 3). When we looked instead at haplotypes based on a sliding window of 3 SNPs, we similarly observed haplotypes within JAK2 to be associated with PV alone (data not shown). Likewise, many of the SNPs found to be individually associated with PV (Table 2) were significant in the sliding window haplotype analysis (eg, rs7046736, rs10815148, and rs12342421) (data not shown). In contrast, no haplotypes within any of the candidate genes examined were found to be associated with ET or PMF.
To examine the effect of JAK2V617F, we examined the association between each SNP and MPD phenotype after adjusting for the presence or absence of JAK2V617F, in addition to age at diagnosis and sex (Table 4). We found the 3 previously identified JAK2 SNPs (rs7046736, rs10815148, and rs12342421) to remain significantly associated with PV and ET, even after adjusting for JAK2V617F status (Table 4), with JAK2V617F in low LD (r2 < 0.13) with tag SNPs in the 4 genes (Figure 1). Likewise, when JAK2V617F status was included as a covariate for haplotype analysis, several haplotypes within JAK2 maintained global significance of association with PV (Table 3).
We compared genotype frequencies for the study population to those found in the HapMap white population (founder CEU population; http://www.hapmap.org/; Table S2). When considering the entire group of patients with MPD compared with the HapMap white founder population, we found highly significant differences in genotype frequency at 6 SNP loci in the JAK2 gene (rs10758669, rs3808850, rs7849191, rs7046736, rs10815148, and rs12342421), but not in EPOR, MPL, or GCSFR (P < .001). Although the HapMap population may not be the ideal control for this comparative analysis, it does underscore the point that genetic variability in JAK2, and not EPOR, MPL, or GCSFR genes is the distinguishing characteristic between the 2 populations.
Finally, we tested for the clinical correlates of PV-associated alleles in patients with PMF and ET. Complete clinical and pathologic information at diagnosis was available for 32 (of 58) patients with PMF; we grouped patients based on frequency of the PV-associated allele (0, 1, or 2) at the relevant SNP loci. The PV-associated allele was present homozygously at one or more of the following SNPs: rs7046736, rs10815148, and rs12342421 (group 1; n = 4), or SNPs rs7046736, rs10815148, rs12342421, and rs10758669 (group 2; n = 8). Both groups showed significant association with leukocytosis (P = .009 and .03, respectively). Furthermore, group 1 showed a significant association with JAK2V617F (P = .02) and group 2 with lower platelet count (P = .05). Both groups also showed a trend toward higher hemoglobin level, although the association did not achieve statistical significance. Other clinical variables (eg, splenomegaly) did not show a significant association in this analysis. Similarly, we had complete data at diagnosis for 17 (of 37) patients with ET patients; again, both groups showed a significant association with JAK2V617F (P < .05).
Discussion
Our findings reflect the distinctive genetic underpinnings of phenotypically related MPDs, including in the presence of a shared disease allele. The data suggest that (1) several SNPs and haplotypes within JAK2 show strong association with PV or ET, but not with PMF, and the particular distribution of alleles at the involved loci contributes to phenotypic discrimination between the 2 MPDs; and (2) in contrast, genetic variation in EPOR, MPL, and GCSFR genes does not contribute to MPD phenotypic diversity (with one exception; EPOR SNP rs318699 in PV).
Because we analyzed LD tagSNPs, the currently identified JAK2 and EPOR alleles represent markers for genomic regions of interest and not necessarily “disease-predisposing” or “causative” alleles for MPDs. Thus, it would be premature to speculate on the potential mechanism(s) underlying the association of a particular SNP allele with PV or ET or both based on current data. In this regard, higher resolution SNP analysis within JAK2 and its flanking regions on chromosome 9p in a larger cohort of patients and relevant controls will be required to identify specific alleles relevant for MPD pathogenesis.
Our analysis indicates that the currently identified JAK2 SNP alleles contribute to PV or ET expression regardless of JAK2V617F status (through accounting for JAK2V617F in our statistical model). This point has limited relevance for PV, given that virtually all patients (up to 95%) with PV harbor JAK2V617F. In contrast, the contribution of these alleles to the phenotypic distinction between JAK2V617F-harboring ET and PV will need to be confirmed in a large enough sample size that allows for such a stratified analysis.
Finally, it is possible that genetic variation at the currently identified SNPs also contributes to interindividual variation in blood counts in healthy persons, an effect that may be amplified through acquisition of somatic mutations such as JAK2V617F. A study of allele distribution at these SNPs and correlation with blood counts (eg, hemoglobin level and platelet count) in a large cohort of healthy persons may be informative in this regard.
An Inside Blood analysis of this article appears at the front of this issue.
The online version of this article contains a data supplement.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
This work was supported by Myeloproliferative Disorders Foundation, Chicago, IL (A.P., T.L., and A.T.) and by the Robert A. Kyle Hematologic Malignancies Program.
Authorship
Contribution: A.P., B.L.F., D.G.G., and A.T. wrote the paper; A.P. and A.T. participated in conception and design of the study; A.P., B.L.F., T.L.L., D.G.G., and A.T. performed research or participated in data analysis; A.P., T.L.L., and A.T. participated in collecting clinical data.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Animesh Pardanani, Mayo Clinic, Division of Hematology, 200 First Street SW, Rochester, MN 55905; e-mail: pardanani.animesh@mayo.edu.