In the current study, we identified 2 genetic markers for susceptibility to chronic myeloid leukemia (CML) using a genome-wide analysis. A total of 2744 subjects (671 cases and 2073 controls) were included, with 202 Korean CML patients and 497 control subjects enrolled as a discovery set. Significant findings in the discovery set were validated in a second Korean set of 237 patients and 1000 control subjects and in an additional Canadian cohort of European descent, including 232 patients and 576 control subjects. Analysis revealed significant associations of 2 candidate loci, 6q25.1 and 17p11.1, with CML susceptibility, with the lowest combined P values of 2.4 × 10−6 and 1.3 × 10−12, respectively. Candidate genes in those regions include RMND1, AKAP12, ZBTB2, and WSB1. The locus 6q25.1 was validated in both Korean and European cohorts, whereas 17p11.1 was validated only in the Korean cohort. These findings suggest that genetic variants of 6q25.1 and 17p11.1 may predispose one to the development of CML.

Chronic myeloid leukemia (CML) is a rare, clonal myeloproliferative disorder characterized by enhanced proliferative capacity and prolonged survival of hematopoietic stem cells and reduced apoptosis. CML has relatively low incidence rates, from 0.6 to 2.0 cases per 100 000 persons, and its incidence does not appear to differ between Western and East Asian countries. The formation of the B-cell receptor-ABL oncogene, which codes for a constitutively active Bcr-Abl fusion tyrosine kinase on the Philadelphia chromosome (Ph), contributes to the pathogenesis of CML. Although Bcr-Abl fusion tyrosine kinase is a key molecular marker of CML, it is still unclear which molecular or cellular events initiate the leukemogenesis of CML or drive translocation of the B-cell receptor-ABL gene.

Several genome-wide association studies have successfully revealed the associations between genetic variants and certain types of hematologic malignancies, including chronic lymphocytic leukemia,1,2  acute lymphoblastic leukemia,3-5  and therapy-related myeloid leukemia.6  Previous association studies for CML suggest that the BCL2 gene polymorphism is related to CML susceptibility.7  However, a genome-wide approach has never been used to identify genetic markers of CML risk. Accordingly, in the present study, we attempted to find genetic markers of CML susceptibility using a genome-wide analysis with a total of 2744 subjects (671 cases and 2073 controls).

Participants

A total of 202 Korean CML patients and 497 control subjects were recruited for use as a discovery set. A second Korean set of 237 patients and 1000 control subjects was used to replicate our findings. Another set was recruited for independent validation and consisted of 232 patients and 576 control subjects from Canada of European descent. The study was approved by the Institutional Research Board at the Samsung Medical Center, Sungkyunkwan University, Seoul, Korea.

Genotyping

A total of 906 530 single nucleotide polymorphisms (SNPs) were genotyped using the Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix). Yields of pure, double-stranded genomic DNA were determined using the QIAamp DNA blood Maxi Kit (QIAGEN). Peripheral blood samples were taken during the course of therapy, usually when or after achieving complete cytogenetic response and after informed consent was obtained in accordance with the Declaration of Helsinki. Samples were normalized to 50 ng/μL, and the normalized genomic DNA (5 μL) from each sample was used as a template for Affymetrix Version 6.0 assays. Genotyping reactions were performed using Affymetrix Genome-Wide Human SNP Nsp/Sty, Version 6.0 kit reagents and protocols. Genotypes were called using the Birdseed algorithm of the Affymetrix Genotyping Console Version 3.0.2. After genotyping, SNPs that showed erroneous genotype clustering patterns were excluded by visual inspection. One sample with a missing genotype rate > 5% was excluded from analysis. We also excluded 192 348 SNPs with > 1% missing genotypes, 274 345 SNPs with a minor allele frequency (MAF) < 5%, and 6532 SNPs that showed significant deviations from Hardy-Weinberg equilibrium (HWE; P < .001) in controls. In the end, a total of 456 522 autosomal SNPs in 201 cases and 497 controls were examined. For validation using the Korean cohort, a total of 88 SNPs were genotyped in 237 cases and 1000 controls using the MassARRAY system (Sequenom). Genotypes for SNPs that deviated from HWE in the validation set were confirmed by sequence analysis (Applied Biosystems), and SNPs of discordant genotypes were excluded in the analysis. A total of 9 SNPs that were validated in the Korean cohort were genotyped again for validation in the European cohort.

Statistical analysis

The population structures of our samples were examined to confirm genetic homogeneity and assess stratification using the multidimensional scaling method.8  Affymetrix Version 6.0 data for East Asian (JPT + CHB), Caucasian (CEU), and African (YRI) populations from the International HapMap Project was used for multidimensional scaling analysis. The genomic inflation factor (λ) was calculated based on median χ2 statistic. Associations between SNP markers and disease susceptibility were tested using Cochran-Armitage trend tests with one degree of freedom. To select loci for further investigation, we searched for candidate regions in which the minimum P values were < 5.0 × 10−5 and > 5 SNPs with P < .001 within 1 Mb. For validation in the Korean cohort, we selected a set of 88 nonredundant SNPs, which included 39 SNPs that passed quality control among the top 50 SNPs and 49 additional SNPs with P < .001 in the candidate regions. We selected 9 of the 88 SNPs with P < .05 for further validation in the European cohort. All statistical analyses, including association analyses, were performed using PLINK Version 1.06.9  Linkage disequilibrium structure was assessed using HaploView Version 4.1.10  We also performed SNP imputation to increase genome-wide coverage for further analyses. IMPUTE program Version 1.0.0 was used to impute 633 644 polymorphic SNPs that were not covered by the Affymetrix Version 6.0 array.11  The reference panel used for imputation was composed of 90 known JPT + CHB haplotypes from the Inter-national HapMap Project data (phase 2 Public Release #22 NCBI Build 36).

A total of 671 CML patients and 2073 control subjects of Korean and European descent were enrolled in our study to identify common genetic variants associated with CML (Figure 1).

Figure 1

Work flow.

Discovery set

In the discovery stage, we evaluated 456 522 common SNPs (MAF > 5%) for 201 cases and 497 controls in the Korean discovery set. Multidimensional scaling analysis demonstrated that the genetic variations exhibited by the Korean subjects overlap with those from JPT and CHB and are clearly distinct from CEU and YRI, according to the International HapMap Project data (supplemental Figure 1, available on the Blood Web site; see the Supplemental Materials link at the top of the online article). The distributions of observed P values for association tests across all SNPs tested showed no evidence of overall systematic bias (λ = 1.024) from the expected P values, and the excess of low P values was consistent with the presence of true associations (quantile-quantile plot; supplemental Figure 2). These observations indicate that our samples are genetically homogeneous and that significant associations are attributable to genetic differences in CML susceptibility.

In genome-wide association analysis within the Korean discovery set (n = 698), a total of 56 SNPs with P < 5.0 × 10−5 (Cochran-Armitage trend test) were identified (the top 50 SNPs are listed in supplemental Table 1). Five loci fulfilled our criteria for significance (minimum P < 5.0 × 10−5 and P < .001 in more than 5 SNPs within 1 Mb): 5q33.3, 6p24.1, 6q25.1, 10q21.3, and 17p11.1 (Figure 2; supplemental Table 2). Minimum P values of 3.5 × 10−5 to 1.4 × 10−6 were observed for each locus. Approximately 18 annotated genes, including the potentially cancer-related candidate genes of RMND1, AKAP12, ZBTB2, EBF1, CTNNA3, and WSB1, were located on or near these loci. Regional association plots of the typed and imputed SNPs around the 5 loci revealed clusters of significant association peaks in the same regions (Figure 3; supplemental Figure 3).

Figure 2

Manhattan plot of genome-wide association results.P values are from the Cochran-Armitage trend test. For further validation, we selected genomic regions in which more than 5 SNPs with P < .001 within 1 Mb were observed and the minimum P value was less than 5.0 × 10−5. The 5 selected genomic loci are shown in this figure.

Figure 2

Manhattan plot of genome-wide association results.P values are from the Cochran-Armitage trend test. For further validation, we selected genomic regions in which more than 5 SNPs with P < .001 within 1 Mb were observed and the minimum P value was less than 5.0 × 10−5. The 5 selected genomic loci are shown in this figure.

Close modal
Figure 3

Regional association plots. (A) 6q25.1 (B) 17p11.1 P values are from the Cochran-Armitage trend test. ● represents genotyped SNPs in the discovery set; and ○, genotyped SNPs in the validation set. The rs numbers of the validated SNPs are provided in Table 1. Gray shaded circles represent imputed SNPs; and gray line, recombination rates (estimated using the HapMap combined data). Arrows indicate the locations of genes. (Bottom panel) A linkage disequilibrium map based on r2 values computed using the HapMap JPT + CHB data.

Figure 3

Regional association plots. (A) 6q25.1 (B) 17p11.1 P values are from the Cochran-Armitage trend test. ● represents genotyped SNPs in the discovery set; and ○, genotyped SNPs in the validation set. The rs numbers of the validated SNPs are provided in Table 1. Gray shaded circles represent imputed SNPs; and gray line, recombination rates (estimated using the HapMap combined data). Arrows indicate the locations of genes. (Bottom panel) A linkage disequilibrium map based on r2 values computed using the HapMap JPT + CHB data.

Close modal

Validation set

Validation studies were performed in cohorts of Korean (additional set of 237 CML cases and 1000 controls) and European (232 cases and 576 controls) descent for 88 SNPs, including 39 of the top 50 SNPs and 49 additional SNPs from the 5 significant loci. Locus 6q25.1 was validated in both Korean and European cohorts. Four SNPs on 6q25.1 (rs4869742, rs7765741, rs3900024, and rs6931104) showed significant associations in both validation sets (Table 1). The P values of the SNPs ranged between 5.5 × 10−6 and 8.4 × 10−4 in the discovery set, 2.6 × 10−3 and 5.0 × 10−2 in the Korean validation set, 2.4 × 10−6 and 2.8 × 10−5 in the pooled set of Korean cohorts, and 1.4 × 10−3 and 2.3 × 10−3 in the European validation set (Table 1; supplemental Table 3), whereas rs4869742 was excluded from the analysis in the European validation set because of significant deviation from HWE. The directions of associations in the European set were identical to those in the Korean discovery and validation sets, and the associations were still significant after Bonferroni correction, suggesting that the locus 6q25.1 is associated with susceptibility to CML across populations.

For a selected representative SNP, rs7765741, the T allele frequency of cases in the combined Korean set was lower than that of controls (0.35 vs 0.44), and a similar trend was also observed in the European set (0.67 vs 0.75), although the T allele frequencies were higher than in Korean set. The odds ratio was 0.69 (95% confidence interval, 0.59-0.81) and 0.68 (95% confidence interval, 0.53-0.85) in Koreans and Europeans, respectively.

Among the candidate SNPs, 3 in 17p11.1 (rs7221571, rs4795519, and rs33962847) remained significant in the Korean validation set (P values: 3.3 × 10−8 to 8.0 × 10−7; Table 1), even after Bonferroni correction (P values: 2.9 × 10−6 to 7.0 × 10−5), but not in the European validation set. The significances of these 3 SNPs were enhanced when the Korean discovery and validation sets were pooled (P values: 1.3 × 10−12 to 5.6 × 10−10). For the most significant SNP, rs4795519, the MAF in cases and controls was 0.37 and 0.52, respectively, with an odds ratio of 0.54 (95% confidence interval, 0.46-0.64). Regional association plots indicated that the locus is located near the 5′-end of the WSB1 gene (Figure 3). The significant SNPs are located in a strong linkage disequilibrium block that has been identified in East Asian samples. The −log10(P) values abruptly dropped when they crossed a recombination hotspot that is located in the upstream region of WSB1 (Figure 3).

The key finding of the present study is that the chromosomal loci, 6q25.1 and 17p11.1, are associated with susceptibility to CML. In the discovery set, 5 chromosomal loci (6q25.1, 17p11.1, 5q33.3, 6p24.1, and 10q21.3) were initially identified as candidate sites (minimum P value: P = 3.5 × 10−5 to P = 1.4 × 10−6). Locus 6q25.1 was validated not only in an additional Korean cohort, but also in a Caucasian cohort. Several genes are located within a 1-Mb region of the locus 6q25.1, including ZBTB2, RMND1, C6orf211, C6orf97, and AKAP12 (Figure 3). Recently, 6q25.1, which is upstream of estrogen receptor 1 gene, was revealed to be associated with breast cancer susceptibility.12  However, it is not known whether ZBTB2, RMND1, C6orf211, C6orf97, and AKAP12 are involved in tumorigenesis, particularly in CML.

ZBTB2 is a POK family transcription factor and a potent repressor of the ARF-HDM2-p53-p21 pathway in cell cycle regulation.13 ZBTB2 can inhibit p53 binding and repress transcriptional activation by p53. Therefore, ZBTB2 could potentially be a master control gene of the p53 pathway.13 ZBTB2 is also a potent transcriptional repressor of the cell cycle arrest gene p21 that inhibits p53 and Sp1.13  Thus, ZBTB2 might be involved in leukemogenesis via deregulation of the p53 pathway. ZBTB2 SNPs, especially those upstream of ZBTB2, appear to be significantly associated with CML susceptibility. Further functional studies of ZBTB2 are needed to elucidate the role of ZBTB2 in CML leukemogenesis based on the results of the current study.

Another candidate gene is AKAP12/gravin, which is down-regulated not just in CML, but also in acute myeloid leukemia and myelodysplastic syndromes, suggesting the association of AKAP12 gene expression with myeloid neoplasm leukemogenesis.14  In addition, inactivation of AKAP12 is associated with tumor growth suppression.15  The chromosomal position of the SNP with the lowest P value in 6q25.1 is near the 3′-untranslated region of AKAP12 (Figure 3).

In addition, the association of the locus 17p11.1 with CML susceptibility was validated in the additional Korean cohort in which 3 SNPs (rs7221571, rs4795519, and rs3396847) were significant (P = 3.3 × 10−8 to P = 8.0 × 10−7), but the loci was not validated in the Caucasian cohort. Of the genes near the significant peak, the 5′-end of a candidate gene, WSB1, is located. Although its precise function is still unknown, WSB1 is associated with resistance to apoptosis, cell cycle acceleration, and protection from cell death induced by cellular stresses, such as DNA damage and genotoxic stress.16,17  In addition, changes in copy number of WSB1 are related to overexpression of WSB1 and are associated with the survival of neuroblastoma patients.18  Furthermore, WSB1 is known to be involved in the progression of pancreatic cancer.19  Although the 17p11.1 locus (on which WSB1 is located) was only significantly associated with CML susceptibility in the Korean population, it cannot be completely ruled out as a CML risk locus and further validation studies should be conducted.

Because of the relatively low incidence of CML (0.6-2.0 cases per 100 000 persons), it is difficult to recruit a large number of cases (> 1000) for genome-wide association studies. In the present study, a total of 2744 subjects were recruited, including 671 cases and 2073 controls. To the best of our knowledge, this study performed genome-wide analysis on the largest number of CML patients. Another strength of our study is that we validated our findings across ethnicities.

In conclusion, the present study identified 2 chromosomal loci, 6q25.1 and 17p11.1, that are associated with CML susceptibility. SNPs in the 6q25.1 locus were associated with CML in cohorts of both Korean and European descent, whereas SNPs at the 17p11.1 locus were associated with CML only in the Korean cohort. Our findings may provide new insights into the pathophysiology of CML and suggest a basis for CML disease prediction and therapeutics.

The online version of this article contains a data supplement.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

This work was supported by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology (NRF-2010-0010208), and the Korea Healthcare Technology R&D Project, Ministry of Health & Welfare, Republic of Korea (A092255). Hyeoung-Joon Kim was supported by the Korea Health 21 R&D Project, Ministry of Health & Welfare, Republic of Korea (A010385).

Contribution: D.H.K. and J.-W.K. designed the study; D.H.K., S.-T.L., H.-H.W., and J.-W.K. performed data analyses and interpreted the results; H.-H.W. conducted statistical and bioinformatic analyses; S.K. and M.-J.K. supervised statistical analyses; Hee-Jin Kim, S.-H.K., Hyeoung-Joon Kim, Y.-K.K., S.K.S., J.H.M., and C.W.J. helped collect data for the Korean cohort; J.H.L. was responsible for data collection from the Canadian cohort and experimental validation procedures; D.H.K., S.-T.L., and H.-H.W. wrote the manuscript; and all authors contributed to the final version of the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Dong Hwan (Dennis) Kim, Department of Hematology/Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Ilwon-dong 50, Gangnam-gu, Seoul, Korea, 135-710; e-mail: drkiim@medimail.co.kr; and Jong-Won Kim, Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Ilwon-dong 50, Gangnam-gu, Seoul, Korea, 135-710; e-mail: culture.jkim@gmail.com.

1
Di Bernardo
 
MC
Crowther-Swanepoel
 
D
Broderick
 
P
et al. 
A genome-wide association study identifies six susceptibility loci for chronic lymphocytic leukemia.
Nat Genet
2008
, vol. 
40
 
10
(pg. 
1204
-
1210
)
2
Crowther-Swanepoel
 
D
Broderick
 
P
Di Bernardo
 
MC
et al. 
Common variants at 2q37.3, 8q24.21, 15q21.3 and 16q24.1 influence chronic lymphocytic leukemia risk.
Nat Genet
2010
, vol. 
42
 
2
(pg. 
132
-
136
)
3
Papaemmanuil
 
E
Hosking
 
FJ
Vijayakrishnan
 
J
et al. 
Loci on 7p12.2, 10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia.
Nat Genet
2009
, vol. 
41
 
9
(pg. 
1006
-
1010
)
4
Sherborne
 
AL
Hosking
 
FJ
Prasad
 
RB
et al. 
Variation in CDKN2A at 9p21.3 influences childhood acute lymphoblastic leukemia risk.
Nat Genet
2010
, vol. 
42
 
6
(pg. 
492
-
494
)
5
Prasad
 
RB
Hosking
 
FJ
Vijayakrishnan
 
J
et al. 
Verification of the susceptibility loci on 7p12.2, 10q21.2, and 14q11.2 in precursor B-cell acute lymphoblastic leukemia of childhood.
Blood
2010
, vol. 
115
 
9
(pg. 
1765
-
1767
)
6
Knight
 
JA
Skol
 
AD
Shinde
 
A
et al. 
Genome-wide association study to identify novel loci associated with therapy-related myeloid leukemia susceptibility.
Blood
2009
, vol. 
113
 
22
(pg. 
5575
-
5582
)
7
Kim
 
DH
Xu
 
W
Ma
 
C
et al. 
Genetic variants in the candidate genes of the apoptosis pathway and susceptibility to chronic myeloid leukemia.
Blood
2009
, vol. 
113
 
11
(pg. 
2517
-
2525
)
8
Price
 
AL
Zaitlen
 
NA
Reich
 
D
Patterson
 
N
New approaches to population stratification in genome-wide association studies.
Nat Rev Genet
2010
, vol. 
11
 
7
(pg. 
459
-
463
)
9
Purcell
 
S
Neale
 
B
Todd-Brown
 
K
et al. 
PLINK: a tool set for whole-genome association and population-based linkage analyses.
Am J Hum Genet
2007
, vol. 
81
 
3
(pg. 
559
-
575
)
10
Barrett
 
JC
Fry
 
B
Maller
 
J
Daly
 
MJ
Haploview: analysis and visualization of LD and haplotype maps.
Bioinformatics
2005
, vol. 
21
 
2
(pg. 
263
-
265
)
11
Howie
 
BN
Donnelly
 
P
Marchini
 
J
A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.
PLoS Genet
2009
, vol. 
5
 
6
pg. 
e1000529
 
12
Zheng
 
W
Long
 
J
Gao
 
YT
et al. 
Genome-wide association study identifies a new breast cancer susceptibility locus at 6q25.1.
Nat Genet
2009
, vol. 
41
 
3
(pg. 
324
-
328
)
13
Jeon
 
BN
Choi
 
WI
Yu
 
MY
et al. 
ZBTB2, a novel master regulator of the p53 pathway.
J Biol Chem
2009
, vol. 
284
 
27
(pg. 
17935
-
17946
)
14
Boultwood
 
J
Pellagatti
 
A
Watkins
 
F
et al. 
Low expression of the putative tumour suppressor gene gravin in chronic myeloid leukaemia, myelodysplastic syndromes and acute myeloid leukaemia.
Br J Haematol
2004
, vol. 
126
 
4
(pg. 
508
-
511
)
15
Choi
 
MC
Jong
 
HS
Kim
 
TY
et al. 
AKAP12/Gravin is inactivated by epigenetic mechanism in human gastric carcinoma and shows growth suppressor activity.
Oncogene
2004
, vol. 
23
 
42
(pg. 
7095
-
7103
)
16
Choi
 
DW
Seo
 
YM
Kim
 
EA
et al. 
Ubiquitination and degradation of homeodomain-interacting protein kinase 2 by WD40 repeat/SOCS box protein WSB-1.
J Biol Chem
2008
, vol. 
283
 
8
(pg. 
4682
-
4689
)
17
Dentice
 
M
Bandyopadhyay
 
A
Gereben
 
B
et al. 
The Hedgehog-inducible ubiquitin ligase subunit WSB-1 modulates thyroid hormone activation and PTHrP secretion in the developing growth plate.
Nat Cell Biol
2005
, vol. 
7
 
7
(pg. 
698
-
705
)
18
Chen
 
QR
Bilke
 
S
Wei
 
JS
et al. 
Increased WSB1 copy number correlates with its over-expression which associates with increased survival in neuroblastoma.
Genes Chromosomes Cancer
2006
, vol. 
45
 
9
(pg. 
856
-
862
)
19
Archange
 
C
Nowak
 
J
Garcia
 
S
et al. 
The WSB1 gene is involved in pancreatic cancer progression.
PLoS ONE
2008
, vol. 
3
 
6
pg. 
e2475
 

Author notes

*

D.H.K., S.-T.L., and H.-H.W. contributed equally to this study.

Sign in via your Institution