Abstract
Individual variability, including disease susceptibility, is determined by the interaction of inherited single base differences (single nucleotide polymorphisms, SNPs) and copy number variants (CNVs) of large genomic regions. A complex combination of these factors may result in a genetic background predisposing to disease. Regions of CNV account for approximately 12% of the human genome, including coding sequences and can range in size from kilobases to megabases. Recent studies have investigated the correlation between CNVs and complex conditions, including mental retardation, lupus and cardiovascular disease. While SNPs have been intensely investigated in many diseases, the influence of CNVs on disease susceptibility is only poorly understood. With the advent of high-throughput, high density array technology, global analysis of complex disease predisposition traits, including CNVs, can be performed. We have applied high-density SNP arrays (SNP-A) for the analysis of somatic chromosomal defects in various hematologic disorders. During our studies we noted a high frequency of germ-line CNVs, complicating our analysis of somatic defects. This observation lead us to the hypothesis that CNVs can themselves constitute predisposition factors to disease and chose to systematically investigate their type and frequency in myeloid disorders including aplastic anemia (AA; N=65), myelodysplastic syndrome (MDS; N=145) and primary and secondary (non-core binding factor) acute myeloid leukemia (AML; N=75). We performed whole genome scanning in patients and a cohort of healthy controls (N=79) using the Affymetrix 250K SNP array. We first identified and catalogued CNVs in controls; their frequency was compared to those reported in the Database of Genomic Variants (http://projects.tcag.ca/variation/) and found to be similar. The CNVs ranged in size from 245.6 Kb to 2.32 Mb (average 805.9 Kb) and were identified on all chromosomes except 5, 13, 16, 18 and 21. We next analyzed copy number changes in patients with myeloid disorders. Using controls (both our cohort and those in the literature) as a reference we determined the frequencies of recurrent CNVs in patients. For most of the CNVs the frequency was <10% within the individual patient groups, similar to what was seen in controls. Nonetheless, four regions (2 distinct loci in the pericentromeric region of 14q, pericentromeric 15q and a locus on 17q21.31) were identified in over 15% of samples studied. We then determined whether a distinct CNV is associated with specific disease risk. While for most CNVs the frequencies found in patients were similar to those in controls, two regions, 3q29 and 14q11.2, were more frequently encountered in patients with AML (3q29, 27/75 vs. 13/79 in controls, p=0.01; 14q11.2, 20/75 vs. 8/79 in controls, p=0.014). The region at 3q29 contains several genes and is a common breakpoint region for hematologic malignancies including MDS and AML, suggesting that this chromosomal area sensitive to physical rearrangement. The locus at 14q11.2 is a known hypervariable region, containing T cell receptor genes. In sum, in addition to SNPs, CNVs may be a part of complex genetic traits in patients with AA, MDS and AML and constitute disease predisposition factors. Beyond their potential role in disease, CNVs have to be excluded in SNP array-based analysis of somatic chromosomal lesions.
Author notes
Disclosure: No relevant conflicts of interest to declare.