Abstract
Complex genetic non-Mendelian traits may be responsible for the predisposition to many conditions. MDS likely evolves as a result of the interaction between environmental agents and genetic background. Examples of genetic factors that could affect the risk for the development of MDS or influence its course include immunogenetic factors, polymorphisms of detoxifying enzymes, proteins of DNA repair machinery and senescence-associated genes. Analysis of the complex interactions between a multitude of gene variants and disease phenotypes is difficult and empiric approaches have little chance of identifying polymorphisms associated with specific genotypes. Recently, development of a whole genome scan based on the detection of individual single nucleotide polymorphisms (SNP) provided an excellent tool to search for disease specific haplotypes. We applied the Affymetrix 50K SNP array (SNP-A) to study 66 MDS patients (21 RA/RCMD, 17 RARS, 22 RAEB1/2, 6 CMML) and 29 control marrow specimens. In total, 3,135 x 106 and 1,377 x 106 genotypes were obtained for MDS and controls respectively. As a reference, we have also used 40 matched controls from the HapMap project. Genotype calls were computed using GTYPE and analyzed with Exemplar software. All samples with inadequate call rates were excluded. Statistical analysis was performed using χ2, Yates, Fisher Exact, Odds Ratio, LD, TDT, HHRR and Haplotype estimates analysis; necessary corrective measures (Bonferroni, Wilcox/permutation and false discovery rate) were applied as well as EM (expectation maximization) algorithm module that performs haplotype-based analysis in order to identify possible association between haplotype pairs and the phenotype. Automated analysis selected SNP based on multiple lines of statistical evidence and we reduced the pool of potentially informative SNP markers to the 100 most significant present in the entire MDS group. Of these, SNP present in <20% of the control population were selected for further analysis. For example, potentially predisposing SNPs associated with genes PARP16 and RASGRF2 occurring in homozygous form in controls at the frequency 1% and 7%, were found in patients at the frequency of 20% and 26% respectively (p<0.01, and p<0.01). Analogous analyses were performed for more stringently defined subsets of MDS patients as compared to each other as well as to controls. When patients with potentially pathogenetically distinct RARS and CMML were excluded from the analysis, based on a cutoff value of p=0.001, 80 SNPs could be selected that were selectively associated with RA/RAEB (in LD with PIPK2A, USP31, RNGTT) while certain SNP (and associated genes) segregated with the entire MDS group (e.g. those in LD with RASGGRP3). Finally, some of SNP were shared between the total group and patients with RA/RAEB. These examples demonstrate the individual genetic risk would result from the interplay of a multitude of gene polymorphisms. Due to the inherent complexity of possible associations, a multiloci analysis is currently being performed to create predictive models that utilize a hierarchy of SNPs that predict specific phenotypes. It is possible that such decision tree models utilizing 3–5 SNPs will generate specificity of around 90% of a validation group.
Disclosure: No relevant conflicts of interest to declare.
Author notes
Corresponding author