Introduction: Beginning of 2015 the FDA issued a call to the public to receive feedback from the community on FDA's regulatory approach to diagnostic tests using next generation sequencing technology. For the clinical performance of such tests one of the proposals was to use community-derived databases to classify variants, especially ClinVar, which is gaining increasing traction. Diagnosis and prognostication in patients with myelodysplastic syndomes (MDS) may be improved by high throughput mutation profiling.

Aim: We used our previously described and well characterized MDS mutation dataset (Haferlach et al, Leukemia 2014) to investigate whether data present in variation databases is sufficient to distinctly and confidently classify variants.

Patients and Methods: A total of 944 patients with various MDS subtypes were screened for gene mutations in 104 known/putative genes relevant to MDS using targeted deep-sequencing (Illumina, San Diego, CA). From the 104 genes investigated, a subset of 6 genes with high incidences were selected for this study: TET2, SF3B1, ASXL1, SRSF2, DNMT3A, and RUNX1 (mutated in >10% of all cases). For this assessment the following databases were used: ClinVar (release 2015-01-08), COSMIC (v71) and dbSNP (v142).

Results: Of these 944 examined cases 713 were mutated in one of the 6 significantly mutated genes, bearing 1431 mutations among them, 736 being distinct. We investigated 25 of these mutations, occurring in at least 5 MDS patients, together comprising 42% of all mutations within this subset of genes. Surprisingly, despite the fact that some of these mutations are well characterized, none of them are referenced in ClinVar. COSMIC lists 24/25 mutations as somatic, 22 bearing the status of "Confirmed as somatic" and 2 mutations in TET2 bearing "Variant of unknown origin". One mutation in SF3B1 (c.1866G>T) is ambiguously flagged as SNP despite being also "confirmed as somatic". 8/25 mutations are listed in dbSNP. In 6 of these instances, no global minor allele frequency (MAF) is given and the validation status lists only a single submitter. Ideally every SNP is defined by its MAF as its intrinsic property, denoting the frequency of occurrence in the population of a variant base. However in two instances, the dbSNP entry is supported by the 1000 Genomes project and a MAF, thus increasing the likelihood that these mutations are filtered out in automated variant calling processes. We also analyzed these mutations with two tools (PolyPhen-2 and SIFT) to predict the possible impact of an amino acid substitution on the structure and function of human proteins. Results were available for 18/25 mutations. In 16/18 cases PolyPhen-2 predicted a probable or possible damaging consequence (score > 0.5), with 2/18 being rated as benign (score < 0.2). SIFT predicted a deleterious impact in 15/18 cases (score ≤ 0.05) and a tolerated status in 3/18 (score > 0.1). Surprisingly, the benign/tolerated rating of both tools are mutually exclusive to the mutations to which they were assigned, leaving 5 mutations non-interpretable based on the combined use of both tools. All investigated mutations were missense variations with exception of 4 frame-shift mutations and 2 non-sense mutation, which did not yield any results in the amino acid impact prediction (see Table 1).

Conclusion: 1) Applying this data of MDS mutation profiling analyses the assessment of mutations based on today's available databases is disappointing, especially regarding the data pool of ClinVar. 2) COSMIC on the other hand seems to be useful in instances, where the gene to be investigated is among a subset of genes called Cancer Gene Consensus, which is an expert level manual curation of the variants. 3) Tools for novel mutations (with no record in any databases) seem to perform well in quite a few instances, but a consensus of multiple tools is urgently needed as contradicting results may be yielded. 4) Based on the increasing awareness in hematology that gene sequence variation present in the germline in patients can not lead to the conclusion that these are benign polymorphisms as has been shown for CEBPA, RUNX1 and TP53 mutations, future databases have to take functional implications of gene sequence variations into account.

Disclosures

Nadarajah:MLL Munich Leukemia Laboratory: Employment. Meggendorfer:MLL Munich Leukemia Laboratory: Employment. Kern:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership. Haferlach:MLL Munich Leukemia Laboratory: Employment, Equity Ownership.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution