Ley TJ, Mardis ER, Ding L, et al. . Nature. 2008;456:66-72.

In 2004, an essentially complete version of the human genome sequence was reported.1 Now, in a little more than four years since that publication and using as a blueprint nucleotide sequence data derived from that project, Bentley, et al.2 reported sequencing in eight weeks at “low cost” the entire genome of a Nigerian male. In that same issue of Nature, Wang and colleagues published the results of the sequencing of the genome of a Han Chinese individual,3 and Timothy Ley, Elaine Mardis, and co-workers from Washington University, St. Louis, reported not only the sequencing of an acute myeloid leukemia (AML) genome, but also its matched normal counterpart from the patient’s skin with a direct comparison of the sequence of the two genomes. All three projects used “next generation” technology called massively parallel synthetic sequencing. In the case of the AML genome, starting from 1µg of DNA, 98 billion bases were sequenced, providing 32.7-fold coverage of the 3 billion base human genome sequence. For the control normal skin sample, 41.8 billion bases were sequenced, resulting in a 13.9-fold haploid coverage.

In addition to the remarkable technical achievement of demonstrating the feasibility of using whole-genome sequencing as an approach to unbiased discovery of tumor-specific somatic mutations, the studies of Ley, Mardis, and colleagues provided a number of new insights into the pathobiology of cytogenetically normal AML. The focus of the study was on identifying non-synonymous sequence variants (i.e., nucleotide substitutions that change the amino acid sequence of proteins). To get to that point, 3,813,205 single nucleotide variants (SNVs) were identified in the AML genome. Of those, 2,647,695 were supported by Decision Tree analysis of which 2,584,418 were also found in the skin sample, leaving 63,277 tumor-specific SNVs. Those tumor-specific SNVs (31,645) that were present in the dbSNP/ Watson/Venter databases were eliminated from analysis, resulting in identification of 31,632 new tumor-specific SNVs. After elimination from further analysis those SNVs that were in non-genic regions (20,440), those located in intronic regions (10,735), and those located in untranslated regions (216), 241 tumor-specific SNVs were localized to coding regions. Sixty of the SNVs affecting coding sequence were found to be synonymous, and after eliminating 173 SNVs that were false positives, germline, or unvalidatable for technical regions, 10 SNVs were validated as non-synonymous somatic mutations. Two of these were well-known mutations associated with AML (internal tandem duplication of FLT3 and 4-bp insertion affecting NPM1). Of the eight novel mutations, all were considered heterozygous with two being nonsense mutations and six being missense mutations. Surprisingly, none of the eight novel mutations were identified in 187 other cases of AML. The rarity of somatic variants argues that neither genetic instability nor defects in DNA repair contributed to the pathophysiology of AML in this case. Expression was detected for five of the eight novel somatically mutated genes, but no functional studies designed to assess the role of the mutant genes in AML pathogenesis were reported. Therefore, that some (or all) of the novel somatically mutated genes are non-pathogenic (i.e., passenger rather than driver mutations) is conceivable.

To understand fully the role of somatic mutations in the pathogenesis of AML (and other malignant neoplasms) many more genomes must be sequenced, a task that even a decade ago seemed unrealistic. However, given the remarkable studies of Ley, et al., Bentley, et al., and Wang, et al. and the commitment of the National Institutes of Health (35 grants totaling $56 million to universities and companies for development of technology to produce whole-genome sequence for $1,000) and the interest of the private sector (the X Prize offers $10 million to the first group that can sequence 100 human genomes in 10 days for $10,000 or less per genome), use of whole human genome sequencing as a clinical tool now appears imminent.

1.
International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature. 2004;431:931-45.
2.
Bentley DR, Balasubramanian S, Swerdlow HP, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53-9.
3.
Wang J, Wang W, Li R, et al. The diploid genome sequence of an Asian individual. Nature. 2008;456:60-5.

Competing Interests

Dr. Parker indicated no relevant conflicts of interest.