Figure 2.
Key concepts in sequencing-based diagnostics. (A) VAF represents the ratio of sequencing reads that contain a variant divided by the total number of reads at that position. Because most somatic mutations are heterozygous, doubling the VAF generally indicates the fraction of cells with the mutation (except when the mutation occurs in a region of copy number alteration). (B) Coverage represents the number of sequencing reads (red and blue indicating forward and reverse reads, respectively) that span a particular region. Approximate coverage levels for different sequencing approaches are compared. Higher coverage (or more independent observations) generally yields more sensitive sequencing. Shown on the right is the coverage depth required to detect mutations at various VAFs. Binomial sampling probability for detection of variants with VAFs of 50% (typical inherited variants; black), 2% (general sensitivity for targeted panels; red), and 0.1% (MRD assays; blue) assuming each variant must be seen at least twice. (C) DNA-sequencing methods. In WGS, libraries are created by ligating sequencing adapters (gray and orange) to the 3′ and 5′ ends of short genomic DNA fragments called “inserts.” Gene panels or exome sequencing enriches DNA of interest form a library using antisense capture probes (green) labeled with biotin, which are then hybridized to DNA inserts from the sequencing and then physically enriched using streptavidin-coated magnetic beads. (D) High-sensitivity sequencing for MRD detection requires error correction to reliably identify mutations below the intrinsic error rate of the sequencer and to account for PCR errors. Error-corrected deep sequencing reduces false-positive calls for low VAF variants by tagging individual DNA molecules with unique molecular identifiers (UMIs). In this example a “true” mutation “T” is present in a single DNA molecule that labeled with a UMI (green). Library amplification and sequencing will result in duplicate DNA molecules each labeled with the same UMI. Randomly accumulated sequencing and PCR errors (orange) will be present in only a subset of reads with a particular UMI (green, purple, red). During sequencing analysis, variants present on only a subset of reads from a particular “read family” with the same UMI will be discarded as errors; true mutations present in the original DNA molecule will be detected in all reads within a read family with the same UMI. UMI methods can be further improved by tracking both DNA strands using “duplex sequencing,” which can yield sensitivities of 10−6.31 Professional illustration by Patrick Lane, ScEYEnce Studios.