Figure 2.
Patterns of uncorroborated mutations and prolific data contamination. (A) Corroboration status for variants reported among 9 representative regions affected by clustered mutations according to Panea_S3. For each region, mutations reported in multiple patients are shown on a separate row, with blue boxes indicating the patient(s) in which the variant could be corroborated and yellow boxes indicating uncorroborated mutations. The rows annotated with sets of arrows correspond to the variants labeled as “prolific” in panel B. The reported existence each of these prolific variants was uncorroborated in multiple patients and typically only corroborated in 1 to 2 patients. (B) Representative examples of prolific mutations that could not be corroborated. Integrative Genomics Viewer visualizations of the sequencing data for regions of the BACH2 (top) and HIST1H1E (bottom) genes include boxes outlining the locations of corroborated (dark blue) or uncorroborated (red) variants in individual genomes. Sample 2965 is the source of both directional contamination (of sample 2966) and the other samples shown. Samples indicated with red arrows (A) or red labels (B) were indicated as mutated in Panea_S2 but were absent from supplemental Table 3. (C) Effect of prolific contamination on the reported rate of mutations in affected genes. The red bars indicate genes with significantly lower frequency in the reanalysis when compared to the mutations reported in Panea_S2. bp, base pair.