Figure 1.
Patterns of uncorroborated mutations and directiona data contamination. (A) A heatmap showing a subset of the mutations reported in the 101 patients coloring each based on read support in the sequencing data from the sample specified (“corroborated,” blue), uncorroborated in data from this patient but present in data from a sample identified as a potential contaminant (“data contamination,” yellow), or with no read support in either (“uncorroborated,” red). The patients are arranged according to their subtype and Epstein-Barr virus (EBV) status. The rows are ordered based on hierarchical clustering with the dendrogram (left) showing the clustering based on mutations reported in each patient. We noted that a minority of uncorroborated variants could not be resolved by any directional contamination, and these “unresolvable” variants appeared to be more common among sporadic and HIV-associated BL cohorts. Genomes were classified based on the predominant pattern where this could be inferred. (B) The average coverage depth for the genomes. (C) Box plot showing the distribution of reads supporting the nonreference allele for corroborated variants. Genomes with their variants supported by a minimal number of reads (mean supporting reads, <4) are highlighted in yellow. (D) A forest plot showing the log-transformed odds ratio estimate from Fisher exact tests comparing the mutation frequency of corroborated variants in EBV+ and EBV− cases. Genes with points above y = 0 had more mutations in EBV+ cases. SNTB2, the only gene reported as enriched for mutations in EBV− cases, is no longer significant (q > 0.1, false discovery rate). Bold red type indicates a gene that is significantly associated with EBV status in this analysis but not in other studies that have compared mutation frequency between EBV+ and EBV− BL.1,3 (E) The mutation burden of each patient based on the corroborated variants is shown as a box-whisker plot with patients stratified on the reported EBV type. Cases that benefited from directional contamination are indicated in red triangles and the rest are black points. Although a significant global difference is observed (Kruskal-Wallis test), post hoc pairwise tests show an insignificant difference between cases with type 1 and type 2 EBV (Wilcoxon rank-sum test).