Figure 2.
Normalized amino acid conservation scores across ENG, ALK1, and SMAD4. The degree of evolutionary conservation of each amino acid in the human protein sequences of ENG (NM_0011147, 658 amino acids), ACVRL1 (NM_000020.2, 503 amino acids), and SMAD4 (NM_005359.5, 552 amino acids), plotted against the respective amino acid position. The conservation, reflecting the retention of macromolecular function, was plotted as normalized conservation scores and 95% confidence intervals (CIs) obtained using ConSurf.58 Lower scores indicate greater conservation. In all 6 plots, the selected amino acids are plotted in red, and all other amino acids are plotted in black. (A) Amino acid sites of pathogenic or likely pathogenic HHT missense substitutions from current cohort 1 (n = 18), cohort 2 (n = 15), and the 2018 HHT Mutation Database40 (n = 64) are plotted in red, and all other amino acids are plotted in black. Amino acids in which pathogenic or likely pathogenic variants were located were more conserved than amino acids with nonpathogenic variants (ENG mean difference, −0.52 [95% CI, −0.81 to −0.24; P = .00072]; ACVRL1, −0.77 [95% CI, −0.94 to −0.61; P = 6.6x10−15]; SMAD4, −0.80 [95% CI, −0.89 to −0.72; P = 3.9 × 10−61]). Notably, however, not all pathogenic or likely pathogenic variants were at conserved sites, and for endoglin, the normalized conservation scores and CIs were highly variable in regions other than the transmembrane domain (near amino acid 600) and the C terminal cytoplasmic tail (amino acids 635-658). (B) Amino acid sites of likely benign missense substitutions in the gnomAD database57 plotted in red vs all other amino acids, plotted in black. Amino acids in which benign variants were sited were less conserved than other amino acids (ENG mean difference, 0.34 [95% CI, 0.19-0.50; P = 1.6 × 10−5]; ACVRL1, 0.48 [95% CI, 0.30-0.66; P = 3.2 × 10−7]; SMAD4, 0.54 [95% CI, 0.32-0.77; P = 3.0 × 10−6]).