Individual DNA sample genotyping to validate variants identified by pooled exome sequencing
Gene symbol . | Protein change . | Variant ID . | WES P value . | WES accuracy (%) . | Individual P value . | Validation P value . |
---|---|---|---|---|---|---|
GOLGB1 | Y1212C | rs3732410 | 5.6 × 10−4 | 97.8 | <.0001 | .035 |
ENPP1 | K173Q | rs1044498 | 7.7 × 10−5 | 96.3 | .0011 | .031 |
PKD1L3 | S1176R | rs1035543 | 8.7 × 10−7 | 98.1 | .0002 | .109 |
DUOX2 | P138L | rs2001616 | 5.2 × 10−6 | 82.5 | .0003 | 1.000 |
HSPB9 | Q2P | rs1122326 | 7.0 × 10−4 | 93.4 | .0018 | .181 |
TULP2 | A18T | rs7260579 | 3.2 × 10−9 | 93.7 | .0039 | .423 |
ARGFX | T196I | rs61750878 | 1.4 × 10−4 | 91.5 | .0042 | .309 |
MS4A14 | I56FS | INDEL_11_60165352 | 1.4 × 10−6 | 94.7 | .0056 | 1.000 |
CEACAM20 | S369F | rs10414398 | 9.2 × 10−7 | 87.0 | .0074 | .289 |
DFNA5 | P142T | rs754554 | 1.8 × 10−7 | 93.7 | .0092 | .148 |
ANKRD33 | Y5F | rs697636 | 9.5 × 10−5 | 76.0 | .0153 | .495 |
PON1 | Q192R | rs662 | 9.0 × 10−4 | 95.7 | .0252 | .348 |
PHF14 | K115R | rs218966 | 5.5 × 10−10 | 87.2 | .0261 | .091 |
ZNF880 | Y150C | rs324125 | 7.6 × 10−9 | 84.6 | .0987 | ND |
PKD1L2 | P512L | rs7205673 | 5.6 × 10−4 | 96.5 | .1066 | ND |
SLC7A13 | Q470K | rs9693999 | 5.3 × 10−7 | 77.8 | .1732 | ND |
DTHD1 | A387C | rs12507599 | 2.8 × 10−5 | 89.9 | .1970 | ND |
SLC41A3 | L501FS | INDEL_3_125725268 | 1.8 × 10−5 | 63.6 | .2580 | ND |
C5 | V145I | rs17216529 | 1.5 × 10−5 | 80.5 | .4514 | ND |
TNFRSF1B | M196R | rs1061622 | 3.0 × 10−4 | 91.6 | .4726 | ND |
ALG1L | N135D | rs3828357 | 1.3 × 10−4 | 84.1 | .7306 | ND |
PRG3 | C3R | rs669661 | 7.5 × 10−16 | 29.2 | 1.0000 | ND |
Gene symbol . | Protein change . | Variant ID . | WES P value . | WES accuracy (%) . | Individual P value . | Validation P value . |
---|---|---|---|---|---|---|
GOLGB1 | Y1212C | rs3732410 | 5.6 × 10−4 | 97.8 | <.0001 | .035 |
ENPP1 | K173Q | rs1044498 | 7.7 × 10−5 | 96.3 | .0011 | .031 |
PKD1L3 | S1176R | rs1035543 | 8.7 × 10−7 | 98.1 | .0002 | .109 |
DUOX2 | P138L | rs2001616 | 5.2 × 10−6 | 82.5 | .0003 | 1.000 |
HSPB9 | Q2P | rs1122326 | 7.0 × 10−4 | 93.4 | .0018 | .181 |
TULP2 | A18T | rs7260579 | 3.2 × 10−9 | 93.7 | .0039 | .423 |
ARGFX | T196I | rs61750878 | 1.4 × 10−4 | 91.5 | .0042 | .309 |
MS4A14 | I56FS | INDEL_11_60165352 | 1.4 × 10−6 | 94.7 | .0056 | 1.000 |
CEACAM20 | S369F | rs10414398 | 9.2 × 10−7 | 87.0 | .0074 | .289 |
DFNA5 | P142T | rs754554 | 1.8 × 10−7 | 93.7 | .0092 | .148 |
ANKRD33 | Y5F | rs697636 | 9.5 × 10−5 | 76.0 | .0153 | .495 |
PON1 | Q192R | rs662 | 9.0 × 10−4 | 95.7 | .0252 | .348 |
PHF14 | K115R | rs218966 | 5.5 × 10−10 | 87.2 | .0261 | .091 |
ZNF880 | Y150C | rs324125 | 7.6 × 10−9 | 84.6 | .0987 | ND |
PKD1L2 | P512L | rs7205673 | 5.6 × 10−4 | 96.5 | .1066 | ND |
SLC7A13 | Q470K | rs9693999 | 5.3 × 10−7 | 77.8 | .1732 | ND |
DTHD1 | A387C | rs12507599 | 2.8 × 10−5 | 89.9 | .1970 | ND |
SLC41A3 | L501FS | INDEL_3_125725268 | 1.8 × 10−5 | 63.6 | .2580 | ND |
C5 | V145I | rs17216529 | 1.5 × 10−5 | 80.5 | .4514 | ND |
TNFRSF1B | M196R | rs1061622 | 3.0 × 10−4 | 91.6 | .4726 | ND |
ALG1L | N135D | rs3828357 | 1.3 × 10−4 | 84.1 | .7306 | ND |
PRG3 | C3R | rs669661 | 7.5 × 10−16 | 29.2 | 1.0000 | ND |
Twenty candidate SNV and 2 insertion-deletion variants with the best statistical association with stroke risk (P < .001) were selected for follow-up verification analyses. The DNA samples within each DNA pool were individually genotyped for each of the 22 variants, and the allele frequency was determined. The mean accuracy of the WES genotyping was calculated by comparing the MAF of the DNA pools obtained by individual genotyping to the MAF determined by pooled WES. Variants with no difference between the genotyping methods would have an accuracy of 100%, whereas variants with completely discordant calls would have an accuracy of 0%. The individual genotype calls were used to repeat the statistical association testing between the stroke and control groups. Using the discovery cohorts (stroke n = 120; control n = 104), we verified that 13 of the 22 variants maintained their initial association with stroke in the same direction (increased or decreased stroke risk) as the pooled WES discovery (individual P value). We then tested these 13 variants in an independent validation cohort (stroke n = 57; control n = 231) to validate the association of these variants with risk for stroke (validation P value). All statistical tests were performed using the Fisher exact test and using an allele model.
ND, no analysis performed.