Figure 1.
The rs311103C allele correlates with the Xg(a−) phenotype. (A) The XG gene partly resides in pseudoautosomal region 1 (PAR1) on both sex chromosomes; the first 3 exons lie in PAR1, whereas the remaining 7 exist only on the X chromosome. Thus, XG is truncated to the first 3 exons on the Y chromosome, where it does not produce a functional transcript. The frequencies of rs311103C, a single-nucleotide polymorphism (SNP) located 3709 bp upstream of the transcription start site, as derived from the 1000 Genomes Project15 closely match the expected allele distributions for Xga negativity. (B) Transcription factor binding analyses of the GATA motif identified at rs311103 show decreased binding preferences with nucleotide substitutions, as adapted from the JASPAR database.17 The XG GATA motif with rs311103C converts GATA to CATA and reduces the relative binding energy score for GATA1 from 0.888 to 0.775, thus bringing it below the default threshold at 0.8. Similarly, the ACKR1 upstream GATA site may carry the rs2814778C SNP on the complementary strand and results in a similarly lowered score for GATA1 and the Fy(b−) phenotype.18 (C) The 158 donors were serologically typed as Xg(a+) or Xg(a−) and genotyped for the rs311103 allele. The asterisk indicates 1 serologically Xg(a−) heterozygous female donor who was weakly positive on flow cytometric analysis; see also panel E. The XG complementary DNA (cDNA) sequence from this donor was found to be identical to the reference sequence (NM_175569.2). (D) For a subset of donors (n = 59; 29 female and 30 male donors), messenger RNA (mRNA) was isolated and converted to cDNA for XG transcript analysis. (E) We also performed flow cytometry on this subset. Genotypes of the donors correlated with XG transcript levels by reverse transcription quantitative polymerase chain reaction (RT-qPCR) and Xga antigen levels by flow cytometry (mean fluorescence intensity [MeanFI]); solid and open circles represent serologically Xg(a+) and Xg(a−) samples, respectively, on 29 females and 30 males. Note the overlap between 2 dots on the median line for the GC males. Bars represent the median. *P < .05, **P < .01, ***P < .001. ns, not significant.