Abstract
A previously undefined transcript with significant homology to the pseudo-α2 region of the α-globin locus on human chromosome 16 was detected as part of an effort to better define the transcriptional profiles of human reticulocytes. Cloning and sequencing of that transcript (GenBank AY698022; named μ-globin) revealed an insert with a 423-nucleotide open reading frame. BLASTP and ClustalW and phylogenetic analyses of the predicted protein demonstrated a high level of homology with the avian α-D globin. In addition, the heme- and globin-binding amino acids of μ-globin and avian α-D globin are largely conserved. Using quantitative real-time polymerase chain reaction (PCR), μ-globin was detected at a level of approximately 0.1% that measured for α-globin in erythroid tissues. Erythroid-specific expression was detected by Northern blot analysis, and maximal expression during the erythroblast terminal differentiation was also detected. Despite this highly regulated pattern of μ-globin gene transcription, μ-globin protein was not detected by mass spectrometry. These results suggest the human genome encodes a previously unrecognized globin member of the avian α-D family that is transcribed in a highly regulated pattern in erythroid cells. (Blood. 2005;106:1466-1472)
Introduction
The globin genes and their products have been intensively investigated for the past 50 years. Those studies led to the description of structural and regulatory elements that are useful for the recognition and comparison of hundreds of globin gene family members. The divergence of ancestral α- and β-globin genes is estimated to have occurred 500 million years ago. Those genes subsequently evolved and were modified by a variety of genetic processes, including duplication events.1 In humans, the α-globin gene family resides on chromosome 16p13.3 and is composed of a cluster of 3 genes (ζ2-α2-α1) with protein products that bind heme and assemble into hemoglobin. Transcription of the ζ2 gene is silenced during fetal life, and the 2 α genes (α2 and α1) are expressed in a balanced fashion for the remainder of ontogeny. In addition to those 3 genes, concerted efforts in the 1980s led to the discovery of other α-like sequences. The downstream region of the α locus contains an unusual gene named θ-globin that generates no detected globin protein in humans.2,3 θ-globin gene transcription is regulated, and the transcripts contain no obvious defects to explain the lack of detectable protein in erythroid tissues.2-5 Three pseudo-globin genes (pseudo-ζ1,6 pseudo-α2,7 and pseudo-α18 ) were also identified in the α-globin locus.
During the past 2 years, investigators have begun a transition toward postgenomic approaches to basic and clinical research. Hypotheses are now generated with the knowledge of whole genomic DNA sequences,9 full-length cDNA collections,10 and millions of expressed sequence tags (ESTs)11 from humans and other species. Comparisons of DNA and RNA sequences with advanced bioinformatics analyses12 have become essential. Hematology is ideally suited for this type of genome-based research due to the ease with which purified populations of hematopoietic cells are isolated. We hypothesized that human reticulocytes contain sufficient mRNA from the terminal stages of differentiation for the study of globin gene expression patterns in high throughput. Levels of globin mRNA detection and differences in globin gene transcription between cord and adult blood were studied to determine the potential of this approach for clinical assessments of hemoglobinopathies and hemoglobin switching. Using oligonucleotide arrays, significantly different globin transcription patterns were found in cord and adult blood samples. Evidence for transcription of the major globin genes was clearly demonstrated. Surprisingly, we also identified transcription from the genomic region previously thought to encode the pseudo-α2 gene. The source of that transcription is characterized in this report as a previously unrecognized globin gene.
Materials and methods
Preparation of reticulocyte RNA
Blood was collected from healthy adult donors and from placental umbilical cords. All cells were collected according to approved guidelines regarding human subjects. Blood samples were centrifuged, and plasma and buffy coat layers were removed. Packed red blood cells were diluted with 4 vol of 1 × phosphate-buffered saline (PBS) and were filtered through two consecutively linked RCXL2 high-efficiency leukocyte reduction filters (Pall, East Hill, NY). Platelets were removed by repeated low-speed centrifugation. RNA was isolated with Trizol LS (Invitrogen, Carlsbad, CA) and then treated with DNaseI to degrade residual genomic DNA in the RNeasy Mini Kit (Qiagen, Valencia, CA) according to the manufacturer's protocol. For erythropoiesis assays, peripheral blood CD34+ cells from healthy donors were cultured for 14 consecutive days in erythropoietin-containing medium, as previously described.13
Microarray data analysis
Microarray analyses were performed using 5 μg total RNA from each sample with one cycle of complementary RNA amplification according to the Affymetrix (Santa Clara, CA) protocol. After the hybridization and washing steps were performed, microarray chips were scanned using MAS 5.0 software. Collected data were analyzed using Partek Pro 6.0 software (Partek, MO). Expression levels were clustered and displayed by Spotfire DecisionSite 8.0 (Spotfire, Somerville, MA). A complete description of the 28 array data sets is being prepared as a separate publication.
Cloning full-length coding sequences
Fifty nanograms of first-strand cDNA made from 1 μg adult blood reticulocyte total RNA was amplified with the forward (5′-CCA TGC TCA GCG CCC AGG AG-3′) and reverse (5′-AGC ACA GGG CTC AGC GGT ATT TTT C-3′) primers using the BD Advantage-GC cDNA polymerase chain reaction (PCR) kit (BD Clontech, Palo Alto, CA) with the cycle conditions as follows: 94°C pre-denaturation for 3 minutes, 94°C for 30 seconds, and 68°C annealing and extension for 3 minutes for 30 cycles. The amplified PCR product was purified with the MinElute PCR Cleanup Kit (Qiagen) and was cloned into pcDNA3.1-V5-6His and pCR2.1 vector (Invitrogen).
Northern blotting
The full-length cDNA clone insert was cleaved by restriction enzyme digestion and gel extraction using the MinElute GelExtraction Kit (Qiagen). Five hundred nanograms of insert was labeled with α32P [dCTP] (Amersham Bioscience, Piscataway, NJ), and purified from unincorporated nucleotides using a G-50 column. The labeled probe was hybridized onto a nylon membrane containing 10 μg total RNA from cord blood reticulocyte (5 pooled samples), adult blood reticulocyte (5 pooled samples), fetal liver (BD Clontech), and adult bone marrow (BD Clontech) on each lane at 43.5°C. The hybridized membrane was washed and exposed on BioMax MS film (Eastman Kodak, Rochester, NY).
Quantitative real-time PCR
For quantitative PCR, the sequence-specific primers and probe were designed to span the border between exons 2 and 3 of the α (forward primer, 5′-GGG TGG ACC CGG TCA ACT T-3′; reverse primer, 5′-GAG GTG GGC GGC CAG GGT; probe, FAM-5′-AAG CTC CTA AGC CAC TGC CTG CTG-3′-TAMRA) and μ-globin mRNA (forward primer, 5′-GCG TGG ACC CAG CCA ACT T-3′; reverse primer, 5′-CAG GTG GGA GGC CAG CAC-3′; probe, FAM-5′-TCC GCT GCT AAT CCA GTG TTT CCA C-3-TAMRA). Copy numbers were calculated by comparison with standard curves. The specificity of each primer and probe was defined by α-globin and μ-globin cDNA templates. For each PCR reaction, 5 ng cDNA made from pooled total RNA was mixed with 2 × TaqMan Master Mix (Applied Biosystems, Foster City, CA) and 10 pmol each of primer and FAM/TAMRA-labeled probe, and were amplified using the ABI 7700 Sequence Detection System (Applied Biosystems). Results were analyzed by Sequence Detector 1.7 software (Applied Biosystems).
Surface-enhanced laser desorption/ionization: time-of-flight mass spectrometry
H4 ProteinChips and calibration standard molecules for the surface-enhanced laser desorption/ionization: time-of-flight (SELDI-TOF) mass spectrometer were purchased from Ciphergen Biosystems (Fremont, CA). Sinapinic acid (SA) was obtained from Sigma (St Louis, MO). The SELDI-TOF mass spectrometer was externally calibrated using the [M + H]+ ion peaks of somatostatin at 1637.9 m/z, insulin β-chain at 3495.9 m/z, human recombinant insulin at 5807.6 m/z, and hirudin at 7033.6 m/z. All mass spectra were recorded in the positive-ion mode using a Ciphergen PBS IIc ProteinChip Array mass spectrometer with time-lag focusing.14 Before SELDI-TOF mass spectrometry analysis, the H4 Protein-Chip was prewashed with 10% aqueous acetonitrile containing 0.1% trifluoroacetic acid (TFA). On drying, 1 μL sample was applied to the ProteinChip, air dried, and washed with 5% aqueous acetonitrile. After drying, 1 μL matrix (saturated SA in 50% aqueous acetonitrile containing 0.1% TFA) was added to each feature of the ProteinChip array. Data were analyzed using the computer software provided by the manufacturer and are reported as mass averages.15
Bioinformatics analyses
The comparisons of mRNA sequences of μ-globin were performed by basic local alignment search tool (BLAST),16 and alignment to the human genome was performed by BLAST-like alignment tool (BLAT).17 For promoter analyses, the 200 base pairs (bp) upstream from the translation start site were examined using PromoterInspector18 with the default setting. Phylogenetic analyses were performed by maximum parsimony (MP) using μ-globin and 291 known α-like globin protein sequences deposited in GenBank. A complete alignment with gap was performed using ClustalX software.19 Aligned sequences were input to PAUP (version 4.0b10 for UNIX; Sinauer Associates, Sunderland, MA), which defined the MP tree, using the heuristic search command. The MP tree20 was chosen when PAUP had not improved the score after several hours of searching. The final tree was drawn by PhyloDraw software.21 Maximum likelihood, neighbor-joining (NJ), BIONJ,45 least-squares, and balanced minimum evolution analyses were performed and demonstrated similar results.
Results
Microarray comparison of globin gene expression
To compare the mRNA profiles in the reticulocytes circulating at the time of birth with those in adults, high-throughput arrays were generated from the blood of 28 separate donors (14 cord blood, 14 adult blood). Platelets were removed from the samples by low-speed centrifugation, and nucleated cells were removed by leukocyte reduction filtering. Samples were analyzed using Affymetrix from HG-U133 A and B chips, and the expression of 44 229 probe sets was ranked. The focus of this report is the globin genes; a description of the other probe sets will be provided in a separate manuscript. As expected, the globin gene transcripts achieved high ranking because of their abundance in reticulocytes. The ranks of signal intensities for α2, α1, β, Aγ, and Gγ globins in adult blood reticulocytes were 1st, 2nd, 3rd, 10th, and 8th, and the ranks in the cord blood reticulocytes were 1st, 2nd, 3rd, 4th, and 5th, respectively. The higher ranking of Aγ compared with G-gamma in cord blood was unexpected because it is known that G-gamma represents approximately 70% of the total gamma chains at birth.22 In comparison with α, β, and γ transcripts, the levels of ϵ-, ζ-, and θ-globin were low in all the samples, and δ-globin was reduced in the cord blood samples.
In addition to the expected globins, we identified a probe set (240336_AT) described by the Affymetrix software as having homology with a hemoglobin-based blood substitute (Rhb1.1).23 The expression rank of that probe set was 21st of 44 229 probes in adult and 11th of 44 229 probes in cord blood reticulocytes. To place the expression pattern of 240336_AT probe in the context of the other globin probe sets, the signal intensities were clustered by unsupervised hierarchic clustering (Figure 1). The clustered arrangement of the signal intensities corresponding to the 9 globin gene probes and 240336_AT from 14 cord blood and 14 adult blood samples are shown. Cord blood and adult blood samples were segregated appropriately on the basis of switching patterns of intensity. The α1, α2, and β-globin genes clustered together according to their high intensity in all the samples. The gamma genes also co-clustered, and they were expressed within the same range as β-globin in cord blood. As shown, expression of the probe 240336_AT did not cluster with any other globin probe. The average signal intensity of that probe was higher than the intensities of ϵ, θ, ζ, and δ but lower than those of γ, β, and α-globin. The pattern of 240336_AT expression was variegated between donors with decreased mean expression in adult blood compared with cord blood samples.
Bioinformatics analyses and cloning of reticulocyte μ-globin
Public sequence data describing the human genome, full-length cDNA, and ESTs provided a clear path for further investigation of the 240336_AT probe. The probe set was designed using more than 25 million EST sequences deposited on dbEST.11 The template EST sequence (GenBank, BE244453) that aligns with the 240336_AT probe was identified by using the reference sequence for a BLAT search.17 Surprisingly, 240336_AT aligned in the same region as the pseudo-α2 gene rather than in other gene regions (Figure 2A). Unlike the pseudo-α2 globin nucleotide sequences,7 the 240336_AT EST sequences aligned to generate a gene structure familiar to the other globin genes. Based on this bioinformatics comparison, a full-length cDNA was cloned from reticulocyte RNA to generate a 506-bp transcript encoding the gene probed by 240336_AT. Four additional bases were identified by 5′ rapid amplification of cDNA ends (RACE). The 510-bp reticulocyte-derived sequence was originally deposited in GenBank in July 2004 (AY698022, NM_001003938) and was named mu (μ) because of the smaller size of the predicted globin product (141 amino acids compared with 142 for the other human α-globin genes).
Additional bioinformatics analyses of the μ-globin sequence are shown in Figure 2. The μ gene aligned within the 3′ region of pseudo-α2 on chromosome 16p13. The gene contained a Kozak sequence (24 bp downstream of the transcriptional start site) indicating the predicted translation initiation site of the μ-globin gene. Although that Kozak sequence (CGCCATGC) was not found in the other human globin genes, it is encoded in approximately 5% to 10% of vertebrate genes,24 including the duck α-D globin (gi62724). The μ-globin sequence also has a poly(A) signal sequence (AATAAA) located 35 bp downstream of the translational stop codon in the 3′ untranslated region (3′UTR). The α-globin 3′UTR is thought to increase mRNA stability by forming a specialized secondary structure that is complexed with α-globin poly(C) binding protein (αCP).25 The predicted secondary structure of the μ-globin 3′UTR suggests that μ-globin mRNA may be less stable because of decreased resistance against degradation by endonucleases or exonucleases (data not shown).
The μ-globin (HBM) sequence was aligned to the α-globin mRNA using ClustalW,20 and its overall similarity was 59% (302 of 510 bases). The predicted protein sequence from the open reading frame (ORF) was identified in GenBank as NP_001003938. We aligned this sequence to the human α-globin protein using ClustalW. It showed incomplete conservation of heme binding and αβ-globin contact sites. As shown in Figure 2B, the μ-globin promoter region (upstream 200 bp) was also examined using PromoterInspector.18 Unlike the α and ζ promoters, μ-globin promoter did not contain a CAAT motif. A muscle TATA (TATAGA) core sequence was identified 60 bp upstream of the ATG. Erythroid Kruppel-like factor (EKLF) and GATA binding factor 1 (GATA1) binding sites were identified at -70 and -82 bp, respectively. The μ-globin gene also contained a hypoxia-inducible factor (HIF) binding site commonly associated with higher-affinity hemoglobins26 but not identified in other human globin gene promoters.
μ-globin gene expression
In addition to array-based assays, μ-globin gene expression was examined by Northern blot analysis (Figure 3). Hybridizations containing 10 μg total RNA for each of 4 erythroid tissues (cord blood reticulocytes, adult blood reticulocytes, fetal liver, and bone marrow) were performed. As expected, α-globin expression was detected at similar levels in adult blood reticulocytes and cord blood reticulocytes. Equivalent bands were also detected in fetal liver and adult bone marrow (Figure 3A). Compared with bone marrow, no detectable μ-globin signal was identified on nonerythroid tissues (Figure 3C). The expression of μ-globin in cord blood reticulocytes was approximately 5 times higher than it was in adult blood reticulocytes. The μ-globin expression on fetal liver was also higher than in adult bone marrow (Figure 3B). Consistent with the array data, these data suggest that the levels of μ-globin mRNA in erythroid tissues decrease during postnatal development.
Because of concerns that the hybridization signals detected by array and Northern blot analyses may be nonspecific, quantitative PCR was performed (Figure 4). Sequence-specific primers and a probe spanning the μ-globin exon2 and exon3 boundary were designed to avoid the amplification of unprocessed RNA, genomic DNA, or other α-globin transcripts. In confirmation of the array and Northern data, μ-globin mRNA levels were significantly higher in cord blood reticulocytes than in adult blood reticulocytes (1.71 × 105 ± 9.51 × 104 copies/ng cDNA in cord blood and 2.17 × 104 ± 6.84 × 103 copies/ng cDNA in adult blood; P < .0002). This pattern of decreased adult expression was also noted by the comparison of fetal liver and adult bone marrow (3.04 × 104 ± 1.68 × 103 copies/ng cDNA in fetal liver and 1.15 × 104 ± 1.19 × 103 copies/ng cDNA in bone marrow; P < .0002) (Figure 4A). α-globin amplification was performed for comparison. Although the expression of α-globin was 2 to 3 orders of magnitude higher than that of μ-globin, the levels of α-globin mRNA were equivalent in fetal and adult tissues (Figure 4B). Therefore, at the transcriptional level, μ-globin expression is only approximately 0.1% of the normal adult α-globin.
μ-globin expression during erythropoiesis was examined and compared with that of α-globin using cultures of adult CD34+ cells.13 μ-globin was not detected above background levels until day 4, when large, immature erythroblasts began to appear in culture. After day 4, a rapid increase in μ-globin was detected until day 10; this was followed by an equally rapid loss as the cells underwent terminal maturation. This pattern was similar to that identified for α-globin, but the α-globin peak occurred later during the culture period on day 12. The overall expression level of μ-globin was 100-fold less than that of α-globin throughout the culture period (compare scales on Figures 4C, D).
SELDI-TOF mass spectrometry of red blood cell lysate
Based on the highly regulated transcription of the μ-globin gene, assays were developed to determine whether significant quantities of μ-globin proteins are expressed in circulating erythrocytes. Importantly, the literature provides little evidence for the existence of this protein in humans or other mammals. Efforts to raise μ-specific antibodies using peptide sequences of μ-globin have not been successful to date (data not shown). Therefore, cord blood and adult blood lysates were directly examined using mass spectroscopy using the SELDI-TOF mass spectometry technology. This method is one of the more sensitive proteomic detection tools and may have some advantages over matrix-assisted laser desorption/ionization (MALDI).14,27 A detection sensitivity of 20 ng per sample was determined using serially diluted hemoglobin standards. In parallel assays, cation exchange high-performance liquid chromatography (HPLC) peaks13 were not detected at levels below 300 ng per sample. SELDI analysis of cell lysates from 3 cord blood and adult blood samples revealed the relative amount of α, β, and γ protein in each sample over the 15-kDa to 16.5-kDa mass range (Figure 5). We predicted the molecular weight of globin protein without the initial methionine residue because of the adjacent valine residue.28 The α-globin chain (15 126 Da) was identified as the major peak in adult and cord blood samples. The beta (15 867 Da) and gamma (15 996 Da) globin peaks were detected in cord blood samples, but the gamma peak was not detected in the adult lysates. No peak was seen at the expected size of μ-globin (15 487 Da) in any samples (ie, no significant peaks with sufficient signal-noise ratio were observed). This suggested the absence of measurable quantities of μ-globin in the lysates. The identities of the other peaks demonstrated by this method of globin analysis are being studied separately (data not shown).
Homology comparisons and phylogenetic analyses
At the nucleotide level, several mammalian orthologues were identified by EST alignments, but no significant homologies were noted with avian or reptilian mRNA examined to date (data not shown). The predicted protein sequence of μ-globin was also searched against the GenBank database using BLASTP default parameters. The search demonstrated that the predicted μ-globin protein most closely aligned with the avian α-D globin chain of bar-headed goose (GenBank, gi70296) with a similarity of 55% (78 of 141 amino acid [aa]). A lower similarity of μ-globin with the human α-globin chain was found (64 of 141 aa). Alignments of the heme-binding sites α1-β1 and α1-β2 contact sites were also studied. Heme-binding homologies were equivalent among μ, the human α chain, and the avian α-D chain (84% [16 of 19]). However, the α1-β1 and the α1-β2 contact sites demonstrated considerably more homology between μ-globin and the avian α-D globin. Interestingly, μ-globin and the avian α-D globin chain have the same length of 141 amino acid residues.
Based on the genetic and predicted protein homologies between μ-globin and avian α-D globin, a more complete phylogenetic comparison was performed. A total of 291 α-like globin protein sequences were collected from GenBank. These amino acid sequences were used for the construction of a phylogenetic tree using the MP algorithm. The constructed tree demonstrated several clustered globin families (Figure 6). As expected, the human α chain was clustered with other mammalian α chains, and human ζ-globin clustered with the mammalian ζ-globin group. The θ-globin group clustered most closely to the α family. In comparison, human μ-globin clustered with the avian and reptilian α-D chains at the greatest distance from the α-globin cluster (Figure 6). No other mammalian globin within the group of 291 was placed within the α-D family.
Discussion
In this report, the availability of a fully sequenced genome and high-throughput expression profiles led to a re-examination of the region of the α-globin locus identified as pseudo-α2.7 A novel globin transcript was identified and named μ-globin. μ-globin is not a pseudogene because it is transcribed from a 510-nucleotide (nt) genomic sequence that contains 2 introns, and it has an ORF encoding 141 amino acids without disruption.29 The μ-globin gene also contains a promoter region with erythroid transcription factor-binding sites, a 24-nt mRNA leader sequence, a Kozak sequence,24 and a functional polyadenylation signal (Figure 2A). In contrast, the originally described pseudo-α2 pseudogene7 was reported to contain no promoter because of the proximity of its first exon with the ζ1-globin gene located just upstream. Pseudo-α2 was also reported to contain a mutated 5′ splice site for intron 1, several frameshift deletions, an insert in the second and third exons, and significant mutations in the polyadenylation signal region compared with the α2 gene. When aligned with current maps, the originally reported genomic sequence for pseudo-alpha2 was found to contain several unmatched nucleotides or gaps (data not shown). Therefore, the pseudogene annotation might have resulted from DNA sequencing limitations that existed 20 years ago.
In the context of the 44 299 probes examined by microarray, relatively high-level expression from the μ-globin probe set (240336_AT) was detected in erythroid cells in vivo (ranked among the top 0.2% of reticulocyte transcripts). Among differentiating primary erythroblasts, μ-globin gene expression was highly regulated with a pattern nearly identical to that of α-globin. However, the level of μ-globin mRNA represented only a small percentage of the amount of α-globin mRNA in fetal and adult erythroid tissues. The differences in the levels of μ- and α-globin gene transcripts may be attributed to differences in their promoters or in the stability of their mRNA. The delayed peak level of α-globin transcripts compared with μ-globin (culture day 12 vs 10, respectively) is consistent with increased stability of the α-globin transcripts.
Because the expression pattern of the β-globin genes in humans during ontogeny generally follows their gene order in the cluster,30 the location of the μ-globin gene between the embryonic and adult genes in the α-globin cluster also suggests that the gene might be developmentally regulated. Northern blot analyses of other tissues revealed no detectable μ-globin gene expression among nonerythroid tissues, suggesting tissue specificity. Microarray, Northern, and quantitative PCR analyses consistently demonstrated significantly higher levels of μ-globin in the fetal tissues compared with that found in the adult tissues. Hence, μ-globin demonstrates erythroid-specific expression with a pattern during ontogeny similar to that described for the γ-globin genes in the β cluster.
To determine the similarity of the predicted μ-globin protein with other known α-like proteins, homology analyses were performed. The ORF-predicted protein from human μ-globin was compared with 291 known α-like globin protein sequences deposited in GenBank using an MP algorithm. BLASTP and ClustalW alignments were also performed, including focused comparisons of heme, α1-β1, and α1-β2 binding. In each case, the predicted human μ-globin was most closely related to the avian and reptilian α-D globins. Initial analyses of primate, bovine, and porcine genomes or associated ESTs suggest α-D-encoded ORFs will soon be identified in a variety of mammals using comparative genomics. Hence, the μ-globin gene may represent an expressed homolog of an ancient globin gene.1 The α-D globin family was first identified as an α chain of hemoglobin M in chicken embryos.31 α-D proteins assemble into high-oxygen affinity hemoglobins among avian31-34 and reptilian species.35-38 In both species, hemoglobins containing α-D chain are expressed at all stages of ontogeny. Embryonic expression may be advantageous as an embryonic adaptation to hypoxia. Adult expression may also provide a survival advantage associated with the ability to respond to the hypoxic conditions of high-altitude flight33,34 or those associated with prolonged submersion.35,36 Thus, it was postulated that an evolutionary advantage for α-D globin expression might have arisen from hypoxic or anoxic conditions.39 The higher level of μ-globin gene expression in cord blood is consistent with the usefulness of high-oxygen affinity hemoglobins during fetal life. Unfortunately, the absence of detectable μ-globin protein in human erythroid tissues makes it difficult to extrapolate the avian and reptilian functional data to humans.
Interestingly, most of the proteins predicted by recent genome mapping efforts have not yet been detected in nature.40,41 This may be attributed to low-level expression or low sensitivity of protein assays. In this context, our inability to detect μ-globin is not unusual. However, it is extremely curious that μ-globin is not the only gene in the human α-globin cluster that lacks a detectable hemoglobin product. The θ-globin gene is also transcribed, but no protein or hemoglobin product has been detected.2-5 Both genes are well conserved at the genomic level with appropriate splicing junctions and maintenance of ORFs. Like μ-globin, θ-globin also has a highly regulated pattern of transcription in erythroid cells.5 In addition, both genes demonstrate only fractional levels of transcription compared with the dominant α genes, and their deletion in humans has no reported effects on the clinical phenotype.42,43 Therefore, the evolutionary conservation of the μ- and θ-globin genes in the absence of a hemoglobin product represents a biologic paradox. This is especially puzzling when considering the low levels of gene products compared with the amount of globin required in humans for the transport of oxygen. In the case of μ-globin, the protein homology between the human, avian, and reptilian species in the absence of any significant genetic homology suggests a selective pressure to sustain the ORF. Therefore, it is uncertain whether this gene is evolving toward becoming a pseudogene. Instead, the possibility exists that this newly discovered, but ancient, globin has a function for which high-level protein expression is not required.
Prepublished online as Blood First Edition Paper, April 26, 2005; DOI 10.1182/blood-2005-03-0948.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.
We thank Drs Douglas R. Higgs, Robert L. Danner, and Alan S. Schechter for helpful discussions. The ongoing cell-processing support of the National Institutes of Health Department of Transfusion Medicine is greatly appreciated.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal