To further our understanding of the regulation of vertebrate globin loci, we have isolated cosmids containing α- and β-globin genes from the pufferfish Fugu rubripes. By DNA fluorescence in situ hybridization (FISH) analysis, we show thatFugu contains 2 distinct hemoglobin loci situated on separate chromosomes. One locus contains only α-globin genes (α-locus), whereas the other also contains a β-globin gene (αβ-locus). This is the first poikilothermic species analyzed in which the physical linkage of the α- and β-globin genes has been uncoupled, supporting a model in which the separation of the α- and β-globin loci has occurred through duplication of a locus containing both types of genes. Surveys for transcription factor binding sites and DNaseI hypersensitive site mapping of the Fugu αβ-locus suggest that a strong distal locus control region regulating the activity of the globin genes, as found in mammalian β-globin clusters, may not be present in the Fugu αβ-locus. Searching the human and mouse genome databases with the genes surrounding the pufferfish hemoglobin loci reveals that homologues of some of these genes are proximal to cytoglobin, a recently described novel member of the globin family. This provides evidence that duplication of the globin loci has occurred several times during evolution, resulting in the 5 human globin loci known to date, each encoding proteins with specific functions in specific cell types.
Introduction
The large amount of intergenic sequences makes genome analysis a difficult task in most higher vertebrates. Current research has indicated the pufferfish (Fugu rubripes and the closely related Spheroides nephelus) as an ideal species for just this task because it has a relatively compact genome of 400 Mb, approximately 7.5 times smaller than the human genome.1,2Nevertheless, the Fugu genome contains a complement of genes similar to that found in humans.3-5 As a consequence, genes occur approximately once every 8 kb in the Fugugenome. Thus, it provides a suitable model for the comparison with gene loci from higher vertebrates. In our laboratory, we study the regulation of the human β-globin gene cluster. This locus is found on the short arm of chromosome 11 and contains 5 genes that are arranged in the same order in which they are expressed during development: 5′-ε (embryonic) Gγ-Aγ (fetal) δ-β (adult)–3′. The α-globin genes are in a separate locus on the short arm of chromosome 16, close to the telomeric end. Two α-like and 2 β-like polypeptides together form the tetrameric oxygen-carrying hemoglobin molecule. In poikilothermic jawed vertebrates (fish, amphibians, and reptiles), the α- and β-globin genes are found closely linked in the same locus. It is thought that a common ancestral globin gene gave rise to distinct α- and β-globin genes through gene duplication events, followed by the separation of the α- and β-genes to different chromosomes as they are found in today's homoiothermic vertebrates (birds and mammals).6 In humans, the genes are in very different chromosomal environments. The α-globin genes are located in a gene-rich telomeric region of chromosome 16 with a constitutively open chromatin structure in all cell types. The genes have methylation-free CpG islands, and the major regulatory element (α-MRE) is a single erythroid-specific DNaseI hypersensitive site located in the intron of a ubiquitously expressed gene, some 40 kb telomeric to the α-genes.7 The β-globin cluster is AT-rich, has no CpG islands, and adopts an open chromatin structure in erythroid cells only.8 The major regulatory element called the locus control region (LCR) is located approximately 20 kb upstream of the structural genes and contains 5 erythroid-specific DNaseI hypersensitive sites.9 Thus, there are considerable differences in the regulatory mechanisms of these 2 loci. It is therefore interesting to determine the structure of the globin loci in more primitive vertebrates. Here, we describe the isolation of cDNAs encoding α- and β-globin from F rubripes. We have used these cDNAs to isolate the genomic loci of these genes. Surprisingly, the genes have split onto separate chromosomes in the pufferfish, as demonstrated by DNA–fluorescence in situ hybridization (FISH) analysis. Sequence analysis of the cosmid containing the β-globin gene reveals close linkage with 2 α-globin genes, α3 and α4; we refer to this locus as the αβ-locus. Another α-globin cDNA is derived from a globin locus on a different chromosome. This locus contains 2 α-globin genes, α1 and α2, but no β-globin gene,10 and is therefore referred to as the α-locus.
To find potential regulatory elements in the αβ-locus, we have performed DNaseI hypersensitive site analysis in Fuguerythroid cells, searched for transcription factor binding sites that are hallmarks of the mammalian distal regulatory elements LCR and α-MRE, and analyzed expression of the αβ-locus in transgenic mice. These analyses suggest that the αβ-locus may not contain a strong distal element regulating the activity of the globin genes.
Both pufferfish globin loci are flanked by a highly conserved gene encoding a protein homologous to Drosophila rhomboid. The observation that a mammalian homologue of this gene,C16orf8, is found closely linked to the mammalian α-globin genes strongly supports the hypothesis that the α- and β-globin loci have evolved from a single ancestral hemoglobin locus. Furthermore, we have found a short region of homology between the pufferfish hemoglobin loci and human chromosome 17. This region contains a C16orf8 homologue closely linked to a gene encoding a recently identified novel member of the globin family, called cytoglobin11 or histoglobin.12Collectively, our data indicate that duplication of the globin loci occurred several times in evolution. Each locus then diverged from the ancestral locus, resulting in the 5 human globin loci known to date, with characteristic features regarding chromosomal environment, flanking genes, function, and expression pattern.
Materials and methods
Construction of the genomic cosmid library
High-molecular-weight DNA was isolated from adultFugu blood (a kind gift from Dr Ichiro Nakayama, Tamaki, Japan) and was partially digested with MboI. The DNA was size-fractionated by centrifugation through a salt gradient, and DNA fragments in the 20- to 40-kilobase (kb) size range were ligated to the arms of the pTCF cosmid cloning vector.13 Ligation reactions were packaged with in vitro packaging extracts followed by infection of HB101 host cells. The resultant cosmid library was plated on ampicillin-containing LB-agar, and filters derived from this library were screened for the presence of cosmids containing the α- and β-globin genes. In addition, we used a gridded cosmid library constructed in the lawrist 4 vector.14
Isolation and characterization of cosmid clones
Fugu α1- and β-globin cDNA clones were isolated from a cDNA library made from adult Fugu blood, using salmon α- and β-globin cDNAs as probes.15 These cDNA clones were then used to screen the genomic cosmid libraries. Two positive cosmids were obtained with the α1-globin probe and one with the β-globin probe. The cosmids were subjected to restriction mapping, subcloning, and sequencing.
Transgenic mice
The β-cosmid was digested with EcoRI, and the 22-kb DNA fragment containing the globin genes (Figure 4A) was purified on a salt gradient. It was then used at a concentration of 2 μg/mL to generate transgenic mice.16 DNA isolated from tail clips was used to identify transgenic founder mice by Southern blotting. Transgenic F1 offspring were mated to wild-type FVB mice, and expression of the pufferfish β-globin gene was analyzed by reverse transcription–polymerase chain reaction (RT-PCR) of RNA isolated from yolk sac (day-11.5 embryos), fetal liver (day-13.5 fetuses), and peripheral blood (adult mice). Primers used were TGGACTGATCAAGAGCGC (sense) and GTCCATGTTCTTCACAGC (antisense); expected product size on cDNA was 215 base pair (bp). No amplification product was expected on genomic DNA because the sense primer bridges exon 1 and 2.
DNA sequencing and analysis
Subclones from the β-globin cosmid were sequenced on an ABI automated sequencer. Overlapping subclones were then used to assemble the sequences into a contig. Final gaps in the sequence were closed by direct sequencing of cosmid DNA with custom-designed primers. Sequence homology searches were performed against the public databases using the BLAST computer programs17(http://www.ncbi.nlm.nih.gov/BLAST/, http://www.ensembl.org); the private database of the Celera company (Rockville, MD; human and mouse genomes); and the Fugu genomic and cDNA databases (http://fugu.hgmp.mrc.ac.uk/Analysis/). The loci drawn in Figure 5represent the consensus of the public and private databases (human, mouse, and Fugu genome; January 2002) and published papers (April 2002). Genscan was used to find potential exons (http://genes.mit.edu/GENSCAN.html). Alignments of human andFugu sequences were made with the BLAST 2 sequences program (http://www.ncbi.nlm.nih.gov/blast/bl2seq/bl2.html). To increase the sensitivity of the searches, the Fugu contig was divided into 1.1-kb subsequences with 100-bp overlaps. Alignments ofC16orf8 orthologues were visualized with VISTA (Lawrence Berkeley National Laboratory, Berkeley, CA).18 The accession number of the αβ-locus is AY170464.
Preparation of Fugu metaphase spreads
Live pufferfish (fingerlings 4-10 cm in size) were purchased from Green Science, Yamaguchi, Japan. Chromosome spreads were prepared as previously described.19 Briefly, the pufferfish were intraperitoneally injected with 0.1% colchicein (approximately 0.1 mL/10 g fish), the fish were killed after 6 to 8 hours, and the kidneys were isolated. Kidney cell suspensions were subjected to hypotonic treatment, and, after fixation, the cells were dropped onto slides and stained with Giemsa.
In situ hybridization of Fugu interphase nuclei and metaphase spreads
DNA-FISH was performed on interphase nuclei and chromosomal spreads from the pufferfish using biotin- or digoxigenin-labeled plasmid probes containing the Fugu α- and αβ-cosmid DNAs. Preparation of the samples and hybridizations were carried out as described by Mulder et al.19 The hybridized biotin probe was detected with 2 layers of avidin–fluorescein isothiocyanate (FITC), and the hybridized digoxigenin probe was detected with an anti-digoxigenin antibody followed by a Texas red–labeled secondary antibody. DNA was counterstained with DAPI.
DNaseI hypersensitive site mapping
Nuclei were isolated from frozen F rubripes tissues as described.20 In some instances, mouse fetal livers were added as a source of carrier nuclei. A time-course (0-10 min) of DNaseI digestion was performed at 37°C.20 The reactions were stopped by the addition of sodium dodecyl sulfate (SDS) to 1% final concentration and EDTA (ethylenediaminetetraacetic acid) to 5 mM final concentration. After purification, the DNA was digested with SphI, SacI, or XhoI, fractionated on 0.8% agarose gels, and Southern blotted. Blots were hybridized with various probes from different regions of the αβ-locus to cover the entire locus on overlapping restriction fragments. These probes were made by PCR using the following primer pairs: probe 1, CCGACAAGCGTTGCAGTAAT and ATTCTCCTTTGGCCTGCTTC (product 1082 bp); probe 2, TTCAGACAGGCTAGAATGCC and CGTATGTGGCTTGTTCCCTT (product 605 bp); probe 3, AAGCTGTGTTCTTGACTGGG and ACCAGGAGTTGCTTTGGAAC (product 1098 bp); probe 4, TGACAACTCGCTGGTAACTG and TCCACAAGGTCCCTGTATTC (product 554 bp); probe 5, TCAGTGGCGACATTTCACCT and GGAAGGTTCATTTGCACACG (product 970 bp); probe 6, AGCTTGACTCCCGATGAACT and AGAATCTGCCTCGAAGAAGC (product 974 bp); probe 7, GCAGCAGGTTCTCAATCATC and TAGACACCCAAAGCCTTGAC (product 711 bp).
Expression analysis
One microgram total RNA, isolated from various tissues of adult mice, was reverse transcribed with an oligo-dT primer followed by PCR reactions on one fifth of the total synthesis. Primers used are 5′-GGCACACACCAAGAGTTCAGG-3′ and 5′-GCACGGTAGCCACAGCAGTA-3′ for cytoglobin (293-bp product). Primers for cyclophilin A (5′-TCACCATTTCCGACTGTGGAC-3′ and 5′-ACAGGACATTGCGAGCAGATG-3′) were used as an internal control (99-bp product). PCR cycles used were determined to be within the linear range of the reactions.
Results
Isolation of cDNAs encoding F rubripes globins
To begin to analyze the globin genes of the pufferfish, we prepared RNA from peripheral blood and used this to construct a cDNA library. This library was screened under low-stringency conditions with salmon α- and β-globin cDNA probes.15 Twenty-four positive clones were picked and were analyzed by restriction mapping followed by sequencing and database searches. This resulted in the isolation of cDNAs encoding Fugu α- and β-globin proteins (Figure 1). The α-globin cDNA is designated α1-globin in this paper. BLAST searches of the Fugu cDNA database revealed many perfect matches with our α1-globin cDNA and many imperfect matches with another α-globin cDNA. For β-globin, we found many highly similar cDNAs aligning with our β-globin cDNA, indicating that these are all derived from the same gene and suggesting that Fugu contains only one functional β-globin gene. The deduced amino acid sequence of the Fuguα- and β-chains predicts that they would form a Bohr-type hemoglobin tetramer.24
Chromosomal clones containing the F rubripes α- and β-globin genes
The Fugu pTCF cosmid library was screened with theFugu α1- and β-globin cDNAs as probes. Two cosmids that hybridized strongly with the α1-globin probe (α-cosmids) were recovered. One of the α-cosmids was rearranged and is not further considered. In addition, we screened the gridded Fugulawrist 4 cosmid library14 with the β-globin probe. This resulted in the isolation of one cosmid (ICRFc66E1840; αβ-cosmid; see below) strongly hybridizing with this probe. Restriction mapping and Southern hybridizations failed to demonstrate the presence of overlapping DNA fragments between the α- and αβ-cosmids, suggesting that their inserts are not closely linked in theFugu genome.
DNA-FISH analysis of F rubripes α- and αβ-globin loci
To investigate whether the 2 globin cosmids are physically linked in F rubripes, we used DNA-FISH on interphase nuclei and metaphase chromosome spreads.19 To obtain metaphase spreads of pufferfish chromosomes, fish were injected intraperitoneally with colchicein and were killed 6 to 8 hours later. Because the kidney is the site of hematopoiesis in fish, it is likely to contain relatively large numbers of dividing cells. We therefore used kidney cells to prepare metaphase spreads according to standard procedures.19 We used α- and αβ-cosmid DNA probes to evaluate the chromosomal localization of the 2 globin loci. In nuclei, we found 2 red spots with the α-cosmid and 2 green spots with the αβ-cosmid. Confocal microscopy shows that these spots are always clearly separated, and we did not observe colocalization of the red and green signals (Figure 2A). Although these data do not exclude that the loci are on the same chromosome, it shows that they are not closely linked in the pufferfish genome. The analysis of spread metaphases was more difficult because of the low frequency of dividing cells and the inefficiency of probe hybridization. Nevertheless, specific hybridization signals could be detected and categorized into chromosomes bearing green signals and chromosomes bearing red signals. Both signals were observed at the telomeric ends of the chromosomes. However, colocalization of red and green signals on the same chromosome was never found. Furthermore, the red α-cosmid signal is present on a much larger chromosome than the green αβ-cosmid signal (Figure 2B). We conclude that the α- and αβ-cosmids represent 2 hemoglobin loci that have separated onto different chromosomes in the pufferfish.
F rubripes α3-, α4-, and β-globin genes
Because the work in our laboratory is focused on the analysis of the human β-globin gene cluster, we analyzed the pufferfish cosmid containing the β-globin gene in more detail. We sequenced the β-globin gene and flanking sequences. We found that the cosmid contains one β-globin gene that matches our β-globin cDNA perfectly. In addition, we found that 2 putative α-globin genes flank the β-globin gene. To validate the assignment of the Fuguglobins as either α-type or β-type proteins, alignments of human and Fugu globins are shown in Figure 1.
The α-cosmid contains the α1- and α2-globin genes but no gene encoding a β-globin polypeptide.10 We therefore refer to our α-globin genes as the α3- and α4-globin genes and to our globin locus as the αβ-locus. Because the α3-globin gene aligns perfectly with α-globin cDNAs in the Fugu cDNA database, we conclude that it is a functional gene. The α3-/α4- and β-globin genes are in the opposite transcriptional orientation (Figure 3A). The introns of these globin genes are relatively small (88-551 bp), as would be expected in the pufferfish, but the classical 3 exon/2 intron structure of the vertebrate globin genes is conserved.6 The splice donor and acceptor sites conform to the GT/AG rule, and we find canonical poly-adenylation signals in all 3 genes (data not shown). The α4 globin gene is most closely related to the αd-globin gene of birds, the minor adult α-globin.25 Given that the gene lacks a TATAA box in the canonical position and there are no perfect matches in the Fugu cDNA database, we consider it possible that this gene is no longer active.
We find a number of distinctive hallmarks in the promoters of theFugu α3- and β-globin genes (Figure 3B-C). Both promoters contain noncanonical TATA-box motifs at the expected positions. Perhaps the most interesting observation is the presence of an inverted CACC-box motif (TGGGTGGGG) in the β promoter. In mammals, this motif is essential for high-level β-globin expression26,27 through the interaction with the erythroid-specific transcription factor, EKLF.28-30 This suggests that expression of the Fugu β-globin gene is also regulated by an EKLF homologue. To functionally characterize theFugu αβ-locus, we isolated a 22-kb EcoRI fragment containing the globin genes (Figure4A) and used it to generate transgenic mice. Two founder mice transmitted the transgene to their offspring. Although these lines appeared to contain intact copies of the transgene as judged by Southern blot analysis, we could not detect the expression of transgene-derived β-globin mRNA in embryonic, fetal, and adult erythroid cells (data not shown). Thus, either the 22-kb fragment lacks elements essential for globin gene activation or the evolutionary distance precludes activation of the pufferfish αβ-locus in mice.
Search for distal regulatory elements
We searched for combinations of erythroid-specific transcription factor binding sites (EKLF, GATA, NF-E2)10 to identify regulatory elements of globin expression outside the promoters of the genes. Although we found a clustering of potential NF-E2 binding sites upstream of the α4-globin gene, positioned around 25.5 kb in Figure4A, these sites were part of a 27-bp sequence tandemly repeated 3 times and located in an area of repetitive DNA (Figure 4A). Such an arrangement does not resemble previously characterized globin control elements, and the clustering of these sites is possibly spurious because of the tandem repeats. BLAST alignments with other vertebrate globin loci did not reveal any clues to the presence of regulatory elements. However, sequence conservation in regulatory modules is usually very poor. We therefore used DNaseI hypersensitive site (HS) mapping as an alternative approach to obtain information about potential regulatory elements in the pufferfish αβ-locus. We isolated nuclei from peripheral blood and digested these with increasing amounts of DNaseI to reveal the presence of erythroid-specific DNaseI HS in the locus. As nonerythroid control tissue we used liver. We chose restriction digests and PCR-generated hybridization probes suitable for HS mapping (Figure 4A). Southern blots revealing hypersensitive sites at the globin gene promoters are shown in Figure 4B. We found that the promoters of the α3- and β-globin genes were in an open chromatin conformation in erythroid cells only, in agreement with the notion that these genes are actively transcribed in red blood cells. We did not find hypersensitivity associated with the α4 promoter, in agreement with our hypothesis that this promoter is no longer functional. The repetitive sequences around 25.5 kb appear to be hypersensitive to DNaseI digestion in erythroid cells, but some hypersensitivity is also found in the control tissue (Figure 4B). Thus, this might reflect an intrinsic property of these repetitive sequences. In conclusion, the analysis of DNaseI sensitivity in the pufferfish αβ-locus chromatin demonstrated the presence of erythroid-specific hypersensitive sites associated with the promoters of the α3- and β-globin genes but has not revealed the presence of strong erythroid hypersensitive sites at other positions in the locus. This suggests that activation of the globin genes in the pufferfish αβ-locus does not require the presence of distant regulatory elements.
Genes flanking the F rubripes αβ-locus
In mammals, the β-globin locus is flanked by genes encoding odorant receptors.31 If this represents the archetypal β-globin locus, a similar setting might be found for theFugu αβ-locus. Using the Genscan computer program, we found a number of potential exons in the region downstream of the α3-globin gene. These exons are highly homologous to the human full-length cDNA FLJ22357, which is encoded by the C16orf8gene located close to the human α-globin cluster (gene 5 in Flint et al10). Because some of the exons and introns are extremely small (65 bp), the exon–intron structure of this gene is not readily predicted by Genscan. We made use of the FLJ22357 cDNA and deduced protein sequence to determine the intron–exon structure of theFugu gene. We find that the human and pufferfish genes contain 18 exons and that all the exon–intron boundaries are in the same positions. Furthermore, alignment of the predicted proteins reveals that 72% of the amino acids are identical and 83% are similar, with just 23 gaps in the alignment of the 855 amino acid (aa) proteins. This degree of conservation is much higher than that observed for the hemoglobins (48%-49% identical residues). We conclude that in the αβ-locus, a homologue of gene 5 is the first gene flanking the globin genes on the left (Figures 4A, 5). This is surprising because we have found previously that homologues of genes telomeric to the human α-globin cluster are present in the pufferfish α-locus in the order gene 4, gene 3.1, gene 5, gene 6, and gene 7, with gene 7 closest to the α-globin genes10(Figure 5).
To the right of the α4-globin gene, we found potential exons encoding parts of a protein with extensive homology to the human leucine carboxyl methyltransferase (LCMT) enzyme. Using the human LCMT cDNA sequence, we were able to identify the remaining exons of theFugu LCMT gene. Both genes contain 11 exons, and the exon–intron boundaries are well conserved between the species. TheFugu LCMT gene is in the same transcriptional orientation as the α-globin genes (Figure 4A). The putative LCMT protein is highly conserved between human and Fugu: 65% of the amino acids are identical, and 80% of the residues are similar. Furthermore, optimal alignment does not require the introduction of gaps in either the human or the Fugu sequence. This degree of similarity is much higher than that observed with the hemoglobins, supporting the notion that the putative Fugu LCMTgene is functional. We find that the human LCMT gene is located on chromosome 16, some 30 Mb away from the α-globin locus. We therefore conclude that the region of homology between the human andFugu globin loci stops immediately to the right of the α4 gene. These data are consistent with previous reports showing that the regions of homology between α-globin loci break down at comparable positions.10
Comparative analysis reveals the presence of a novel globin locus in mammals
In the human genome, a paralogue of the C16orf8 gene, encoding FLJ22341, is found on chromosome 17. The pufferfish hemoglobin loci also contain genes highly homologous to the C16orf8gene. Multiple alignment of the human and pufferfish genes generated with the aid of VISTA18 demonstrates the conservation of the coding exons between the C16orf8-related genes (Figure6). Because the pufferfish globin loci are flanked by C16orf8 homologues, we searched the surroundings of the FLJ22341 gene on human chromosome 17 for the presence of the other genes found in the pufferfish α- and αβ-globin loci. This comparison yielded 2 remarkable results. First, the FLJ22341 gene is flanked on the left by theAANAT gene, encoding the arylalkylamine N-acetyltransferase protein (NP_001079). This gene is present in a similar position in the pufferfish α-locus, but not in the human α-locus (Figure 5). Second, the gene immediately flanking the FLJ22341 gene on the right encodes a novel member of the globin family (XM_05881811,12); the official name assigned to this globin is cytoglobin (CYGB). Our expression analysis of cytoglobin in the mouse (Figure 7A-B) confirms previous observations that it is widely expressed11 12 but also reveals large differences in expression between tissues.
In the human and pufferfish α-loci, the MPG andC16orf35 genes are between the C16orf8 gene and the globin genes, but in the pufferfish aβ-locus, theC16orf8 homologue is immediately flanked by a globin gene (Figure 5). Thus, this order of genes is the same between the cytoglobin locus on human chromosome 17 and the pufferfish αβ-locus. Furthermore, this syntenic region on human chromosome 17 is completely conserved with a syntenic region on mouse chromosome 11, confirming the common evolutionary origin of this area of the genome. Based on these observations, we conclude that we have identified a novel globin locus on human chromosome 17/mouse chromosome 11.
Discussion
The α- and β-globins in pufferfish
Here, we describe the isolation of α- and β-globin genes from the pufferfish F rubripes. We present evidence thatFugu contains one β-globin gene and at least 2 functional α-globin genes. This conclusion is supported by BLAST searches in the most recent version of the Fugu genome5(version 8.1.1; release date, July 18, 2002) that indicate theFugu genome does not harbor hemoglobin genes in addition to those contained in the α- and αβ-loci.
The presence of 3 functional hemoglobin genes has been reported previously for the black rock cod, Notothenia coriiceps. It has been suggested that these fish have no requirement for hemoglobin molecules with different oxygen affinities because there is little variation in temperature and oxygen levels in their habitat.32,33 This could also apply to the pufferfish. TheFugu β-globin gene is closely linked to the α3- and α4-globin genes; such close linkage is commonly observed in poikilothermic jawed vertebrates. It is interesting that the α1- and α2-globin genes are located in a different globin cluster on a separate chromosome. This globin locus encodes only α-globin.10 To the best of our knowledge, this is the first example of the split of α- and β-globin genes onto separate chromosomes in poikilothermic vertebrates. The Fuguα-locus contains the α1- and α2-globin genes, of which the α1-globin is active. The αβ-locus contains the α3- and α4-globin genes, of which the α3 gene is active. The α2 and α4 genes are reminiscent of the rat γ1 gene that has retained its coding capacity but is not expressed because of an inactive promoter.34 We hypothesize that these apparently superfluous globin genes have been silenced to maintain a proper α/β chain ratio.
Regulation of globin gene expression
One of the aims of the present study was to gain insight in the regulatory mechanisms underlying globin gene expression. We anticipated that the small size of the Fugu globin clusters would facilitate the elucidation of the requirements for high-level, erythroid-specific gene expression. However, we found no expression of the Fugu αβ-cosmid globin genes in transgenic mice. Other examples have been reported for the Fugu WT1 gene (N. Hastie, personal communication, March 2000) and the Huntingtin gene.35 Possibly, the evolutionary distance between Fugu and mouse precludes the activation of Fugu genes in the mouse.
The major regulatory elements of the mammalian hemoglobin loci, α-MRE and LCR, are characterized by the presence of strong DNaseI HSs in erythroid cells. We therefore performed DNaseI hypersensitive site mapping of the Fugu αβ-locus and searched for transcription factor binding sites to identify candidate regulatory elements. We find erythroid-specific hypersensitive sites overlapping the promoters of the α3- and β-globin genes. We note that the promoter of the Fugu β-globin gene contains an EKLF consensus site, in a position similar to that found in the mammalian β-promoters. This binding site is required for high-level β-globin expression in mammals through the interaction with the erythroid-specific EKLF transcription factor.30 In contrast, we do not find evidence for the presence of distal regulatory elements in the αβ-locus, suggesting that activation of the globin genes in the αβ-locus may not require remote activating elements. It is intriguing that the region of homology with the putative α-MRE, located in a conserved position (intron 5 of gene 7)10 in the Fugu α-locus, is absent in the αβ-locus. A strong erythroid-specific DNaseI hypersensitive site coincides with this putative α-locus MRE (data not shown), and we have previously shown that this element serves as an enhancer in transfection experiments.10 Collectively, these data argue that remote regulatory elements do exist in fish. Therefore, it remains possible that such elements are part of the αβ-locus outside the area analyzed in this study.
Evolution of hemoglobin loci
The physical separation of the α- and β-genes is thought to be advantageous for the generation of novel α- and β- chain variants because gene conversion events would suppress the separate evolution of these 2 globins when the genes are in cis. Furthermore, separation of the loci increases the flexibility of the spatio-temporal regulation of globin gene expression, as exemplified by the relatively recent recruitment of a fetal β-like globin gene in some euplacental mammals such as goats and humans.37 It is believed that the α- and β-globins have evolved from an ancestral globin gene through in cis gene duplication events. Later, the α- and β-globin genes split onto separate chromosomes through intrans duplication of the locus followed by the elimination of the α-genes from the β-locus and the β-genes from the α-locus, resulting in the distinct α- and β-globin loci found in today's birds and mammals.37 This model of the common evolutionary origin of the human α- and β-globin loci is strongly supported by our observation that C16orf8 homologues are linked to bothFugu hemoglobin loci. However, the mammalian β-globin loci are flanked by olfactory receptor genes.31 InFugu, we find no evidence for homology with mammalian and chicken β-globin loci in the chromosomal regions around the β-gene. Possibly, the locus duplication events leading to the Fuguand mammalian hemoglobin loci have occurred independently during evolution. This is supported by the fact that thus far no α-only or β-only loci have been found in the 2 major amphibian lineages, frogs38 and salamanders (T.McM. and S.P., unpublished data, May 2001). Alternatively, the loci may have arisen from the same duplication event, with the present day β-globin loci in homoiothermic vertebrates separated from their original flanking genes through additional chromosomal rearrangements. The analysis of genes flanking the hemoglobin loci in amphibians might help to distinguish between these 2 possibilities.
Common evolutionary origin of human globin loci?
The comparative analysis of pufferfish and human globin loci is consistent with a common evolutionary origin of the human globin loci since we find short regions of homology flanking the α-globin (pufferfish and human) αβ-globin (pufferfish) and cytoglobin (human) clusters. The current annotation of the Fugu genome indicates that neuroglobin, myoglobin, and cytoglobin genes are present in this fish species, but it is unclear yet whether any of these globin genes are also linked to AANAT or C16orf8 genes, or both (S.P., unpublished observations, August 2002).
Recently, evidence has been presented that ancient genome duplications contributed to the vertebrate genome.39,40 In agreement with these data, our work supports a model in which globin loci have evolved through duplication events followed by diversification and specialization of the separate loci.37 Evidence for the original chromosomal rearrangements that gave rise to human myoglobin, neuroglobin, and β-globin loci may no longer be recognized because other, unrelated genes now flank these loci (Figure 5). In contrast, the cytoglobin locus has retained linkage with the AANAT andC16orf8 genes, providing evidence of the common evolutionary origins of this locus and the α- and αβ-loci. Cytoglobin appears to be the most primitive of these loci because it contains only one globin gene encoding a globin of ancient origin.11 12 We therefore suggest that the cytoglobin locus reflects the gene arrangement in an ancient vertebrate globin locus from which the modern-day human myoglobin, cytoglobin, α-, and β-globin loci have been derived (see below; Figure8). This notion is supported by the observation that the C16orf8 homologues of the human α-globin and theFugu α- and αβ-loci are more closely related to each other than to the C16orf8 homologue of the cytoglobin locus (Figure 6B).
Model for the evolutionary origin of the human globin loci
Recently, a model of the evolution of vertebrate globins has been proposed, based on a phylogenetic analysis of the globins.11 In Figure 8, we have adapted this proposal to accommodate the evolution of the human globin loci. The very early ancestor to vertebrates contained a single ancestral globin gene. This globin gene may already have been linked to ancestral AANATand C16orf8 genes, but there is no experimental evidence to support such linkage. Based on the antiquity of neuroglobin, it has been proposed that the last common ancestor to all vertebrates contained 2 globin loci.11 Thus, duplication of the ancestral globin locus resulted in 2 globin loci, developing into loci encoding neuroglobin and cellular globin. Our data support linkage of the cellular globin locus to the C16orf8 andAANAT genes at this stage. Next, duplication of the cellular globin locus resulted in separate cellular and hemoglobin loci. Linkage of C16orf8 and AANAT genes to cytoglobin and hemoglobin loci supports this mechanism. A further duplication of the cellular globin locus allowed the development of the myoglobin and cytoglobin loci. It is unclear from the pylogenetic data whether this occurred before the jawed vertebrates diverged from the jawless vertebrates (agnathans: lampreys and hagfish),11 36 and it will therefore be of interest to determine whether agnathans have both a myoglobin and a cytoglobin locus. In the hemoglobin locus, gene duplication gave rise to a cluster encoding several monomeric hemoglobins, as found in today's agnathans. This allowed the specialization of individual genes in α-type or β-type hemoglobins. Finally, additional locus duplication events followed by deletions of globin genes resulted in hemoglobin loci with only α-type or β-type globins, typical of birds and mammals. The presence of the α-locus, containing only α-type globins, and the αβ-locus, containing both types of globins, in the pufferfish supports this mechanism. Furthermore, this suggests that the locus that gave rise to the human α-globin locus was already an “α-only” locus when amphibians diverged from bony fishes, approximately 400 million years ago, predicting that α-only loci are also present in the amphibian and reptile lineages. Alternatively, the β-gene might have been lost from these loci independently after the divergence of bony fishes and amphibians.
Future directions
The model for the evolutionary origin of the human globin loci, presented in Figure 8, makes several predictions that can be experimentally tested through in silico analysis of globin loci in agnathans, amphibians, and reptiles. In combination with in vitro and in vivo assays, this “functional genomics” approach will provide detailed insight into the evolution and regulation of vertebrate globin gene clusters.
Prepublished online as Blood First Edition Paper, November 27, 2002; DOI 10.1182/blood-2002-09-2850.
Supported by the Dutch Organization for Scientific Research NWO.
N.G. and T.M. contributed equally to this work.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.
References
Author notes
Sjaak Philipsen, Erasmus MC Department of Cell Biology, PO Box 1738, 3000 DR Rotterdam, The Netherlands; e-mail:philipsen@ch1.fgg.eur.nl.