Abstract
Much of our understanding of human physiology, and of many aspects of pathology, has its antecedents in laboratory and clinical studies of hemoglobin. Over the last century, knowledge of the genetics, functions, and diseases of the hemoglobin proteins has been refined to the molecular level by analyses of their crystallographic structures and by cloning and sequencing of their genes and surrounding DNA. In the last few decades, research has opened up new paradigms for hemoglobin related to processes such as its role in the transport of nitric oxide and the complex developmental control of the α-like and β-like globin gene clusters. It is noteworthy that this recent work has had implications for understanding and treating the prevalent diseases of hemoglobin, especially the use of hydroxyurea to elevate fetal hemoglobin in sickle cell disease. It is likely that current research will also have significant clinical implications, as well as lessons for other aspects of molecular medicine, the origin of which can be largely traced to this research tradition.
Introduction
During the past 60 years, the study of human hemoglobin, probably more than any other molecule, has allowed the birth and maturation of molecular medicine. Laboratory research, using physical, chemical, physiological, and genetic methods, has greatly contributed to, but also built upon, clinical research devoted to studying patients with a large variety of hemoglobin disorders. During this period, the pioneering work of Linus Pauling, Max Perutz, Vernon Ingram, Karl Singer, Herman Lehmann, William Castle, Ruth and Reinhold Benesch, Titus Huisman, Ernst Jaffé, Ernest Beutler, and many others still active has been instrumental in these studies. Our understanding of the molecular basis of hemoglobin developmental and genetic control, structure-function relations, and its diseases and their treatment is probably unparalleled in medicine. Indeed, this field, especially during the first 25 years of the existence of the American Society of Hematology, provided the model for developments in many other areas of research in hematology and other subspecialities. This review attempts to highlight some recent developments in hemoglobin research most relevant to the hematologist in the context of the current understanding of the functions of these proteins and their genes. I am occasionally asked, “What's new in hemoglobin?” I believe that this review will show that we are still learning much that is very relevant to our understanding of human physiology and disease.
Hemoglobin structure
The human hemoglobin molecules are a set of very closely related proteins formed by symmetric pairing of a dimer of polypeptide chains, the α- and β-globins, into a tetrameric structural and functional unit. The α2β2 molecule forms the major adult hemoglobin. Their main function in mammals is to transport oxygen (O2) from the lungs to tissues, but they also specifically interact with the 3 other gases, carbon dioxide (CO2), carbon monoxide (CO), and nitric oxide (NO), that have important biological roles.
The functional properties of hemoglobin molecules are primarily determined by the characteristic folds of the amino acid chains of the globin proteins, including 7 stretches of the peptide α-helix in the α-chains and 8 in the β-chains (Figure 1).1,2 These helices are in turn folded into a compact globule that heterodimerizes and then forms the tetramer structure.3 These 4 polypeptides of the hemoglobin tetramer each have a large central space into which a heme prosthetic group, an iron-protoporphyrin IX molecule, is bound by noncovalent forces, and thus the iron atom is protected from access of the surrounding aqueous solution. The iron atoms in this environment are primarily in the physiologic ferrous (FeII) chemical valence state, coordinated to 4 pyrrole nitrogen atoms in one plane, to an imidazole nitrogen atom of the invariant histidine amino acid at position 8 of the “F”-helix, and to a gas atom on the side opposite (with respect to the porphyrin plane) the histidine residue. The reversible binding of gases to these 4 ferrous iron atoms in the tetramer of globin polypeptides allows hemoglobin to transport O2, CO, and NO.4 CO2 is transported in the blood in solution and by interactions with the amino-terminal residues of hemoglobin as a weak carbamino complex and not by binding to the iron atoms.
In recent years, knowledge of the properties of the characteristic folds of each of the globin polypeptides and their ability to bind heme prosthetic groups has led to the development of a detailed evolutionary tree to describe the ontogeny of this family of genes from bacteria to vertebrates.5,6 In bacteria, they are known as flavohemoglobins and appear to be primarily NO dioxygenases for detoxifying NO; in the protist and plant taxa, these single-chain globin proteins are largely involved with electron transfer and O2 storage and scavenging. In invertebrates, the O2 transport function of the globins develops as do several other biochemical functions. It is in the vertebrate taxa that the characteristic pattern of highly expressed intracellular globins, frequently functioning as multimers, for oxygen transport over relatively long distances evolved (Figure 2). These several globin proteins also include, however, the single-chain myoglobin, in high concentrations in many muscle tissues, as well as the homologous (to myoglobin and to each other) α- and β-globins and their very stable α/β dimers that pair to form hemoglobin. In the highly specialized mammalian enucleated cell, the erythrocyte, these molecules are expressed at very high concentrations (Figure 1), resulting in a tremendously efficient transport mechanism. The genes of myoglobin (and other globins) separated from the α- and β-globin genes during vertebrate evolution, and these 2 genes themselves evolved into complex genetic loci on separate chromosomes. The numbers of these genes, their chromosomal locations, and their developmental control vary greatly among species; however, the basic globin gene structure and protein folds are conserved in evolution among all mammals.
Myoglobin has a very high affinity for O2 compared with hemoglobin, but a detailed understanding of its function has still not been achieved. Mice with knockout of the myoglobin gene have almost normal physiology. Myoglobin is thought to serve more to facilitate oxygen diffusion in muscle, especially to mitochondria, than to act as a storage site, as previously thought.7 Myoglobin also appears to act as an NO dioxygenase and a nitrite reductase. In the last decade, several other homologous proteins—neuroglobin and cytoglobin—have been detected in low amounts in certain tissues and appear to protect against hypoxia; again, however, there is much controversy about their functions.8,9
Hemoglobin function
The role of erythrocyte-encapsulated hemoglobin in transporting oxygen has been the focus of many of the greats of physiology, including Christian Bohr, August Krogh, J. B. Haldane, F. J. W. Roughton, and others in the last century and has been reviewed in detail.10,11 More recently elucidated was how this finely tuned system is regulated via heterotropic interactions with other molecules, such as protons, anions, and bisphosphosphoglyceric acid (2,3 BPG or, in the older convention, 2,3 DPG),12 and by intramolecular, or homotropic, interactions for optimal normal respiratory function.13 Cooperative oxygen binding can be explained very precisely in terms of the allosteric model14 of protein regulation of Monod, Wyman, and Changeux, but alternative models are still being developed.15 Understanding the physiologic fine-tuning of this function by proton binding (the Bohr effect) or 2,3 BPG binding has been a triumph of basic protein chemistry and applied physiology during the last 50 years.10,11
In the 1950s, the methods of protein sequence determination and X-ray crystallography allowed the determination of the amino acid sequences of various hemoglobins and the spatial arrangements of their atoms. This work, marked in particular by the high-resolution structural analysis—among the first for any protein—by Nobelist Max Perutz (Figure 3) and his colleagues in the late 1960s,2,16 soon resulted in a detailed explanation of the relationship of hemoglobin function as an oxygen transporter to its molecular structure. Furthermore, this information allowed for the explanation of the clinical phenotypes of most of the many hundreds of characterized mutations in the globin genes and proteins, which cause changes in function and include the many “hemoglobinopathy” diseases, in terms of this molecular structure. These correlations, initiated by Perutz and Lehmann17 and pioneered in the United States by Ranney, Beutler, Nathan, Bunn, Forget, and others, remain among the landmark accomplishments of the then-new field of molecular medicine. Although much of this information is now securely rooted in the textbooks, studies of hemoglobin function have recently become quite active again.
In the last decade, there has been considerable attention to understanding the interactions of normal hemoglobin with CO, in recognition of the fact that as well as being a toxic hazard, CO is produced in the body from free heme by heme oxygenase and can itself activate soluble guanylyl cyclase.18 For this and other reasons, it has potential pharmacological applications. This attention to non–oxygen-related functions has been even more applicable to the study of NO/hemoglobin interactions since the important realization in the mid-1980s that NO is a ubiquitously produced cell signaling molecule, acting via both soluble guanylyl cyclase production of cyclic GMP and other mechanisms, throughout almost all life forms. It is especially important in mammals in the regulation of vascular tone, cell interactions, and neural function.19
It has been known since before World War I that NO reacts with oxyhemoglobin to produce methemoglobin, with ferric (FeIII) iron and nitrate ions. Recent work suggests that most of the methemoglobin circulating in red blood cells is derived from this oxidation process,20 which is normally reversed by the erythrocytic methemoglobin reductase system. In the past 40 years, a second reaction of NO with deoxyhemoglobin to form nitrosyl(heme)hemoglobin (NO-hemoglobin), with the NO liganded to the ferrous iron atom, has also been studied intensively. Like the reaction with oxyhemoglobin, this reaction had generally been assumed to be irreversible. However, there is now evidence that NO-hemoglobin in the circulating red blood cell may be capable of releasing NO molecules—thus potentially allowing a mechanism for hemoglobin-based, endocrine-like transport of NO from one tissue to another within the body.21
Ten years ago, a third reaction of NO with oxyhemoglobin was postulated to be physiologically important: the binding of NO to the strongly conserved β-chain cysteine amino acid at position 93 (Figure 1) to form S-nitrosylhemoglobin (SNO-hemoglobin).22 It was suggested that SNO-hemoglobin can physiologically dissociate to release NO at low oxygen concentrations. Thus, this could be a mechanism for homeostatic control of blood flow to tissues, because the NO released would promote vascular dilatation and increase blood flow and oxygen delivery. This hypothesis, although teleologically attractive, has been very controversial, with many studies negating it, and very recent work with transgenic mice lacking the β93 cysteine residues appears to disprove it.23 More recently, an alternate hypothesis to account for the transport of NO by erythrocytes has been advanced. It has been suggested that nitrite ions within erythrocytes can be reduced to NO by deoxyhemoglobin—with reaction kinetics maximal at approximately 50% oxygen saturation—so that NO is increasingly generated as red blood cells enter regions of relative hypoxia.24 Thus, there are now several potential explanations for a likely central function of NO in controlling blood flow via hypoxic vasodilation (Figure 4).
These recent studies of NO interactions with hemoglobin point to the increasing realization in the last few years that hemoglobin has evolved with functional properties important for the physiology of several gases, especially NO, as well as that of the paradigmatic delivery of O2. There is also some indication that abnormalities in hemoglobin levels or localization (for example, the increases in total intracellular hemoglobin that occur in polycythemia or of cell-free hemoglobin in chronic and acute anemias [Figure 4]) may result in clinical abnormalities because of their overall tendency to deplete available NO. The major toxicities of all hemoglobin-based blood substitutes seem to be similar and are likely to be due largely to enhanced destruction of NO by the cell-free hemoglobin25 but could possibly be overcome by replacement of the NO.26
The hemoglobin phenotype
In erythrocytes of normal human adults, hemoglobin A (α2β2) accounts for approximately 97% of the protein molecules, hemoglobin A2 (α2δ2) for 2%, and hemoglobin F or fetal hemoglobin (α2γ2) for 1% (Figure 5). This distribution reflects the patterns of expression of the α-globin gene locus on human chromosome 16 and the β-globin gene locus on human chromosome 11. After the evolutionary separation of the 2 mammalian globin loci, each locus has undergone complex changes that resulted in the presence of multiple genes and nonexpressed pseudogenes in the human genome. The pattern of expression of these genes shifts from the more 5′ genes on the DNA to more 3′ genes during fetal, then neonatal, and then adult development stages (Figure 6).27 In the fetus, the ζ and ε genes are initially expressed primarily in the yolk sac, para-aortic region, and then the liver, resulting in the formation of hemoglobins Gower 1, Gower 2, and Portland. Their down-regulation in early embryonic life is followed by the expression of the 2 α-genes and the 2 γ-genes (Gγ and Aγ); they are functionally identical but are different in that there is either a glycine or an alanine at position 136. This causes the accumulation of hemoglobin F, which predominates in the last 2 trimesters of gestation and has a slightly higher oxygen affinity than the adult hemoglobins because it binds 2,3 BPG less strongly. At birth, although the α genes remain fully active, the γ genes are effectively down-regulated and the β-like (δ and β) genes are up-regulated so that, normally, by the end of the first year of life, the “adult” hemoglobin phenotype, hemoglobins A and A2, is predominant. In some cases, expression of the γ-globin persists in adult erythroid cells; this largely asymptomatic state is known as hereditary persistence of fetal hemoglobin (HPFH).28
The covalent modification of the major adult hemoglobin by nonenzymatic glycation of the β-chain amino-terminal residue by glucose forms hemoglobin A1c.29 This was observed in electrophoretic studies of hemoglobin phenotypes and has opened a vast area of diabetes-related research. There has also been much progress in understanding the diverse causes, manifestations, and treatment of methemoglobinemia.30,31 Indeed, the discovery of a deficiency of cytochrome b5 reductase (methemoglobin reductase) as a cause of familial methemoglobinemia may be considered the first description of an enzyme defect in a hereditary disorder.32 There have also been significant advances in understanding the complex physiologic adaptive responses to acute and chronic hypoxia, especially of populations at high altitudes.33
During the last 30 years, an enormous amount of effort has been devoted to understanding the molecular and cellular mechanisms that underlie these changes (called hemoglobin “switching”) in expression of the α- and β-globin gene clusters.34 This has been because of the intrinsic interest of this system as one of developmental gene control but also because of the potential relevance of this information to developing therapies for the 2 most common groups of genetic diseases of hemoglobin, the sickle cell syndromes and the thalassemia syndromes. Before reviewing these studies of globin developmental control, I note some of the relevant work—especially recent findings—on the pathophysiology of these 2 groups of diseases and how altering the hemoglobin phenotype might be clinically beneficial.
Sickle cell disease
The discovery by Linus Pauling and his associates in 194935 that the molecular basis of sickle cell anemia is due to an abnormal hemoglobin virtually created the field of molecular medicine and moved research hematology to its forefront. It is sometimes forgotten that this molecular medicine paradigm also required understanding of the inheritance pattern of this disease, which was supplied in the same year by J. V. Neel,36 whose publication is also one of the founding articles of the field of medical genetics. We now have a detailed understanding of how a single nucleotide change (A to T) in the β-globin gene leads to the valine for glutamic acid substitution37 in the β-globin protein. This in turn allows the formation of stable intermolecular interactions (linear polymers of the tetramers) in the concentrated intracellular solutions of deoxyhemoglobin S (α2β2S or sickle hemoglobin).38 This process is the basis for our understanding of the pathophysiology of this disease39,40 at the genetic, molecular and cellular levels. Sickle cell anemia pathophysiology is a consequence of this reduced solubility, causing polymerization of hemoglobin S tetramers in red blood cells upon partial deoxygenation and the impaired flow of these cells in the microcirculation.38 Other mechanisms secondary to intracellular polymerization have been extensively studied, especially in animal models, but their relative importance to human pathophysiology remains unclear.
More than 50 years ago it was postulated (probably first by J. B. S. Haldane) that the sickle mutation results in increased resistance to malaria in heterozygotic persons or carriers41 ; subsequent work indicates that this is true for thalassemia as well.42 Current research suggests the importance of redox and immunologic processes in this protection, but the exact cellular mechanisms are not yet clear.42,43 Again, as with so many other studies of the clinical biology of hemoglobin, this concept of selective advantages for carriers of certain disease-causing (in the homozygous state) genes has been applied widely.
Another new concept from sickle cell anemia research quickly extended to other diseases was the realization by Y. W. Kan and his colleagues in 197844 that restriction enzymes could be used to detect DNA polymorphisms linked to the abnormal β-globin gene to identify prenatally those fetuses who have one or both of the mutant hemoglobin genes. These studies also initiated the gradual transition of the molecular diagnosis of hemoglobin disorders from protein methods to the current wide range of extremely sensitive and precise nucleic acid analyses.45
However, despite the detailed characterization of the abnormal gene and protein and the behavior of hemoglobin S in red cells, we understand relatively little of how these abnormalities affect specific organs and the overall health of affected persons. The best indicator of this conundrum is the unexplained heterogeneity in age of onset and severity of disease in persons whose hemoglobin genotype and phenotype appear similar or identical.46 Unlike “classical” monozygotic diseases, even many of the thalassemic syndromes, clinical progression and the need for treatment in patients with sickle cell anemia patients can only be predicted in limited circumstances, such as in children detected to have abnormal blood flow in the large vessels of the brain as measured by the transcranial-Doppler ultrasound method47 or in adults with pulmonary hypertension.48
Although many other measurements, such as globin cluster haplotype analysis or white blood cell levels, have been suggested to have explanatory and predictive value, only 2, the presence of α-thalassemia and the levels of hemoglobin F, have been validated comprehensively. Coexisting α-thalassemia leads to a reduction in MCHC, which inhibits hemoglobin S polymerization, but this beneficial effect seems to be counter-balanced by the increase in total hemoglobin levels, which may have some deleterious effects.49,50
The beneficial effects of hemoglobin F have been confirmed by clinical observations, epidemiologic studies, biophysical measurements, and therapeutic trials. In 1948, Janet Watson51 noted that until adult hemoglobin displaces the form present at birth (hemoglobin F), manifestations of sickle cell disease are limited. Population studies among different groups of persons with sickle cell disease (eg, Saudi Arabs vs African populations) or within single geographic areas, as well as a large “natural history” study in the United States confirmed that various measures of severity were inversely related to hemoglobin F levels but suggested that very high (> 25%) levels were needed for major benefit. At the same time, diverse laboratory studies showed the mechanism by which hemoglobin F had a sparing effect on intracellular polymerization (Figure 7) and confirmed clinical estimates of the levels of hemoglobin F needed for benefit.52 Equally importantly, several drugs were found to increase hemoglobin F levels in nonhuman primates. DeSimone and Heller53 and Letvin et al54 showed that 5-azacytidine and hydroxyurea (now frequently designated as hydroxycarbamide) had such effects. This work was extended to patients by Platt, Charache, Dover, Nienhuis, Ley, Rodgers, and their colleagues (reviewed by Rodgers55 ). In a multicenter, double-blinded study of adults with frequent pain crises, led by Charache,56 hydroxyurea improved several clinical parameters compared with placebo. In 1998, hydroxyurea was approved by the US Food and Drug Administration for treating these types of patients, and a recent systematic review has confirmed its efficacy in adult patients with sickle cell disease.57
However, many patients do not respond at all to hydroxyurea with elevations of hemoglobin F, whereas some clinical manifestations seem to be little affected by even the 10% to 15% levels of hemoglobin F obtained in some patients with the drug. Furthermore, there is yet limited evidence that the drug prevents damage of crucial organs, such as the lungs, kidneys, and brain, or improves survival in adult patients.58 Long-term and controlled studies in children to assess the effects and safety of hydroxyurea have only been recently initiated. Thus, in addition to further clinical outcome studies with hydroxyurea, there is a strong need to find other agents that may singly or in combination with hydroxyurea have a more robust effect on hemoglobin F levels. To this end, studies with erythropoietin, butyrate compounds, and deoxyazacytidine are being pursued, but at present, none seems strongly promising. Clearly, much more work, both clinical and laboratory, is needed to test these drugs further and to find new agents if we are to improve on this partially effective pharmacologic approach to this disease.
Although the utility of hydroxyurea in treating sickle cell disease has been generally accepted in the academic community, its more general use in the treatment of patients with sickle cell disease has been quite limited, even among patients who are likely to benefit.59 Further complicating its evaluation has been that its clinical effects have been attributed by some to mechanisms other than inducing hemoglobin F, such as lowering neutrophil counts or adhesion molecules, generating NO, and others. The evidence for these is minimal, and some of these effects may be indirect results of elevating hemoglobin F. Likewise, many pathophysiologic mechanisms have been proposed in sickle cell disease studies as alternatives or complements to the intracellular polymerization of deoxyhemoglobin S and its effects on the rheologic properties of the sickle erythrocyte. None has the weight of evidence that surrounds the primary polymerization phenomenon, and they are likely to be secondary factors; none of the therapeutic approaches based on these hypotheses has been promising up to now compared with inhibiting polymerization of deoxyhemoglobin S with hemoglobin F.
However, it has recently been proposed that as a consequence of the fragility of the sickle erythrocyte (due to intracellular polymerization, which results in intravascular hemolysis and the chronic anemia characteristic of this disease), circulating cell-free hemoglobin levels are increased, and this acts as a strong NO scavenger. This hypothesis60 postulates that some of the clinical manifestations of sickle cell disease (pulmonary hypertension, leg ulcers, and possibly stroke) relate to this NO deficiency, whereas others (vaso-occlusive pain crises, acute chest syndrome) are due primarily to occlusion of microcirculatory flow by red cells made rigid (not necessarily “sickled”) by intracellular polymer. Patients with sickle cell disease appear to differ in the relative importance of the hemolytic and the occlusive mechanisms for reasons that are not clear. This model suggests possible treatment with NO to dilate vessels so as to diminish sickle cell entrapment, in addition to attempts to inhibit polymerization itself, and thus should be amenable to clinical testing in the near future.
The thalassemia syndromes
Studies of the α- and β-thalassemia syndromes, especially by Weatherall, Wood, Higgs, Nathan, and their colleagues, during the past 50 years have also tremendously informed the basic understanding of the hemoglobin genes and proteins, as have the laboratory studies informed the clinical description of patients with these syndromes.61 The genetic mechanisms causing reduction of α-globin synthesis in the α-thalassemias and of β-globin synthesis in the β-thalassemias have been models for the study of other genetic diseases. These have been reviewed in detail,61,62 as has been the explanation of pathophysiology as being due to chain imbalances within the thalassemic erythroid precursors resulting in ineffective erythropoiesis and medullary as well as intravascular hemolysis, perhaps as a result of oxidative processes and resulting apoptosis-like events during erythroid development.63 In recent years, studies of patients with de novo α-thalassemia and mental retardation and with α-thalassemia and myelodysplastic syndrome have identified a somatic mutation in the gene ATRX, the role of which in chromatin remodeling has a strong effect on α-globin gene expression.64 These studies again illustrate the continued impact of hemoglobin genetics on other aspects of molecular medicine.
Among the major unanswered questions in thalassemia research has been the variability of clinical symptoms for some patients with β-thalassemia intermedia65 but especially for the large numbers of patients who are doubly heterozygous for hemoglobin E and β-thalassemia.66 Recent studies have led to the conceptualization of this variability in terms of primary, secondary and tertiary factors.67,68 The primary factors are those of the β-globin genotype, in particular the amount of globin mRNA and protein (eg, β° with no globin versus β+ with reduced globin levels of expression from each mutant gene). The secondary factors are other genetic changes in the globin gene clusters that either contribute to levels of α-globin or determine γ-globin levels in response to deficiencies of β-globin expression. Among the latter are the Xmn1-G γ polymorphism at position −158 of the Gγ-gene, which up-regulates its expression, as well as other less well understood genetic changes that may cause the markedly high levels of hemoglobin F levels seen in HPFH or increase these levels only slightly. The tertiary factors are much more diverse and include factors that affect iron absorption, bilirubin metabolism, and other factors known and unknown. A protein that binds to free α-chains, the α hemoglobin stabilizing protein, has been identified as a molecular chaperone (affecting the folding of the α-chains) in globin biosynthesis69 and is postulated to influence the severity of β-thalassemia syndromes.70
These analyses confirm the likelihood that pharmacologic or genetic methods to elevate hemoglobin F could have great therapeutic value for most cases of β-thalassemia, as with sickle cell disease. Unfortunately, the response to agents such as hydroxyurea and even the more potent 5-azacytidine that work in many patients with sickle cell disease have been relatively disappointing in patients with thalassemia.71 It has been suggested that recent blood transfusions, which are generally more necessary in patients with thalassemia than in those with sickle cell disease, blunt the effects of these drugs because they seem to need rapidly proliferating bone marrow to increase expression of hemoglobin F.72,73 Clinical studies, with respect to timing of therapies or the use of other factors, such as erythropoietin and iron, may establish regimens with greater clinical benefit. With regard to other recent research in β-thalassemia, it may be noted that new understanding of the control of iron absorption and metabolism and improved chelation agents may also allow amelioration of some of the clinical complications that cause morbidity and mortality without changing the fundamental genetic imbalance.74
Genetic testing and the availability of prenatal diagnosis has greatly reduced the numbers of β-thalassemia patients in certain countries during the last 2 decades. Stem cell transplantation has also been very successful in treating many patients with these syndromes when the requisite medical facilities are available. Overall, survival and cure of the thalassemic syndrome approaches 90% in several major centers.75 High-resolution human leukocyte antigen typing has extended this success to unrelated donors, and the use of less intense conditioning regimens, with the goal of achieving stable mixed chimerism, promises also to expand the utility of stem cell transplants.
However, both transplants and prenatal diagnosis have been much less used or successful in the sickle cell disease population. Stem cell transplants seem to be unfeasible for most patients for economic reasons or because of the lack of matched stem cell donors.
Treatment of the thalassemic and sickle cell syndromes by gene transfer therapy, the original goal of many gene therapy investigators, has been beset by all of the tribulations that have affected this field in general. It has not been possible to achieve high-efficiency transfer of globin gene vectors into erythroid cells with resultant prolonged robust expression of the normal globin gene. Equally important, the recent cases of leukemia resulting from insertional mutagenesis in children with X-linked severe combined immunodeficiency who were being treated with gene therapy has greatly increased safety concerns with regard to viral vectors. Furthermroe, there are reasons to expect that because of the need for high levels of globin protein production with precise chain balances, these diseases will not be easy to treat. However, clinical advances related to the stem cell transplants mentioned above, as well as improvements in the design of globin gene vectors and transfection methods, suggest that clinical studies of gene therapy protocols for some genetic diseases of hemoglobin will soon be undertaken.76-78 An entirely new approach to therapy of genetic diseases, with induced pluripotent stem cells (iPS) generated from autologous skin, has just now been demonstrated in a sickle cell anemia mouse model.79
Human globin genetics
The α-like globin gene clusters shown in Figure 5 are at the p13.3 locus of chromosome 16 in a region of ubiquitously expressed genes near the telomere. The β-like genes are in the p15.5 region of chromosome 11, which contains multiple DNA sequences that act as strong tissue and developmental stage-specific enhancers of transcription. These differences probably account for the fact that the 2 α-globin genes are expressed strongly and continuously in erythroid cells from a short time after early embryonic development to adulthood (Figure 6). However, the 5′ α-2 gene is expressed much more strongly than the more 3′ but identical α-1 gene. In contrast the β-globin gene cluster undergoes sequential expression of the ε-globin gene, then the 2 γ-globin genes, and then the adult δ- and β-globin genes, but with a marked preponderance of β-globin compared with δ-globin. The embryonic ζ-globin gene, which is expressed briefly in early fetal life, is 5′ to the 2 α-globin genes on chromosome 16.
In the α-globin gene locus, a DNA region 5′ to the cluster called hypersensitivity site-40 (HS-40) acts as an erythroid-specific enhancer of transcription of closely linked genes.61,80 In contrast, on chromosome 11, the β-globin gene is regulated by its proximity to a group of at least 5 DNA sites, termed the locus control region (LCR), that are hypersensitive to cleavage by the DNAse I nuclease (for reviews, see Li et al81 and Dean82 ). The LCR appears to contribute significantly to regulation of the sequential 5′-to-3′ expression of the globin genes during development, as well as their very high level of expression—necessary to obtain, with matched contributions from the α-locus, the very high hemoglobin levels of the normal erythrocyte.
Each of the segments of DNA in the 2 globin gene clusters that codes for an RNA transcript for a particular globin protein is by convention called a globin “gene.” All of these genes, 8 in the human, have very similar structures. There are 3 coding exons and 2 intervening sequences, or introns of DNA, the RNA copy of which is spliced from the pre-messenger RNA after transcription, that are subject to many mutations that affect splicing efficiency. The preservation of this structure of the globin genes, which may be related to the preservation of the protein folds, among the human genes and even among many mammalian species, is in contrast to the much greater variability of the number and arrangement of the genes among species.
Immediately 5′ to each gene are regulatory DNA sequences, termed promoters. These, and other more distant DNA elements, contribute to the regulation of expression of each of these genes (for reviews, see Stamatoyannopoulos et al,34 Weatherall and Clegg,61 Higgs et al,80 Orkin,83 Martin et al,84 Chakalova et al,85 and Mahajan et al86 ), and mutations in them also affect the transcription efficiency of each gene. Such mutations, as well as those within the coding sequence that affect post-transcriptional modifications or translational efficiency, are manifest clinically as the thalassemia syndromes discussed above. Such regulatory single nucleotide or point mutations, now numbering in the hundreds, are more common in the β-thalassemias than in the α-thalassemia syndromes, which are mainly caused by large deletions in the α-gene cluster. A registry of these hemoglobin variants and much other information about these molecules and their genes, initiated by Titus Huisman, is available online at the database HbVar (http://globin.bx.psu.edu/hbvar).87,88
Globin gene regulatory mechanisms
Directly flanking each of the globin genes at its 5′ end are regions of DNA sequences designated the proximal promoter that regulate the binding of a complex of proteins that control the initiation and rate of transcription of the mRNA.34,89,90 These cis-acting elements include, in the immediate or proximal region, the ATA, CCAAT, and CACCC nucleotide sequences, frequently found in these regions in many genes. More distant 5′ sequences, up to 1 or 2 kilobases from the gene and called the distal promoter, may also serve to contribute to activation or silencing of each of the globin genes. GATA sequences [(A/T)GATA(A/G)] in the DNA, which bind GATA-1 transcription regulatory proteins, are found throughout the globin gene clusters and have strong erythroid specificity. The transcription regulatory protein (or trans-acting factor) EKLF was originally thought to have strong specificity for positive regulation of the β-globin gene but is now known to also have effects on many erythroid genes. Figure 8 shows a relatively simple model compatible with these results for how developmental control of the β-globin gene cluster may occur via interactions with the LCR and these major factors, but a host of other factors (see below) clearly also play crucial roles.
In recent years, based on the pioneering studies of Felsenfeld, Orkin, Bieker, Engel, and others, evidence has been presented for a role of a variety of other transcription factors or complexes in the control of individual globin genes or their sequential expression, including NF-E2, BP-1, SSP, FOG, FKLF, DRED, and PYR (for reviews, see Stamatoyannopoulos et al,34 Orkin,83 Martin et al,84 Chakalova et al,85 Mahajan et al,86 Giardino et al,88 and Stamatoyannopoulos89 ). No simple picture of how any of these might regulate the developmental pattern of the globin genes has emerged. DNA sequencing of the globin gene clusters has revealed an enormous number of other DNA motifs in the globin cluster, some within the LCR, that appear to be binding sites for many other proteins that also contribute to enhancing or silencing transcription, revealing the enormous complexity of this system.
The DNA sequences coding for the pre-mRNA of each globin gene includes that for the mRNA cap site, the initiator codon, splicing sites, terminator codons, and mRNA polyadenylating signals. Mutations in any of these can affect transcription efficiency; mRNA processing, stability, and transport from the nucleus to the cytoplasm; and the efficiency of translation into globin protein on ribosomes. Impairment in any of these can lead to a thalassemia syndrome, as can more distal mutations or deletions in the globin cluster. Conversely, mutations affecting the binding of transcription factors near the γ-globin genes or DNA deletions that change the relationship of these genes to cis-acting sequences can up-regulate γ-gene expression as in the HPFH syndromes.28
The molecular mechanisms of the β-globin LCR have been partially elucidated by 2 decades of research by the laboratories of Tuan, Grosveld, Groudine, and others, with in vitro, cellular, and genetically modified animal assays. These closely spaced DNAse hypersensitivity sites were initially postulated to act singly or as a group as a major regulator of globin gene expression, both as a strong general enhancer of transcription and as a specific mediator of developmental control of the β-cluster. Both mechanisms are believed to occur by interactions of all or parts of the LCR sequentially with the embryonic, then the fetal, and then finally the adult-like globin genes during ontogeny and, in adult life, during maturation of marrow erythroblasts (Figure 6). They seem to somehow “open” the chromatin of the relevant segments of DNA, allowing the transcription initiation complex and other trans-acting factors to bind at the appropriate gene.91
Much effort has been extended to understand how the LCR affects transcription of the entire β-globin gene cluster. Models have ranged from those based on binding of transcription factors to the LCR and their subsequent migration or “tracking” to the individual genes (compatible with observed intergene RNA transcripts) to the more generally accepted “looping” concepts. Recently, de Laat et al92 have refined these looping models by showing the formation of “active chromatin hubs” or “transcription factories” during erythroid maturation and stochastic interactions of the LCR with the individual globin genes. Current work also emphasizes the possible importance of cell-cycle control and of intergenic transcription in the formation of these transcription domains.93
In some species, parts of the LCR seem to act as insulators, which can shield the β-cluster from the effects of other cis-acting elements or, conversely, limit the spread of the effects of chromatin changes that occur within the globin gene cluster from affecting genes outside of the cluster.94,95 A hypersensitive site 3′ to the β-globin gene cluster has also been noted, but its role is not well characterized. Recent data, however, have suggested that very strong enhanced transcription of later genes may give the appearance of “switching” as early genes are diluted out.96 Finally, it should be noted that virtually all the work just reviewed is focused on initiation and rates of transcription. It is clear that post-transcriptional factors, such as stability of mRNA of the α-globin genes or translation rates (as affected by heme) are also very important in the regulation of intracellular hemoglobin phenotypes.97,98 Unfortunately, relatively little research has occurred in these areas in recent years because the transcription paradigm has dominated the field.
Developmental biology of hemoglobin
Much information about the developmental modulation of hemoglobin initially came from clinical examination of human fetuses and newborns. In recent years, most data have come from studies of erythroid cells in culture or from genetically modified animal models, none of which is a precise model of these processes in humans.99 With that caveat, it should be noted that investigations in mice and, to a limited extent, in humans suggest that erythropoiesis can be divided into a primitive phase or stage in the yolk sac and then a definitive stage, initially in the liver and spleen and later in the bone marrow of the fetus. Extensive delineation of the genetic signals and hematopoietic factors that control primitive erythropoiesis has been accomplished100 ; mechanisms of the control of globin gene expression during the clinically important definitive erythropoiesis stage have been much less tractable to these approaches.
It has been generally accepted that the specificity of expression of the globin genes is not directly related to the site of hematopoiesis, to clonal control of cell maturation, or to growth factors or other signaling molecules. However, recent work has provided evidence for cells of different phenotypes during ontogeny in humans96 and, at least in vitro, the influence of certain cytokines,101 such as stem cell factor (SCF), on the hemoglobin phenotype. Erythropoietin, a very important cell growth factor, stimulates production of erythroblasts that mature into hemoglobin producing cells and thus markedly increases the production of hemoglobin102 but does not seem to affect directly the expression of the globin genes or their developmental control.
Mechanisms of hemoglobin F inducers
In an era striving for “rational” therapeutics, our chances of improving the efficacy of hydroxyurea for sickle cell disease or obtaining agents that work in the thalassemia syndromes would clearly be improved if we understood the mechanisms of the drugs currently used. A major conceptual advance in this field was based on the observations of Stamatoyannopoulos, Papayannopoulou, and their collaborators that adult erythroid cells in culture could express fetal globins and their development of a model based on this to separate cell differentiation and proliferation events.34,89 These concepts helped frame studies in the late 1970s of the importance of methylation of CpG “islands” in the DNA of promoters in suppressing transcription103 and led to the use by others of 5-azacytidine, a demethylating agent, to elevate hemoglobin F in baboons. However, controversy about this mechanism—and the suggestion that the hemoglobin F induction was due to an effect of cytostasis on cell differentiation—led to trials with hydroxyurea and other agents. Although less potent then 5-azacytidine, hydroxyurea appeared safer and thus became the focus of most subsequent research. Later, butyrate compounds, which are elevated in the plasma of mothers with diabetes whose fetuses have delayed “switching,” was tried, and its effect has been ascribed to another gene regulatory mechanism, the acetylation of chromatin proteins.104 Thus attention focused on agents that appeared to change the epigenetic expression of the globin genes directly or indirectly by affecting the cell cycle.
Unfortunately, despite 2 decades of such sophisticated mechanistic research, our understanding of how any of these compounds affects the hemoglobin phenotype of human erythroid cells is really quite limited.89,105,106 Nor do we have good assays for such agents except for studies in nonhuman primates and in humans. Thus it is likely that the search for new drugs must remain both empiric and mechanistic, but now with very precise analytical tools for measuring RNA and protein changes, for the immediate future.
Conclusions
Hemoglobin, perhaps the best studied of all macromolecules, has not revealed all its secrets even at the clinically relevant levels, to say nothing of biophysical studies at the levels of its atoms and electrons. In recent years, although unexpected new functions have been found, the central goal of most biomedical hemoglobin research has been the development of a mechanistic description of the developmental control of the α- and β-globin gene clusters. This field of research has been of great interest to those interested in the whole range of hemoglobin studies—from the most basic molecular genetics, to various “translational” models, to clinical problems in treating patients. It has been the hope that understanding these control mechanisms would lead to the discovery or design of drugs to treat the genetic hemoglobin diseases by efficient elevation of fetal hemoglobin and would also improve the efficiency of stem cell and gene transfer approaches to therapy. Although some of these therapies have progressed greatly during this period, we are still far from understanding the basic processes controlling developmental changes in the globin gene clusters. Despite the enormous body of experimental data obtained from cell, animal, and clinical studies, no predictive model has yet been proposed to explain the control of this obviously complex system.
New experimental approaches to these problems are needed. Very recently genome-wide association studies have demonstrated a gene (BCL11A) encoding a zinc-finger protein on chromosome 2p15 that modulates hemoglobin F levels.107,108 There is much current interest in the potential role of microRNA (miRNA) molecules, rather than more traditional protein trans-factors, in controlling erythroid differentiation.109 It has recently been reported that the transcription factor GATA-1 activates transcription of 2 miRNA molecules that are essential for erythropoiesis.110 This newly emerging paradigm for the control of much gene expression by small, noncoding RNA molecules (the RNA “world”) is only now being investigated in the hemoglobin gene loci.
These and other totally new approaches suggest that our concepts of hemoglobin, and especially its genetics, will be as different in 10 or 20 years as our current concepts differ from those of the investigators who made the first electrophoretic separations of this abundant and convenient protein 60 years ago. However, studies of normal human beings and patients with hemoglobin diseases must continue as vigorously as that of the many model systems that complement clinical work in this field. If improving health is the goal, results from model studies will only be useful if the reiterative processes of “bedside to bench to bedside” continue at all stages of the research.
Acknowledgments
I thank the physicians and scientists of virtually every discipline of biomedical research who have, for many decades, formed an international “college” devoted to hemoglobin and its diseases. I apologize to those I could not name explicitly and for not being able to cite hundreds of important articles because of space limitations. Indeed, except for some papers that I consider classics in the relationship of hemoglobin research to the development of the field of molecular medicine, I have generally cited—where possible—recent review articles rather than the many original papers. In addition, I have emphasized studies clearly relevant to the human hemoglobin molecules and have skirted the vast—but frequently contradictory—literature on “model” systems. In addition, I thank my mentors, my colleagues within and without the intramural National Institutes of Health research program, and especially my “students,” from whom I have learned the most—and in whose accomplishments I have taken the most pleasure.
National Institutes of Health
Authorship
Contribution: A.N.S. wrote the manuscript.
Conflict-of-interest disclosure: A.N.S. is a coinventor of a patent held by the National Institutes of Health for the use of nitrite salts for the treatment of cardiovascular diseases.
Correspondence: Alan N. Schechter, Molecular Medicine Branch, NIDDK, Building 10, 9N-314, National Institutes of Health, Bethesda, MD 20892; e-mail: aschecht@helix.nih.gov.