Biochemical Journal

Review article

Metabolic control of transcription: paradigms and lessons from Saccharomyces cerevisiae

Robert N. Campbell, Michael K. Leverentz, Louise A. Ryan, Richard J. Reece


The comparatively simple eukaryote Saccharomyces cerevisiae is composed of some 6000 individual genes. Specific sets of these genes can be transcribed co-ordinately in response to particular metabolic signals. The resultant integrated response to nutrient challenge allows the organism to survive and flourish in a variety of environmental conditions while minimal energy is expended upon the production of unnecessary proteins. The Zn(II)2Cys6 family of transcriptional regulators is composed of some 46 members in S. cerevisiae and many of these have been implicated in mediating transcriptional responses to specific nutrients. Gal4p, the archetypical member of this family, is responsible for the expression of the GAL genes when galactose is utilized as a carbon source. The regulation of Gal4p activity has been studied for many years, but we are still uncovering both nuances and fundamental control mechanisms that impinge on its function. In the present review, we describe the latest developments in the regulation of GAL gene expression and compare the mechanisms employed here with the molecular control of other Zn(II)2Cys6 transcriptional regulators. This reveals a wide array of protein–protein, protein–DNA and protein–nutrient interactions that are employed by this family of regulators.

  • galactose
  • glucose
  • haem
  • meiosis
  • oxygen


The Zn(II)2Cys6 family of proteins are defined by a conserved motif. This motif is composed of six cysteine residues in the form of: -Cys-X2-Cys-X6-Cys-X5-12-Cys-X2-Cys-X6-8-Cys- where X represents any amino acid (Figure 1) [1,2]. The genome of the simple eukaryotic yeast Saccharomyces cerevisiae encodes 46 proteins that contain this motif. Proteins bearing a Zn(II)2Cys6 domain are also found in other fungi, such as the milk yeast Kluyveromyces lactis, the fission yeast Schizosaccharomyces pombe, and the human pathogens Candida albicans and Aspergillus nidulans. The motif has, however, not been observed in either prokaryotes or in higher eukaryotes. The consensus sequence binds two zinc ions (Zn2+) which promote the folding of the motif. In many cases, members of the Zn(II)2Cys6 family have been implicated as transcriptional regulators. The consensus sequence represents the site of interaction between the protein and DNA. The DNA-binding sites for a number, but not all, of these regulators contain trinucleotide sequences (often 5′-CGG-3′) present in the DNA-recognition site either singly or in repeat forms. The orientation of the CGG triplets with respect to each other (e.g. either inverted, direct or everted repeats) and the spacing between pairs of triplets play important roles in the determination of DNA-binding site specificity. For example, Gal4p, Put3p and Ppr1p each bind to DNA as homodimers and interact with inverted CGG sequences (Figures 2A–2C). Gal4p binds to inverted repeats spaced by 11 bp, whereas the spacing between the repeats is different for Put3p (10 bp) [3] and for Ppr1p (6 bp) [4]. In each of these cases, the Zn(II)2Cys6 motif of each monomer faces the motif from the other monomer leading to a level of reflectional symmetry down the centre of the protein and DNA-binding site. The differential three-dimensional spacing and orientation and the Zn(II)2Cys6 domains is achieved by sequences directly adjacent to the consensus motif [5]. This variable linker region connects the Zn(II)2Cys6 domain to a coiled-coil dimerization domain, located at the C-terminal side, that extends away from the DNA (Figure 2). Leu3p also binds to DNA as a homodimer, but interacts with an everted CGG repeat pair (Figure 2D). In this case, the two Zn(II)2Cys6 domains face away from one another. The Hap1p homodimer, on the other hand, binds DNA such that the Zn(II)2Cys6 domains face the same direction in order to bind to a direct repeat of CGG triplets (see Figure 2E). In addition to the CGG triplets, the nucleotides surrounding them may also determine DNA-binding affinities to some extent. Other Zn(II)2Cys6 domain proteins have been shown to interact with DNA as either monomers (e.g. the A. nidulans AlcR protein [6]) or heterodimers (e.g. Oaf1p/Pip2p [7] and Pdr1p/Pdr3p [8]).

Figure 1 The Zn(II)2Cys6 zinc cluster motif

(A) The amino acid sequence of the Gal4p Zn(II)2Cys6 zinc cluster motif. The cysteine residues of the motif (consensus sequence -Cys-X2-Cys-X6-Cys-X5-12-Cys-X2-Cys-X6-8-Cys-) are highlighted in orange. The lysine residues at positions 17 and 18 are highlight in red. These residues make specific contacts with DNA in the Gal4p recognition site (5′-CGGN11CCG-3′). Below the sequence is a box diagram to indicate which parts of the sequence adopt an α-helical nature. (B) The binding of zinc ions facilitates the tight folding of the Zn(II)2Cys6 motif. The motif (grey) within Gal4p binds two zinc ions (yellow) utilizing six cysteine residue side chains (orange). The hydrogen-bonds between the zinc atoms and the cysteine residues are shown as blue broken lines. The side chains of Lys17 and Lys18 are highlighted in red. This Figure was generated using PyMOL (DeLano Scientific; and the PDB co-ordinates 1D66. An interactive three-dimensional version of this Figure can be found at

Figure 2 The DNA complexes of proteins containing a Zn(II)2Cys6 zinc cluster

(A) The Gal4p–DNA complex [102]. (B) The Put3p–DNA complex [103]. (C) The Ppr1p–DNA complex [104]. (D) The Leu3p–DNA complex [105]. (E) The Hap1p–DNA complex [106]. Each figure was generated using PyMOL (DeLano Scientific; with the following PDB co-ordinates from the references stated: 1D66, 1ZME, 1PYI, 1ERE and 1HWT respectively. In all cases, the DNA (red) and the protein (blue) are shown in cartoon format. The zinc ions are shown as yellow spheres. The consensus DNA-binding site for each protein is written directly beneath the appropriate structure.

Regulators of eukaryotic RNA polymerase II transcription usually possess two distinct, and often separable, domains [9]. A DNA-binding domain is required to direct the transcriptional regulator to specific genes and an activation domain recruits the protein complexes that allow the eventual bringing of RNA polymerase to the appropriate start site of transcription [10]. Although the proteins in the Zn(II)2Cys6 family may share a common DNA-binding domain, the mechanism by which the transcriptional activity of each protein is controlled are wide and diverse. Perhaps the two most common forms of regulating transcription factor function are the control of DNA-binding function (either directly, perhaps through post-translational modification of the DNA-binding domain itself, or indirectly by sequestering the protein in the cytoplasm) and the control of activation domain function. In the present review, we will discuss the regulation of the transcriptional activity of four Zn(II)2Cys6 proteins. These proteins each regulate specific sets of genes in response to defined signals. In particular, we will concentrate on the molecular details of inducer recognition and the manner in which this recognition is converted into a transcriptional output.


The yeast GAL genes encode the enzymes of the Leloir pathway that are required to catalyse the conversion of galactose into a more metabolically useful version, glucose 6-phosphate. The enzymes of the Leloir pathway and the expression of the GAL genes has been reviewed a number of times [1116]. Therefore in the present review we will concentrate on the most recent advances, including the three-dimensional structures of some of the proteins involved in the process of transcriptional regulation.

When yeast cells are grown in the absence of galactose, the GAL genes are largely transcriptionally inert. If, however, galactose is the only available carbon source, then the GAL genes are transcribed, both rapidly and to a high level [17]. Galactose is a comparatively poor sugar source for the cell, and yeast will metabolize other sources of carbon (e.g. glucose) in preference to galactose, even if a mixture of glucose and galactose is available to them [18]. Glucose will trigger catabolite repression of the GAL genes, and the expression of the activator upon which they depend, the Zn(II)2Cys6 protein Gal4p, is severely reduced [19]. In the presence of other carbon sources, such as raffinose or glycerol, Gal4p is produced in the cell and can be found tethered upstream of the GAL genes. Under these conditions, the activity of DNA-bound Gal4p is inhibited by its interaction with another protein, Gal80p [20]. Although the presence of galactose within the cell triggers the activation of Gal4p, neither Gal4p nor Gal80p functions as the galactose sensor. Instead, a transcriptional inducer or ligand sensor, Gal3p, interacts with the transcriptional inhibitor, Gal80p, in a galactose- and ATP-dependent manner [21]. Gal3p seems to require galactose and ATP so that it can adopt a conformation to enable it to interact with Gal80p [11]. The net result of this interaction is that Gal4p becomes active and transcription of the GAL genes proceeds.

The GAL regulatory system in the baker's yeast S. cerevisiae and the milk yeast K. lactis share a number of similarities, but there are also some important differences. The regulatory proteins in the two yeasts are, at least in part, interchangeable. For example, Gal4p from both S. cerevisiae (ScGal4p) and K. lactis (KlGal4p) will complement a gal4 mutation in either yeast [22,23], despite the two proteins sharing comparatively little overall sequence similarity (28% amino acid identity and 57% similarity over their entire length). Gal80p from each of these yeasts are highly related (58% amino acid identity and 82% similarity) and will inhibit the transcriptional activity of either version of Gal4p. However, although KlGal1p can complement both a Scgal1 (galactokinase-defective) and Scgal3 (ligand sensor-defective) mutation [24], ScGal3p cannot complement the non-inducible phenotype of a Klgal1 deletion mutant unless the KlGAL80 gene is also replaced by ScGAL80 [21]. In addition, studies have suggested that there are differences in the cellular locations of the three GAL regulatory proteins in the two yeasts, and consequently potential differences in the mechanism of transcriptional activation. In both cases, Gal4p is presumed to be nuclear, whereas ScGal80p can be found in both the nucleus and the cytoplasm and is capable of shuttling between the two. ScGal3p is predominately cytoplasmic [25,26]. Recently, KlGal80p has been identified as an exclusively nuclear protein, whereas KlGal1p is present both in the nucleus and the cytoplasm [27]. On the basis of these and other results, it has been suggested that the S. cerevisiae GAL switch is activated when galactose and ATP bind to Gal3p in the cytoplasm. This traps Gal80p in the cytoplasm thereby freeing Gal4p from its repressing effects and allowing transcriptional activation to occur [11]. In K. lactis, the GAL switch appears to be controlled by competition in the nucleus between KlGal1p and KlGal4p for KlGal80p binding [27]. In either model, Gal80p must interact with two very different proteins: Gal4p, via its C-terminal activation domain, and Gal3p (Gal1p in K. lactis) when galactose and ATP are bound to the ligand sensor.

A variety of experimental data suggests that the yeast GAL genes physically relocate to the nuclear periphery upon transcriptional activation [28,29]. Galactose-induced GAL gene re-localization has been found to be diminished following the deletion of NUP1 (a gene encoding a nuclear pore complex component), SAC3 (suppressor of actin 3), SUS1 (SI gene upstream of ySa1) or ADA2 (transcriptional ADAptor), suggesting that the GAL genes were being physically tethered to the nuclear periphery via interactions between the nuclear pore complex, mRNA export factors and SAGA (Spt/Ada/Gcn5 acetyltransferase) [28]. However, the removal of Nup1p or Ada2p from the cell was found not to affect induced GAL1 mRNA levels, suggesting that gene re-localization was not involved in regulating gene expression [28]. The removal of Mex67p, another mRNA export factor that interacts with Sac3p, and the TREX (transcriptional export) mRNA export complex [30,31] from cells was found to abolish induced re-localization of the GAL10 gene. Again, the deletion of MEX67 (mRNA export factor of 67 kDa) does not appear to affect GAL10 mRNA expression levels [32]. Hence, GAL gene re-localization appears to be due to interactions between gene, transcription factors, mRNA processing factors and the nuclear pore complex. GAL gene re-localization is not necessary for gene activation and, instead, appears to be a consequence of downstream mRNA processing events. It is yet to be determined whether gene re-localization to the nuclear periphery plays a regulatory role in these downstream events.

When the galactose in the yeast medium has been used up, the transcription of the GAL genes diminishes. However, after the transcriptional shut-off, the GAL gene cluster is retained at the nuclear periphery, possibly by interaction between the nuclear pore complex and the histone variant H2A.Z [33]. In this state, the GAL genes can be reactivated more rapidly if galactose again becomes available to the cell, than in cells that have not previously been exposed to the sugar. While the presence of H2A.Z may contribute to rapid re-activation of the GAL genes, it is most likely that the observed ‘transcriptional memory’ at the GAL loci is due to sustained levels of the galactokinase Gal1p in the cytoplasm of yeast cells [34]. Gal1p is highly similar to Gal3p, the ligand sensor, and in the absence of Gal3p, Gal1p can function as a ligand sensor for the GAL genetic switch, albeit with considerably less efficiency than Gal3p. Unlike the levels of Gal3p that increase only modestly in the presence of galactose, Gal1p levels increase by over 1000-fold following galactose induction [35]. It seems that enough Gal1p survives in the cytoplasm after the GAL genes are no longer being transcribed and after subsequent cell divisions to enable rapid GAL gene activation on future exposure to galactose [34]. In the absence of Gal1p, expression of Gal3p from the GAL1 promoter also enabled rapid reactivation of the GAL genes, providing more evidence that the ‘transcriptional memory’ of galactose was a result of positive feedback by trans-acting cytoplasmic factors rather than by chromatin modifications and gene localization events [34]. Presumably, the increased concentration of the inducer (Gal1p or Gal3p) allows for a more rapid alleviation of the inhibitory effect of Gal80p on Gal4p. However, it is not clear how the GAL locus remains tethered to the nuclear periphery after several generations of inactivity, as was previously observed [33].

Recently, the three-dimensional structures of some of the GAL regulatory proteins have been determined. The structure of the galactokinase from S. cerevisiae has been solved at 2.4 Å (1 Å=0.1 nm) resolution [36] (see Figure 3A). The galactokinase, Gal1p, shares an extraordinarily high degree of sequence similarity, ∼70% amino acid sequence identity and ∼90% similarity, with the ligand sensor Gal3p. It has therefore been possible to build predictive structural models for the latter based on the structure of the former. Despite this sequence similarity, Gal3p does not possess galactokinase activity [37]. Like a galactokinase, Gal3p interacts with galactose and ATP, but unlike galactokinase enzymes, Gal3p does not promote the conversion of these substrates into galactose 1-phosphate. Instead, it is assumed that Gal3p requires the presence of the substrates to adopt a conformation that allows an interaction to occur between it and Gal80p. Overall, Gal3p is a monomer [38] and is anticipated to adopt a two-domain structure similar to that observed for Gal1p. The ligands bind between the two domains (representing the N- and C-terminal portions of the protein) and the association of the domains represents a structural mechanism by which the presence of the ligands may be transmitted.

Figure 3 Structures of the yeast GAL regulatory proteins

(A) The structure of Gal1p from S. cerevisiae [36]. The protein is shown as a grey cartoon with galactose (red) and ADPNP (blue), an ATP analogue, as stick models. The magnesium ion in the active site is shown as a pink sphere. (B) The structure of Gal80p from K. lactis [40]. (C) The structure of the Gal80p–NAD complex from S. cerevisiae [39]. In (B) and (C), one monomer of Gal80p dimer has been coloured in blue and the other in beige. In (C), NAD is shown as a stick model (red). Each Figure was generated using PyMOL (DeLano Scientific; with the following PDB co-ordinates from the references stated: 2AJ4, 2NVW and 3BTS respectively. Interactive three-dimensional versions of these Figures can be found at

The structures of both K. lactis and S. cerevisiae Gal80p have been solved recently [39,40]. As expected from their sequence similarity, the structures of both proteins are very similar. Both proteins crystallize as dimers and show a high degree of structural homology with oxidoreductase enzymes [41]. The N-terminal half of each monomer forms a Rossmann fold composed of six parallel β-sheets, flanked on one side by two α-helices and by three α-helices on the other [40]. The C-terminal domain of the protein is dominated by a nine-stranded mixed β-sheet that forms the dimerization surface. The most prominent feature of Gal80p is a deep cleft (Figure 3B) in each monomer that, based on the location of mutants that are defective in Gal4p binding, is predicted to be the docking site for the Gal4p activation domain [40].

Although the structure of the K. lactis Gal80p was solved as an apo-protein structure [40], the structure of the protein from S. cerevisiae (Figure 3C) was solved as a complex with the dinucleotide NAD [39]. An analysis of the structures of the K. lactis protein in the absence of dinucleotide and the S. cerevisiae protein in its presence suggests that very few structural changes occur on interaction with the dinucleotide. What, therefore, is the role of NAD in the regulation of GAL gene expression? NAD itself would appear to have no effect on the formation of either the Gal4p–Gal80p or the Gal3p–Gal80p complexes, but NADP may have the effect of disrupting the Gal4p–Gal80p complex [39]. Additionally, mutants of Gal80p that are predicted to be defective in dinucleotide binding would appear to have slightly modified GAL gene induction kinetics in vivo [39]. It is, however, not clear if the effect of dinucleotides on GAL gene expression represents a real control point or if they represent some kind of remnant that arises as a result of the evolution of Gal80p from an oxidoreductase enzyme.

The structure of Gal80p bound to a peptide representing the activation domain of Gal4p has also been reported recently [39]. However, the structure was not at sufficiently high resolution to determine the side-chain interactions that occur between the Gal4p peptide and Gal80p. These results do, however, suggest that Gal4p binds to Gal80p in the deep cleft on the surface of the latter (Figure 3B). Again, because of the lack of resolution, it is not clear from the structure if the binding of Gal4p and the dinucleotide are mutually exclusive. Clearly, higher resolution structural information will be required before this issue can be resolved.

Yet another level of control of the activity of Gal4p would appear to be mediated through the addition of ubiquitin to the protein [42,43]. The ubiquitylation of proteins that involves the addition of multiple ubiquitin moieties usually targets the protein for degradation by the proteasome [44]. However, the addition of single ubiquitin moieties to proteins can result in modified function rather than degradation. Studies involving the potential interaction between Gal4p and the proteasome extend over some 16 years [45] with the discovery that the transcriptional activation domain of Gal4p was capable of interacting with Sug1p, a proteasomal ATPase [46]. More recently, however, it has been shown that the interaction between Gal4p and components of the proteasome may, in fact, regulate the DNA-binding activity of Gal4p rather than resulting in degradation of the protein [43,47]. DNA-tethered versions of transcriptional activators bearing the Gal4p DNA-binding domain were found to be mono-ubiquitylated and that these forms of the protein were resistant to removal from the DNA by the proteosome [47]. Additionally, it has been shown that the mono-ubiquitylation of Gal4p requires parts of the activation domain. This suggests that, in addition to its functions in binding Gal80p and in binding parts of the transcriptional machinery that will eventually result in the recruitment of RNA polymerase II to the promoter, the activation domain of Gal4p also serves as a docking site for an as yet unidentified E3 ubiquitin ligase complex [43]. A version of Gal4p lacking this portion of the activation domain, and hence unable to be mono-ubiquitylated, was found to be hyper-sensitive to removal from the DNA such that its overall DNA-binding activity was drastically reduced in comparison with the wild-type protein [43]. The site of ubiquitin addition within Gal4p has not yet been identified.

Despite over 50 years of intensive research effort, the regulation of yeast GAL genes expression continues to surprise. Nuances continue to be uncovered but, in addition, paradigm shifts in our understanding of this exquisitely sensitive and elegant genetic switch still happen. It is clear, however, that many of the recent advances in understanding the protein–protein and protein–small molecule interactions that regulate gene expression have stemmed from solving the three-dimensional structures of the GAL regulatory proteins. In the future, solving the Gal4p–Gal80p and Gal3p–Gal80p complexes will be required to extend our knowledge yet further.


Glucose is the preferred carbon source for S. cerevisiae and its presence triggers catabolite repression of genes required for growth on non-preferred carbon sources, while activating those required for growth on glucose [48]. Glucose regulates transcription in yeast by a complex signalling network [49]. The first step of glucose utilization is the import of the sugar into the cell which occurs via specialized glucose-transporter proteins. In yeast, these proteins are encoded by the HXT (hexose transporter) genes, and the transcription of HXT14 is regulated in response to glucose levels. The HXT genes are repressed in the absence of glucose, and are expressed only if glucose is available to the cell [50]. The regulation of these genes is dependent upon the Zn(II)2Cys6 family protein Rgt1p (restores glucose transport-1) [51].

The regulation of Rgt1p function is complex, involving the integration of several glucose signalling pathways, and appears to be affected through phosphorylation of Rgt1p [52]. Rgt1p is constitutively expressed and localized to the nucleus [51,53]. In the absence of glucose, Rgt1p binds to the promoters of HXT genes to the consensus DNA sequence 5′-CGGANNA-3′ [52]. Unlike many other Zn(II)2Cys6 proteins (Gal4p, Put3p etc.), the Rgt1p-binding site is non-palindromic, suggesting that Rgt1p binds as a monomer. However, there are multiple Rgt1p-binding sites within the HXT gene promoters [52]. The HXT genes have been identified as the primary targets of Rtg1p, with the HXK2 (hexokinase isoenzyme 2) gene being a notable exception ([54], and see below).

The transcriptional repression of the HXT genes in the absence of glucose requires both Rgt1p and the Ssn6p–Tup1p general co-repressor complex, suggesting that the function of Rgt1p is to recruit Ssn6p–Tup1p to HXT gene promoters and bring about gene repression [51,55]. HXT2 and HXT4 encode for high-affinity glucose transporters and are not required for growth at high-glucose concentrations. These genes are subject to an additional level of regulation through repression by Mig1p in high glucose [56].

Upon the addition of glucose to the cell, Rgt1p becomes hyper-phosphorylated and is released from DNA [52]. Polish et al. [55] used a two-hybrid assay to show that the central region of Rgt1p interacts with a region adjacent to the Rgt1p DNA-binding domain in a glucose-dependent manner. Rgt1p hyper-phosphorylation is mediated by PKA (protein kinase A; cAMP-dependent protein kinase), which phosphorylates Rgt1p on at least four separate serine residues [57]. Mutation of these residues, or loss of PKA activity, resulted in constitutive binding of Rgt1p to DNA, and the loss of the glucose-dependent intramolecular interaction. PKA activity is regulated by cellular cAMP levels, which are increased in high glucose following activation of the Gpr1p/Gpa2p and Ras1p/Ras2p pathways and adenylate cyclase (Cyr1p) [57]. The above suggests that glucose-induced phosphorylation of Rgt1p by PKA results in a conformational change within Rgt1p that inhibits its DNA-binding activity and hence allows derepression of the HXT genes.

Rgt1p appears to possess an activating function in addition to its more well-studied function as a repressor. The full induction of the HXT1 gene by high glucose levels (>2%) is not seen in an RGT1 knockout strain, and a LexA–Rgt1p fusion protein activates transcription in high glucose [50,51]. The finding that Rgt1p functions as an activator seems paradoxical, as Rgt1p is no longer bound to DNA in high glucose [52]. It may be that Rgt1p activates HXT1 by indirect means, although the factors involved in this have yet to be identified. Indeed, results have suggested that Rgt1p may be capable of activating transcription by itself [51].

Rgt1p-mediated gene repression also requires Mth1p and Std1p, two homologous proteins which interact with Rgt1p in the absence of glucose [55], and which are themselves regulated by glucose levels. Mth1p appears to prevent the glucose-dependent intramolecular interaction of Rgt1p from occurring, an attractive model being that Mth1p protects Rgt1p from phosphorylation [55]. Std1p interacts with a region of Rgt1p that was found to be necessary for transcriptional activation, and it was proposed that Std1p functions by shielding the Rgt1p activation domain in the absence of glucose, in a manner similar to the Gal80p–Gal4p interaction. Although, as noted above, exactly how Rgt1p functions as an activator remains to be elucidated [55].

As discussed above, Rgt1p phosphorylation is signalled via the Gpr1p/Gpa2p and Ras1p/Ras2p pathways [57]. However, the overall regulation of Rgt1p activity by glucose is more complex and involves the integration of several regulatory circuits. Both Mth1p and Std1p are targeted for proteasome-mediated degradation when glucose is available, following activation of the Snf3p–Rgt2p glucose sensors and Yck1/2-Grr1p pathway [5860]. However, in the absence of glucose, Mth1p and Std1p are phosphorylated by active Snf1p–kinase complex, preventing their degradation [61].

Mth1p and Std1p are also regulated by glucose at the level of transcription. The MTH1 (MSN three homologue 1) gene is repressed by the Mig1p-glucose repressing pathway, reinforcing the effect of glucose on relieving Rgt1p-mediated gene repression [60]. Control over Rgt1p DNA-binding ability and transcriptional repression function would ultimately seem to be hinged on the level of Mth1p in the cell. Meanwhile, STD1 (suppressor of Tbp deletion) is actually activated by high glucose levels, seemingly by the relief of Rgt1p-mediated repression, suggesting a feedback mechanism that was proposed to enable rapid establishment of HXT repression if glucose levels drop [60].

Although Rgt1p is thought to mediate repression of the HXT genes by recruiting Ssn6p–Tup1p [51,55], Rgt1p may also inhibit gene transcription in a more direct way. It was recently reported that Rgt1p mediates the formation of a DNA loop at the HXK2 gene promoter in low glucose [62]. First, two-hybrid experiments demonstrated that Rgt1p interacts with Med8p, a component of the RNA polymerase II mediator co-activator complex [62]. Med8p is known to bind to specific elements in several genes, promoting either gene repression or activation depending on the context [63,64]. Next, ChIP (chromatin immunoprecipitation) experiments suggested that Rgt1p and Med8p interact on the HXK2 promoter, resulting in the formation of a DNA loop, with the implication that this DNA looping prevents gene transcription [62]. It is not clear from the current results whether this type of DNA looping represents a general mechanism of transcriptional inhibition by Rgt1p or is specific to the HXK2 gene.

In summary, Rgt1p functions primarily as a repressor of the HXT genes and HXK2, in conditions where glucose is not available to the cell (Figure 4). When functioning as a repressor, Rgt1p binds DNA, a function which is regulated by glucose-induced hyper-phosphorylation. In low glucose, Rgt1p binds to specific DNA sites and represses transcription in concert with co-repressing factors (Tup1p–Ssn6p, Med8p), which may involve the formation of DNA loops refractory to transcription. In high-glucose environments, Rgt1p becomes hyper-phosphorylated and undergoes a conformational change which prevents DNA binding. Two additional factors are required for Rgt1p to function as a repressor: (i) Mth1p, which prevents Rgt1p phosphorylation, and (ii) Std1p, which may inhibit Rgt1p activator function. Exactly how these additional factors function, in particular Std1p, requires further study, as does the mechanism behind Rgt1p-mediated transcriptional activation. Finally, several glucose-signalling pathways converge on Rgt1p to regulate its function, including Rgt2p–Snf3p and Grr1p-mediated protein degradation, activation of PKA by the Gpr1p/Gpa2p and Ras1p/Ras2p pathways, and the Mig1p and Snf1p–kinase pathways, showing how several separate signals may be integrated into controlling a specific transcriptional response.

Figure 4 The regulation of HXT gene transcription by Rgt1p

(A) In the absence of glucose, Rgt1p binds to the promoters of HXT1–4 genes and represses transcription. (B) If glucose is available to the cell, Rgt1p becomes phosphorylated and, as a consequence, is unable to bind DNA. This relieves Rgt1p-mediated gene repression. (C) At high levels of glucose, Rgt1p activates the expression of HXT1. The molecular mechanism by which this occurs in unclear, although it is likely to involve additional factors.


Oxygen is both essential and deleterious to the cell. If the concentration is too low, respiration and the synthesis of many molecules are not possible. If the concentration is too high, the reactive nature of oxygen can result in oxidative damage. Therefore living systems must possess a means of accurately sensing and adapting to varying levels of oxygen. In S. cerevisiae, the Zn(II)2Cys6 transcription factor Hap1p (haem activator protein 1) acts in this capacity. Like the Zn2(II)Cys6 family members Put3p and Ppr1p, it possess both sensor and activator functions. Whereas Put3p directly detects proline [65] and Ppr1p detects orotic acid [66], Hap1p binds haem [67]. Haem is the primary indicator of the cellular oxygen status in yeast, as its synthesis is directly dependent upon oxygen concentration. When oxygen is non-limiting, haem is readily synthesized, however, when conditions become hypoxic, at oxygen concentrations below ∼1 μM, haem production ceases [67]. As Hap1p directly associates with haem, it can detect the oxygen status of the cell by proxy [67].

During hypoxic growth, or when haem is absent, Hap1p binds the promoters of ten genes, such as HMG1 (3-hydroxy-3-methylglutaryl-coenzyme a reductase), ERG5 (ergosterol biosynthesis) and CYB5 (cytochrome B) and actively represses them [68]. However, when oxygen is abundant and the haem concentration is high, Hap1p binds haem and directly associates with the promoters of 19 different genes required for aerobic growth, such as CYC1 (iso-1-cytochrome c), CYC7 (iso-2-cytochrome c) and ROX1 (repressor of hypoxic genes), where it activates their transcription [68]. Thus Hap1p allows the cell to alter transcriptional programmes based on the presence or absence of haem, and hence, adapt to changing oxygen status.

Hap1p consists of 1483 amino acids arranged into several distinct domains (Figure 5A). At the N-terminal it possesses a Zn2(II)Cys6 DNA-binding domain followed by a dimerization domain. Distal to these canonical domains it possesses RPM (repression module)1/RPM3 followed by six HRMs (haem-responsive motifs; HRM1–6). At the C-terminal, distal to a region free of distinct domains, RPM2 is followed by HRM7 and then the acidic activation domain [69]. Although all of the RPMs are necessary for Hap1p repressor function, it is only HRM7 which plays a role in the activation of Hap1p [70]. In vivo, Hap1p exists in a high-molecular-mass complex consisting of Hap1p, the Ssa products of HSP70 (heat-shock protein 70), as well as the co-chaperones Ydj1p and Sro9p [71]. Purification of this native complex showed that the components exist in a ratio of Hap1p/Ssap/Sro9p of 2:3:5 [71]. Although, the HSP70 co-chaperone Ydj1p is a member of this complex, it could not be quantified due to low protein levels [67,71]. All of these associating chaperones are essential for the repressor function of Hap1p and previous studies suggested that Hap1p only existed in this complex, dubbed HMC (high-molecular-mass complex), when haem was absent [72,73]. The addition of haem caused the complex to dissociate and Hap1p to bind DNA in a dimeric form, with much greater efficiency than the HMC. This was therefore considered to be the activated form Hap1p [72]. These experiments were, however, carried out in vitro and later experiments, where the complex was analysed in more native conditions, suggested that the HMC complex exists irrespective of haem concentration. Instead, the complex undergoes a conformational change when Hap1p is activated by haem [71].

Figure 5 Hap1p functions as a repressor or an activator depending on the haem concentration

(A) An overview of the structural motifs within Hap1p. (B) Repressor function. The Hap1p complex is bound to half sites when acting as a repressor, regardless of the haem concentration. In the absence of haem the Hap1p complex represses gene expression through association with the universal repressor Tup1p–Ssn6p. When bound to haem, the complex undergoes a conformational change and no longer associates with Tup1p–Ssn6p and repression is relieved. (C) Activator function. The Hap1p complex is bound to repeat DNA-binding sites when functioning as an activator, regardless of haem concentration. When haem is not present, the Hap1p complex is transcriptionally inert. When bound to haem, however, the complex associates more strongly with DNA and undergoes a conformational change mediated by a transient interaction with Hsp90p to activate gene expression.

When haem is present, it binds to the HRM7 domain of Hap1p, inducing a conformational change in the HMC [70,74]. Hsp90p mediates this alteration in HMC conformation and is essential for Hap1p to function as an activator [71,74]. In the absence of haem, Hsp90p interacts very weakly with the HMC, but upon the addition of haem this interaction is moderately stabilized. Although stabilized, this interaction with Hsp90 appears to be transient and weak, as a minor portion of the Hap1p pool is associated with Hsp90p, even under fully inducing conditions [71]. Additionally, it has been shown that mutations that greatly stabilize the Hsp90p–Hap1p interaction diminish Hap1p activity [70], suggesting that a transient interaction is essential for its function as an activator.

The above observations suggest that Hap1p exists as a repressor or an activator and that the joint actions of haem and Hsp90p moderate this conversion. Indeed, the analysis of Hap1p-regulated genes has demonstrated this to be the case. When haem is not present, the Hap1p complex represses the activity of genes such as ERG5, and as the haem concentration increases these genes become de-repressed in a dose-dependent manner. When HAP1 is deleted, ERG5 loses sensitivity to haem and becomes constitutively active [68]. The dose-dependent loss of repression indicates that the greater the pool of haem-bound Hap1p complex, the less capacity it has to repress gene expression. The fact that deletion of HAP1 ablates this, suggests that the Hap1p complex can only fully repress a gene in the absence of haem [68]. However, genes which are activated by the Hap1p complex, such as CYC1, do not respond to haem until a threshold level is reached, whereas the gene products cannot be detectable when Hap1p is deleted [68]. This would suggest that although repression is directly dependent upon haem levels, activation operates through an as yet unidentified mechanism. Despite these differences, these results do demonstrate that Hap1p functions either as an activator or as a repressor depending on the haem concentration, much as Rtg1p maintains these functions as the glucose concentration alters [50,51,55].

Although the haem/Hsp90-mediated conformational changes of the Hap1p complex determine whether it functions as an activator or repressor, both operations are dependent on DNA binding [75,76]. The Hap1p complex is bound to the canonical Hap1p DNA-binding site consisting of two CGG triplets separated by a 6 bp spacer (5′-CGGnnnTAnCGG-3′) in both the presence and absence of haem, although haem moderately increases this binding [69,71]. However, both the promoters of genes activated (ERG2, ERG11 and CYC1) and repressed (ERG5) by Hap1p also possess CGG half sites [68,75,76]. This suggests that perhaps Hap1p might also bind them in addition to the complete sites.

The ERG5 promoter possess two complete Hap1p-binding sites, in addition to five half sites, and when repressed by the Hap1p complex associates with the universal repressor Tup1/Ssn6 [68]. Although, Hap1p binds to the two complete sites with a greater affinity than the half sites, deletion of a 100 bp region containing the five half sites markedly de-represses ERG5, whereas deletion of the complete sites has little effect. This indicates that it is the half sites and not the complete site which is responsible for Hap1p-mediated repression. Although, it is interesting to note that while the Hap1p binding to the complete sites was enhanced in hypoxic conditions, the association with the half sites was not. Therefore it is possible that these sites work in conjunction [68]. The HAP1 gene is also repressed by Hap1p with the promoter only possessing Hap1p half binding sites [75,77]. Deletion of the HAP1 promoter region containing the putative UAS (upstream activating sequence: −461 bp to −380 bp from the transcription start site), which contains no half sites, results in constitutive repression of HAP1 [68]. Further analysis has revealed that the repressing half site is located from −461 bp to −345 bp from the start codon [77]. The Hap1p complex was found to bind to this site regardless of haem and concentration and when this sequence was used to purify the Hap1p complex, it co-purified with Ssa, but not Hsp90p, again regardless of haem and concentration [77]. When this same experiment was performed with a complete Hap1p-binding site, the Hap1p complex was shown to associate with Hsp90p only in the presence of haem [75]. As the Ssa components of the Hap1p complex are required for repression and Hsp90p for activation, the absence of Hsp90p from the half site pull-down suggests that this binding site is responsible for repression of HAP1. These two separate examples suggest that Hap1p binds to half sites in order to repress gene expression, not unlike Rtg1p [52], and to complete sites to activate it.

The above studies suggest a model for Hap1p function (Figures 5B and 5C) where it exists as the Hap1p–Ssap–Ydj1p–Sro9p complex. When bound to half sites it imposes a repressor function in the absence of haem, which is dependent on the Ssa, Ydj1p and Sro9p chaperones and the Tup1p–Ssn6p complex. When the haem concentration increases, although still bound to the half sites, this repressor function is lost. When bound to complete sites Hap1p functions as an activator through conformational changes brought about by haem binding, mediated through Hsp90p. The Hap1p complex is still present at the complete sites in the absence of haem, but inactive. The function of the Hap1p complex therefore is mediated by haem and the nature of the binding site, but not DNA binding itself.


Ume6p is an 836 amino acid (91 kDa) Zn(II)2Cys6 DNA-binding protein that has been implicated in the transcriptional regulation of genes involved in a wide variety of metabolic pathways, including phospholipid biosynthesis, arginine catabolism, nitrogen metabolism, peroxisomal functions and DNA repair [7881]. It is, however, most widely recognized for its role in the regulation of the EMGs (early meiosis-specific genes) in budding yeast [82]. Under meiotic conditions, i.e. nitrogen depletion and the presence of a non-fermentable carbon source, these genes are transcribed. In medium that promotes vegetative growth with either glucose or acetate as the sole carbon source [83], EMGs are repressed by Ume6p [84].

Ume6p is structurally distinct from the ‘classical’ Zn(II)2Cys6 zinc cluster proteins such as Gal4p. CD spectroscopy and hydrophobicity analysis suggest that extensive folding of Ume6p in aqueous solution is unlikely, and with the exception of the binuclear cluster domain, Ume6p has little or no identifiable secondary motifs [85]. In keeping with its predominant role as a repressor, Ume6p lacks an obvious acidic activation domain. The Zn(II)2Cys6 DNA binding domain that is commonly located at the N-terminal end of the protein in this family, is located at the C-terminus of Ume6p (residues 771–798). In addition, the protein lacks the highly conserved coiled-coil elements that are important for dimerization in other family members [85]. This led to the assumption that Ume6p functioned as a monomer, which was confirmed following CD analysis [85].

During vegetative growth, when yeast cells are utilizing a fermentable carbon source in the presence of nitrogen, transcriptional repression by Ume6p is dependent on histone deacetylation and chromatin remodelling activities. Ume6p binds the URS1 element (5′-TAGCCGCCGA-3′) present in the promoters of EMGs [84,86,87] and then recruits the well-characterized Sin3p–Rpd3p HDAC (histone deacetylase complex), as well as the Isw2p chromatin-remodelling complex to the promoter [8890]. A short region of Ume6p (residues 515–530) interacts with the Sin3p co-repressor and this region is necessary and sufficient for recruitment of the complex to promoters [88,91]. Tethering of Rpd3p to the promoter leads to the deacetylation of histones H3 and H4 over a range of one to two nucleosomes from the site of recruitment, resulting in localized perturbation of the chromatin structure, the inhibition of TATA-binding protein binding and repression of transcription [88,89,92]. Ume6p levels in vegetative cells are maintained through RAS and cAMP-PKA, which are major regulators of the nutritional response in yeast [93]. PKA phosphorylates Cdc20p, an activator of the APC/C (anaphase-promoting complex/cyclosome) ubiquitin ligase, preventing Cdc20p from interacting with Ume6p. This protects the protein from proteolysis and maintains normal Ume6p levels [93]. Two protein kinases that are known to phosphorylate Ume6p, Rim11p [a GSK3 (glycogen synthase kinase 3) homologue] and Rim15p (a nutritional control kinase), are also inhibited through direct phosphorylation by PKA [94].

A number of studies have suggested that meiotic induction requires the conversion of Ume6p from a repressor into an activator by association with the meiotic inducer Ime1p [91,9598]. Ime1p is required for the initiation of the earliest meiotic events [99], and two-hybrid assays have shown that Ume6p interacts with Ime1p and that this interaction is inhibited by a single amino acid substitution in Ume6p (T99N) [95]. It was proposed that Ime1p provided Ume6p with an activation domain, allowing the induction of the early meiotic genes and progression of meiosis [91,97]. However, more recent data suggests that Ume6p destruction, following association with Ime1p, is required for meiotic gene induction [93].

When cells utilize a non-fermentable carbon source in the presence of nitrogen, reduced PKA activity reduces Cdc20p phosphorylation allowing the interaction of Cdc20p with Ume6p, which results in the partial degradation of Ume6p [93]. A 50% reduction in Ume6p levels is observed when cells are switched from a fermentable to a non-fermentable carbon source [93]. In response to nitrogen starvation (meiotic entry), transcription of Ime1p is induced and it is recruited to the URS1 elements [96]. The kinase-active non-phosphorylated Rim11p phosphorylates Ime1p, and along with Rim15p and an additional GSK3 homologue, Mck1p, phosphorylates the N-terminal region of Ume6p [94,97]. Elevated phosphorylation of the Ume6p N-terminal region favours an Ume6p–Ime1p interaction. The C-terminal domain of Ime1p (residues 270–360) interacts with the N-terminal region of Ume6p (residues 1–232), and localizes the transcriptional activation domain of Ime1p to the URS1 elements [97]. Mallory et al. [93] revealed that Ume6p is completely eliminated from cells during meiotic induction, with complete Ume6p destruction contingent upon both the entry into the meiotic programme and direct association with Ime1p. Meiotic destruction of Ume6p is prevented when Ime1p and Ume6p association is inhibited by the presence of the Ume6p T99N mutation [93]. The precise nature of how meiotic gene induction proceeds is unclear. A proposed ternary complex between Ime1p, Cdc20p and Ume6p would complete Ume6p destruction and disrupt the Sin3p–Ume1p–Rpd3p HDAC complex [93]. The Sin3p–Rpd3p HDAC complex would then be removed from the URS1 elements, relieving Rpd3p repression, and permitting Gcn5p-dependent acetylation of EMG promoters. This would then result in the transcriptional activation of EMGs, promoting meiosis and sporulation [93,96,100]. Ume6p levels are detected again during spore wall assembly after the completion of meiosis I and II. However, Ume6p does not appear to be required to re-establish EMG repression as the cell completes meiosis [93], suggesting that another system is introduced to perform this function prior to Ume6p-dependent repression.

Ume6p is a transcriptional repressor that is required to repress the expression of EMGs during vegetative growth, and, upon entry into meiosis, Ume6p destruction is required for normal gene induction (Figure 6). The mechanisms controlling Ume6p transcriptional repression, and transcriptional activation as a result of Ume6p destruction, ensure that entry into the developmental pathway of meiosis takes place only under the correct conditions.

Figure 6 The control of Ume6p repression function

(A) In the presence of nitrogen and a fermentable carbon source, Ume6p recruits the Sin3p–Rpd3p HDAC to the promoter of the EMGs, leading to the deacetylation of histones and repression of transcription. Cdc20p phosphorylation prevents an interaction with Ume6p, therefore protecting Ume6p from proteolysis. (B) In the presence of nitrogen and a non-fermentable source, reduced Cdc20p phosphorylation allows its interaction with Ume6p, which results in the partial degradation of Ume6p. (C) In response to nitrogen starvation, transcription of Ime1p is induced and both Ime1p and Ume6p are phosphorylated, favouring an Ime1p–Ume6p interaction. A proposed ternary complex between Ime1p, Cdc20p and Ume6p would complete Ume6p degradation, disrupt a Sin3p–Ime1p–Rpd3p complex and relieve Rpd3p repression allowing the acetylation of EMG promoters and transcription to proceed.


The Zn(II)2Cys6 family of proteins in the yeast represent a diverse set of functions linked through a common DNA-binding motif. Some of these proteins apparently function solely as transcriptional activators and require, often elaborate, mechanisms involving both proteins and small molecules in order to regulate their transcriptional activity in response to stimuli. Other members of this family can function as either transcriptional activators or transcriptional repressors of certain genes under certain conditions. Again, elaborate mechanisms exist to modulate the switch between these two activities. Finally, members of the Zn(II)2Cys6 family of proteins can function solely as transcriptional repressors. Therefore although this family of proteins share a common DNA-binding motif, the manner of their activity and the mechanisms by which this activity is controlled are varied and diverse. Perhaps the biggest advance in understanding the molecular basis of the control of transcriptional function in recent years has stemmed from the elucidation of three-dimensional structures for a number of the regulatory proteins involved. In the future, further structural analysis of both complete proteins (rather than isolated domains) and protein complexes will be required to unpick both the gross molecular mechanism and more subtle nuances that control gene expression in response to small-molecule metabolites. Understanding the precise molecular basis by which the presence, or absence, of a small molecule can trigger a series of events ultimately leading to changes in gene expression remains challenging. Although it is relatively straightforward to look at the input and output signals (the presence or absence of the small molecule and the levels of gene expression), the intermediate steps (protein–small molecule, protein–protein and protein–DNA interactions) have been much harder to define. However, the complex, yet elegant, mechanisms employed by members of this family to control gene expression continue to shed light on a variety of fundamental control processes.


Work in the author's laboratory was supported by grants from the BBSRC (Biotechnology and Biological Sciences Research Council) and the Wellcome Trust.

Abbreviations: CYC1, iso-1-cytochrome c; EMG, early meiosis-specific gene; ERG5, ergosterol biosynthesis 5; GSK3, glycogen synthase kinase 3; HAP1, haem activator protein 1; HDAC, histone deacetylase complex; HMC, high-molecular-mass complex; HRM, haem-responsive motif; HSP70, heat-shock protein 70; HXK2, hexokinase isoenzyme 2; HXT, hexose transporter; PKA, protein kinase A (cAMP-dependent protein kinase); RPM, repression module


View Abstract