Steroid hormones are important endocrine signalling molecules controlling reproduction, development, metabolism, salt balance and specialized cellular responses, such as inflammation and immunity. They are lipophilic in character and act by binding to intracellular receptor proteins. These receptors function as ligand-activated transcription factors, switching on or off networks of genes in response to a specific hormone signal. The receptor proteins have a conserved domain organization, comprising a C-terminal LBD (ligand-binding domain), a hinge region, a central DBD (DNA-binding domain) and a highly variable NTD (N-terminal domain). The NTD is structurally flexible and contains surfaces for both activation and repression of gene transcription, and the strength of the transactivation response has been correlated with protein length. Recent evidence supports a structural and functional model for the NTD that involves induced folding, possibly involving α-helix structure, in response to protein–protein interactions and structure-stabilizing solutes.
- AF1 transactivation domain
- allosteric regulation
- protein–nucleic acid interaction
- protein–protein interaction
- post-translational modification
- secondary structure
- steroid receptor
It is 25 years since the cDNAs for the glucocorticoid [1,2] and oestrogen  receptors were isolated. This led to the cloning of cDNAs for other steroid receptors, as well as receptors for non-steroid ligands and a large number of orphans, and resulted in the creation of the superfamily of nuclear receptors (reviewed in [4–6]). The availability of steroid receptor cDNAs also opened up new opportunities for investigating receptor protein structure and mechanisms of action, and the genetic dissection of hormone physiology.
Steroid hormones and non-steroid ligands are important endocrine-signalling molecules, which control reproduction, development, metabolism, salt balance and specialized cellular responses, such as inflammation and immunity. Although the known hormones and ligands for nuclear receptors are structurally and/or chemically distinct, they share a general lipophilic character, and strikingly the individual receptor proteins show some remarkable conservation in terms of domain organization and structure. SHRs (steroid hormone receptors) are intracellular proteins, which have a well-defined domain organization consisting of a C-terminal LBD (ligand-binding domain), linked via a hinge region to a DBD (DNA-binding domain), which is followed by a variable NTD (N-terminal domain; Figure 1). The LBD binds both agonists and antagonists, and retains variable amino acid identity between receptors (approx. 20–50%). The LBD for all steroid receptors is folded into ‘12’ α-helices arranged to form three separate helical sheets, with helices 3, 4 and 12 integral to ligand binding (Figure 1). In addition, a glutamic acid residue in helix 12 and a lysine residue in helix 3, together with a hydrophobic pocket on the surface of the LBD made up of residues from helices 3, 4 and 5, are important for protein–protein interactions and represent a ligand-dependent transactivation domain, termed AF2 (activation function 2) . High-resolution structures are available for both agonist- [8–13] and, in some cases, antagonist- [8,13,14] bound SHR-LBD (Figure 1).
The core DBD is a highly conserved, defining feature of this family of transcription factors, and is characterized by eight conserved cysteine residues that co-ordinate two zinc ions. The DBD folds to adopt a globular conformation consisting of two perpendicular α-helices, with residues important for DNA recognition and binding forming part of the recognition helix (Figure 1) [15–18]. Steroid receptors bind to palindromic-like sequences, consisting of a 6 bp half-site separated by 3 bp, as homodimers. A subfamily of SHRs, consisting of the androgen, glucocorticoid, mineralocorticoid and progesterone receptors, bind to the half-site sequence AGAACA, whereas the oestrogen receptor recognizes the sequence AGGTCA . Although the LBD and DBD of SHRs share significant homology in both primary amino acid sequence and tertiary structure, the NTD shows little or no sequence homology between the different families of receptor, and structural information is limited. However, the NTD has been found to be important for both activation and repression of steroid-regulated genes. This review will focus on recent advances in our understanding of the structure–function relationships of the SHR-NTDs.
MAPPING OF FUNCTIONAL DOMAINS WITHIN THE SHR-NTD
Deletion of the LBD of SHRs results in a protein that is now constitutively active in reporter gene assays. This led to the identification of a hormone-independent AF1 (activation function 1) in the NTD. This contrasts with the AF2 activity in the LBD, which is dependent upon the binding of ligand.
Androgen receptor (AR; NR3C4)
In contrast with other members of the SHR subfamily, deletion of the AR-NTD results in a transcriptionally weak protein, providing evidence for the main transactivation function being located within the NTD [20,21]. The AR-AF1 is modular in nature, and regions important for transactivation have been mapped by deletion analysis [20,21], the use of fusion proteins  and by point mutations . These studies identified amino acid residues 101–370 and 360–485 as being critical for receptor-dependent transactivation, and these have been termed TAU1 and TAU5 . A fusion protein containing amino acids 142–485 (human AR), consisting of most of TAU1 and all of TAU5, retained approx. 70% the activity of the full-length NTD when measured in yeast cells (Figure 1) .
Comparison of the amino acid sequence of the AR-NTD from different species has highlighted three regions of conservation: the first 25 amino acids, amino acids 224–258 (within the AF1 domain) and the region immediately adjacent to the DBD (amino acids 500–541) [24–26]. The highly conserved sequence in the AF1 domain has some limited sequence identity with the GR (glucocorticoid receptor) amino acids 66–83  and the highly conserved hydrophobic amino acids have been shown to be important for activity and protein–protein interactions . Naturally occurring mutations in this sequence have also been identified in an animal prostate cancer model [26,27].
The first 20 amino acids of the AR-NTD contain an FxxLF motif (where x is any amino acid) that is important for interactions with the AR-LBD (reviewed in [28,29]). This is a key interaction for receptor function, which stabilizes bound hormone and is important for AR-dependent gene regulation, since mutations that disrupt this interaction impair receptor activity [30–32]. Interestingly, while other SHRs interact with co-activator proteins via LxxLL motifs, which bind in the hydrophobic groove of AF2, the AR-LBD preferentially binds the AR-NTD and co-activators with more bulky hydrophobic residues in the sequence F/WxxLF/W/Y [33,34], which may explain the relatively weak activity observed for AR-AF2. Recently, the structural basis for this preferred binding of the FxxLF motif has been reported [35–37]. X-ray crystallography studies showed that the AR-NTD FxxLF motif forms a ‘charge clamp’ with Glu-897 in helix 12 and Lys-720 at the end of helix 3, and the hydrophobic residues fit better into the surface pocket on the LBD [36,37]. In contrast, an LxxLL motif peptide fails to make hydrogen-bond contacts with the glutamic acid residue in helix 12, and makes fewer hydrophobic contacts with the surface of the LBD [36,37].
Glucocorticoid receptor (GR; NR3C1)
The AF1 domain of the human GR, originally termed τ1, was mapped by deletion analysis to amino acids 77–262 (Figure 1) , and was subsequently shown to function in yeast cells as a LexA fusion protein , and in yeast or HeLa nuclear extracts in vitro [40–42]. A 58-amino-acid core domain (amino acids 187–244) was mapped by further deletions, which retained 60–70% the activity of full-length AF1 . Comparison of the sequence of the GR from different mammalian species and Xenopus reveals good amino acid conservation for the AF1 domain (32% identity) and the sequences immediately N-terminal of the DBD, amino acids 275–419 (54% identity). However, when fish sequences are included, there is less evidence of amino acid conservation.
Extensive mutagenesis studies revealed an important role for hydrophobic amino acids in AF1 core activity [44,45]. Mutations that reduced the overall acidity of the AF1 domain led to progressive impairment of transactivation . It was speculated that these residues were important structurally, and defined the solvent-exposed surface of the transactivation domain. It is significant, therefore, that mutating key glutamic acid, phenylalanine or tryptophan residues in the corresponding enh2 domain (amino acids 108–317) of the rat GR impaired transactivation activity, but not the ability to repress transcription through protein–protein interactions with the transcription factor AP1 . Such studies emphasize a role for different regions of the GR-NTD in gene regulation, which are likely to be mediated via protein–protein interactions (see below) . More recently, using transactivation domain mutations, Yamamoto and colleagues  showed differential requirements for AF1 and AF2 in target gene expression. These studies emphasize a further level of selectivity in SHR signalling, after hormone and DNA binding that operates at the level of assembly of transcriptionally competent protein complexes on target gene promoters.
Mineralocorticoid receptor (MR; NR3C2)
Initial studies by Govindin and Warriar  identified amino acids 328–382, in the middle of the MR-NTD, as being important for transactivation function (Figure 1). However, more recent studies have mapped two distinct regions as being critical: AF1a was defined as the first 170 amino acids, and AF1b as the 147 amino acids adjacent to the DBD  (Figure 1). The mapping of quite distinct regions of the protein may suggest cell and/or promoter-selective activity for the MR-NTD transactivation function. Significantly, these three regions of the MR-NTD show a high degree of amino acid conservation between the mammalian MR sequences and a number of fish species: amino acids 1–170 (AF1a), 25% identity; amino acids 244–300, 54% identity; and amino acids 459–566, including AF1b, 46% identity.
Oestrogen receptor-α and -β (ERα and ERβ; NR3A1 and 2 respectively)
The ERα AF1 domain is located between amino acids 51 and 149 [52–54] (Figure 1). However, different regions of the protein were shown to have distinct cell-type and promoter selectivity . More recently, the first 40 amino acids, immediately N-terminal of AF1, have been reported to be involved in receptor-dependent transcription and interactions with the ER-LBD . In 1997, Gustafsson and co-workers  identified and cloned a second ER, termed ERβ, which has a distinct NTD. In contrast with ERα, the NTD of ERβ seems relatively weak at transactivation, but the first 31 amino acids appear to be critical for AF1 activity  (Figure 1).
Comparison of the ERα sequence from species as diverse as mammals, birds and Xenopus reveals high amino acid conservation within the first 39 residues (46% identity) and, overall, approx. 29% identity for the AF1 domain. This level of conservation is lost when the sequences from several fish species are analysed.
Progesterone receptor (PR; NR3C3)
There are two forms of the PR: PR-A and PR-B, which differ by the presence of an N-terminal 165-amino-acid extension (reviewed in ). The AF1 domain has been mapped to 91 amino acids preceding the DBD [58,59] (Figure 1). Interestingly, the autonomous function of this domain required the PR-DBD, suggesting intra-domain communication . A second transactivation domain, termed AF3, is present in the PR-B isoform, and consists of the first 165 amino acids  (Figure 1). Comparison of the PR sequences from mammalian species and chicken indicated small regions of high conservation: amino acids 302–326, 71% identity; amino acids 357–390, 39% identity; and a region including AF1 (amino acids 459–566), 46% identity. Strikingly, the sequences corresponding to AF3, that are unique to PR-B, show only 12% amino acid identity under these conditions.
The two isoforms of the PR (A and B), which have AF1 (NTD) and AF2 (LBD) in common, show different patterns of transcriptional activity, with the B-form being in general the stronger activator (reviewed in ). Although this may reflect the presence of an additional activation surface, AF3, in PR-B, a selective and ‘transferable IF (inhibitory function)’ has also been mapped to the 291 amino acids upstream of AF1 (residues 165–455) . Huse et al.  similarly noted a ‘negative modulation domain’ within 120 amino acids of the PR-NTD (amino acids 230–350). The IF was found to suppress the activity of AF1 and AF2, but interestingly not AF3. Furthermore, the inhibitory activity could be transferred to ERα, reducing ER-dependent transactivation when inserted upstream of the NTD . The mechanism for this selective inhibition is not clear, but does not appear to require additional factors, and may involve intramolecular communication with AF1 or AF2 or the blocking of key protein–protein interactions with these domains . More recently, Wang and Simons  identified amino acids 468–508, within the AF1 domain, as being necessary for co-repressor binding (Table 1, and see below). Significantly, this sequence could substitute for a corresponding activity in the GR-AF1 domain (amino acids 154–236 of the rat GR), despite showing no significant sequence homology . This would argue for a structural basis for this inhibitory activity, but since these sequences are necessary for the binding of the co-repressors NcoR (nuclear receptor co-repressor) and SMRT (silencing mediator of retinoid and thyroid hormone receptor), the mechanism is likely to be distinct from the IF activity identified by Horwitz and co-workers .
The AF1 domain or related SHR-NTD activities have been shown to function in a variety of mammalian cell types and in the budding yeast Saccharomyces cerevisiae. Although this suggests a potential conservation in transactivation activity between SHRs, there is also considerable evidence available that the transactivation activity of individual NTDs can exhibit cell and/or promoter selectivity. When taken together, these studies illustrate that different surfaces, even within a single SHR-NTD, can be involved in activation and/or repression of transcription. Furthermore, the strength of the respective SHR-NTDs for activating transcription is also variable. Wilson and co-workers , using Gal4–DBD fusion proteins, recently demonstrated a nice correlation between AF1 activity (and, inversely, AF2 activity) and the length of the NTD, with the longer AR-, PR- and GR-NTDs showing the highest activity.
SHRs are also targets for post-translational modifications, and the NTD is the site of both phosphorylation and sumoylation (reviewed in [63–65]), whereas lysine residues targeted for acetylation or ubiquitination are found in the hinge region between the DBD and LBD (, and references therein). The roles of these modifications are beginning to be elucidated, but the possible effects on structure are less clear. What has also emerged in recent years is that post-translational modifications may be interdependent and not necessarily ‘stand-alone’ events (see [66,67]). The five classes of SHR are phosphoproteins, with key phosphorylation sites mapped to serine, threonine and tyrosine residues. However, what are less clear are the consequences of phosphorylation for receptor action.
In the case of the AR, the majority of phosphorylated sites have been identified in the NTD. Initial studies reported phosphorylation of the AR on serine residues 81 and 94 , and more recently, Gioeli et al.  identified an increase in phosphorylation of serines 16, 81, 256, 308, 424 and 650 in response to hormone treatment of LNCaP cells. However, the kinase enzymes responsible for phosphorylating the AR at these sites are less well defined. Activation of the MAPK (mitogen-activated protein kinase) and Akt/PKB (protein kinase B) pathways has also been implicated in AR function. Chang and co-workers  correlated phosphorylation of Ser-514, possibly by MAPK, with an increase in the PSA (prostate-specific antigen) gene transcription. A similar increase in PSA expression, coupled with an increase in cell survival, was observed by Wen et al.  after phosphorylation of Ser-213 and Ser-791 by Akt/PKB. In contrast, Lin et al. observed a decrease in AR-dependent gene expression upon phosphorylation of these two residues by Akt . Consistent with Akt/PKB having a positive role, Manin et al.  observed that the phosphoinositide 3-kinase/Akt pathway was involved in an up-regulation of AR expression levels. However, that study did not determine whether this was due to a direct phosphorylation of AR protein.
The GR is phosphorylated in both mammalian and yeast cells in a similar manner, and as with the AR studies described above, phosphorylation enhanced, repressed or had little effect on GR-dependent gene expression [46,74,75]. The human and rat GRs are phosphorylated at Ser-203 and Ser-211 (numbering for human receptor) by cyclin-dependent kinase in yeast and mammalian cells , whereas Thr-171 and Thr-246 are phosphorylated in the rat GR by glycogen synthetase 3-kinase and the MAPK, JNK (c-Jun N-terminal kinase), respectively . Phosphorylation of Ser-203 and Ser-211 has been shown to be hormone-dependent, and using phosphospecific antibodies, Garabedian and co-workers  were able to show that the GR was cytoplasmic when phosphorylated on Ser-203, and both cytoplasmic and nuclear when Ser-211 was modified. This supported the hypothesis that phosphorylation of specific sites altered the intracellular location of the GR, and presumably its function. The MR has also been shown to be phosphorylated, but the domain or residues modified have not been identified [78,79].
A number of potential phosphorylation sites exist within the PR-NTD. The AF3 region, which is unique to the PR-B isoform, is phosphorylated, but mutating phosphorylated residues had little effect on transactivation activity of the full-length receptor or the isolated transactivation domain . In contrast, mutating Ser-190 or a cluster of residues adjacent to the AF1 and DBD did reduce transactivation of the PR-A isoform in a promoter- and cell-context-dependent manner , whereas Ser-294 was reported to be a substrate for MAPK, which then targeted the PR for degradation via the proteasomal pathway .
Both ERα and β have been shown to be phosphorylated on serine or threonine residues in the NTD. Phosphorylation of Ser-87 (human ERβ) and the corresponding residues in ERα, Ser-104 and Ser-106, together with Ser-118, was shown to be important for ligand-independent activation of transcription and recruitment of the co-activator protein SRC-1 (steroid receptor co-activator 1) [82–84]. Similarly, binding of p68/p72 has also been shown to be phosphorylation-dependent , as was the interaction between ERα and the orphan receptor COUP-TF1 (chicken ovalbumin upstream promoter-transcription factor 1) . Ser-87 (ERβ) and Thr-311 (ERα) have both been shown to be substrates for MAPK enzymes , and Ser-118 (ERα) is a target for the GTF (general transcription factor) TFIIH/cyclin-dependent kinase 7 activity . Thus several residues in the ER-NTD can serve as targets for cross-talk with other signalling pathways, and lead to modulation of protein–protein interactions.
SUMO-1 (small ubiquitin-related modifier-1) is part of a family of small peptides that are covalently linked to a range of proteins and is the most recent post-translational modification to be described for SHRs. The E2-ligase, ubc9, and members of the PIAS [protein inhibitor of activated STAT (signal transducers and activators of transcription)] family of E3-ligase enzymes have been shown to interact with the six principal SHRs, and to modulate receptor-dependent activation (Table 1) . Interestingly, all the SHR-NTDs with the exception of ERα and β have one or more consensus motif, ψKxE (where ψ represents a hydrophobic residue), which contains an acceptor lysine (K) residue . Sumoylation of the AR, GR and MR leads to repression of transcription, which appears to be promoter-specific and may also depend upon cell type [87–92]. Similarly, the IF activity identified in the PR-NTD has been correlated with sumoylation of the receptor, since mutation of an acceptor lysine residue (Lys-388) abolishes trans-repression . However, the picture is more complicated since the different PIAS family members have been found to activate as well as repress SHR-dependent transcription. This may be a direct effect of the PIAS proteins , or result from sumoylation of co-activator proteins, such as members of the p160 family .
The consensus sumoylation motif has previously been described as a SC (synergy control) sequence . Synergism or co-operativity between SHR dimers  and/or other transcription factors  can involve the NTD, and is likely to play an important role in determining the magnitude of the receptor-dependent transcription response at different promoters. Deletion of the SC sequence resulted in increased activity from multiple hormone response elements , and the data from the above studies support the model that this sequence functions, at least in part, to ‘dampen down’ transactivation activity via modification of the receptor protein by SUMO-1.
Post-translational modification of the SHR-NTD can result in changes in intracellular localizations, turnover and protein–protein interactions. The continuing challenge is to determine the functional and/or structural significance of individual modifications, and to determine whether such modifications may actually act co-operatively to regulate receptor function.
PROTEIN–PROTEIN INTERACTIONS WITH THE SHR-NTD
SHRs function in the main as ligand-activated transcription factors that are targeted to regulated genes via DNA-response element recognition and binding. The DNA-bound SHR may then activate transcription though modifying the underlying chromatin organization and by recruitment of the transcription machinery (reviewed in [98–100]). Alternatively, the DNA-bound receptor may repress transcription through so-called negative DNA-response elements (reviewed in ). These actions involve the assembly of multiprotein complexes through receptor–protein interactions, which are discussed below. In addition, certain SHRs, most notably the GR, can regulate gene expression in the absence of specific DNA-binding through protein–protein interactions with other transcription factors and inhibiting their transactivation function [101,102].
The basal transcription machinery
The RNA polymerase II enzyme comprises 10 or 11 subunits, and is highly conserved in all eukaryotes [103,104]. Transcription initiation involves the assembly of a PIC (pre-initiation complex) containing the multi-subunit RNA polymerase II enzyme and up to six GTFs [103,104]. A number of the subunits making up the PIC exhibit DNA-binding activity, most notably the TBP (TATA-binding protein), which is part of the TFIID complex. The assembly of the PIC can follow a stepwise progression in vitro, but most likely involves recruitment of pre-existing protein complexes, for example TFIID [TBP+TAFs (TBP-associated factors)], holo-RNA polymerase II (polymerase enzyme+mediator complex+GTFs) and the multi-subunit TFIIH factor (see [99,105–109]). TFIIH has a number of defined enzymatic activities that play an important role in initiation and promoter escape by the RNA polymerase enzyme, including a helicase activity required to unwind the DNA to create an open complex and a kinase that phosphorylates the C-terminal domain of the large subunit of RNA polymerase II .
Multiple interactions between SHRs and members of the basal transcription machinery have been reported over the past 10 years (Table 1), and include interactions with TFIID/TBP [42,45,110–113], TFIIF [23,111] and TFIIH , and the elongation factors pTEFb  and ELL . It seems likely that such interactions play a part in recruitment of the GTFs and RNA polymerase II, resulting in the assembly of the PIC and/or regulating the early steps of promoter clearance and transcription elongation.
Co-activators: faithful interactions or promiscuous partners?
Evidence for the existence of co-activator proteins originally came from studies involving the phenomenon of squelching [117–119]. However, perhaps what was less expected was the huge number of proteins that would be identified that could bind to and/or modulate SHR activity in vivo. In order to make sense of the myriad of interactions involving the SHR-NTD, it is helpful to group the different target proteins: (1) those that directly regulate the receptor transactivation function, including components of the general transcription machinery and co-activators that may act as a ‘bridge’ between the DNA-bound receptors and the transcriptional machinery, or those that harbour specific enzyme activities, such as histone acetyltransferases or methyltransferases; (2) co-repressors, proteins that mediate receptor-dependent repression of transcription, which may block the transactivation domain or recruit complexes with histone deacetylase activity; and (3) co-regulatory proteins that act indirectly on receptor-dependent transactivation activity by regulating receptor stability or intracellular localization; again, such proteins may have modifying enzymatic activity, such as sumoylation or phosphorylation. Table 1 lists representatives from each of these three groupings that have been reported to bind to the NTD of one or more SHR.
A significant proportion of research effort has been focused on the identification and characterization of co-activator proteins. Co-activators can be defined as proteins that interact directly with transactivation domains, and are recruited to promoter and enhancer DNA sequences. Such proteins should harbour enzymatic activity and/or interact with components of the general transcription machinery, and function to enhance the level of transcription of receptor-target genes. As co-activators were identified and characterized, the idea of receptor-specificity was an attractive possibility with respect to possible mechanism(s) for gene regulation. However, as shown in Table 1, whereas some co-activators appear restricted to one or two SHRs, e.g. ARA24, ARA160 and DRIP150, others, such as CBP [CREB (cAMP-response-element-binding protein)-binding protein] and the SRC proteins, are more promiscuous in their binding profiles. The SRC or p160 family of co-activators represents a particularly interesting group. Members of the p160 co-activator family [SRC-1, SRC-2/TIF2 (transcription intermediary factor 2)/GRIP1 (glucocorticoid receptor-interacting protein 1) and SRC-3/ACTR (activator for thyroid hormone and retinoid receptors)/RAC3/AIB1] were originally identified as LBD-AF2-binding proteins via so-called NR-box (LxxLL) motifs [7,120]. Subsequently, they have been shown to interact with SHR-NTD sequences and, at least to some degree, increase receptor transactivation function [32,121–126]. However, the level of co-activation for different receptors can vary, and may also be dependent on the cellular background. Interactions with the p160 proteins have also been shown to be important for synergism between the SHR-AF1 and -AF2 domains [55,127], and for ligand-independent activity, possibly resulting from phosphorylation of the NTD .
In addition to the abundance of enzymatic activities exhibited by co-activators, or co-regulatory proteins in general, are the concepts of redundancy and the abundance of multi-subunit complexes that share subunits, but show distinct patterns of recruitment and/or activity. This was most elegantly demonstrated by the recent Herculean analysis of the transcription events at the oestrogen-regulated pS2 promoter . Gannon and co-workers, using ChIP (chromatin immunoprecipitation) and ‘Re-ChIP’ assays, demonstrated the importance of ER binding to the promoter for the recruitment of different multi-subunit complexes in order to initiate cyclic rounds of transcription initiation of the target gene. These workers found evidence for redundancy and selectivity in the ER interactions, and could distinguish several distinct protein complexes needed for transactivation, including p68/TBP, SRC-1 or SRC-3/CBP or Tip60/and/or methylase transferases, SWI/SNF and TRAP (Mediator) . The recruitment of different protein complexes has also been shown for the AR [106,107] and the GR and PR .
In summary, SHR co-activators can function at several steps during transcription, from recruitment of chromatin-remodelling complexes (SWI/SNF) and histone-modifying complexes [e.g. SAGA (Spt-Ada-Gcn5 acetyltransferase) and CBP–p/CAF (CREB-binding protein-associated factor)] to the regulation of transcription initiation and elongation by the polymerase II holo-enzyme. Initially, it was presumed that co-activators made specific interactions with specific receptors. Indeed, the transactivation potential of many co-activators differs, depending on cell type and receptor employed. Although the majority of co-activators identified to date have been shown to interact with more than one SHR (Table 1), there are likely to be subtle levels of control depending on the levels of co-activator, post-translational modifications and the architecture of target gene promoters. Thus further studies of co-activators and SHR relationships will need to address these issues in order to confidently dissect the molecular and functional interactions.
Co-repressors: more than a road block?
Studies relating to SHR co-repressors have been somewhat overshadowed by the wealth of information gathered relating to co-activators. However, the co-repressors themselves provide an interesting insight into transcriptional repression and the balance between switching genes ‘on’ and ‘off’.
Table 1 illustrates a selection of co-repressors that bind to SHR-NTDs. There generally appear to be fewer AF1-interacting co-repressors in comparison with co-activators. This may reflect a bias in scientific focus, which has generally been directed towards co-activators and ligand (AF2)-dependent gene regulation. Alternatively, the cell may not require a large array of co-repressors to switch ‘off’ transcription; indeed, a handful of co-repressors may be sufficient to accurately fulfil the task in hand.
Co-repressors respond to both inter- and intra-cellular signals when switching ‘off’ transcription. In response to specific cues, co-repressors will bind sites on SHRs, which are distinct but may overlap with the transactivation functions AF1 or AF2, and recruit histone deacetylase complexes, effectively re-condensing the chromatin and packing away target genes. In addition, co-repressors may also effect receptor turnover by recruiting the 26 S proteasome subunit or modulate cellular signalling cascades (reviewed in ). Interestingly, co-repressors may physically alter the flexible N-terminal transactivation domain and block the recruitment of co-activators engaging in transcription . For example, elegant AF1 domain-swapping experiments on the GR and PR reveal that the co-repressors NCoR and SMRT bind to both the N- and C-termini, highlighting the importance of intramolecular interactions . In addition, irrespective of the different affinities of co-repressor–GR/PR binding, a TIF2 (SRC-2) peptide was able to compete for AF1 binding. It has already been shown that N/C intra-/inter-domain bridging is important for a fully functioning SHR [28,29], so it may be plausible to presume that an additional mechanism of co-repressor action is to inhibit this interaction and prevent recruitment of co-activators like the p160 family of proteins, thus down-regulating transcription of target genes.
Selective interactions between co-repressor and specific SHRs may complement the actions of the more ‘general’ co-repressors (NcoR, SMRT and Daxx). For example, the basic helix–loop–helix orange protein, HEY-1, appears to be specific for the AR . Discovered through a yeast-two-hybrid screen, HEY-1 was shown to interact with residues in the AF1 domain. Furthermore, the authors noted that HEY-1, albeit weakly, interacts with the GR, PR and ERα, but functions as a specific co-repressor for AR in reporter gene assays . A second co-repressor, FLASH (Fas-associated huge protein), has been shown to interact with both MR and GR AF-1 domains . Although that study indicates that FLASH up-regulates MR- and GR-dependent transcription, two further studies by Chrousos and co-workers [130,131] indicate that FLASH is able to repress the actions of the GR, PR and AR, and has no effect on the ER. Such differences are intriguing, and again may reflect differences in cellular environments and the balance between endogenous co-activator and co-repressor proteins.
Taken together, it seems that co-repressors may act in several ways. There may be specific interactions between co-repressors and selective SHRs that may depend on cell context and type. In addition, more general forms of co-repression may involve abrogation of intramolecular bridging or competition with co-activators for similar binding sites or regions.
Co-regulatory proteins represent an interesting set of factors that can affect SHR-dependent transactivation and includes molecular chaperones and a number of proteins with specific enzymatic activities, for example, acetyltransferases and ubiquitin and SUMO-conjugating enzymes (Table 1). Chaperones such as the BAG-1 family of proteins have been shown to increase transcription of the AR, presumably by recruiting a complex of co-activators or through active cycling of the receptor protein . In stark contrast to the AR, BAG-1 proteins have been shown to inhibit GR-dependent transactivation by binding to the hinge region of the receptor .
The importance of the SHR-NTD for receptor-dependent gene regulation is underpinned by the large amount of data amassed on the binding of GTFs, co-activators, co-repressors and co-regulator proteins. The interplay between transactivation and transrepression is likely to be dependent on both receptor type and discrete, but possibly overlapping, amino acid sequences. The differential binding of co-regulators has been implicated in both gene- and cell-specific expression, and taken in context suggests that a large and varied response to stimuli can be achieved in vivo.
STRUCTURAL ANALYSIS OF THE AF1 TRANSACTIVATION DOMAIN
As indicated earlier there is little, if any, sequence homology between the different SHR-NTDs. However, analysis of the amino acid composition of these domains suggests some shared characteristics. The NTDs of SHRs have a high proportion of proline and/or serine residues (greater than 10% of the total amino acids). The AF1 domains of the AR and ERα show a similar high content of these residues, and for all the SHRs, the regions mapped as being important for transactivation are rich in the amino acids glycine and/or leucine relative to the corresponding NTD. With the exception of the PR-AF1 domain, the mapped transactivation domains all have acidic isoelectric points (pI 4.4–5.0), which is higher than the archetypal acidic activator VP16/Vma65 (pI 3.4), and suggests they may not necessarily fit the paradigm of acidic activators. The PR-AF1 domain stands out, with a pI of 7.6 and less than 5% acidic residues.
In contrast with the DBD and the LBD, there are no high-resolution structures available to date for the NTD of any member of the nuclear receptor superfamily. In the case of the SHR, the lengths of the NTDs are likely to contribute some practical challenges in terms of structure determination, and the situation is complicated further by the lack of amino acid sequence identity and the functional properties of the AF1 domains (as discussed below). However, in the absence of high-resolution structures, significant structural information has been obtained from CD, NMR, fluorescence and Fourier transform infrared (FTIR) spectroscopy, together with secondary structure prediction algorithms and biochemical analysis.
Secondary structure and regions of natural disordered structure can be predicted on the basis of amino acid sequence and composition. For each SHR-NTD, the distribution of secondary structure types (α-helix, β-strand and coil) is shown in relation to the mapped transactivation domains (Figure 2). It can be seen that the MR-AF1a, GR-AF1, AR-AF1 and ERα-AF1 domains consist of potentially a mixture of α-helix, β-strand and coil conformations. In contrast, PR-AF1 and AF3 are likely to be predominantly α-helix, whereas MR-AF1b and ERβ are predicted to be exclusively β-strand (Figure 2). In the case of the AR-AF1, this predictive analysis has proved informative when combined with mutagenesis to disrupt suspected helical regions [22,134]. Figure 3 shows the relative proportion of α-helix, β-strand and non-ordered/other structure for the different transactivation domains. With the exception of the MR-AF1a/b domains, the mapped transactivation domains are predicted to have similar levels of α-helix (15–20%) and generally less than 10% β-strand. In contrast, the MR-AF1a and -AF1b domains have 40 and 0% α-helix respectively, and MR-AF1b is predicted to be predominantly β-strand (greater than 20%; Figure 3).
It is becoming increasingly apparent that regions or even whole domains of naturally disordered structure can play an important role in protein function (reviewed in [135,136]). The presence of a high proportion of proline, serine and glycine residues in the NTDs of steroid receptors conforms to the amino acid compositional bias associated with a signature for intrinsic disorder . The trained neural network program, PONDR®, uses primary amino acid sequences to predict the occurrence of such regions. Figure 2 shows the PONDR® score plot indicating regions of disorder (threshold 0.5 or above) for each of the six principal SHR-NTDs. Regions of 40 or more consecutive residues represent the highest probability of naturally disordered structure (Figure 2, solid bar). There is a good inverse relationship between predicted secondary structure elements and regions of disordered structure. It is also striking that regions of highest probability for disordered structure generally overlap with regions important for transactivation (Figure 2).
The actual secondary structure content has been determined for the AR-AF1 [134,137], ERα/β-NTD  and GR-AF1 and AF1 core [138,139] by CD and FTIR spectroscopy. The far-UV CD spectrums for all five polypeptides in aqueous solution are characterized by large minima at 190 nm, which is characteristic of proteins with little stable secondary structure content. The experimentally measured α-helical content of AR-AF1 and GR-AF1 was 13–16% and 27% respectively, with the non-ordered structure of 24–36% for the AR and 39% for GR [134,137,139]. Both polypeptides adopt a more α-helical conformation, at the expense of non-ordered or possibly β-structure, when placed in a hydrophobic environment or in the presence of the natural osmolyte TMAO (trimethylamine-N-oxide) [134,137–140]. Folding of the GR-AF1 and AR-AF1 domains, in the presence of TMAO, to a more structured conformation was also followed by measuring the steady-state fluorescence emission spectrum for aromatic amino acids and by partial proteolysis [113,134,140]. Under these conditions, the AF1 domain adopts a more protease-resistant conformation, with the tryptophan residues being less solvent-exposed. Three induced α-helical regions within the GR-AF1 core were identified by NMR spectroscopy in the presence of the hydrophobic solvent TFE (trifluoroethanol) . The ability to form α-helix structure was correlated with transactivation potential of the GR through the introduction of proline mutations to disrupt the helical segments and alter the secondary structure of the receptor polypeptide [138,141]. Similarly, introduction of mutations into the AR-AF1 region to disrupt putative helical regions led to alterations in the partial proteolysis profile, consistent with a loss of structure and supporting the existence of α-helical segments in the polypeptide . A common feature of transactivation domains, apart from the general lack of stable structure, has been their modular nature. This was illustrated elegantly for the GR-AF1 core domain by analysing the activity of individual and multiple combinations of helices 1 and 2 . These studies demonstrated that, although helix 1 and 2 were able to bind to target proteins (Ada2 and CBP), they were not capable of activating transcription alone. However, multiple copies of either helix or, even better, combinations of the two were as active as the original AF1 core polypeptide . These findings emphasize the importance of multiple protein–protein interactions and the structural architecture of the transactivation domain for function. Taken together, the biophysical and biochemical analysis of the AR- and GR-TAD conformation revealed structurally flexible polypeptides that could fold into more compact conformations in the presence of structure-stabilizing solutes (TFE and TMAO); this folded structure may involve a coil-to-helix transition.
Although the structural predictions for the PR-NTD are not significantly different from those of the other SHRs (Figures 2 and 3), this domain is thought to have significant structure, as revealed by limited proteolysis, which was stabilized further by the presence of the DBD [143,144]. Physicochemical properties suggested that the NTDs of both PR-A and PR-B were monomeric in solution, with a non-globular, extended conformation, and that the AF3 region of the B-isoform, in contrast with AF1, showed a more extended conformation [143,144].
Another procedure that has proved informative, in the absence of high-resolution structures and sequence similarity, is HCA (hydrophobic cluster analysis), again based on examination of the primary amino acid sequence [145,146]. This technique is based on pattern recognition, and has proved useful in revealing structural relationships in the absence of any obvious sequence homology, and is therefore ideal for analysing SHR-NTDs. The amino acid sequence is duplicated and represented on a flattened α-helical structure (without making assumptions about the underlying structure), and hydrophobic residues are circled, revealing patches of hydrophobicity, which can be related to the folded conformation. Figure 4 shows the HCA plots for the transactivation domains of six SHR-NTDs. Similarities in the HCA profiles, even in the absence of clear sequence alignments, has helped predict that proteins may share similar folded conformation [145,146]. From the plots shown, there are clear patches of hydrophobic residues (Val, Ile, Leu, Phe, Met, Trp and Tyr) and clusters of both proline residues (stars), which break possible secondary structures, and glycine residues (diamonds), which confer flexibility in the polypeptide chain. However, there is no obvious similarity in the patterns of hydrophobic residues among the different transactivation domains (Figure 4). This is in striking contrast with a similar analysis of the AR-, GR- and PR-DBD/LBD sequences, which show high similarity between the HCA plots (I. J. McEwan, unpublished work) and which would be expected from the crystal structures of these SHR-LBDs. Thus the analysis fails to reveal any likely shared conformational folding between the different SHR transactivation domains, but may prove useful in studying the folding of the individual domains. This might suggest that, although the SHR-NTD share the general properties of being structurally flexible, with little stable secondary structure and the propensity to adopt α-helix conformation, the overall folding of these domains may result in distinct conformations, which in turn may depend on cellular and target gene environment.
Allosteric regulation: role of DNA-response element binding
There is increasing evidence to support the view that DNA-response elements can play a more active role in transcription factor action than merely tethering the protein to the DNA (reviewed in ). Thus DNA binding may cause structural changes within the SHR-NTD, and reciprocal intramolecular-domain communications may influence response element recognition and/or binding affinity.
Evidence for DNA binding affecting the structure of the NTD of steroid receptors comes again from spectroscopy analysis and sensitivity to limited proteolysis. Changes in the structure of the GR-NTD were observed by both near-UV CD and fluorescence spectroscopy, with the changes possibly involving β-sheet secondary structure and folding of the domain , whereas Greenfield et al.  observed an increase in thermal stability and α-helical content upon binding of the full-length ERα to DNA. Given the known structural content of the DBD and LBD, these changes in conformation are most likely to reflect folding of the NTD and/or the hinge region. Evidence for intradomain communication within ERα was originally suggested from studies using domain-specific antibodies , and was supported further by the observation of altered protease sensitivity for both ERα and β when bound to different EREs (oestrogen-response elements) [151–153]. In the case of ERβ, binding to different EREs led to distinct patterns of recruitment of members of the p160 co-activator family and strongly supported a link between protein–protein interactions and SHR-NTD conformation . More recently, binding to specific DNA sequences was shown to alter the steady state fluorescence emission spectrum for a construct of the AR-NTD-DBD and alter the pattern of proteolysis for the receptor fragment  or the full-length receptor . In addition, it was found that the presence of the AR-NTD reduced the affinity of the DBD for binding to both selective and non-selective DNA-response elements . Interestingly, although the presence of the DBD stabilized the hinge and NTD of the PR, there was no additional affect observed in the presence of a specific DNA-response element . Taken together, these studies highlight the possibilities for domain communication within the steroid receptor molecule, and the potential reciprocal effects on protein folding and DNA binding.
Allosteric regulation: role of target protein binding
All the structural analyses to date on the isolated SHR-NTD/AF1 indicate that these regions of the protein are likely to lack stable structure, but adopt a more folded conformation under certain conditions. This does not exclude the possibility that, in the context of the full-length receptor, the structure of the NTD may be more ordered and stabilized by the presence of the DBD and/or LBD. However, in recent years there has been increasing interest in proteins or protein domains that appear natively unstructured, but are nonetheless functionally important [135,136]. The proportion of naturally disordered proteins in eukaryotes has been estimated to be as high as 35–51% . A number of functions have been proposed for intrinsically unstructured regions that seem particularly relevant to a discussion of SHR-NTDs, given their role in multiple protein–protein interactions and gene regulation. The first is in molecular recognition, where structural flexibility: (a) allows high specificity, without the need for high affinity; (b) allows the ability to make multiple different interactions; and (c) results in a relatively very large interaction surface as the protein folds around the binding partner [135,136]. Secondly, natural disorder in SHR-NTDs may help to facilitate molecular assembly by allowing proteins to form loose groupings, which then undergo a folding step to form the functional complex . Thirdly, sites of post-translational modification are often found in unstructured regions, which allow access to the amino acid side chain by the modifying enzyme .
Kumar, Thompson and co-workers have shown that folding of the GR-AF1 domain with TMAO resulted in enhanced binding of the targets TBP, CBP and the p160 co-activator, SRC-2 (GRIP1) . These authors have proceeded to show that the binding of TBP to GR-AF1 resulted in a coil-to-α-helix transition and alteration in the chemical shifts of glycine residues present in the GR-AF1 core region . TBP binding also results in a structural change in the ERα-NTD, consistent with an increase in α-helix/β-sheet secondary structure .
The interaction of the AR-AF1 domain with the large subunit of the GTF, TFIIF, also induces folding and an increase in α-helix structure similar to that seen with TMAO [134,137]. Furthermore, folding of the AF1 domain by either TMAO or TFIIF binding resulted in enhanced binding of the co-activator protein SRC-1a . This is a significant finding, since it suggests, together with the data for GR-AF1, that induced folding of the AF1 domain leads to the creation of surfaces for further protein–protein interactions, and may illustrate a mechanism for regulating assembly of different transcriptionally competent complexes. The model would be that some interactions induce folding (Figure 5), whereas other interactions would require a more ordered binding surface. Alternatively, as suggested above, there could be concomitant folding of the receptor transactivation domain with the different target proteins, resulting in the assembly of a multi-subunit, transcriptionally competent complex. These models will need to be tested in order to fully understand SHR-NTD structure and function.
STRUCTURAL BASIS FOR STEROID RECEPTOR ACTION
Thus, in spite of the lack of crystal structure data, a large body of evidence has been gathered underlining an induced structure model for the SHR-NTD/AF1 upon DNA and/or protein binding. Figure 5 illustrates this model, and indicates that different conformers of the SHR-NTD/AF1 may exist in equilibrium (parts a and b), with regions of partial structure stabilized by intramolecular interactions with the DBD (shown) and/or the LBD (not shown). Upon specific protein–protein interaction, there is induction of folding and/or further stabilization of structural elements within the NTD (Figure 5c), which is important in the assembly of a transcriptionally competent receptor complex. Such a model is attractive, as it allows for specificity and multiple target protein binding to the SHR-NTD. Interestingly, the few NTD-target protein interactions that have been analysed kinetically suggest that protein–protein-binding affinities exist in the micromolar range ([112,132], and D. N. Lavery and I. J. MacEwan, unpublished work). Thus induced folding and the creation of new binding sites within the NTD suggest a way of regulating the different interactions required to assemble transcriptionally competent complexes. It is also possible to include in this model of induced transactivation domain folding distinct roles that charged and hydrophobic amino acids may play. Wright and co-workers have argued, on the basis of kinetic evidence, for a fast step involving long-range charged interactions, followed by a slow folding step involving the burying of hydrophobic residues at the binding surface . In addition to the kinetic data, support for this model of AF1 structure and function comes from mutational analysis, highlighting the importance of charge [46,48], hydrophobicity [25,44,48] and structural flexibility .
Steroid receptors are targeted to specific DNA sequences usually located in the promoter and/or enhancers of target genes. The influence of the steroid receptor is likely to be at one or more of the different steps of the transcription cycle. For example, the DNA-bound receptor may recruit chromatin-remodelling complexes (SWI/SNF) or histone-modifying enzymes (histone acetyltransferases, methyltransferases and histone kinases) in order to open up the chromatin structure to other transcription factors and/or the transcription machinery. However, SHRs can also recruit the transcription machinery either directly through interactions with GTFs or indirectly via co-activator proteins. Subsequent to transcription initiation by RNA polymerase II, steroid receptors may act to regulate transcription elongation or mRNA processing. The different actions of steroid receptors on target gene regulation will depend upon the nature of the DNA-response element and the binding of different combinations of target proteins, and the characteristics of these interactions. Taken together, some or all of these macromolecular interactions are likely to influence the structure of the AF1 domain.
CONCLUSIONS AND FUTURE PERSPECTIVES
The last 25 years have seen tremendous progress in our understanding of the structure–function relationships of members of the nuclear receptor superfamily. This has included the isolation of receptor cDNAs, the solving at atomic resolution of the structure of the isolated DBD and LBD, the identification of binding partners, and a clearer appreciation of the role of mutations in nuclear receptors, resulting in a wide range of pathological states. The NTD is variable in terms of amino acid sequence and length, but has been shown to be important for SHR-dependent gene regulation, and is a major site for post-translational modifications. Thus, in the next quarter century, there will continue to be both challenges to, and advances in, our understanding of the structure and function of this domain. In particular, it will be important to: (i) identify gene targets and regulatory networks for each SHR, which are dependent upon the NTD; (ii) study the role of different post-translational modifications for SHR action, and identify the enzymes responsible and potential interdependence of post-translational events for receptor function; and (iii) arrive at a better structural understanding of the SHR-AF1/NTD. Ultimately, this will depend upon the solving of a high-resolution structure (perhaps of the ER with the smallest NTD or one of the non-steroid receptor members of the superfamily). In the meantime, better quantification (binding kinetics, affinity and stoichiometry) of individual receptor–protein interactions would provide a better picture of the molecular details underpinning receptor-dependent transactivation.
As befits such a large family of proteins involved in development, metabolic regulation and reproduction, there is intensive research into all aspects of SHR structure and function. In the future, it is assured that there will be further dramatic developments as researchers strive to understand the biology and physiology of SHRs, the structural and biochemical basis for receptor action and the role genetic alterations in receptor signalling play in disease.
D.N.L. is supported by a PhD studentship funded by the AICR (Association of International Cancer Research). Work in the I.J.M. laboratory is supported by the AICR and Biotechnology and Biological Sciences Research Council (U.K.). We are also grateful to colleagues in the field for fruitful and stimulating conversations on the questions addressed in this review.
Abbreviations: AF, activation function; AR, androgen receptor; CBP, CREB (cAMP-response-element-binding protein)-binding protein; DBD, DNA-binding domain; ERα/β, oestrogen receptor α and β respectively; ERE, oestrogen-response element; FLASH, Fas-associated huge protein; FTIR, spectroscopy, Fourier-transform infrared spectroscopy; GR, glucocorticoid receptor; GRIP1, glucocorticoid receptor-interacting protein 1; GTF, general transcription factor; HCA, hydrophobic cluster analysis; IF, inhibitory function; LBD, ligand-binding domain; MAPK, mitogen-activated protein kinase; MR, mineralocorticoid receptor; NCoR, nuclear receptor co-repressor; NTD, N-terminal domain; PIAS, protein inhibitor of activated STAT; PIC, pre-initiation complex; PKB, protein kinase B; PR, progesterone receptor; PSA, prostate-specific antigen; SC, synergy control; SHR, steroid hormone receptor; SMRT, silencing mediator of retinoid and thyroid hormone receptor; SRC-1, steroid receptor co-activator 1; TBP, TATA-binding protein; TFE, trifluoroethanol; TIF2, transcription intermediary factor 2; TMAO, trimethylamine-N-oxide
- The Biochemical Society, London