The assembly of proteins into amyloid fibrils can be an element of both protein aggregation diseases and a functional unit in healthy biological pathways. In both cases, it must be kept under tight control to prevent undesired aggregation. In normophysiology, proteins can self-chaperone amyloidogenic segments by restricting their conformational flexibility in an overall stabilizing protein fold. However, some aggregation-prone segments cannot be controlled in this manner and require additional regulatory elements to limit fibrillation. The present review summarizes different molecular mechanisms that proteins use to control their own assembly into fibrils, such as the inclusion of a chaperoning domain or a blocking segment in the proform, the controlled release of an amyloidogenic region from the folded protein, or the adjustment of fibrillation propensity according to pH. Autoregulatory elements can control disease-related as well as functional fibrillar protein assemblies and distinguish a group of self-regulating amyloids across a wide range of biological functions and organisms.
- BRICHOS domain
- HET-S prion
- melanin synthesis
- proinsulin C-peptide
- protein misfolding
- spider silk
For every polypeptide chain, a specific conformation or ensemble of conformations exists that represents its native fold. Yet, many, if not most, proteins can in addition adopt an amyloid state, in which the protein backbone forms a cross-β-sheet structure capable of assembly into elongated unbranched fibrils [1–3]. The reasons certain proteins can convert from native into amyloid folds in vivo are largely unknown, but can involve mutations, as exemplified by familial Danish and British dementias [4,5], proteolytic processing, as exemplified by aggregation of Aβ1–42 (amyloid β-peptide) in AD (Alzheimer's disease) , or post-translational modifications as exemplified by the hyperphosporylation of the AD-associated tau protein . Once misfolded, proteins can even cause the conversion of other folded proteins into amyloid. The transmission of protein aggregation disease has been described in the case of misfolded prions , but has more recently also been demonstrated using other amyloid-forming proteins as infectious agents .
However, the ability to form amyloid is not always related to aberrant protein production or modification events. Insulin and SP-C (lung surfactant protein C) are two potentially amyloidogenic proteins even in their normal biologically active forms. Additionally, the assembly of a protein into fibrillar structures can serve a specific biological purpose, e.g. in the form of spider silk, non-genetic inheritance in yeast or bacterial surface adhesion, and must therefore be promoted under specific circumstances .
The potential of a protein to form amyloid can be predicted from the sequence by calculation of the β-sheet propensity of a protein segment [11,12], by correlation of its secondary structure preference with amino acid polarity  or by estimation of the protection provided by the surrounding protein fold . A structure-based prediction method assesses the ability of the side chains in two protein segments to form a ‘steric zipper’ as two complementary β-sheets, a structure found in microcrystals of amyloid-forming protein fragments . Genome-wide screens using both sequence-  and structure-  based prediction methods show that 10–30% of the proteome contains potentially amyloid-forming segments. Mapping of these regions in the available protein structures deposited in the PDB, however, suggests that their geometry and localization in the folded protein makes an assembly into β-sheets improbable. Such regions are preferentially buried inside the protein and significantly underrepresented in flexible regions, with a total of <0.1% fulfilling all criteria for the classification as an amyloid-formation ‘hotspot’ . Liberation of these segments, e.g. by flanking with glycine residues, proteolysis or mutagenesis, can facilitate amyloid formation by a soluble protein .
The implications from this are three-fold: (i) some segments that are prone to form amyloid elude evolutionary modification, which suggests a functional importance and positive evolutionary pressure for their non-elimination in spite of their potential danger; (ii) protein folds have instead evolved to constrain amyloid-forming regions by balancing factors; and (iii) the formation of non-functional amyloid can be linked to aberrant liberation of amyloidogenic segments through miscleavage, post-translational modifications, mutations or interactions with other misfolded proteins, all of which can disturb the balance between amyloidogenic and chaperoning segments.
As can be implicated from this introduction, there are instances of natural biologically active proteins that have a high propensity to form amyloid even under physiological conditions. In our laboratories, we have investigated how this propensity is controlled in three proteins that form amyloid or amyloid-like structures: SP-C, insulin and spider silk protein. These studies led us to compare our findings to the autoregulatory strategies that have been described for other amyloid-forming proteins. In the present review, we evaluate the group of self-regulating amyloids and illustrate their diverse biological functions and molecular mechanisms (Table 1).
CONTAINMENT OF AN AMYLOIDOGENIC PEPTIDE USING A SPECIALIZED DOMAIN
Whereas flexibility constraints help to prevent amyloid formation in folded proteins, some highly aggregation-prone sequences cannot be ‘caged’ in this manner, as their biological function requires free accessibility or a non-proteinaceous environment. An example is SP-C, an extremely hydrophobic TM (transmembrane) peptide consisting of 35 amino acid residues, which is released from its 197-amino-acid precursor, proSP-C, by proteolytic cleavages [19,20]. Mature SP-C forms a TM α-helix that is secreted into the alveoli as part of the lung surfactant and prevents the lung from collapsing at the end of expiration . The α-helix of SP-C has a very unusual amino acid sequence consisting mainly of valine residues. Owing to the high β-sheet propensity of valine, SP-C is easily converted into a β-strand conformation and forms amyloid-like fibrils in vitro [21–25], as well as in vivo in association with pulmonary fibrosis .
The C-terminal part of proSP-C contains a BRICHOS domain, which has been found in 12 different protein families associated with diseases, such as the dementia-related Bri2 protein [27,28]. In proSP-C, the BRICHOS domain functions as an intramolecular chaperone by assisting the TM part of SP-C to form an α-helix that can be inserted into the ER (endoplasmic reticulum) membrane [26,29,30] (Figure 1).
It was shown recently that mutations in the BRICHOS domain, or in the linker that connects the BRICHOS domain to the TM region, lead to SP-C aggregation and amyloid deposits in the lungs of patients suffering from interstitial lung disease . It is believed that the proSP-C BRICHOS domain before membrane insertion specifically captures a β-hairpin structure, formed by the linker region and the TM part of proSP-C .
The proSP-C BRICHOS domain has no distinct sequence-specific binding preferences, but rather recognizes stretches of amino acid residues that promote membrane insertion and are in a non-helical conformation [31,32]. The AD-associated Aβ peptide contains a TM region and a helix that is predicted to form a β-sheet, similar to SP-C, and can fold into a β-hairpin structure, which makes this whole segment a possible target for the anti-amyloid activity of BRICHOS. In line with this function, the proSP-C BRICHOS domain binds to Aβ and inhibits fibril formation at substoichiometric concentrations of BRICHOS in vitro . Furthermore, the proSP-C BRICHOS domain also inhibits fibril formation of medin, which forms amyloid in the aortic wall [31,33].
The BRICHOS domain of the Bri2 protein exhibits the same activity against Aβ fibril formation as the one found in proSP-C and has been shown to bind its C-terminally derived peptide Bri23, which also has a high β-sheet propensity . All proteins in the BRICHOS family, with the exception of proSP-C, have a C-terminal region outside the BRICHOS domain that contains at least two stretches with predicted preference for β-sheet formation, interrupted by a short coil region . This might indicate that the function of the BRICHOS domains is to chaperone these β-sheet-prone regions in precursor proteins, thereby preventing aggregation and amyloid formation.
STABILIZATION IN CIS AND IN TRANS BY A PROPEPTIDE SEGMENT
The peptide hormone insulin can form amyloid deposits in patients with Type 1 diabetes [35–37]. Its assembly into amyloid fibrils has been studied extensively in vitro . The fibril formation is preceded by an unfolded insulin intermediate which then undergoes aberrant refolding [39,40]. Zinc, which promotes crystallization by forming complexes with hexameric insulin, can inhibit this unfolding step and has been suggested to prevent insulin fibrillation in the secretory granules of β-cells . However, exclusion of zinc from these granules does not impair the secretion of functional insulin .
Proinsulin, the precursor in which the A- and B-chains of mature insulin are linked by a 31-residue C-peptide (proinsulin connecting peptide) and additional residues removed by cleavage, does not assemble into fibrils, but instead forms large amorphous aggregates . Although these aggregates still show ‘amyloid-typical’ ThT (thioflavin T) staining and contain predominantly β-sheet structures, the aggregation lag time is extended >15-fold compared with fibril formation of mature insulin. Singular scission of the linkage between C-peptide and the insulin A- or B-chain increases the propensity of proinsulin to form amyloid fibrils. It has therefore been suggested that the covalent linkage by C-peptide restrains the movement of the amyloidogenic segments in insulin and thus acts as an intrinsic block against proinsulin fibrillation. This approach has some similarities to proproteases, which contain a short active-site-blocking segment until excision to prevent runaway activity.
Interestingly, whereas C-peptide thus serves to inhibit amyloid formation by conformational restraints, it also contains a glycine-rich middle segment that confers a high degree of flexibility . In the same study that reported the low fibrillation tendency of proinsulin, it was shown that mini-proinsulin, in which C-peptide is exchanged for a dipeptide linker, is completely unable to form fibrils . Despite proving that limitation of the conformational freedom of the A- and B-chains prevents aggregation, this finding raises the question of why a long flexible linker is employed at the cost of incomplete ablation of the fibrillation propensity. Additionally, proinsulin with a single scission between C-peptide and A- or B-chain still exhibits a >5-fold longer aggregation lag phase than mature insulin , indicating the presence of a second mechanism by which C-peptide acts on insulin amyloid formation.
A possible answer was suggested by the finding that co-administration of C-peptide with insulin in Type 1 diabetics leads to a faster increase in plasma insulin levels than the administration of insulin only . Surface plasmon resonance and ESI (electrospray ionization)–MS analyses have provided evidence for molecular interactions between both peptides that lead to the reduction of insulin oligomers. In a continuation of these studies, we have recently demonstrated that C-peptide can also block insulin fibril formation in trans, i.e. by intermolecular interactions , which resembles its effect on islet amyloid polypeptide, another amyloidogenic peptide of the secretory granules .
C-peptide therefore appears to be able to prevent insulin aggregation in two manners: (i) by a blocking effect that reduces the flexibility of proinsulin through conformational constraints; and (ii) by a binding effect that stabilizes mature insulin through molecular interactions, which, in both cases, makes amyloidogenic insulin segments unavailable for self-association. It is even possible that C-peptide-mediated blocking of fibrillation occurs in balance with the normal formation of zinc-stabilized insulin crystals, which suggests a three-fold model for the biological prevention of insulin aggregation (Figure 2).
REGULATION OF FUNCTIONAL AMYLOID ASSEMBLY BY THE RELEASE OF AN AMYLOIDOGENIC SEGMENT
Most occurrences of amyloid are seen as an aberrant folding event. Functional amyloids, on the other hand, represent, as mentioned already, a distinct class of protein assemblies with a dedicated biological purpose. This implies that their formation cannot be avoided, but needs to be tightly regulated, as runaway fibrillation would pose a serious threat to the organism. Since it has been suggested that amyloidogenic regions are often contained in a stabilizing environment (see above), it appears likely that a similar mechanism may also control the assembly of functional amyloids. Here, an amyloidogenic segment could be released from its safe containment in the protein fold and initiate fibrillation following an external signal.
This mechanism has been described recently for class I hydrophobins . Hydrophobins are fungal proteins with a length of approximately 100 residues, which share eight conserved half-cystine residues. Class I hydrophobins, classified by the spacing between the half-cystine residues, assemble into ‘fibrous rods’ when exposed to an hydrophobic/hydrophilic interface. Hydrophobin assemblies phenotypically resemble amyloid fibrils, contain a β-sheet structure, can be stained by ThT and have high chemical stability . The high-resolution structure of Neurospora crassa hydrophobin EAS (product of the ‘EASily wettable’ gene) shows a unique fold that is stabilized by four intramolecular disulfide bridges . Using site-directed mutagenesis, model peptides and in silico analyses, the amyloid-forming region of EAS has been mapped to a C-terminal β-sheet that is hydrogen-bonded to the β-barrel core of the protein . Stabilization of the amyloidogenic segment combined with the high conformational entropy originating from a large intrinsically disordered loop protruding from the central β-barrel core causes EAS to remain soluble even at high concentrations . In contrast, contact with an air/water interface has been proposed to induce the formation of an amphipathic secondary structure in the disordered loop, which increases the aggregation propensity that drives the assembly via the amyloid-forming segment. A computational model of fibrillated EAS suggests that the β-barrel fold of the protein can remain intact in the fibrillar state . The proposed fibril organization, referred to as domain swapping, is strikingly similar to that of RNase A amyloid formed by liberation of its amyloidogenic segment .
CONTROL OF FIBRILLATION PROPENSITY BY pH
Whereas the EAS hydrophobin supports the hypothesis that amyloid formation occurs through the exposure of an otherwise contained amyloidogenic segment, the autoregulatory mechanism of the functional amyloid protein Pmel17 follows a different approach. Pmel17 has been identified as the sole component of amyloid fibrils found in melanosomes, an organelle that is related to endosomes and lysosomes [52,53]. The amyloid fibrils act as a scaffold for melanin synthesis by sequestering toxic precursors . Pmel17 thus represents a functional amyloid in humans .
Its mature form, termed Mα, is liberated by several proteolytic steps from the 668-residue Pmel17 proprotein that facilitates its intracellular sorting to the melanosomes . Within Mα, a 124-residue glutamate-rich segment, composed of eight imperfect repeats, was identified as the amyloid-forming region . Two observations have revealed clues about the regulation of amyloid formation: (i) the stability of fibrils formed by the repeat segment can be directly correlated with pH ; and (ii) fibril formation is synchronized with melanosome maturation in vivo, via a pH change from pH 4 at stage I (small oligomers) to pH 5 at stage II (fibrils) and finally pH 7 at stages III–IV, when melanin deposition occurs. This pH regime is also required for the specific activity of enzymes involved in the synthesis . Pfefferkorn et al.  have investigated the aggregation kinetics of the repeat domain at the same pH interval and found that pH 5 is the optimal condition for fibre elongation, whereas pH 4 promotes the formation of small aggregates resembling prefibrillar oligomers. Interestingly, elevation of the pH to 7 dissolves the fibrils, which does not occur in vivo, possibly due to the stabilizing effects of the deposited melanin. Furthermore, the pKa shift from 4.5 to 5.5 during aggregation suggests that protonation of specific carboxy groups in the repeat sequence is required for fibril formation. Therefore an autoregulatory model has been proposed in which the protonation of glutamate residues at stages I and II mitigates repulsive effects between the negatively charged repeat segments and allows self-association . Hence Pmel17 aggregation is likely not to be controlled via self-chaperoning, but instead uses charge repulsion as an anti-amyloid effect encoded in the repeat segment itself and can be abolished following a pH shift.
TERMINATION OF AMYLOID FORMATION VIA A PRION-FORMING DOMAIN
Next, the yeast HET-S protein deserves specific mention, as its fibril-forming segment controls the fold of a regulatory domain instead of the other way around as is usual. HET-S, which is 95% identical with the HET-s protein, is a prion protein involved in programmed cell death . It contains a natively disordered PFD (prion-forming domain) as well as a globular N-terminal domain. Whereas the PFDs of HET-S and HET-s form amyloid fibrils, the full-length HET-S protein is resistant to aggregation. Its globular domain can inhibit fibrillation in cis and in trans , suggesting a regulatory function. An investigation of the structure–function relationship of the globular domain led to a surprising finding: incorporation of the HET-S PFD in a nascent HET-s fibril requires unfolding of the terminal helices of the globular domain . If two HET-S proteins are in close proximity in the fibril, their globular domains dimerize, causing further un- or re-folding and possibly aggregation, which blocks the fibril elongation site and triggers cell death. It should be noted that the inhibitory effect of HET-S on amyloid formation is activated via its incorporation into an amyloid fibril. This effect makes this protein a ‘prion sensor’ with the ability to identify and terminate amyloid formation.
SENSING OF MULTIPLE SIGNALS FOR SELF-ASSEMBLY BY SPECIAL N- AND C-TERMINAL DOMAINS
Probably the most intricate autoregulatory strategy is the mechanism of spider silk assembly. Here, fibre formation is controlled not by one, but by two regulatory domains, whose interplay, coupled with pulling forces, facilitates the formation of the highly complex dragline silk. Spider dragline silk represents an amyloid-like material in the sense that it is composed of highly organized proteins (spidroins) in mainly β-sheet conformation . Like amyloid fibrils, spider silk is formed from soluble proteins under physiological conditions , but are several orders of magnitude bigger and the peptide chains in their β-sheets are oriented in the direction of the fibre axis [64–66].
The spider can master two major problems associated with the spinning of its silk: (i) to store the highly aggregation-prone spidroins in soluble form at extreme concentrations (30–50%, w/w) in the silk glands; and (ii) to turn the silk solution into solid fibres within fractions of a second in the spinning duct . Along the silk production pathway there is a gradual decrease of pH from 7.2 to <6.3, changes in ion composition and increases in shear forces by the narrowing duct. Although the molecular mechanisms of silk formation associated with these events are not completely known [67–69], recent publications have revealed several clues and raised several hypotheses.
The spidroins are composed of three distinct parts; a non-repetitive 130-residue NTD (N-terminal domain), an extensive 3500-residue region of blocks of polyalanine and polyglycine which gives the fibre its mechanical properties , and a non-repetitive 100-residue CTD (C-terminal domain) [70–72]. Upon fibre formation, the polyalanine blocks are converted from mixtures of helical and random coil structures in the gland into antiparallel β-sheets that stack in crystalline formations in the fibre [66,73]. Both terminal domains form homodimers of five α-helices, but are structurally unrelated to each other and to any other known protein [74,75]. Spidroins in the gland are dimeric, interconnected by a conserved disulfide bridge in the CTD  and are believed to be stored in micelles where the terminal domains form the hydrophilic outer shell, whereas the repetitive region is shielded in the centre . Miniature recombinant spider silk proteins in which the CTD is linked to a few repeat motifs are sensitive to shear forces, and the presence of the CTD is crucial for macroscopic fibre formation, but its role in the fibre assembly is not pH-dependent [78,79]. A possible mode of action is the exposure of hydrophobic segments in the CTD when subjected to mechanical stress, which could promote aggregation .
The NTD, on the other hand, confers high solubility of spidroins at pH 7 and rapid fibre formation at pH 6 . The observed pH interval for these events is in good agreement with the physiological conditions occurring in the spiders’ silk production system. The NTD is the most conserved part of the spidroins, present in distantly related species and different spidroin types, implying further an important function in spider silk formation rather than for the properties of the final silk . HDX (hydrogen–deuterium exchange)–MS, ultracentrifugation, NMR, size-exclusion chromatography and ESI–MS experiments have shown that the NTD forms stable dimers in low salt solutions below pH 6.3 [81–83] while undergoing conformational changes . This probably reflects the formation of endless interconnected spidroins that result from the dimerization of the NTD.
The size of the terminal domains corresponds to less than 5% of the full-length spidroin and can by themselves hardly account for the global conformational changes that the spidroins undergo during spinning. Considering the nature of the repetitive middle region, it may seem contradictory that the spidroins display stretches of alanine residues that form β-sheets in the final fibre, since alanine has a high α-helix propensity . However, the spidroin repetitive region apparently has evolved to avoid aggregation during storage in the glands . During spinning, the NTD dimerizes and interconnects the spidroins into the endless protein chains [81–83,85]. The forces generated by pulling the silk fibre are apparently transplanted via the spidroins and combined with the shear forces from the narrowing duct to induce unfolding of the polyalanine α-helices and trigger β-sheet formation [82,83,85] (Figure 3). In comparison, MaSp1 and SP-C are both extremes, but of opposite types: the spider protein MaSp1 folds α-helix-promoting alanine residues into β-sheets, and SP-C folds β-sheet-promoting valine residues into a metastable α-helix. In both proteins, autoregulatory elements are required to make these structural conversions possible.
As new examples of both functional and disease-related amyloids are discovered, additional autoregulatory strategies are likely to emerge. Two previous reports have already hinted at the existence of additional mechanisms. One such instance is the observation that peptide hormones can be stored in amyloid form in vivo, the formation of which appears to be influenced by processing steps. These peptide aggregates were found to dissociate, dependent on changes in concentration and environmental conditions, which may be controlled by sequence elements that mediate conditional fibril disaggregation .
Another potential autoregulatory mechanism mediates assembly of the Orb2A protein, a cytoplasmic polyadenylation element-binding protein that facilitates long-term memory storage in Drosophila . Orb2A possesses an N-terminal octapeptide which is critical for its assembly in vitro and in vivo, but is located outside of the >100-residue-long amyloid-forming region . Because of its outside position and short length, it appears unlikely that the octapeptide is a self-chaperoning element. Since it does not form amyloid in vitro, it is also not likely to act as a polymerization trigger, which suggests the existence of yet another control mechanism.
Natural strategies to control the unwanted formation of amyloid are apparently in high demand: some biomaterials involve the controlled assembly of amyloid-like structures, e.g. in the use of recombinantly produced spider silk as tissue culture scaffolding , and the controlled delivery of long-acting drugs can be achieved using amyloid deposits . Biomedical research seeks new therapeutic approaches to combat protein aggregation diseases, such as targeted disaggregation of amyloid deposits . In the present review, we have discussed self-regulatory mechanisms that control amyloid formation and have provided a short overview of the different approaches. The strategies are remarkably diverse: amyloid-forming segments can be guarded by a chaperone domain, blocked by a propeptide segment or stabilized by repulsive charges. Still, all of these approaches can be considered instances of self-chaperoning in the sense that each protein guards its amyloidogenic segment to prevent uncontrolled self-assembly. In contrast with the continuous containment of amyloidogenic regions in the protein fold, specialized self-chaperoning that is adapted to a specific biological purpose may be considered a functional feature of proteins that can form amyloid fibrils in normophysiology. These proteins represent the group of self-regulating amyloids. As our knowledge of amyloid formation at the biophysical and biological levels grows, the autoregulatory mechanisms may also be increasingly appreciated, and additional studies on their applications in biomedicine are of high importance.
This work was supported by the Swedish Research Council, Karolinska Research Foundations and a Karolinska Ph.D. student grant.
Abbreviations: Aβ, amyloid β-peptide; AD, Alzheimer's disease; C-peptide, proinsulin connecting peptide; CTD, C-terminal domain; ESI, electrospray ionization; NTD, N-terminal domain; PFD, prion-forming domain; SP-C, lung surfactant protein C; ThT, thioflavin T; TM, transmembrane
- © The Authors Journal compilation © 2012 Biochemical Society