The chaperone/usher pathway controls assembly of fibres of adhesive organelles of Gram-negative bacteria. The final steps of fibre assembly and fibre translocation to the cell surface are co-ordinated by the outer membrane proteins, ushers. Ushers consist of several soluble periplasmic domains and a single transmembrane β-barrel. Here we report isolation and structural/functional characterization of a novel middle domain of the Caf1A usher from Yersinia pestis. The isolated UMD (usher middle domain) is a highly soluble monomeric protein capable of autonomous folding. A 2.8 Å (1 Å=0.1 nm) resolution crystal structure of UMD revealed that this domain has an immunoglobulin-like fold similar to that of donor-strand-complemented Caf1 fibre subunit. Moreover, these proteins displayed significant structural similarity. Although UMD is in the middle of the predicted amphipathic β-barrel of Caf1A, the usher still assembled in the membrane in the absence of this domain. UMD did not bind Caf1M–Caf1 complexes, but its presence was shown to be essential for Caf1 fibre secretion. The study suggests that UMD may play the role of a subunit-substituting protein (dummy subunit), plugging or priming secretion through the channel in the Caf1A usher. Comparison of isolated UMD with the recent structure of the corresponding domain of PapC usher revealed high similarity of the core structures, suggesting a universal structural adaptation of FGL (F1G1 long) and FGS (F1G1 short) chaperone/usher pathways for the secretion of different types of fibres. The functional role of two topologically different states of this plug domain suggested by structural and biochemical results is discussed.
- chaperone/usher pathway
- fimbrial adhesins
- protein structure
The chaperone/usher pathway [1,2] represents one of the most common mechanisms for assembly of surface located fibrillar organelles in Gram-negative bacteria. The surface fibres assembled via this pathway frequently play an important role in bacterial virulence by mediating adhesion to and potentially invasion into host cells, and in some cases also by protecting bacteria from the host's immune system [1,2].
The fibrillar structures assembled by the chaperone/usher pathway are constructed from Ig-like β-sandwich subunits, the pilin subunits. However, in contrast with an Ig domain, a circular permutation in the pilin sequence positions the sequence corresponding to the seventh C-terminal G β-strand of a canonical Ig domain at the N-terminus of the polypeptide chain [3–6]. In a typical Ig fold, the ‘top’ edge of the sandwich, defined by the A and F strands, is capped by the C-terminal G strand, which is hydrogen bonded to the F strand and provides hydrophobic residues to the core of the fold. Because it does not have this strand, a pilin subunit cannot cap its AF edge and a closed hydrophobic core cannot be formed by a single pilin subunit [3–7]. The N-terminal extension of pilins does not contribute to the subunit's globular fold, but deletions or mutations in this region block assembly of pilin subunits into fibres [3–8]. The structure of a Caf1M-(Caf1)2 ternary complex, with the minimal Yersinia pestis F1-antigen fibre [(Caf1)2] bound to the Caf1M chaperone, revealed that fibre subunits are linked together by insertion of the N-terminal extension of one subunit into the hydrophobic cleft of the second subunit [6,7]. The inserted N-terminal segment adopts a β-strand conformation running anti-parallel to the F strand, with hydrophobic side chains bound in acceptor cleft sub-pockets, hence completing the Ig fold of the subunit. This mode of binding, termed DSC (donor strand complementation), had previously been predicted for type 1 and P pilus fibres [3–5], and is likely to be present in all surface polymers assembled through the chaperone/usher pathway. The resulting linear fibres are composed of globular modules, each having an intact Ig topology generated by DSC. Depending on subunit composition and mode of subunit–subunit interactions, secreted linear fibres coil into different structures, such as thick rigid monoadhesive pili with a diameter of 7–8 nm, thin flexible polyadhesive pili with a poorly defined diameter (2–4 nm) and thin flexible polyadhesive fibres [1,2].
Biogenesis of surface fibres is assisted by periplasmic chaperones and OM (outer membrane) ushers. As fibre subunits emerge in the periplasm, periplasmic chaperones bind them, ensure correct folding and deliver the folded subunits to the usher. The ushers then act as assembly platforms co-ordinating subunit polymerization and translocation to the cell surface.
The L-shaped periplasmic chaperones comprise two Ig-like domains at ∼90° angle, with a large cleft between the two domains. The F1 and G1 β-strands in the first, N-terminal, domain are connected by a long and flexible loop. This loop harbours a conserved motif of hydrophobic residues that is critical for subunit binding . Subunits bind in the cleft between the two chaperone domains with the subunit C-terminal carboxyl group anchored by two strictly conserved positively charged residues at the bottom of the subunit-binding cleft. In chaperone–subunit complexes, the A and F edge strands are hydrogen bonded to the A1 and G1 strands of the chaperone, which creates a super-barrel of β-strands from both the subunit and the chaperone [6,7]. In the super-barrel, large hydrophobic residues from the G1 donor strand of the chaperone are inserted between the two sheets of the subunit β-sandwich and become an integral part of the bound subunit's hydrophobic core. On the basis of the lengths of the subunit binding motifs, periplasmic chaperones are grouped in two subfamilies, FGL (F1G1 long) and FGS (F1G1 short) [1,2]. Chaperones from these subfamilies operate with subunits displaying significant structural differences and assemble functionally and morphologically different organelles.
At the usher, chaperone–subunit interactions are replaced by subunit–subunit interactions in a process called DSE (donor strand exchange). No energy input from external sources is required to convert periplasmic chaperone–subunit preassembly complexes into free chaperones and secreted fibres. Instead, assembly is driven by subunit folding energy conserved by the chaperone [6,7]. Whereas the fibre-incorporated subunit adopts a very stable, compact structure, the chaperone-bound subunit is in an open, molten-globule-like, high-energy conformation. The large difference in stability between the chaperone-bound, partially folded conformation and the fully folded native fibre conformation of pilins creates a free energy potential that drives fibre formation. Ushers may accelerate donor strand exchange by binding and positioning subunit–chaperone complexes in an orientation favourable to attack of the subunit complex at the base of a growing fibre [6,10–12]. The exchange of the chaperone G1 donor strand together with the A1 ‘accessory’ strand of the chaperone by the Gd donor strand of the subunit and accompanying collapse of the subunit core appear not to occur through separate steps of complete dissociation and association, but rather by partial displacement in a process termed ‘zip-out-zip-in DSE’ [6,11]. Directing zip-out-zip-in steps is another plausible mechanism for usher-mediated catalysis of DSE .
Structural information is essential to advance our understanding of the function of ushers. Ushers are large integral OM proteins of approx. 800 amino acid residues in length. A recent electron microscopy study revealed that the PapC usher involved in assembly of P pili exists as a dimeric complex, with each usher monomer containing apparent channels of ≈2 nm diameter . This pore size would be wide enough for translocation of individual folded subunits or their polymers from the periplasm to the cell surface. The type-1 pilus usher FimD was shown to possess an N-terminal domain, comprising residues 1–139 [UND (usher N-terminal domain)-FimD] . Isolated UND-FimD specifically binds chaperone–subunit complexes with micromolar affinities . The NMR structure of isolated UND-FimD revealed that this domain consists of a flexible N-terminal ‘tail’, a core structure of a novel α/β fold and a potential hinge segment that connects the structured core to the rest of the usher . UND-FimD was co-crystallized with the complex between the FimC chaperone and the pilin domain of the FimH subunit (FimHP) . The crystal structure of this ternary complex revealed that the flexible N-terminal tail of UND-FimD specifically interacts with both the chaperone and subunit, acting as a sensor for loaded FimC molecules. Following targeting to UND-FimD, FimCH chaperone–adhesin complexes form stable interactions with the FimD C-terminus and induce conformational changes in FimD, possibly priming the usher for pilus biogenesis [16,17]. Structural information on the UCD (usher C-terminal domain) is currently unavailable.
Capitani et al.  predicted a 22-stranded TM (transmembrane) β-barrel in the centre of the FimD usher. Importantly, this study also predicted the existence of a novel periplasmic domain formed by the most conserved region of the usher. In the prediction, the sequence of this ‘middle domain’ was flanked by TM β-strands of the UTBD (usher TM β-barrel domain). This model was recently confirmed on publication of the low resolution crystal structure of the TM segment of PapC usher . In this structure, residues 130–640 of PapC form a 24-stranded β-barrel with the central predicted ‘middle domain’ located within the central barrel and lying across the channel.
Both FimH and PapC belong to the functional family of FGS chaperone-associated ushers [1,2,20]. These two ushers assemble relatively rigid mono-adhesive pili of complex subunit composition, including a two-domain terminal adhesin subunit [1,2,20]. In contrast, little is known about FGL chaperone-associated ushers involved in assembly of functionally different organelles, the thin poly-adhesive fibres of simple subunit composition . In the present study, using the Caf1A usher from the F1 capsule system of Y. pestis as a prototype member of the FGL family, we present the first structural detail of the usher of any member of this family.
Sequence analysis of Caf1A with the HMM-B2TMR program  predicts a structural organization very similar to that predicted for FimD (Figure 1). To check the prediction experimentally, we expressed a series of sequences to locate the core region of the UMD (usher middle domain) and investigated the secondary structure, oligomeric state and stability of isolated UMD. The study revealed an UMD of approx. 80 amino acid residues capable of autonomous folding into a stable β-structural unit. X-ray structure determination revealed that UMD has a pilin-like fold with significant three-dimensional structural similarity to donor- strand-complemented Caf1. This study has also demonstrated the absolute requirement of this domain for Caf1 export, although Caf1A itself can assemble in its absence. Based on these data and some topological differences between the isolated Caf1A UMD and channel-inserted PapC UMD, a model is presented of how this dummy subunit (UMD) might act as a plug in the Caf1A β-barrel. Finally, our comparison of UMDs of Caf1A and PapC suggests a universal structural adaptation of FGL and FGS chaperone/usher pathways for the secretion of different types of fibres and highlights some potentially important differences between these translocation systems.
MATERIALS AND METHODS
Plasmid construction and deletion mutagenesis
To create expression plasmids encoding various lengths of Caf1A UMD, DNA encoding the amino acid fragments from residues 198, 213, 222, 232, 242, 243, 244 and 246 to residue 320 were amplified from the p12R plasmid  by PCR using forward F-PMD (1–8) and reverse R-PMD primers (Supplementary Table S1 at http://www.BiochemJ.org/bj/418/bj4180541add.htm). The amplified fragments were cloned into pCR-T7/CT-TOPO-TA (198–, 213–, 222– and 232–320) or pET101D-TOPO (242–, 243–, 244– and 246–320) expression vectors (Invitrogen) essentially as described in the Invitrogen cloning procedures. The genes, flanked with the Met and stop codons, were positioned downstream of the T7 promoter and ribosome-binding site provided in the vector. The resulting plasmids were transformed into expression strains Escherichia coli BL21-AI or BL21(DE3). Deletion mutants caf1AΔ1–4 were created in p12R by inverse PCR  using reverse primers Caf1AΔm1, Caf1AΔm2, Caf1AΔm3 or Caf1AΔm4 and forward primer Caf1AΔm-F (Supplementary Table S1). Following mutagenesis, PCR products were digested with BglII, re-ligated and transformed into E. coli DH5α. Mutants were confirmed by BglII digestion of isolated plasmid and DNA sequencing of the complete operon.
Expression and purification of UMD constructs
E. coli BL21-AI or BL21(DE3) transformants growing at 37 °C were induced for protein expression with 0.2% L-arabinose or 1 mM isopropyl β-D-thiogalactoside respectively. The cells were harvested by centrifugation, resuspended in a 20 mM bis-Tris/HCl, pH 6.0, buffer A, and lysed by sonication with Vibra cell (Sonics & Materials). The crude extract obtained was precipitated by slowly adding solid ammonium sulphate (Merck) to 40% saturation followed by incubation for 1 h at 4 °C. The precipitate was dissolved in buffer A and dialysed overnight against the same buffer. The sample was loaded on to an anion exchange Mono Q HR 10/10 column (Pharmacia Biotech). Proteins were eluted with a 0–150 mM gradient of NaCl. The UMD-containing fractions were pooled and protein in the sample was concentrated to 1.6–2 g/l on a Vivaspin device (Vivascience) with molecular-mass cut-off of 5 kDa. The sample was loaded on to a HiLoad Superdex 75 prep grade size exclusion column (Pharmacia Biotech) pre-equilibrated with a 20 mM Hepes and 150 mM NaCl, pH 7.5, buffer. UMD was eluted in a single peak at a flow rate of 1.5 ml/min. The peak fractions were collected and concentrated to 3–30 g/l for analysis or to 30–60 g/l for crystallization experiments.
Expression of caf1A deletion constructs, cell fractionation and surface immunofluorescence
Plasmid p12R encodes all 3 genes required for assembly of surface F1: caf1M chaperone, caf1A usher and caf1 subunit together with a truncated caf1R regulator. The operon is repressed with 0.6% glucose and induced in the absence of glucose. Overnight cultures of E. coli DH5α/p12R or E. coli DH5α carrying the appropriate plasmid (p12R-A-mΔ1, etc.) were washed and subcultured 1 in 10 in LB (Luria–Bertani) broth containing 100 mg/l ampicillin and 0.6% (w/v) glucose, or no glucose as appropriate, and grown for 2 h at 37 °C with shaking at 250 rev./min. Crude OM preparations were isolated from washed induced cells by differential centrifugation and washing at 20000 g for 30 min at 4 °C following lysis with a French Press (Aminco) at 16000 lbf/in2 (1 lbf/in2=6.9 kPa) and removal of any unlysed whole cells. OM preparations were re-suspended at a final ratio of 0.5 ml PBS to 100 D600 equivalent units of bacterial cells (where 1 D600 equivalent unit corresponds to the total cells recovered from 1 ml bacterial culture with a D600 of 1.0). OM preparations were further purified by equilibrium density centrifugation on 30–65% (w/w) sucrose gradients in PBS, pH 7.4, with centrifugation at 116000 g for 28h at 15 °C. Co-equilibration of Caf1A constructs with outer membranes was confirmed by immunoblotting for Caf1A and the integral outer membrane protein OmpA.
Periplasmic fractions were isolated from induced cells by osmotic shock , using 3 D600 equivalent units to produce 100 μl periplasmic extract. Surface-exposed Caf1 was quantified in a surface immunofluorescence assay  using a 1:200 dilution of rabbit anti-Caf1SC antibody and 1:100 dilution of goat anti-rabbit IgG fluorescein conjugate (Sigma). Average fluorescence per 1 D630 of bacterial cells was calculated from three dilutions of each sample and triplicate cultures of each strain were quantified to derive the average fluorescence±S.D., per D630 bacterial cell. E. coli DH5a/pUC19 was used as a background control.
Crystallization and structure determination
Crystallization was performed by the hanging-drop vapour-diffusion method at 293 K. For U232-320, large rhomboidal or more complex multi-face crystals were obtained in drops with either 20% PEG [poly(ethylene glycol)] 3350 in 0.2 M NaSCN, pH 6.9, or 10% PEG 8000 in 0.2 M MgCl2 and 0.1 M Tris/HCl, pH 7.0. The crystals belong to space group C2221 with unit cell dimensions a=30.0 Å (1 Å=0.1 nm), b=112.0 Å, c=115.1 Å, and two copies of the UMD per asymmetric unit. Diffraction data were collected under liquid nitrogen cryoconditions at 100 K. To avoid damage on freezing, crystals were soaked for 30–60 s in cryoprotection solution prepared by mixing 1 part of precipitant solution with 1 part of 25% PEG 400. Crystals were flash-cooled by rapidly moving them into the cold nitrogen stream or by dipping them in liquid nitrogen. Native X-ray diffraction data were collected on beam-line ID14eh1 [ESRF (European Synchrotron Radiation Facility), France], using an ADSC (Area Detector Systems Corporation) Q4 CCD (charge-coupled-device)-based detector. To obtain a derivative for heavy metal phasing, crystals were incubated with precipitant solution containing 2.5 mM KAuCl4 for 24 h. Au SAD (single-wavelength anomalous dispersion) experiments were performed at BM14U using an XFlash Detector (Bruker). Results were collected, processed and reduced using MOSFLM and SCALA . Using PHENIX , heavy-atom parameters were obtained and refined, and initial phases calculated. The initial model was constructed using PHENIX and O . Positional, bulk solvent and isotropic B factor refinement was performed using REFMAC5 . NCS (non-crystallographic symmetry) restraints were applied during early cycles of modelling and refinement. Progress of refinement and selection of refinement schemes were monitored by the Rfree for a test set comprising 4.6% of the data. Data collection and refinement statistics are given in Table 2.
SDS/PAGE was performed using 20% polyacrylamide for analysis of UMD fragments, 14% for subunit and 8.5% for Caf1A usher. Samples were incubated for 5 min at 95 °C in sample buffer containing 2% (w/v) SDS unless otherwise indicated. Immunoblots were developed with 1:5000 dilution of rabbit anti-Caf1A antibody and 1:20000 dilution of goat anti-rabbit peroxidase conjugate using an ECL® (enhanced chemiluminescence) detection kit (Amersham).
CD spectra were recorded on a J-810 dichrograph (Jasco) equipped with a PTC-423S temperature control system (Jasco) in 0.1 cm quartz cuvettes. Typically, five spectra recorded at a scanning speed of 50 nm/min with band width set to 1 or 2 nm were accumulated.
GdmCl (guanidinium hydrochloride)- and temperature-dependent folding transitions
To study unfolding of UMD, the native protein (1.4 g/l) was diluted 10-fold with 20 mM phosphate buffer (pH 6.4) (buffer B) containing different concentrations of GdmCl, and incubated at 21 °C for 16 h. To study refolding, UMD was first denatured in 4.1 M GdmCl and incubated for 1 h at 21 °C. Unfolded UMD (1.1 g/l) was then diluted 10-fold with buffer B containing different concentrations of GdmCl and incubated at 21 °C for 2–16 h. The GdmCl-induced unfolding transition was measured by following the change of ellipticity at 217 nm, evaluated according to the two state model of folding using a six-parameter fit , and normalized. Free energy for unfolding of UMD in the absence of GdmCl was determined by linear extrapolation of the dependence of the free energy of unfolding on GdmCl concentration to 0 M . To study temperature denaturation, UMD at a concentration of 0.15 g/l in the B buffer was heated from 20 °C to 95 °C in 5 °C steps at a heating rate of 1 °C/min. Spectra were recorded between each 5 °C heating interval. To study the reversibility of temperature denaturation, the sample of UMD preheated to 95 °C was allowed to cool to 21 °C, followed by recording of a far-UV CD spectrum. Progress of protein melting was calculated from the change of ellipticity at 206 nm. Equilibrium constants in the transition region were evaluated based on a two state folding model. Van't Hoff enthalpy of melting was obtained from the temperature dependence of the equilibrium constants as in .
10 μl Ni-NTA (Ni2+-nitrilotriacetate) agarose beads (QIAGEN) were incubated for 30 min with 75 μg of Caf1M-Caf1 complex, carrying an N-terminal 6-His tag on Caf1, in 100 μl binding buffer (5 mM imidazole, 25 mM NaH2PO4 and 150 mM NaCl, pH 8.00). The beads were washed twice with 500 μl binding buffer to remove unbound complexes. To study binding between Caf1M-Caf1 and UMD or UND, the beads with bound Caf1M-Caf1 were added to 50 μl binding buffer containing 1 g/l UMD or UND and incubated for 30 min at 24 °C in a shaker. The beads were recovered by centrifugation and washed twice with 500 μl binding buffer. Complexes were eluted from the beads with 30 μl elution buffer (125 mM imidazole, 25 mM NaH2PO4 and 150 mM NaCl, pH 8.0) and were analysed with SDS/20% PAGE.
Multiple sequence alignment and structural superposition
The Caf1A-UMD sequence was used as query for an NCBI (National Center for Biotechnology Information) Psi-BLAST (position-specific iterated BLAST) search using Expect and Psi-BLAST thresholds 8 and 0.003, respectively, against the non-redundant protein sequences database (nr) to identify a large number (>1000) of sequences with E-values significantly better than the cutoff. A vast majority of these sequences were annotated as usher sequences. As this set contained many identical or near-identical sequences, a set of 147 unique sequences was extracted from the top 1000 hits and aligned using ALIGNX (InforMax).
Structures were superimposed using the INDONESIA program (D. Madsen, G. Kleywegt and P. Johansson, http://xray.bmc.uu.se/~dennis/).
RESULTS AND DISCUSSION
Isolation and characterization of the middle domain of Caf1A
Initially the 213–320 fragment of Caf1A (U213–320) corresponding to the entire predicted UMD was expressed and purified from the E. coli cytoplasm. U213–320 was eluted in a single peak from a size exclusion column (Supplementary Figure S1 at http://www.BiochemJ.org/bj/418/bj4180541add.htm). Comparison of the retention coefficient (Kav) for U213–320 with those for globular proteins of known molecular weight suggested an apparent molecular weight of 16.1 kDa (Supplementary Figure S1, right panel). This value is 1.3-fold higher than the deduced sequence weight for U213–320, indicating that U213–320 is likely to exist as a monomer with an abnormally large gyration radius in solution. CD measurement of purified samples of U213–320 revealed a typical β-structural protein far UV CD spectrum with a single ellipticity minimum of −6000 deg·cm2·dmol−1 (per dmole of residue) at 217 nm (Figure 2a). Heating U213–320 to 95 °C or adding 3 M GdmCl dramatically changed the spectrum, suggesting loss of the native structure. The spectrum was completely restored by sample cooling or 10-fold dilution respectively. The maximal CD changes caused by thermal and chemical denaturation were observed at 206 and 217 nm respectively. These wavelengths were chosen to study thermal and chemically induced conformational transitions in U213–320 (Figures 2b and 2c). Both temperature- and GdmCl-dependence of the CD signal have the characteristic sigmoidal shape of co-operative transitions. Analysis of the transitions using the two-state model allowed estimation of thermodynamic parameters for U213–320 stabilization (Table 1). U213–320 is a temperature-resistant protein (Tm=70.4±0.5 °C, ΔHm=313±15 kJ·mol−1, where Tm, melting temperature and ΔHm, van't Hoff enthalpy of melting) with thermodynamic stability at physiological conditions (ΔG295=17.4±1 kJ·mol−1) typical for single domain globular proteins. Taken together, these results show that U213–320 is a stable β-structural monomeric unit capable of autonomous folding.
To test for the presence of unstructured regions in U213–320, we subjected purified U213–320 to limited proteolysis with trypsin and chymotrypsin. Proteolysis of U213–320 was observed at chymotrypsin concentrations below 1 μg/ml and SDS/PAGE revealed concentration-dependent disappearance of the intact protein with simultaneous appearance of a single truncated species (Figure 2d). Precise molecular masses for the proteolytic products, 9751 and 8658 Da for the trypsin and chymotrypsin proteolysis respectively, were determined in a parallel analysis of the digestion reaction by MS. These values correlated with the masses of sequences Ile229–Ile318 (9748 kDa) and Tyr239–Tyr318 (8655 kDa) of Caf1A usher, and cleavage of UMD between Lys228–Ile229 and Tyr238–Tyr239, for trypsin and chymotrypsin respectively. Both sites are close to the N-terminus of the recombinant UMD protein. Hence, these results identified an unstructured sequence at the N-terminus of U213–320.
We considered two possibilities: either U213–320 contains an extra sequence at the N-terminus, which does not participate in formation of the core structure of UMD, or, on the contrary, it lacks an essential sequence stabilizing its N-terminus and completing the UMD structure. To test these alternatives we created three additional constructs expressing U198–320, U222–320 and U232–320. The first construct includes an extra 15 amino acid sequence and the other two lack 9 and 19 residues respectively (Figure 1). The gel-filtration analysis of the constructs revealed a slight increase in mobility for U198–320 and a mobility decrease for U222–320 and especially for U232–320, yielding apparent molecular masses of 16.5, 14.2, and 9.1 kDa for the constructs respectively (Supplementary Figure S1). The constructs showed similar CD spectra with some increase of the positive signal below 205 nm, e.g. from about 0 for U198–320 to 4000 deg·cm2·dmol−1 for U232–320 (Figure 3a). Thermal denaturation showed similar transitions for all the constructs, revealing slightly higher stability for U232–320, which melts at 2–3 °C higher temperature than the other constructs (Figure 3b). Hence, the shortest version still contains the core structure of UMD. The slowest mobility during gel filtration and elevated thermal stability observed for this construct could be explained by the lack of flexible sequence at the N-terminus.
Structure of the middle domain of Caf1A
Crystallization screening for the shortest construct (U232–320) quickly revealed two conditions producing well-diffracting crystals. The crystal structure of U232–320 was solved to 2.8 Å resolution using Au SAD (Table 2). The structure consists of a symmetric domain-swapped dimer of similar UMD monomers [rmsd (root mean square deviation) 0.69 Å for 69 Cα atom positions] (Figure 4a). The U232–320 monomer has the classical s-type Ig fold , forming a β-sandwich with the participation of the N-terminus of the second monomer as one of the strands.
To model a non-swapped UMD domain, the N-terminal β-strand of one monomer was built using co-ordinates of atoms of the corresponding segment in the second monomer of the dimer (Figure 4b). The resulting model consists of one three-stranded (A, B and E) and one four-stranded (D, C, F and G) β-sheet (Figure 4c). No electron density was found for the first seven residues of U232–320. The A β-strand starts after Pro239 and consists of only four residues. The last, G, β-strand ends at Tyr310. The core structure is followed by an extension of five residues, Asp311–Ala316, that are involved in crystallographic contacts with a neighbouring dimer. The last four residues of U232–320 are unstructured. The β-sandwich is partially open at the beginning of the B, D and F and ending of the A, C, E and G β-strands. The long loop connecting E and F β-strands fills in this opening, contributing two residues into the hydrophobic core of the protein. The other loops are short, consisting of three to four residues.
The sequence U232–320 contains nearly 50% of the most conserved residues of usher proteins (; Figure 1). Most of these residues play clear structural roles as hydrophobic core (Val, Leu, Ile, Phe) or sharp turn (Gly) forming residues (Figure 4b). A highly conserved glutamic acid residue at 298 links the two sheets and caps one end of the β-sandwich. Four highly conserved surface-exposed residues, Ile252, Arg263 (Gln in most UMDs), Glu291 and Asp300 (Figure 4b), have no apparent structural role and may instead be important for UMD functionality.
N-terminal boundary of the middle domain of Caf1A
The high susceptibility of the A β-strand to chymotrypsin digestion (see Figures 1c and 2d) is indicative of elevated mobility. This is further strengthened by the observation of the A β-strand swapping in the UMD dimer (Figure 4a). To check whether this sequence is crucial for protein folding, we created a shorter version of UMD, U242–320, lacking the first half of the A β-strand including two buried residues, Pro239 and Tyr241 (Figure 1c). U242–320 accumulated in cells in soluble form (Figure 3c). Analysis of the protein by gel filtration revealed that, as with the other purified usher fragments, U242–320 behaves as a monomer (Supplementary Figure S1). U242–320 displayed a CD spectrum very similar to that of U232–320 (Figure 3a). However, the deletion caused a fall of more than 10 K in the melting temperature (Figure 3b). Further truncation of the N-terminal sequence by one additional residue, the surface exposed Gln242, significantly decreased cytoplasmic levels of the expressed domain, whereas inclusion of the next buried residue, Trp243, in the deletion of the entire A β-strand (U246–320), completely abolished expression of the isolated UMD (Figure 3c). Hence, the A β-strand is important for UMD folding, and despite its apparently elevated mobility should be viewed as a part of the core structure of this domain.
Structural similarity between the middle domain and F1 capsular subunit Caf1
The Caf1 donor strand complemented (Caf1dsc) fibre subunit (PDB accession code 1P5U) and UMD show similar overall topology (Figure 4c). To check the structural similarity between these proteins we superimposed their structures. Despite the clear difference in size of the proteins, superposition revealed significant structural similarity (Figure 4d). Of 71 core Cα atoms of UMD, 60 were superimposed with the corresponding atoms on Caf1dsc with an rmsd of 1.78 Å. Interestingly, the structurally similar part of Caf1dsc is in the top half of the subunit, as viewed in a vertically positioned fibre. The finding of high similarity with a specific part of the subunit indicates that UMD may represent a case of molecular mimicry, playing a role as a subunit-substituting protein, a ‘dummy subunit’.
Membrane assembly of Caf1A constructs lacking the middle domain
Topology analysis (Figure 1) suggested a periplasmic location for UMD and that the UMD is bound on either side by membrane spanning β-strands. As the UMD was shown to be an autonomously folding soluble domain, we reasoned that UMD could be deleted, leaving an amphipathic β-barrel. Hence, to investigate the requirement of UMD for F1 assembly, a series of four internal deletions were created within the caf1A gene in the plasmid p12R. One deletion, caf1AΔS236–V315 (AΔ4), precisely removed the immunoglobulin-like fold of UMD, whereas a second caf1AΔY215–V315 (AΔ2) removed the entire predicted UMD domain, leaving two residues following an absolutely conserved glycine residue at the end of the fourth predicted TM strand. As TM predictions for residues upstream of this fourth predicted TM strand were quite high, complicating identification of the fourth TM strand, caf1AΔI205–V315 (AΔ1) and caf1AΔQ192–V315 (AΔ3) were also created to ensure removal of the complete periplasmic middle domain and to remove the fourth TM strand respectively. A protein of approximately the correct size (expected sizes of 78.856 kDa, 80.067 kDa, 77.414 kDa and 82.334 kDa for AΔ1–4 respectively) was present in OM preparations from each of these constructs, and notably the SDS/PAGE profile of AΔ4 was very similar to that of wild-type (Figure 5a). Analysis of the mobility of Caf1A and Caf1AΔ4 on SDS/PAGE following pre-incubation in sample buffer at different temperatures confirmed that Caf1AΔ4 mimicked the heat-modifiable profile of wild-type Caf1A (Figure 5b). At 25 °C and 55 °C, three species were present, the fully denatured monomeric form, a faster migrating doublet migrating approx. 20 kDa faster than the fully denatured form and an oligomeric form, consistent with that of a dimer (181.4 kDa for wild-type Caf1A and 164.7 kDa for Caf1AΔ4). PapC usher was reported to function as a dimeric twin-pore oligomer . Thus, presence and resistance of this oligomeric Caf1A to heat denaturation (resistant at 55 °C, denatured at 100 °C) is good evidence that Caf1AΔ4 is assembled in the membrane. The other deletion mutants also formed the heat-sensitive oligomeric species, although less of the faster migrating doublet was formed, suggesting that they are also assembled, but possibly in a less native conformation.
Presence of the middle domain is essential for Caf1 secretion
Ability of Caf1AΔ2 and Caf1AΔ4 to support surface assembly of F1 polymer was assessed by an in vivo immunofluorescence assay. In contrast to E. coli cells expressing the wild-type operon (p12R) and even cells grown under conditions where the wild-type operon is essentially repressed (E. coli/p12R grown in 0.6% glucose), no Caf1 could be detected on the surface of E. coli cells expressing the mutated Caf1A protein, from either p12R-AΔ2 or p12R-AΔ4 (Figure 6a). Instead, accumulation of periplasmic Caf1, bound to chaperone, was observed with all four mutants (Figure 6b).
The middle domain does not bind chaperone–subunit complexes
On the basis of the observation of structural similarity between UMD and fibre-inserted Caf1, we considered the possibility that UMD might be involved in orientating the subunit within the pore by directly binding the subunit and/or chaperone. We have previously shown that the N-terminal domain of Caf1A (UND, residues 1–137) participates in recognition of Caf1M–Caf1 complexes by Caf1A (A. Dubnovitsky, A.V. Zavialov, X. Yu and S. D. Knight, unpublished work). To directly compare binding of Caf1M–Caf1 complexes by UMD and UND, we investigated the ability of agarose-bound Caf1M–Caf1 complexes to trap soluble UMD or UND. In this system, we used a previously designed version of the Caf1 subunit with the N-terminal donor sequence replaced by a 6-His tag. This modification arrests Caf1 polymerization, allowing accumulation of stoichiometric Caf1M–Caf16H (binary) complexes [6,8]. Caf1M-Caf16H complexes were bound to Ni-NTA–agarose beads (QIAGEN) via the 6-His tag. The Caf1M-Caf1 beads were then incubated with UMD, or UND-containing solutions, recovered, and bound protein was eluted with imidazole. Analysis by SDS/PAGE revealed a band of UND (Figure 6c, right panel), but no band of UMD co-eluting with the Caf1M–Caf16H complex (Figure 6c, left panel). No binding between UMD and larger Caf1M–Caf1n complexes could be observed in our ion-exchange-based binding tests (results not shown).
Functional role of the middle domain
Our results suggest that UMD is not involved in directly binding Caf1M:Caf1 chaperone–subunit complexes, but that its presence is required for assembly. There could be several different explanations for the requirement of UMD for export. The similar structure and cross-sectional dimensions of UMD and fibre-inserted subunit suggest that UMD might function as a dummy subunit, mimicking a fibre-incorporated subunit and plugging the translocation pore prior to initiation of assembly and secretion. This is consistent with the recent demonstration of the location of the PapC UMD across the central barrel of PapC usher . Plugging the pore might be important to prevent leakage of large periplasmic components through the outer membrane. However, if this were the only role of UMD, one might expect ΔUMD-Caf1A to be functional in assembly and secretion. As this was not the case and the accumulated Caf1 in the periplasm was not accessible to external antibody, UMD appears also to fulfil some other role, possibly in maintaining the very large usher β-barrel in a translocation-competent conformation.
Structural comparison of middle domain from FGS and FGL chaperone/usher translocation systems
Caf1A and PapC usher proteins belong to different functional families of chaperone/usher secretion systems, the FGL and FGS chaperone/usher pathways, which assemble structurally and functionally different organelles. Subunits of these two types of organelles show no sequence homology and display significant structural differences [1,2]. To elucidate structural adaptation of FGL and FGS chaperone/usher translocation systems for the secretion of different types of fibres, we compared the available structures of the middle domains from Caf1A (the isolated UMD) and PapC (the domain located within the usher barrel).
Structural superposition of the two middle domains immediately revealed a single principal difference in their fold (Figure 7a): whereas the isolated Caf1A UMD has a classical seven-stranded Ig fold (Figure 4c), PapC UMD lacks the A β-strand and is an incomplete Ig fold with a six-stranded β-sandwich, consisting of β-sheets BE and DCFG. To compensate for the absent A β-strand, B and E β-strands in PapC are shifted by 1.4–2 Å in the direction of the edge β-strand G of the opposite β-sheet, creating a more compact structure. This conformational difference is reflected in an elevated rmsd between corresponding Cα atoms in B and E β-strands of Caf1A and PapC (1.64 Å), contrasting with a low rmsd value estimated for strands of the opposite β-sheet, DCFG (0.72 Å). While this could be a real topological difference between FGL and FGS ushers, it may reflect the two different states of UMD studied, with important functional implications, as discussed below. Another large difference between the middle domains of Caf1A and PapC is found in the conformation of the loop between the E and F β-strands. In PapC, the central residue in the EF loop, Val302, is involved in the hydrophobic core formation. To be able to interact with hydrophobic core residues, Val302 pulls the central part of the EF loop towards the core structure. In contrast, the corresponding region in Caf1A consists of glycines, which are placed 3–5 Å higher above the core structure of the domain. This conformation is partially stabilized by the hydrogen bonding of the following Ser289 with the G β-strand, which in Caf1A is longer than that in PapC.
Despite the topological difference, the core segments of both UMDs (excluding the N terminal sequence) show high structural similarity: 62 of 71 core Cα atoms of Caf1A UMD (residues 248–286 and 289–311) were superimposed with the corresponding atoms on PapC UMD (263–301 and 303–325) with an rmsd of only 1.2 Å (pairs of atoms with distance exceeding 3.7 Å were excluded from superposition). Donor strand complemented Caf1 and PapE subunits, transported via Caf1A and PapC ushers respectively, are structurally much more dissimilar, displaying rmsd of 1.94 Å for 87 Cα atoms superposed within 3.7 Å. The large difference between the UMD and subunit conservation suggests that the channel is universally adapted for the transport of organelle subunits of different structure and shape.
Identification of conserved UMD–channel wall contacts
Caf1A and PapC not only belong to the functionally different secretion systems (FGL and FGS respectively), but they are also members of distantly related phylogenetic families of “classical” ushers (γ and π respectively ). Hence, the structural comparison of these distantly related ushers that share only 25 identical residues (33%) in their most conserved middle domain sequences could be applied to identify conserved surface residues, contributing to the UMD–channel wall contact.
Remaut et al.  identified several ionic interactions, which they proposed may influence the position of the PapC UMD inside the translocation channel. Surprisingly, according to our superposition, none of the PapC residues involved in these ionic interactions (Arg281, Arg303, Arg305 and Arg316) are identical or even similar to those in corresponding positions in Caf1A. Nevertheless, one such ionic interaction might still be formed in Caf1A. In PapC, Arg305, within the middle domain, forms a salt bridge with Glu467 on the channel wall, which in turn interacts with Arg237 located within a β5–6 hairpin of the wall. In place of Arg305, Caf1A contains a highly conserved Glu291 (Figures 4b and 7a). At the same time, instead of the channel wall glutamate, Caf1A contains several lysines in this region (KPKNK) that could form electrostatic interactions with Glu291 and an aspartate residue replacing the PapC Arg237 within the conserved hairpin sequence (FRS). In PapC, most of the ionic interactions are formed between UMD and either a small α-helix or β-hairpin extending from the channel wall. Although alignments indicate the presence of the similar structural features in Caf1A, differences in arrangement of ionic interactions may reflect localized differences in the structure of the channel and movement of UMD.
Caf1A and PapC middle domains contain a limited number of identical residues on their surface. Interestingly, a majority of these residues, including the highly conserved Ile252 and Gly276 (Figure 4b) and relatively conserved Thr255, Pro274 and Pro277, are closely positioned in space (Figure 7). Together with Leu272, which in PapC is replaced by Met287, these residues form a hydrophobic patch on the surface (Figure 7b). The patch is further extended by the aliphatic part of side chains of residues Asp300 in Caf1A and Asn314 in PapC (Figure 7), which belong to a highly conserved Asp/Asn populated position in the usher alignment (Figures 1c and 4b). In PapC, all these residues form extensive hydrophobic interactions with the inner surface of the β barrel, dominated by the side chains of Tyr540, Tyr601 and Tyr622 or aliphatic parts of side chains of some polar residues (results not shown). The high conservation of these residues as well as conformational similarity of their side chains in distantly related ushers suggests that this hydrophobic patch is likely to be involved in formation of UMD–channel wall contact in all ‘classical’ usher proteins. These hydrophobic interactions, at the ‘tip’ of the middle domain, are likely to play a more important role in the lateral positioning of UMD inside of the translocation channel than the less conserved electrostatic interactions.
Hypothesis: six/seven β-stranded conformation switching and capping/uncapping of the usher channel
Our results showed that the A β-strand is important for the stability of isolated UMD of Caf1A. They also implied some mobility of this sequence. Threading of Caf1A and PapC sequences on the 6- and 7-stranded β-sandwiches of UMD of PapC and Caf1A respectively suggested that both sequences are compatible with both folds (results not shown). Thus this raises the fascinating possibility that the 7-stranded topology presented here represents an alternate state of UMD common to all ushers. In this scenario, when located within the translocation channel, the 6-stranded UMD would be stabilized by surface contact(s) with the channel wall, whereas the released 7-stranded structure would be independently stable. In the PapC 6-stranded structure , the sequence corresponding to the additional Caf1A UMD β-strand plus the upstream sequence form a long flexible strand. However, both this sequence and the C-terminus of UMD are more or less fixed by trans-membrane β-strands. Thus, on formation of the additional β-strand, switching from the 6- to 7-stranded β sandwich, the N-terminal linker would both contract in length and change its tethering position, leading to tension on the UMD and re-orientation of the entire domain, as schematically illustrated in Figure 8. Such a mechanism would allow a co-operative switching of Caf1A from a closed (plugged by UMD) pore conformation to an open one where UMD has moved out of the pore or docked inside the channel in a vertical orientation (; Figure 8b). The critical requirement of residues in the PapC N-terminal linker sequence for pilus biogenesis  supports involvement of this sequence in opening of the channel. Our ongoing structural studies of the open conformation of the full-length usher will help to elucidate the validity of this hypothesis.
This work was supported by grants from the Swedish Research Council [grant numbers SRC-6212005-5360 (to A. V. Z.) and SRC-2006-4297 (to S. D. K.)] and BBSRC (Biotechnology and Biological Sciences Research Council), U.K. (to S. M.) [grant number BBRSC-5147821].
The structural co-ordinates reported for Caf1A middle domain (232–320 amino acid region) will appear in the Protein Data Bank under accession code 3FCG.
Abbreviations: DSC, donor strand complementation; DSE, donor strand exchange; FGL, F1G1 long; FGS, F1G1 short; GdmCl, guanidinium hydrochloride; ΔHm, van't Hoff enthalpy of melting; LB broth, Luria–Bertani broth; OM, outer membrane; PEG, poly(ethylene glycol); Psi-BLAST, position-specific iterated BLAST; SAD, single-wavelength anomalous dispersion; Tm, melting temperature; TM, transmembrane; UCD, usher C-terminal domain; UMD, usher middle domain; UND, usher N-terminal domain; UTBD, usher TM β-barrel domain
- © The Authors Journal compilation © 2009 Biochemical Society