The secondary structure of lipase 1 from Candida rugosa, a model system for large monomeric enzymes, has been studied by FTIR (Fourier-transform infrared) spectroscopy in water and 2H2O. The secondary structure content, determined by the analysis of the amide I band absorption through second derivative and curve fitting procedures, is in agreement with that estimated by X-ray data and predicts, in addition, the existence of two classes of α-helices. We have also investigated the enzyme stability and aggregation at high temperature by following the protein unfolding. The thermal stability determined by FTIR is in excellent agreement with the temperature dependence of the lipase activity. Furthermore, new insights on the glycosylation of the recombinant protein produced in Pichia pastoris and on its heterogeneity related to different fermentation batches were obtained by the analysis of the IR absorption in the 1200−900 cm−1 carbohydrate region. A drastic reduction of the intensity of this band was found after enzymic deglycosylation of the protein. To confirm that the FTIR absorption in the 1200–900 cm−1 region depends on the carbohydrate content and glycoform distribution, we performed an MS analysis of the protein sugar moieties. Glycosidic structures of the high mannose type were found, with mannoses ranging from 8 to 25 residues.
- Candida rugosa lipase
- conformational stability
- heterologous protein glycosylation
- infrared spectroscopy
- secondary structure
The post-genomic era is witnessing the high-throughput production of recombinant proteins from both prokaryotic and eukaryotic organisms. In this scenario, the availability of simple techniques providing insight into their structural, conformational and functional properties would be of great help.
FTIR (Fourier-transform infrared) spectroscopy is emerging as a powerful technique for the determination of the secondary structure of proteins in solution, with no restriction on their molecular mass. As widely reviewed in the literature [1–4], quantitative information on the secondary-structure elements of the protein can be obtained by the analysis of the amide I absorption in the 1700 –1600 cm−1 region. This band originates from the C=O stretching vibration of the peptide group, whose frequency depends on the hydrogen-bonding and coupling along the protein chain, and is therefore sensitive to the protein conformation. Prediction of the protein secondary structure can be obtained by the decomposition of the amide I band into its components through curve fitting procedures [2,5,6].
Here, we present an FTIR study of the enzyme CRL1 (Candida rugosa lipase 1) with the purpose of assessing the usefulness of this technique for the analysis of conformation and post-translational modifications of this enzyme, taken as a model for large monomeric enzymes of interest in both basic and applied research .
CRL1 is a globular protein of 57 kDa, produced as a recombinant product from Pichia pastoris . Its crystal structure has been determined at 2 Å (1 Å=0.1 nm) resolution [9,10]. Information about the structure of CRL1 in solution is not available to date and cannot be obtained by NMR due to the large size of the protein. However, it is well known that lipase activity is strongly conformation-dependent and extremely sensitive to experimental conditions, such as detergents, pressure and solvents. The most significant conformational change, which is crucial for the enzymic activity, is the movement of a surface loop – the lid – on enzyme activation by interaction with the substrate. Noinville et al.  have recently investigated this phenomenon in the Humicola lanuginosa lipase through FTIR-attenuated total reflection spectroscopy . Indeed, conformational changes and structural stability have been successfully studied by FTIR spectroscopy for a large number of proteins, including lipases [11–16].
In the present study, we report the determination by FTIR spectroscopy of the CRL1 secondary structure in water solution and in 2H2O. Its quantitative composition in water was compared with X-ray data. We have also investigated the enzyme stability and aggregation at high temperature by monitoring protein unfolding in the temperature range 20–100 °C.
Furthermore, we explored the potential of FTIR spectroscopy for the study of the recombinant protein glycosylation through the analysis of the carbohydrate band in the region 1200–900 cm−1. The new results presented in this study strongly indicate that FTIR spectroscopy can offer a reliable tool for the characterization of this post-translational modification, which is a major issue in recombinant protein production.
Production, purification and deglycosylation of recombinant lipase 1
Recombinant CRL1 was expressed and secreted by P. pastoris as described in . Cultures were grown in a minimal medium to decrease the secretion of contaminant proteins. Growth was for 7 days in shaking flasks (shaking rate, 150 rev./min) at 30 °C, in a minimal medium containing 1.3% (w/v) yeast nitrogen base, 2% (w/v) glucose and 0.1 M phosphate buffer (pH 6). Lipase was recovered from the culture supernatant by centrifugation at 20000 g and concentrated by tangential-flow filtration against deionized water with a Minitan System (Millipore, MA, U.S.A.). CRL1 was then purified by chromatography on phenyl-Sepharose HP (Amersham Biosciences, Uppsala, Sweden) in an FPLC-GP250 PLUS apparatus (Amersham Biosciences). Elution was obtained at a flow rate of 1.5 ml/min, by a 100 –0% linear gradient of 1 M (NH4)2SO4 in 10 mM Tris/HCl (pH 7.4). After dialysis versus deionized water, the protein solution for FTIR measurements in water was concentrated to 10 –20 mg/ml by using Amicon 30 kDa cut-off filter devices (Centricon Plus-20; Millipore, Bedford, MA, U.S.A.). Protein concentration was determined by the method of Bradford  taking BSA as a standard, using the Bio-Rad kit (Bio-Rad) .
For FTIR measurements in 2H2O, the same sample was freezedried and re-dissolved in 2H2O.
Approximately 70 μg of CRL1 purified by hydrophobic-exchange chromatography was deglycosylated in two subsequent steps by the addition of a total amount of 100 units of PNGase F (peptide N-glycosidase F; Sigma). CRL1 was denatured by heating (4 min at 99 °C) before the addition of 50 units of PNGase and the reaction was performed for 48 h at 37 °C in water (final volume, 55 μl). After the addition of 50 units of enzyme, the reaction mixture was further incubated at 37 °C for 24 h. For the FTIR study, released sugars were removed by centrifugation on Amicon 30 kDa cut-off filter devices (Centricon Plus-20).
FTIR absorption spectra from 4000 to 700 cm−1 were collected in transmission using an FTS-40A spectrometer (Bio-Rad, Digilab Division, Cambridge, MA, U.S.A.) equipped with a DTGS (deuterated triglycine sulphate) detector and air dryer purging system, at 1 and 2 cm−1 resolution, a scan speed of 5 kHz, 256 scans co-addition and triangular apodization. Protein solution (15 μl), at a concentration of 10 –20 mg/ml, was placed in a temperature-controlled transmission cell (Wilmad, Buena, NJ, U.S.A.), with BaF2 windows and Teflon spacers of 15 μm for water and 30 μm for 2H2O. For thermal unfolding from 20 to 100 °C, the sample was heated at a rate of 0.2 °C/min, each spectrum being collected every 2 °C.
FTIR data analysis
The protein FTIR spectra were obtained after subtraction of the solvent absorption, strictly collected under the same conditions, by adjusting the subtraction factor until a flat baseline was obtained in the 2000 –1700 cm−1 region [4,18,19]. Furthermore, subtraction of residual vapour absorption was also performed when necessary.
Second derivative spectra were obtained following the Savitsky–Golay method (3rd grade polynomial, five smoothing points), after a binomial 11 points smoothing of the spectrum.
Curve fitting of the amide I band from 1700 to 1600 cm−1, in water, was performed by GRAMS/32 (Galactic Industries Corporation, Salem, NH, U.S.A.) as a linear combination of Gaussian components. In the fitting, the number of components and initial values of their peak positions were taken from the second derivative spectrum. For the choice of the initial values of the Gaussian intensities, we followed the approach of Arrondo et al. [2,6], performing first an FSD (Fourier-self-deconvolution) of the amide I band [2,5] by Win-IR Pro software (Bio-Rad, Digilab Division). By using an enhancement factor K=2.9 and half-bandwidth HW=14 cm−1, we obtained an FSD spectrum displaying the same number of components and peak positions as the second derivative spectrum (results not shown). The FSD peak intensities, adjusted to fit the amide I absorption profile, were then taken as initial values for the Gaussian intensities [2,6]. As initial values for the Gaussian bandwidths, we chose a constant value from the FSD spectrum. Taking the above values as initial parameters, we performed curve fitting of the amide I band by leaving the parameters free to adjust iteratively, with the only restriction on the peak wavenumbers being to vary within a range of ±2 cm−1.
GC-MS and MALDI-MS (matrix-assisted laser-desorption ionization MS) analyses
For the GC-MS carbohydrate analysis, purified CRL1 was dissolved in 500 μl of 1 M methanolic HCl and heated at 80 °C for 16 h. The sample was then dried and re-N-acetylated at room temperature for 15 min with 50 μl of acetic anhydride in 500 μl of methanol containing 10 μl of pyridine. Monosaccharide trimethylsilylation was performed in 100 μl of N,O-bis(trimethylsilyltrifluoroacetamide) at 70 °C for 15 min. The sample was then dried, dissolved in 50 μl of hexane and used for GC-MS analysis.
The GC-MS analysis was performed on an Agilent Technologies MSD 5973N instrument (Palo Alto, CA, U.S.A.) using a fused silica capillary column (30 m, 0.5 mm internal diameter, 0.25 μm) from Hewlett–Packard. The injection temperature was 250 °C and the oven temperature was increased linearly from 90 to 250 °C at a rate of 8 °C/min. In the resulting chromatogram, peaks corresponding to mannose and NAG (N-acetylglucosamine) were assigned by both retention time and their fragmentation spectra.
For the analysis of protein glycoforms, CRL1 (200 μg) was digested with trypsin in 50 mM ammonium bicarbonate (pH 8.5) for 4 h at 37 °C, using an enzyme-to-substrate ratio of 1:25. The sample was then freeze-dried and deglycosylated by an overnight treatment with PNGase F (0.25 units) in 50 mM ammonium bicarbonate (pH 8.5) at 37 °C. The released N-linked oligosaccharides were purified by Sep-pak chromatography, freeze-dried and N-acetylated with acetic anhydride in pyridine (5:1, v/v) at 80 °C for 2 h. The sample was dried, re-dissolved in 0.1% trifluoroacetic acid and used for MALDI-MS analysis.
MALDI-MS analysis of oligosaccharides was performed on a Voyager DE PRO instrument (Applied Biosystems, Foster City, CA, U.S.A.). Typically, 1 μl of analyte solution was mixed with 1 μl of 2,5-dihydroxybenzoic acid (20 mg/ml) in acetonitrile/0.1% trifluoroacetic acid (70:30, v/v). Mass values are reported as average masses.
RESULTS AND DISCUSSION
FTIR absorption spectrum of CRL1 in solution
CRL1 is known to be a mixture of isoenzymes related to each other by high similarity in their amino acid sequences . Therefore some heterogeneity may be expected in the isoenzyme conformation, in the relative amount of each isoform, as well as in the degree of protein glycosylation. For these reasons, to obtain a reliable evaluation of the secondary structure and other properties such as glycosylation and thermal stability, the present study was performed on the pure recombinant isoform 1 (CRL1) obtained from a synthetic coding sequence . This protein was originally expressed in P. pastoris under the control of a methanol-inducible promoter. However, expression driven by the constitutive glyceraldehyde 3-phosphate dehydrogenase promoter (commercially available in plasmid pGAP from Invitrogen) was found to result in a lower secretion of endogenous Pichia proteins and therefore in a lower level of contaminant proteins. Highly purified, active recombinant lipase was obtained from culture supernatants by hydrophobic chromatography, with a final yield of approx. 50 mg of pure CRL1 from 1 litre of culture.
CRL1 conforms to the α/β hydrolases fold with a complex secondary structure [9,10,21]. The FTIR absorption spectrum in water is reported in Figure 1, after subtraction of solvent and vapour as described in the previous section. The major protein absorption bands due to the peptide group vibrations occur in the 1900 –1200 cm−1 spectral region: the amide I band (1700–1600 cm−1) mainly due to the C=O stretching vibrations, amide II (1580–1510 cm−1) due to the N–H bending with a contribution of the C–N stretching vibrations and, with weaker intensity, amide III (1400–1200 cm−1) due to the N–H bending, C–Cα and C–N stretching vibrations. Furthermore, Figure 1 shows in the 1200 –900 cm−1 region an important absorption of the protein-associated sugar chains, which will be analysed in a separate section.
Here, we examine the amide I absorption band, since this band is most sensitive to the secondary structure of the protein [2–6]. Figure 2 shows the absorption spectrum of the protein in water in the 1700 –1500 cm−1 region obtained at 2 cm−1 resolution, and its second derivative spectrum, where minima allow identification of the absorption band components. Indeed, because of their large bandwidth and small spectral separations, they cannot be directly resolved in the absorption spectrum. These components are due to the secondary-structure elements of the protein, since the C=O stretching vibration is differently affected in the different backbone conformations by hydrogen bonds and dipole interactions.
In Figure 2, we also present the protein absorption in 2H2O to confirm the results obtained in water. Owing to the much lower absorption of the 2H2O solvent in the region 1700–1500 cm−1, a better signal-to-noise ratio was obtained and a higher resolution spectrum (1 cm−1) is presented. A shift towards the lower wave-numbers was observed for the amide I band in 2H2O, together with a decrease in intensity for the amide II band, both indicating that hydrogen–deuterium exchange occurred in the protein [14,22]. The second derivative amide I spectrum is similar to that of the protein in water. Indeed, the secondary-structure components, highly resolved in 2H2O, occurred at peak positions very close to those found in water (see Table 1), indicating that the secondary-structure peptide hydrogens are not involved in the exchange. Their relative intensities are also similar, except for the peak at 1635.8 cm−1.
Two major bands at 1657.9 and 1649.4 cm−1 can be assigned to the α-helix structures of the protein (see Table 1). The β-sheet component, observed in water at 1630.0 cm−1 with a shoulder around 1637 cm−1, appears well resolved into two components in the 2H2O, with a higher intensity for the 1635.8 cm−1 peak, as already reported in the literature for other proteins . Furthermore, the bands around 1682 and 1673 cm−1 can be assigned to turns, whereas that at 1691 cm−1 can be assigned to β-sheet and turns. The 1619.0 cm−1 component can instead be attributed to side chains and aggregates [2,4,22,24].
A similar FTIR amide I assignment has been proposed for the homologous acetylcholinesterase .
CRL1 secondary structure as predicted by curve fitting of amide I
To obtain quantitative information about the protein secondary structures in water, a curve fitting analysis of the amide I absorption band was performed as a linear combination of the components identified in the second-derivative spectrum. These components were approximated by Gaussian functions whose peak positions, widths and heights were adjusted iteratively in the curve-fitting procedure. Since the choice of the initial values is crucial for the result of a fitting with a large number of parameters, the Gaussian initial parameters (peaks, band widths and intensities) were determined following the procedure suggested by Arrondo et al. . In the minimization procedure, the parameters of the Gaussian components were left free to adjust iteratively, with the only restriction on the peak wavenumbers being to vary within a range of ±2 cm−1. The outcome of the fitting is presented in Figure 3, where the Gaussian components of the best fit are also reported. Their band position and areas are given in Table 1. The band area of each component, expressed as a percentage of the total amide I area, can be then taken as a measure of the secondary structure assigned to it, assuming that the peptide bond molar absorptivities were the same for the different secondary structures of the protein [3,5,6]. Actually, when the number of adjustable parameters is large, a fitting does not necessarily lead to a unique solution. For this reason, the reliability of the fitting procedure was evaluated by the analysis of spectral data obtained for 11 independent samples of CRL1. The S.D. of the measured data and of the best-fit parameters are reported in Table 1. Excellent agreement is found for the band positions of the best-fit Gaussian components with those determined from the second derivative spectra, taking also into account that these parameters were left free to adjust in a wavenumber range of ±2 cm−1. All these results support, therefore, the reliability of the fitting and the predicted secondary structure of the protein.
Native conformation and thermal unfolding of CRL1
The native conformation of CRL1 in water solution is summarized in Table 1, with the α-helical structures (1657.9 and 1649.4 cm−1) accounting for approx. 40% of the total structures, β-sheets (1630.0 and 1637.2 cm−1) for 23% and turns (1672.7 and 1682.5 cm−1) for 28%. Despite the higher α-helix and β-sheet content predicted by FTIR (see Table 1), this structure is in good agreement with that obtained by X-ray studies . As already discussed in the literature , this discrepancy can be due to the absorption of unordered structures, which overlaps those of the α-helix and the β-sheet. Actually, the specific contribution of unordered structures, due to their large bandwidth, cannot be singled out in the amide I derivative spectrum and cannot therefore be taken into account by the FTIR data analysis.
Moreover, it should be noted that, when the secondary structure of CRL1 determined by FTIR spectroscopy is compared with that obtained from X-ray studies, the percentage of structures evaluated from X-ray data might be affected by uncertainties in the definition of the boundaries of secondary-structure elements . In addition, the different environments in the crystal and solution state could also be responsible for structural differences, particularly in a flexible high-molecular-mass protein such as CRL1 [26,27].
Taking into account these considerations, the agreement between the protein secondary structure as obtained by FTIR and by X-rays can be considered highly satisfactory.
It is noteworthy that FTIR spectroscopy allows the identification of two distinct α-helical structures in the protein, with absorption at 1657.9 and 1649.4 cm−1 (Figures 2 and 3). Also, two β-sheet components are resolved by FTIR spectroscopy. The FTIR evidence for two α-helices in CRL1, as well as for two β-sheets, was also reported for the homologous protein acetylcholinesterase  and for several other proteins [27–30].
The presence of two α-helix components reflects a difference in the force constant of the carbonyl vibration. The higher band position at 1657.9 cm−1 corresponds to a stronger carbonyl stretching and weaker hydrogen bonding, leading to more flexible helices . However, this difference in the helix flexibility did not result in any significant hydrogen–deuterium exchange (Figure 2).
Having determined the secondary structure of native CRL1, its thermal unfolding was investigated in water and 2H2O by heating the sample at a linear rate of 0.2 °C/min. The temperature dependence of the amide I second derivative spectrum (Figure 4a) showed that the native structure of CRL1 in water is preserved up to 50 °C. At higher temperatures, a loss of both α-helix and β-sheet structures is observed. Interestingly, the results obtained by using 2H2O enabled us to monitor the unfolding of each secondary-structure element better than in water, since they appeared well separated in 2H2O (Figure 4b).
The temperature dependence of the amide II band, at 1548.7 cm−1 in water (Figure 4a), follows that of the secondary structure components of the amide I band.
At approx. 64 °C in water and also at 70 °C in 2H2O, the α-helix and β-sheet components decreased simultaneously to 50% of their values, as shown in Figures 5(a) and 5(b). The increase in the midpoint temperature transition in 2H2O is a common feature of other proteins .
Furthermore, the loss of secondary structure was accompanied by the simultaneous appearance of new bands, at 1625 and 1696 cm−1 in water, due to protein-aggregated structures induced by the thermal treatment [3,24]. In 2H2O, these bands were found at 1619 and 1687 cm−1, a result that clearly shows that hydrogen–deuterium exchange takes place in the aggregates.
Thermal denaturation experiments validated, therefore, the use of FTIR spectroscopy to monitor CRL1 unfolding and aggregation, since the stability of the secondary structure relates well with the temperature dependence of enzymic activity . From this perspective, we believe that FTIR spectroscopy might provide a valuable analytical tool for detecting conformational changes relevant for protein function induced, for example, by site-directed mutagenesis or by processing steps in biocatalysis, such as immobilization or freeze-drying [12,32,33].
CRL1 carries three consensus sequences for N-glycosylation at Asn291, Asn314 and Asn351. Crystallographic analysis of native, purified CRL1 showed sugars linked to the lone 314 and 351 positions . Accordingly, replacement of Asn314 and Asn351 by mutagenesis strongly decreased the enzymic activity, suggesting a functional role for sugar chains in maintaining the enzyme-active conformation . It is known that recombinant proteins expressed in P. pastoris carry oligosaccharide chains composed of two NAG units and a variable number of mannose residues, depending on the fermentation conditions . The carbohydrate content of recombinant CRL1 expressed in P. pastoris was evaluated to be 5% of its weight , but the composition of the sugar moiety was not investigated further.
One of the goals of the present study was to explore the potential of FTIR spectroscopy for the analysis of protein glycosylation. The absorption spectra of protein samples obtained from different fermentations and purification batches are reported in Figure 6 for the region 1750 –900 cm−1, after normalization at the amide I peak. Remarkable differences can be observed in the intensity of the carbohydrate band in the region 1200 –900 cm−1. These differences are expected to reflect the heterogeneity in protein glycosylation, which might occur in different fermentation batches. First, the effect of the enzymic deglycosylation on the intensity of the IR carbohydrate band was investigated. A sample of recombinant CRL1 was subjected to treatment with deglycosydase PNGase F until full deglycosylation was observed in SDS/PAGE (Figure 7a). Comparison of the absorption spectra of the protein before and after enzymic treatment (Figure 7b) showed a drastic reduction in the band intensity after the enzyme treatment, confirming that the 1200–900 cm−1 band is due to the absorption of protein carbohydrates. Interestingly, a residual carbohydrate band was observed in the FTIR spectrum of the PNGase F-treated protein (Figure 7b), suggesting the presence of a small amount of residual sugars. This result, which cannot be obtained by SDS/PAGE, relies on the high sensitivity of FTIR to carbohydrates .
To confirm that the FTIR spectral differences observed in the 1200 –900 cm−1 region are due to a different carbohydrate content and glycoform distribution, we performed a detailed structural characterization of the glycosidic moiety of the recombinant protein essentially using MS methodologies [37–39]. As a first step, the carbohydrate content of CRL1 was determined by GC-MS (results not shown). The chromatogram showed the occurrence of peaks corresponding to mannose and NAG components, identified through their retention time and individual fragmentation spectra, thus excluding the presence of other carbohydrates in the samples.
The heterogeneity in the protein glycosylation was studied by MALDI-MS analysis of the mixture glycoforms released from CRL1. The mass spectral analysis of the oligosaccharides is reported in Figure 8, where glycoforms can be identified by their unique mass values [37,39]. Only glycosidic structures of the high mannose type were observed, with mannoses ranging from 8 to 25 moieties. The most abundant species contain a total of 11 mannose residues. The glycoform distribution of CRL1 from different productions displayed a similar pattern, with minor but significant differences in the population of high mannose chains.
Differences in the intensity of the 1200–900 cm−1 band could therefore be due to different glycoform distributions of the protein samples in Figure 6. It is interesting to observe that the secondary structure of these samples, determined by the amide I band analysis described in a previous section, was found to be constant and not affected by the different glycoform distributions, namely by the mannose content of the protein. Enzymic activities of these samples were also similar. These results indicate that the heterogeneity in the mannose content does not affect the protein structure and function, which is expected to be critically dependent on the core of NAG glycosylation [9,10].
In conclusion, analysis of IR absorption in the 1200 –900 cm−1 region reported in the present study enabled us to monitor a different mannose content in proteins produced in different fermentation batches. Macro- and microheterogeneity in glycosylation is a major issue in recombinant protein production, since it can affect activity, biological function and antigenicity of the final product. It is well known that the composition of the sugar moiety at the glycosylation sites is dependent not only on the host used for heterologous expression but also on the culture conditions.
We gratefully acknowledge P. Pucci for his help with the MS analysis. This work was partially supported by grants from Fondo di Ateneo per la Ricerca (Università Milano-Bicocca; ex MURST 60%) to S. M. D. and M. L.
Abbreviations: CRL1, Candida rugosa lipase 1; FSD, Fourier-self-deconvolution; FTIR, Fourier-transform infrared; MALDI-MS, matrix-assisted laser-desorption ionization MS; NAG, N-acetylglucosamine; PNGase, F, peptide N-glycosidase F
- The Biochemical Society, London