GA (glucoamylase) hydrolyses starch and polysaccharides to β-D-glucose. RoGA (Rhizopus oryzae GA) consists of two functional domains, an N-terminal SBD (starch-binding domain) and a C-terminal catalytic domain, which are connected by an O-glycosylated linker. In the present study, the crystal structures of the SBD from RoGA (RoGACBM21) and the complexes with β-cyclodextrin (SBD–βCD) and maltoheptaose (SBD–G7) were determined. Two carbohydrate binding sites, I (Trp47) and II (Tyr32), were resolved and their binding was co-operative. Besides the hydrophobic interaction, two unique polyN loops comprising consecutive asparagine residues also participate in the sugar binding. A conformational change in Tyr32 was observed between unliganded and liganded SBDs. To elucidate the mechanism of polysaccharide binding, a number of mutants were constructed and characterized by a quantitative binding isotherm and Scatchard analysis. A possible binding path for long-chain polysaccharides in RoGACBM21 was proposed.
- carbohydrate-binding module
- Rhizopus oryzae glucoamylase
- starch-binding domain
Starch, the primary source of stored energy in plants, is composed of amylose and amylopectin. The former consists almost entirely of α-(1,4)-D-glucopyranose units; however, a few α-1,6 branches and linked phosphate groups may be found [1,2]. The latter is composed of α-(1,4)-D-glucose segments connected by approx. 5% α-1,6 branching sites . Both amylose and amylopectin fold into helical structures [4,5] and are further organized into the semicrystalline granular form. In the crystalline state, amylose forms double helices , whereas in solution it is a flexible chain that can adopt a locally helical shape. In animals, glycogen, similar to the starch in plants, is the storage form of glucose. Glycogen has structural similarities to amylopectin, except that the number of branching sites and glucose units are, in general, more than that of amylopectin. GA (glucoamylase), also known as amyloglucosidase or γ-amylase (EC 22.214.171.124), is a biocatalyst capable of hydrolysing α-1,4 and α-1,6 glycosidic linkages in raw starches and related oligosaccharides to produce β-D-glucose [7,8]. The GA from Rhizopus oryzae (RoGA) comprises a C-terminal catalytic domain, classified as a member of the GH15 (glycoside hydrolase family 15) , connected to an N-terminal SBD (starch-binding domain) by an O-glycosylated linker . CAZY (http://www.cazy.org/) classification, which is based on primary sequence similarity, groups the SBD of RoGA into CBMs (carbohydrate-binding modules) family 21 (RoGACBM21) [11,12].
To date, 51 CBM families have been classified, and eight of them (families 20, 21, 25, 26, 34, 41, 45 and 48) have starch-binding activity . The dissociation constants, Kd, for the binding of SBDs to starch are in the micromolar range [14,15]; however, SBDs associated with GA only appear in two families (CBM20 and CBM21). Previous reports on structure-based molecular modelling and NMR spectroscopy of RoGACBM21 revealed that although these two families share very low sequence identity, they indeed have similar biological function and structural folding . Therefore the most recent evolutionary study on CBM20 and CBM21 proposes that these two SBD families are grouped into a new CBM clan . However, the detailed binding mechanisms between CBM21 and starch/glycogen remain unclear.
SBDs can be functionally independent of the catalytic domains and have been proposed to increase the hydrolysis of granular starch by disrupting its structure and concentrating the catalytic domains on the surface of starch [14,18–20]. Ligand binding of RoGACBM21 with starch and soluble oligosaccharides was determined by Chou et al. . The Kd values were of a similar order of magnitude to that of CBM20 from AnGA (Aspergillus niger GA), but the maximal amount of bound protein (Bmax) of RoGACBM21 was 40- to 70-fold higher than that of AnGACBM20 [21,22]. The binding affinity between RoGACBM21 and soluble ligands, βCD (β-cyclodextrin) and G7 (maltoheptaose), was measured as approx. 5 μM .
Three-dimensional structures of several starch-binding CBM superfamily members have been reported: CBM20 from AnGA , CBM21 from RoGA  and Homo sapien phosphatase 1 (PDB: 2EEF), CBM25 and CBM26 from Bacillus halodurans maltohexaose-forming amylase , CBM34 from Thermoactinomyces vulgaris α-amylase , CBM41 from Klebsiella aerogenes pullulanase  and CBM48 from Rattus norvegicus AMP-activated protein kinase . In the present study, we determined the crystal structures of apo-SBD and the complexes with βCD (SBD–βCD) and maltoheptaose (SBD–G7). Additionally, a number of RoGACBM21 derivatives containing point mutations in aromatic residues (Y32A, W47A, F58A, Y67A, Y83A, Y93A and Y94A) and hydrophilic residues (N29A, K34A, N50A, E68A, N96A and N101A) were isolated and studied. We describe detailed interactions between RoGACBM21 and two substrates (βCD and G7) and propose a polysaccharide-binding path.
Protein expression and purification
Recombinant enzyme expression, purification and the functional assay for RoGACBM21 have been reported previously . However, the quality and quantity of full-length RoGA protein expression was poor due to the flexible O-glycosylated linker. Briefly, the DNA fragment encoding RoGACBM21 was cloned into the pET23a(+) expression vector and overexpressed in Escherichia coli BL21-Gold (DE3) cells (Novagen). The recombinant SBD sample was purified by His-BindR affinity column chromatography (Novagen). The purified RoGACBM21 sample was dialysed against sodium acetate buffer (50 mM, pH 5.5). The resulting C-terminally His6-tagged SBD (11.65 kDa) was purified by Ni-NTA (Ni2+-nitrilotriacetate) affinity chromatography with a final yield of 5 mg of purified protein per litre of cells.
All RoGACBM21 mutants were generated using the PCR-based QuikChange® site-directed mutagenesis method (Stratagene) as described  with pET-RoGACBM21 as the template, two complementary primers containing the desired mutation, and Pfu Turbo DNA polymerase (Stratagene). All constructs were transformed into competent E. coli BL21-Gold (DE3) for protein expression.
Quantitative measurement of binding to starch
The starch-binding isotherm was analysed by a saturation-binding assay as reported previously . Wild-type and mutant RoGACBM21 derivatives (100 μl, 5–50 μM) in sodium acetate (50 mM, pH 5.5) were mixed with 0.1 mg of prewashed insoluble starch and incubated at 25 °C with gentle stirring for 16 h. After centrifugation at 16000 g for 10 min at 4 °C, the protein concentration of the supernatant (unbound protein) was determined by the BCA (bicinchoninic acid) protein assay, and the amount of bound protein was calculated from the difference between the initial and unbound protein concentrations. The Kd and Bmax values were determined by fitting to the non-linear regression of the binding isotherms using a standard single-site binding model.
Quantitative measurement of binding to soluble carbohydrate
Fluorescence spectrophotometry of the binding of wild-type or mutant RoGACBM21 to βCD was recorded by measuring changes in the intrinsic protein fluorescence intensity. Experiments were performed in 50 mM sodium acetate (pH 5.5) at 25 °C using a PerkinElmer LS-55 spectrophotometer. Circular and linear carbohydrates (2–20 mM) were titrated into RoGACBM21 (1 μM, 2 ml), and the fluorescence-emission spectrum was monitored at 350 nm with a fixed excitation at 280 nm. The relative changes in fluorescence intensity were plotted against the ligand concentration, and the data were fitted to a simulated curve using the appropriate equation for a single binding site.
Crystallization trials were carried out using the hanging-drop vapour-diffusion method. Both protein (1 μl) and reservoir (1 μl) solutions were mixed and equilibrated against a reservoir solution (500 μl) in Linbro plates. Initial crystallization conditions were obtained using Hampton Research Crystal Screen kits and then further optimized to obtain diffraction-quality crystals. The concentration of RoGACBM21 used for crystallization was approx. 10 mg/ml. The βCD and G7 were used at a molar ratio of 1:2 (protein/carbohydrate) to form SBD–βCD and SBD–G7 complexes respectively. Three SBD crystals grew in different conditions at 293 K. The apo-SBD crystal was crystallized in 25% PEG [poly(ethylene) glycol] 4000, the SBD–βCD crystal was grown using 18% PEG 8000 and 0.2 M zinc acetate, and the SBD–G7 crystals were grown in 30% PEG 8000 and 0.6 M ammonium sulfate.
X-ray data collection
The X-ray diffraction data were collected at beamline BL13C1 at the NSRRC (National Synchrotron Radiation Research Center, Taiwan). The data were processed and scaled using the program HKL2000 . The apo-SBD, SBD–βCD and SBD–G7 crystals diffracted to 1.25, 1.8 and 2.3 Å (1 Å=0.1 nm) resolution respectively. Both apo-SBD and the SBD–βCD crystals belong to the orthorhombic P212121 space group with one molecule per asymmetric unit and the VM  was calculated as 1.81 Å3·Da−1 and 2.64 Å3·Da−1 respectively. The SBD–G7 crystal belongs to the monoclinic P21 space group with four molecules per asymmetric unit and the VM  was calculated as 2.74 Å3·Da−1 (Table 1).
Structural determination and refinement
The crystal structures of RoGACBM21 were determined using the unliganded solution structure RoGACBM21 (PDB: 2DJM)  as a search model by molecular replacement. The molecular replacement program MOLREP  was used for phase determination. Data between 8.0 and 4.0 Å and a Patterson radius of 20 Å were used to calculate the rotation and translation functions. A similar procedure was applied for the three structures and significant rotation and translation solutions were obtained. Structural model building was carried out using XTALVIEW , and the structural refinement was performed by CNS  and the CCP4 program suite . The final statistics of refinement for apo-SBD and the complexes, SBD–βCD and SBD–G7, are summarized in Table 1. The co-ordinates of the RoGACBM21 structures have been deposited in the PDB as the accession numbers 2VQ4 (apo-SBD), 2V8L (SBD–βCD) and 2V8M (SBD–G7).
RESULTS AND DISCUSSION
CBM folding topology
Several CBM superfamilies, including CBM20, CBM21, CBM25, CBM26, CBM34, CBM41 and CBM48, contain an SBD in either the N- or C-terminus [24,25,34–40]. These SBDs reveal low sequence identity but a similar immunoglobulin-like folding topology. RoGACBM21 contains 106 residues, which has low sequence identity (<15%) with AnGACBM20, as well as that of other SBD-containing CBM families. A structure-based multiple-sequence alignment of SBDs from nine representative CBM superfamilies is shown in Figure 1. The sequence alignment was carried out based on the overall structural folding of eight β strands of SBDs and the corresponding ligand-binding sites. Two types of SBD topologies, type I and type II , show a similar overall structure by switching the first and last β strands. Superfamilies CBM20, CBM25, CBM26 and CBM41 belong to the type I topology, whereas CBM21, CBM34 and CBM48 belong to type II topology .
The RoGACBM21 and the SBD–βCD and SBD–G7 complexes share a similar overall structure, which belongs to a typical β-sandwich fold with an immunoglobulin-like architecture  with approximate dimensions of 28 Å×30 Å×42 Å. The β-sandwich is symmetric and composed of eight β strands, which are antiparallel except β7 and β8. These strands are paired as four hairpin β strands, β12, β34, β45 and β67/8, and fold as a β barrel. The β barrel contains a substantial hydrophobic core that contributes the major stabilizing force to RoGACBM21. The overall structures of SBD–βCD and SBD–G7 are shown (Figures 2A and 2B). Two carbohydrate-binding sites, designated as sites I and II, were observed in both SBD–βCD and SBD–G7 complexes, in which the sugar ligands, βCD and G7, are located at almost diagonal ends of the SBD in a perpendicular orientation. Binding site I is located at Trp47 around loop β34 and site II is located at Tyr32 around loop β23. The carbohydrate-binding ratio for both SBD–βCD and SBD–G7 complexes is one sugar ligand per SBD molecule in crystal, in which the SBD shares each binding site with the symmetry-related molecule in the crystal such that two SBD molecules together hold one sugar ligand. Site I and site II are compatible to bind one sugar molecule co-operatively. In solution, the binding between the SBD and βCD is determined to be 1:2 by isothermal titration calorimetry (results not shown). The well-defined electron density map of βCD in the SBD–βCD complex is shown in Figure 2(C). The Kd (∼333 μM) of the αCD complex is much higher than that of βCD and G7 complexes , because the cyclic ring of the six glucose units of αCD is too small to fit the essential binding surface of the SBD, such that its binding ability is dramatically reduced.
There are four molecules per asymmetric unit in the SBD–G7 complex. The overall structures of four SBDs are very similar with RMSD (root mean square deviation) values of 0.57–1.04 Å in Cα. The key aromatic residues, Trp47, Tyr83, Tyr94 and Phe58 are in the same orientation, except for Tyr32, which is still located in the centre of G7 in the SBD–G7 complex, but tilted in a different orientation (Figure 2D). The conformations of four G7s vary; however, they tend to fold into a U shape and fit the curvature of the SBD-binding site. The major conformational difference of G7 appears to be located in the two ends of molecules.
Site I mainly comprises three conserved aromatic residues, Trp47, Tyr83 and Tyr94, to form a broad, flat and stable hydrophobic environment. The aromatic rings of these residues construct the curvature of the SBD-binding site and interact with the sugar rings by ideal hydrophobic stacking interactions (Figure 2C). Site I is rather rigid because Tyr83 and Tyr94 are located in β strands and Trp47 is locked by Asn46 and Asn50. Site II is generally formed by loops β23 and β45, particularly dominated by two aromatic residues, Tyr32 and Phe58. Nevertheless, both Tyr32 and Phe58 are in the proximity of the sugar ring where the van der Waals surface and the βCD are closely packed. The sugar ring of βCD is nearly parallel to the ring of Tyr32, which protrudes into the non-polar cavity of βCD and forms a hydrogen bond with βCD O27 (Figure 2C). Such unique binding is only observed in the SBD–βCD complex, where the α-glucan chains of βCD wrap around Tyr32 to stabilize the binding. Similar binding patterns were observed in CBM25, CD glycosyltransferase [41,42] and CBM48, the glycogen-binding domain of AMP-activated protein kinase , in which the corresponding residues in site II are Leu600 and Leu146 respectively. Another key residue in site II is Phe58, of which the hydrophobic phenyl ring forms a flat stacking interaction with the sugar ring of the glucosyl unit of βCD. Two planar rings of Tyr32 and Phe58 pack closely against the van der Waals surface of the βCD molecule and act like a clamp to pick up the βCD.
In addition to the hydrophobic interactions, numerous hydrophilic interactions between the SBD and βCD were found. In site I, residues Asn50 (ND2-O61:2.7 Å), Asn96 (ND2-O33:2.9 Å) and Asn101 (ND2-O22:2.6 Å; OD1-O32:2.9 Å) form two unique polyN loops (Figure 2E) and provide the main hydrophilic interactions to βCD. In site II, residues Asn29 (NH2-O3:2.8 Å), Lys34 (NZ-O3:2.8 Å and N-O3:3.3 Å) and Glu68 (OE1-O2:2.5 Å and OE2-O3:2.9 Å) supply several hydrogen-bond interactions with βCD. Meanwhile, these residues interact with each other by hydrogen bonds to provide a tight binding environment for βCD and contribute the forces to stabilize the SBD. The binding area of site II is small and narrow and it protrudes away from the complex more than site I.
Besides the hydrophobic interaction, several asparagine residues from β34 and β78 near site I provide additional hydrophilic interactions for the carbohydrate binding. These two unique polyN loops positioned on the two sides of the sugar molecule interact with the hydroxy groups to protect the hydrophobic curvature of the binding-site groove created by Trp47, Tyr83 and Tyr94 from exposure to solvent (Figure 2E). The polyN loops consisting of consecutive asparagine residues in loop β34 (Asn46, Asn48, Asn49 and Asn50) as well as in loop β78 (Asn96, Asn97, Asn98 and Asn101) (Figure 2F) participate in sugar binding.
The significant interaction networks created in each of the polyN loops of both complexes stabilize the binding loop and facilitate sugar binding. In polyN loop β34, residues Asn46 and Asn50 directly interact with the main chains of Trp47 such that the aromatic ring of Trp47 forms hydrophobic stacking with the sugar ring (Figure 2F). Meanwhile, Asn50 and Trp47 lock together to assist the sugar binding by a hydrogen bond (Figure 2F). Furthermore, the geometry of Asn50 is squeezed out by an interaction network, Asn46–Trp47–Asn48–Asn50–Gly51–Asn52–Asn46, and the side chain of Asn50 is forced to interact with the sugar molecule (Figure 2F). Likewise, in polyN loop β78, Asn96 interacts with Ser99, Ala100 and Asn101 by three hydrogen bonds to build a βCD-binding network (Figure 2F). The orientation of Asn101 is fixed by Asn96 and Asn97 with two hydrogen bonds, as well as controlled by the adjacent residues, Ala100, Tyr102 and Gln103. As a result, Asn101 binds βCD with two hydrogen bonds (Figure 2F). A primary triangle interaction, Asn96:CO–Asn101:N–Asn97:NH2, and a periphery interaction network, Asn96–Ala100–Gln103–Tyr102–Gln95–Asn97, are produced and tight sugar binding is generated. The orientations of Asn96 and Asn101 are especially restricted and do not reveal favourable geometries in the Ramachandran plot.
Some residues in the polyN loops show very restricted geometries because these residues have to link one by one and form a series of chain interactions, which not only stabilize the binding loops, but also confirm the sugar binding. These extra interactions from the polyN loops might produce a higher sugar-binding capacity and affinity for RoGACBM21 as compared with those of other CBM superfamilies. Furthermore, the polyN loops interact with two sides of the βCD molecule to assist in turning the orientation of βCD to form hydrophobic interactions with Trp47, Tyr83 and Tyr94. Interestingly, these characteristic polyN sequences are only observed in RoCBM21 and McCBM21 (where Mc is Mucor circinelloides) (Figure 1) [13,17].
Binding affinity for starch and βCD
To elucidate the mechanism of polysaccharide binding, we examined the essential residues in the vicinity of the binding sites according to the complex structures. A number of RoGACBM21 mutants were generated and characterized by a quantitative binding isotherm and fluorescence spectroscopic analysis (Table 2) . The mutant proteins contained substitutions at the aromatic residues from site I (W47A, Y83A, and Y94A) and site II (Y32A and F58A) involved in βCD and G7 binding and two additional tyrosine residues near sites I and II, Y67A and Y93A. Mutations at hydrophilic residues in site I (N50A, N96A and N101A) and site II (N29A, K34A and E68A) involved in direct hydrogen-bond interactions with ligands were also analysed.
The site I mutants (W47A, Y83A and Y94A) exhibited decreased binding affinity for starch and βCD when compared with the Kd values for wild-type RoGACBM21 (Table 2). The βCD binding cannot be detected in the W47A mutant. Furthermore, W47A, Y83A and Y94A showed a reduced Bmax for the binding capacity to starch, especially Y83A, possibly because Tyr83 is located on the inside position of the site I-binding pocket. Site I (W47A) serves as the major carbohydrate-binding site for starch. For the site II mutants (Y32A and F58A), the Kd values of Y32A for starch and βCD binding were similar to that of wild-type SBD. In addition, Y32A had less effect on the binding of soluble oligosaccharides. F58A showed an increased Kd for the binding of starch and had the highest Kd for the binding of βCD among all of the mutants; however, this mutant exhibited almost no effect on the binding capacity for starch, with a Bmax similar to that of wild-type SBD. Because Phe58 significantly affected the binding to βCD and starch, Phe58 might play a key role in site II. The Trp47- and Tyr32-binding sites were co-operative to sugar binding, because the binding affinity of the single mutant, Y32A or W47A, was partially maintained; however, the binding of the double mutant (Y32A/W47A) to insoluble starch was almost completely abolished . The binding roles of Tyr32 and Trp47 can be accommodated in RoGACBM21 complexes (Figure 2C).
From the structure-based alignment (Figure 1), two additional interesting aromatic residues, Tyr67 and Tyr93 were studied. Tyr67 is conserved in most of the SBD-containing CBM superfamilies and Tyr93 is also conserved in most of the CBM21 superfamily members [13,17]. Although Tyr67 and Tyr93 do not directly interact with sugar in both the SBD–βCD and SBD–G7 complexes (Figure 2), Y67A and Y93A mutants showed apparently lower Bmax and higher Kd values (Table 2). In particular, Y67A had a considerably reduced Bmax (3.7 μmol/g), the lowest starch-binding capacity of all of the mutants. In the SBD–βCD complex, Tyr67 points into the N-terminal loop, in which the hydroxy group of Tyr67 positions around Pro4–Ser5–Ser6–Ala7 and forms a hydrogen bond with the O atom of Ala7 and stacks with Pro4 in the N terminus (Figure 3A). The mutant Y67A showed a small Bmax and it might due to the instability of the N terminus. Thus Tyr67 might play a role in the N-terminal stabilization. The overall structure of Y67A should be maintained because the binding activity of Y67A is preserved, and the secondary structure of Y67A examined by circular dichroism was intact (results not shown). The result in the present study demonstrates that Tyr67 might play an important role in governing the starch-binding capacity for RoGACBM21.
The Kd and Bmax values for mutants with substitutions at hydrophilic residues in site I (N50A, N96A and N101A) and site II (N29A, K34A and E68A) are shown in Table 2. Mutant N50A had a reduced Bmax and the highest Kd (∼10-fold higher than wild-type) for starch binding among all site I and II mutants, and this may be attributed to the loss of hydrogen-bond interactions between Asn50 and Tyr83. Asn50 might play a major role in the binding of insoluble polysaccharides and contributes to the overall integrity of site I. Mutants N96A and N101A (site I) as well as K34A and E68A (site II) exhibited inefficient βCD binding but with high Kd values due to a loss of hydrogen-bond interactions. Consequently, we suggest that these hydrophilic residues from sites I and II might play a critical role in the binding of soluble or insoluble polysaccharides, in either an individual or co-operative manner.
A continuous polysaccharide-binding path
Based on sites I and II, as well as two crucial tyrosine residues near sites I and II, Tyr67 and Tyr93, a characteristic connection was observed between two carbohydrate-binding sites (Figure 3A). From the surface electrostatic potential distribution of the SBD–βCD complex (Figure 3B), a continuous hydrophobic surface is observed formed by residues Trp47, Tyr83, Tyr94, Tyr93 and Tyr67. Furthermore, from the solvent-accessible surface of the SBD–βCD complex (Figure 3C), there are two possible accessible surface paths to link site I and site II. One path is through residues Asn50, Tyr83, Tyr94, Tyr93 and Tyr67. From the mutation data (Table 2), Y67A and Y93A mutants showed substantial effects in the starch-binding assay, Tyr67 and Tyr93 might act as a midpoint to link two binding sites and make a continuous binding path for longer chain and larger polysaccharides, or even starch. The distance of this potential polysaccharide-binding path through Trp47 (site I), Tyr93, Tyr67 and Tyr32 (site II) is in a range of 45–60 Å with an accessible surface area of approx. 1215 Å2. The other path is through hydrophilic residues, Asn50, Lys85, Glu87, Lys35 and Lys34, which are consecutive and it might be able to link sites I and II also. The continuous binding path might be important for the raw starch/insoluble polysaccharides, but not for soluble and smaller carbohydrate molecules, such as βCD and G7. During the initial binding step, sites I and II might act as two persistent binding markers for soluble/insoluble polysaccharides, after which the continuous polysaccharide-binding path may assist further binding.
Liganded and unliganded RoGACBM21
The superimposed structures of unliganded and liganded RoGACBM21 are presented in Figure 3(D). They reveal comparable overall structures, in which RMSD values between the SBD and two complexes, SBD–βCD and SBD–G7, are 0.74 and 0.98 Å respectively, and the RMSD value between the two complexes is 0.57 Å. The comparison of the residues around the sugar-binding site of unliganded and liganded SBD is shown in Figure 3(E). An apparent conformational change was observed in Tyr32, which formed a hydrogen bond with Asn29 in unliganded SBD; however, the phenyl ring of Tyr32 flipped over and reoriented to form sandwich stacking with the sugar ring upon ligand binding (Figure 3E). Surprisingly, the essential aromatic residues (Trp47, Tyr83 and Tyr94) as well as other ligand-binding residues remained in the same orientation in unliganded and liganded SBDs. Before ligand binding, numerous water molecules occupied the sugar-binding site and interacted with several hydrophilic residues, such as Lys34, Glu68, Asn50, Asn94 and Asn101. However, upon ligand binding, this water layer was replaced by a sugar molecule and the interactions were established among the sugar and hydrophilic binding residues.
These results indicated that the sugar-binding site of SBD is rather fixed and stable. As long as the β barrel scaffold is formed and the three key binding positions (Trp47, Tyr83 and Tyr94) can be preserved, the ligand binding is likely to take place. For linear polysaccharides, the Kd values of RoGACBM21 decreased with increasing ligand length and it was worse than that of βCD . The Y32A mutant significantly diminished the affinity in G7 but had no effect in the short linear carbohydrates, such as maltotriose and maltotetraose . Tyr32 may adjust the orientation of its phenyl ring to accommodate the conformation of a long ligand.
RoGACBM21 complexes and other CBM superfamilies
The three-dimensional structures of two CBM21 members, apo-RoGACBM21 and apo-HsCBM21 are superimposed in Figure 4(A). They share a similar overall structure except for loops β34, β45 and β56. The corresponding residues for ligand binding residues in site I (Trp47, Tyr83 and Tyr94) and site II (Tyr32 and Phe58) in the RoGACBM21 complex were also observed in HsCBM21 (where Hs is Homo sapiens) in putative site I (Trp72, Ala114 and Trp125) and site II (Phe59 and Tyr82) respectively. These residues are located in comparable locations except that the orientations of the side chains vary. We suggest that HsCBM21 might possess binding characteristics similar to that of RoGACBM21.
AnGA, a GA from Aspergillus niger , shares the same glycoside hydrolase characteristic as RoGA. However, the SBDs of both proteins, AnGACBM20 and RoGACBM21, fold with different topologies  and link with the catalytic domains through their N- and C-terminal ends respectively [23,43]. The three-dimensional structures of the two SBDs could be superimposed by switching the first and the last β strands; the structural superimposition is presented in Figure 4(B). Most CBM superfamilies contain one sugar-binding site; however, RoGACBM21 and AnGACBM20 complexes contain two binding sites [24–26,40]. RoGACBM21 binds co-operatively to sugar, whereas AnGACBM20 exhibits independent sugar binding . Even though the sugar-binding location is similar in both proteins, several structural differences were noted (Figure 4B), particularly in loops β34, β45 and β78 around the sugar-binding regions, which are in completely different conformations. These distinct loops provide the unique sugar binding for RoGACBM21 and AnGACBM20.
The two binding sites of AnGACBM20 and RoGACBM21 are structurally and functionally different and the corresponding residues in AnGACBM20 are Trp543 and Trp590 for site I and Tyr527 and Tyr556 respectively. In AnGACBM20 , site I (Trp543) acts as the initial recognition site for starch, whereas site II (Tyr527) is capable of recognizing a range of starch strand orientations. Site I has a larger surface area, undergoes a conformational change upon sugar binding, and is proposed to act as a more specific site to lock the ligand into place. However, from the structural results and functional assay data (Table 2) for the RoGACBM21 complexes, we assume that site I (Trp47) is the essential binding site and site II (Tyr32) plays an auxiliary role as a recognition site because a conformational change and the protruded orientation of Tyr32 was clearly observed between liganded and unliganded SBDs (Figure 3E). The residues corresponding to Trp47 in most SBD superfamilies are highly conserved, except for BhCBM26 (where Bh is B. halodurans) (Figure 1). Site I reveals a larger and broader binding surface, and undergoes further conformational changes upon βCD binding.
TvAI, an α-amylase from T. vulgaris  that catalyses the hydrolysis of α-D-(1,4)-glucoside linkages in starch to release the α-anomer, shows different hydrolase characteristics with RoGA. However, both SBDs belong to the N-terminal starch-binding CBM superfamily and fold with the same type II topology [25,43]. Although the sequence identity and similarity between these two molecules are only 16 and 36% respectively, their structures superimposed quite well, especially the β strands. The structural superimposition of RoGACBM21 and TvAICBM34 is shown in Figure 4(C) with an RMSD of 1.6 Å in Cα. Nevertheless, structural differences were observed in loop regions, for example loops β34 and β78 (Figure 4C). Two sugar-binding sites, site-NA and site-N, found in TvAICBM34  correspond to site I and site II respectively in RoGACBM21. Both sugar-binding sites were in similar orientations but two sugar molecules were in a perpendicular orientation in both proteins. Meanwhile, the functional roles of the two sugar-binding sites in both proteins were comparable: site I/site-NA binds sugar specifically and site II/site-N helps the enzyme approach starch by recognizing the starch surface.
There is a carbohydrate-binding curvature observed in several CBM families, such as TvAICBM34  and BhCBM25  complexes to provide the main protein–carbohydrate interactions by several key residues, for example Trp51, Tyr89 and Tyr119 in TvAICBM34 and His26, Trp34 and Trp74 in BhCBM25. This binding platform was also formed in RoGACBM21 by the corresponding residues, Trp47, Tyr83 and Tyr94. These key residues have a similar location in each protein structure, and their aromatic rings hold the sugar molecules by hydrophobic stacking interactions. Within this binding platform, the most conserved residue is Trp47 in RoGACBM21, corresponding to Trp74 in BhCBM25 and Tyr89 in TvAICBM34. This binding curvature could swing around the substrate to create a tight binding pocket and induce at least three glucose units of sugar molecules to be bent in a segmented shape for binding.
The crystal structures of unliganded and liganded RoGACBM21 (SBD–βCD and SBD–G7) were determined at 1.25, 1.8 and 2.3 Å resolutions respectively. Two carbohydrate-binding sites, I and II, were found on the surface of SBD, of which site I is a flat and broad hydrophobic-binding region created by the aromatic residues Trp47, Tyr83 and Tyr94, and site II is a protruded and narrow binding environment formed by Tyr32 and Phe58. Site I serves as the major sugar-binding site by forming an essential binding curvature for at least three glucose units. Site II may play an auxiliary role for recognizing the long-chain polysaccharides. A significant interaction network, which was created in the polyN loops unique to RoGACBM21, stabilizes the loops and confirms the sugar binding. This might produce a higher binding capacity for RoGACBM21 compared with other CBM superfamilies. An especially stable binding environment in SBD was suggested to accommodate a variety of sugars with different conformations and lengths. From the structural determination and the quantitative binding assay of RoGACBM21, a possible long-chain insoluble polysaccharide-binding path was suggested, in which the polysaccharide is bound by the specific binding sites I and II as well as a continuous binding surface with Tyr67 and Tyr93 between two sites.
This work was supported by grants from the National Science Council of Taiwan (NSC 96-2311-B-007-014 and NTHU 96N2424E1 to Y.-J. S. and NSC 96-2627-B-007-003, to M. D. T. C.). This work was partially supported by a Council of Agriculture grant, National Science and Technology Programme for Agricultural Biotechnology (97AS-1.2.1.-ST-a5, to M. D. T. C.). The RoGACBM21 clone was kindly provided by Mr Chia-Chin Sheu (Simpson Biotech, Taoyan, Taiwan). The X-ray diffraction data were carried out at the NSRRC (National Synchrotron Radiation Research Center, Taiwan).
Abbreviations: AnGA, Aspergillus niger GA; CBM, carbohydrate-binding module; βCD, β-cyclodextrin; G7, maltoheptaose; GA, glucoamylase; PEG, poly(ethylene) glycol; RMSD, root mean square deviation; RoGA, Rhizopus oryzae GA; SBD, starch-binding domain
- © The Authors Journal compilation © 2008 Biochemical Society