Thiolases are essential CoA-dependent enzymes in lipid metabolism. In the present study we report the crystal structures of trypanosomal and leishmanial SCP2 (sterol carrier protein, type-2)-thiolases. Trypanosomatidae cause various widespread devastating (sub)-tropical diseases, for which adequate treatment is lacking. The structures reveal the unique geometry of the active site of this poorly characterized subfamily of thiolases. The key catalytic residues of the classical thiolases are two cysteine residues, functioning as a nucleophile and an acid/base respectively. The latter cysteine residue is part of a CxG motif. Interestingly, this cysteine residue is not conserved in SCP2-thiolases. The structural comparisons now show that in SCP2-thiolases the catalytic acid/base is provided by the cysteine residue of the HDCF motif, which is unique for this thiolase subfamily. This HDCF cysteine residue is spatially equivalent to the CxG cysteine residue of classical thiolases. The HDCF cysteine residue is activated for acid/base catalysis by two main chain NH-atoms, instead of two water molecules, as present in the CxG active site. The structural results have been complemented with enzyme activity data, confirming the importance of the HDCF cysteine residue for catalysis. The data obtained suggest that these trypanosomatid SCP2-thiolases are biosynthetic thiolases. These findings provide promise for drug discovery as biosynthetic thiolases catalyse the first step of the sterol biosynthesis pathway that is essential in several of these parasites.
- coenzyme A (CoA)
- lipid metabolism
- oxyanion hole
- sterol biosynthesis
- thioester chemistry
Trypanosoma spp. and Leishmania spp., belonging to the Trypanosomatidae family, are parasitic unicellular eukaryotes. Members of this protist family cause several diseases in humans as well as in animals and plants, especially in the tropical and subtropical regions of Africa, Asia and Latin-America. These often deadly diseases are widespread and adequate pharmaceuticals are urgently needed for better treatment . Therefore these diseases are included in the research focus of the WHO (World Health Organization)/TDR programme . For example, Trypanosoma brucei gambiense and Trypanosoma brucei rhodesiense cause human African trypanosomiasis or sleeping sickness. A widespread cattle disease in Africa, nagana, is caused by the subspecies Trypanosoma brucei brucei and some other Trypanosoma species. Another species, Trypanosoma cruzi, causes Chagas’ disease in humans in Latin-American countries. Leishmania donovani is responsible for a form of leishmaniasis called kala-azar in India. The subspecies Leishmania mexicana, which is also a human pathogen, is cultivated easily and is widely used in leishmanial research. The WHO estimates that in 2011 there have been more than 2000000 new cases of these neglected trypanosomal and leishmanial diseases . The trypanosomatid parasites complete their life cycles in two different host systems, an insect vector and a mammalian host, like human or cattle. For T. brucei brucei, the procyclic form, found in the tsetse fly's midgut, and the long-slender bloodstream form present in the mammalian host, have been studied in great detail . Leishmania spp. also exhibit two predominant forms, the promastigote and amastigote, which are found in the insect vector and intracellularly in the phagolysosomes of mammalian macrophages respectively. One of the areas of trypanosomatid research concerns the role of mitochondrial enzymes in lipid metabolism . The sterol biosynthesis pathway has been shown to be important in the human pathogenic stages of T. cruzi and Leishmania spp. and is known to occur, at least partially, in the mitochondrion. The first step of this pathway is catalysed by a thiolase .
Thiolases are CoA-dependent enzymes  involved in synthetic and degradative fatty acid CoA metabolism. The fatty acid substrates are conjugated via a thioester bond to CoA, which activates the fatty acid moiety for further metabolic conversions. No cofactors or metal ions are required for the catalysis. Thiolases are tight dimers or tetramers (dimers of tight dimers) [6,7]. The reaction catalysed by a 3-oxoacyl-CoA (3-ketoacyl-CoA) thiolase in the degradative direction is shown in Figure 1. In the synthetic direction, thiolases catalyse a Claisen condensation reaction. Six thiolase isoenzymes have been identified in humans (Table 1). All these thiolases are related in sequence, and belong to the same thiolase superfamily. The structural and enzymological properties of human CT, human T2 and the bacterial Zoogloea ramigera thiolase have been well studied [8–10]. However, only few studies have been performed on the dimeric peroxisomal SCP2 (sterol carrier protein, type-2)-thiolases. It is known to be involved in the bile acid synthesis pathway in mammals. It catalyses a degradative β-oxidation thiolytic cleavage reaction of bile acid CoA conjugates. These thiolases therefore have unique substrate specificity, as their substrates are 2-methyl-branched fatty acids, which have in addition a steroid moiety at their ω-end . Furthermore, only the mammalian SCP2-thiolase has an extra domain, the SCP2 domain (approximately 130 residues), located at its C-terminus.
The thiolase fold consists of three domains, the N-domain, the loop domain and the C-domain. The N- and C-domains share the same βαβαβαββ topology. The standard nomenclature and numbering for the secondary structure elements and loops are taken from the Z. ramigera thiolase, as shown in Figure 2. Both the N- and C-domains provide catalytic residues. The loop domain shapes the binding pocket of the CoA moiety . The pantetheine-binding tunnel is shaped by the covering loop located at the beginning of the loop domain and the pantetheine-binding loop at the end of this same domain. The protruding tetramerization and cationic loops of the loop domain are important for the assembly and function of the tetrameric thiolases. The active site is near the dimer interface and its shape is defined by residues of both subunits. However, the catalytic residues of each of the two active sites belong to the same subunit. The catalytic site is deeply buried inside the bulk of the protein. It is located at the end of the pantetheine-binding tunnel, close to the N-termini of the Nα3-helix and Cα3-helix. The catalytic site of thiolase is rather polar. The best studied thiolase is the bacterial Z. ramigera thiolase, which is known to have high catalytic efficiency. It has a small pocket for binding the acyl moiety and can only use short-chain acyl-CoA molecules as substrates. Three residues are directly involved in the catalytic reaction: two cysteine residues and one histidine residue. The first cysteine residue (Cys89) acts as a nucleophile and participates in the covalent intermediate formation while the second cysteine residue (Cys378) acts as an acid/base. The role of the histidine residue (His348) (Figure 1) is to activate Cys89 and increase its nucleophilicity. Previous studies have highlighted the role of two oxyanion holes (OAH1 and OAH2), which are critically important for the thioester-dependent catalytic steps. There are two active site water molecules (Wat49 and Wat82 in Z. ramigera thiolase) . The precise role of these two water molecules is not fully understood. Wat82 is a hydrogen-bond donor to OAH1, and it is anchored in its position by the fourth catalytic residue, Asn316. Wat49 is hydrogen bonded to Wat82 and both of these water molecules affect the acid/base properties of the C-terminal catalytic Cys378 (Figure 1) . These four catalytic residues belong to four loops with characteristic sequence motifs, as highlighted in Figure 2 . CxS provides the nucleophilic cysteine residue (Cys89), NEAF the Asn316 side chain, GHP the His348 side chain that activates the nucleophilic cysteine residue, and CxG provides the catalytic acid/base, Cys378. Sequence alignments show that in the SCP2-thiolase only two sequence fingerprints are preserved, being CxS and GHP (Figure 2). Interestingly, the NEAF sequence motif is replaced by the HDCF motif, whereas the CxG motif is replaced by MGG (Figure 2). The latter sequence feature implies that the corresponding catalytic acid/base is absent in the SCP2-thiolase subfamily, and consequently the reaction mechanism of this subclass must be uniquely different from that of the canonical thiolases .
Genomic searches have established that only one thiolase, being an SCP2-thiolase, exists in African trypanosomes, whereas Leishmania spp. have two thiolases, the SCP2-thiolase and the TFE (trifunctional enzyme)-thiolase . Unlike the mammalian SCP2-thiolase, the trypanosomal and leishmanial SCP2-thiolases do not have the extra SCP2 domain. Recent studies of the trypanosomal thiolase showed that it is a predominantly mitochondrial enzyme , although with both a mitochondrial transit peptide and a peroxisomal targeting signal. This thiolase has been purified and it was shown to be a dimeric enzyme. It was found to have significant degradative and synthetic catalytic activities . In the present paper we report in-depth structural and enzymological investigations of this trypanosomal SCP2-thiolase as well as on the leishmanial SCP2-thiolase. These two homologous thiolases are very similar to each other (Figure 2). The structures provide a clue to the function of the HDCF sequence fingerprint and, together with the biochemical assays, also reveal new insights into the possible function of these thiolases.
Cloning, expression, purification and kinetic assays
The wild-type Lm-thiolase (L. mexicana SCP2-thiolase) and Tb-thiolase (T. brucei brucei SCP2-thiolase) genes were PCR amplified and ligated into the expression vector pET28a (Novagen) between NdeI and BamHI restriction sites. Both of these thiolases were expressed and purified using the Escherichia coli BL21 (DE3) bacterial cell line containing the pGro7 plasmid (TaKaRa Bio) which encodes GroEL and GroES. The details of the cloning, expression and purification of Lm- and Tb-thiolase can be found in the Supplementary Online data (at http://www.biochemj.org/bj/455/bj4550119add.htm). Enzyme activity assays were performed as reported previously [7,9] for both the degradative direction as well as the synthetic direction and as described in the Supplementary Online data. Each activity was measured four times using two different batches of purified protein. The activity of the protein remained unchanged upon freezing the sample at −70°C and subsequent thawing.
Determination of the molecular mass and oligomeric state
The molecular mass of the Tb-thiolase has been determined previously by a SLS (static light scattering) system (Wyatt Minidawn) . Purified Lm-thiolase was also analysed using this SLS system to establish its molecular mass and oligomeric state (Supplementary Figure S1 at http://www.biochemj.org/bj/455/bj4550119add.htm). For this experiment, the SLS flow cell was connected to an ÄKTA purifier (GE Healthcare). Lm-thiolase (~2.0 mg) was loaded on to the Superdex 200 (10/300) GL column attached to the ÄKTA purifier and the SLS profiles were analysed using the Astra software package.
Single site-directed mutagenesis was achieved using the single primer extension method . The PCR products obtained were sequenced to confirm the presence of the mutations (Biocenter Oulu sequencing facility). The purification protocol of each of the variants was identical with that of the wild-type. In Lm-thiolase, the nucleophilic cysteine residue (Cys123) of the CxS-motif was mutated into alanine or serine. The variant C123A was used for crystallographic binding experiments with CoA. For Tb-thiolase, C120A and C337A variants were made. The C120A variant was generated to probe the importance of the cysteine residue of the CxS motif, and the C337A mutation was made to probe the role of the cysteine residue of the HDCF motif in the catalytic reaction. The C337A variant was also used for crystallographic studies.
Crystallization, data collection, data processing, structure determination and refinement
The crystallization experiments were performed at room temperature (25°C) using the sitting drop vapour diffusion method. The detailed crystallization and crystal handling procedures (summarized in Table 2) are described in the Supplementary Online data.
The data of the wild-type Lm-thiolase crystals were collected at BESSY II, Berlin, Germany at beamline BL14.1 , at 1.85 Å (1 Å=0.1 nm) resolution. The data were processed using MOSFLM  and SCALA of the CCP4 suite . The diffraction data of all Lm-thiolase variants as well as the CoA-complexed crystals were collected using a home source Bruker Proteum X8 system. All these data were processed using the PROTEUM2 software suite (Bruker AXS). The Lm-thiolase structure was solved by the molecular replacement method using PHASER  and the yeast peroxisomal thiolase structure (PDB code 1AFW ) as the initial phasing model. The structures of the Lm-thiolase variants C123S (at 1.90 Å resolution), C123A (at 1.90 Å) and the complex of C123A with CoA (at 2.45 Å) were determined by molecular replacement using the refined Lm-thiolase wild-type structure as the initial phasing model. All Lm-thiolase structures have been refined using REFMAC5 of the CCP4 suite .
The data of the wild-type Tb-thiolase crystals were collected using the microfocus beam line ID23-2 at ESRF (European Synchrotron Radiation Facility, Grenoble, France) at 2.45 Å resolution. The data were processed using MOSFLM  and SCALA of the CCP4 suite . Data for a crystal of the C337A variant of Tb-thiolase were collected at 2.90 Å resolution using the home source Bruker Proteum X8 system. These data were processed using the PROTEUM2 software suite (Bruker AXS). The wild-type Tb-thiolase structure was determined by molecular replacement using PHASER , and using the Lm-thiolase structure as the search model. The structure of the C337A Tb-thiolase variant was determined using the wild-type Tb-thiolase structure as the initial phasing model in the molecular replacement calculations. All Tb-thiolase structures were refined with PHENIX.REFINE using the option for twin refinement .
The final statistics of the data collection, data processing and refinement of the Lm- and Tb-thiolase datasets are summarized in Table 3. The detailed procedure of data collection, data processing, structure determination and refinement can be found in the Supplementary Online data.
The structure of Z. ramigera thiolase, in particular the complex of the acetylated enzyme with bound acetyl-CoA (PDB code 1DM3 ), has been used as the reference for structural comparisons presented in the Results section. The structures of the human T2-thiolase complexed with CoA (PDB code 2IBW ), yeast AB-thiolase complexed with MPD (2-methyl-2,4-pentanediol) (PDB code 1AFW) and the Z. ramigera thiolase complexed with acetoacetyl-CoA (PDB code 1M1O ) have also been used for the comparisons. All structural superimpositions were achieved using the SSM feature of COOT . The geometries of the final models were examined using the program PROCHECK of the CCP4 suite . Further structure analyses such as the B-factor profiles were analysed using the program BAVERAGE in the CCP4 suite . The PISA  webserver was used to identify the dimer interface residues and to evaluate the dimeric interactions. All Figures were made with CCP4MG . The structures of the leishmanial and trypanosomal thiolases contain one or two dimers in the asymmetric unit respectively (Table 3). Subunit B was used for the structural analysis.
Bioinformatics and sequence comparisons
The human SCP2-thiolase domain sequence was used as the query in a BLAST search with the five mammalian genomes (Macaca fascicularis, Rattus norvegicus, Mus musculus, Oryctolagus cuniculus and Bos taurus) found in the NCBI database and eight trypanosomatid genomes (T. brucei brucei, T. brucei gambiense, Trypanosoma congolense, T. cruzi, L. mexicana, Leishmania major, Leishmania braziliensis and Leishmania infantum) available in the GeneDB database (http://www.genedb.org/blast/submitblast/GeneDB_transcripts/omni). Only sequences with the SCP2-thiolase fingerprints (CxS, HDCF and GHP motifs) were included in the final set. One SCP2-thiolase sequence could be detected in each of the eight genomes. Subsequently, a multiple sequence alignment (Supplementary Figure S2 at http://www.biochemj.org/bj/455/bj4550119add.htm) of these eight parasite-specific SCP2-thiolases, the thiolase domains of human SCP2-thiolase and five other mammalian SCP2-thiolases were obtained using ClustalW, as integrated in the program MEGA5 . This sequence alignment was used for the evolutionary analysis. A phylogenetic tree was created using the neighbour-joining method with 10000 bootstrap replicates in MEGA5. Independent alignments were made for the human SCP2-thiolase domain and the trypanosomatid thiolases using the EMBOSS needle program (http://www.ebi.ac.uk/Tools/psa/emboss_needle/) to determine pairwise sequence identities.
Several structures of wild-type and mutated variants of leishmanial (L. mexicana) and trypanosomal (T. brucei brucei) thiolases have been determined in the present study. The structures include unliganded and liganded complexes, as summarized in Tables 2 and 3. The structural data have been complemented by enzyme assay data (Table 4). Sequence alignments of Lm- and Tb-thiolases with the well-characterized thiolases of Saccharomyces cerevisiae and Z. ramigera (Figure 2) indicate that the nucleophilic cysteine residues of leishmanial and trypanosomal thiolases are Cys123 and Cys120 respectively, whereas the cysteine residues of the HDCF loops are Cys340 and Cys337 respectively. Therefore the structures of C123A and C123S Lm-thiolase were also determined. As shown in the subsequent sections, the cysteine residue of the HDCF motif of Lm- and Tb-thiolase is likely to be functionally equivalent to the CxG cysteine residues of the classical thiolases. Therefore the structure of C337A Tb-thiolase was also determined.
Structures of Lm-thiolases were determined in two different crystal forms (Table 3). The unliganded wild-type Lm-thiolase, C123A Lm-thiolase and C123S Lm-thiolase crystallized with two protomers (subunits A and B) in the asymmetric unit of P65. Only a few residues (1–10) at the N-terminus of each of these structures could not be modelled in the electron density maps. The nucleophilic Cys123 appears to be partially oxidized in the wild-type Lm-thiolase and has been modelled in both active sites as an oxidized cysteine sulfonate moiety. Crystals of the CoA complex (refined at 2.45 Å) could only be obtained by co-crystallization of C123A Lm-thiolase with CoA. In these crystals, there are two dimers in the asymmetric unit of P21 (Table 3). The mode of binding of CoA is well defined by its density in the A, B and D subunits (Supplementary Figure S3 at http://www.biochemj.org/bj/455/bj4550119add.htm) and is best defined in the B-subunit. There are no other structural differences between the unliganded (wild-type, oxidized) and the CoA-complexed (C120A) structures. In each of the variants good electron density was observed for the mutated loops and the associated B-factors were low. Furthermore, the mutated loops adopt the same conformation as in the wild-type, indicating that the single amino acid substitutions have not caused significant structural changes.
The wild-type Tb-thiolase (refined at 2.45 Å) and the C337A Tb-thiolase were crystallized in the absence of CoA in space group P31 (Table 3) with two dimers in the asymmetric unit. These structures have been refined using the twin refinement option in PHENIX.REFINE. In these structures, 11 residues at the N-terminus, as well as residues 34–43 (Nβ1–Nα1) and 102–104 (Nα2–Nβ3) could not be modelled in the map. The B-factor plots of the refined structures show that the HDCF loop is well defined in the wild-type as well as in the C337A variant. The conformation of the HDCF loop is the same in both structures, showing that the C337A mutation does not induce structural changes in the loop. In contrast with Lm-thiolase, the N-terminal nucleophilic catalytic Cys120 is not oxidized.
The Lm-thiolase and Tb-thiolase sequences share 71% sequence identity (Figure 2). As can be expected from this high degree of identity, the Lm-thiolase and Tb-thiolase structures are very similar. The RMSD for the superimposed Cα-atoms of the B-subunits of wild-type Lm-thiolase and Tb-thiolase calculated using COOT is 0.8 Å. The active site geometries are also very similar. It can be noted that the Nβ1–Nα1 loop is ordered in Lm-thiolase, but disordered in Tb-thiolase. In the subsequent structure analysis, unless otherwise stated, the B-subunit of the Lm-thiolase–CoA complex (PDB code 3ZBN) has been used for the comparisons.
The overall structure of Lm-thiolase and the mode of CoA binding
Analyses of the structures using the PISA server  show that the two protomers in the asymmetric unit of Lm-thiolase and Tb-thiolase represent the physiological dimer. The solvent-accessible surface area of the dimer interface is 2300 Å2. SLS and gel-filtration experiments with Lm-thiolase also confirm its dimeric state (Supplementary Figure S1). The structures show that this dimerization mode is the same as that of the previously described canonical thiolases. The detailed structural comparisons described in the present study have been made with respect to the structure of the tetrameric Z. ramigera CT-thiolase, which is a short-chain biosynthetic thiolase (Table 1). To a lesser extent, also the structure of the dimeric yeast (S. cerevisiae) AB-thiolase, which is a long-chain degradative thiolase, and the T2-thiolase (for studying the mode of binding of the 2-methyl fatty acid CoA moiety) have been used for comparisons.
Major deletions and insertions in Lm- and Tb-thiolases with respect to the Z. ramigera and the yeast thiolases occur in the loop domain (Figure 2). For example, Lm- and Tb-thiolases do not have the tetramerization and cationic loops (Figure 3), whereas the adenine-binding loop and the covering loop are longer. Furthermore, the helix, Lα1, is in a different orientation (Figures 3 and 4). The more extended covering loop in Lm-thiolase is also involved in monomer–monomer contacts at the dimer interface. The Nβ1–Nα1 loop is longer by six residues in Lm-thiolase (Figure 2) and partially fills the space occupied by the cationic loop of Z. ramigera thiolase. The longer adenine-binding loop of Lm-thiolase causes large differences in the mode of binding of the adenosine 3′,5′-diphosphate part of CoA (Figure 3). The adenosine-binding pocket is lined by two positively charged residues, Lys28 and Arg159. Lys28 of the Nβ1–Nα1 loop points towards the pyrophosphate moiety whereas Arg159 (Lα1) points to the 3′-phosphate. In the CoA–Lm-thiolase complex, the adenine moiety points away from the Lα3-helix (Figure 3) and is much more exposed to bulk solvent. The different mode of binding of CoA in Lm-thiolase and Z. ramigera thiolase is consistent with the observation that the adenine-binding pocket in Z. ramigera is filled by larger side chains in Lm-thiolase such as those of Leu228 (Ala223 in Z. ramigera thiolase), Tyr212 (Gln183) and Thr256 (Gly244). The pantetheine part of CoA points inwards and is buried in a tunnel (Figure 3), as in the Z. ramigera thiolase. In Lm-thiolase as well as in the other structurally characterized thiolases, CoA interacts with Ser259 of the pantetheine-binding loop of Lm-thiolase (see also Figure 5). The side chain of this serine residue is weakly hydrogen bonded to the pantetheine N4 atom. The pantetheine-binding tunnel is completed by Leu165 of the covering loop and the Phe182 side chain situated at the beginning of helix Lα2. The mode of binding of the pantetheine moiety defines the catalytic pocket located near the terminal sulfur of CoA.
The active site of Lm-thiolase
The catalytic cavity of Lm-thiolase is deeply buried (Figure 3) as in the Z. ramigera thiolase. The side chains and main-chain segments which outline the active site are well defined in the electron density map of the apo and complex structures. In Lm-thiolase, all these residues are from the same subunit. In particular, the four loops associated with the characteristic sequence fingerprints (Figure 2) are important for the substrate and catalytic specificities. Two cysteine residues protrude out of these loops into the active site, being the CxS Cys123 and the HDCF Cys340. When compared with Z. ramigera thiolase, there are major sequence and structural differences in the substrate specificity (Figure 4) and catalytic (Figure 5) loops of Lm-thiolase. As discussed in the subsequent sections, some of these loops also shape the two oxyanion holes that are essential for stabilizing the intermediates of the catalytic cycle. These oxyanion holes, OAH1 and OAH2, are preserved in Lm-thiolase, but their precise geometry is distinctly different (Figure 6).
Enzymological properties of the active site variants of Lm-thiolase and Tb-thiolase
The importance of the two active site cysteine residues for catalysis was probed by studying the enzyme activity of wild-type Lm-thiolase and Tb-thiolase as well as of variants in which these cysteine residues were mutated into an alanine or a serine (Table 4). The activity data show that the wild-type enzyme is active in both the degradative (using acetoacetyl-CoA as substrate) and synthetic (using acetyl-CoA as substrate) directions. The enzyme is 6-fold more active in the synthetic direction than in the degradative direction (Table 4). For the variants in which these cysteine residues are mutated into an alanine, no activity could be detected in either direction. These data confirm the importance of the Cys123 (the CxS cysteine) as well as the Cys340 (the HDCF cysteine) for catalysis. Residual activity (approximately 10-fold lower as compared with the wild-type) is observed for the C123S variant of Lm-thiolase in the degradative direction. No activity could be detected for this variant in the synthetic direction.
The Lm-thiolase and Tb-thiolase genes were identified by genome searches, using the sequence of the human SCP2-thiolase domain as the query. The amino acid sequence identity of the trypanosomatid and the human SCP2-thiolase is low: 26.9% and 26.1% for Lm-thiolase and Tb-thiolase respectively. Therefore a more extensive evolutionary sequence analysis was carried out (Figure 7) which suggested that the family of SCP2 sequences can be grouped in two separate divergent classes. The alignment of these sequences confirms that the nucleophilic cysteine residue of the CxS motif as well as the histidine residue of the GHP motif is conserved. Additionally, in all of these SCP2-thiolases, the HDCF-motif (Supplementary Figure S2) is conserved and the catalytic base, Cys378 of the CxG motif, is absent. The presence of the HDCF motif and the absence of the CxG motif therefore appear to define the SCP2-thiolase subfamily.
The substrate-specificity loops of Lm-thiolase
Lm-thiolase and Tb-thiolase have been classified as belonging to the SCP2-thiolase subfamily because of their overall sequence relationship with the mammalian SCP2-thiolases. The mammalian SCP2-thiolases are characterized by their specificity for long-chain fatty acyl CoA substrates with a 2-methyl group as well as having a steroid moiety at the ω-end of the tail. Therefore two different aspects need to be considered with respect to the substrate specificity of Lm-thiolase: the possibility of binding of a ligand with a 2-methyl group and a long-chain fatty acid tail. Three different loops, the covering loop (of the loop-domain), the Cβ1–Cα1 loop, and the Cβ4–Cβ5 loop, are important for defining substrate specificity.
Regarding the possible mode of binding of an extended fatty acid tail to Lm-thiolase, like the steroid moiety, the best insight is provided by structural studies on the degradative yeast AB-thiolase that accepts ligands with extended fatty acid tails. These studies have shown that the helix Lα1 and subsequent covering loop adopt a completely different conformation when compared with those of the Z. ramigera thiolase structure (Figure 3) providing the extra space needed for the binding of the long-chain fatty acid tail (Figure 3) . In the unliganded yeast AB-thiolase, this extra space is occupied by an MPD molecule (Figure 3). In Lm-thiolase the structure of Lα1 and following covering loop adopt a conformation similar to that of the Z. ramigera thiolase, suggesting that Lm-thiolase may not bind long-chain fatty acid tails.
Concerning the binding pocket for the 2-methyl branch, the mitochondrial short-chain T2-thiolase structure has shown the importance of the Cβ1–Cα1 loop in this respect. In T2-thiolase, the structure and sequence of this loop, highlighted in Figure 2 by the VMG sequence fingerprint of the Z. ramigera thiolase, create the space required for the 2-methyl group of 2-methyl-acetoacetyl-CoA . In Lm-thiolase, the Cβ1–Cα1, Cβ4–Cβ5 and the covering loop adopt very different folds (Figure 4). The side chain of Phe182 of the Lm-thiolase covering loop and the main chain of the Cβ4–Cβ5 loop, in particular O(Gly428), fill the space corresponding to the binding pocket found in T2-thiolase for the 2-methyl group. This implies that 2-methyl-branched fatty acyl CoAs are not substrates of Lm-thiolase.
Comparison of the Lm-thiolase structure with that of the Z. ramigera structure complexed with acetoacetyl-CoA (PDB code 1M1O) shows that the acetoacetyl moiety could fit snugly in its predicted binding cavity in Lm-thiolase (Figure 4), formed by the side chains of residues Phe81, Ala122, Leu165, Ala168 and Phe182. Phe81 protrudes out of the Nβ2–Nα2 loop and Ala122 points out of the Nβ3–Nα3 loop. Leu165 and Ala168 protrude out of the covering loop and Phe182 is at the beginning of Lα2. The side chain of Phe182 adopts two conformations, correlated with the absence/presence of an active site ligand, being either CoA or the sulfonate moiety. Also the side chains of Val390 of the GHP Cβ3–Cα3 loop and Met426 of the MGG motif of the Cβ4–Cβ5 loop (Figure 2) point into this tight, hydrophobic binding pocket, suggesting that the Lm-thiolase is probably specific to short-chain substrates.
The substrate specificity of Lm-thiolase has not been studied by enzyme activity experiments. The structural data of the present study suggest, surprisingly, that Lm-thiolase is unlikely to bind 2-methyl-branched or long-chain fatty acyl CoA substrate molecules. These structural data therefore indicate that the Lm-thiolase may be considered as a prototype of a separate class within the family of SCP2-thiolases, displaying a substrate specificity, which is distinctly different from that of the mammalian homologues. Indeed, the existence of two classes of SCP2-thiolases agrees well with the bioinformatics analysis (Figure 7).
The catalytic loops of Lm-thiolase
In Lm-thiolase, there are no structural changes with respect to that of Z. ramigera in the Nβ3–Nα3 CxS loop that provides the nucleophilic cysteine residue. Also the histidine residue side chain of the Cβ3–Cα3 GHP loop, which activates the nucleophilic cysteine residue, is conserved. Interestingly, significant structural and sequence changes are observed in the other two catalytic loops: the Cβ2–Cα2 NEAF/HDCF loop and the Cβ4–Cβ5 CxG loop. The conserved fingerprint of the latter loop (CxG in the Z. ramigera thiolase) is MGG in Lm-thiolase. This loop provides also the main-chain NH-group for OAH2 in the Z. ramigera thiolase [by N(Gly380), together with N(Cys89)]. OAH2 binds the thioester oxygen atom of the acetylated enzyme (Figure 8). The structural comparison of the Z. ramigera and the oxidized Lm-thiolase, presented in Figure 6, shows that the OAH2 geometry is preserved in Lm-thiolase, being formed by N(Cys123) and N(Gly428). In the active site of the non-oxidized Lm-thiolase, a water molecule is hydrogen bonded to these main-chain N-atoms of OAH2.
The main-chain conformation of the HDCF loop in Lm-thiolase and the corresponding NEAF loop in Z. ramigera thiolase are similar (Figure 5). The HDCF loop is a conserved feature of SCP2-thiolases (Supplementary Figure S2). The structural comparison shows that the first residue of the HDCF loop, His338, replaces the Asn316–Wat82 diad of OAH1 and its NE2 atom contributes to OAH1. OAH1 binds the thioester oxygen of the bound acyl-CoA molecule (Figure 8). The second hydrogen bond donor of OAH1 is the NE2-atom of His388 of GHP. This property of OAH1, being formed by two histidine side chains (Figure 6), is a unique feature of the SCP2-thiolase subfamily and has not been observed in any of the previously characterized thiolases. However, the presence of two histidine residues pointing into the catalytic site, is also seen in some other members of the thiolase superfamily, referred to as the CHH-subfamily. Typical examples of members of this latter subfamily are the type-I and type-II β-oxoacyl acyl carrier protein synthases (KAS-I and KAS-II) , which are involved in the fatty acid synthesis pathway.
The second residue of the HDCF motif, Asp339, points inwards and appears to function as an anchor, like the glutamate residue of the NEAF motif. Interestingly, the position of the sulfur of Cys340 in the HDCF-motif of the SCP2-thiolase is spatially close (3.3 Å) to that of the corresponding atom of the acid/base cysteine residue of the Z. ramigera thiolase, present in the Cβ4–Cβ5 CxG loop (Figure 5). The sulfur atoms of these two cysteine residues point into the fatty acid tail binding pocket from the same side. Comparison of the active sites (Figure 5) shows that Cys340 is well positioned to function as the catalytic acid/base allowing for proton exchange with the C2-carbon of the fatty acid tail. The geometry of the active site of Lm-thiolase (Figure 5) therefore strongly suggests that the HDCF cysteine residue is the acid/base catalytic residue of the SCP2-thiolase subfamily, as discussed further in the next section.
The side chain of Phe341 of the HDCF-motif in Lm-thiolase lines the pantetheine-binding tunnel, just as Phe319 of the NEAF loop in the Z. ramigera thiolase (Figure 6). In summary, the side chains of the residues of this HDCF loop contribute to two important geometric features of the catalytic pocket: (i) the oxyanion hole OAH1 (His338) and (ii) the acid/base (Cys340).
The reaction mechanism
The catalytic site of Lm-thiolase is polar and deeply buried (Figure 3). Centrally positioned in this catalytic site is the nucleophilic Cys123. From previous studies on other thiolases, depending on the crystallization conditions, this cysteine residue is sometimes oxidized to a sulfenic acid, with the extra oxygen pointing into OAH2 . Interestingly, in the wild-type Lm-thiolase active site, this cysteine residue is oxidized to a sulfonate moiety and the structure shows that one oxygen atom of the sulfonate moiety points into OAH1, and another oxygen atom points into OAH2 (Figure 6).
Comparisons of the catalytic loops clearly show that the catalytic machinery of the Lm-thiolase is different from that of the Z. ramigera thiolase. Key differences between the catalytic site of Lm-thiolase and the Z. ramigera thiolase are: (i) in the Lm-thiolase the catalytic Cys378 of the CxG-loop of Z. ramigera is absent, and apparently replaced by the cysteine residue of the HDCF loop; (ii) OAH1 is formed by two NE2 (histidine) side-chain atoms; and (iii) in the Z. ramigera thiolase the catalytic acid/base Cys378 is activated via two water molecules referred to as Wat82 and Wat49 (Figure 6), whereas in the catalytic site of Lm-thiolase these two water molecules are absent. It has been proposed that, in the Z. ramigera thiolase, these two water molecules stabilize the deprotonated acid/base cysteine residue, which is an important intermediate in the catalytic cycle . In Lm-thiolase the corresponding Cys340 is covered by the side chains of Phe186 and Phe341. Interestingly, the Cys340 side chain sulfur atom is hydrogen bonded to two main-chain NH-groups, of Cys340 and Phe341 of the HDCF loop (Figure 5), forming an oxyanion hole that can stabilize the deprotonated cysteine during the course of catalysis. This different mode of activation of the deprotonated acid/base cysteine residue is visualized in Figure 8.
The enzyme activity data of the C123A and C123S mutants of Lm-thiolase confirm the importance of Cys123 for the catalysis. C123A Lm-thiolase is inactive, whereas C123S Lm-thiolase retains only residual activity in the degradative direction, as also observed for the corresponding point mutation in the study of Z. ramigera thiolase . With Z. ramigera thiolase, the rate of catalysis in the C89S mutant is approximately 100-fold reduced in the thiolytic direction, whereas in the synthetic direction the rate decrease is approximately 2000-fold. Also in the C123S variant of Lm-thiolase the rate in the synthetic direction is much more affected, as shown by the undetectable catalytic activity.
The enzyme activity data of the HDAF variant show the critical role of Cys340 in catalysis, as this variant of Tb-thiolase is inactive. Likewise the functional role of the corresponding catalytic acid/base, Cys378, of Z. ramigera thiolase has also been demonstrated  and it was found that the kcat value of the C378G variant is 50000-fold lower in the thiolytic direction.
The enzyme activity assays (Table 4) show that Lm-thiolase and Tb-thiolase are more active in the synthetic than in the degradative direction, contrary to what has been found for the Z. ramigera thiolase. Apparently, the different mechanisms by which the reactive intermediates of the catalytic cycles of Lm-thiolase and Z. ramigera thiolase are stabilized result in significant changes in the respective free-energy profiles of the catalytic cycle. This subsequently results in differences in the rate-limiting steps of the respective catalytic cycles, to the extent that in Lm-thiolase the rate in the synthetic direction is faster than in the degradative direction.
The function of the SCP2-thiolase in trypanosomatids
The high degree of identity between the Lm-thiolase and Tb-thiolase sequences (Figure 2) suggests that these two thiolases exert the same metabolic role in the two different parasites. The high similarity of the active sites of Lm-thiolase and Tb-thiolase is also indicative for the same reaction mechanism. In addition, their specific enzyme activities (Table 4) are very similar. The bioinformatics study (Figure 7) shows that the trypanosomatid thiolases form a separate subgroup distinct from the mammalian SCP2-thiolases. Pinning down the precise physiological function of the SCP2-thiolase in Leishmania spp. and T. brucei is difficult, as these parasites have complicated life cycles each involving two very different hosts and different compartments within their hosts, and they thus encounter environments with different nutrients. Thiolases are known to be functional in synthetic pathways, catalysing the Claisen condensation reaction, as well as in degradative pathways, in which 3-oxo-acyl-CoA molecules are shortened, such as in the bile acid synthesis pathway and in the β-oxidation pathway. The structural (Figure 4) and enzymological data (Table 4) of Lm-thiolase and Tb-thiolase suggest that their SCP2-thiolases are probably short-chain, acetoacetyl-CoA thiolases, implying that they could be involved in a synthetic pathway.
Interestingly, sterol biosynthesis has been reported to be an essential metabolic pathway in the human pathogenic stages of some trypanosomatid parasites. Sterol molecules are important constituents of the trypanosomatid cell membrane . It is known that trypanosomatids like T. cruzi and Leishmania spp. do synthesize several sterols and this pathway has been validated as a drug target for both these organisms . Sterol synthesis comprises several pathways including the mevalonate and isoprenoid pathways. The first enzyme of the mevalonate pathway is a biosynthetic thiolase. The studies reported in the present paper suggest that this activity could be provided by the Lm-thiolase and it is reasonable to assume that this is also the case in T. cruzi, which contains also an SCP2-thiolase (Figure 7). Therefore this enzyme in these two parasites deserves further research for drug target validation and discovery. The other two enzymes of this pathway, HMG (3-hydroxy-3-methylglutaryl)-CoA synthase and HMG-CoA reductase, are also mitochondrial proteins , like the trypanosomal Tb-thiolase . The enzyme is not expressed in the bloodstream form of T. brucei and it is interesting to note that knockdown of its expression in the procyclic form does not result in a growth phenotype . Therefore better understanding of the role and mechanism of trypanosomal sterol biosynthesis is required .
The crystallographic studies of these SCP2-thiolases have provided, for the first time, insight into the unique features of the active site of this subfamily of thiolases. The function of the HDCF loop has been unravelled, showing that the cysteine residue of this motif is the catalytic acid/base residue in this class of thiolases, replacing the corresponding function of the CxG cysteine residue in the Cβ4–Cβ5 loop of the canonical thiolases. The mode of activation of the HDCF cysteine residue appears to be different from the CxG cysteine residue, as its deprotonated form is stabilized by two main-chain NH hydrogen-bond donors (Figure 8), instead of two water molecules. The histidine side chain of the HDCF motif also has an important role in the active site, as it replaces the Asn316–Wat82 diad of OAH1 of the Z. ramigera thiolase. The substrate-specificity pocket of the Tb-thiolase and Lm-thiolase is found to be rather narrow and hydrophobic, due to the structures of the covering loop, the Cβ1–Cα1 loop and the Cβ4–Cβ5 loop and additional side chains (Figure 4). The enzyme activity data and the geometry of the active site, surprisingly, suggest that the trypanosomatid SCP2-thiolase is likely to be a biosynthetic acetoacetyl-CoA thiolase, for example participating in the sterol biosynthesis pathway, and thus has a function different from that of the mammalian SCP2-thiolase. In the human pathogenic stages of T. cruzi and Leishmania spp, the sterol biosynthesis pathway is known to be essential. For understanding the function of the trypanosomal SCP2-thiolase, it is important to note that the Tbthiolase has been found to be located predominantly in the mitochondrion of the procyclic form . It is also of interest to note that the Tb-thiolase is the only thiolase encoded in the T. brucei genome. Future structural enzymological studies will provide more insight into the substrate and catalytic specificity, and consequently the function, of the trypanosomatid SCP2-thiolases.
Rajesh Harijan performed overexpression, purification, enzyme assay, crystallization, X-ray data collection, structure determination and bioinformatics. Mikael Karjalainen made the mutations. Manfred Weiss performed the data collection of wild-type Lm-thiloase. Paul Michels provided the Lm- and Tb-thiolase clones. Rik Wierenga, Tiila Kiema, Paul Michels and M.R.N. Murthy conceived and supervised the project. Rajesh Harijan and Rik Wierenga analysed the results and wrote the paper. Rajesh Harijan, Rik Wierenga, Tiila Kiema, Paul Michels, M.R.N. Murthy and Neelanjana Janardan finalized the paper.
This research was supported through the Academy of Finland [grant number 131795 (to R.K.W.)]. The research of M.R.N.M. was partially covered by an INDO-Finnish grant.
The intensity data were collected at the Biocenter Oulu Protein Crystallography core facility of the Department of Biochemistry, at the BM ID-23-2 beam line, ESRF, Grenoble, France and at the BL 14.1 beamline, BESSY II, Berlin, Germany. We thank Dr J. Barycki (Biochemistry Department, University of Nebraska, Lincoln, NE, U.S.A.) for providing the plasmid of the short-chain L-3-hydroxyacyl-CoA dehydrogenase and Dr Rajaram Venkatesan for collecting the data at the ESRF. We also thank the expert support by Ville Ratas.
The structural co-ordinates reported will appear in the PDB under accession codes 3ZBG, 3ZBN, 3ZBK, 3ZBL, 4BI9 and 4BIA.
Abbreviations: HMG, 3-hydroxy-3-methylglutaryl; Lm-thiolase, Leishmania mexicana SCP2-thiolase; MPD, 2-methyl-2,4-pentanediol; OAH, oxyanion hole; SCP2, sterol carrier protein, type-2; SLS, static light scattering; Tb-thiolase, Trypanosoma brucei brucei SCP2-thiolase; TFE, trifunctional enzyme; WHO, World Health Organization
- © The Authors Journal compilation © 2013 Biochemical Society