Review article

A common structural blueprint for plant UDP-sugar-producing pyrophosphorylases

Leszek A. Kleczkowski, Matt Geisler, Elisabeth Fitzek, Malgorzata Wilczynska


Plant pyrophosphorylases that are capable of producing UDP-sugars, key precursors for glycosylation reactions, include UDP-glucose pyrophosphorylases (A- and B-type), UDP-sugar pyrophosphorylase and UDP-N-acetylglucosamine pyrophosphorylase. Although not sharing significant homology at the amino acid sequence level, the proteins share a common structural blueprint. Their structures are characterized by the presence of the Rossmann fold in the central (catalytic) domain linked to enzyme-specific N-terminal and C-terminal domains, which may play regulatory functions. Molecular mobility between these domains plays an important role in substrate binding and catalysis. Evolutionary relationships and the role of (de)oligomerization as a regulatory mechanism are discussed.

  • oligomerization
  • protein structure
  • sugar activation
  • UDP-sugar synthesis


UDP-sugars serve as direct precursors for most polysaccharides in plants, including sucrose, cellulose, hemicelluloses and pectins. They are also precursors for carbohydrate chains of glycolipids and glycoproteins, and for glycosylation of myriad secondary metabolites, among other functions [1,2]. UDP-sugars are, by far, the main precursors for biomass production in plants [2]. UDP-Glc (where Glc is glucose), the major UDP-sugar in plants and a key substrate for sucrose and cellulose synthesis, may also serve as a precursor for synthesis of other UDP-sugars or UDP-sugar-analogues, e.g. UDP-Gal (via UDP-Glc epimerase) or UDP-GlcA (via UDP-Glc dehydrogenase) [14] (where Gal is galactose and GlcA is glucuronic acid).

The pyrophosphorylases discussed in the present review catalyse reversibly the transfer of a uridyl group from UTP to a sugar monophosphate (sugar-1-P) (or a sugar-1-P analogue), producing UDP-sugar (or UDP-sugar analogue) and PPi. These proteins include UGPase (UDP-Glc pyrophosphorylase), USPase (UDP-sugar pyrophosphorylase) and UAGPase [UDP-GlcNAc (UDP-N-acetylglucosamine) pyrophosphorylase], and they differ in the specificity and efficiency of their reactions. Whereas UGPase is fairly specific for UTP and Glc-1-P as substrates [4], both USPase and UAGPase can also use a variety of other phosphorylated sugars or sugar analogues. Those include Gal-1-P, GlcA-1-P, Ara-1-P and Xyl-1-P (for USPase) [511] as well as GlcNAc-1-P and GalNAc-1-P (for UAGPase) [1214] (where Ara is arabinose, Xyl is xylose, and GalNAc is N-acetylgalactosamine). All of these proteins, much like other eukaryotic pyrophosphorylases [15], carry out an ordered reaction, where UTP has to bind first to the active site before sugar-1-P binds [5,13,1621].

There are two types of UGPase: UGPase-A, a largely cytosolic enzyme [4]; and UGPase-B, a recently discovered plastidial protein [20]. Both UAGPase and USPase are cytosolic [4]. Plants contain distinct two isoenzymes of each of UGPase-A and UAGPase, whereas both USPase and UGPase-B exist as single proteins per species [2,4,14,20,22]. Both UGPase-A and USPase are essential for reproductive processes, with male sterility as the most serious consequence in the loss-of-function mutants [6,11,2326], whereas UGPase-B is essential in sulfolipid formation [20]. Nothing is known about the role of UAGPase in plants.


Despite sharing the same catalytic function (production of UDP-Glc and PPi from Glc-1-P and UTP), UGPases (either A or B type), USPase and UAGPase have very low homology at the aa (amino acid) level, with at most 22% identity. Within those low homology numbers, UAGPase and USPase are consistently closer to each other, whereas UGPase-B is most distant (Supplementary Figure S1 at Both UGPase-A and UAGPase occur in all eukaryotes, whereas UGPase-B is apparently specific for plants and cyanobacteria [20,27]. Plants are the only organisms to contain all four types of the pyrophosphorylases [4]. USPase was believed to be present in plants only [5], but related proteins have recently been described for the protozoans Leishmania and Trypanosoma [9,10] and some bacteria [11]. USPase-like activities reported for animal tissues (e.g. [28]) are likely to belong to animal UGPase-A which, in contrast with plant UGPase-A [22], has some non-specific activity with a variety of sugar phosphates [11,29]. On the basis of aa sequence comparisons, as proposed previously [5,8,20,27,30], all of the UDP-sugar producing pyrophosphorylases can be phylogenetically categorized into four distinct groups that diverged early, possibly in prokaryotic or early eukaryotic ancestors (Figure 1A).

Figure 1 Phylogenetic tree based on amino acid sequences and structural comparison of UDP-sugar-producing pyrophosphorylases

(A) Maximum parsimony tree for UAGPase, UGPase-A, UGPase-B and USPase from Arabidopsis (At), Leishmania (Lm), humans (Hs) and Candida albicans (Ca). This includes four proteins that had their crystal structures resolved (with PDB codes listed) and three proteins that were homology modelled (HM) (see Figure 2). A dendrogram (B) was constructed for the same proteins on the basis of their structural similarity (delta QH), where their crystal structures and homology models were aligned and compared using maximum parsimony. Structures of AtUAGPase and AtUSPase were modelled using HsUAGPase (PDB code 1JVG) and LmUSPase (PDB code 3OH4) respectively as templates, whereas AtUGPase-B was modelled on CaUAGPase (PDB code 2YQJ). A structural comparison of the position of all carbon backbone atoms for all seven sequences was performed according to the method of Roberts et al. [33].

Figure 2 Structures of Arabidopsis UDP-sugar-producing pyrophosphorylases

(A) Crystal structure for Arabidopsis UGPase-A (PDB code 2ICY) [17]. Parts of the protein corresponding to the N- and C-terminal domains and the positioning of the NB-loop are indicated. (B) Model of UAGPase2 based on crystal structures of AGX (PDB code 1JVG) [13]. (C) Model of USPase based on Leishmania USPase (PDB code 3OH4) [21]. (D) Model of UGPase-B based on Candida albicans UAGPase (PDB code 2YQJ) [18]. (E) Superpositioning of all four pyrophosphorylases. (F) Close-up view of the active site from the entrance side, with the NB- and SB-loops indicated for UGPase-A and UGPase-B as examples. Homology structures were generated using 3D-Jigsaw [48], Swiss-Model and DeepView (version 4.01), and the models were refined using 700 iterations of steepest descent and 500 iterations of conjugate gradients for energy minimization in DeepView. The positioning of substrate (UDP-Glc, white spacefill) within the active site of each protein was based on structural superposition with crystallized UGPase-A. The structures are shown in identical orientation and scale according to best-fit superpositioning as cartoon ribbon diagrams using the VMD software package [49]. Superpositioning was performed using the VMD MultiSeq plugin [33].

Protein function is associated with the occurrence of key aa and, more importantly, the three-dimensional structure or fold. In many anciently diverged protein families, e.g. pseudouridine synthases and aminoacyl tRNA synthases, aa sequence homology can be very low despite proteins having the same function, and structural similarity is often a better predictor of function [31,32]. In the UDP-sugar-producing pyrophosphorylase superfamily, a structure comparison (Figure 1B) using QH (Q-homology) [33] revealed that UGPase-B was structurally quite similar to UAGPase, despite low aa identity and distant evolutionary origin (compare the phylogenetic tree to the QH tree in Figure 1). USPase is also phylogenetically closer to UAGPases, but has less related structure than UGPase-B. On the other hand, UGPase-A has the most divergent structure, but is closer in sequence evolution to the UAGPase/USPase families than UGPase-B.


Studies on crystal structures of UAGPase [13,18], UGPase-A [17,19,34] and USPase [21] revealed that the proteins share similar molecular architecture. In Figure 2, we compared structures of crystallized Arabidopsis UGPase-A (PDB code 2ICY) with homology models for Arabidopsis UAGPase2, UGPase-B and USPase. Each homology model was based on crystal structures of a related eukaryotic protein of the same family [i.e. Leishmania USPase (PDB code 30H4) for Arabidopsis USPase, and AGX (human UAGPase; PDB code 1JVG) for Arabidopsis UAGPase], with the exception of UGPase-B, which was modelled on the closest homologous sequence with known structure (yeast UAGPase).

All of those pyrophosphorylases have elongated structures that are built from three domains: the central catalytic domain and two flanking N- and C-terminal domains. In all cases, the central domains reveal a common structure that includes a dominant single Rossmann fold that is built of a central mixed β-sheet, where each β-strand is, at both ends, linked to α-helices. The active centre of the pyrophosphorylases is in the form of a two-lobed pocket and is supported from one side by the central β-sheet. The first lobe encompasses the so called ‘NB (nucleotide-binding)-loop’, which interacts with the nucleotide substrate. The second lobe is involved in sugar binding and includes a mobile ‘SB (sugar-binding)-loop’. Both N- and C-domains have enzyme-specific folds and they are likely to have different regulatory functions in different pyrophosphorylases. However, similarities have been found in the way that the domains are linked to the central domain. The link for N-terminal domains is tight, as they are linked via two loops that protrude from the active-centre region of the central domain and are integrated into the N-terminal domain. This tight hinge is responsible for high interconnectivity between the central and N-terminal domains. In contrast, the connection between the central and C-terminal domains consists of a single long α-helix.

The first insight into eukaryotic UGPase-A structure came from a homology model for barley UGPase-A [30] that was computed on the basis of the crystal structure of AGX (PDB code 1JVG) [13]. Subsequent studies on crystallized UGPase-A from Leishmania, Arabidopsis and yeast have confirmed that both UAGPase and UGPase-A share general structure details; however, the two proteins substantially differ in the details of their C-terminal domains [17,19,34] (Figure 2A). Instead of several β-sheets connected by loops at the C-terminal domain, as it is in UAGPase, the C-terminal domain of UGPase-A forms a left-handed parallel β-helix, which moves towards the central domain upon substrate binding.

AGX was the first eukaryotic pyrophosphorylase of any type to have its crystal structure resolved [13]. A previous study [35] demonstrated that AGX exists as two isoforms, AGX1 and AGX2, differing in a 17 aa-long insert that was proposed to modify specificity of the UAGPase from preferential synthesis of UDP- GalNAc to that of UDP-GlcNAc. The X-ray structure has revealed that the 17 aa-long loop, the so called ‘I-loop’, is located in the C-terminal domain and is responsible for the oligomerization property of AGX1. In Arabidopsis, there are two isoenzymes of UAGPase that are products of distinct genes [14]. The homology model of Arabidopsis UAGPase2 (Figure 2B) generally overlaps with that of UAGPase1, with only a few variations in some loop regions [14]. However, the two isoenzymes clearly differed in substrate specificity, with UAGPase2, but not UAGPase1, being able to use Glc-1-P as an alternative substrate. The difference must have a structural basis in protein architecture, and probably involves a loop closest to the binding site of the sugar moiety of the substrate [14]. For both UAGPase isoenzymes, the C-terminal domain is relatively small, and its axis is oriented almost perpendicular to that in UGPase-A.

The only USPase structure resolved is for the enzyme from Leishmania (PDB code 3OH4) [21]. Its central domain resembles analogous domains in other UDP-sugar-producing pyrophosphorylases (a central sheet and arrangement of α-helices in the Rossmann fold). The N- and C-terminal domains of USPase have a certain structural similarity to those of human and yeast UAGPase [13,18], and plant and Leishmania UGPase-A [17,19] respectively. The relatively big C-terminal domain is built of two parts: a distorted β-sheet (similar to UAGPase) and a left-handed parallel β-helix (similar to UGPase-A) (Figure 2C). The β-sheet contains a loop similar to the I-loop of AGX, where it was shown to facilitate formation of an inactive dimer from active monomers [13]. However, Leishmania USPase apparently exists exclusively as a monomer, and there was no evidence for oligomerization [21].

The least-studied pyrophosphorylase is UGPase-B, with no X-ray structure known. On the basis of the crystal structure of Candida albicans UAGPase (PDB code 2YQJ) [18], we were able to model a major part of Arabidopsis UGPase-B (aa 190–733) that encompasses the entire central domain and fragments of its N- and C-terminal domains (Figure 2D). The beginning of the N-terminal region (aa 1–230) of UGPase-B could be modelled separately (results not shown), using the TASSER server, from multiple structural alignments including Shigella ArsH (a NADPH-dependent FMN reductase) and rat glutathione transferase (PDB codes 2FZV and 1R4W respectively), but the last 151 aa-long fragment of the C-terminal domain (aa 733–883) could not be modelled, as it did not align significantly with any known structure. The large N-terminal and C-terminal domains account for the fact that UGPase-B (composed of 883 aa) is much larger than UGPase-A, UAGPase and USPase (469–611aa). In addition, UGPase-B is a plastidial protein and in its unprocessed form has a signal peptide, corresponding to the first 73 aa, which is not part of the mature Arabidopsis protein [20].

Superpositioning of UAGPase, UGPases and USPase structures (Figure 2E) supports the view of a common structural blueprint for those pyrophosphorylases, especially for the central catalytic domain. From a structural point of view, it seems that all UDP-sugar-producing pyrophosphorylases evolved from a simple precursor that had only one domain, i.e. a catalytic domain. In support of this view, bacterial UGPases (PDB codes 2PA4 and 2E3D) have some structural similarity to the central catalytic domain of all plant UDP-sugar-producing pyrophosphorylases, but they lack N- and C-terminal domains. Apparently, during evolution, not only modifications/mutations within the catalytic domain, but also the acquirement of different N- and C-terminal extensions, resulted in the panel of pyrophosphorylases that we have today: enzymes with a common catalytic mechanism, but with different substrate specificities and oligomerization abilities (see below).

As the enzymes catalyse mechanistically similar reactions, their reactive centres have an overall similar structure (Figure 2F). Several residues involved in substrate binding are conserved, and this especially involves the NB-loop: any mutation or deletion in this region had strong negative effects on activity and substrate binding [13,3638]. However, major differences occur in the SB area, e.g. the SB cavity is larger in USPase than in UAGPase and UGPases, accounting for the fact that USPase can use multiple sugar substrates. The SB site of USPase is less shielded from the environment and contains a highly flexible region responsible for binding C-5 and C-6 of sugar substrates. This ensures that specific determinants of individual substrates are matched by specific interactions [21]. The smallest substrate-binding cavity is present in UGPases, again accounting for the fact that those enzymes are usually specific for Glc-1-P as substrate, and reflecting spatial restraints in the cavity close to C-6 of the sugar [17,21].

On the basis of comparison between structures of apoenzymes and enzymes complexed with substrates, it was proposed that UDP-sugar-producing pyrophosphorylases undergo substantial conformational changes during enzymatic catalysis [1719]. The apoenzyme is characterized by an ‘open conformation’ with a broad entrance to the active centre. Upon substrate binding, the NB-loop closes to the nucleotide, and the SB-loop on the opposite side of the active site closes to the sugar moiety. The movement of the SB-loop induces movement of the N-terminal domain, whereas movement of the NB-loop is linked with tilting of the C-terminal domain. These conformational changes result in a ‘closed conformation’ of the enzyme where the substrates are tightly bound in the active-centre pocket, allowing catalytic reaction. The largest conformational changes occur in UGPase-A, where the C-terminal domain rotates by approximately 17° towards the central domain [17,19]. The smallest molecular mobility was found for USPase, where the N-terminal domain moves only slightly, and movements of the central and C-terminal domains are restricted to ligand-binding regions [21].


The UDP-sugar-producing pyrophosphorylases have complex patterns of oligomerization: some of them are active as monomers and inactivated by oligomerization, some are active only as oligomers. This most probably reflects the fact that all oligomerizations involve the N- and/or C-terminal domains, and these domains are enzyme specific. AGX and plant UGPase-A are active as monomers, and they are inactivated upon dimer/oligomer formation. Both enzymes were crystallized as monomers and dimers [13,17]. Similar evidence was obtained by separating various oligomerization forms of both proteins by native PAGE [35,36,38,39]. For UAGPase, dimers were proposed to dissociate to monomers under assay conditions [13]. A similar mechanism was demonstrated for barley UGPase-A [39], where the oligomerization status of the protein was additionally affected by subtle changes in hydrophobicity and by protein crowding conditions [36,38,39]. On the other hand, only monomers were observed for USPase [9], and nothing is known as to whether UGPase-B undergoes any (de)oligomerization process.

UAGPase and UGPase-A differ in the nature of the structural determinants of the oligomerization process. For AGX, an extended loop (I-loop) at the C-terminal domain makes extensive contacts with the active site of its dimeric partner [13]. On the other hand, in dimers of plant UGPase-A, the N-terminal domain of each monomer is positioned against the C-terminal β-helix of the other monomer and directly across its active site [17]. This probably restricts the entry of substrate into the active site, and interferes with catalysis. The dimer assembly could also restrict the molecular mobility that seems to be an essential mechanism for UGPase-A activity [17].

The oligomerization status of the active form of UGPase-A differs between eukaryotes. In plants and Leishmania, UGPase-A monomer is the only active form [19,22,36,38,39], whereas UGPase-A octamer is the active form in yeast and humans [29,33,40]. Whereas dimerization of plant UGPase-A involves interactions of both N- and C-terminal domains, in the yeast protein the octamers are held entirely by interactions of the C-terminal domain [34]. In such a complex, substrate binding to the active site (located on the central domain) of each of the components of the octamer is apparently not obstructed. As one would expect, aa residues that participate in the formation of the octameric complex are highly conserved in animal and fungal, but not plant, UGPase-A sequences [34].

A short peptide at the very end of the C-terminal domain, corresponding to the last exon, has been suggested to stabilize the octamer structure of yeast UGPase-A [34]. For barley UGPase-A, however, deletion of this region resulted in a highly active, exclusively monomeric, form of the enzyme [36]. Thus, whereas for yeast UGPase-A the peptide corresponding to the last exon helps to maintain the active form of the enzyme (octamer), an analogous peptide in the plant enzyme appears to hinder the formation of fully active protein (monomer), suggesting a regulatory role. Interestingly, bacterial UGPases, although unrelated to eukaryotic UGPases at the derived aa sequences [41,42], have also been suggested to be regulated by the (de)oligomerization phenomenon [43,44].


With the homology-derived models for plant UAGPase, USPase and UGPase-B (Figure 2), basic function/structure properties of these proteins may be now experimentally verified through biochemical approaches. However, precise information of their function/structure properties can only be obtained when their crystal structures become available. This is especially important for UGPase-B, which has not yet been crystallized from any source. Since USPase was crystallized only from Leishmania and the protein has at most 35% identity with plant USPases [11], the latter will also need to be crystallized to obtain precise information about their structure, especially details of their active sites. Crystal structures may also be required for each of the two plant UAGPases, given the fact that they differ in substrate specificity [14].

Despite the essential role of UDP-sugars for a plethora of glycosylation reactions [4], there are no known specific inhibitors for any of the pyrophosphorylases discussed in the present review. Given the availability of their crystal structures, this opens up possibilities for the design of inhibitors fitting the active-site architecture of a given target protein [45] or for approaches based on high-throughput screening of chemical libraries [46]. Besides pharmacological applications, as suggested for inhibitors of Leishmania UGPase-A [19] and USPase [9,21], inhibitors could be essential to distinguish, for instance, between UDP-Glc-producing activities of the pyrophosphorylases in crude extracts or partially purified preparations. Inhibitors that can discriminate between different isoenzymes of a given protein would also be valuable [47].


The work of the authors is supported, in part, by The Swedish Research Council.

Abbreviations: aa, amino acid(s); AGX, human UDP-GlcNAc (UDP-GalNAc) pyrophosphorylase; Gal, galactose; GalNAc, N-acetylgalactosamine; Glc, glucose; GlcA, glucuronic acid; GlcNAc, N-acetylglucosamine; NB, nucleotide-binding; QH, Q-homology; SB, sugar-binding; UAGPase, UDP-GlcNAc (UDP-GalNAc) pyrophosphorylase; UGPase, UDP-Glc pyrophosphorylase; USPase, UDP-sugar pyrophosphorylase


View Abstract