Research article

The structure of SENP1–SUMO-2 complex suggests a structural basis for discrimination between SUMO paralogues during processing

Lin Nan Shen, Changjiang Dong, Huanting Liu, James H. Naismith, Ronald T. Hay


The SUMO (small ubiquitin-like modifier)-specific protease SENP1 (sentrin-specific protease 1) can process the three forms of SUMO to their mature forms and deconjugate SUMO from modified substrates. It has been demonstrated previously that SENP1 processed SUMO-1 more efficiently than SUMO-2, but displayed little difference in its ability to deconjugate the different SUMO paralogues from modified substrates. To determine the basis for this substrate specificity, we have determined the crystal structure of SENP1 in isolation and in a transition-state complex with SUMO-2. The interface between SUMO-2 and SENP1 has a relatively poor complementarity, and most of the recognition is determined by interaction between the conserved C-terminus of SUMO-2 and the cleft in the protease. Although SENP1 is rather similar in structure to the related protease SENP2, these proteases have different SUMO-processing activities. Electrostatic analysis of SENP1 in the region where the C-terminal peptide, removed during maturation, would project indicates that it is the electrostatic complementarity between this region of SENP1 and the C-terminal peptides of the various SUMO paralogues that mediates selectivity.

  • protease
  • sentrin-specific protease 1 (SENP1)
  • small ubiquitin-like modifier (SUMO)
  • ubiquitin-like protein (Ubl)
  • ubiquitin-like protein-specific protease (Ulp)


Ubiquitin and Ubls (ubiquitin-like proteins) are covalently linked to lysine side chains in target proteins and confer altered properties on the modified proteins. Ubiquitin, NEDD8 (neural precursor cell-expressed developmentally down-regulated 8) and SUMO (small ubiquitin-like modifier) all have important roles in vivo and are required for normal cell growth and division in lower and higher eukaryotes. In lower eukaryotes, a single SUMO gene is expressed, whereas, in vertebrates, three paralogues, designated SUMO-1 {also known in humans as SMT3c [suppressor of MIF2 (mitotic fidelity protein 2)], PIC1 [PML (promyelocytic leukaemia protein) interacting clone-1], GMP1 (GTPase-activating protein-modifying protein 1), sentrin 1 and Ubl1}, SUMO-2 (also known as SMT3a and sentrin 3) and SUMO-3 (also known as SMT3b and sentrin 2) are expressed. The conjugated forms of SUMO-2 and SUMO-3 only differ from one another by three N-terminal residues and form a distinct subfamily known as SUMO-2/3 that are 50% identical in sequence with SUMO-1. Proteomic analysis has indicated that there are a large number of SUMO substrates and has demonstrated paralogue specific modification. Many of the SUMO-modified proteins identified appeared to be involved in transcriptional regulation, chromatin organization and RNA metabolism [14]. A fourth SUMO paralogue was reported to be expressed in kidney cells [5], but it was noted previously that the intronless SUMO-4 gene might be a non-expressed pseudogene [6]. Further analysis will be required to establish expression profiles of this gene in different tissues.

SUMO is linked to substrate proteins by an enzymatic cascade involving a SUMO-activating enzyme (E1), a SUMO-conjugating enzyme (E2) and, typically, a SUMO protein ligase (E3). In the first step in this reaction, SUMO-activating enzyme [a heterodimer containing SAE1 (SUMO-activating enzyme subunit 1) and SAE2] catalyses the formation of adenylated SUMO in which the C-terminal carboxy group of SUMO is covalently linked to AMP. Breakage of the SUMO–AMP bond is followed by formation of a covalent intermediate in which the C-terminal carboxy group of SUMO forms a thioester bond with the thiol group of a cysteine residue in SAE2 (Cys173). In the second step of the reaction, SUMO is transesterified from SAE2 to Cys93 in the SUMO-conjugating enzyme Ubc9 (ubiquitin-conjugating enzyme 9). A feature of Ubc9 that distinguishes it from conjugating enzymes of other ubiquitin-like proteins is its ability to directly recognize substrate proteins. Thus the Ubc9–SUMO thioester can catalyse formation of an isopeptide bond between the C-terminal carboxy group of SUMO and the ϵ-amino group of lysine in the substrate protein, provided that the lysine residue is part of a SUMO-conjugation motif. Typically, lysine residues subject to SUMO modification are found within a SUMO modification consensus motif, ψKXE (where ψ is a large hydrophobic residue and X is any residue), although modification at non-consensus sites has been reported. SUMO-2 and −3 each possess exposed SUMO-modification consensus motifs that can be utilized to form polymeric SUMO chains, although their role in vivo has yet to be determined (reviewed in [7]). In the presence of SAE1/SAE2 and Ubc9 only, SUMO is specifically conjugated to substrates containing the ψKXE motif. This motif is contacted directly by Ubc9 [810], but, with the notable exception of RanGAP1 (Ran GTPase-activating protein 1), SUMO modification with only SAE1/SAE2 and Ubc9 is rather inefficient and SUMO-specific E3 ligases are required for efficient conjugation (reviewed in [11]).

Like most other Ubls, SUMO paralogues are synthesized as larger precursors that must be processed to reveal the C-terminal glycine residue that is linked to lysine side chains in target proteins. The C-terminal sequences removed by processing are unrelated between SUMO-1, −2 and −3. This processing is carried out by SUMO-specific proteases that also remove SUMO from modified substrates and deconjugate polySUMO chains. In Saccharomyces cerevisiae, two SUMO-specific proteases, Ulp1 (Ubl-specific protease 1) and Ulp2, have been characterized and are detected at the nuclear pore and nucleoplasm respectively [1114]. Structural analysis and sequence comparisons of the C-terminal protease domains of Ulp1 and Ulp2 indicate that they are cysteine proteases belonging to the family typified by the adenovirus protease [12,15]. S. cerevisiae, in which the Ulp1 gene was deleted, were not viable, whereas yeast deleted for Ulp2 are viable but grow abnormally and are hypersensitive to DNA damage. Defects in these strains appear to be a consequence of lack of isopeptidase activity rather than from the loss of C-terminal hydrolase activity. Yeast strains deleted for Ulp1 or Ulp2 have distinct patterns of SUMO-modified proteins, suggesting that the substrates for Ulp1 and Ulp2 are different. In Schizosaccharomyces pombe, the homologue of Ulp1 is not necessary for viability, but cells lacking Ulp1 are defective for many nuclear processes [16].

Database searching initially identified eight genes for human proteins with significant sequence identity with yeast Ulp1 that were believed to be SUMO-specific proteases [17]. Although the products of these genes may function as proteases for ubiquitin-like proteins, they are not all specific for SUMO, as SENP8 (sentrin-specific protease 8) has recently been revealed as the NEDD8-specific protease, NEDP1/DEN1 (deneddylase1) [1820]. Of the remaining seven genes, SENP1, SENP2 [also designated Axam, SuPr-1, SSP3 (sentrin-specific protease 3), SMT3IP2 (SMT3-specific isopeptidase 2)] and SENP3 (SMT3IP1) have been shown to function as SUMO-specific proteases (reviewed in [11]). Each of these proteins appears to have a distinct subcellular localization that is dictated by their non-conserved N-terminal regions. SENP1 is nuclear and SENP3 is nucleolar, but differential splicing generates SENP2 proteins that can be cytoplasmic, nuclear-pore-localized or nuclear-body-localized. Recent analysis of a mouse strain in which SENP1 expression was dramatically decreased owing to a retroviral insertion indicated that this protease was required for normal mouse development [21].

SENP1 is capable of both processing pre-SUMO-1, pre-SUMO-2 and pre-SUMO-3 [22] and deconjugating SUMO-1, SUMO-2 and SUMO-3 from modified proteins [23]. To determine the structural basis for SUMO recognition and cleavage specificity, we have determined the structure of SENP1 to 2.45 Å (1 Å=0.1 nm). Recent work on the structurally related NEDP1 indicated that NEDD8 binding induced a substantial conformational change in the protease [24]. To determine whether this was also the case for SENP1, we used NaBH4 to trap a stable thiohemiacetal transition-state analogue [15,25] between Cys602 of SENP1 and Gly92 of SUMO-2 and determined the structure of this complex to 3.2 Å. Biochemical and structural analysis revealed the basis for SENP1 discrimination between SUMO-1, SUMO-2 and SUMO-3 during processing of the pre-proteins to the mature forms of SUMO.


Protein preparation

All constructs were generated by a standard PCR-based cloning method. The catalytic core domain of SENP1 (amino acids 415–643) and mutants were cloned into the vector pEHISTEV and expressed as a N-terminally His-tagged protein. The recombinant proteins were expressed in Escherichia coli Bl21(DE3) cells and purified using Ni-NTA (Ni2+-nitriloacetate)–agarose resin (Qiagen). The His-tag of purified protein was removed by TEV (tobacco etch virus) protease in 50 mM Tris/HCl, pH 8.0, 50 mM NaCl and 5 mM 2-mercaptoethanol. After TEV protease cleavage, SENP1 was purified further by Ni-NTA-affinity chromatography and gel filtration (Superdex 200 column; Amersham Biosciences). N-terminally His-tagged full-length SUMO-1 (101 amino acids), SUMO-2 (103 amino acids) and SUMO-3 (104 amino acids with nine extra amino acids ESSLAGHSF from the SUMO-2 C-terminus) were also expressed from pEHISTEV vector in E. coli BL21(DE3) cells, purified by Ni-NTA-affinity chromatography and gel filtration (Superdex 75; Amersham Biosciences). All constructs were verified by automated DNA sequence analysis and were shown to be identical with those reported previously for SENP1, SUMO-1, SUMO-2 and SUMO-3 (GenBank® accession numbers Q9P0U3, AAH53528, AAH68465 and NP_008867 respectively). All of the mutants of SENP1 were generated using PCR-based mutagenesis and verified by DNA sequence analysis (DNA Sequencing Unit, Dundee University, Dundee, U.K.).

Generation of the SENP1–SUMO-2 complex

The covalent adduct of SENP1 with SUMO-2 was prepared using a His-SENP1/SUMO-2 molar ratio of 1:5 in a buffer containing 50 mM Tris/HCl, pH 8.0, 50 mM NaCl and 5 mM 2-mercaptoethanol. Ten aliquots of NaBH4 (40 mg) were added to the reaction mixture over 30 min to final concentration of 30 mM. After the reaction, the His–SENP1–SUMO-2 complex was purified using Ni-NTA-affinity chromatography and gel filtration (Superdex 200 column). The purified complex was concentrated to ∼15 mg/ml in a buffer containing 20 mM Tris/HCl, pH 8.0, and 50 mM NaCl and was used for crystallization trials.

Crystallization, data collection and structure determination of SENP1

SENP1 crystallization was performed at 20 °C using a sitting-drop vapour-diffusion method. Single diamond-shaped crystals were grown after 2 days from equal volumes of protein solution (20 mg/ml in 20 mM Tris/HCl, pH 8.0, and 50 mM NaCl) and reservoir solution containing 100 mM CoCl2, 0.1M Mes, pH 6.5, and 1.8 M (NH4)2SO4. Before being subjected to X-ray diffraction, crystals were protected in a cryoprotectant buffer containing reservoir buffer plus 15% (v/v) glycerol. Diffraction data were collected at ID14-4 of the ESRF (European Synchrotron Radiation Facility). The data were indexed and integrated with MOSFLM/SCALA [26]. SENP1 crystals belong to the space group P3121 with cell dimensions of a=b=72.0 Å, c=200.6 Å, α=β=90° and γ=120° (Table 1). The structure of SENP1 was determined by molecular replacement with PHASER [27] using human SENP2 (PDB code 1TH0) as a search model. The model was built in O program [28], and the structural refinement was carried out using REFMAC [29].

View this table:
Table 1 Crystallographic data

Crystallization and structure determination of the SENP1–SUMO-2 complex

Crystals were grown using the sitting-drop method by mixing the SENP1–SUMO-2 complex (15 mg/ml) with equal volume of reservoir solution containing 25% propane-1,2-diol, 0.1 M phosphate/citrate, pH 4.2, 5% (w/v) PEG [poly(ethylene glycol)] 3000 and 10% PEG 8000. Diffraction data were collected at ID14-1 of the ESRF. The data were reduced with MOSFLM/SCALA [26]. The space group of SENP1-SUMO-2 belongs to P3221 with cell dimension of a=b=143.4 Å, c=71.9 Å, α=β=90° and γ=120° (Table 1). The structure of the SENP1–SUMO-2 complex was determined with PHASER [27] using the structure of human SENP2–SUMO-1 complex (PDB code 1TGZ) as a search model. The model was built in O program [28] and the structural refinement was carried out using REFMAC [29].

In vitro processing and desumoylation assays

To assay the processing activity, the native SUMO-1, SUMO-2 and SUMO-3 (with nine extra amino acids ESSLAGHSF from the SUMO-2 C-terminus) precursors (all 20 μM) were incubated with purified SENP1 (10 nM) at 25 °C in a buffer containing 50 mM Tris/HCl, pH 8.0, 150 mM NaCl and 5 mM 2-mercaptoethanol. Samples were removed at various times during incubation and analysed by SDS/PAGE (10% gels) followed by Coomassie Brilliant Blue R250 staining. Protein quantification was performed using a CCD (charge-coupled device) camera system (LAS1000 Plus system; Fujifilm) after separation by SDS/PAGE (10% gels).

To survey the effects of various SENP1 mutants, an equal amount (300 nM) of the wild-type or mutant SENP1 protein was incubated with the SUMO-2 precursor (20 μM) at 37 °C in reaction buffer containing 50 mM Tris/HCl, pH8.0, 150 mM NaCl and 5 mM 2-mercaptoethanol. The reaction was stopped by the addition of sample buffer containing SDS, and the products were analysed by SDS/PAGE (10% gels).

To assay the desumoylation activity of SENP1 mutants, an equal amount of the wild-type or mutant SENP1 proteins (1 nM) was incubated with SUMO-2-conjugated GST (glutathione S-transferase)–PML at 37 °C in reaction buffer containing 50 mM Tris/HCl, pH 8.0, 150 mM NaCl and 5 mM 2-mercaptoethanol. The reaction was terminated by addition of sample buffer containing SDS, and the products were analysed by SDS/PAGE (10% gels).

In vivo analysis of desumoylation activity of SENP1 mutants

H1299 cells were co-transfected with HA (haemagglutinin)–SUMO-2 and either wild-type or mutant SENP1-SV5 (simian virus 5). At 36 h after transfection, cell extracts were subjected to SDS/PAGE (10% gels) and Western blotting and then were probed with mouse monoclonal antibody 12CA5 (1:2000 dilution; obtained from BabCO), which recognizes influenza HA to reflect total SUMO-2 or anti-SV5 monoclonal antibody (1:2000 dilution; a gift from Professor Rick Randall, School of Biology, University of St. Andrews) to reveal total levels of SENP1.


It should be noted that there is some confusion in the literature over the designation of SUMO-1, SUMO-2 and SUMO-3. We originally [30] used a previous designation [31] that was based on analysis of the same genes in mice [32]. As the first functional comparison of the SUMO paralogues was provided in [31], we have continued to use this nomenclature, although, in recent publications [22,33], what is described as SUMO-2 is equivalent to SUMO-3 in the present paper. For clarity, the human SUMO proteins used in this study are pre-SUMO-1, MSDQEAKPSTEDLGDKKEGEYIKLKVIGQDSSEIHFKVKMTTHLKKLKESYCQRQGVPMNSLRFLFEGQRIADNHTPKELGMEEEDVIEVYQEQTGGHSTV, pre-SUMO-2, MSEEKPKEGVKTENDHINLKVAGQDGSVVQFKIKRHTPLSKLMKAYCERQGLSMRQIRFRFDGQPINETDTPAQLEMEDEDTIDVFQQQTGGVPESSLAGHSF, and pre-SUMO-3, MADEKPKEGVKTENNDHINLKVAGQDGSVVQFKIKRHTPLSKLMKAYCERQGLSMRQIRFRFDGQPINETDTPAQLEMEDEDTIDVFQQQTGGVY.


SENP1 structure

Human SENP1 is a 643-amino-acid protein containing a C-terminal region similar to the catalytic domain of yeast Ulp1 and human SENP2 (Figure 1A). On the basis of such sequence alignments, we generated a construct that would express a region of SENP1 (amino acids 415–643) predicted to contain the catalytic domain. Residues 415–643 of SENP1 were expressed and purified from E. coli and were shown to be catalytically active in SUMO processing and deconjugation. The purified protein was crystallized, and X-ray diffraction data were collected. The SENP1 structure was solved by molecular replacement using the SENP2 catalytic domain [33] as a search model. As expected from sequence homology in the SENP protease superfamily [33], SENP1 adopts a fold that identifies it as a member of the cysteine protease superfamily and contains a characteristic catalytic triad of cysteine (Cys602), histidine (His533) and aspartate (Asp550). The fold of the protein has been described in detail elsewhere [15,24,33,34]. Briefly, SENP1 contains a five-stranded mixed β-sheet in which the middle strand β-5 (Figures 1 and 2) is antiparallel to the other four. This sheet sits against two helices. A large helix, identified previously as the central helix, sits on the opposite side of a central cleft in the protein. The key nucleophile, Cys602, is located at the N-terminus of the central helix, and His533 and Asp550 are both located on the β-sheet. The N- and C-termini of the domain are close together, remote from the active site. The rmsd (root mean square deviation) between the two monomers in the asymmetric unit is 0.52 Å, this decreases to less than 0.4 Å if crystal contact residues are excluded. Superposition of SENP1 upon SENP2 gives an rmsd of 0.8 Å, for 221 out of a possible 224 Cα atoms. For the 221 superimposed residues, there is 58% sequence identity.

Figure 1 Structure-based sequence alignment of SENP and SUMO family members

(A) Sequence alignment of SENP1 with other members of the SUMO protease family. (B) Sequence alignment of the unprocessed forms of SUMO-1, SUMO-2 and SUMO-3. Conserved residues are shaded in black. Gaps are denoted by single dot. Secondary-structure elements are indicated, with α-helices, β-strands and coil depicted as rectangles, arrows and lines respectively.

Figure 2 Structure of SENP alone and in complex with SUMO-2

(A) Stereo view of SENP1 alone. The active-site residues and the N- and C-termini are indicated. (B) Stereo view of the SENP1–SUMO-2 complex. SENP1 is in blue and SUMO-2 is in cyan. The side chains of SENP1 residues altered by mutagenesis are indicated.

SENP1–SUMO-2 complex

To determine the structure of a complex between SENP1 and SUMO-2, NaBH4 was used to selectively reduce the deacylation intermediate formed during proteolytic cleavage, yielding a chemically stable transition-state analogue [25]. Thus a complex containing a covalent thiohemiacetal linkage between the active-site cysteine (Cys602) residue of SENP1 and the C-terminal glycine residue of SUMO-2 was prepared and crystallized. The structure of the complex (Figure 2B) was determined by molecular replacement using SUMO-1 from the SENP2–SUMO-1 complex [33] and SENP1 (above). Given the low resolution of the complex, only an overall single B-factor was refined for the complex. Full statistics are given in Table 1. The low resolution precludes any detailed analysis of the protein–protein interface, since at this resolution there is some ambiguity and uncertainty in the experimental location of side chains. The structure is reliable in positioning the main chains of the two molecules and therefore the residues at the interface can be identified. There is unambiguous density showing that Trp465 of the protease alters its conformation folding down upon the C-terminus of SUMO-2, but, apart from that, there is no indication of any conformational change in SENP1 occurring on binding SUMO-2. The C-terminus of SUMO-2 forms an elongated strand that binds in the large cleft of SENP1. There are three contact regions for SUMO-2: a small area N-terminus (Ala22–Gln24), a larger patch between Arg55 and Arg74, and the C-terminus Asp81–Gly92. In SENP1, there are six regions of contact, Asp441–Asp456, Trp465–Trp472, Asn494–Lys500, Arg511–Lys514, His529–Trp534 and Ser600–Cys602. Superposition of the entire of SENP2–SUMO-1 complex on to SENP1–SUMO-2 gives 295 matching Cα atoms (out of a possible 302) with an rmsd of 1.1 Å. The individual components superimpose with an rmsd of 0.9 Å for 221 Cα atoms of SENP1 and SENP2, and 0.7 Å for 77 common Cα atoms of SUMO-1 and SUMO-2. This suggests that there are some significant differences in how SENP2 recognizes SUMO-1 and how SENP1 recognizes SUMO-2. The C-terminus of SUMO-2 binds to SENP1 in almost exactly the same way as SUMO-1 binds SENP2. It is the core of SUMO-2 that is slightly displaced compared with SUMO-1 with respect to the protease structure. These differences appear to be reflected in the protein–protein interfaces. In comparison with the SENP2–SUMO-1 complex, the SENP1–SUMO-2 complex buries approx. 25% more surface. Although the C-terminal portion of SUMO-2 is conserved in SUMO-1 and SUMO-3 (Figure 1B), the other two regions of SUMO-2 that contact SENP1 are not, having only six of 22 residues absolutely conserved. A similar observation was made studying the SENP2–SUMO-1 complex: aside from the conserved C-terminus, only four of 11 residues of SUMO-1 in contact with the protease are absolutely conserved. Gaps and cavities are found at the protein–protein interface in both complexes (Figure 2), suggesting poor complementarity of fit. In electrostatic terms, there is better agreement: both proteases have negative and positive surface patches around the central cleft, which bind to oppositely charged interfaces on both SUMO-1 and SUMO-2 (Figure 3).

Figure 3 Comparison of the electrostatic potentials of the complementary surfaces of SENP1–SUMO-2 and SENP2–SUMO-1

(A) On the left is the electrostatic analysis of SENP1 in complex with SUMO-2. SUMO-2 is shown as a cyan ribbon. On the right the SUMO-2 structure has been rotated 180° to show the electrostatic potential of the surface bound to SENP1. (B) On the left is the electrostatic analysis of SENP2 in complex with SUMO-1 [33]. SUMO-1 is shown as a green ribbon. On the right, the SUMO-1 structure has been rotated 180° to show the electrostatic potential of the surface bound to SENP2.

Mutational analysis of SENP1

To validate our structural analysis, we have employed site-directed mutagenesis to alter residues in SENP1 predicted to participate in substrate recognition and catalysis (Figure 2). Mutated versions of the SENP1 protease domain protein were obtained in the same way as for native protein. To measure processing activity, full-length SUMO-2 was incubated with the altered SENP1 proteins. Cleavage to mature SUMO-2 was determined by analysis of the reaction products using SDS/PAGE (10% gels) with Coomassie Blue staining (Figure 4A). To assay for isopeptidase activity, GST–PML was linked to a polymeric chain of SUMO-2 [30] and was used as substrate. Active protein releases free mature SUMO-2 and GST–PML that are identified by SDS/PAGE (10% gels) and Coomassie Blue staining (Figure 4B).

Figure 4 SUMO-2 processing and deconjugation activities of SENP1 mutant proteins

(A) SUMO-2-processing activity of SENP1 mutants (300 nM) was determined with pre-SUMO-2 (20 μM) as substrate. Reactions took place at 37 °C for 30 min, and the products were fractionated by SDS/PAGE (10% gels) and stained with Coomassie Blue. The locations of pre-SUMO-2 and SUMO-2 are indicated. (B) SUMO-2-deconjugation activity of SENP1. The substrate for deconjugation was GST–PML bearing a polymeric chain of SUMO-2 [30]. Substrate (20 μM SUMO-2) was incubated with wild-type (WT) or mutant SENP1 (1 nM) at 37 °C for 30 min, and the reaction products were analysed as described in (A). M, molecular-mass markers (sizes are given in kDa). Lane −, control (no protease).

As expected, mutations in the absolutely conserved catalytic triad (C602A, H533A and D550A) were completely inactive in processing of SUMO-2 and in deconjugating SUMO-2. As such, these mutants serve as a useful baseline for assessing the activity of other mutants. We mutated Trp465, Phe496, Trp534 and His529, which, according to the structure of the complex, could interact with the C-terminus of SUMO-2. In each case, the processing of the pre-SUMO-2 to its mature form was seriously impaired or eliminated. This confirms our structural model that these residues are important in binding SUMO-2. In this model, the absolutely conserved Trp465 and Trp534 form a clamp which locks the C-terminus of SUMO-2 in place. The dimensions of this constriction, which are evident in all SUMO-like protease complexes [15], explains the requirement for the Gly-Gly dipeptides before the cleavage site, as any side chain projecting from the polypeptide backbone would clash with the walls of the tunnel through which the C-terminus of SUMO has to pass. In the case of F496A and H529A mutants, the altered proteins retain significant levels of deconjugation activity, although both residues make seemingly important van der Waals contacts with SUMO-2. Phe496 stacks against Phe86 of SUMO-2, and His529 interacts with Gln89 and possibly with Thr90. Both Phe496 and His529 are absolutely conserved in SUMO proteases, but not in NEDD8 protease (Phe496 found as tyrosine and His529 found as asparagine). Phe86 of SUMO-2 and Gln89 are conservatively replaced by tyrosine and by glutamate in other SUMO proteins, indicating that these interactions would be preserved for all combinations of protease and SUMO. The differences in behaviour exhibited by Phe496 and His529 in our two assays most likely reflects the differences in kcat/Km ratios for deconjugation and processing. In support of this, we note that SUMO-2 processing by native SENP1 is rather inefficient and requires 300 nM SENP1 for complete processing of SUMO-2 (20 μM), whereas SUMO-2 deconjugation is efficient and 20 μM SUMO-2 is fully deconjugated by 1 nM native SENP1 (Figure 4). Our structure suggested that the conserved residue Val532 could play a role in SUMO-2 recognition; however, the V532A mutation had no effect on either deconjugation or processing.

To test the importance of predicted interactions between SUMO-2 and SENP1 that were distinct from the C-terminal interactions described above, D441A, D468A, R511A and W512A mutants were tested for activity in processing and deconjugation assays. Arg511 and Trp512 are found together on one side of the central protease cleft, while Asp441 and Asp468 are found on the opposite face of the central protease cleft. All four residues make contacts with SUMO-2 and thus could contribute to substrate recognition. Trp512 interacts with the main chain of Asp62 and Gly63 of SUMO-2 and removal of the tryptophan side chain severely impairs processing activity and reduces the deconjugation activity of the mutated protein. Arg511 interacts with the side chain of Asp62 and the R511A mutant has reduced activity, but is unaffected in deconjugation. The side chain of Asp468 is in close proximity to SUMO-2 Arg55 and Gln87, yet the D468A mutated protein is only slightly impaired in processing and not at all in deconjugation. Although Asp441 appears to approach the side chain of SUMO-2 Arg55, the activity of the D441A mutant protein is indistinguishable from that of the wild-type protein in both processing and deconjugation. Thus, while Trp512, Arg511 and Asp468 are all conserved in SUMO proteases, only Trp512 appears to play an important role in SUMO-2 recognition, with Arg511 and Asp468 playing a less critical role. The non-conserved Asp441 does not appear to contribute to SUMO-2 recognition.

The Q596A mutant of SENP1 is severely impaired in both deconjugation and processing. This conserved residue is located remotely from the interface from SUMO-2; however, it appears to play a role in hydrogen-bonding to the protein main chain of SENP1 close to the catalytic site. Our data suggest that this residue plays an important structural role in forming the correct active site for peptide-bond cleavage.

The assays employed above determine the activity of wild-type and mutant versions of the protease domain of SENP1 in vitro. However, it is important to determine the effect of these mutations within the context of the full-length protein in vivo. Therefore the same mutations tested in vitro were introduced into a cDNA encoding full-length SENP1 in a eukaryotic expression vector. Expression constructs for wild-type or mutant forms of SENP1 were co-transfected with an expression construct for HA–tagged SUMO-2 into H1299 cells. In the absence of co-transfected SENP1, HA–SUMO-2 was found as high-molecular-mass conjugates. However co-transfection of wild-type SENP1 results in deconjugation of SUMO-2 from modified substrates and a dramatic decrease in the quantity of high-molecular-mass SUMO-modified species (Figure 5A). Consistent with the in vitro data, mutants that had little activity in the in vitro deconjugation assay are also defective for SUMO-2 deconjugation in vivo (Figure 5A). SENP1 did not display any significant variation in levels of expression that would explain the observed differences in in vivo SUMO-deconjugation activity (Figure 5B). These experiments in vivo confirm the findings for the isolated protease domain in vitro. It was notable, however, that, while the W512A mutant had a reduced deconjugation activity in vitro, its deconjugation activity in vivo was only impaired to a small extent. This might be explained by different enzyme/substrate ratios in vitro and in vivo.

Figure 5 Ability of SENP1 to deconjugate SUMO-2 in vivo

H1299 cells were co-transfected with expression plasmids for HA-tagged SUMO-2 and SENP1 mutants as indicated. At 36 h after transfection, cells were lysed in SDS and analysed by Western blotting with antibodies against the HA tag (A) and SENP1 (B). Molecular-mass sizes are given in kDa.

Ability of SENP1 to discriminate between SUMO paralogues in processing and deconjugation

It has been reported that SENP1, SENP2 and Ulp1 have distinct processing activities on SUMO-1, SUMO-2 and SUMO-3 [22,33]. To analyse SENP1 selectivity, the rates at which SENP1 processed SUMO-1, SUMO-2 and SUMO-3 were compared. Pre-SUMO-1 was processed rapidly, while pre-SUMO-2 was processed relatively slowly. Pre-SUMO-3 was processed by SENP1 at an intermediate rate (Figure 6A). Equivalent amounts of SUMO-1, SUMO-2 and SUMO-3 were conjugated to RanGAP1 and incubated with SENP1, and the rate of deconjugation was determined. Rates at which SUMO-1, SUMO-2 and SUMO-3 were deconjugated from RanGAP1 were indistinguishable (Figure 6B). We investigated whether the protein to which SUMO-1 was conjugated influenced the rate of deconjugation by comparing RanGAP1 (residues 418–587) and SP100 (speckled protein of 100 kDa) (residues 181–360) SUMO-1 conjugates. While SUMO-1 modified RanGAP1 was efficiently deconjugated by SENP1, SUMO-1-modified SP100 was less efficiently deconjugated, but at a rate that was comparable with the rate at which SENP1 processed pre-SUMO-1 to the mature form (Figure 6C). Although this suggests a degree of specificity in SENP1 deconjugation, this is unlikely to be absolute, as exogenously expressed SENP1 can deconjugate tagged SUMO-2 from most substrates in vivo (Figure 5).

Figure 6 SUMO-processing and -deconjugation specificity of SENP1

(A) Rate of processing of pre-SUMO-1, pre-SUMO-2 and pre-SUMO-3 (all 20 μM) by SENP1 (10 nM). Reaction products were analysed as described in Figure 4 and were quantified using a Fujifilm LAS1000 Plus system. Experiments were carried out in triplicate and the results are means±S.D. (B) Rate of deconjugation of SUMO-1, SUMO-2 and SUMO-3 from RanGAP1 (all 10 μM) by SENP1 (1 nM). Reaction products were analysed and quantified as in (A). (C) Comparison of rates of processing and deconjugation of different substrates. Substrates pre-SUMO-1, SUMO-1-conjugated RanGAP1, SUMO-1-conjugated SP100 (all 10 μM) were incubated with SENP1 (1 nM), and reaction products were analysed and quantified as in (A).


Lack of discrimination of SUMO-1, SUMO-2 and SUMO-3 during deconjuation

Both SENP2 [33] and SENP1 (Figure 6B) deconjugate SUMO-1, SUMO-2 and SUMO-3 from their isopeptide bond to RanGAP1 with equal efficiency. Yet our complex shows that, aside from the C-terminus of SUMO-1 and SUMO-2, there is very little similarity in the protein–protein interfaces between the various complexes. This complete lack of discrimination is an unusual observation, as protein–protein interfaces are usually exquisitely specific and small changes in sequence often perturb complex formation. We have already noted that the protein–protein interface appears to have a poor complementarity. We conclude that almost the entire burden of recognition during deconjugation falls upon the interactions between the conserved C-terminus of the SUMOs and the cleft in the proteases. Indeed, we have shown that it is single amino acid substitution in this region of NEDD8 that underlies the discrimination between NEDD8 and ubiquitin [24]. The other interactions between SUMO-2 and SENP1 (and SUMO-1 and SENP2) are often not even conserved and do not appear to be crucial to recognition. Mutations in SENP1 that disrupt interactions with the C-terminus of SUMO-2 have a deleterious effect on SENP1 activity. In contrast, mutations which elsewhere in SENP1 that disrupt side chain–side chain interactions have little, if any, effect on SENP1 activity (Figure 4). Trp512, although remote from the C-terminus, interacts with the main chain of SUMO-2 and its mutation does decrease SENP1 activity. Thus it appears that, so long as the interactions between the C-terminus of SUMO and the protease is conserved, there is considerable latitude in the interaction between the bulk of SUMO-1/2/3 and its proteases.

Discrimination of SUMO-1, SUMO-2 and SUMO-3 during processing

Unlike deconjugation, the activity of SENP1 to process pre-SUMO-1, pre-SUMO-2 and pre-SUMO-3 varies (Figure 6 and [22]), with SENP1 preferentially processing SUMO-1 over SUMO-3 and SUMO-2 being only slowly processed. It has been shown previously [33] that SENP2 processes SUMO-2 in preference to SUMO-1, with SUMO-3 being the most slowly processed. Note that SUMO-2 as described in [33] is equivalent to SUMO-3 in the present paper and vice versa (as explained in the Materials and methods section). By swapping the C-terminal peptides, the origin of the specificity for SENP2 was established as residing in two residues immediately after the cleavage site of pre-SUMO-1, pre-SUMO-2 and pre-SUMO-3. It has subsequently been shown [22] that it is His98 of SUMO-1 (immediately after the Gly-Gly dipeptide) which confers rapid processing by SENP1 in vitro. A mutant of pre-SUMO-3 with a V94H mutation is processed by SENP1 as rapidly as SUMO-1 and much more rapidly than wild-type pre-SUMO-3. They also showed that Pro94, the second residue after the Gly-Gly dipeptide in pre-SUMO-2, is responsible for the slow processing of pre-SUMO-2 by SENP1 and SENP2. Mutation of Pro94 by site-directed mutagenesis leads to much more rapid processing of pre-SUMO-2 by both SENP1 and SENP2. Equally, its insertion into pre-SUMO-3 greatly decreases its processing rate. That the secondary amino acid proline inhibits protein processing is unsurprising. Proline introduces kinks into protein structure and provides a clear structural hypothesis for the apparent slowness of SUMO-2 processing. Why a single histidine residue should promote processing by SENP1 and not SENP2 is unclear. There is no obvious answer from sequence comparison of the two proteases, and no hypothesis from other structural studies has emerged. As SUMO-1 and SUMO-2/3 appear to respond differently to cell stress [31], it is likely that the different SUMO paralogues have different functions in vivo. Thus the distinct processing activities of SENP1 and SENP2 may be responsible for generating distinct pools of processed SUMO paralogues in particular cell types that can be used for conjugation. Although SENP1 processes SUMO-1 more efficiently than SUMO-2, we chose to determine the structure of SENP1 and SUMO-2, as a trapped SUMO-2 complex had not been described previously. It should be noted, however, that, while SENP1 processes SUMO-1 and SUMO-2 at different rates, there is no difference in the rate at which SUMO-1 and SUMO-2 are deconjugated and the complex is equivalent whether it was trapped from a deconjugation or processing reaction.

Before processing, the residues immediately after the Gly-Gly dipeptide of pre-SUMO-2 would project from the other side of the hydrophobic tunnel formed by Trp465 and Trp534. It was pointed out that, in SENP2, the region which would interact with the C-terminal peptide is largely hydrophobic [33]. In contrast (Figure 7), electrostatic analysis of SENP1 reveals that this region is quite strongly acidic. The C-terminus of SUMO-1 with its His-Ser dipeptide has a polar and positively charged C-terminal peptide, complementing the acid patch of SENP1. In contrast, the relatively hydrophobic and electrostatically neutral C-terminal peptide of SUMO-3 (with Val-Tyr after the Gly-Gly dipeptide) complements the similarly hydrophobic patch on SENP2. The structural data obtained from our study suggest that it is the complementarity between electrostatic properties of the C-terminal peptides of the SUMO paralogues with SENP1 and SENP2 which underlies selectivity. Although electrostatic complementarity is far from a novel concept in biology, its presence here was unsuspected. This is because the difference in SENP1 and SENP2 is not due to a single obvious amino acid change and is therefore very difficult to probe by site-directed mutagenesis of SENP1. The difference is a feature of opening up of a pocket in which Asp550 is now exposed. Support for our proposal comes from re-examination of the Ulp1 protease [15]. As with SENP1 and SENP2, it deconjugates SUMO-1, SUMO-2 and SUMO-3 with equivalent efficiency, indicating that, as before, the burden of recognition falls upon the C-terminus of SUMO. Interestingly, Ulp1 does not process SUMO-2 or SUMO-3, but readily processes SUMO-1. Ulp1, like SENP1, has a strong acid patch which would be expected to interact with the C-terminal His-Ser dipeptide of SUMO-1. Recently, analysis of a mouse strain where SENP1 expression has been ablated as a consequence of a retroviral insertion in the SENP1 gene has revealed the biological role for SENP1 [21]. This mutation results in placental abnormalities that are incompatible with normal embryonic development. Analysis of the forms of SUMO affected by this mutation revealed that the levels of SUMO-1 conjugates were increased while the levels of SUMO-2/3 conjugates were unaffected. Furthermore, mutant cells appeared to be defective in processing SUMO-1, but not SUMO-2/3. While the lack of SUMO-1 processing in the mutant cells is entirely consistent with the strong preference that SENP1 displays for processing of SUMO-1 in vitro (Figure 6A), the observed accumulation of SUMO-1, but not SUMO-2 or SUMO-3, conjugates in the mutant cells is more puzzling. In vitro analysis of SUMO deconjugation by the isolated protease domain of SENP1 revealed that it deconjugated SUMO-1, SUMO-2 and SUMO-3 at similar rates (Figure 6B and [22]). It is entirely possible that the N-terminal domain of SENP1, which is absent from the bacterially expressed protein used in the studies reported here, confers additional targeting properties on the protein that direct it to SUMO-1-modified substrates.

Figure 7 Electrostatic analysis of the region of SUMO-specific proteases predicted to define processing specificity

(A) SENP1 has a negatively charged area (circled, red) in the region predicted to interact with the C-terminal SUMO peptides. SUMO-2 is shown as a cyan ribbon. (B) SENP2 has a neutral area (circled, white) in the region predicted to interact with the C-terminal SUMO peptides. SUMO-1 is shown as a green ribbon. (C) Ulp1 has a negatively charged area (circled, red) in the region predicted to interact with the C-terminal SUMO peptides. SMT3 is shown as a magenta ribbon.


We thank Heidi Mendoza (University of Dundee) and Barbara Ink (GlaxoSmithKline) for provision of the original SENP1 constructs. This work was supported by Cancer Research UK, Wellcome Trust and BBSRC (Biotechnology and Biological Sciences Research Council).


  • The structural coordinates for SENP1 (sentrin-specific protease 1) and SENP1 complexed with SUMO-2 (small ubiquitin-like modifier 2) have been deposited in the Protein Data Bank under codes 2CKG and 2CKH respectively.

Abbreviations: GST, glutathione S-transferase; HA, haemagglutinin; IPTG, isopropyl β-D-thiogalactoside; NEDD8, neural precursor cell-expressed developmentally down-regulated 8; NEDP1, NEDD8-specific protease; Ni-NTA, Ni2+-nitriloacetate; PEG, poly(ethylene glycol); PML, promyelocytic leukaemia protein; RanGAP1, Ran GTPase-activating protein 1; SAE, SUMO-activating enzyme subunit; SENP, sentrin-specific protease; SP100, speckled protein of 100 kDa; SMT3, suppressor of MIF2 (mitotic fidelity protein 2); SMT3IP, SMT3-specific isopeptidase; SUMO, small ubiquitin-like modifier; SV5, simian virus 5; TEV, tobacco etch virus; rmsd, root mean square deviation; Ubc9, ubiquitin-conjugating enzyme 9; Ubl, ubiquitin-like protein; Ulp1, Ubl-specific protease 1


View Abstract