Research article

Probing the S1 specificity pocket of the aminopeptidases that generate antigenic peptides

Efthalia Zervoudi, Athanasios Papakyriakou, Dimitra Georgiadou, Irini Evnouchidou, Anna Gajda, Marcin Poreba, Guy S. Salvesen, Marcin Drag, Akira Hattori, Luc Swevers, Dionisios Vourloumis, Efstratios Stratikos


ERAP1 (endoplasmic reticulum aminopeptidase 1), ERAP2 and IRAP (insulin-regulated aminopeptidase) are three homologous enzymes that play critical roles in the generation of antigenic peptides. These aminopeptidases excise amino acids from N-terminally extended precursors of antigenic peptides in order to generate the correct length epitopes for binding on to MHC class I molecules. The specificity of these peptidases can affect antigenic peptide selection, but has not yet been investigated in detail. In the present study we utilized a collection of 82 fluorigenic substrates to define a detailed selectivity profile for each of the three enzymes and to probe structural and functional features of the S1 (primary specificity) pocket. Molecular modelling of the three S1 pockets reveals substrate–enzyme interactions that are critical determinants for specificity. The substrate selectivity profiles suggest that IRAP largely combines the S1 specificity of ERAP1 and ERAP2, consistent with its proposed biological function. IRAP, however, does not achieve this dual specificity by simply combining structural features of ERAP1 and ERAP2, but rather by an unique amino acid change at position 541. The results of the present study provide insights on antigenic peptide selection and may prove valuable in designing selective inhibitors or activity markers for this class of enzymes.

  • aminopeptidase
  • antigen
  • kinetics
  • peptide specificity
  • site-directed mutagenesis
  • substrate library


Antigen presentation and processing

Cytotoxic T-lymphocytes identify infected or transformed cells by recognizing small antigenic peptides bound on to cell-surface receptors of MHC I. These antigenic peptides are derived from the proteolysis of intra- or extra-cellular proteins and constitute an indicator of the health status of the cell. Aberrant generation of antigenic peptides can lead either to immune evasion or autoimmunity. Antigenic peptides are generated intracellularly by complex proteolytic pathways [1]. A key component of these pathways is the proteasome, a large intracellular multi-subunit protease that generates fragments from intracellular or endocytosed proteins. Peptides generated by the proteasome are transported into the ER (endoplasmic reticulum) by a specialized ATP-dependent peptide transporter, TAP (transporter associated with antigen processing) [2]. A similar but distinct pathway, the cross-presentation pathway, operates in specialized intracellular vesicles that contain endocytosed extracellular proteins [3]. The proteasome-generated peptides usually have the correct C-terminus as the final antigenic peptides, but also have N-terminal extensions that make them too large to bind on to MHC class I molecules that have stringent length requirements with a general preference for nonamers. Although these extensions vary from one to six amino acids long, the most common extension is one amino acid [4]. Inside the ER, resident aminopeptidases trim these antigenic peptide precursors to generate the mature antigenic peptides that can then bind on to nascent MHC class I molecules [5,6].

ERAP1 (ER aminopeptidase 1) and ERAP2 are two specialized aminopeptidases that reside in the ER and have been demonstrated to trim antigenic peptide precursors to generate mature antigenic peptides [710]. Recently, a homologous aminopeptidase, IRAP (insulin-regulated aminopeptidase; or PLAP) has been demonstrated to perform a similar function in cross-presentation vesicles [11]. These three aminopeptidases share approximately 50% sequence identity. As a result of their shared homology and function, it has been proposed that they constitute a distinct sub-family of aminopeptidases within the metalloprotease classification M1 [12]. ERAP1 is the best characterized of the three and has been shown to affect antigen presentation in vivo, shaping the pool of antigenic peptides and influencing immunodominance [1317]. Inhibition of ERAP1 by the broad-spectrum metalloprotease inhibitor Leucinethiol was sufficient to replicate gene knock-down experiments in cells and to induce alterations in the repertoire of the antigenic peptides [14]. ERAP1 has unusual, for an aminopeptidase, enzymatic properties, preferring to trim longer peptides down to a length of 8–10 amino acids, the appropriate length for MHC class I binding [18]. It has relatively broad substrate specificity showing preferences for side chains throughout the peptide-substrate sequence [19]. IRAP shares some of the molecular properties of ERAP1 in generating mature antigenic epitopes, although recent findings suggest that it does so in distinct patterns, suggesting differences in specificity [20].

The trimming specificity of the N-terminal amino acid from antigenic peptide precursors by aminopeptidases is a strong determinant for the generation of mature antigenic peptides and the determination of the antigenic peptide repertoire. A large number of antigenic peptide precursors carry only a single amino acid extension, whose trimming will be largely affected by the N-terminal specificity of the aminopeptidase [4]. The in vitro trimming preferences of ERAP1 have been recently demonstrated to largely determine antigenic peptide presentation in cultured cells [21]. Although highly homologous, ERAP1/2 and IRAP do not have the same specificity. Using chromogenic substrates it has been reported that the preferred residue for ERAP1 is leucine, whereas for ERAP2 is arginine [22,23]. IRAP can cleave both substrates [24]. The exact role of these specificity differences in the biological function of these enzymes is not clear, nor have they been investigated in any detail. In the present study we set forth to characterize in detail the shape, size and composition of the S1 specificity pocket of each enzyme, in an effort to better understand the molecular determinants that contribute to antigenic peptide repertoire generation. By a combination of substrate library screening, molecular modelling and site-directed mutagenesis, we unravel key features of the S1 pocket of these enzymes that are consistent with their distinct biological functions and may be valuable for the rational design of selective inhibitors or activity markers.


Protein expression and purification

Recombinant ERAP1 was produced by insect cell culture after infection with recombinant baculovirus carrying the ERAP1 coding sequence and isolated from the cell supernatant as previously described [19]. A recombinant and soluble form of IRAP was produced by 293F cells grown in suspension after transfection with a plasmid vector carrying the IRAP coding sequence as previously described [20].

For production of recombinant ERAP2, the sequence coding for full-length human ERAP2 was inserted in the pFastBac1 vector between the BssHII and NotI restriction endonuclease sites. The final construct contained the 21-bp A-rich sequence derived from a lobster tropomyosin cDNA leader sequence adjacent to the initiation codon and a C-terminal His6 tag for efficient expression and purification. The pFastBac1-ERAP2 vector was used to generate recombinant baculovirus according to the manufacturer's instructions (Invitrogen). The recombinant baculovirus was used to infect Hi5 cells grown in suspension in SF900II serum-free medium. At 3 days post-infection, recombinant ERAP2 was found in the cell supernatant, harvested by centrifugation and isolated by Ni-NTA (Ni2+-nitrilotriacetate) affinity chromatography as previously described for ERAP1 [19].


Site-directed mutagenesis for the construction of the E541R mutation in human IRAP was performed using the QuikChange® II XL kit according to the manufacturer's instructions (Agilent Technologies). The primers used for the mutagenesis were 5′-TCATCTGTTCAGTCTTCAGAACAAATTCGAGAAATGTTTGATTCTCTTTCC-3′ (sense) and 5′-GGAAAGAGAATCAAACATTTCTCGAATTTGTTCTGAAGACTGAACAGATGA-3′ (antisense). Successful mutagenesis was confirmed by DNA sequencing.

Library synthesis

Of the 82 fluorigenic substrates in the library, 61 have been described previously [25]. All new compounds [D-amino acids-ACC (where ACC is 7-amino-4-carbamoylmethylcoumarin), L-hTyr (homo-tyrosine)-ACC, L-4-guanidino-phenylalanine-ACC and L-dehydrotryptophan-ACC] were synthesized using protocols described previously [25]. HPLC purification and post-purification analysis of all new compounds were conducted on a Waters M600 solvent delivery module equipped with a Waters M2489 Detector system using preparative Waters Spherisorb S10ODS2 or analytical Waters Spherisorb S5ODS2 columns. Solvent composition: system A [water/0.1% TFA (trifluoroacetic acid)] and system B [acetonitrile/water 80%:20% (v/v) with 0.1% TFA]. All substrates were at least 95% pure and were validated by ESI–MS (electrospray ionization MS) at the Mass Spectrometry Facility at the Department of Chemistry of University of Wroclaw. The chemical structures for all 82 substrates are shown in Supplementary Figure S1 (at

Fluorigenic assay

Trimming of the fluorigenic peptide substrates by ERAP1, ERAP2 and IRAP was followed using a TECAN infinite® M200 microplate fluorescence reader. The samples were excited at 380nm and fluorescence was recorded at 460nm. The reactions were followed for 5–10 min at 24 °C. In all cases the rise in fluorescence was linear with time indicating steady-state kinetics. The slope of the time-course was used to calculate the reaction rate. L-AMC [L-leucine-AMC (7-amino-4-methylcoumarin)] and R-AMC (L-arginine-AMC) substrate controls were included in every plate to allow comparison between data collected from different plates.

Homology modelling

Multiple sequence alignment of human ERAP1 (isoform a, NP_057526.3), ERAP2 (NP_001123612.1) and IRAP (isoform a, NP_005566.2) was performed using ClustalW2 ( with the default parameters (Supplementary Figure S2 at The good overall sequence identity of ERAP2 and IRAP with ERAP1 (49% and 44% respectively), especially considering the higher degree of identity at the catalytic subsites of interest, provided a solid template for homology modelling. On the basis of the crystal structure of ERAP1 (PDB code 2XDT), ERAP2 and IRAP models were generated using Modeller 9v4 [26]. Residues Pro46–Arg940 from the PDB 2XDT structure were used as a template for the generation of ERAP2 (Arg61–Thr960) and IRAP (Leu60–Leu1025) models (excluded residues are shown in Supplementary Figure S2). The model with the lowest objective function value was selected for further optimization using AMBER 9 [27]. Hydrogen and missing heavy atoms, including Zn(II), disulfide and metal–ligand bonds were added using XLEaP. The AMBER-based parm99SB force field was applied to all protein atoms, and parameters for the Zn(II) co-ordination sphere were taken from [28]. Subsequently, the position of hydrogen atoms and the metal ion site was optimized with energy minimization in a vacuum using a distance-dependent dielectric and a 20 Å (1 Å=0.1 nm) cut-off for non-bonded interactions. The quality of ERAP2 and IRAP models was assessed using the Structural Analysis and Verification Server (, which exhibited a reasonable degree of quality by virtue of their sequence alignment.

Substrate docking

The substrate library was generated starting from the SMILES (simplified molecular input line entry specification) representation of each compound, then OMEGA [29] was used to calculate the initial three-dimensional co-ordinates and QUACPAC (Openeye Inc.) to apply AM1-BCC atomic charges [30]. Docking of the substrates was performed using AutoDock 4.2 [31] with default parameters except for the number of docking rounds, which was increased to 100. Non-polar hydrogen atoms were merged and Kollman charges were applied to the protein atoms using AutoDockTools 1.4.5. Ligands were treated as fully flexible excluding amide bonds and guanidinium groups. The docked model chosen for analysis was among the highest binding energy conformations with proper orientation for substrate binding (i.e. the scissile bond C=O· · ·Zn(II) distance<2.5Å). No further optimization of the predicted enzyme–substrate interactions was attempted. Electrostatic potential surfaces were generated using APBS and PME electrostatics packages [32]. VMD 1.8.6 and PyMOL ( was used for visual inspection and rendering of the Figures [34].


Screening strategy

We used a collection of 82 fluorigenic substrates to generate a selectivity profile for ERAP1, ERAP2 and IRAP. The experimental conditions for each screen were designed so that every substrate used would be assayed under sub-Km concentrations so that the rate of cleavage would linearly correlate with the kcat/Km value of each substrate and enzyme as previously described [25]. To ensure this, we first generated Michaelis–Menten plots for the best-known substrate for each enzyme. We used L-ACC for ERAP1 and IRAP and R-ACC for ERAP2. The Km for ERAP1 is larger than 1 mM (estimated to be 1150±305 μM) and the Km values for ERAP2 and IRAP are 90±3 μM and 85±30 μM respectively (results not shown). As a result we chose to do all screening assays at substrate concentrations below 10 μM. After the preliminary screening, substrates for which no signal was measured were re-screened at a concentration of 100 μM in an effort to quantify trimming rates for poor substrates. Using this strategy we estimated a minimum trimming rate difference between good and non-processed substrates of approximately 200-fold.

Selectivity profiles for ERAP1, ERAP2 and IRAP

The relative trimming rates for each of the 82 substrates with each of the three enzymes are shown in Figure 1. Rates are plotted as a fraction of the best substrate for each enzyme. The three enzymes share key preferences but also display marked differences. ERAP1 efficiently trimmed approximately 16 of the 82 substrates, showing significant preference for hydrophobic and aromatic amino acids, as well as for long aliphatic side chains. Accordingly, the best performing substrate was hTyr. ERAP2 displayed a significantly different profile than ERAP1, with strong preferences for positively charged amino acids. Overall, ERAP2 efficiently trimmed approximately 10 of the 82 substrates, with several key differences from ERAP1. The best two substrates, arginine and hArg (homo-arginine), had a guanidinium group, revealing a strong preference for extended chains with positively charged ends. Shorter hydrophobic side chains were processed to a smaller degree. Similarly to ERAP1, ERAP2 appeared to prefer extended carbon side chains, but in contrast with ERAP1, ERAP2 displayed a very strong preference for a positive charge at the end of those chains. The selectivity profile of IRAP was the most permissive of the three enzymes. IRAP was able to trim at least 25 of the 82 substrates in the library. Interestingly, in almost all cases, IRAP was able to process the substrates that were trimmed by either ERAP1 or ERAP2. This finding suggests that IRAP has the combined specificity of ERAP1 and ERAP2. However, some exceptions were evident since a few substrates were not processed by ERAP1 or ERAP2 but trimmed by IRAP (cyclopentyl-glycine, Abu, Bpa) and vice versa (3-NO2-tyrosine). These observations suggest that although IRAP can generally process the sum of substrates of ERAP1 and ERAP2, it may use distinct molecular interactions to achieve this specificity.

Figure 1 Selectivity profiles of ERAP1, ERAP2 and IRAP

Trimming rates were calculated for each substrate and then normalized for the best substrate for each enzyme. Error bars correspond to the S.D. for three to six measurements. Substrates for which no bar is drawn failed to be hydrolysed by the enzyme even when measured at a substrate concentration of 100 μM. See Supplementary Figure S1 (at for substrate structures. (A) Natural amino acid side chains in L- or D- configuration. (B) Non-natural amino acid side chains. (C) Reaction kinetics and specific rates for the best substrate for each enzyme (hTyr-ACC for ERAP1, Arg-ACC for ERAP2 and hArg-ACC for IRAP). Error bars indicate the S.D. of three measurements.

Non-natural side chains probe the characteristics of the S1 pockets

We employed amino acids with unnatural side chains to gain insight on structural and functional features of the S1 pockets (Figure 1B). Interestingly, we identified a much larger number of unnatural side chains as good substrates for all three enzymes. Accordingly, the best substrate for ERAP1 was hTyr, hArg was the second best substrate for ERAP2, and both of those substrates were optimal for IRAP. This observation suggests that the S1 pockets are not strictly optimized for natural amino acids but can easily accommodate more complex structures.

D-amino-acid-based substrates were poorly processed by all three enzymes, suggesting that the L-configuration is a prerequisite for binding and/or catalysis. Under typical experimental conditions only ERAP2 was found to be able to trim D-arginine, albeit ~50-fold slower than L-arginine. Michaelis–Menten analysis of these two substrates suggested that the lower trimming rate was due to both changes on the Km (90±3 μM for L-Arg and 1053±304 μM for D-Arg) and kcat parameters (0.177±0.003 s−1 for L-Arg and 0.038±0.018 s−1 for D-Arg) (see Supplementary Figure S3 at These findings suggest that the L-configuration is crucial for both binding and catalysis for this family of enzymes.

Similarly to human aminopeptidase N (CD13) [25], the enzymes analysed in the present paper had a very strong preference for amino acids with an amino group in the α position and were completely inactive toward substrates with a hydroxy group in the α position, such as Apns, or amino acids with no amino group present in this position, such as 6-Ahx or β-Ala. This finding is consistent with the important role for the peptide N-terminus in substrate recognition [35].

Some of the substrates in the library have side chains of substantial size and would be expected to fit only in large S1 pockets. The S1 pocket of IRAP in particular appears to be able to accommodate the most bulky and hydrophobic substrates in the library (Bpa and Igl), whereas ERAP1 and ERAP2 processed them poorly. This result indicates that the S1 pocket of IRAP may be the largest of the three. Finally, lack of processing of conformationally restricted substrates such as 1-Nal, 2-Nal or Bip by ERAP1, ERAP2 and IRAP suggests that although the pocket is large, it is well-defined and rigid so as to exclude side-chain structures that are not flexible enough to adopt appropriate configurations. Overall, these observations suggest that it may be possible to optimize S1 recognition for each enzyme by incorporating bulky non-natural side chains in the substrate.

ERAP1 and ERAP2 mixture behaves similarly to IRAP

Saveanu et al. [10] have previously suggested that ERAP1 and ERAP2 operate in a concerted manner in the ER. In contrast, IRAP has been suggested to operate on a separate pathway of cross-presentation, distal from compartmentalized ERAP1 and ERAP2 [11]. To investigate possible effects in S1 specificity when ERAP1 and ERAP2 are mixed, we screened the L-substrate library in the presence of a 2:1 molar ratio of ERAP1/ERAP2, according to the molar ratio of the two enzymes reported previously [10]. The resulting specificity profile was found to closely follow the sum of the individual selectivity profiles of each enzyme, revealing no strong synergism or allosteric effects under these experimental conditions (Figure 2). Again, the selectivity profile of the mixture of ERAP1 and ERAP2 closely resembled the profile of IRAP, although some differences were obvious. We conclude that IRAP largely combines the specificity of ERAP1 and ERAP2, but retains unique profile features that suggest differences in the molecular determinants of its S1 pocket.

Figure 2 Comparing the S1 specificity profile

ERAP1 and ERAP2 were mixed at a 2:1 molar concentration and the selectivity profile of the mix was compared with that of IRAP using the L-substrates.

The three enzymes present similarities primarily for the substrates they do not process efficiently

Despite their differences, the three enzymes presented some striking similarities in the substrates they were unable to process efficiently. None of the enzymes were able to process a proline side chain, presumably due to the absence of a free N-terminal group to be recognized by the aminopeptidase GAMEN motif [12]. Very short hydrophilic side chains were not preferred, presumably due to the hydrophobic nature of the S1 pockets. β-Branched side chains such as valine, isoleucine and threonine were also poorly tolerated. Finally, negatively charged side chains were very poor substrates for all three enzymes. These observations suggest that the S1 pockets of the three enzymes share common structural features that exclude some categories of side chains from being effectively recognized.

Molecular modelling suggests critical features of the S1 pocket that control specificity

To understand the molecular basis for the specificity effects unravelled by the library screen, we utilized a recently released crystallographic structure of ERAP1 (PDB code 2XDT) to dock the best substrates and analyse the atomic level interactions in the S1 pocket. Since no crystallographic structures are yet available for ERAP2 and IRAP, we used the structure of ERAP1 to construct homology models of the other two aminopeptidases. The high homology shared between the three aminopeptidases (50% identity) and the relatively few amino acid differences in the vicinity of the S1 pocket, result in homology models of higher accuracy compared with our previous report [19].

Docking of model substrates, in combination with the positioning of key catalytic features of the enzyme [such as the residues of the HEXXH motif that binds the catalytic Zn(II) atom and the GAMEN motif that is responsible for the recognition of the N-terminus of the peptidic substrate] help define the spatial orientation of the S1 pockets (Figure 3). For all three enzymes, the general shape and size of the pockets are similar, although IRAP has a larger exit channel towards the solvent. The S1 pocket is relatively large (being able to easily accommodate even the largest of the docked substrates) and elongated, originating from the catalytic site Zn(II) atom and forming a shallow channel towards the solvent. The channel is capped by residues from the C-terminal domain of the protein, forming a closed structure with minimal solvent access, suggesting that a conformational change may be necessary to allow substrate binding and product release. The overall electrostatic potential of the pocket is strongly negative, an observation that may explain the poor processing of negatively charged side chains. This potential is largely derived by the presence of two conserved glutamate residues that provide the N-terminal docking site and by additional negatively charged side chains in the S1 pocket. ERAP2 has the most negatively charged residues of the three enzymes (Glu177, Asp198 and Asp888), whereas IRAP has two (Glu426 and Glu541) and ERAP1 only one (Glu865). The only basic residue within the three S1 pockets belongs to ERAP1 (Arg430), leading to altered electrostatic potential near the top of the S1 pocket (Figure 3, blue coloured region), an observation that supports the poorer ability of ERAP1 to trim positively charged P1 substrates.

Figure 3 Surface representation of the S1 pocket for ERAP1, ERAP2 and IRAP coloured by electrostatic potential

The best substrates C for each enzyme are shown as stick models in the predicted conformations.

Key residues that control specificity

By analysing the interactions between docked substrates and protein side chains, we were able to define the residues that line the S1 pocket for the three enzymes. These residues are listed in Table 1 and indicated in the alignment in Supplementary Figure S2. Half of these residues are conserved between the three enzymes (positions 184, 314, 319, 433, 864 and 868 in ERAP1 numbering) and presumably contribute to the common general characteristics of the pocket. Five of these conserved residues are non-polar (Pro184, Phe314, Met319, Phe433 and Phe864 in ERAP1) and may support the preference of all three enzymes for non-polar P1 substrates. Interestingly, Phe433/Phe450/Phe544 comes in close proximity to the β-carbon of the substrate backbone, leading to unfavourable steric hindrance with any substrates with β-branched side chains such as valine, threonine or isoleucine (Figure 4A). Instead, the Phe433/Phe450/Phe544 residues are predicted to provide favourable aromatic π interactions with the guanidinium groups of arginine and hArg P1-bearing substrates as well as favorable CH–π interactions with linear aliphatic chains (Figure 4C). Accordingly, the longer hTyr, hLeu (homo-leucine) and norleucine are the same or better substrates in comparison with tyrosine, leucine and isoleucine. On the opposite side of the pocket, Met319/Met336/Met430 is predicted to make contacts with the Cβ atom of the substrates, leading to steric hindrance for substrates with D-configuration and decreased binding affinity (Figure 4B). D-Substrates are predicted to bind with a relatively different configuration of their scissile peptide bond compared with L-substrates (compare Figure 4A with 4B), consistent with the reduced catalytic efficiency we observed (lower kcat).

View this table:
Table 1 Residues of ERAP1, ERAP2 and IRAP that are predicted to provide key interactions with substrates and help form the S1 specificity pocket

Conserved residues between the three enzymes are in bold.

Figure 4 Key residues that define the S1 pocket

(A) Phe433 in ERAP1 is stacked closely with a leucine side chain of the substrate. (B) Met319 makes unfavourable steric interactions with the Cβ of D-leucine, leading to an altered binding conformation of the scissile peptide bond. (C) Simulated interactions between a 4-guanyl-phenylalanine side chain with Phe544 and Glu541 in IRAP. (D) The six non-conserved amino acids that define the S1 pocket of each enzyme; the predicted conformation of hArg is indicated by an arrow.

Six of the 12 residues that define the S1 pockets vary between the enzymes and contribute to differences between the three S1 pockets that underlie changes in specificity (Figure 4D). Of these residues, two were found to be of particular importance for interactions that appear critical for the differences in specificity between the three enzymes. The polar residue at position 181/198/293 (ERAP1/ERAP2/IRAP numbering) is a glutamine in ERAP1 and IRAP, but is an aspartate in ERAP2. Its positioning adjacent to the GAMEN motif makes it appropriate for interactions with positively charged side chains and has been shown to be important for the selectivity of ERAP2 by site-directed mutagenesis [36]. Interestingly, although IRAP, similarly to ERAP2, is able to process substrates with positively changed side chains, it does not contain an aspartate residue at position 181/198/293 but resembles ERAP1 by having a glutamine residue. This observation raises the question on how IRAP is able to recognize positive charges in the S1 pocket. Docking of positively charged substrates in IRAP suggests that at least a non-conserved, negatively charged amino acid in IRAP, Glu541, may be a candidate residue for providing salt-bridge interactions to stabilize positively charged substrates in the S1 cavity of IRAP (Figure 4C). ERAP1 has an arginine residue at the equivalent position (Arg430) and ERAP2 has a glutamine residue (Gln447).

Mutagenesis confirms that Glu541 in IRAP is important for positively charged substrate recognition

To test the prediction that Glu541 in IRAP is important for the enzyme's preference for positively charged side chains, we used site-directed mutagenesis to change the Glu541 in IRAP to an arginine, the equivalent residue in ERAP1. The IRAP E541R variant was expressed in recombinant form and purified to homogeneity (see Supplementary Figure S4 at We probed the substrate selectivity of the mutant IRAP using the L-substrate library and compared it with the wild-type protein (Figure 5). As predicted, the mutant IRAP had an altered selectivity profile and was much less potent in trimming positively charged residues (noted by arrows in Figure 5). In this context the selectivity profile of E541R IRAP was similar to the profile of ERAP1. Michaelis–Menten analysis using the fluorigenic substrate R-AMC indicated that the difference in specificity for IRAP E541R was primarily due to loss in affinity (Km) (Figure 6). We concluded that Glu541 in IRAP is largely responsible for allowing IRAP to mimic the substrate preferences of ERAP2, without losing the preferences of ERAP1.

Figure 5 Selectivity profile of IRAP E541R mutation compared with wild-type protein

Data have been normalized for leucine. Arrows highlight the most significant changes brought about by the mutation. WT, wild-type.

Figure 6 Michaelis–Menten kinetics for hydrolysis of the substrate R-AMC by E541R IRAP as well as the wild-type enzyme

The enzymatic parameters Km and kcat are depicted for the wild-type (WT) and mutant (E541R) IRAP enzyme.


The importance of antigenic peptide precursor trimming by aminopeptidases has emerged the last few years as both a necessary step for antigenic peptide generation and also as a novel paradigm of regulation of the adaptive immune response [17,37]. However, the discovery that three distinct aminopeptidases participate in antigen processing has raised important questions regarding the regulation of antigenic peptide generation that are far from answered. We hypothesized that the necessity for multiple aminopeptidases performing what is seemingly an identical role lies in key differences between the specificity of these enzymes. Towards testing this hypothesis, we systematically characterized the S1 specificity of ERAP1, ERAP2 and IRAP. We discovered that these three enzymes share many features between their S1-binding pockets, but at the same time have key differences that may help explain their distinct biological functions.

Our analysis suggests that to a large extent IRAP combines the N-terminal specificity of ERAP1 and ERAP2. This is consistent with the recently proposed function of IRAP in a distinct processing compartment inside the cell [11]. ERAP1 and ERAP2 have been proposed to function in tandem inside the ER, with ERAP2 behaving as an accessory protease, assisting ERAP1 in trimming sequences that would otherwise be poorly processed. IRAP, however, appears to act alone inside cross-presentation compartments and as a result it needs to be able to process both ERAP2 and ERAP1 substrates. ERAP2 gains the ability to process positively charged amino acids by a key change at position 181/198/293 [36]. However, this particular change reduces its affinity for hydrophobic chains, specializing it for positively charged amino acids. IRAP, however, cannot afford this option; it needs to be able to trim both ERAP1 and ERAP2 substrates. IRAP achieves this not by altering position 181/198/293 but by altering position 430/447/541 instead, allowing it to combine both specificities. This elegant solution to this apparent specificity problem is indicative on how key amino acid changes inside a specificity pocket can guide selectivity in this family of aminopeptidases. Generally it has proven difficult to alter the primary specificity of proteases by single amino acid replacements because the S1 pocket is influenced by a large number of interactions and may even be intrinsically disordered, as seen for example in the trypsin/chymotrypsin family [38]. However, certain scaffolds tolerate specificity switching by single residue substitutions, for example the chymase/granzyme family [39].

Regardless of the differences between their specificity, the three enzymes share some striking similarities for the side chains they fail to recognize. Neither enzyme can process negatively charged side chains presumably due to the strong negative electrostatic potential of the general region of the S1 pocket. Furthermore, all three enzymes fail to process substrates with side chains that carry β-carbon or oxygen branching (such as valine, isoleucine or threonine) due to the limited space in the S1 pocket and the stringent stereochemical requirements for the recognition of the N-terminus of the substrate by the conserved GAMEN motif. Phe544 plays a key role in this phenomenon and is critical for enzyme activity [40]. These findings, however, raise a crucial question. If all the aminopeptidases that perform antigenic peptide processing before MHC class I loading cannot efficiently process common amino acids such as valine, isoleucine, threonine, glutamate or aspartate, how do such antigenic peptide precursors get processed? Inspection of the SYFPEITHI antigen database reveals that many antigenic peptides may be derived from precursors that would require the excision of such amino acids ( One notable example is the antigenic peptide from human ovalbumin, SIINFEKL, that can be processed by ERAP1 efficiently, although a common precursor sequence contains a glutamate residue (ESIINFEKL) [9]. A possible answer to this question lies on the specificity of ERAP1 for amino acids distal to the N-terminus of the substrate [19], a property that may be shared by ERAP2 and IRAP. An alternative explanation would include the participation of a currently unidentified accessory aminopeptidase.

The important role for these aminopeptidases in antigen presentation in combination with their distinct selectivity profiles suggests that selective inhibition of a single one may lead to subtle alterations of the antigenic peptide repertoire that can be used to modulate the immune response. Recently, polymorphic variation in ERAP1 and ERAP2 has been linked with predisposition to autoimmunity, viral infection and cancer, suggesting that manipulation of the activity of these aminopeptidases may have an important therapeutic potential [41]. Indeed, use of the non-selective general metalloproteinase inhibitor leucine-thiol in cultured cells has been demonstrated to alter antigen presentation [17]. Furthermore, we recently demonstrated that polymorphic variation in ERAP1 can affect antigen processing in vitro [42]. Therefore the development of highly potent and selective inhibitors for this class of enzymes may constitute a useful approach towards the modulation of the adaptive immune response. In addition, the development of highly selective substrates can be useful for investigating established pathogenic links and developing diagnostic and prognostic markers. Our results suggest that although these three enzymes are highly homologous, they still carry key differences in their S1 pockets that can be exploited for the design of selective inhibitors or specific activity markers that can be used to follow antigen processing in vivo or ex vivo.

In summary, we have performed a detailed analysis of the S1 specificity of ERAP1, ERAP2 and IRAP, three enzymes that process antigenic peptide precursors and are crucial to the functioning of the adaptive immune system. By combining small-substrate library screening, molecular modelling and mutagenesis we revealed key differences and similarities between the three enzymes that underlie their biological function. Furthermore, our analysis can facilitate efforts towards the rational design of small-molecular-mass selective inhibitors and activity markers that can be used to manipulate and characterize the adaptive immune response.


Efthalia Zervoudi performed and analysed the enzymatic experiments. Athanasios Papakyriakou and Dionisios Vourloumis performed and helped interpret the molecular modelling. Dimitra Georgiadou, Irini Evnouchidou, Akira Hattori and Luc Swevers designed the ERAP2 and IRAP expression systems, purified the recombinant enzymes and helped interpret the specificity data. Anna Gajda, Marcin Poreba and Marcin Drag designed and synthesized the substrate library and helped interpret the specificity data. Marcin Drag, Athanasios Papakyriakou and Guy Salvesen helped design the study, interpret the data and write the paper. Efstratios Stratikos conceived and supervised the project, helped analyse the data and wrote the paper. All authors read and approved the final manuscript.


This work was funded in part by a NCSR Demokritos internal grant (to E.S.) and by the Foundation for Polish Science and the State for Scientific Research Grant [grant number N N401 042838 (to M.D.)]. I.E. and D.G. acknowledge financial support by the National Centre for Scientific Research “Demokritos” graduate fellowship programme.


We thank Dr Florentia Fostira for help with confirming mutagenesis reactions by DNA sequencing and Paulina Piatek for help with synthesis of fluorigenic substrates.

Abbreviations: ACC, 7-amino-4-carbamoylmethylcoumarin; AMC, 7-amino-4-methylcoumarin; ER, endoplasmic reticulum; ERAP, ER aminopeptidase; hArg, homo-arginine; hTyr, homo-tyrosine; IRAP, insulin-regulated aminopeptidase; R-AMC, L-arginine-AMC


View Abstract