Biochemical Journal

Review article

Biological signalling activity measurements using mass spectrometry

Pedro R. Cutillas, Claus Jørgensen

Abstract

MS (mass spectrometry) techniques are rapidly evolving to high levels of performance and robustness. This is allowing the application of these methods to the interrogation of signalling networks with unprecedented depth and accuracy. In the present review we discuss how MS-based multiplex quantification of kinase activities and phosphoproteomics provide complementary means to assess biological signalling activity. In addition, we discuss how a wider application of these analytical concepts to quantify kinase signalling will result in a more comprehensive understanding of normal and disease biology at the system level.

  • cancer
  • heterogeneity
  • quantitative analysis
  • kinase network
  • systems biology

INTRODUCTION

A fundamental property of all living beings is their ability to adapt to their environment. Cells interpret cues in their surroundings to regulate metabolic homoeostasis, to move, to differentiate, to proliferate or to die. Thus from the chemotaxis of bacteria towards a food source to the formation of complex organs in mammals, organisms exist and survive because of their ability to respond to external signals. At the cellular level, extracellular information is interpreted by a network of biochemical reactions that link input signals initiated at surface receptors to biological outputs of diverse nature. Some of the best known and described regulatory functions controlling the flux of signalling from receptor to biological output involve phosphorylation-dependent signalling. Through the ability to control the formation of multi-protein complexes, protein localization, stability and enzymatic activity, phosphorylation and thereby its regulators (kinases, phosphatases and phospho-binding-domain-containing proteins) play an intricate part in cell signalling [1].

In order to conceptualize the biochemical structure of signalling processes, kinases and other signalling proteins are often grouped into pathways. A considerable amount of work has been dedicated to identify components of pathways involved in metabolic regulation, mitogen response and survival processes, as well as to delineate how these contribute to the regulation of fundamental cellular processes (e.g. [28]) and how their deregulation influences disease progression. Indeed, it is now known that defects in cellular signalling processing contribute to the onset and progression of diseases including metabolic syndromes [3,9,10], neurodegeneration [11,12], autoinflammatory conditions [6,8,13] and cancer [1417].

Early biochemical work has been essential to lay the foundation of our understanding of cell signalling. An important property is that individual pathway components cross-talk with each other thereby forming branched rather than linear cascades [1822]. Feedback loops confer kinase signalling with some of their inherent properties, such as the ability to integrate, potentiate (such as by ultrasensitivity), terminate and diversify signalling, thereby adding complexity to the structure of these signalling pathways [2326]. The complexity in the interconnections is such that it is now widely accepted that signalling components are in fact not arranged as linear pathways, but instead form a network of complex design and architecture [27]. One benefit of network modularity compared with linear cascades is that such structures are less susceptible to the malfunction of individual members, thereby conferring robustness to the system [26]. Networks are also better suited to interpret and integrate the multiplicity of signals which cells need to decode and respond to, thus allowing cells to integrate multiple simultaneous signalling cues [28,29]. Importantly, malfunction in disease is more common on the signalling nodes central in the network, which receive multiple inputs [30,31].

The extent and complexity of kinase signalling is illustrated by the fact that the human genome contains more than 500 protein-kinase-encoding genes [1] which regulate the phosphorylation of thousands of sites on proteins; it is estimated that >30% of proteins are phosphorylated at any one time [32], and that mammalian cells may contain >100000 phosphorylation sites [30]. Further complexity arises because protein kinases do not work in isolation but are interwoven with other types of signal transmission, such as lipid signalling and GTPases, as well as other forms of post-translational modifications, including acetylation and nitrosylation. Protein–protein and protein–lipid interactions also modulate the ability and specificity of kinase cascades to transmit signals through the formation of multi-protein signalling complexes [33,34].

Several kinases have an impact on the ability of cells to survive and proliferate; conversely, kinases required for the survival of specific cell types may not be essential in other cell types [35]. This can be illustrated by the heterogeneous nature of cancer cell sensitivity to kinase inhibitors. Approximately 25% of breast cancer patients are predicted to respond to Herceptin {an inhibitor of the tyrosine kinase HER2 [human EGF (epidermal growth factor) receptor 2]} based on their HER2 expression status [36,37], although in reality only ~50% of HER2-positive patients respond to this drug (i.e. only approx. 13% of breast cancer patients show clinical benefit from treatment with Herceptin). Similarly, not all CML (chronic myeloid leukaemia) patients show improved recovery when treated with Gleevec {an inhibitor of the tyrosine kinase BCR (breakpoint cluster region)-Abl (V-abl Abelson murine leukaemia viral oncogene homologue 1); [38]} and those that initially respond eventually develop resistance. These data indicate that although a significant proportion of breast and leukaemia cancer cells display hyperactivated HER2 and Abl respectively, most tumours are not solely dependent on the increased activation of these kinases. Unresponsive cancers may be the result of resistance-conferring mutations directly within the target kinase. Alternatively, cells may be proliferating using alternative signalling routes, and hence they may activate a different set of kinase(s). These observations indicate that kinases confer cancer cells with oncogenic signals in a context-dependent manner and that cells can take different signalling routes to proliferate.

In attempts to identify a core-set of kinases essential to cell survival, RNAi (RNA interference) was used to systematically knockdown the expression of all kinases in several types of normal and cancer cells [3941]. Interestingly, these results showed that whereas some poorly studied kinases were indispensable for certain cell types to survive, other kinases, with well-known mitogenic and survival roles, were dispensable for the survival of these cells. Similarly, systematic gene sequencing of kinases in cancer have identified putative driver mutations on poorly studied kinases of unknown function [42], suggesting that activation of these kinases may confer cells with a survival advantage and thereby contribute to cancer progression [43]. Collectively these data indicate that our accumulated understanding of kinase signalling in homoeostasis and disease is still limited.

These data also give the perception that the field has focused on the study of a few kinases whereas others, with perhaps equally important roles in biology, are much less well characterized (or not characterized at all). This bias is of course not intentional, but it can be attributed to the availability of reagents, especially antibodies and small molecule inhibitors, to study well-known kinases, rather than to the relative importance of these enzymes to cell biology. It may be argued that, in order to significantly advance, the field now requires new analytical techniques that can complement classical biochemical studies focusing on one kinase at a time. Ideal methods for the analysis of kinase signalling should allow their study without a preconception of the nodes in the network that may be important for specific biological outputs. In this respect, as briefly mentioned above, RNAi is providing new opportunities to systematically investigate the biological roles of novel kinases from a functional perspective [44]. As for mechanistic studies, the advent of MS-based techniques for comprehensive quantification of kinase signalling are revolutionizing the way we can study signalling at the systems biological level. MS-based techniques are evolving from being screening tools (which are inherently time consuming and have low specificity) to offer a level of throughput and robustness that makes MS a suitable method for ‘routinely’ assaying signalling components at an ‘-omic’ scale. We envisage that a wider application of these novel techniques to signalling studies will provide a quantum leap in our understanding of the molecular properties of kinase networks at the system level and how these are regulated in normal and disease biology.

In the present review we discuss the concepts behind the analysis of kinase signalling using MS, with particular focus on the techniques that can be used for multiplex kinase activity measurements and can thus be used to investigate the signalling network at the system level. The analytical concepts reviewed are kinase activity measurements using MS and MS-based techniques for phosphoproteomics (Figure 1). The emphasis of the discussion is on the significance of the findings obtained with both techniques to the understanding cell signalling processes.

Figure 1 Workflows for kinase activity measurements using MS

(A) Kinase activity can be measured in cell lysates by incubating them with peptide kinase substrates in in vitro enzymatic reactions. Phosphorylated reaction products are then quantified by LC-MS [46]. (B) MS can also be used to quantify endogenous phosphorylation, which is a reflection of net kinase activity in cells [57].

DIRECT MULTIPLEX QUANTIFICATION OF KINASE ACTIVITIES BY MS

The classical way of assaying kinase activity involves isolating the kinase under study from other cellular components (e.g. by immunoprecipitation using specific antibodies) and mixing it with its substrates (a known protein or peptide substrate and radioactive [32P]ATP) and magnesium and/or manganese (cofactors needed by most kinases to be active). Activity is then calculated by measuring the incorporation of radioactivity on the polypeptide substrate over time. In addition to the undesirable use of radioactivity, the shortcoming of this technique is that it is restricted by the ability of antibodies specific for the kinase under study.

Using PI3K (phosphoinositide 3-kinase)/Akt [also known as PKB (protein kinase B)] signalling as an illustrative example, we have previously shown that MS can be used to quantify kinase activity with great precision and sensitivity [45,46]. Several signalling cues affect PI3K/Akt activity, such as insulin, antigen, GPCRs (G-protein-coupled receptors) and growth factor receptors [47]. In addition, PI3K/Akt signalling is deregulated in a plethora of diseases ranging from diabetes to inflammation, neurodegeneration and cancer [48,49]. The technique for quantifying PI3K activity involves mixing cell lysates with a peptide substrate (with sequence RPRAATF) known to be relatively specific for protein kinases downstream of PI3K, including PKB and SGK (serum- and glucocorticoid-induced protein kinase) [50,51]. Although this peptide could in principle be phosphorylated by other kinases present in the lysate, they seemed to do so with much slower kinetics [50]. Thus, by employing specific inhibitors, it was demonstrated that the assay was specific for the measurement of PI3K/Akt activity and provided a proof of concept for the use of MS to quantify kinase activities in whole-cell lysates [46].

One of the advantages of using MS to determine kinase activity is that the incorporation of phosphate to the substrate can be determined in a highly specific manner. This contrasts with the use of [32P]ATP which measures the incorporation of phosphate not only on the substrate added to the reaction, but also on other proteins present in the lysate due to unwanted side reactions involving other kinases also present in the lysate. Since MS allows detection of the phosphopeptide product of the kinase reaction with ultimate specificity, it is possible to use total cell lysate as the enzyme source, thus obviating the need for immunoprecipitation of the kinase. Another advantage of measurements based on activity is that the signal of the target kinase is amplified; this is because each kinase molecule catalyses the conversion of a large number of substrates into product (which is what is being measured) per unit time. This, in combination with the sensitivity of modern mass spectrometers, makes these assays extremely sensitive, thus allowing investigation of signalling in tissues, such as biopsies and rare cell populations, which are not amenable to extensive investigations using traditional biochemical methods.

In addition to specificity and sensitivity, the main attraction of MS as an analytical technique is that it allows the measurement of hundreds/thousands of analytes in the same assay. Thus MS could in principle be used to quantify the activity of all of the kinases present in the genome. This would involve incubating cell lysates with peptide substrates for all known kinases (plus ATP and Mg2+) and using MS to measure the intensities of phosphopeptides produced as a result (Figure 2). This analytical concept was introduced at the 2004 Conference of the American Society for Mass Spectrometry [52] as activitomics. Later, the practical validity of this analytical concept was demonstrated for the analysis of 90 kinase activities in a single assay [53].

Figure 2 The principle of multiplex analysis of kinase activities

MS can be used to multiplex the assessment of kinase activities in total cell lysates by quantifying the reaction products of in vitro enzymatic assays.

Methods for high-content analysis of kinase activities, in which all kinases may be assayed simultaneously, would have great utility in many different areas of biomedical research and for drug discovery and validation. In addition, a direct measure of activities (instead of proxies such as genetic mutations, protein amounts or phosphorylation status) would be ideal for inferring the circuitry of signalling networks. However, practical issues make designing an assay for the quantification of all kinases difficult, as our current knowledge of kinase specificity still is a limiting factor. Although progress in our knowledge of kinase specificity is being made [54,55], this is still short of kinome-wide coverage and therefore the technique is at present restricted to studying the regulation of well-characterized enzymes. Moreover, it is not clear whether or not it may be possible to identify polypeptide substrates that may be used to assay kinases with similar specificities; thus Akt1, Akt2 and Akt3 are predicted to have the same substrate specificities and these may also be shared with other basophilic kinases such as SGK and S6Ks (S6 kinases) [56].

Thus MS-based methods for multiplex analysis of activities still need further development in order to constitute a general tool for unbiased discovery and monitoring of signalling networks. However, having an understanding of the strengths and limitations of these MS-based quantification of activities can still make these assays invaluable in numerous applications of biomedical and pharmacological research. For example, these assays can be very useful to study the regulation of kinase groups and to quantify the activities of well-characterized kinases, which could serve as ideal pharmacodynamic markers to assess target inhibition of investigative drugs in small amounts of material and to profile off-target effects of these kinase compounds in primary tissues. Kinase activities may also be ideal predictive biomarkers to stratify patients in clinical trials and for personalized therapies.

USING PHOSPHOPROTEOMICS TO MEASURE KINASE/PHOSPHATASE ACTIVITY

By definition, phosphorylation sites on proteins are the result of the activity of kinases and phosphatases acting on them (Figure 3). Therefore measuring the extent of phosphorylation of specific residues provides a means to assess the equilibrium of kinase/phosphatase reaction pairs [57]. Consequently there is a great interest in the design and use of phosphoproteomic techniques in many areas of biology and biomedical research as a means to investigate the wiring and regulation of kinase signalling networks. Traditionally, modulation of signalling dynamics has been measured using antibodies recognizing specific phosphorylation sites. More recently, the development of antibody microarrays is allowing multiplexing the measurement of protein biomarkers with very good sensitivity and throughput. As an example of a recent study, van Oostrum et al. [58] used reversed-phase protein arrays to monitor the phosphorylation of six sites known to be activation markers of pathways downstream of growth factor receptors. However, the workhorse of most successful current phosphoproteomics methods is MS, which has been used to identify and quantify thousands of phosphorylation sites on proteins in a variety of studies. The advantage of MS over immunochemical techniques is that analysis is not restricted by the availability of antibodies, and that signalling can be analysed without a preconception of the signalling processes that may (or may not) be involved in the biological system under investigation.

Figure 3 Phosphophoproteomics as a measure of kinase activity

(A) By definition, each phosphorylation site is the result of kinase activity, opposed by phosphatase activity. (B) The hypothetical case of phosphorylations on proteins with different levels of expression is presented. (C) The data in (B) is shown as total phosphorylation or stoichiometry of phosphorylation. This hypothetical scenario shows that quantification of phosphoproteins and their phosphorylation sites can give different information when total or stoichiometry of phosphorylation are considered.

A question that needs to be addressed is what is the significance and utility of quantifying phosphorylation in such a large scale for the understanding of cell signalling. Unfortunately the assertion that phosphorylation leads to the activation of enzymes is frequently made; according to this view, measuring phosphorylation is a means to infer the activity of the protein bearing the phosphorylation site and thus phosphorylation and activation are frequently used interchangeably. It is correct that phosphorylation on the activation loop of certain protein kinases correlates with their activity status; however, the functional consequence of most phosphorylations can vary. For example, many phosphorylations decrease the activity of kinases, as illustrated by the phosphorylation of GSK3β (glycogen synthase kinase 3β) by PKB at position Ser9, which decreases its enzymatic activity [59]. In addition, whereas phosphorylation of STAT (signal transducer and activator of transcription) transcription factors at a tyrosine residue around position 700 by tyrosine kinases [chiefly JAK (Janus kinase)] increases their transcriptional activity [14], FOXO (forkhead box O) transcription factors are deactivated by PKB phosphorylation. In fact, most of the phosphorylations by PKB/Akt are deactivating [60]. Other phosphorylations may not alter the enzymatic activity of the protein, but may result in them being sequestered into specific subcellular compartments or changing their binding partners. It has also been proposed that some phosphorylations bear no functional significance on the protein being phosphorylated [61,62], which is argued to exist because there is no selection pressure against it. The existence of non-functional phosphorylation has been debated [63,64], but its occurrence is difficult to prove or disprove because phosphorylations that do not affect enzymatic activity may confer phosphoproteins with other properties, such as an ability to interact with other proteins or to be located in their physiological intracellular space.

Since there is no general rule for the functional consequence that phosphorylation has on the function of proteins bearing these modifications, what then is the value of large-scale studies that quantify thousands of phosphorylation sites? We believe that the value of phosphoproteomics is in its potential to offer a comprehensive readout of the activity status of kinases and regulatory loops within the kinase network. As mentioned above, each phosphorylation site identified by MS is the result of a kinase reaction. Thus, even when the physiological consequence of a particular phosphorylation site may not be fully understood, it is still a readout of a kinase/phosphatase reaction, and as such, measuring the extent of phosphorylation provides a readout of net kinase activity (discussed in more depth below). In addition, only by measuring the behaviour of the signalling network as a whole (as well as the dynamics) can associations between individual or groups of phosphorylations and cellular behaviour be identified. This will in turn allow in-depth functional studies of these phosphorylation events (i.e. prioritization), as well as identification of kinases and phosphatases predicted to control the specific phosphorylation events.

CAVEATS OF THE USE OF PHOSPHOPROTEOMICS AS READOUTS OF KINASE/PHOSPHATASE ACTIVITIES

A caveat of the use of phosphoproteomics as a measure of kinase activity is that we do not know the identities of the kinases acting on most of the sites quantifiable by MS. Thus, although conceptually we could use phosphoproteomics to quantify all of the activities of the kinome, in practice we can only follow those activities for which well-characterized substrates are known. This is in analogy of classical signalling studies which use immunoblotting to analyse the extent of phosphorylation of sites such as STAT5A phospho-Tyr694, PKB phospho-Ser473 and ERK (extracellular-signal-regulated kinase) phospho-Tyr204 to quantify the activation of JAK, PI3K/PDK1 (phosphoinositide-dependent kinase 1) and MEK [MAPK (mitogen-activated protein kinase)/ERK kinase] respectively, as these sites are thought to be specifically phosphorylated by these upstream kinases.

In order to evaluate different kinase nodes quantifiable in a single LC (liquid chromatography)-MS run, the data from [65] was mined for phosphorylation sites for which well-characterized upstream kinases are known. Table 1 shows examples of phosphorylation sites that provide readouts of several kinases and which are readily quantifiable by MS. The challenge of the field is to identify the kinases acting on the many other thousands of phosphorylation sites that can be quantified by MS, so as to provide a link between the phosphoproteome and the kinome and to transform phosphoproteomic studies from descriptive exercises to studies of exceptional functional and mechanistic significance.

View this table:
Table 1 Examples of phosphorylation sites quantifiable by direct LC-MS for which upstream kinases are known

These phosphorylation sites, identified by LC-MS/MS and readily quantified by LC-MS [99], provide readouts of the activity of kinases acting upstream. Potential kinases for each phosphorylation site were obtained from the phosphoELM database [86] and the literature. BTK, Bruton's tyrosine kinase; CaMK IV, Ca2+/calmodulin-dependent protein kinase IV; CDK1, cyclin-dependent kinase 1; GSK3β, glycogen synthase kinase 3β; IRAK1, interleukin-1-receptor-associated kinase 1; MAPK, mitogen-activated protein kinase; MAPKAPK2, MAPKAP kinase-2; mTOR, mammalian target of rapamycin; p70S6K, p70 S6 kinase; PAK1, p21-activated kinase 1; PDK1, phosphoinositide-dependent kinase 1; PKA, protein kinase A; PKC, protein kinase C; ROCK1, Rho-associated kinase 1; RSK-1, ribosomal S6 kinase 1.

To this end, intense research is leading to an increase in our knowledge of substrate specificities of individual kinases or kinase groups. Approaches used for this purpose are both computational and from brute force biochemical studies. At the computational level, linear motifs and protein–protein interactions have been used to predict kinases responsible for phosphorylating experimentally identified phosphorylation sites [54,66].

Experimental wet laboratory approaches for systematic identification of kinase substrates have also been reported. Philip Cohen's laboratory reported an approach, termed Kestrel [32], to identify substrates of kinases in which the recombinant kinase, for which new substrates are being searched, is incubated with [32P]ATP and dephosphorylated cellular proteins and an in vitro kinase reaction is allowed to occur. The phosphorylated reaction products were detected by radiography and, after separation by multi-dimensional HPLC, identified by MS. As an extension of this approach, modern phosphoproteomic techniques are now being used to identify phosphorylations in in vitro kinase reactions in a global fashion. In order to lower the probability for false positives in in vitro kinase reactions, elegant chemical biological approaches have been described in which kinases are engineered to accept analogues of ATP as a co-substrate of the kinase activity. These ATP analogues are not substrates of cellular kinases, but are effectively used by recombinant kinases engineered at the ATP-binding pocket. The approach was pioneered in the laboratory of Shokat and Morgan and used to identify CDK1 (cyclin-dependent kinase 1) substrates [67]. Huber's laboratory recently used a similar approach to identify MEK1 substrates [68].

TRANSLATING PHOSPHOPROTEOMIC DATA INTO KINASE ACTIVITY DATA

Since, by definition, each phosphorylation site is the result of a kinase reaction (Figure 3), it should in principle be possible to use phosphorylation data to infer kinase activation on a proteomic scale. However, even when the identities of the kinase(s) catalysing the addition of phosphates to a given residue are known, inferring kinase activation from phosphorylation data is not straightforward. Quantification of phosphorylation stoichiometry would be needed for assessing specific kinase activity (Figure 3). However, since these measurements cannot easily be obtained systematically, the value of phosphoproteomics is in the ability to provide a measure of total kinase activity. In other words, quantification of a phosphorylation site, without a knowledge of the expression of the protein bearing the modification site, tells us about how active the kinase(s) acting on these sites have been in cells under a defined set of conditions. This concept can be illustrated with an example in which a hypothetical kinase phosphorylates a protein substrate (Figure 3). For the purpose of this discussion, we will refer to specific activity as the ability of a kinase to phosphorylate its substrates, whereas net activity will be used to denote how many times the kinase has actually phosphorylated such substrates before cell lysis and under defined experimental conditions. In this hypothetical scenario, phosphatase activity is implied in net kinase activity. The hypothetical case of low substrate expression and high specific activity would result in large stoichiometry of phosphorylation (Figure 3). Even at low substrate expression, since its specific activity is high, the kinase manages to phosphorylate substrates in this hypothetical case. Analytically, this scenario would be detected as high stoichiometry of phosphorylation, but total amounts of phosphopeptide quantified would be low. Thus net kinase activity (the number of times the hypothetical kinase has phosphorylated this substrate) would be relatively low, simply because the substrate is not highly expressed. It is important to note that, in cases in which substrates for a particular kinase are not expressed, total enzymatic activity would be zero even if the kinase was highly expressed and activated to its maximum. Conversely, the hypothetical case of high substrate expression and low kinase activation would result in low stoichiometry of phosphorylation, yet net kinase activity would be high, simply because high substrate expression would allow the kinase to phosphorylate many of these molecules per unit time (Figure 3). Thus total phosphorylation and stoichiometry provide complementary information on kinase activation, and the ability to obtain both values would be beneficial for the understanding of signalling.

Thus although methods for systematic measurement of stoichiometry of phosphorylation would be desirable, at present phosphoproteomics can be used to quantify net phosphorylation, which we believe is a useful measure of kinase activity. An example of this is the activation of STATs, which requires phosphorylation of a tyrosine residue at position 700 of their sequence. Since only phosphorylated STATs translocate from the cytoplasm to the nucleus, the transcriptional activity of STATs depends on the total amount of phosphoprotein rather than on their stoichiometry of phosphorylation at this residue [69]. Thus, at least in this case, a measure of total phosphorylation of the STAT Tyr700 residue is more useful to quantify the activity status of the JAK/STAT signalling than the determination of stoichiometry for this residue. Nevertheless, attempts to provide methods for global quantification of stoichiometry by MS have been reported [70]. This technique involved measuring total protein as well as the peptide bearing the site of phosphorylation and its unphosphorylated counterpart. These measurements can rarely be made for a large number of sites and samples with the current capabilities of modern MS; therefore the approach is only possible for the analysis of highly abundant proteins and in those laboratories with significant instrumentation and bioinformatics support.

EXAMPLES OF PHOSPHOPROTEOMIC STUDIES AND WHAT WE ARE LEARNING FROM THIS WORK

The technology for quantitative phosphoproteomics has recently been extensively reviewed [57,7173]. We will thus only describe these methods in the context in which these were used for biological research.

A typical phosphoproteomics experiment involves comparing the intensities of phosphopeptides in LC-MS/MS (MS/MS is tandem MS) runs across the experimental conditions chosen for the study. The first step in the workflow is the digestion of all of the proteins present in cell lysates, thus creating a peptide mixture containing hundreds of thousands peptides (Figure 4). Phosphopeptides are subsequently enriched from this peptide mixture using different forms of chromatographic media, the most popular being IMAC (immobilized metal-affinity chromatography) and TiO2 (titanium dioxide). A higher coverage of the phosphoproteome can be achieved by performing an initial separation step prior to phosphopeptide enrichment; this can be accomplished by SCX (strong cation exchange), HILIC (hydrophilic interaction LC) or RP (reversed-phase)-HPLC. Most of the peptides present in IMAC or TiO2 eluates are phosphopeptides containing between one and three phosphorylation sites (although phosphopeptides with a higher number of sites and unphosphorylated acidic peptides are also present). Peptides in these phospho-fractions are then identified and quantified by LC-MS/MS.

Figure 4 A workflow for large-scale isotope-based quantification of phosphorylation

A typical quantitative phosphoproteomics experiment involves several steps in its workflow. More than ten thousand phosphorylation sites can routinely be quantified with this approach.

Although LC-MS is inherently quantitative, this technique does not produce an output that can be easily translated into a quantitative readout of peptide abundance. Therefore methods for quantitative proteomics based on LC-MS had to be developed in order to harness the quantitative information inherent in MS data. Perhaps the most popular of such methods is based on metabolic labelling. Although different forms of metabolic labelling for quantitative proteomics have been described, SILAC (stable isotope with labelled amino acids in culture) has been extensively used to quantify phosphorylation [70,74,75]. Quantification of phosphorylation sites by LC-MS was described in the mid-2000s [76,77]; however, Olsen et al. [78] was the first study that applied the workflow described above (Figure 4) in a large-scale phosphoproteomics experiment. In that study, proteins in HeLa cells were labelled with SILAC and stimulated with EGF for five time points. After protein digestion, peptide separation by SCX HPLC into 13 fractions and enrichment from these fractions using TiO2 chromatography, phosphopeptides were identified and quantified by LC-MS/MS. Out of the 6600 phosphorylation sites detected on 2400 proteins, approx. 2000 of them were shown to be regulated by EGF. This study was influential because it showed the power of state-of-the-art LC-MS/MS instrumentation for the quantification of signalling on a global scale. The study also showed unexpected complexity of the phosphorylation dynamics downstream of a growth factor receptor. SILAC has also been used to profile phosphorylation downstream of other signalling pathways including insulin and PDGF (platelet-derived growth factor) signalling [79,80]. Related techniques have been used to profile more than 20000 phosphorylation sites across the cell cycle [70]. However, although powerful, the limitation of the methodology was in its complexity and time-consuming nature (see Figure 4) restricting its implementation to laboratories with significant technological support. Figure 4 illustrates that a typical SILAC experiment that compares just two or at the most three samples requires more than 1 month of experimental time. Therefore an experiment with three biological replicates would require at least 3 months of work (this does not take into account bioinformatic analysis). Analysis of technical replicates requires less time as SILAC labelling and sample preparation does not need to be performed again. However, although technical replicates are important to evaluate analytical reproducibility, biological repeats are needed to assess biological variability.

As an alternative to SILAC, quantitative phosphoproteomic methods that rely on chemical labelling instead of metabolic labelling have also been reported. The rationale for the use of chemical labelling methods over SILAC is that these are applicable to primary samples, human tissues and cell cultures not amenable to SILAC labelling. For phosphoproteomics, iTRAQ (isobaric tag for relative and absolute quantitation) labelling was used to profile phosphorylation on proteins that bind small molecule kinase inhibitors and derive binding constants for different inhibitors [81]. Similar techniques have been used to profile phosphorylation downstream of growth factor receptors [82,83].

Collectively, phosphoproteomic experiments performed to date have illustrated the enormous complexity of the phosphoproteome. It is estimated that there may be more than 100000 sites of phosphorylation in cells, and many of them are regulated by RTKs (receptor tyrosine kinases), GPCR and/or during the cell cycle. This is in line with the known function of phosphorylation in controlling fundamental cellular processes at the molecular level.

TARGETED PHOSPHOPROTEOMICS AS AN ALTERNATIVE TO LARGE-SCALE METHODS

As an alternative to large-scale phosphoproteomics, the field is also showing an interest in the use of targeted approaches to quantify pre-selected phosphopeptides that more faithfully represent the cellular signalling network of interest (Figure 5). The advantage of targeted approaches is that these have higher throughput and they overcome the problem of undersampling of data-dependent LC-MS/MS; this results in the quantification of identical peptides each time the analysis is performed, thus allowing comparison of a sufficient number of samples and replicates needed in order to assess the statistical significance of the quantitative data. A typical LC-MS/MS run that targets the quantification of several hundred phosphopeptides takes approx. 2 h of mass spectrometer time (Figure 5). Therefore a comparison of two samples with five biological replicates can be done in less than 1 day. Although this throughput cannot be considered to be high, it contrasts with that afforded by discovery strategies based on chemical or metabolic labelling which, as discussed above, may only be able to analyse two samples (without replicates) in 2 days of MS time. From a pragmatic point of view, the higher throughput of targeted MS means that phosphoproteomic strategies could be applied to address research questions intractable with other methods.

Figure 5 A workflow for targeted quantification of phosphorylation

This approach involves a first analytical step in which phosphorylated peptides are identified by LC-MS/MS. In subsequent experiments, these phosphopeptides are then quantified across samples with relatively high throughput.

Therefore targeted methods for proteomics and phosphoproteomics offer a realistic alternative to immunochemical techniques, such as Western blots and protein arrays. As discussed above, an advantage of LC-MS over immunochemical techniques is that the analysis is not restricted by the availability of antibodies so that signalling may be investigated without a preconception of the nodes that may be involved. In addition, by introducing an internal standard such as phosphopeptides enriched with heavy isotopes of carbon and nitrogen, the technique also allows absolute quantification of individual phosphorylation events [84]. This feature is particularly useful for translating these assays into clinical applications, which require absolute rather than relative quantification in order to provide an objective measure of activity [46].

Phosphorylation sites of interest can be found in databases such as Phosida [85], PhosphoElm [86] and PhosphoSite [87], and in repositories of experimentally identified peptides (and phosphopeptides) including Tranche [88] and Pride [89]. However, these freely available resources are useful only as a starting point because targeted LC-MS requires information on retention time (which is instrument-specific), in addition to the mass and fragmentation information of phosphopeptides. Therefore such experiments normally require a first experimental step in which LC-MS/MS is used to identify the phosphopeptides that may be quantified in subsequent targeted experiments (Figure 5).

Two different approaches may be used for targeted quantitative LC-MS/MS analysis of peptides (and, by extension, phosphopeptides), namely SRM [selected reaction monitoring, also known as MRM (multiple reaction monitoring)] and XIC (extracted ion chromatogram) of precursor ions (Figure 6). Although SRM-like experiments can be performed in Q-TOF (quadrupole–time-of-flight) instruments [90], SRM is most efficiently performed in triple quadruple mass analysers and involves mass selection of a parent ion, which is then fragmented in a collision cell. At least one of the fragments produced as a result is then mass-analysed by a second quadrupole (Figure 6). The technique is reputed to offer ultimate specificity because the method involves selection based on the mass of the analyte and that of a fragment originated from the same compound. Because of their specificity, SRM methods are the gold standards for the analysis of small molecules, such as drugs or pesticides, in clinical, forensic and environmental laboratories [91]. In proteomics, these methods are increasingly used to monitor biomarkers in biological fluids [92,93] and have also been used to quantify phosphorylation sites in small-scale studies [94]. At present, the limit of the number of peptides that can be quantified in a single run SRM/MRM is several hundred [95].

Figure 6 Mass spectrometric techniques for targeted proteomics and phosphoproteomics

Phosphopeptides can be quantified by SRM (A) or by XIC (B). SRM involves quantifying at least one fragment derived from the precursor phosphopeptide ion. Specificity of quantification is due to the two stages of mass selection. SRM is normally performed in triple quadrupole instruments [95], although pseudo SRM/MRM experiments are also possible in Q-TOF-type instruments [90]. Quantification by XIC involves plotting the elution profile of the precursor phoshopeptide over the chromatographic run. Specificity of quantification is due to the narrow mass window for selection. In addition, filters of charge, isotope distributions and retention time contribute to the specificity of the quantification. These experiments are better performed in high-resolution and high-mass-accuracy instruments such as Q-TOFs, Orbitraps or FT-ICR (Fourier-transform ion cyclotron resonance) [97]. ESI, electrospray ionization.

Quantification by XIC of parent ions is an alternative to SRM for targeted analysis of peptides and phosphopeptides (Figure 6). This technique involves plotting the signal intensity of the mass-to-charge ratio of the (phospho)peptides being quantified as a function of the elution time during the LC-MS run, thus constructing XICs. The peak areas and heights on these peptide-elution profiles correlate with their abundance across samples [76,96,97]. Modern mass spectrometers of high resolution and high mass accuracy allow obtaining these XICs with very narrow mass ranges thus resulting in an increase in the specificity of the analysis. The specificity can also increase by considering the charge of the peptide and its isotopic distribution to eliminate the possibility of selecting a peak with the same mass but different charge, or the second or third isotope of a co-eluting ion [98].

These MS-based targeted approaches can be particularly suitable as quick readouts of kinase activities by their ability to quantify specific phosphorylation sites known to be activation markers of kinases. As an example of this concept, we used the XIC approach to quantify phosphorylation in two leukaemia cell lines showing strikingly different sensitivities to inhibitors that have PI3K and Src as their main targets. More than 2000 phosphorylation sites could be quantified in a single LC-MS run, of which >90% showed accuracy deviations lower than 50% [99]. This study demonstrated that LC-MS can quantify thousands of phosphorylation sites and thus produce high content and accurate data without the need for metabolic or chemical labelling. Importantly, these assays were performed in time frames compatible with the analysis of a sufficient number of samples for assessing the statistical significance of the data and to serve as a general analytical tool in cell signalling studies.

CONCLUDING REMARKS

Although more technological developments are needed for MS-based quantification of signalling to reach maturity, the current state-of-the-art is already allowing the interrogation of signalling networks with unprecedented depth and accuracy. More work is needed to identify the kinases acting on the phosphorylation sites that can be routinely quantified by LC-MS and to develop methods to routinely quantify stoichiometry. These advances will allow the construction of signalling networks which overlay quantitative phosphoproteomics data and fully exploit the information inherent in the phosphoproteome. The advent of targeted phosphoproteomics will contribute to the interrogation of the activation of defined signalling nodes with higher throughput than that afforded with classical techniques based on data-dependent MS/MS, yet with more depth and accuracy than by techniques based on immunochemistry, thus allowing us to ask biological questions intractable by other means. We envisage that application of the concepts reviewed here to quantify the activity of other enzyme classes important in signalling, such as acetyltransferases or GTPases, will further contribute to advance our understanding of cell biology at the system level.

FUNDING

The P.R.C. laboratory is supported by Bart's and the London Charity [grant number 297/997], Cancer Research UK [grant number C27327/A9914] and the Biotechnological and Biological Sciences Research Council [grant number BB/G015023/1]. C.J. thanks the Institute of Cancer Research and the Biotechnological and Biological Sciences Research Council for support.

Acknowledgments

We thank members of the Centre for Cell Signalling for helpful discussions and feedback on the manuscript.

Abbreviations: Abl, V-abl Abelson murine leukaemia viral oncogene homologue 1; EGF, epidermal growth factor; GPCR, G-protein-coupled receptor; HER2, human EGF receptor 2; ERK, extracellular-signal-regulated kinase; IMAC, immobilized metal-affinity chromatography; JAK, Janus kinase; LC, liquid chromatography; MEK, MAPK (mitogen-activated protein kinase)/ERK kinase; MRM, multiple reaction monitoring; MS/MS, tandem MS; PI3K, phosphoinositide 3-kinase; PKB, protein kinase B; Q-TOF, quadrupole–time-of-flight; RNAi, RNA interference; SCX, strong cation exchange; SGK, serum- and glucocorticoid-induced protein kinase; SILAC, stable isotope with labelled amino acids in culture; SRM, selected reaction monitoring; STAT, signal transducer and activator of transcription; XIC, extracted ion chromatogram

References

View Abstract