Biochemical Journal

Research article

Molecular dynamics analysis of conserved hydrophobic and hydrophilic bond-interaction networks in ErbB family kinases

Andrew J. Shih , Shannon E. Telesco , Sung-Hee Choi , Mark A. Lemmon , Ravi Radhakrishnan


The EGFR (epidermal growth factor receptor)/ErbB/HER (human EGFR) family of kinases contains four homologous receptor tyrosine kinases that are important regulatory elements in key signalling pathways. To elucidate the atomistic mechanisms of dimerization-dependent activation in the ErbB family, we have performed molecular dynamics simulations of the intracellular kinase domains of three members of the ErbB family (those with known kinase activity), namely EGFR, ErbB2 (HER2) and ErbB4 (HER4), in different molecular contexts: monomer against dimer and wild-type against mutant. Using bioinformatics and fluctuation analyses of the molecular dynamics trajectories, we relate sequence similarities to correspondence of specific bond-interaction networks and collective dynamical modes. We find that in the active conformation of the ErbB kinases, key subdomain motions are co-ordinated through conserved hydrophilic interactions: activating bond-networks consisting of hydrogen bonds and salt bridges. The inactive conformations also demonstrate conserved bonding patterns (albeit less extensive) that sequester key residues and disrupt the activating bond network. Both conformational states have distinct hydrophobic advantages through context-specific hydrophobic interactions. We show that the functional (activating) asymmetric kinase dimer interface forces a corresponding change in the hydrophobic and hydrophilic interactions that characterize the inactivating bond network, resulting in motion of the αC-helix through allostery. Several of the clinically identified activating kinase mutations of EGFR act in a similar fashion to disrupt the inactivating bond network. The present molecular dynamics study reveals a fundamental difference in the sequence of events in EGFR activation compared with that described for the Src kinase Hck.

  • epidermal growth factor receptor (EGFR)
  • homology modelling
  • kinase activation
  • molecular dynamics
  • principal component analysis (PCA)


RTKs (receptor tyrosine kinases) are transmembrane glycoproteins important in intercellular communication and oncogenesis [1]; they comprise a large ligand-binding extracellular domain, a single transmembrane α-helix, an intracellular tyrosine kinase domain and a C-terminal tail that harbours regulatory tyrosine autophosphorylation sites (reviewed in [2,3]). The ErbB family consists of four homologous RTKs: EGFR [epidermal growth factor receptor; ErbB1/HER1 (human EGFR1)], ErbB2 (HER2/Neu), ErbB3 (HER3) and ErbB4 (HER4). Binding of growth factors to the extracellular ligand-binding domains of ErbB receptors promotes their homo- and/or hetero-dimerization, which in turn leads to activation of the cytoplasmic kinase domain. This triggers a multi-layered signalling network of crucial pathways regulating cell proliferation, differentiation, migration etc. [4]. Aberrant signalling by EGFR and ErbB2 is correlated with a variety of diseases, from psoriasis to cancer [5]. In particular, clinically identified mutations in EGFR have been shown to increase the basal activity of the EGFR kinase domain, and NSCLC (non-small-cell lung cancer) patients carrying these mutations respond remarkably to the EGFR RTK inhibitor gefitinib [68]. ErbB2 is the target of the therapeutic Herceptin antibody, and its amplification and overexpression in breast cancer correlates with a poor prognosis [9,10]. The loss of ErbB4 signalling in mice has been shown to result in defective heart, nervous system and mammary gland function [1113].

In RTK signalling, the intracellular kinase domain catalyses transfer of the γ-phosphate of ATP to tyrosine residues on both the RTK itself and in other target substrates (reviewed in [2]). Regulation of the RTK kinase domain is thought to involve contributions from several conserved sub-regions: the C-loop (catalytic loop), the A-loop (activation loop), the P-loop (glycine-rich nucleotide-binding loop) and the αC-helix, which together define the active site in the cleft between the β-strand-rich N-lobe and the helical C-lobe. The catalytic loop residues directly participate in phosphoryl transfer. The A-loop, P-loop and αC-helix (marked in Figure 1A) modulate the activity of the kinase domain by regulating accessibility of the active site to binding and co-ordinating both ATP and the substrate tyrosine. The ~20 amino acid A-loop in ErbB kinases contains one phosphorylatable tyrosine residue (Tyr845 in EGFR, Tyr877 in ErbB2 and Tyr850 in ErbB4). In many kinases, e.g. IRK (insulin receptor kinase), phosphorylation of this A-loop tyrosine residue triggers conformational changes that allow substrate access to the active site. Thus A-loop conformational changes constitute a key event in RTK regulation (Figure 1B). The αC-helix and P-loop must also be re-positioned to co-ordinate the ATP and the substrate tyrosine for effective phosphoryl transfer. The four regulatory subdomains are highly conserved across the ErbB family, with 88% sequence identity between EGFR and ErbB2, and 77% sequence identity between EGFR and ErbB4 (Figure 1A). ErbB3 is unique in the ErbB family as it lacks specific conserved residues considered to be required for full kinase activity.

Figure 1 ErbB kinases sequence and structure

(A) Alignment of the four catalytic subdomains comprising the active site in the ErbB kinases, with accompanying alignment scores for the subdomains alone and for the entire kinase. (Uniprot accession numbers: EGFR, P005333; ErbB2, P04626; ErbB4, Q15303) (BF) Comparison of the inactive and active conformations of the ErbB kinases highlighting the conserved features (structures of EGFR pictured, PDB codes 2GS7 and 1M14 respectively). The C-loop is highlighted in red, with the active conformation shown in blue and the inactive conformation in green (B). The hydrophobic core is shown in orange (C and D), and the C-spine is shown in yellow and the R-spine is in purple (E and F). (G) The residues comprising the C-spine and R-spine for EGFR, ErbB2 and ErbB4 kinase. The residues highlighted in green are hydrophobic, the residues in red are hydrophilic and those in white are neutral. (H) The residues comprising the hydrophobic core and the αC–β4 region for EGFR, ErbB2 and ErbB4 kinase. Serine is shown in pink as it is slightly hydrophilic, although it is still considered neutral.

Previous structural studies have revealed highly conserved hydrophobic ‘spines’ within kinases that are considered important for defining their catalytic state [14,15], shown in Figures 1(E) and 1(F). The R-spine (regulatory spine) consists of four hydrophobic side chains (Met742, Leu753, His811 and Phe832 in EGFR) anchored by an aspartic acid residue in the αF-helix (Asp872 in EGFR). The R-spine spans several key regulatory subdomains, and co-ordinates the motion of the N- and C-lobes of the kinase [14]. The C-spine (catalytic spine) involves eight hydrophobic side chains (Val702, Ala719, Leu774, Val819, Leu820, Val821, Thr879 and Leu883 in EGFR) that help support and co-ordinate the adenine ring of ATP in the active state [15]. Similarly, in the inactive state there is a small hydrophobic ‘core’ formed between the αC-helix and the A-loop, which maintains the kinase in the inactive conformation (Figures 1C and 1D). Disruption of this hydrophobic core by single point mutations has been shown to activate EGFR [6,8,1618].

EGFR stands out among RTKs in appearing not to require A-loop phosphorylation for its activity [19]. Mutation of the EGFR A-loop Tyr845 to a phenylalanine residue does not appear to change the level of receptor auto-phosphorylation in cells or in vitro [20], in contrast with other kinases such as IRK (although Tyr845 is phosphorylated by Src in EGFR signalling [21]). Crystal structures have confirmed that the EGFR and ErbB4 kinase domains can adopt active-like conformations even without Tyr845 (Tyr850 in ErbB4) phosphorylation [16,22], and have revealed an allosteric mechanism for kinase domain activation [20]. Activation of the EGFR tyrosine kinase domain involves the formation of an asymmetric ‘head-to-tail’ dimer in which one kinase domain (the ‘receiver’) becomes activated through allosteric changes arising from contacts between its N-lobe and the C-lobe of its neighbour (the ‘activator’). The C-lobe of the activator kinase appears to play a cyclin-like function in activating its dimerization partner (the receiver). The importance of the asymmetric dimer interface was confirmed by mutational studies in EGFR and ErbB4 [20,22]. Further studies have shown that the intracellular juxtamembrane region of the receptor also contributes to formation of the asymmetric dimer interface, in a manner that is necessary for maximal activation [2325].

Considering the high degree of sequence similarity and structural homology across the ErbB family members (Figures 1A, 1G and 1H), we sought to identify the degree to which molecular mechanisms of activation are conserved across the ErbB family, and to identify differences in overall function that arise from variability in primary structure. Previously, we and others have hypothesized the existence of distinct networks of intramolecular non-covalent bonds that characterize the active and inactive conformations of kinases (for Lyn [26,27], Abl [28], EGFR [2830] and ErbB2 [31]), with transitions between the states necessitating a shift in these bond networks. In the present paper, we describe bioinformatics and fluctuation analyses of MD (molecular dynamics) trajectories of ErbB kinase domains and relate sequence similarities to correspondence of specific bond-interaction networks and resemblances in collective dynamical modes. We investigate how the various stimuli/perturbations, such as dimerization, phosphorylation of the A-loop tyrosine and mutations seen in cancer patients, have an impact on both the active and inactive conformations of the ErbB family kinase domains. The solvated systems of the truncated ErbB family kinases we show in the present study even have a physiological relevance to cell studies. The protein tyrosine kinases Src and Abl have a highly similar active structure to those in RTKs [2,32]. Furthermore, ErbB4 is cleaved from the membrane into the s80 protein, a fully active soluble form of the ErbB4 kinase domain [13].


MD simulations

Models for ErbB1 (EGFR) kinase were derived from the PDB 1M14 (active) and 2GS7 (inactive) structures [16,20]. Models for ErbB4 were derived from the structures described by Qiu et al. [22], PDB codes 3BCE and 3BBW. Structures for ErbB2 were constructed using homology modelling following the procedure described in [31]. Models for kinase dimers were constructed on the basis of the asymmetric dimer interface described previously [20]. Each system was simulated as a fully atomistic, explicitly solvated system in NAMD [33], using the CHARMM 27 forcefield [34]. The missing hydrogens in the protein were added using the hbuild plugin in the VMD algorithm [35]. To simulate a physiological pH of 7.0, the histidine residues were constructed with +1 protonation state on the δ-nitrogen. The entire system consisted of the protein, water molecules (TIP3P model [36]) and ions at 75 mM ionic strength to simulate physiological conditions; the water molecules and ions were placed at locations of electrostatic extrema (maxima for negative Cl ions and minima for positive Na+ ions) as determined by a Debye–Huckel potential using Solvate 1.0 ( The placement of the counter-ions was also restricted to 8 Å (1 Å = 0.1 nm) away from any protein residue. The Rattle algorithm [38] was employed to constrain the hydrogens to allow for a 2 fs timestep of integration. Periodic boundaries were implemented in all three dimensions and long-range electrostatic interactions were accounted for using the Particle Mesh Ewald algorithm [39]. A conjugate gradient algorithm was employed for all energy minimizations. The solvated system was energy-minimized and the volume of each system was subsequently equilibrated with CPT (constant pressure and temperature) simulations. Both the temperature and pressure were constrained using a Langevin algorithm [40]. The pressure Langevin piston was set with a reference pressure of 1 atm, a mass of 2000 amu and a collision frequency of 5 (1/ps), and the temperature Langevin piston was set with a reference temperature of 300 K and a mass of 10 000 kcal·ps2 (1 kcal =4.184 kJ). Following volume equilibration, the energy of the system was equilibrated using constant volume and temperature (NVT) simulations. Each system was simulated for at least 10 ns (see Supplementary Table S1 for simulation times, available at For the dimer systems, restraints of 1 kcal/Å2 were placed on each kinase to their initial state in the inactive asymmetric dimers to maintain the equilibrated structures while removing steric clashes. The kinase systems were equilibrated and then the restraints were reduced to 0.5 kcal/Å2 and 0.1 kcal/Å2. Following energy and volume equilibration (see Supplementary Figure S1 at, the production simulations were performed.

Analyses of MD simulations

RMSD (root mean square deviation) calculations were performed using the RMSD tool plugin in VMD by first removing global translation and rotation, and then computing the RMSD of the selected sub-regions (A-loop, C-loop, P-loop and αC-helix) relative to a reference structure (the respective active or inactive crystal structure). PCA (principal component analysis) was performed using the software Carma [41], by constructing the covariance matrix of atomic fluctuations σ in Cartesian space and then diagonalizing σ to obtain the principal components (eigenvalues and eigenvectors). An analysis of hydrogen-bond patterns in the production simulations was performed using CHARMM in conjunction with VMD. The trajectories were first analysed in CHARMM with a hydrogen-bond cut-off of 3.4 Å and a cut-off angle of 150°. CHARMM was used to generate a list of hydrogen bonds that were present in at least 60% of the trajectory; we note that the threshold of 60% was varied from 50–80% in our sensitivity analysis and the results were not altered significantly. These hydrogen bonds were then visualized in VMD and any bonds not providing a consistent sustained bond were removed to reveal the persistent hydrogen bonds. During the hydrogen-bond analysis, acidic and basic residues formed strong hydrogen bonds. Such hydrogen bonds were considered salt bridges if they were between the side chain of an acidic and basic residue, with a bond length less than 1.6 Å and present in the majority (>60% of the trajectory) of the simulation. Statistical hydrogen-bonding criteria was chosen to highlight the contributions of specific residues within the protein regardless of its surrounding energetic environment [42]. SASA (solvent-accessible surface area) values were calculated in VMD using the measure SASA module using a probe radius of 1.4 Å larger than the van der Waals radius. The SASA was calculated for each step in the trajectory from which the mean and S.D. were computed. As an alternative measure of hydrophobicity in heterogeneous environments, following Garde and co-workers [43,44], normalized water density fluctuations were computed by recording the ratio of VolN×(〈N2〉−〈N2〉)/〈N2〉, where VolN is the volume of interest, 〈N2〉−〈N2〉 is the variance in the fluctuation of water number in VolN, and 〈N〉 is the mean associated with the number of water molecules. We chose the region of interest to be within 5 Å of a specified hydrophobic sub-region in the EGFR monomer kinase. The results were then divided by the water density fluctuations of bulk water for normalization. Although the results are presented for a cut-off of 5 Å, other cut-offs ranging from 3 to 15 Å were investigated and similar trends were recorded.


Following MD simulation of each active or inactive monomeric kinase system for at least 10 ns, the time evolution of the RMSD was used to monitor equilibration and to track any reorganization of the A-loop and αC-helix conformations; no conformational switching towards active or inactive states was observed (see Supplementary Figure S2 and the Supplementary online data at Whereas the majority of the protein backbone, including the C-loop and the P-loop, aligns closely between the inactive and active states, the A-loop and αC-helix conformations differ considerably. In transitioning from the inactive to the active conformation, the αC-helix rotates toward the C-lobe, with the rotating end shifting by ~9 Å toward the base of the cleft between the N- and C-lobes (Figure 1B). This helix is also extended by two turns (involving residues 728–732 in EGFR) in the active conformation compared with the inactive conformation. In the inactive kinase, the A-loop maintains a ‘closed’ conformation (mainly through inter-region hydrogen bonds) and partially blocks the catalytic site. By contrast, the A-loop appears ‘unfurled’ in the active kinase, and lies against the C-lobe (blue in Figure 1B).

PCA reveals a tightly co-ordinated motion in all active ErbB members

PCA revealed that motions occurring within the active sites of the inactive and active EGFR monomers differ significantly, particularly in the A-loop (see Figure 2). The inactive EGFR monomer exhibits large-amplitude motion in both the αC-helix and the A-loop (4.95 and 7.34 Å respectively), with smaller fluctuations in the P-loop and C-loop (4.17 and 3.65 Å respectively). We attribute the larger amplitudes to a more flexible protein segment; this is consistent with the observation that part of the activation loop is unresolved in almost all of the crystal structures of ErbB kinases to date. Our previous work with the loop modelling program MODELLER [45] discusses the sensitivity of the modelled region (which is absent from the crystallographic structures). In particular, several candidate structures do not show significant differences in dynamics [31]. Moreover, alternate structures of ErbB4 also using MODELLER are remarkably close to initial simulation structures (results not shown). In contrast, the active EGFR monomer demonstrates a uniform level of motion across all four subdomains of the active site with only low-amplitude fluctuations (2–3 Å), and shows no significant local deformations. This implies that the motions are more tightly co-ordinated across the active site in the active conformation than in the inactive conformation. The kinase monomers of the other ErbB family members (modelled ErbB2 and the ErbB4 structure) demonstrate similar motions, as shown in Figure 2. The inactive ErbB4 kinase exhibits a dominant motion in the A-loop (6.46 Å), whereas the P-loop, C-loop and the αC-helix undergo smaller lateral motions (4.45 Å, 2.64 Å and 2.30 Å respectively). In the active conformation, the ErbB4 kinase presents a fluctuation profile similar to EGFR: the subdomains all have similar small-amplitude motions (2–3 Å) with no large local deformations. Thus for the three homologous members of the ErbB family, which have a high degree of sequence similarity, not only are the principal motions conserved across the systems, but the characteristic differences between the inactive and active kinase conformations are maintained in character. This finding suggests large (and possibly similar) differences in the internal network of bonds between the two activity states of each kinase. Indeed, the identified bonding network (Table 1, and Supplementary Table S2 at shows clear conservation across the members of the ErbB family, and also reflects the differences in principal motions between the inactive and active conformations. In general, the inactive conformations have significantly fewer persistent bonds in this network when compared with the active conformations, consistent with the larger amplitude motion.

Figure 2 Visualization of the first principal component for the four key subdomains in the inactive and active conformation of the ErbB kinases

The motions are overlaid sequentially where the large-amplitude motion in each frame is highlighted in red and the low-amplitude motion is highlighted in blue. For the two dimer systems, the activating dimer is shown in orange and green. The inactive conformations exhibit large localized motions, whereas the active conformations demonstrate smaller co-ordinated motions.

View this table:
Table 1 Persistent hydrogen bonds and salt bridges in the three homologous ErbB kinase monomer systems

The salt bridges are in bold and homologous bonds are aligned. The Table presents the subdomains with key conserved bonds across the ErbB family, see Supplementary Tables S2, S3 and S4 (at for an exhaustive list.

Activating bond network

We find that several bonds are conserved across all members of the ErbB family in the active state (EGFR numbering is used in this discussion: see Table 1): two salt bridges, Glu734–Lys851 and Glu738–Lys721; three hydrogen bonds, Leu834–Arg812, Lys836–Val810 and Leu838–Arg808; as well as the bond between Asp813 and Arg817, which is a salt bridge in EGFR and ErbB4, but an hydrogen bond in ErbB2. The Glu738–Lys721 salt bridge is highly conserved across all active kinases and helps co-ordinate the α- and β-phosphates of ATP bound in the active site. The Glu734–Lys851 salt bridge connects the A-loop and the αC-helix, co-ordinating the movements of these two subdomains and dampening larger fluctuations. Similarly, three conserved hydrogen bonds link the A-loop and the C-loop, coupling the motions of these two loops. These can be regarded as ‘fastening’ hydrogen bonds that maintain the N-terminal side of the A-loop open in its active state – the alternative (in the inactive state) being steric hindrance to the binding of ATP and peptide substrates. A fourth hydrogen bond is seen only in ErbB2: Glu876–Arg898 fastens the C-terminal side of the A-loop open [31]. Although there is no analogous bond in EGFR and ErbB4, the Tyr845–Tyr867 bond in EGFR and the homologous Tyr850–Phe872 bond in ErbB4 serve a similar role. The conserved Asp813–Arg817 bond positions the side chain of the catalytic Asp813 in the active site and thus probably facilitates the preorganization of the catalytic site in the active kinase system.

Inactivating bond network

In contrast with the case for the active configurations, few intramolecular bonds in the inactive kinase conformation are conserved across the ErbB family (Table 1 and Supplementary Table S2). In fact, only one such bond is conserved across EGFR, ErbB2 and ErbB4: a hydrogen bond between Met742–Leu753 that pins the C-terminal end of the αC-helix in its location away from the active conformation. The inactive kinases do share an autoinhibitory pattern in which key residues required for kinase activation are sequestered. Similarly to the situation described for Lck [26,27], Glu738 of EGFR is salt-bridged to Lys836 in a manner that ‘sequesters’ this glutamate residue and prevents it from forming the highly conserved Glu738–Lys721 salt bridge [31] required for full activation. For ErbB2 and ErbB4, the homologous Lys836 is replaced with an arginine residue. In ErbB4, this arginine residue contacts additional residues (as well as the Glu738 equivalent), apparently dampening the fluctuations of the αC-helix. In ErbB2, the arginine residue that replaces the Lys836 equivalent is flipped away from the αC-helix, so cannot sequester the Glu738 residue equivalent. In ErbB2 and ErbB4, the other half of the highly conserved salt bridge is sequestered; namely, the homologous Lys721 interacts with the Asp831 side chain, and this interaction in turn prevents Lys721 from forming the co-ordinating salt bridge.

Activation in the asymmetric dimer occurs through disruption of the inactivating bond network

We also analysed fluctuations in the EGFR kinase within the context of the asymmetric dimer described by Kuriyan and co-workers [20]. The fluctuations recorded for active EGFR in this context are very similar to those seen in the active EGFR monomer (Figure 2), with the conserved bonds described above being mostly preserved (see Supplementary Table S3 at Slight variations in bond patterns in the active configuration include a shift from Glu734–Lys851 and Leu838–Arg808 in the monomer to Glu734–Lys836 and Ala840–Arg808 in the dimer, but dimerization has little overall effect on the activating bond network. In contrast, in the inactive dimer the first principal component reveals substantial motion of the αC-helix that is much greater than seen in the inactive monomer system (a 7.49 Å shift in the dimer compared with 4.14 Å in the monomer, see Figure 2). Simulations of the symmetric dimer interface of the inactive kinase in the unphosphorylated and phosphorylated states does not result in any conformational switching (see Supplementary Figure S3 at, each system is stable in the inactive state. Furthermore, in previous simulations of ErbB2 [31] we used loop modelling to reduce the effect of steric clashes with no significant differences in dynamic motions. Therefore we attribute the motion of the αC-helix uniquely to the introduction of the asymmetric dimer interface. Even in the short timescale of 30 ns of the EGFR dimer trajectory, we observe a rearrangement of the αC-helix position towards the active conformation. Consistent with the allosteric activation mechanism proposed by Zhang et al. [20], several interactions in the inactivating bond network surrounding the A-loop and the αC-helix are indeed disrupted in the dimer trajectory, including Tyr740–Ser744, Leu834–Asp813, His846–Arg865 and Lys851–Arg812 interactions (see Supplementary Table S3). Some bonds (e.g. Glu738–Lys836) are still present, although the population statistics indicate that their survival percentage (fraction present in the trajectory) has decreased from >90% in the inactive monomer trajectory to ~60% in the inactive dimer trajectory over 30 ns.

When compared with their monomeric counterparts, the ErbB2 and ErbB4 inactive dimers demonstrate a similar loss of bonds surrounding the αC-helix and the A-loop (see Supplementary Figure S4, Supplementary Table S3 and the Supplementary online data at, and [31]). For ErbB4, a list of bonds disrupted upon dimerization includes Glu739–Arg841, Asp742–Arg841, Glu743–Arg817, Gly838–Arg817, Gly855–Glu730 and Lys856–Glu844. Similarly to the Glu738–Lys836 salt bridge in EGFR, the Glu743–Arg841 salt bridge in ErbB4 shows a marked decrease in survival time from >90% in the monomer trajectory to ~70% in the dimer trajectory. Moreover, two of the bonds broken (Glu739–Arg841 and Asp742–Arg841) involve the Arg841 residue, resulting in a significant weakening of the bonds in the inactivating bond network discussed above that sequester key side chains in the inactive state.

EGFR mutations activate the kinase by disrupting the inactivating bond network

In light of the allosteric activation mechanism described in the context of the EGFR dimer system, we examined the effect of EGFR-activating mutations clinically identified in NSCLC. The two mutants we examined are those most commonly found in NSCLC: a small in-frame deletion in the αC-helix (del 723–729 ins S) and a point mutation in the A-loop (L834R) (Figure 3). The deletion mutant physically shifts the αC-helix, removing residues 723–729 and leading to the shortening of a disordered segment that adjoins the αC-helix. This shifts the αC-helix towards the active state. The shortening of the αC-helix also alters the overall movement of the αC-helix, with a distinctly different motion from all other systems (Figure 3). The bond patterns we recorded in the dimer of the deletion mutant are similar to those of the wild-type inactive EGFR dimer (see Supplementary Table S3 and the Supplementary online data). The majority of the differences between the two systems are seen in the αC-helix: the deletion mutant loses two bonds seen in the wild-type inactive dimer between the αC-helix and the A-loop: Glu738–Phe832 and Glu738–Gly833. The L834R mutation alters the conformation of the A-loop slightly; however, our trajectory shows that the mutant kinase is still in a distinctly inactive conformation. The L834R mutant dimer system shares a similar bond pattern (see Supplementary Table S3) with the wild-type inactive EGFR dimer system with some exceptions: namely, Gly833 has hydrogen-bonded with His811, similar to the N-terminal fastening bonds that couple the A-loop and C-loop in the active state. Moreover, Arg834 has hydrogen-bonded to Arg865, representing a fastening bond to the C-lobe.

Figure 3 Effect of EGFR-activating mutations identified in NSCLC

Top panels, comparison of the first principal component for the EGFR inactive mutant dimer systems: del 723–729 ins S and L834R. Bottom panels, mapping of the RMSD of the inactive EGFR dimers as a scatter plot with the RMSD from inactive on one axis and the RMSD from active on the other. The end structures are visualized with an active reference (orange) and an inactive reference (purple). The conformational switching between active and inactive EGFR of the A-loop and αC-helix are separable. aC-helix, αC-helix; Del, deletion; WT, wild-type.

Hydrophobic interactions help differentiate active and inactive conformations

To investigate the effect of hydrophobic interactions on the ErbB kinase conformations, we analysed the hydrophobicity as well as the SASA of relevant hydrophobic sub-regions, namely, the C-spine, R-spine, hydrophobic core and the αC–β4 region (Figure 1). The four regions have a high percentage of hydrophobic side chains; however, some minor differences between members of the ErbB family exist (Figure 1), particularly in the αC–β4 region. The hydrophobicity of the sub-region is a non-additive quantity, dependent not only upon the primary structure, but also on the surrounding environment. Garde and co-workers [43,44] have recently proposed an approach for quantifying the hydrophobicity of heterogeneous surfaces using normalized water density fluctuations (see the Experimental section), according to which increased normalized water density fluctuations are used as a signature of a more hydrophobic surface.

The active conformations in the C-spine are noted by the elevated normalized water density fluctuations, with the exception of the EGFR monomer (where the difference is minor) (Figure 4A). The active conformations also consistently expose a mean surface area of approximately 500 Å2 across the members of the ErbB family (represented as a horizontal line), which is consistently less than the exposed surface area in the inactive conformations (mean value of over 700 Å2) (Figure 4B). Together, these trends reflect a more hydrophobic C-spine region in the active compared with the inactive conformation. The hydrophobicity of the R-spine is not significantly different between the active and inactive conformations (Figure 4C). However, the R-spine analysis does reveal that the active conformations have a lower mean SASA value than the inactive conformations (Figure 4D); for some systems, the difference is not so clearly delineated, in part because it consists of just four residues, so SASA is subject to larger fluctuations. With similar levels of water density fluctuations between active and inactive conformations, the decreased mean SASA values for the R-spines in the active conformations imply a more stable hydrophobic context, consistent with the finding that the well-formed spines are characteristic of the active kinase conformations. Interestingly, all ErbB2 systems present a higher fluctuation in water density, which is consistent with the notion that hydrophobicity is particularly important in the context of ErbB2, owing to its interaction with Hsp90 (heat-shock protein 90) that is known to be mediated by hydrophobic contacts; see also the analysis surrounding the αC–β4 region discussed below. We also record a decrease in the R-spine SASA for inactive EGFR in the dimeric form compared with the monomeric form, with similar hydrophobicity (Figures 4C and 4D). The addition of the dimer interface in inactive EGFR reduces the mean SASA for the R-spine from 140 Å2 to 80 Å2, implying that dimerization can provide additional stabilization due to hydrophobic interactions in EGFR.

Figure 4 Normalized water density fluctuations and the mean SASA (exposed) values for functionally important sub-regions

Normalized water density fluctuations and mean SASA values for the C-spine (A and B) and the R-spine (C and D). For the normalized water density fluctuations (A and C), a higher value is correlated with a higher hydrophobicity, which would be expected to have lower SASA. See also Figure 5.

With respect to the water density fluctuations in the hydrophobic core, the inactive conformation is more hydrophobic than the active conformation (with the exception of ErbB2) (Figure 5A). This difference is consistent with the hydrophobic stabilization of the inactive conformation, especially for EGFR and ErbB4, but not for ErbB2. Notably, mutations of hydrophobic residues in the hydrophobic core are reported for EGFR and ErbB4 in clinical studies (see the Discussion section), whereas for ErbB2, such mutations are found surrounding the αC–β4 region. The analysis in Figure 5(A) also implies that the difference in hydrophobicity between the inactive and active conformations is reduced in the EGFR dimer compared with the monomer. This is consistent with the allosteric activation mechanism, namely that dimerization significantly reduces the hydrophobic advantage and provides a stimulus for activation. As for the SASA results, the hydrophobic core does not show a clear separation between the active and inactive conformations (Figure 5B). However, to describe the highly irregular nature of the surface of the hydrophobic core, the water density analysis may be more suitable that the SASA.

Figure 5 Normalized water density fluctuations and the mean SASA (exposed) values for functionally important sub-regions

SASA values for the hydrophobic core (A and B), the αC–β4 region (C and D) and the dimer interface (E and F). For the normalized water density fluctuations (A, C and E), a higher value is correlated with a higher hydrophobicity, which would be expected to bury more SASA.

The αC–β4 region is an unstructured span between the αC-helix and the β4 sheet in RTKs. From a sequence perspective, only in ErbB2 is the αC–β4 region predominantly hydrophobic (Figure 1, see primary sequence; Gly776 in ErbB2 corresponds to Ser744 and Ser749 in EGFR and ErbB4 respectively; Gly778 in ErbB2 corresponds to Asp746 and Asp751 in EGFR and ErbB4 respectively). The water density analysis clearly reflects this trend by singling out the ErbB2 monomer systems as being particularly hydrophobic (Figure 5C); the mean SASA for the αC–β4 region in ErbB2 systems is also consistently lower than in other members of the ErbB family (Figure 5D). As discussed in [31], this unique feature of ErbB2 is thought to be responsible for its preferential association with the molecular chaperone Hsp90.

The asymmetric dimer interface consists largely of hydrophobic side chains in the N-lobe of the receiver kinase (Leu680, Ile682, Leu736, Leu758 and Val762) and the C-lobe of the activator kinase (Ile917, Tyr920, Met921, Val924 and Met928) [20]. With respect to water density analysis, the dimer interface presents a similar hydrophobicity across all members, with the ErbB2 monomers a little more hydrophobic, particularly in the active conformation (Figure 5E). The monomeric systems record a mean SASA value of approximately 350 Å2 for the dimer interface residues in the active conformations compared with 200 Å2 in the inactive conformations (Figure 5F). This decrease for the inactive monomers may imply a preference for the inactive state in the monomer context. Notably, the dimeric systems record much lower SASA values of approximately 75 Å2, implying that hydrophobic stabilization provides a dominant driving force for dimerization; interestingly, for an EGFR dimer, the SASA for the inactive state is greater than that for the active case and hence the preference for the inactive conformation is not implied in the context of the dimer. Thus dimerization also provides a stimulus for activation.


The analysis described in the present study identifies conserved intramolecular non-covalent bond networks across the ErbB family kinases that emerge from sequence homology and lead to conserved dynamic characteristics as well as function. The hydrophilic and hydrophobic networks are co-operative interactions which help differentiate the active and inactive conformations while modulating key loop fluctuations and bonds. Investigation of the networks allows identification of network fragilities, which can be exploited through mutations to alter the basal kinase activity.

The bond networks highlight key conserved residues and bonds that are characteristic to each conformational state. Some of the bonds have been highlighted previously, for example the highly conserved salt bridge Glu738–Lys721 [16] and the fastening bonds Leu834–Arg812, Lys836–Val810 and Leu838–Arg808 [31]. We add to this another salt bridge co-ordinating the αC-helix and the A-loop, the Glu734–Lys851 salt bridge and the hydrogen bond Asp813–Arg817 which facilitates the placement of the Asp813 side chain in a catalytically competent orientation. Overall, the six conserved interactions tightly co-ordinate the N-lobe to the αC-helix, the αC-helix to the A-loop, and the A-loop to the C-loop. The nature of this bond network suggests that interactions among the nearest key subdomains in the ErbB family helps in maintaining the proximity of important catalytic residues, and positioning them so they can contribute directly to kinase activity. The conserved bond network in the inactive conformations of the ErbB family kinases is less extensive (in EGFR, Glu738–Lys836 and Asp831–Lys721), but appears to serve a crucial role in sequestering key catalytic residues and thus preventing activity.

Hydrophobic interactions appear to provide context-specific contributions to stability to the active and inactive conformations of ErbB kinases. Since a single amino acid change can alter the hydrophobicity of a region, we considered the water fluctuation analysis in conjunction with SASA to present the contributions from hydrophobicity for irregular surfaces characteristic of sub-regions in the kinase. Both SASA and fluctuation analyses of local water densities imply that the inactive monomer conformations of ErbB kinases are preferentially stabilized through hydrophobic interactions associated with the dimer interface. Moreover, consistent across the ErbB family, kinase domain dimerization further reduces the SASA of the dimer interface residues, implying that hydrophobic interactions provide a dominant impetus for dimerization.

The hydrophobic interactions with respect to the C-spine and R-spine regions are found to benefit the active conformations overall, in particular, the SASA analysis suggests that the active conformations benefit from a larger buried surface area. Interestingly, we found ErbB2 to possess a greater hydrophobicity in the spine regions. Thus the αC–β4 region as well as the C-spine and R-spine, are found to display uniquely hydrophobic characters in ErbB2, consistent with its association with Hsp90. Also, in the case of the ErbB2 dimer interface, the active conformation is more hydrophobic than the inactive conformation, but also exposes more surface area, implying a destabilizing hydrophobic force in the active conformation; considering that ErbB2 does not have a known ligand, this could serve as an auto-inhibitory mechanism against activation. It is also intriguing to note that in the hydrophobic core, there is a large differential between the inactive and active conformations in the water fluctuations, consistent with preferential benefit for the inactive conformation in EGFR and ErbB4, but not for ErbB2. Thus the analyses for the spine regions and the hydrophobic core collectively lead to the remarkable prediction that although the αC–β4 region and the R-spine confer hydrophobic benefit to the inactive conformation of ErbB2, the hydrophobic core has the same effect for EGFR and ErbB4. Indeed, this correlates well with clinical studies, where activating point mutations in the hydrophobic core have been found in EGFR and ErbB4, but those in the αC–β4 region are found in ErbB2, suggesting that the hydrophobic analysis enables the context-specific identification of fragile sub-regions.

The active kinase conformations have a greater number of conserved intramolecular persistent bonds (hydrophilic-specific interactions) than the inactive systems, whereas the inactive kinase conformations appear to have a distinct hydrophobic advantage in the dimer interface region. This observation is fully consistent with the view that for ErbB kinases, the dominant stimuli that activate them operate by modulating the dimer interface (activation mechanism in the wild-type) and the hydrophobic core regions (mode of activation in clinically identified mutation) to destabilize the inactive conformation, which we discuss below.

For the wild-type systems, this finding provides strong support for the allosteric mechanism associated with formation of the asymmetric dimer interface reported by Zhang et al. [20] and Qiu et al. [22] and the studies of juxtamembrane domain associations [2325]. In this respect, the emerging view on allosteric EGFR regulation posits that, rather than forcing the protein to a new conformation, the allosteric interface guides the kinase domain to the active state through pre-existing conformational ensembles [46]. Our results for the EGFR dimer simulations (Figures 2 and 3) are consistent with this view; namely, the allosteric dimer interface causes the αC-helix to alter the conformational space it samples and thus biases it towards a more active conformation. This leads to a metastable intermediate characterized by the αC-helix adopting an active-like conformation, albeit still remaining partially molten. We note that although we have captured the initial conformation change associated with the introduction of the allosteric asymmetric dimer interface, which suggests a shift towards the active conformation, we are unable to capture the entire conformational shift because of limited timescales accessed in our simulations. We have also not attempted to compute the free energy landscape, which will be pursued in future studies.

We hypothesize a sequence of events that characterize the EGFR kinase activation pathway (see Supplementary Figure S7 and the Supplementary online data at (i) formation of the dimer interface triggers a shift of the αC-helix conformation toward the active state; and (ii) formation of the additional helical turns in the αC-helix and the initiation of the A-loop movement either happen concurrently or sequentially, but the A-loop does not complete its conformational switching until the αC-helix is fully formed. This sequence differs from the mapped pathway of the Src kinase Hck [4749]; in Hck, following A-loop tyrosine phosphorylation, the A-loop can adopt an open (active) conformation although the αC-helix is still in the inactive conformation. In EGFR, the opposite is true; namely, upon dimerization the αC-helix moves towards the active conformation with the A-loop in the closed (inactive) conformation (see the Supplementary online data).

Intriguingly, kinase activation in the two clinically identified EGFR activating mutants studied here (L834R and del 723–729 ins S) also appears to be governed by the same principles. The L834R mutation, apart from directly disrupting the hydrophobic core in lieu of the hydrophilic arginine residue, leads to additional coupling of the A-loop to the C-loop (through the Gly833–His811 hydrogen bond) and the A-loop to the C-lobe (through the Arg834–Arg865 interaction) not seen in other systems. Hence the activating stimulus for this system possibly stems from a disruption of some of the hydrophobic interactions within the protein kinase responsible for stabilizing the inactive conformation. The deletion mutation, however, directly alters the αC-helix to cause a shift toward its active conformation, although it also disrupts the normal dynamics and potential extension of the αC-helix. Hydrophobic analysis also reveals that the L834R and deletion mutant function through different mechanisms. The deletion mutant has increased normalized water density fluctuations compared with L834R and the wild-type systems, particularly around the hydrophobic core but also increased SASA (see Figures 5A and 5B), indicating that the hydrophobic region is not shielded from water exposure. Moreover, based on Figures 5(E) and 5(F), the dimer interface for the deletion mutant does not show a pronounced hydrophobicity indicated by a lower normalized water density fluctuations. Consistent with these salient observations for the deletion mutant, in the functional studies of Choi et al. [8], the deletion mutant showed an increase in basal phosphorylation, but a decrease in the overall phosphorylation under EGF stimulation. By interfering with the key interactions surrounding the helix, it can potentially stabilize the active conformation of the monomer kinase. However, upon dimerization the active conformation is destabilized because the deformation disrupts both the dimer interface due to its proximity and because the αC-helix cannot fully form. This justifies the dual response of the deletion mutant on kinase phosphorylation activity in the presence/absence of EGF stimulation.

The hydrophilic and hydrophobic bond networks we have identified also enable us to propose a set of mechanisms that increase the basal level of kinase activity for the other EGFR mutations in NSCLC [68]: E685S, G695S, S744I and L837Q, as well as a set of mutations in ErbB4 found in melanoma [50], namely, E836K, E872K and G936R. Similar to L834R, in the L837Q mutation, replacing the hydrophobic leucine residue side chain with the hydrophilic glutamine residue side chain is likely to cause it to rotate away from the small hydrophobic core between the αC-helix and the A-loop, thereby disrupting the hydrophobic core and inducing a local steric effect. This reconfiguration may also disrupt the interaction of Lys836 with Glu738. The S744I mutation in NSCLC probably disrupts the Met742–Leu753 bond, allowing transition of the αC-helix into its active position. E685G and G695S are in the N-lobe, close to the asymmetric dimer interface, so mutations here are likely to alter kinase activity either by increasing the dimerization affinity or by reconfiguring the EGFR RTK monomer by partially mimicking the formation of the asymmetric dimer interface. Two of the ErbB4 mutations seen in melanoma, E836K and E872K, are situated around the small hydrophobic core, and the change from a negatively charged side chain to a positively charged chain is poised to disrupt the core's stability, thereby providing an activating stimulus. The G936R mutant in ErbB4 is located close to the asymmetric dimer interface in the C-lobe and potentially alters the dimerization affinity. Analogous mutations in ErbB2 have not been reported. Instead, the activating clinical mutations in ErbB2 are in the αC–β4 region [51,52], shown here to be uniquely hydrophobic in the ErbB family. Disruption of the hydrophobicity of the αC–β4 region would alter the binding to Hsp90 [53] and increase heterodimerization [54], and thus the activity of ErbB2. Characterization of another set of mutations in ErbB4 found in NSCLC show two inactivating mutations, G802dup and D861Y [55]. Asp861 is the start of the highly conserved DFG motif in RTKs and mutation of it would reduce activity, whereas the G802dup is spatially located near the P-loop and probably affects ATP-binding affinity. Similarly, in ErbB4, two of the point mutations are located near the active site: Glu836 is located next to the C-loop, whereas Glu872 is situated in the A-loop and, owing to their proximity to the active site, the mutations are also poised to alter substrate binding and phosphorylation; such mechanisms are discussed in a recent computational study in the context of EGFR (Y. Liu and R. Radhakrishnan, unpublished work).

As mentioned in the Introduction, A-loop phosphorylation is thought to not be required for ErbB family kinase activation [19]. Consistent with this notion, although we record some context-specific structural changes between systems in which the A-loop is phosphorylated compared with unphosphorylated (see Supplementary Table S4, Supplementary Figures S4–S6 and the Supplementary online data at, our results indicate that this phosphorylation does not provide a dominant activating effect. We reached a similar conclusion in recent studies of HER2 using a free energy perturbation approach [31].

In conclusion, our results help to establish the molecular context governing the stability and activation stimuli of ErbB kinases. Given that RTKs are important signalling elements in cells, and mutations within these kinases cause subtle changes in molecular behaviour that result in substantial alterations of downstream signalling [29], the present study helps to establish the crucial link between molecular mechanisms of kinase activation and the ensuing signalling response.


All of the authors helped to design the study. Andrew Shih and Shannon Telesco conducted the research and performed the simulations presented. Andrew Shih, Shannon Telesco, Sung-Hee Choi and Mark Lemmon analysed and interpreted the data. Andrew Shih, Shannon Telesco, Mark Lemmon and Ravi Radhakrishnan wrote, revised and finalized the manuscript.


This work was supported by NSF (National Science Foundation) [grant numbers CBET-0730955, CBET-0853389 and CBET-0853539]. S.E.T. received support through the NSF Graduate Research Fellowship, a GAANN (Graduate Assistance in Areas of National Need) Award and a Ruth L. Kirchstein National Research Service Award from the NIH (National Institutes of Health)-NHLBI (National Heart, Lung and Blood Institute) [grant number 2T32HL007954]. Computational resources were provided in part by the National Partnership for Advanced Computational Infrastructure [under grant number MCB060006].


The authors thank the members of the Lemmon and Radhakrishnan laboratories for helpful discussions.

Abbreviations: A-loop, activation loop; C-loop, catalytic loop; C-spine, catalytic spine; EGFR, epidermal growth factor receptor; HER, human EGFR; Hsp90, heat-shock protein 90; IRK, insulin receptor kinase; MD, molecular dynamics; NSCLC, non-small-cell lung cancer; P-loop, glycine-rich nucleotide-binding loop; PCA, principal component analysis; R-spine, regulatory spine; RMSD, root mean square deviation; RTK, receptor tyrosine kinase; SASA, solvent-accessible surface area


View Abstract