Research article

Multiple isoforms of the translation initiation factor eIF4GII are generated via use of alternative promoters, splice sites and a non-canonical initiation codon

Mark J. Coldwell, Ulrike Sack, Joanne L. Cowan, Rachel M. Barrett, Markete Vlasak, Keiley Sivakumaran, Simon J. Morley


During the initiation stage of eukaryotic mRNA translation, the eIF4G (eukaryotic initiation factor 4G) proteins act as an aggregation point for recruiting the small ribosomal subunit to an mRNA. We previously used RNAi (RNA interference) to reduce expression of endogenous eIF4GI proteins, resulting in reduced protein synthesis rates and alterations in the morphology of cells. Expression of EIF4G1 cDNAs, encoding different isoforms (f–a) which arise through selection of alternative initiation codons, rescued translation to different extents. Furthermore, overexpression of the eIF4GII paralogue in the eIF4GI-knockdown background was unable to restore translation to the same extent as eIF4GIf/e isoforms, suggesting that translation events governed by this protein are different. In the present study we show that multiple isoforms of eIF4GII exist in mammalian cells, arising from multiple promoters and alternative splicing events, and have identified a non-canonical CUG initiation codon which extends the eIF4GII N-terminus. We further show that the rescue of translation in eIF4GI/eIF4GII double-knockdown cells by our novel isoforms of eIF4GII is as robust as that observed with either eIF4GIf or eIF4GIe, and more than that observed with the original eIF4GII. As the novel eIF4GII sequence diverges from eIF4GI, these data suggest that the eIF4GII N-terminus plays an alternative role in initiation factor assembly.

  • alternative splicing
  • eukaryotic initiation factor
  • non-canonical initiation codon
  • translation initiation


Recent assessments of the human genome suggest that it only contains 20000–25000 protein-coding genes [1], far fewer than previous estimates [2] and surprisingly low compared with genomes of simpler organisms such as Caenorhabditis elegans (~13000) or Drosophila melanogaster (~18000). In contrast with the original hypothesis that one gene is translated into one protein, several mechanisms exist which generate diversity in the proteome. Multiple proteins can be produced from a single gene locus by alternative usage of promoters, splice sites or TICs (translation initiation codons). Large-scale studies of the human transcriptome have confirmed the widespread use of alternative transcriptional start sites [3] and alternative splicing [4], but much less is known about how alternative TICs create diversity in the N-termini of variant proteins, and how these alternative TICs are selected. However, examples of proteins with alternate isoforms that arise from different TICs have been found to: (i) function differently {e.g. the K2P (two-pore domain potassium) channels TREK (TWIK-1-related K+ channel) 1 and TREK2 show altered biophysical properties and ion selectivity [5,6]}; (ii) demonstrate different subcellular localization {e.g. BAG-1 (Bcl-2-associated athanogene 1) [7]}; and (iii) retain the same function but with different levels of efficiency {e.g. eIF4GI (eukaryotic translation initiation factor 4GI [8]}.

The initiation phase of mRNA translation is a key stage in gene expression that can rapidly alter the temporal and spatial expression of either the whole transcriptome or specific mRNAs, without requiring new transcription (reviewed in [9]). The major mechanism of translation initiation in eukaryotic cells requires the assembly of the ribosome and associated initiation factors at the m7G (7-methylguanylate) cap structure at the 5′-end of an mRNA. Central to cap-dependent translation initiation is the multi-domain eukaryotic translation initiation factor eIF4G, which acts as a scaffold for the assembly of the initiation complex via contacts with the mRNA cap-binding protein eIF4E, the RNA helicase eIF4A, the ribosome-binding complex eIF3, the eIF4E kinase Mnk (mitogen-activated protein kinase signal integrating kinase) and PABP [poly(A)-binding protein] [10]. eIF4G is expressed as two paralogues in mammalian cells, eIF4GI and eIF4GII, alongside a truncated isoform termed DAP5 (death-associated protein 5) (Figure 1A). In the present study we focus on the regulation of expression of eIF4GII.

Figure 1 There are wide variations in EIF4G1 and EIF4GII mRNA levels and expression of protein isoforms in a human cell line panel

(A) Different forms of human eIF4GI arise from alternative AUG initiation codons (f–a), and are named according to the nomenclature adopted by Bradley et al. [17]. Several EIF4G1 transcripts exist, arising from different promoters and splicing events [39], but these are omitted for clarity. eIF4GII shares approximately 46% homology with eIF4GI, with the greatest homology at partner binding sites. DAP5/p97/NAT1 is homologous with the C-terminal portion of eIF4GI/eIF4GII and is translated from a conserved non-canonical GUG initiation codon. (B) Total RNA was reverse-transcribed and analysed by quantitative real-time PCR with primers specific for EIF4G1 or EIF4GII. Both values were normalized to the level of 18S rRNA in each sample. (C) Equal amounts of extract from the cell types indicated were subjected to SDS/PAGE using 4% polyacrylamide precast gels and proteins were transferred on to PVDF membranes. The membrane was then probed with the antibodies shown to visualize the isoforms of eIF4GI (indicated on the left-hand side of the top panel) and eIF4GII.

Following the assembly of the 48S pre-initiation complex at the m7G cap, the 40S ribosomal subunit and associated factors must translocate along the 5′ UTR (untranslated region) of the mRNA to the initiation codon [11,12]. Localization of the 40S ribosomal subunit to the start codon of most mRNAs is thought to occur by the ‘scanning’ mechanism, where the translational machinery translocates along the mRNA in a 5′ to 3′ direction until it reaches an AUG TIC in an efficient consensus [GCC(AG)CCAUGG] [13], which is recognized by eIF1 and eIF1A. In addition, non-canonical initiation codons can also be used in mammalian cells in specific mRNAs, with initiation demonstrated from 21 other possible TICs [14,15]. For example, DAP5 is translated from a GUG (Figure 1A) [16], and synthesis of the p50 isoform of BAG-1 begins from a CUG codon [7].

The scanning model of translation initiation postulates that an initiation codon in a good context closest to the 5′-end of the mRNA is used for initiation. However, if initiation does not occur in an efficient manner, this allows some 40S ribosomal subunits to resume scanning to a downstream TIC giving the potential for multiple isoforms of a protein with N-terminal truncations to be produced from alternative TICs. Leaky scanning, the term for this phenomenon, occurs on multiple mRNAs encoding eIF4GI, with five isoforms expressed from alternative AUGs (Figure 1A) [17,18], the mRNA encoding the shortest eIF4GIa also being expressed from an internal promoter. We have previously used RNAi (RNA interference) to reduce the expression of endogenous eIF4GI proteins, which resulted in reduced protein synthesis rates and alterations in the morphology of cells [8]. Some of the effects of eIF4GI silencing could be rescued by expressing EIF4G1 cDNAs that were immune to silencing, with the different translational isoforms being able to rescue translation rates to different extents [8]. Although an initial report suggested that eIF4GI and eIF4GII were interchangeable [19], overexpression of the eIF4GII paralogue in the eIF4GI-knockdown background was unable to restore translation to the same extent as eIF4GIf/eIF4GIe in our previous study [8]. This observation was perhaps not surprising given that the overall conservation of these proteins is 46%, with 61% conservation from the eIF4E-binding site to the C-terminus, but only 22% when considering the N-terminus to the eIF4E-binding site. However, in our previous work we were using a single EIF4GII (encoding eIF4GII) cDNA which matched the published sequence originally identified in a study which suggested that this paralogue was functionally similar to eIF4GI [19]. However, further work has shown that the two proteins are cleaved by picornaviral proteases with different kinetics [20], different fragments are produced by caspase-mediated cleavage during apoptosis [21], and eIF4GI, but not eIF4GII, can be cleaved by HIV protease [22]. Furthermore, eIF4GII has been reported to play a particular role in the assembly of pre-initiation complexes following differentiation [23].

In the present paper we report the discovery of multiple isoforms of eIF4GII which arise from multiple promoters, alternative splicing events and the use of an upstream non-canonical CUG initiation codon. We now show that the rescue of translation in eIF4GI/eIF4GII-knockdown cells by a novel isoform of eIF4GII was more robust than that observed with either eIF4GI or the original eIF4GII sequence. These results suggest that the novel N-terminal sequences in this isoform may play an important role in initiation factor assembly.



Sequences of oligonucleotide primers used are provided in Supplementary Table S1 (at In all cases, the presence of the correct siRNA (short interfering RNA) or EIF4G sequence was verified by restriction digestion and automated sequencing. Plasmids expressing eIF4GI-specific siRNA hairpins have been described previously [8]. Potential siRNA target sites (si59, si243, si354 and si355) were identified in the EIF4GII sequence using the siRNA design tool ( and corresponding oligonucleotides were inserted into the pSilencer 3.0 H1 plasmid (Ambion). A control siRNA that corresponded to each eIF4GII-specific siRNA was created that contained mismatches to the centre of the target site, and oligonucleotides were inserted into the same vector. The vector pcDNAmyc and variants containing different EIF4G1 ORFs (open reading frames) and the original EIF4GII ORF [19] have been described previously [24]. The QuikChange® Site-Directed Mutagenesis method (Stratagene) was used to introduce nucleotide changes in vectors containing EIF4GII cDNA sequences that would maintain the correct protein sequence, but prevent siRNA-mediated degradation of the exogenous mRNA or to mutate TICs. The vector pcDNA_3F containing a triple FLAG tag following a multiple cloning site was constructed by annealing the oligonucleotide pair (3F forward and 3F reverse), then ligating into pcDNA3.1(+) (Invitrogen) digested with XhoI and XbaI.

Cell culture and transient transfection

Materials for tissue culture were from Invitrogen and FBS (fetal bovine serum) was from Labtech International. Cell lines were HeLa (cervical cancer), MCF-7 (breast adenocarcinoma), A549 (epithelial lung adenocarcinoma), U2OS (bone osteosarcoma epithelial cells), Raji (Burkitt lymphoma cell line), UVW and T98G (glioma), SH-SY5Y (neuroblastoma), HEK (human embryonic kidney)-293T and MRC-5 (fetal lung fibroblasts). All cells were obtained from the HPACC (Health Protection Agency Culture Collections, Salisbury, U.K.) and maintained in DMEM (Dulbecco's modified Eagle's medium; for HeLa and HEK-293T) or RPMI 1640 medium (for MCF7) supplemented with 10% FBS at 37°C in a humidified atmosphere containing 5% CO2. For transient transfection, cells were seeded on to 5 cm plates at a density of 100 000 cells per plate and 24 h later cells were transfected using FuGENE® 6 (Roche), according to the manufacturer's protocol, and 1 μg of pSilencer DNA and 2 μg of isoform-specific pcDNAm4G. The transfection mixture was removed 24 h later, and the cells were washed twice with PBS and further incubated in fresh medium. Preparation of cell lysates, measurement of protein synthesis rates by incorporation of [35S]methionine into protein and m7GTP (7-methylguanosine-triphosphate)–Sepharose affinity isolation of eIF4E and associated factors were carried out as described previously [8].

RNA methods

Total cellular RNA was isolated using the RNAqueous kit (Ambion) according to the manufacturer's protocol. Total cDNA for quantitative PCR assay was prepared from total RNA with the Improm II Reverse Transcription system (Promega) using random primers. The QuantiTect SYBR Green PCR kit (Qiagen) was used to quantify levels of mRNA or rRNA, using primers described previously [8] or described in Supplementary Table S1. Primers were designed using the Primer Express Program v2.0 (Applied Biosystems). Amplification reactions were carried out and analysed using the Applied Biosystems 7500 Real Time PCR System. For 5′-RACE (rapid amplification of cDNA ends) experiments, mRNA was purified using the Dynabeads mRNA DIRECT Kit (Invitrogen) and 5′-RACE was performed using the First Choice RLM (RNA-ligase-mediated)-RACE kit (Applied Biosystems), with gene-specific primers. Briefly, mRNA from three cell lines (HEK-293T, HeLa and MCF7) was treated with alkaline phosphatase to remove phosphates from any uncapped messages. Following purification, tobacco acid pyrophosphatase was used to remove m7G caps from intact mRNAs, leaving a phosphate which was used to ligate an RNA oligonucleotide on to the 5′-end of all previously capped mRNAs. These oligonucleotide-capped messages were then subjected to reverse transcription, after which 5′-ends were amplified using a high-fidelity polymerase using forward primers complementary to the RNA oligonucleotide and reverse primers complementary to portions of exon 6, and then exon 5, of EIF4GII.

Following amplification, the resultant EIF4GII 5′-ends were ligated into pGEM T-Easy (Promega) and sequenced. All RACE-generated cDNA sequences are available through the NCBI EST (expressed sequence tag) database [25], with sequences and accession numbers listed in Supplementary Table S2 (at Novel EIF4GII 5′ cDNA ends were amplified from these plasmids with forward and reverse primers containing respective NheI and XhoI restriction sites, allowing their introduction into the pcDNA_3F plasmid. cDNAs encoding extended ORFs were amplified and fused on to the existing EIF4GII cDNA by megaprimer PCR [26] and the original EIF4GII cDNA as the template. In the first round, the primers HindIII 4GII CUG and 4GIIinner RACE were used. The amplified product was subsequently used as the forward primer in a second PCR with 4GII 5′R BspEI as the reverse primer. The resulting product was digested with HindIII and BspEI and used to replace the AUG-initiated portion of EIF4GII cDNA in the original plasmid [8], removed by digestion with the same enzymes.

Immunoblotting and antibodies

Cell lysates adjusted to contain equal amounts of protein or obtained from m7GTP–Sepharose affinity isolation were subjected to SDS/PAGE, transferred on to PVDF membrane (GE Healthcare) and proteins were visualized with the antibodies described below. Appropriate secondary antibodies were conjugated either to horseradish peroxidase and visualized with ECL (enhanced chemiluminescence) reagents (Perbio), or to fluorophores and visualized via a LI-COR Odyssey instrument. Polyclonal rabbit antibodies against eIF4GI and eIF4GII have been described previously [8,24,27]. The anti-Myc 4A6 monoclonal antibody was purchased from Upstate, whereas the anti-FLAG M2 (affinity-purified) and the anti-β-actin polyclonal antibodies were from Sigma–Aldrich. In all cases, care was taken to ensure that detection was within the linear response of the individual antiserum to the protein.


eIF4GI and eIF4GII expression varies between different cell lines

In 1998, Gradi et al. [19] first identified the EIF4GII mRNA in which the initiation codon for eIF4GII translation corresponds to the second AUG used to initiate translation of eIF4GI (eIF4GIe; Figure 1A). AUG codons further downstream (corresponding to ORFs d, c, b and a) which could lead to the expression of additional eIF4GII isoforms are conserved, and there are additional in-frame AUG codons not present in eIF4GI. However, the equivalent to the longest eIF4GIf initiation codon is an AUA in the EIF4GII sequence. To obtain an overview of expression of the eIF4G paralogues, and to determine whether further isoforms of eIF4GII existed, we examined the level of expression of eIF4GI and eIF4GII mRNA and protein in a human cell line panel.

Quantitative real-time PCR was used to determine the amount of EIF4G1 and EIF4GII transcripts, with the amount of 18S rRNA determined and used for normalization of EIF4G levels to ensure the comparison of the same amount of cDNA from each cell line. The amounts of EIF4G1 and EIF4GII transcripts were found to vary highly, but in all cell types EIF4G1 was transcribed to a higher amount than EIF4GII (Figure 1B). Subsequently, equal amounts of protein from each cell line were subjected to SDS/PAGE and immunoblotting (Figure 1C) using conditions we had successfully used to resolve individual isoforms of eIF4GI [8], revealing that the pattern of eIF4GII isoform expression was distinct to that observed with eIF4GI. As described previously [8], the majority of eIF4GI proteins correspond to the longest eIF4GIf and eIF4GIe ORFs. In contrast, there appeared to be multiple isoforms of eIF4GII which did not resolve into clearly defined species, and the levels of expression of these proteins clearly varied between cell lines. It was therefore important to elucidate what events were responsible for the generation of these novel species of eIF4GII protein.

A novel exon extends the EIF4GII ORF in the central domain of the protein

When we subcloned the original EIF4GII ORF [19] by RT (reverse transcription)–PCR from HeLa cell mRNA, we amplified the expected sequence, as well as a cDNA which contained a small central exon, which we designated 13a (Figure 2A). Exon 13a maintains and extends the ORF by 37 amino acids (Figure 2B), in a region where there is low homology. However, part of this corresponding region in eIF4GI (amino acids 683–721, Figure 1A) has been described as being “critical for ribosome scanning” in rabbit reticulocyte lysate programmed with viral RNAs [28]. BLASTp analysis identified a further human transcript (GenBank® accession number BC072413) containing this exon, and also identified this exon in sequences from Equus caballus and Callithrix jacchus. The extended coding potential was also present in the Mus musculus genome, as ascertained by tblastn. Using oligonucleotide primers specific to exon 13/exon 13a splice junctions, compared with a pair that amplified exons 23–24, we observed that exon 13a was only in approximately 10% of transcripts in our cell line panel (Figure 2C). Using our knockdown/add-back approach [8], we have addressed the role of this central region in the eIF4G protein in the context of the full-length protein. First, we deleted amino acids 681–721 from eIF4GIf. Next, we inserted the corresponding region of either the original eIF4GII sequence (HybN) or the novel version containing exon 13a (Hyb+), creating chimaeras. These N-terminally Myc-tagged eIF4G proteins were expressed in HeLa cells for 96 h while expression of all endogenous EIF4G1 transcripts was reduced using siRNA31 [8,29]. Successful reduction of expression of all isoforms of eIF4GI was confirmed (Figure 2D, lane 2 compared with lane 1), and the Myc-tagged proteins were detected with differing migrations using SDS/PAGE (4% gels) in accordance with the sizes of deleted and inserted sequences.

Figure 2 A novel coding exon that extends the ORF of EIF4GII may play a similar functional role to a domain in EIF4G1

(A) The genomic arrangement of the original published EIF4GII transcript [19]. Subcloning of EIF4GII mRNA identified a novel exon 13a that maintains and extends the ORF. Black boxes denote non-coding exons, grey boxes denote coding exons. (B) Sequence alignment of the original and extended versions of the eIF4GII ORF against the equivalent region in the eIF4GI paralogue. Bold type denotes the region in eIF4GI determined to be critical for ribosome scanning. Bold italic type denotes the translation of the sequence present in exon 13a of the extended eIF4GII. (C) Quantitative real-time PCR was used to assay the proportion of messages containing exon 13a in a panel of human cell lines. Values obtained by quantitative real-time PCR were normalized to those obtained from an amplicon which detected exons 23–24 of EIF4GII mRNA. Data were collected from three independent experiments, with assays each performed in triplicate; values are means±S.E.M. (D) Plasmids were created that expressed Myc-tagged eIF4GIf, or versions deleted of the region between 681 and 721 (Δ), or where the deleted region was replaced by the equivalent region from the original or extended eIF4GII sequences (HybN and Hyb+ respectively). All cDNAs were immune to silencing by siRNA31 [8]. HeLa cells were co-transfected for 96 h with siRNA vectors and plasmids expressing the different versions of eIF4GI as indicated. Proteins were detected using the antbodies indicated. (E) Before harvest after 96 h of transfection, cells were incubated with [35S]methionine for 1 h, extracts prepared, and the incorporation of radioactive methionine into protein was determined as described in the text (c.p.m./μg of total protein, and expressed relative to that obtained in untransfected cells, set at 1).

To analyse overall translation rates, cells were pulse-labelled for 1 h with [35S]methionine before cell harvest. Reduced expression of eIF4GI resulted in a decrease in the incorporation of radiolabel to a level observed previously [8]. However, the rescue in translation rates with full-length wild-type eIF4GIf was greater than described previously [8], possibly because of the more robust expression observed in these experiments (Figure 2E). The presence of the eIF4GIf with the deleted region led to a reversal in translation rate reduction which was not as marked as that seen with the wild-type that may, in part, reflect reduced levels of protein expression. Furthermore, the chimaeric versions of eIF4GI, where the central region had been replaced with alternative eIF4GII sequences, were also able to rescue translation, but not markedly differently to the wild-type. In conclusion, the novel region found in eIF4GII neither emphasized nor inhibited translation, and the role of exon 13a in eIF4G function will need further investigation.

Extending the ORF of EIF4GII by 5′-RACE

When the original and extended eIF4GII sequences were expressed in HeLa cells and analysed using SDS/PAGE (4% gel), these higher-molecular-mass species did not co-migrate with the slowest migrating endogenous eIF4GII detected with a specific antibody (results not shown), indicating that there are multiple isoforms of eIF4GII. Analysis of EIF4GII transcripts in NCBI GenBank® identified BC072413, which contained a 5′-end that extended upstream of the published sequence (NM_003760.1, Figure 3A). Furthermore, BC072413 contained an in-frame non-canonical CUG codon that could extend the ORF. The sequence surrounding this codon (CAUCGCCUGA) does not closely map to the Kozak consensus [13], with neither a G at +4 nor a purine at −3. Nor does it resemble an alternative consensus proposed for non-AUG initiation codons [30], although the sequence proposed by this work [30] was based on only 45 non-AUG initiation events, and was not supported by any experimental evidence. Other EIF4GII transcripts in GenBank® also contained extra exons upstream of the original sequence (Figure 3A). It should be noted that the current build of the chromosome 1 sequence lacks ‘exon 1’ in the RefSeq versions of EIF4GII NM_003760 (versions 1, 2 and 4, and also found in BC030578), although it is present in the genomic clone AL627311.16. To determine whether an extension of the EIF4GII mRNA was responsible for the slower migrating forms of eIF4GII observed in the cell line panel, we undertook experiments to amplify the 5′-end of the EIF4GII mRNA. The ‘oligonucleotide-capping’ method [31] was used to determine if any longer ORFs existed for EIF4GII in HEK-293T, HeLa and MCF-7 cells. The mRNAs obtained by this method (some of which are shown diagrammatically in Figure 3B; see also Supplementary Table S2) exhibited a great deal of heterogeneity in both transcriptional start site (identifying three further promoters) and in incorporated exons. Promoters before exon −6 and within exon 4 lack a single defined transcriptional start site, with respective differences of approximately 70 or 100 nt between the most 5′ and 3′ start sites (Figure 3C). Such a phenomenon (termed a ‘broad’ promoter) is most associated with CpG-island-type promoters, as opposed to the much more ‘focal’ transcriptional initiation found with TATA promoters [32].

Figure 3 Extending the ORF of EIF4GII by 5′-RACE

(A) The genomic arrangement of the EIF4GII transcripts in the NCBI GenBank® database, showing the original sequence (AUGo, reference [19]) and putative extended ORFs arising from possible non-canonical initiation at CUG codons (CUGx–CUGz). White boxes denote non-coding exons, black boxes denote coding exons. (B) Genomic arrangements of cDNA sequences obtained in the present study following 5′ RLM-RACE using mRNA from HeLa, MCF7 and HEK-293T cells as a template. Only those transcripts where the ORF is maintained in the 5′ extensions are illustrated. (C) Representation of multiple splicing events observed in the 5′ -end of the EIF4GII gene locus (including those identified by previous studies). Novel promoters identified in the present study are denoted with grey arrows, with broken lines representing those promoters where transcriptional initiation events take place over a large range. Exons are illustrated to scale whereas introns are not, and exons are numbered according to those identified in the original NM_003760 transcript. For the remaining EIF4GII exons (6–28), refer to Figure 2(A). (D) An alternative representation of the EIF4GII gene locus in chromosome 1p36.12, where introns are illustrated to scale, with exon positions marked by vertical lines. Exon −7 is found in the RCC1 gene locus (1p36.1) and may arise through a HeLa-cell-specific chromosomal rearrangement. (E) Quantitative real-time PCR was used to assay the proportion of messages containing the putative CUG initiation codon (exon −6) in the panel of human cell lines used in Figures 1(B) and 2(C). Values obtained by quantitative real-time PCR were normalized to those obtained from an amplicon which spanned exons 2–3 of EIF4GII mRNA. Data were collected from three independent experiments, with assays each performed in triplicate; values are means±S.E.M.

The furthest upstream promoter was identified only in transcripts from HeLa cells, with the first exon (−7) aligning to the RCC1 gene locus, therefore this may be due to a chromosomal rearrangement in this cell type.

We did not clone any sequences corresponding exactly to the original sequence, but did obtain transcripts similar to those previously identified. In addition, the exon arrangement and splice junctions found in some of the novel transcripts (which in some cases splice with alternative donor or acceptor sites) do not maintain the postulated extended ORF and are not illustrated here. The different arrangements of the splicing events between exons −1 and 2, and −6 and −5 is shown in Figure 3(C), with a scale representation of the chromosome from exon −6 to 5 shown in Figure 3(D). To precisely determine promoter usage and splice variants of EIF4GII in each cell line, a deep sequencing strategy would be appropriate, but is beyond the scope of the present study.

Quantitative real-time PCR was carried out to determine what proportion of EIF4GII transcripts contain the novel exon −6, which is postulated to contain the CUG initiation codon (Figure 3E). It can be observed that, in all cell lines, the novel exon accounts for at least half the amount of total EIF4GII mRNA and, in some cell lines, this exon is present in all transcripts.

Extended EIF4GII ORFs can be translated in vivo

To ascertain whether the novel putative EIF4GII ORFs were being translated in a cellular environment, the short cDNAs obtained by RLM-RACE, as shown in Figure 3(B), were subcloned in-frame with a vector containing a C-terminal 3× FLAG epitope (Figure 4A, top panel). HeLa cells were transfected with these constructs and the presence of FLAG-tagged extended ORFs was clearly observed by immunoblotting (Figure 4A, bottom panel). Furthermore, the plasmid containing 5′ 4GII-clone 2 generated three species (lane 3), with upper and lower forms corresponding to translation from the non-canonical CUG and original AUG codons respectively. The third under-represented band, which is also observed in lanes 4 and 5, may reflect initiation at an alternative in-frame codon.

Figure 4 Extended EIF4GII ORFs can be translated in vivo

(A) The 5′-RACE clones illustrated in Figure 3(B) with an upstream CUG in-frame with the original ORF were subcloned upstream of a pcDNA 3.1(+) vector which had previously had three copies of the FLAG epitope tag inserted into the multiple cloning site. Protein expression was analysed in cells transfected with the 5′-RACE clones by immunoblotting with an anti-FLAG M2 antibody. Molecular mass markers were used to ascertain the migration of proteins synthesized from upstream non-canonical initiation codons (black arrows) or the canonical AUG (*). The expected migration of a protein arising from the original AUG, fused to the C-terminal 3×FLAG tag was 11.9 kDa, with those from 5′ 4GII clones 1–4 predicted to be 26.5, 27.9, 26.7 and 17.7 kDa respectively. Naming of eIF4GII clones follows that shown in Figure 3(B). The molecular mass in kDa is indicated on the left-hand side. (B) Usage of the alternative initiation codons was confirmed by site-directed mutagenesis. CUG or AUG initiation codons in the 3×FLAG vector containing EIF4GII 5′-RACE clone 2 (CUGb, lane 2) were mutated to the best possible context (AUG+) or to a UAC triplet. Proteins were detected with an anti-FLAG antibody. (C) Sequence alignment of the N-termini of the original (AUGo) and extended versions of the EIF4GII ORFs found in the present study (CUGa–CUGd), assigned to transcripts found in previous studies (CUGx–CUGz) and the eIF4GI paralogue.

To confirm their usage, the novel CUG (CUG1) and original AUG (AUGo) initiation codons in the 5′ 4GII-clone 2 vector were mutated to an AUG codon in an efficient context (AUG+) or to an UAC triplet (Figure 4B, illustrated in the top panel). A second CUG codon (CUG2), just upstream and in-frame with the AUG, was also mutated in this way to determine whether initiation occurred at this point. The wild-type plasmid again drove expression in HeLa cells of the two major species that were previously observed (Figure 4B, bottom panel, lane 2), and initiation at the first CUG codon was confirmed due to the co-migration with the protein produced when this triplet is mutated to an AUG (lane 3). Furthermore, CUG2 only becomes a novel start site when mutated to an AUG in an efficient context (lane 5).

Changing the CUG codon to a strong AUG prevents translation from the downstream AUGo, suggesting that leaky scanning is solely responsible for initiation at this codon. We have no evidence for internal ribosome entry in 5′ regions of EIF4GII mRNAs, as subcloning the 5′ UTR upstream of the CUG, the region between CUG1 and AUGo, or the whole sequence upstream of AUGo into the pRF dicistronic luciferase vector used to test for IRES (internal ribosome entry site) activity [33,34] gives no increase in expression of the second firefly luciferase cistron (results not shown). This differs from EIF4G1, where internal ribosome entry was proposed to drive initiation at some of the alternative AUG initiation codons [18], and an IRES is used to produce one of the isoforms of BAG-1 from within the coding region [33].

Surprisingly, the UAC1 mutation designed to prevent initiation still allowed some translation (lane 4). We have endeavoured to determine whether initiation from the CUG1 and UAC1 initiation codons is recapitulated in rabbit reticulocyte lysate, but have been unable to observe translation from either initiation codon, although the AUG1 works as expected (results not shown). However, this result may be transcript-specific within this in vitro translation system, as we have observed initiation from a novel non-canonical GUG codon when examining a different cellular mRNA which is weakly maintained when the GUG is mutated to UAC (L.S. Perry, J.L. Cowan and M.J. Coldwell, unpublished work). It is possible that UAC could be a hitherto unexpected initiation codon; however, this is unlikely as mutation of AUGo to the same codon (lane 8) results in loss of initiation from this point (although initiation from AUGo is already weak due to it being the seventh AUG encountered by a scanning ribosome). It is more likely that upstream or downstream sequences may pause a scanning ribosome to facilitate usage of the CUG1/UAC1 initiation codon. The sequence immediately downstream of the CUG (from +1 to +60) is composed of 70% G-C bases and is predicted to form a secondary structure with a free energy of −18.8 kcal/mol (1 kcal=4.184 kJ). A hairpin with a free energy of −19 kcal/mol downstream of an UUG or GUG codon was sufficient to increase translation in previous models [35,36].

Alternatively a bioinformatics study postulated the existence of the extended EIF4GII ORF [37], but assigned the initiation codon as being an AUC 15 nucleotides upstream of the CUG initiation codon we identified. This could be the initiation codon that remains active in the UAC mutant (Figure 4B), although our unpublished work suggests that an AUC initiation codon in the optimal Kozak consensus can only support less than 0.5% of the translation events of an AUG, which compares with 6% if a CUG is used (J.L. Cowan and M.J. Coldwell, unpublished work).

The in-frame N-terminally extended ORFs that we have therefore identified are shown in Figure 4(D), aligned to the ORFs from other database sequences and illustrating the divergence from the eIF4GIf paralogue. The nomenclature we adopt and use in the remainder of the present paper reflects the initiation codon used, with a lower-case letter denoting either the original exon arrangement (o) or one of the novel arrangements illustrated in Figures 3(A) and 3(B). Returning to our initial data on expression of eIF4GII (Figure 1C), the diverse patterns of protein expression indicate that there may be further isoforms of eIF4GII to be found, although there is a broad pattern that most of the isoforms correspond to migration of the CUG-initiated forms, which corresponds to the quantitative PCR data (Figure 3E).

Extended eIF4GII isoforms co-migrate with the endogenous protein and are recruited to the TIC

To prove that CUG-initiated forms of eIF4GII co-migrate with the endogenous protein, extended EIF4GII ORFs were subcloned on to the original cDNA by megaprimer PCR and inserted into a vector containing an N-terminal Myc tag (Figure 5A). eIF4GII-AUGe initiates from the original AUG, but with the extra coding exon 3a (Figure 3C). No 5′ UTR sequences were incorporated and the N-terminal Myc tag initiates at an AUG initiation in a favourable Kozak consensus. The plasmids were transfected into HeLa cells (Figure 5B) and protein expression was analysed with antibodies against eIF4GII (top panel) or the Myc tag (bottom panel). Using low percentage acrylamide gels, we were able to resolve that the isoforms initiating from the CUG co-migrated with the upper band of endogenous eIF4GII, with the AUG-initiated forms doing likewise with the lower band. The multiple isoforms generated by alternate initiation and splicing identified in the present study demonstrate why eIF4GII migration in SDS/PAGE is not as defined as that observed with eIF4GI (Figure 1C).

Figure 5 Extended eIF4GII isoforms co-migrate with the endogenous protein and are recruited to the TIC

(A) To prove that CUG-initiated forms of eIF4GII co-migrate with the endogenous protein, extended eIF4GII ORFs were subcloned on to the original cDNA by megaprimer PCR and inserted into a vector containing an N-terminal Myc tag. The AUGe variant corresponds to the original AUG initiated form, but also contains the short coding exon 3a. (B) cDNAs were transfected into HeLa cells and protein expression was analysed with antibodies against eIF4GII (top panel) or the Myc tag (bottom panel). (C) Both endogenous and exogenous extended eIF4GII isoforms can associate with eIF4E. Aliquots of extracts containing equal amounts of protein were subjected to m7GTP–Sepharose affinity chromatography to recover eIF4E and associated factors. Proteins were resolved by SDS/PAGE and visualized by immunoblotting using the antisera indicated. cDNAs with a + indicate that the coding exon 13a is also present.

To investigate whether the extended eIF4GII isoforms form eIF4F complexes, we generated extracts from cells transfected with eIF4GIf, the original AUGo-initiated EIF4GII cDNAs (with and without exon 13a), or cDNAs arising from the CUGa-initiated ORF (again, ±exon 13a) and incubated them with m7GTP–Sepharose, which isolates eIF4E and any associated proteins. Following elution, immunoblotting showed that all four forms of eIF4GII investigated could be co-isolated with eIF4E (Figure 5C).

Extended EIF4GII is more effective than the original ORF at rescuing translation in an eIF4GI/eIF4GII double-knockdown background

To be able to assay any differences in eIF4G isoforms, as previously carried out with eIF4GI [8], we used siRNA to down-regulate expression of endogenous eIF4GII in HeLa cells (Supplementary Figure S1 at A number of experimental replicates established that the most potent siRNA was siRNA243, which we then used to reduce the expression of eIF4GII in the following experiments. To determine the effects of knocking down both eIF4G proteins in the same cells, we used siRNA243 to target eIF4GII and siRNA31 [8,29,38] to reduce endogenous eIF4GI levels (Figure 6A). Immunoblotting proved that the eIF4GI- or eIF4GII-specific siRNAs only reduced expression of their respective targets. Furthermore, measuring incorporation of [35S]methionine into protein demonstrated that reduction of eIF4GII levels had a similar effect on global protein synthesis rates, as had been previously noted for eIF4GI (Figure 6B).

Figure 6 Restoration of protein synthesis that has been reduced by knockdown of eIF4G expression is induced with differing efficiencies by alternative forms of eIF4GII

(A) Immunoblotting of endogenous eIF4GI and eIF4GII isoforms resolved by SDS/PAGE following transfection of HeLa cells with siRNA-expressing plasmids for 96 h. (B) Prior to the harvest after 96 h of transfection, cells were incubated with [35S]methionine for 1 h, extracts were prepared and the incorporation of radioactive methionine into protein was determined (expressed as c.p.m./μg of total protein). (CE) HeLa cells were co-transfected for 96 h with siRNA vectors and plasmids expressing different Myc-tagged isoforms of eIF4GI or eIF4GII, as indicated. Cell extracts were prepared and aliquots containing equal amounts of total protein were resolved by SDS/PAGE, as indicated. Endogenous and exogenous eIF4GI (D) or eIF4GII (C) was visualized by immunoblotting using the indicated antiserum, with actin being used as a loading control. Exogenous proteins were also detected with an antibody capable of recognizing their N-terminal Myc tag (E). (F) As described in (B), incorporation of [35S]methionine into protein was measured in cells expressing siRNA to reduce expression of both isoforms of eIF4G, or cells in which the expression of individual forms of eIF4G was maintained by expression of exogenous cDNAs, as indicated. Data were collected from three independent experiments, with assays each performed in triplicate; values are means±S.E.M.

We finally wished to determine whether the reduction in translation rates could be rescued by the addition of exogenous EIF4G cDNAs, including the novel extended isoforms of eIF4GII identified in the present study. Single knockdowns were performed as above, and double-knockdowns were performed by co-transfecting siRNA31 and siRNA243. Furthermore, vectors expressing Myc-tagged eIF4GIf, eIF4GIe [8], eIF4GII AUGo [19] or the extended eIF4GII sequences (CUGa, CUGb and CUGd), were co-transfected into cells that had received the double-knockdown. The expression of endogenous proteins was reduced as before, including in those cells transfected with both siRNA plasmids (Figures 6C and 6D), albeit to a lesser extent possibly due to the reasons outlined below. Furthermore, assaying translation rates in the double-knockdowns demonstrated that the reduction in protein synthesis measured in the single-knockdowns was not additive in the double-knockdowns (compare Figure 6B with 6F).

It should be noted that the double-knockdown resulted in the death of many of the cells, so it is possible that those cells that survived to the point of harvest may have only been transfected with one or other of the siRNA-expressing vectors. Furthermore, it is possible that not all translation initiation is dependent on the assembly of initiation factors on an eIF4G scaffold, with molecules containing the central MIF4G and MA1 domains, such as DAP5, CBP80 and NOM1, able to recruit other components of the translation initiation machinery in an albeit modified manner.

Expression of any of the single eIF4G proteins can rescue translation in the double-knockdown cells, and interestingly the rescue of translation by the extended eIF4GII (CUGa, CUGb or CUGd) was much more robust than that observed with the original eIF4GII sequence (Figure 6F), suggesting that the novel N-terminal region may play a particular role in initiation factor assembly. Given that the three versions (CUGa, CUGb and CUGd) used in the present study share some sequence similarity (Figure 4C), this will enable us to narrow our search for functional regions in the novel extension. Furthermore, this rescue is achieved with less expression of the eIF4GII proteins, as determined by immunoblotting of the N-terminal Myc tag on all exogenous proteins (Figure 6E, compare lanes 2 and 3 with lanes 4–7). Whether a consequence of the absence or presence of a sole form of eIF4G (whether extended or not) results in the expression of a different subset of proteins remains to be elucidated. Simply looking at differences in radiolabelling of proteins produced in the hour prior to harvest by one-dimensional SDS/PAGE does not show any obvious changes (results not shown), although more detailed approaches, such as pSILAC (pulsed stable isotope labelling by amino acids in cell culture) or ribosomal profiling, may tell us more.


The eIF4G proteins play a central role in the assembly of initiation factor complexes by providing a scaffold that both leads to circularization of mRNAs, and the recruitment of the small 40S ribosomal subunit. Our previous work identified differences in the ability of eIF4GI isoforms initiated at alternative AUG codons to rescue translation in a background depleted of the endogenous protein [8]. The results of the present study demonstrate that eIF4GII is also synthesized in cells as multiple isoforms, but in this case different promoters, splicing events and the use of a non-canonical CUG initiation codon lead to the production of a multitude of different proteins. The N-termini of the eIF4G isoforms are only poorly conserved, with the exception of the site necessary for PABP binding, and the present study confirms that these regions may have different functions. Work is currently underway to determine whether the N-termini of eIF4GI and eIF4GII have alternative binding partners, and what their function may be in translation initiation.

The extended forms of eIF4GII are not the only proteins synthesized from non-canonical TICs, and we and others are currently investigating a growing number of incidences where such events can generate alternative forms of a protein. Moreover, the RACE protocol and assays used in the present paper may provide a rapid and simple way in which postulated alternative initiation events can be tested.


Mark Coldwell designed and executed experimental programme and wrote the majority of the paper, including drafting the Figures. Ulrike Sack carried out the quantitative PCR and immunoblot analysis in Figure 1 and was involved in the initial siRNA experiments (Supplementary Figure S1 and Figure 6). Joanne Cowan completed replicates for Supplementary Figure S1 and further experiments for Figure 6. Rachel Barrett contributed to initial experiments for Figure 6. Markete Vlasak contributed to cloning cDNAs and experiments for Figure 5. Keiley Sivakumaran contributed to cloning cDNAs and experiments for Figure 2. Simon Morley obtained funding for the project and contributed to the writing of the paper.


This work was supported by the Biotechnology and Biological Sciences Research Council [grant number BB/D007593/1].

Abbreviations: BAG-1, Bcl-2-associated athanogene 1; DAP5, death-associated protein 5; eIF, eukaryotic initiation factor; FBS, fetal bovine serum; HEK, human embryonic kidney; IRES, internal ribosome entry site; m7G, 7-methylguanylate; m7GTP, 7-methylguanosine-triphosphate; ORF, open reading frame; PABP, poly(A)-binding protein; RACE, rapid amplification of cDNA ends; RLM, RNA-ligase-mediated; siRNA, short interfering RNA; TIC, translation initiation codon; UTR, untranslated region


View Abstract