CRISPR (clustered regularly interspaced short palindromic repeat)/Cas (CRISPR-associated) is a nucleic acid processing system in bacteria and archaea that interacts with mobile genetic elements. CRISPR DNA and RNA sequences are processed by Cas proteins: in Escherichia coli K-12, one CRISPR locus links to eight cas genes (cas1, 2, 3 and casABCDE), whose protein products promote protection against phage. In the present paper, we report that purified E. coli Cas3 catalyses ATP-independent annealing of RNA with DNA forming R-loops, hybrids of RNA base-paired into duplex DNA. ATP abolishes Cas3 R-loop formation and instead powers Cas3 helicase unwinding of the invading RNA strand of a model R-loop substrate. R-loop formation by Cas3 requires magnesium as a co-factor and is inactivated by mutagenesis of a conserved amino acid motif. Cells expressing the mutant Cas3 protein are more sensitive to plaque formation by the phage λvir. A complex of CasABCDE (‘Cascade’) also promotes R-loop formation and we discuss possible overlapping roles of Cas3 and Cascade in E. coli, and the apparently antagonistic roles of Cas3 catalysing RNA–DNA annealing and ATP-dependent helicase unwinding.
- CRISPR-associated 3 (Cas3)
- clustered regularly interspaced short palindromic repeat (CRISPR)
CRISPR (clustered regularly interspaced short palindromic repeat )/Cas (CRISPR-associated) is a nucleic acid processing system in bacteria and archaea that interacts with mobile genetic elements. CRISPR loci are functionally coupled to Cas proteins and can protect cells against infection by phage or plasmids [2–8]. CRISPR/Cas also contributes to other physiological roles in some organisms: prevention of lysogeny-dependent biofilm formation in Pseudomonas aeruginosa [9,10], DNA repair or recombination in Escherichia coli  and partitioning of newly replicated plasmids during cell division in halophilic archaea . In some organisms, CRISPR/Cas may self-target, interacting with host cell genes [13,14]. The various functions of CRISPR/Cas probably reflect the impressive variability in organization and distribution of CRISPR loci and cas genes across bacterial and archaeal domains, prompting their classification into CRISPR/Cas sub-types. This and other aspects of CRISPR/Cas function have been detailed in recent reviews [15–19].
Key events in CRISPR/Cas processing include acquisition of DNA fragments from mobile genetic elements into host cell CRISPR loci as ‘spacers’ alongside new repeat sequences, and transcription of CRISPR loci to RNA (pre-crRNA) that is processed into shorter RNA (crRNA). Therefore crRNA contains sequence that is complementary to antecedent mobile elements, and can be utilised to target invading nucleic acid. In bacteria, crRNA targets DNA [3,5,6,20], and in the archaeon Pyrococcus furiosus, invading RNA is targeted, leading to its destruction . Processing and targeting of crRNA is catalysed by ‘core’ Cas proteins, working with less widespread Csx (Cas subtype-specific) proteins or archaeal CMR proteins [Cas RAMP (repeat-associated mysterious proteins) module]. Acquisition of DNA spacer-repeats into CRISPR loci is by unknown mechanisms that may involve Cas1 or Cas2 nucleases [11,22,23]. Endonucleolytic trimming of transcribed pre-crRNA into shorter (50–100 nucleotide) RNA (crRNA) [24,25] is catalysed by subtype-specific Cas proteins; Csy4 in Ps. aeruginosa , Cas6 in the archaeon P. furiosus  or CasE (also called Cse3) in E. coli . E. coli CasE is part of a larger protein–RNA complex known as Cascade, encoded by casABCDE .
CRISPR loci were first described in E. coli . Four such loci have been analysed in this organism [29,30], one of which (CRISPR-1) is associated with eight cas genes (Cas1, 2, 3 and casA, casB, casC, casD and casE) (Figure 1A). Transcription of CRISPR-1 and casA–E is repressed by H-NS  and de-repressed by LeuO  or BaeR . Activation of E. coli CRISPR-1/Cas improved the survival of cells attacked by lysogenic and lytic cycles of infection by λ phage [3,4,7]. This required expression of a crRNA sequence engineered with an anti-λ spacer (T3). In two studies [4,7] the T3 spacer was recombineered into the CRISPR-1 locus, and increased protection from λ phage required casA–E. In another study , protection from phage was induced by expression of crRNA, casA–E and cas3 in multi-copy plasmids. E. coli chromosomal cas3 is located upstream of casA–E (Figure 1A) and is not controlled by H-NS, although cas3 transcription in Thermus thermophilus is increased 2.5-fold in response to infection by phage φYS40 .
Cas3 is a core Cas protein found in most CRISPR/Cas sub-types. Evidence from in vivo studies indicates that Cas3 functions downstream of endonucleolytic crRNA processing in Ps. aeruginosa  and co-operates with the Cascade–crRNA complex in E. coli . Amino acid sequence motifs in Cas3 predict ATP-dependent helicase activities for Cas3 and a HD-phosphohydrolase domain . As is consistent with containing a helicase domain, purified Cas3 from Streptococcus thermophilus catalysed displacement of 20 nucleotide DNA and RNA strands annealed to M13 closed circular ssDNA (single-stranded DNA) . Although the HD domain of S. thermophilus Cas3 was required for ssDNA nuclease activity on M13, the same domain in Sulfolobus solfataricus Cas3 was active at nicking dsDNA (double-stranded DNA) in preference to ssDNA , a nicking activity not tested by Sinkunas et al. . In the present study, we describe the ATP-dependent helicase activity of E. coli Cas3 that displaces an RNA strand from an R-loop, a substrate in which RNA is base paired into complementary duplex DNA. We also describe a new biochemical activity for Cas3: ATP-independent annealing of RNA and DNA strands. Cas3 purified from the archaeal species Mth (Methanothermobacter thermautotrophicus) showed similar annealing and helicase activities, indicating that these Cas3 functions are evolutionarily conserved. We discuss apparently antagonistic functions of Cas3 that depend on the presence of ATP.
Cloning cas3 from E. coli MG1655 and Mth ΔH
E. coli K12 MG1655 ygcB (encoding Cas3) was cloned via PCR from genomic DNA into plasmid pET14b using NdeI and EcoRI sites and verified by sequencing generating pEB489. pEB489 was used as a PCR template to clone ygcB into pMal-c2 using EcoRI and XbaI sites in-frame with malE, creating plasmid pAH1. This was successful at overproducing soluble Cas3 fused at its N-terminus to MBP (maltose-binding protein), creating a protein of approximately 143 kDa as shown in Figure 1(C). Supplementary Table S1 (at http://www.BiochemJ.org/bj/439/bj4390085add.htm) lists primer sequences used for cloning ygcB into pMal-c2 (ygcB-X2 and ygcB-Y2) and for Cas3 mutagenesis to generate H74G, D75G, K78L, K320L and G317SS318Y.
Mth ORF1086 (encoding Cas3) was cloned from Mth genomic DNA (a gift from Dr James Chong, Department of Biology, University of York, York, U.K.) via PCR using primers with NdeI and HinDIII sites as detailed in Supplementary Table S1. PCR products were cloned into pT7-7 generating pEB359, and this construct was effective at overexpressing soluble MthCas3. Mth D347A Cas3 was generated by mutagenic PCR on pEB359 using primers listed in Supplementary Table S1.
Purification of E. coli and Mth Cas3 proteins
E. coli MBP–Cas3 was purified from pAH1 in DH5α grown in MU broth containing ampicillin (50 μg/ml). At D=0.5, cells were induced by IPTG (isopropyl β-D-thiogalactopyranoside; 0.5 mM) for 30 min at 37 °C. Recovered cells were resuspended in amylose column buffer [20 mM Tris/HCl, pH 7.5, 200 mM NaCl, 1 mM EDTA, 1 mM DTT (dithiothreitol) and 0.1 mM PMSF] and frozen at −80 °C. Cas3 was purified by FPLC at room temperature (22 °C) and SDS/PAGE, and Coomassie Blue staining of proteins followed each purification step. Protein concentrations were measured against a BSA standard curve using Bradford's reagent (Bio-Rad Laboratories).
Biomass was lysed by sonication and centrifuged (20000 rev./min, 30 min in an Avanti J25 machine with rotor JA25.50). Soluble material (S1) was passed into a 7 ml amylose (High Flow, New England Biolabs) column equilibrated with amylose column buffer. MBP–Cas3 bound to amylose and was eluted in 2 mM maltose in amylose column buffer. Cas3 was loaded directly on to a 10 ml heparin column equilibrated with buffer C (20 mM Tris/HCl, pH 8.0, 200 mM NaCl, 1 mM EDTA, 1 mM DTT and 0.1 mM PMSF). Cas3 did not bind to heparin: the flow-through was loaded directly on to a 10 ml DEAE-Sepharose column, eluting in 300–400 mM NaCl. Cas3 pooled fractions were loaded directly on to a 5 ml fibrous cellulose phosphate column pre-equilibrated with buffer C containing 350 mM NaCl. Cas3 did not bind cellulose phosphate and was finally re-loaded on to the amylose column to concentrate the protein. Cas3 fractions were pooled and dialysed against buffer C containing 35% glycerol for storage at −80 °C. Mutant Cas3 proteins were purified in the same way.
Mth Cas3 was purified from pEB359 transformed into BL21 CodonPlus, inducing Cas3 expression at 37 °C for 2 h with IPTG (0.5 mM). Cells were harvested into buffer C and frozen at −80 °C. Thawed biomass was lysed by sonication and clarified (20000 rev./min, 20 min in an Avanti J25 machine with rotor JA25.50) to obtain soluble proteins (S1). S1 containing Mth Cas3 was passed through a heparin column in buffer C and did not bind: the flow-through was bound on to a HiTrap Q-Sepharose column and eluted at approximately 600 mM NaCl. Mth Cas3 was pooled and loaded directly on to a 10 ml phenyl-Sepharose column from which Mth Cas3 eluted when the column was washed with water. Mth Cas3 was then dialysed into buffer C for storage.
Cloning, overexpression and purification of E. coli CasC, CasD and CasE proteins
E. coli casC (ygcJ), casD (ygcI) and CasE (ygcH) were cloned by PCR from MG1655 genomic DNA using primers shown in Supplementary Table S1. Primers for casC included restriction sites for EcoRI and BamHI for cloning PCR products into pT7-7, giving plasmid pEB488. Primers for casD and casE included restriction sites for NdeI and XhoI, or NdeI and EcoRI respectively for cloning of PCR products into pET14b giving plasmids pSDC3 and pSDC4 respectively.
CasC was overexpressed from pEB488 in BL21 Codon Plus in flasks containing 400 ml of MU broth with ampicillin (50 μg/ml) and chloramphenicol (20 μg/ml) grown at 37 °C. At D=0.5, CasC expression was induced with IPTG (0.5 mM) for 2 h. Cells were harvested, resuspended in <5 ml of buffer CP (20 mM Tris/HCl, pH 8.0, 1 mM EDTA, 1 mM DTT, 0.1 mM PMSF and 100 mM potassium acetate) and frozen at −80 °C. Thawed cells were lysed by sonication generating soluble material (S1) that was recovered by centrifugation (20000 rev./min for 30 min in an Avanti J25 machine with rotor JA25.50). CasC was purified by FPLC at room temperature, with protein retrieval monitored by SDS/PAGE and Coomassie Blue staining. S1 was passed through a tandem arrangement of a 5 ml HiTrap Q-Sepharose column linked to a 5 ml HiTrap SP-Sepharose column and this array was developed using a gradient of potassium acetate (0.1–2.0 M), with CasC eluting at 500–750 mM. CasC fractions were dialysed into buffer CP and loaded on to a 1 ml HiTrap heparin column developed with potassium acetate (0.1–1.5 M), CasC eluting at approximately 550–700 mM. CasC fractions were pooled, dialysed into buffer CP, and loaded on to an S200 gel-filtration column equilibrated in buffer CP. Pooled fractions containing CasC were dialysed into buffer CP containing 30% glycerol for storage.
CasD and CasE were purified as hexahistidine-tagged proteins from biomass overexpressed and harvested as described for CasC. Lysis of thawed cells by sonication provided CasE that was present in soluble material (S1), but CasD was insoluble (P1). Hexahistidine-tagged CasE was purified from S1 by using the HisGraviTrap procedure (GE Healthcare) and purified CasE was assessed by SDS/PAGE prior to storage after dialysis into buffer C containing 25% glycerol. Insoluble (P1) CasD was solubilized and refolded on a HisGraviTrap column: samples were washed with buffer C containing 1 M NaCl and the pellet resulting from the washing (P2) was recovered by centrifugation at 18000 rev./min in a JA-25.50 rotor (Beckman Coulter). P2 was then dissolved by gentle stirring at room temperature in 6 M urea and the solution was clarified by centrifugation, giving soluble material that was loaded directly on to a nickel-charged HisGraviTrap column.
Other proteins in the present study
Purified E. coli MBP for negative control reactions was bought from New England Biolabs, as were RNase H, exonuclease I and enzymes for molecular cloning (Vent DNA polymerase, T4 DNA ligase, restriction endonucleases, Antarctic phosphatase) and end-labelling of nucleic acids (T4 polynucleotide kinase). RecG was a gift from Professor Bob Lloyd (School of Biology, Centre for Genetics and Genomics, University of Nottingham, Nottingham, U.K.). T3 and T7 RNA polymerases for in vitro transcription of crRNA were bought from Invitrogen. Hel308 protein, used as controls for helicase and ATPase reactions, was purified as described previously .
DNA substrates used in the present study are summarized in Table 1 and detailed in the Supplementary Experimental section and Supplementary Table S2 (at http://www.BiochemJ.org/bj/439/bj4390085add.htm). Oligonucleotides were purchased from MWG and were used to generate Cas3 substrates as detailed in . Plasmid DNA used in R-loop assays was generated from E. coli strain DH5α. Any ssDNA in these plasmid preparations was controlled for by digestion with exonuclease I, as described in the Results section. In reactions using linearized plasmids, plasmid was incubated with XbaI for 1 h and then heated to inactivate XbaI. R-loop-forming reactions described in the present study used radiolabelled RNA molecules comprising sequences of E. coli CRISPR-1 (crRNA) that were complementary to target DNA. The cloning fragments of E. coli CRISPR-1 locus into pBluescript or pT7-7 generated target DNA. Details of how these substrates were generated are listed in the Supplementary Experimental section.
RNA substrates (crRNA) were generated in the present study by in vitro transcription, and their DNA templates are summarized in Table 1 and detailed in Supplementary Table S2. RNA substrates ranged from 250 to 900 nucleotides in length, comparable with the 600-nt crRNA expressed with Cas3 and Cascade during anti-viral defence described previously .
R-loop reactions were in 20 μl volumes for the times stated in the Figures, at 37 °C for E. coli Cas3 or 44 °C for Cas3 from the moderately thermophilic archaeon Mth. Standard reactions contained Cas3 protein, 5 mM magnesium chloride, 5 mM DTT, 7% glycerol, plasmid DNA (300 ng) and radiolabelled RNA (25 ng) at pH 7.5 in Tris/HCl and RNase-free water. Reactions were terminated by the addition of proteinase K (5 μl of 10 mg/ml stock) and EDTA (5 mM). Reactions were loaded directly on to 1.2% agarose gels for electrophoresis in Tris/borate/EDTA buffer (45 mM Tris/borate and 1 mM EDTA) for approximately 3 h at 50 V. Gels were dried on a heated bed with vacuum suction and laid down for phosphorimaging on a Fuji FLA3000 machine.
R-loops were recovered from de-proteinized Cas3 reactions by precipitation at −20 °C for at least 2 h in ammonium acetate (1 M) and isopropyl alcohol (50%, v/v), using glycogen (1 μg/μl) as a carrier. The nucleic acid pellet was resuspended in buffer for RNaseH digestion (New England Biolabs) or helicase buffer (5 mM MgCl2, 5 mM ATP and 10 mM Tris/HCl, pH 7.5) for dissociation by RecG. RNase H and RecG reactions were for 30 min at 37 °C.
In vivo phage protection assays
We used E. coli strain MG1655, and a modified version of the method described for E. coli strain W3110 described previously , in which crRNA, Cas3 and Cascade were co-expressed from three plasmids. crRNA was expressed from pWUR478 (a gift from Professor John van der Oost, Laboratory of Microbiology, University of Wageningen, Wageningen, The Netherlands). cas3 or K78L cas3 and the cascade operon were cloned in the present study into pRSF-1b (pEB547 and pEB548) or pET-Duet (pEB550) respectively. Cloning procedures for these plasmids are detailed in the Supplementary Experimental section with primer details given in Supplementary Table S1.
For plaque assays, BL21 DE3 cells, which lack CRISPR/cas, were transformed with pEB550, pWUR478 and pEB547 or pEB548. Overnight cultures were inoculated in TB broth containing maltose (0.02%), ampicillin (50 μg/ml), kanamycin (30 μg/ml) and chloramphenicol (15 μg/ml). Cultures were grown to D=0.5 and then 500 μl of each culture was mixed with 3 ml of 0.6% LB (Luria–Bertani) agar before pouring over LB agar plates containing each of the three aforementioned antibiotics and either IPTG (0.6 mM) or no IPTG as indicted. Serial dilutions (10−2 to 10−7) of stock phage λvir were applied as 10 μl spots to the soft agar top and incubated at 30 °C, as shown in Figure 4(C).
Malachite Green ATPase assays
ATP hydrolysis to phosphate by Cas3 and Cas3 K320L was measured by absorbance at 660 nm in Malachite Green reporter assays and compared with reactions containing no protein. For Cas3, the magnesium chloride concentration was kept constant at 5 mM and ATP (or other NTP) was titrated from 1 to 10 mM. Figure 5(A) shows optimal ATPase activity against time arising from Cas3, in 4 mM ATP:2 mM Mg2+ at 37 °C. A phosphate standard curve was used to measure nmol of phosphate produced per nmol of Cas3.
Helicase and nuclease assays
R-loop helicase reactions were performed at 37 °C for E. coli Cas3 or 44 °C for Mth Cas3 in a 20 μl final reaction volume. Standard reactions contained the indicated Cas3 protein, 5 mM magnesium chloride, 5 mM DTT, 7% glycerol and R-loop substrate (5 ng). Reactions contained either 5 mM ATP or 5 mM ATP[S] (adenosine 5′-[γ-thio]triphosphate) and were performed at pH 7.5 in Tris/HCl, and were stopped by the addition of proteinase K and EDTA before being loaded on to 10% acrylamide TBE gels.
The reactions for nuclease digestion by E. coli Cas3 contained either 32P end-labelled ssDNA or dsDNA (5 ng), or M13 or φX174 circular ssDNA (200 ng), as indicated in the Supplementary Figures. The reactions contained MgCl2 from 1–10 mM and were incubated at 37 °C for 10 min to 3 h. For short linear duplex substrates (Table 1), products of exonuclease or endonuclease digestion were detected by deproteinized reactions running on 15% acrylamide TBE gels and phosphorimaging. For nicking reactions, products were mixed with formamide-loading dye, heated at 75 °C for 5 min and loaded on to 15% acrylamide TBE gels containing 4 M urea.
E. coli Cas3 promotes ATP-independent annealing of complementary nucleic acids
Cas3 encoded within CRISPR-1/Cas of E. coli K-12 MG1655 (Figure 1A) has amino acid sequence motifs characteristic of HD-superfamily proteins  and ATPases of superfamily 2 (SF2) helicases (Figure 1B). E. coli Cas3 has been implicated as an effector of CRISPR protection against λ phage by its actions with the Cascade protein complex (encoded by casA–E, Figure 1A) . We investigated the biochemical activities of purified E. coli Cas3 fused at its N-terminus to E. coli MBP (MBP–Cas3, 143 kDa; Figure 1C). Fusion of Cas3 to MBP overcame significant problems encountered when trying to generate stable Cas3 amenable to biochemistry.
ATP-dependent displacement of 20–22 nt oligonucleotides paired to M13 ssDNA has been described previously for Cas3 from S. thermophilus . In the present study, we confirm a helicase activity for Cas3 using the E. coli enzyme on R-loop substrates, described below (see also Figure 5). In addition, we observed that Cas3 promoted ATP-independent annealing of a 70 nt DNA strand to complementary sequence in φX174 ssDNA (Supplementary Figure S1A at http://www.BiochemJ.org/bj/439/bj4390085add.htm). In these reactions, Cas3 could anneal 70% of free radiolabelled oligonucleotide compared with 13% spontaneous annealing in the absence of Cas3 (Supplementary Figure S1B). One interpretation of these observations is that E. coli Cas3 has biochemical activities additional to ATP-dependent helicase unwinding. Annealing by Cas3 bore similarities to that reported for other helicases, most notably those of the RecQ family [40–43]. We wanted to test this activity of Cas3 in more detail, using CRISPR RNA and duplex DNA substrates, which may be more relevant to the function of E. coli CRISPR/Cas than M13 ssDNA or φX174 ssDNA.
Cas3 annealing activity base pairs RNA into complementary duplex DNA
We investigated whether E. coli Cas3 promoted ATP-independent annealing of RNA into complementary sequence within duplex DNA. To monitor these reactions, radiolabelled RNA was generated by in vitro transcription of E. coli CRISPR-1 fragments cloned into pBluescript, summarized in Table 1. In one set of reactions, Cas3 (50–800 nM) was incubated for 45 min in buffered magnesium chloride (5 mM) with 269-nt RNA (25 ng) that was 100% complementary in sequence to part of the uncut plasmid DNA (pJLH14, 250 ng). Reactions were deproteinized and products were separated by electrophoresis to detect positions of radiolabelled RNA by phosphorimaging. Two products containing RNA migrated much more slowly than free RNA, their formation corresponding to increasing Cas3 (Figure 2A, lanes 4–8). Using a fixed Cas3 concentration converted approximately 20% of free RNA into product in 45 min (Figure 2B, but see also Figure 3). Products were consistent with Cas3 promoting association of RNA with plasmid DNA in a joint molecule that was stable during migration through agarose gels. This observation was supported by comparing positions of ethidium bromide-stained plasmid DNA with radiolabelled RNA in agarose gels of identical de-proteinized reaction products (Figure 2C). Plasmid DNA in reactions lacking Cas3 (Figure 2C, lane 1) or with Cas3 added (Figure 2C, lane 2) migrated as two species, but DNA in the Cas3 reaction was slower migrating than when Cas3 was absent. This slower migrating DNA (Figure 2C, lane 2) corresponded exactly to the position of radiolabelled RNA from an identical reaction imaged to detect RNA (Figure 2C, lane 4). We concluded that Cas3 promoted association of RNA with plasmid DNA.
Control reactions were used to ensure that formation of the observed RNA–DNA products was specific to Cas3. Reactions containing ‘mock’ proteins, MBP, Hel308 helicase  or Cas3 that had been pre-heated to 85 °C showed no detectable RNA–DNA product (Figure 2D). Further support for ATP-independent RNA–DNA products of Cas3 was obtained by substituting E. coli Cas3 with Cas3 purified from the archaeon Mth (Supplementary Figures S2A–S2C at http://www.BiochemJ.org/bj/439/bj4390085add.htm). Mth Cas3 also promoted formation of the RNA–DNA product (Supplementary Figure S1C), indicating that this Cas3 activity is evolutionarily conserved.
Cas3 also promoted formation of RNA–DNA products in reactions containing RNA mixed with uncut plasmid DNA molecules in which sequence complementarity was 60 nt (Supplementary Figure S3A, lanes 1–8 at http://www.BiochemJ.org/bj/439/bj4390085add.htm) or 1000 nt (Supplementary Figure S3A lanes 9–14). In all of these assays, RNA was generated by in vitro transcription to generate sense strand (+) RNA from cloned E. coli CRISPR-1 fragments, as described in Table 1. Cas3 also promoted RNA–DNA product formation from RNA sequences that were antisense (−) CRISPR-1 RNA, or from RNA sequences unrelated to CRISPR-1 (Supplementary Figure S3B), indicating that Cas3-promoted ATP-independent annealing of RNA to DNA was independent of nucleic acid sequence or RNA structure specificity.
Cas3 anneals RNA into duplex DNA to form R-loops, and is most efficient when target duplex DNA is linearized
When uncut plasmid was used as a target for annealing complementary RNA by Cas3, we observed approximately 20% of free RNA converted into product. This activity was improved 3–6-fold if plasmid was linearized prior to mixing with RNA and Cas3, measured with fixed Cas3 concentrations (Figure 3A and 3B). A single reaction product was observed from linearized plasmid DNA compared with two from uncut plasmid (Figure 3B, compare lanes 4 and 8). We also observed a new product containing radiolabelled RNA that migrated close to free RNA (Figure 2C, lane 6, labelled Z).
We next asked whether the RNA–DNA product promoted by Cas3 was an R-loop: an RNA strand invaded into duplex DNA by base pairing to complementary sequences displacing a ‘loop’ of ssDNA. R-loops are stable, consistent with the observed migratory properties of Cas3 reaction products. We assessed directly whether RNA–DNA products of Cas3 reactions might be R-loops using RNase H and RecG, two bacterial enzymes that degrade or dissociate R-loops [44,45]. Cas3 reactions set up as in Figure 3(B) were deproteinized and nucleic acid products were recovered by precipitation. The precipitation step was required because we observed RNase H to be inactive when added directly to Cas3 R-loop reactions. RNA–DNA product could be recovered by precipitation (Figure 3C, lane 2), and this product was partially dissociated by RecG (in the presence of ATP), re-generating free radiolabelled RNA (Figure 2D, lane 3). Most significantly, products of Cas3 reactions were substrates for RNaseH, which efficiently degraded RNA within them, consistent with RNA base-paired to complementary DNA (Figure 3C, compare lanes 2 and 4). Taken together, these data are consistent with an activity of Cas3 promoting the formation of hydrogen-bonded base pairs between complementary RNA invaded into duplex DNA as R-loops.
We tested the possibility that Cas3 annealed RNA only to regions of complementary ssDNA that can arise during preparation of plasmid DNA, although this seemed unlikely considering the observed enhancement of product formation by linearizing plasmids shown in Figure 3(B). However, control reactions were treated with exonuclease I to remove ssDNA in the target plasmid DNA substrate, followed by incubation of this treated DNA with RNA and Cas3. Treatment of plasmid with exonuclease I had no effect on product formation by Cas3 (Supplementary Figure S4A, compare lanes 2 and 3 at http://www.BiochemJ.org/bj/439/bj4390085add.htm). We also examined the possibility that the nuclease activity of E. coli Cas3 could facilitate spontaneous pairing of RNA to ssDNA. Despite efforts using several substrates and a variety of incubation conditions, we were unable to detect exo- or endo-nuclease activity of E. coli Cas3 on plasmid DNA, M13 closed covalent ssDNA or on end-labelled model DNA substrates analysed on neutral or denaturing gels (summarized in Supplementary Figure S4B–S4D). This indicated that Cas3 could promote the ATP-independent annealing of RNA substrates into complementary duplex DNA (R-loop formation) without prior detectable endo- or exo-nuclease processing of DNA.
R-loop formation by E. coli Cas3 requires magnesium and a HD amino acid motif
Cas3 R-loop formation required magnesium or manganese as a co-factor, but was not supported by zinc or in EDTA (Figure 3D). To explore this requirement of Cas3, we focussed on its HD motif (Figure 1B), which binds to metal ions in HD superfamily proteins of diverse functions [46–48]. We purified Cas3 proteins with amino acid substitutions in the HD motif (D75G and K78L, Figure 1B) and measured their ability to promote R-loop formation. Cas3 D75G and K78L were less effective at forming R-loops than wild-type protein in reactions containing either uncut plasmid DNA (Figure 4A, compare lane 2 with lanes 4 and 5) or linearized plasmid (Figure 4B). This indicated the importance of the HD motif to this Cas3 function, and confirmed that a trace contaminant of E. coli Cas3 purification is unlikely to cause the observed R-loop-forming activity. Similar to wild-type Cas3, binding of K78L Cas3 to duplex DNA was detected in EMSAs (electrophoretic mobility-shift assays; Supplementary Figure S4E), indicating that the K78L amino acid substitution had not caused a more general misfolding or structural aberration that might otherwise account for reduced catalysis. Amino acid substitutions to the Cas3 ATPase motif I/Walker A box (K320L and G317S/S318Y, Figure 1B) did not reduce R-loop formation (Figure 4A, lanes 6 and 8). Interestingly, Cas3 protein with combined substitutions in both Lys78 and Lys320 (K78L/K320L) was proficient at forming R-loops (Figure 4A, lane 7), an unexpected observation given reduced levels of R-loop formation catalysed by K78L Cas3.
Mutation of Cas3 HD motif reduces protection of cells against phage λvir in vivo
E. coli cells can be protected against lysis by phage λvir using inducible plasmid-based co-expression of Cas3, Cascade and crRNA engineered with λvir sequence [3,20]. We also observed a protective effect using similar plasmid co-expression of wild-type Cas3, Cascade and crRNA (summarized in Figure 4C). In these assays, serial dilutions of phage λvir suspensions were applied to a lawn of E. coli cells to observe plaque formation (Figure 4C, lane 1) that was reduced 100–1000-fold upon IPTG-induced expression of crRNA, Cas3 and Cascade (Figure 4C, lane 2). Expression of Cas3 K78L HD mutated protein instead of wild-type Cas3 in the same assay (Figure 4C, lanes 3 and 4) led to a 100-fold greater sensitivity to plaque formation. This confirmed the importance of the HD-motif to Cas3 function and its contribution to protection against the phage observed in this assay.
E. coli Cas3 helicase activity dissociates R-loops
Addition of ATP to Cas3 R-loop reactions correlated with a total absence of R-loop product, compared with reactions containing magnesium only (Figure 5A, compare lanes 2–5 with lanes 6–10). This could be explained by ATP-dependent helicase unwinding by Cas3 in dismantling R-loops. To investigate this, we first assayed for the ATPase activity of purified E. coli Cas3. ATP (but not GTP, CTP or TTP) was hydrolysed by Cas3 in the absence of nucleic acids, and pre-incubation of Cas3 with φX174 ssDNA or φX174 dsDNA had no stimulatory effect on ATPase activity (Figure 5B). Hel308, a ssDNA-stimulated ATPase helicase , was used as a positive control for DNA-stimulated ATPase activity. ATP hydrolysis was most effective at a 2:1 molar ratio of ATP/magnesium. Cas3 K320L, mutated in ATPase motif I, had 2–4-fold reduced ATPase activity when compared with wild-type Cas3 at fixed protein concentrations as a function of time (Figure 5C), as expected and in agreement with S. thermophilus Cas3 data .
To test directly whether ATPase activity of Cas3 could drive helicase unwinding of an R-loop, we generated a model substrate from oligonucleotides summarized in Table 1 and detailed in Supplementary Table S1. Incubation of this R-loop with Cas3, ATP and magnesium gave a product (Figure 5D, lanes 2–4) that was barely detectable when ATP was substituted for poorly hydrolysable ATP[S] (lanes 15–17), or if Cas3 was disrupted in ATPase motif II (Walker B/DExH box, H455L; Figure 5D, lanes 8–10). Helicase activity of Cas3 was unaffected by mutation in the HD motif (Figure 5D, lanes 5–7), and was moderately reduced by mutation of glycine and serine residues in ATPase motif I (Walker A, ‘GSSY’; Figure 5D, lanes 11–13). The product formed by Cas3 in these helicase reactions was DNA lacking the ‘invading’ RNA, indicated by comparing its migration through gels with a marker substrate also lacking the RNA strand (Figure 5E). Purified archaeal Mth Cas3 also unwound the R-loop substrate in the same way (Supplementary Figure S2D), indicating that this is a conserved function of Cas3 across domains of life, in addition to ATP-independent R-loop formation. The ability of E. coli Cas3 to dissociate a model R-loop substrate extends the known helicase activity of Cas3, but raises the question of how Cas3 catalyses apparently antagonistic functions of forming and dissociating R-loops, discussed below.
E. coli Cascade complex also promotes R-loop formation
Expression of Cascade was essential for the protection of E. coli cells against phage λvir, described in Figure 4(C) and previously , and a recent study showed that R-loop formation by purified E. coli Cascade complex is independent of ATP . In agreement with this, we observed that a mixture of purified CasC, CasD and CasE proteins, but not individual proteins, promoted formation of R-loop products indistinguishable from those described for Cas3 (Figure 6). This indicated that the crRNA substrates we in vitro transcribed for the present study are relevant for recognition not just by Cas3, but also for the key CRISPR effector complex Cascade. This is also consistent with the expression of 600-nt crRNA that was used with Cascade and Cas3 to effect in vivo protection of cells against λvir, described previously  and also shown in the present study. These partly overlapping functions of Cas3 and Cascade promoting R-loop formation are discussed below.
Cas3 R-loop formation and the HD domain
E. coli Cas3 and Cascade (CasABCDE) participate in processing crRNA and complementary duplex DNA sequences during interference stages of CRISPR immunity. Models of Cascade and Cas3 function place crRNA processing by Cascade upstream of Cas3. Helicase activity of S. thermophilus Cas3 can unwind duplexes formed from ssRNA (single-standed RNA) and ssDNA, and Cas3 nuclease activity is proposed to destroy invading DNA [36,49]. Helicase data shown in the present study supports a model for Cas3 helicase processing, but goes further by showing that Cas3 can dissociate an R-loop substrate of ssRNA invaded into a duplex of DNA, and does so by removing the invaded RNA strand. We also report a novel activity of Cas3: ATP-independent annealing of RNA into duplex DNA to form an R-loop. Helicase and R-loop forming activities were also observed using Cas3 from the archaeal species Mth, supporting an evolutionarily conserved role for each Cas3 function. R-loops have been implicated in several areas of nucleic acid metabolism, including replication of mitochondrial genomes  and T4 phage , recombination , and control of DNA topology . In each case a stable RNA–DNA hybrid is required, formed from duplex DNA and ssRNA, as has been proposed for targeting reactions in CRISPR immunity.
R-loop formation by E. coli Cas3 required magnesium as a co-factor and a functional HD motif. These pre-requisites for annealing and R-loop formation by Cas3 indicate a catalytic mechanism rather than a passive effect of Cas3 binding to complementary nucleic acids and promoting their annealing by simply bringing them into proximity. Since HD domains were first described  they have been characterized as phosphatases and nucleotidases acting on nucleotide substrates, requiring binding to a metal ion co-factor [47,54–56]. HD motifs of Cas3 proteins support different activities: the S. thermophilus Cas3 HD domain was required for robust nuclease degradation on M13 ssDNA substrate, but no nuclease activity was detected on uncut dsDNA plasmid . In contrast, S. solfataricus protein Sso2001, which is homologous with the HD domain of fused HD-helicase Cas3, showed nicking activity on dsDNA and dsRNA (double-stranded RNA) but not ssDNA . These differing nuclease activities suggest remarkable variability in substrate specificities of Cas3 HD motifs from different organisms. Amino acid sequence alignments of predicted HD domains of Sso2001 and Cas3 from S. thermophilus and E. coli do not reveal any obvious differences in conserved residues that are known to be required for catalysis and which might account for these different substrate preferences. It may be that information obtained from atomic structures will illuminate interpretation of substrate preferences that might be controlled by aspects of Cas3 HD domain protein architecture other than individual catalytic residues.
We were unable to detect robust nicking or nuclease activity of E. coli Cas3, despite assays in a wide range of buffers and magnesium concentrations analysed on neutral or denaturing gels using substrates similar to those described previously [36,37]. This suggests that the observed R-loop-forming activity of Cas3, although requiring the HD domain, is independent of nuclease activity, at least that which could be detected in the assays in the present study. Our negative outcomes from Cas3 nuclease assays cannot exclude the possibility that E. coli Cas3 may have cryptic nuclease activity that is activated in vivo by some other factor, or by recognition of a specific DNA/RNA structure. This type of activation has been described for the annealing helicase AH2 , and might help to ensure that potentially destructive nuclease activities are not deployed unless strictly targeted. It is unlikely that the N-terminal MBP tag fused to Cas3 impeded nuclease function. Factor Xa digestion of MBP–Cas3 to remove the MBP tag did not reveal a nuclease activity, and in any case R-loop-forming and helicase activities of Cas3 were proficient in the presence of the MBP tag. At this stage there is no understanding of the structure of Cas3 proteins or relative positioning of HD and helicase domains. However, our observation that the negative effect of HD motif mutants on R-loop formation by Cas3 can be reversed by also mutating the nucleotide-binding ATPase motif I (Walker A) raises an intriguing possibility of communication or other cross-talk between the HD and helicase domains of Cas3. This might indicate that metal binding by the HD domain may be needed for activation of R-loop formation via some kind of structural effect on Cas3 protein. However, we emphasize that this is currently speculative and presentation of detailed atomic structures of Cas3 should lead to major insights into domain functioning.
Roles of Cas3 and Cascade
The importance of the HD motif for Cas3 functioning in protection of E. coli cells against plaque formation by λvir was emphasized by reduced protection of cells expressing K78L Cas3 compared with wild-type Cas3. K78L Cas3 showed 4–6-fold reduced R-loop formation in vitro, an observation that is consistent with reduced interference of viral DNA by crRNA in vivo and a corresponding increased sensitivity to infection. However, we emphasize that we are not proposing that R-loop formation during CRISPR immunity is exclusive to Cas3, having been demonstrated for Cascade in work appearing during preparation of the present paper . Consistent with this, we also observed R-loops formed by the Cascade CDE complex using the same substrates as those for Cas3. We propose that Cas3 and Cascade overlap in function by their ability to both promote R-loop formation. Cascade catalyses endonucleolytic processing of pre-crRNA to crRNA that is bound stably by the Cascade complex for targeting to complementary DNA sequence. This is likely to control seeding of R-loop formation, but leaves open the question of how Cas3 gains access to an R-loop for its further helicase and other processing. It is clear that Cas3 can recognize R-loops by its ability to unwind them, and it might be that Cascade primes formation of the R-loop that is presented to Cas3 for extension and stabilization. More research will be needed to unravel whether Cascade physically recruits Cas3 or if Cas3 can independently act on an R-loop without association through Cascade. We also note that an RNA–DNA hybrid (R-loop) formed during interference stages of CRISPR immunity would be a provocative substrate for digestion by RNaseH that might need to be protected for effective targeting, a topic that has not yet been explored and may involve a nucleation or protection role for either Cascade or Cas3. Finally, we cannot rule out the possibility that R-loop formation by Cas3 is involved in another aspect of Cas3 function that is not directly part of crRNA–DNA interference stages of CRISPR immunity. This might be hinted at by differing transcriptional control of cascade compared with cas3 in E. coli.
Antagonistic roles of Cas3 forming and dissociating similar substrates dependent on ATP status
In line with a helicase activity of Cas3 that removes the RNA strand from an R-loop, we observed that ATP abolished detectable R-loop formation. An important question arising here is how can Cas3 have apparently antagonistic roles in both forming and unwinding R-loops? One possibility is that ATP might be central to Cas3 not only for powering helicase unwinding, but also for some other aspect of Cas3 functioning. Engagement of ATP by Cas3 might switch a change in oligomeric or conformational state of Cas3, in turn controlling biochemical activities. This effect has been reported for Rad52 and RecQ1 proteins, which also promote annealing of complementary nucleic acids in the absence of ATP, through assembly of higher-order protein oligomers. In the case of RecQ1, these oligomers are dispersed by ATP leading to DNA helicase activity . This type of ATP effect on Cas3, and possibly its interaction with Cascade proteins, might form an interesting avenue for future research.
Edward Bolt conceived the project and wrote the paper. Experiments were designed and performed by Edward Bolt and Jamieson Howard, and by Ivana Ivančić-Baće and Stephane Delmas for Figures 4 and 5 respectively.
This research was funded by a Biotechnology and Biological Sciences Research Council (BBSRC) Ph.D. studentship (to J.L.H.) and by the Croatian Ministry of Science, Education and Sports [grant number 119-1191196-1201 (to I.I.-B.) and an by an award from Croatian Academy of Sciences and Arts. We are also grateful to The British Council for pump-prime funding the collaboration between E.L.B. and I.I.-B. in 2008.
We thank Alexandra Hughes and Jamie Webster for support, and Peter McGlynn and Mark Dillingham for useful discussion.
Abbreviations: ATP[S], adenosine 5′-[γ-thio]triphosphate; Cas, CRISPR-associated; Cascade, CasABCDE; CRISPR, clustered regularly interspaced short palindromic repeat; crRNA, CRISPR RNA; dsDNA, double-stranded DNA; DTT, dithiothreitol; IPTG, isopropyl β-D-thiogalactopyranoside; LB, Luria–Bertani; MBP, maltose-binding protein; Mth, Methanothermobacter thermautotrophicus; ssDNA, single-stranded DNA; ssRNA, single-stranded RNA; TBE, Tris/borate/EDTA
- © The Authors Journal compilation © 2011 Biochemical Society