1. Introduction
Among the many biotechnological approaches for improving the properties of agricultural plants, genome editing has the potential to play a key role. Unlike traditional strategies and breeding methods, Cas endonuclease technology provides a fast path to the creation of modified genotypes through site-directed mutagenesis or precise editing of the nucleotide sequences of respective genes [1] [2]. To date, this technology allowed to modify many agronomically important traits in major cultivated crops, such as corn, rice, wheat, potatoes, soybeans, sugarcane, etc. [3] [4].
Sorghum (Sorghum bicolor (L.) Moench) is one of the most important drought-tolerant cereal crops in the arid regions of the Earth. Due to global warming of climate, the importance of this crop is expected to grow steadily. Sorghum grains do not contain gluten and can serve as a source of protein for people with gluten intolerances, which must follow a gluten-free diet. However, compared to other cereals, sorghum grain has a lower nutritional value, the main reason for which is the resistance of its grain storage proteins (kafirins) to protease digestion [5] [6] [7]. The poor digestibility of kafirins, in turn, reduces the access of amylolytic enzymes to starch granules and reduces the digestibility of starch and the nutritional value of sorghum grain [8].
Cas endonuclease technology offers to solve this problem. The targeted induction of mutations in genes encoding different classes of kafirins, including gene knockouts, using genome editing bears the potential to significantly improving the digestibility of proteins in sorghum grain and increase its nutritional value. The reduction of kafirin synthesis induces the changes in the ultrastructure of endosperm protein bodies and increases their digestibility by proteases [9] [10] [11]. As a further consequence, the proteome of caryopses may be rebalanced via enhanced synthesis of other proteins [10], including those with a higher content of essential amino acids such as lysine [11] [12]. Recently published work on the induction of mutations in the α-kafirin nucleotide sequence has shown the potential of Cas endonuclease technology to improve the nutritional value of sorghum grain [13].
Previous studies have revealed a multitude of aspects that have to be considered when generating transformation vectors for plant genome editing using Cas endonucleases [14] [15] [16]. The aim of this work was to create highly efficient vectors and agrobacterial clones containing these vectors to mutate the α- and γ-KAFIRIN genes of sorghum. Accordingly, major features of the constructs generated in the present study include the rice U3 promoter and the maize POLYUBIQUITIN 1 (UBI1) promoter to drive gRNA (guide RNA) and cas9 expression, respectively. Further, the Phosphinothricin phosphotransferase (Bar) gene of Streptomyces hygroscopicus equipped with an intron to prevent agrobacterial expression and driven by the maize UBI1 promoter was used as plant selectable marker.
2. Materials and Methods
pSH121 (NCBI: txid2338066) (Figure 1(a)) [17] was used as the basic vector for the introduction of target-specific sequences of kafirin-encoding genes upon
Figure 1. Vectors pSH121 (a) and B479p7oUZm-LH (b) used in this work.
cleavage with BsaI to complement the gRNA expression units. This vector contains the nucleotide sequence of a maize codon-optimized cas9 gene under control of the maize UBI1 promoter and sites for the SfiI restriction enzyme for the directed transfer of a fragment containing the cas9 and gRNA expression units into a binary vector of the p7i series. As a binary vector from this series, we chose B479p7oUZm-LH (Figure 1(b)) which contains the bar gene and also carries the SfiIA and SfiIB sites compatible with pSH121. This vector was purchased from DNA Cloning Service (https://www.dna-cloning.com/). Bioinformatics analysis of the nucleotide sequences of the pSH121 and B479p7oUZm-LH vectors was performed using the SnapGene Viewer software.
The genomic sequences of the α- and γ-KAFIRIN genes were taken from the site https://phytozome.jgi.doe.gov (α-KAFIRIN (k1C5): Sobic.005G193100, Chr05: 67654898 … 67655764; γ-KAFIRIN (gKAF1): Sobic.002G211700, Chr02: 60423442 … 60424313). The selection of target motifs was carried out using the online tools CRISPOR (http://crispor.tefor.net/) and CHOPCHOP (https://chopchop.cbu.uib.no/) [18] [19].
For molecular cloning, conventional techniques were used if not specified otherwise [20]. The restriction endonucleases Eco31I, MluI and SfiI were purchased from Thermo Scientific. Restriction endonuclease SfiI is unique in that it recognizes a 13-nucleotide site and forms sticky ends, which is particularly useful to transfer DNA fragments in directed fashion. Fractionation of linearized plasmid DNA was carried out in agarose gel in 1x TAE buffer. Subsequent purification of DNA was performed using the ISOLATE II PCR and Gel Kit (BIOLINE) along with Quantum PrepTM Freeze’N Squeeze DNA Gel Extraction Spin Columns (Bio-Rad Laboratories). Ligation of targets and plasmids with 5’ and 3’-overhangs was performed using T4 DNA ligase (Thermo Scientific). The created constructs were introduced into E. coli XL-1 Blue bacterial cells. The presence of target-specific inserts was monitored by DNA sequencing on an ABI 3130 genetic analyzer using the OsU3p-F3 sequencing primer GACAGGCGTCTTCTACTGGTGCTAC. To validate the correct assembly of the cloned binary plasmids, restriction endonuclease analysis was performed using the enzymes MluI and SfiI. The created vectors were transferred by electroporation into the A. tumefaciens strain AGL0.
3. Results and Discussion
Transformation vectors for site-directed mutagenesis of kafirin genes were created by the following steps:
1) Retrieve kafirin gene sequences from databases and select target motifs within their coding sequences.
2) Clone the target-specific parts of the gRNAs into the generic vector pSH121.
3) Perform the verification of cloned DNA targets by sequencing.
4) Subclone a fragment containing the cas9 and gRNA expression units into the generic binary vector B479p7oUZm-LH.
5) Perform restriction endonuclease analysis to confirm the correct generation of vectors.
The genetic maps of the pSH121 and B479p7oUZm-LH vectors used in this study are shown in Figure 1.
3.1. Bioinformatics Analysis and Oligonucleotide Design for the gRNA Expression Units
Signal sequences play an important role in the packaging of kafirins into protein bodies, and, consequently, in the accumulation of storage proteins in sorghum grain. For example, a single nucleotide substitution (G → A) at position 61 relative to the first nucleotide of the start codon of α-KAFIRIN gene distinguishes the hdhl mutant with a high digestibility of kafirins and high lysine content from other sorghum varieties [21]. This missense mutation results in the amino acid alanine (Ala) instead of a threonine (Thr) at the last position of the signal peptide. This mutation is thought to render the protein resistant to processing and to trigger the unfolded protein response (UPR) and the formation of irregular protein bodies [21]. Therefore, we chose nucleotide sequences of these parts of α- and γ-kafirins as target motifs for the RNA-guided Cas9 used in this study.
Using the CRISPOR and CHOPCHOP online tools to analyze the 63 bp signal sequence of α-kafirin made it possible to identify four target motifs, from which the two with the best features, such as specificity score, predicted efficiency, outcome of out-of-frame mutations and number of off-targets, were selected (Table 1). The same procedure was pursued for the 57 bp signal sequence of γ-kafirin, which revealed five target motifs, from which another two were selected (Table 2). The results provided by the two platforms were very similar and therefore, only the data delivered by the CRISPOR tool are shown here.
The nucleotides of the signal sequences of α-KAFIRIN (k1C5) and γ-KAFIRIN (gKAF1) genes with the location of target sites are shown in the scheme (Figure 2).
Table 1. Selection of target motifs within the signal peptide-encoding sequence of the α-KAFIRIN gene using the CRISPOR online tool.
The 63 bp input sequence ATGGCTACCAAGATATTTGTCCTCCTTGCGCTCCTTGCTCTTTCAGTGAGCACAACAACTGCA was used from Sorghum bicolor (pz9Sbicolor), chromosome_5:58133820-58133882, reverse genomic strand. It contains four possible target motifs. Expected cleavage positions are located −4 to −3 bp upstream of the Cas9-bound triplet (PAM).
Table 2. Selection of target motifs within the signal peptide-encoding sequence of the γ-KAFIRIN gene using the CRISPOR online tool.
The 57 bp input sequence ATGAAGGTGTTGCTCGTTGCCCTCGCTCTCCTGGCTCTCGCGGCGAGCGCCGCCTCC was used from Sorghum bicolor (pz9Sbicolor), chromosome_2:60425298-60425354, forward genomic strand. It contains five possible target motifs. Expected cleavage positions are located −4 to −3 bp upstream of the Cas9-bound triplet (PAM).
Figure 2. The nucleotides of signal sequences (highlighted in blue) of α-KAFIRIN (k1C5) ((a), (b)) and γ-KAFIRIN (gKAF1) ((c), (d)) genes with the location of PAM sites (underlined with a solid line) and selected target motifs (underlined with a dotted line).
According to the chosen target motifs, oligonucleotides were designed for subsequent cloning of gRNA/cas9 vectors. The sequences of the oligonucleotides are shown in Table 3.
3.2. Design and Cloning of gRNA/Cas9 Vectors
Canonical target motifs for U3 promoter-driven guide RNAs and Cas9 have the generic sequence AN19NGG (encompassing the target motif-specific part of gRNA and the PAM (protospacer adjacent motif)). For efficient transcription of gRNA under the control of the RNA polymerase III-processed OsU3 promoter,
Table 3. Targets of kafirin genes used in the work.
an A was used as an additional 5’-terminal nucleotide in all gRNAs, because useful target motifs starting themselves with an A are not available in the targeted gene regions.
The principles of cloning target-specific derivatives of vector pSH121 are shown in Figure 3. The sequences of the generic and derived vectors pSH121 differ in size (12,396 bp and 12,199 bp, respectively). The design of forward (single strand) oligonucleotides was as follows: 5’-TGGCA (or G) N2-20-3’. The design of reverse single strand oligonucleotides was accordingly as follows: 5’-AAAC (complementary to N20-2) T (or C)-3’. A double-stranded nucleotide fragment for integration in pSH121 can be easily created by annealing these two complementary single-stranded oligonucleotides (see Table 3). The double-stranded insert fragment has sticky ends compatible with the BsaI-created DNA-ends of the linearized vector pSH121.
The cloning protocol included the following steps.
1) Plasmid pSH121 was digested with BsaI (Eco31I) restriction enzyme to allow for the insertion of the target-specific insert. Restriction products of BsaI fragments 1227 bp (SpecR) and 10,972 bp were separated on a 1% agarose gel. The latter fragment was isolated and purified from the gel.
2) The assembly of the target-specific double-stranded (ds) oligonucleotide was performed by heating a mixture of an equimolar amount of each of the single-stranded F and R oligonucleotides followed by their annealing via slow cooling.
3) The assembled ds oligonucleotide was ligated using compatible overhangs with the 10,972 bp BsaI (Eco31I)-fragment of plasmid pSH121.
4) The ligation products were transformed into competent E. coli cells, which were then grown and selected on LB medium with kanamycin. The plasmids isolated from the selected colonies were cleaved using endonuclease MluI and then sequenced to confirm the presence of the insert.
5) Upon digestion using SfiI, the fragment containing expression units for gRNA and cas9 was ligated with the SfiI-linearized vector B479p7oUzm, thereby combining all functional elements and both borders of the T-DNA. The resultant binary vector also carries a bacterial selectable marker gene conferring resistance to streptomycin and spectinomycin.
The correct insertion of the target-specific parts of the gRNA into pSH121 was verified by Sanger sequencing using the OsU3p-F3 sequencing primer GACAGGCGTCTTCTACTGGTGCTAC as shown in Figures 4-7.
Figure 3. Workflow of inserting target gene-specific fragments into the generic vector pSH121.
Figure 4. Confirmation by Sanger sequencing of the correct insertion of the target motif #1 (α8)-specific part (indicated by black background) into the gRNA expression unit of pSH121 for the targeted mutagenesis of the α-KAFIRIN gene.
Figure 5. Confirmation by Sanger sequencing of the correct insertion of the target motif #2 (α33)-specific part (indicated by black background) into the gRNA expression unit of pSH121 for the targeted mutagenesis of the α-KAFIRIN gene.
Figure 6. Confirmation by Sanger sequencing of the correct insertion of the target motif #1 (γ32)-specific part (indicated by black background) into the gRNA expression unit of pSH121 for the targeted mutagenesis of the γ-KAFIRIN gene.
3.3. Restriction Analysis
To control the successful assembly of binary vectors, restriction endonuclease analysis was performed using the enzymes MluI and SfiI. The MluI recognition site is unique in pSH121 and absent in B479p7oUZm-LH, while both of the generic vectors pSH121 and B479p7oUZm-LH have two SfiI restriction sites each. In Figure 8, digestion of each of the newly created vectors (1C, 2C, 3C, 4C) is displayed. The vectors have a size of 17,846 bp. Whereas MluI produced one fragment, cleavage with the SfiI yielded two fragments, the sizes of which correspond to the expected values (10,223 bp and 7623 bp).
Figure 7. Confirmation by Sanger sequencing of the correct insertion of the target motif #2 (γ41)-specific part (indicated by black background) into the gRNA expression unit of pSH121 for the targeted mutagenesis of the γ-KAFIRIN gene.
Figure 8. Restriction analysis of newly constructed binary vectors. Lanes 1 - 4: binary vectors 1C, 2C, 3C and 4C after digestion with MluI restriction enzyme; M—DNA fragment length marker (“SibEnzyme” Russia, cat. No. M30 http://russia.sibenzyme.com/info815.php); 5 - 8: binary vectors 1C, 2C, 3C and 4C after digestion with restriction endonuclease SfiI.
Figure 9. Cross sections of kernels set on the panicle of the sorghum plant #2C-1.2.5 carrying a genetic construct for α-KAFIRIN gene editing (target #2C) ((b), (c), (d)) ((a)—kernel of original cv. Avans). Вar 1 mm.
4. Conclusion
It is expected that the population of the Earth will reach 9.6 billion people by the middle of this century. The demand for staple crops thus will increase by up to 60% [22]. To cope with this challenge, a significant improvement of plant breeding and plant production methods is required. In this regard, genome editing belongs to the most promising approaches [23], with Cas endonucleases being the currently most powerful platform. Using this technology, the improvement of grain quality via targeted mutagenesis of the KAFIRIN genes of sorghum may be achieved in a comparatively short time [24]. The vectors we have created represent an important step towards this goal. One of these vectors, 2C for α-KAFIRIN gene editing, was used to transform sorghum via Agrobacterium (strain AGL-0)-mediated DNA transfer to immature embryos of cv. Avans. In these experiments, we have obtained four plants (T0 generation) with modified endosperm texture (Figure 9) that should be expected in the case of disturbed synthesis of α-kafirins, and improved in vitro digestibility of endosperm proteins [11] [12] [13] [21]. The incorporation of vectors during transformation was confirmed by PCR analysis. Amplification and sequencing of the target regions from the transgenic plants are in progress.
Acknowledgements
The work was funded in part by the Russian Foundation for Basic Research, grant 19-016-00117.