Before commercialization of genetically modified crops, the events carrying the novel DNA must be thoroughly evaluated for agronomic, nutritional, and molecular characteristics. Over the years, Polymerase Chain Reaction-based methods, Southern blot, and short-read sequencing techniques have been utilized for collecting molecular characterization data. Multiple genomic applications are necessary to determine the insert location, flanking sequence analysis, characterization of the inserted DNA, and determination of any interruption of native genes. These techniques are time-consuming and labor-intensive, making it difficult to characterize multiple events. Current advances in sequencing technologies are enabling whole genomic sequencing of modified crops to obtain full molecular characterization. However, in polyploids, such as the tetraploid potato, it is a challenge to obtain whole genomic sequencing coverage that meets regulatory approval of the genetic modification. Here we describe an alternative to labor-intensive applications with a novel procedure using Samplix Xdrop® enrichment technology and next-generation Nanopore sequencing technology to more efficiently characterize the T-DNA insertions of four genetically modified potato events developed by the Feed the Future Global Biotech Potato Partnership: DIA_MSU_UB015, DIA_MSU_UB255, GRA_MSU_UG234 and GRA_MSU_UG265 (derived from regionally important varieties Diamant and Granola). Using the Xdrop® /Nanopore technique, we obtained a very high sequence read coverage within the T-DNA and junction regions. In three of the four events, we were able to use the data to confirm single T-DNA insertions, identify insert locations, identify flanking sequences, and characterize the inserted T-DNA. We further used the characterization data to identify native gene interruption and confirm the stability of the T-DNA across clonal cycles. These results demonstrate the functionality of using the Xdrop® /Nanopore technique for T-DNA characterization. This research will contribute to meeting regulatory safety and regulatory approval requirements for commercialization with small shareholder farmers in target countries within our partnership.

Plasmid and T-DNA Materials

The plasmid pSIM4392 was developed by Simplot Plant Sciences (Boise, ID). The genetic elements within the T-DNA are in the supplementary data (S2. Table 1). To summarize, pSIM4392 has a T-DNA that contains four cassettes. The first cassette (elements 5 to 11, S2. Table 1) contains the selectable marker nptII gene and the expression of the gene confers kanamycin resistance used for the selection of plants containing the T-DNA. The second cassette (elements 13-15, S2. Table 1) contains Rpi-vnt1 (vnt1) gene from Solanum venturi (Foster, S.J., 2009). The third cassette (elements 17-19, S2. Table 1) contains Rpi-mcq1 (mcq1) gene from Solanum mochiquense (Aguilera-Galvez, C., 2020). The fourth cassette (elements 21-23, S2. Table 1) contains Rpi-blb2 (blb2) gene from Solanum bulbocastanum (van der Vossen, E.A., 2005). The gene products from the last three cassettes, VNT1, MCQ1 and BLB2, are R-proteins involved in the plant immune response that protects potato from foliar late blight infection caused by P. infestans (Jones, J.D. and Dangl, J.L., 2006). These genes are in the CC-NB-LRR (coiled-coil, nucleotide-binding, leucine-rich repeat) class of resistance (R) genes (Paluchowska, P., et.al (2022). Each cassette is a cisgene expressed under its native promoter and terminator, pVnt1 and tVnt1 for Rpi-vnt1, pMcq1, and tMcq1 for Rpi-mcq1, pBlb2 and tBlb2 for Rpi-blb2. The sequence of pSIM4392 plasmid can be found in the Dryad Dataset (Zarka, KA., 2023). A map of the entire pSIM4392 plasmid is shown in Figure 1.

Plant Materials

Potato plant events were produced using Agrobacterium transformation as part of the Global Biotech Potato Partnership and by collaboration with Simplot Plant Sciences (Boise, Idaho). The C58-derived Agrobacterium strain AGL1 (Lazo et al., 1991) carrying pSIM4392, was used to transform potato internode explants following the method described by Richael et al. (2008). A flowchart highlighting the development and selection of lead potato events transformed with T-DNA in plasmid pSIM4392 is shown in the supplementary material (S1 Figure 2). Transformed internode explants were regenerated on medium containing 150 mg/l kanamycin to select for lines containing a T-DNA insert. The pSIM4392 backbone contained the isopentenyl transferase (ipt) gene. Events expressing the ipt gene will have a cytokinin phenotype (stunted growth) or have atypical phenotypes such as elongated trichomes or chlorotic leaves. (Kunkel et al., 1999). They would have also transferred some or all of the plasmid backbone. These events were eliminated from further analysis. For both the Diamant and the Granola host varieties around 300 events were advanced to analyze T-DNA copy number. The T-DNA copy number was determined by digital droplet Polymerase Chain Reaction (ddPCR) according to the protocol in Collier et al. (2017). Events with more than one copy were eliminated from further analysis. Internal regions of the T-DNA were tested in the events with Polymerase Chain Reaction (PCR) analysis and any negative events were eliminated. R-gene function was tested in growth chamber plant pathology bioassays and in field trials (unpublished, Douches, D., 2023). The plant events, selected as the lead events, and used in this study, are DIA_MSU_UB015 and DIA_MSU_UB255 from the host variety Diamant, and GRA_MSU_UG234 and GRA_MSU_UG265 from the host variety Granola. These events will be referred to as UB015, UB255, UG234 and UG265 respectively.

Genomic DNA Isolation

For PCR analysis and ddPCR analysis, genomic DNA was isolated from leaf tissue using the DNeasy Plant Mini Kit: CAT#69,104 (Qiagen) according to the manufacturer’s instructions. DNA isolation for the Xdrop® enrichment technology method required high molecular weight genomic DNA. High molecular weight DNA is essential to obtain the long sequencing reads that will span not only the flanking region of the insert location but also the T-DNA. DNA isolation was done with leaf tissue of greenhouse-grown plants and an isolation procedure modified from Saghai-Maroof et al. (1984). Fresh leaf tissue (2 g) was ground with a mortar and pestle in 7 ml of extraction buffer (0.1 M Tris, pH 8.0/1.4 M NaCl/0.02M EDTA/2% hexadecyltrimethylammonium bromide/ 1% 2-mercaptoethanol). Transfer of further DNA-containing solutions was only done with universal pipet tips with wide tip openings (USA Scientific, Ocala, FL). The ground leaf tissue mixture was filtered through 2 layers of cheesecloth and incubated at 65°C for 30 min with occasional gentle mixing. An equal volume of chloroform/isoamyl alcohol 24:1 (vol:vol), was added, and the solution was mixed by inversion to form an emulsion that was centrifuged at 3000 rpm for 10 min at room temperature. The aqueous phase was removed, and 2/3 vol of isopropanol was added and mixed by gentle inversions. The precipitated DNA was washed with 1ml 70% ethanol and then dissolved in 300 ml of resuspension buffer (10mM Tris 1 mM EDTA). The DNA samples were evaluated for DNA size distribution by capillary electrophoresis on a Tapestation^TM instrument, using Genomic DNA ScreenTape (Agilent Inc., Santa Clara, CA) according to the manufacturer’s instructions. The DNA samples were then shipped to Samplix (Denmark).

Inserted T-DNA analysis

In polyploids, such as the tetraploid potato, it is difficult to have the coverage in whole genomic sequencing needed to meet regulatory review recommendations. Samplix developed an enrichment instrument technology called the Xdrop ®, which enables targeted DNA fragments to be encapsulated and enriched so they can be sequenced using next-generation sequencing. As mentioned in the introduction, Blondal et al. (2021), previously described identifying flanking regions of inserted T-DNA. Here we describe utilizing the technology to achieve high sequence coverage across the entire T-DNA region of each of the lead 3R-gene late blight resistant events as well as the identification of flanking regions on either side of the T-DNAs.

A. Xdrop® enrichment technology

The Xdrop® enrichment technology uses the Xdrop® instrument, cartridges, and reagents along with the DNA samples of interest. The workflow includes two parts: 1. Primer design, enrichment and quantification. 2. Digital Polymerase Chain Reaction (dPCR) Generation, Sorting of Xdrop® droplets, Droplet Multiple Displacement Amplification (dMDA) and Evaluation of Enrichment. A graphic of the workflow was previously described in Blondal et al. (2021) and included here in the supplementary material (S1 Figure 1).

1. Primer design, enrichment and quantification

The DNA samples were evaluated for size distribution and quality by Tapestation^TM System (Agilent Technologies Inc.), using Genomic DNA ScreenTape according to the manufacturer’s instructions. Primer sets for enrichment and quantification were designed specifically for the detection of sequences within the insert site. This was done to achieve coverage across the entire T-DNA and obtain genomic flanking sequence data as well. The primers were tested and successfully implemented to enrich two Regions of Interest (ROIs) and are listed in the supplementary material (S2. Table 2). Primer set, ROI1_8F and ROI1_8R, are located within the 3’ region of the Rpi-mcq1 gene. The primer set ROI2_9F and ROI2_9R set is located within the 3’ region of the Rpi-vnt1 gene. The highest amount of enrichment will occur in the sequence surrounding the ROIs. Therefore, to ensure that high-quality sequence data can be achieved after the enrichment and span the entire T-DNA the ROIs are located near the center of the T-DNA. Assay evaluation of primers described in the supplementary material (S2. Table 2). was performed by using quantitative polymerase chain reaction (qPCR) as previously described in Blondal et al. (2021).

The DNA samples were purified using HighPrep™ PCR Clean-up Bead System according to the manufacturer’s instructions (MAGBIO Genomics) with the following changes. Bead-to-Sample ratios were 1:1 (vol:vol) and elution was performed by heating the sample in the elution buffer for 3 minutes at 55 °C before separation on the magnet. The samples were eluted in 20 μl 10 mM Tris pH8. Purified DNA samples were quantified by Quantus (Promega Inc.) Fluorometer^TM, according to the manufacturer’s instructions.

2. dPCR Generation, Sorting of Xdrop ® droplets, Droplet Multiple Displacement Amplification (dMDA) and Evaluation of Enrichment

The dPCR generation, sorting of Xdrop® droplets and droplet Multiple Displacement Amplification (dMDA) were performed as previously described in Blondal et al. (2021). In short, millions of double emulsion droplets were generated by the Xdrop ® instrument, followed by droplet PCR (dPCR), which was conducted by taking each DNA sample and compartmentalizing the DNA into droplets that included dPCR master mix and ROI primer sets. After droplet production, the DNA within the droplets was subjected to PCR amplification using the enrichment primers, described above, to generate droplets carrying the ROI. The resulting enrichment of the ROI targets was evaluated by quantitative Polymerase Chain Reaction (qPCR) according to the Xdrop ® manufacturer instructions. After the dPCR protocol, the droplets were collected and dyed to generate a fluorescent signal in droplets carrying the ROI. The positive droplet populations were sorted from the negative droplets using a fluorescence-activated cell sorting (FACS) instrument, specifically, a SONY benchtop SH800S cell sorter with a 100 μm nozzle (Sony Biotechnology). A more detailed description of the process is presented in the Xdrop ® manufacturer instructions. DNA from the positive droplets was released and re-encapsulated into single emulsion droplets by Xdrop ® and subjected to multiple displacement amplification (dMDA) according to Xdrop ® manufacturer instructions. The amplified DNA was isolated and quantified. The dMDA reactions were then diluted in molecular grade H₂O (1:9 vol/vol) and subjected to qPCR reaction using validated qPCR assays. 10 ng of target DNA was used as a control as well as the dMDA controls for background and contamination evaluation. Enrichment Calculation Tool (https://samplix.com/calculations/per-amountof-genetic-material/actual-enrichment-calculator) was used to calculate the enrichment of the dMDA samples according to the Xdrop® manufacturer instructions. Sample Cycle Threshold (CT) values in Real-time Polymerase Chain Reaction (RT-PCR) and the clusters with the FACS analysis were evaluated according to the Xdrop® manufacturer instructions.

B. Nanopore Genomic Sequencing and Bioinformatic Analysis

Minion Oxford Nanopore Sequencing platform was used to generate long-read sequencing data from the dMDA samples as described by the manufacturer’s instructions (Premium whole genome amplification protocol (SQK-LSK109) with the Native Barcoding Expansions 1-12 and 13-24 (EXP-NBD104/114)). For each of the events, 1.2 μg amplified of dMDA DNA was treated with T7 Endonuclease I followed by size selection, end-repair, barcoding, and adaptor ligation using the Oxford Nanopore Technology (ONT) Ligation Sequencing Kit. After library generation the sample was loaded onto a GridION flow cell R9.4.1 (20 fmol) and run for 16–24 hours under standard conditions as recommended by the manufacturer (Oxford Nanopore Inc.). To get sequences with high accuracy the generated raw data files (fasta5) were base called using Guppy 5.0.17 with super high accuracy and quality 10 filtering.

Using the sequence data obtained and the potato reference genome Solanum tuberosum DM1-3 PGSC v4.04 pseudomolecules downloaded from http://spuddb.uga.edu/index.shtml (SPUD DB Potato Genomics Research, 2023), the T-DNA, was mapped to the genome with Minimap2. Using the bedtools bamtobed utility, a bed file was created, corresponding to the mapping regions. After masking those regions, using the bedtools maskfasta utility, a new reference genome was created. The T-DNA only region of the pSIM4392 sequence was then added as an extra chromosome. All reads were then mapped to the new masked genome, containing the T-DNA, using Minimap2 with default settings for Oxford Nanopore reads (-ax map-ont). The sequence viewer Integrative Genomics Viewer (IGV) was used to examine the coverage profile for the T-DNA.

Both primary and supplementary mapping reads were extracted, that map to the T-DNA by using the utilities samtools view and seqkit grep. These are the reads of interest for finding the location of the T-DNA insertions in the genome. Using the reference genome and the extracted reads from the T-DNA, Minimap2 was used to map the reads from the T-DNA to the masked genome which includes the T-DNA. Using the samtools view utility, a TAG was added containing the read name to the bam file, followed by using the bedtools genomecov utility to generate a list of areas with T-DNA coverage in the bam file. The IGV viewer was then used to identify the insertion borders. All sequences from each event were compared with the plasmid T-DNA to determine the integrity of the T-DNA sequences for each event. The workflow for characterizing the T-DNA in a GM event after Xdrop® enrichment and ONT sequencing is described in Figure 2. A table of the software, tools and resources used is available in the supplementary material (S2. Table 3).

Flanking sequence analysis and Sanger Sequencing

The reads mapping to the construct were mapped back to the genome to identify the sequence that would span the breakpoint between insert and genome. These sequences were then used to identify the chromosomal location of the insert at each the left border and right border T-DNA. The identified insert site was confirmed by PCR across left and right breakpoints between insert and genome followed by Sanger sequencing. The PCR primers and conditions are found in the supplementary material (S2. Table 4). The confirmational PCR analysis and Sanger Sequencing covered the junction as well as 1000bp of the flanking regions. Sanger Sequencing was conducted by submitting amplicon DNA and primers to The Research Technology Support Facility (RTSF) at Michigan State University.

Insertion Analysis in non-GM varieties

The sequence of the potato reference genome Solanum tuberosum DM1-3 PGSC v4.04 pseudomolecules downloaded from http://spuddb.uga.edu/index.shtml could be different than the genome sequence of the host varieties of our events. Therefore, to further characterize the insert location, the sequence of the genomic insertion site in the non-GM variety of each event was determined using primers that hybridize to the flanking genomic regions of the insertion sites identified in the flanking sequence analysis. Primers and PCR conditions for this analysis can be found in the supplementary material (S2. Table 5). PCR amplification resulted in amplicons that covered the insert loci in the non-GM variety of each event. Each of the resulting fragments was ligated into the TOPO-TA plasmid using Invitrogen™ TOPO™ TA Cloning™ Kit for Sequencing CAT 450030 (according to manufacturer’s instructions). Each ligation was transformed into Takara’s Stellar™ Competent E. coli HST08 strain cells (according to manufacturer’s instructions). The resulting colonies obtained were PCR tested using M13 primer sets to confirm insertion. Selected clones were Sanger sequenced using the M13 primers. BLAST analysis using the BLAST database (BLAST, 2023), on each of the sequenced amplicons was aligned to a region of the potato reference genome for comparison. Insertion sites for each event were analyzed for chromosomal deletions.

Native Gene Interruption Analysis

Determination of whether or not a native gene was interrupted by the insertion of the T-DNA in our potato events was studied. The potato reference genome used in this study is from the Potato Genome Sequencing Consortium and is a doubled monoploid (Diambra, L. A. 2011). The sequence they obtained was integrated with a sequence from a heterozygous diploid line and used in the database potato reference genome Solanum tuberosum DM1-3 PGSC v4.04 pseudomolecules downloaded from http://spuddb.uga.edu/index.shtml. The insert location on each event was analyzed using the genome browser database within the sequence of the potato reference genome Solanum tuberosum DM1-3 PGSC v4.04 pseudomolecules located at http://spuddb.uga.edu/index.shtml to determine if there were any interruption of native genes. The database's graphical viewer was used to visually inspect the locus for disrupted genes.

Stability of T-DNA Across Clonal Cycles

Potatoes are propagated asexually and both phenotypically and genetically identical to the mother plant as well as each other. The progeny plants (events) have not recombined meiotically. The stability of the inserts in UB015, UB255, UG234, and UG265 events was assessed to show that DNA introduced into potato through transformation is stable over several clonal cycles. Genetic stability of the T-DNA inserts in UB015, UB255, UG234, and UG265 was assessed using PCR to determine the presence or absence of the insert in plants that have sustained three generations. DNA insert stability was demonstrated in the originally transformed events (G0) by extracting and evaluating DNA from the events that had been propagated in vitro. For generation-3 (G3) analyses, 3 tubers from 3 plants from each event and a plant from each non-transgenic control were collected from an MSU confined field trial (East Lansing, MI) that sustained 3 tuber cycles (1 in greenhouse and 2 in field trials). Genomic DNA from events UB015, UB255, UG234, and UG265 was isolated from fresh plant tissue using Qiagen DNeasy Plant Mini Kit Cat No./ID: 69104 according to the manufacturer’s instructions. PCR testing was done with the DNA primer sets found in the supplementary material (S2. Table 6). There are 2 primer sets located within the T-DNA. One set, called VNT1, is located in the native terminator of Rpi-vnt1 gene and the p4274 set is located in the promoter region of the Rpi-mcq1gene. The stability of the inserted T-DNA in the events is also shown by PCR analysis using primers that are unique to each event by using a primer set with one primer located in the T-DNA (at either the right or left border) and the other primer located in the chromosomal location specific to the event. There are 4 sets of T-DNA right border region primer sets, one set specific for each event. Each PCR was performed using conditions: 95°C 3 min, then 35 cycles at 95°C 30 sec, (Annealing Temp. see S2. Table 6)°C 30 sec, 72°C 1 min 20 sec. The PCR samples were then electrophoresed on a 0.8% agarose gel using a 1 kb standard (STD) from (NEB 1 kb N3232S, New England Biolabs, USA).

Software/Tools/Resource	Link
bedtools	https://github.com/arq5x/bedtools2
Guppy	https://github.com/timkahlke/LongRead_tutorials
Integrative Genomics Viewer (IGV)	https://github.com/igvteam/igv
minimap2	https://lh3.github.io/minimap2
samtools	https://github.com/samtools/samtools
seqkit	https://github.com/shenwei356/seqkit

Dataset: T-DNA characterization of genetically modified 3-R-gene late blight resistant potato events with a novel procedure utilizing the Samplix Xdrop® Enrichment Technology

Data files

Abstract

Plasmid and T-DNA Materials

Plant Materials

Genomic DNA Isolation

Inserted T-DNA analysis

Flanking sequence analysis and Sanger Sequencing

Insertion Analysis in non-GM varieties

Native Gene Interruption Analysis

Stability of T-DNA Across Clonal Cycles

Dataset: T-DNA characterization of genetically modified 3-R-gene late blight resistant potato events with a novel procedure utilizing the Samplix Xdrop® Enrichment Technology

Data files

Abstract

README

Methods

Plasmid and T-DNA Materials

Plant Materials

Genomic DNA Isolation

Inserted T-DNA analysis

Flanking sequence analysis and Sanger Sequencing

Insertion Analysis in non-GM varieties

Native Gene Interruption Analysis

Stability of T-DNA Across Clonal Cycles

Works referencing this dataset