Data from: Vegetable phylloplane microbiomes harbour class 1 integrons in novel bacterial hosts and drive the spread of chlorite resistance
Data files
Sep 25, 2024 version files 9.74 GB
-
ML1_1.fastq
605.69 MB
-
ML1_2.fastq
834.33 MB
-
ML1_3.fastq
641.13 MB
-
ML2_1.fastq
512.59 MB
-
ML2_2.fastq
359.70 MB
-
ML2_3.fastq
374.63 MB
-
ML3_1.fastq
563.43 MB
-
ML3_2.fastq
711.04 MB
-
ML3_3.fastq
465.84 MB
-
README.md
3.36 KB
-
S1_1.fastq
661.46 MB
-
S1_2.fastq
551.29 MB
-
S1_3.fastq
565.64 MB
-
S2_1.fastq
589.61 MB
-
S2_2.fastq
320.66 MB
-
S2_3.fastq
525.44 MB
-
S3_1.fastq
667.88 MB
-
S3_2.fastq
397.93 MB
-
S3_3.fastq
389.84 MB
Abstract
Bacterial hosts in vegetable phylloplanes carry mobile genetic elements such as plasmids and transposons that are associated with integrons. These mobile genetic elements and their cargo genes can enter human microbiomes via consumption of fresh agricultural produce, including uncooked vegetables. This presents a risk of acquiring antimicrobial resistance genes from uncooked vegetables. To better understand horizontal gene transfer of class 1 integrons in these compartments, we applied epicPCR, a single-cell fusion-PCR surveillance technique, to link the class 1 integron integrase (intI1) gene with phylogenetic markers of their bacterial hosts. Ready-to-eat salads carried class 1 integrons from the phyla Bacteroidota and Pseudomonadota, including four novel genera that were previously not known to be associated with intI1. We whole-genome sequenced Pseudomonas and Erwinia hosts of pre-clinical class 1 integrons that are embedded in Tn402-like transposons. The proximal gene cassette in these integrons was identified as a chlorite dismutase gene cassette, which we showed experimentally to confer chlorite resistance. Chlorine-derived compounds such as acidified sodium chlorite and chloride dioxide are used to disinfectant raw vegetables in food processing facilities, suggesting selection for chlorite resistance in phylloplane integrons. The spread of integrons conferring chlorite resistance has the potential to exacerbate integron-mediated antimicrobial resistance (AMR) via co-selection of chlorite resistance and AMR, thus highlighting the importance of monitoring chlorite residues in agricultural produce. These results demonstrate the strength of combining epicPCR and culture-based isolation approaches for identifying hosts and dissecting the molecular ecology of class 1 integrons.
README: Vegetable phylloplane microbiomes harbour class 1 integrons in novel bacterial hosts and a chlorite-resistance-conferring gene cassette
https://doi.org/10.5061/dryad.f1vhhmh4k
The eighteen FASTQ files that are available for download are as follows:
S1_1, S1_2, S1_3, S2_1, S2_2, S2_3, S3_1, S3_2, S3_3, ML1_1, ML1_2, ML1_3, ML2_1, ML2_2, ML2_3, ML3_1, ML3_2, ML3_3
Abbreviations
S = spinach leaf salad; ML = mixed leaf salad
The first digit after the letters denotes biological replicates. Each biological replicate represents a packaged salad product purchased from the supermarket on a different day and was processed for microbiome extraction independently. The second digit after the underscore denotes the technical replicates.
Code/Software
The epicPCR Nanopore reads in the FASTQ files were filtered to ensure that they were flanked by the nested forward and reverse primers in the correct orientation using pychopper (v2.7.9) (https://github.com/epi2me-labs/pychopper [parameters: -t 48 -m edlib -p -b]. Next, full-length epicPCR amplicon reads were filtered using chopper (v0.7.0), ensuring a minimum average base-level accuracy of Q10, and a length between 850 and 950 bp [parameters: -q 10 -l 850 --maxlength 950 --threads 48]. Pychopper was then re-run on the filtered full-length amplicons to identify and trim the 16S rRNA gene and intI1 regions from the full length epicPCR products [parameters: -t 48 -m edlib -b]. For the 16S region, the reverse complement of the bridging primer (used as the forward primer) and the reverse nested primer (AP27_short) were used by pychopper. Conversely, the intI1 region was extracted by pychopper using the forward nested primer (HS915) together with the bridging primer.
Taxonomic profiling of the trimmed 16S rRNA regions of the epicPCR amplicons was performed using Emu (v3.4.5) against the Greengenes2 database (v2022.10). Emu is optimised for full-length, error-prone 16S rRNA reads and estimates relative abundance by mapping the reads to the Greengenes2 taxonomic database, employing an expectation-maximisation algorithm for accurate error-correction by iteratively refining taxon relative abundances based on total read mapping counts. We employed Emu by setting a minimum relative abundance cut-off of 0.5% [parameters: abundance --min-abundance 0.005 --threads 48], and then using emu collapse-taxonomy to collapse taxonomic profiles at the genus level [parameter: rank="genus"]. The genera observed in the analysis output must be observed in at least two independent replicates (n ≥ 2) within each salad sample type.
References
De Coster, W.; Rademakers, R. NanoPack2: population-scale evaluation of long-read sequencing data. Bioinformatics *2023, *39 (5).
Curry, K. D.; Wang, Q.; Nute, M. G.; Tyshaieva, A.; Reeves, E.; Soriano, S.; Wu, Q.; Graeber, E.; Finzer, P.; Mendling, W.; et al. Emu: Species-level microbial community profiling of full-length 16S rRNA Oxford Nanopore sequencing data. Nat Methods *2022, *19 (7), 845-853.
McDonald, D.; Jiang, Y.; Balaban, M.; Cantrell, K.; Zhu, Q.; Gonzalez, A.; Morton, J. T.; Nicolaou, G.; Parks, D. H.; Karst, S. M.; et al. Greengenes2 unifies microbial data in a single reference tree. Nat Biotechnol *2023*.
Methods
Emulsion, paired-isolation and concatenation PCR (epicPCR)
Approximately 150,000 bacterial cells were resuspended in 75 µL of PCR reagents containing GC buffer (1x), dNTP mix (0.4 mM), Phusion High-Fidelity DNA polymerase (0.05 U/µL) (Thermo Scientific, United States), bovine serum albumin (1 µg/µL) (Promega, United States), Lucigen Ready-Lyse lysozyme (500 U/µL) (LGC Biosearch Technologies, United States), and three primers: R926 (2 µM), intI1_outer_2 (1 µM), and R519-HS458RC bridge primer (0.04 µM) with the final concentrations of each reagent in parentheses. For each of the six phylloplane microbiome sample, three technical replicates of epicPCR were performed, one of which was spiked with a class 1 integron-free Synechococcus species CC9311 strain at a population frequency of 10% to test if false associations can be observed between the 16S rRNA markers of Synechococcus CC9311 and class 1 integrons in each sample.
Each PCR suspension was added to 425 µL of ABIL oil and was agitated at 4 ms-1 for 45 s with a FastPrep-24 bead beating system (MP Biomedicals, United States). Each emulsion was aliquoted into eight portions prior to epicPCR. The conditions for the first stage of epicPCR are: 37˚C for 10 min (lysozyme lysis); 98˚C for 5 minutes (initial denaturation); 35 cycles of 98°C for 10 s (denaturation), 55°C for 30 s (annealing), and 72°C for 30 s (extension); and finally, 72°C for 5 min. The aqueous phase containing DNA was separated from the oil phase by using a mixture of 2-methyl-1-propanol and sodium chloride solution (Qi et al., 2023). The extracted DNA was further purified using the Monarch PCR & DNA Cleanup Kit (New England Biolabs, United States). In the second stage of epicPCR, the DNA templates from the previous step were PCR amplified in four aliquots of 25 µL reactions containing GC buffer (1x), dNTP mix (0.4 mM), Phusion High-Fidelity DNA polymerase (0.04 U/µL), AP27_short primer (0.8 µM), nested primer HS915 (0.4 µM), forward and reverse blocking primers (0.32 µM each) under the following conditions: 98˚C for 30 s; 30 cycles of 98°C for 10 s, 55°C for 30 s and 72°C for 15 s; followed by 72°C for 2 min. Gel electrophoresis was performed to confirm that the epicPCR products were approximately 880 bp in size.
epicPCR products were treated with the NEBNext Ultra II End Repair/dA-Tailing Module (New England Biolabs, United States). Native barcodes from the SQK-NBD114.96 Native Barcoding Kit (ONT, United Kingdom) were ligated to end-repaired DNA using the Blunt/TA Ligase Master Mix (New England Biolabs, United States). Individually barcoded DNA samples were pooled and ligated to unique native adaptors using NEBNext Quick T4 DNA Ligase (New England Biolabs, United States). The multiplexed sequencing libraries were loaded into an R10.4.1 MinION flow cell for Nanopore sequencing on a MinION Mk1B sequencer (ONT, United Kingdom). Simplex basecalling, trimming of adaptors and barcodes, and demultiplexing of sequencing data were performed using Dorado (v0.5.2) with the super accurate basecalling model dna_r10.4.1_e8.2_400bps_sup@v4.3.0 (Oxford Nanopore, United Kingdom).
Filtering and taxonomic profiling of epicPCR Nanopore reads
We first filtered epicPCR Nanopore reads ensuring that they were flanked by the nested forward and reverse primers in the correct orientation using pychopper (v2.7.9) (https://github.com/epi2me-labs/pychopper) [parameters: -t 48 -m edlib -p -b]. These full-length epicPCR amplicon reads were then filtered using chopper v0.7.0 (De Coster and Rademakers, 2023), ensuring a minimum average base-level accuracy of Q10, and a length between 850 and 950 bp [parameters: -q 10 -l 850 --maxlength 950 --threads 48]. Pychopper was then re-run on the filtered full-length amplicons to separately identify and trim the 16S rRNA gene and intI1 regions from the full length epicPCR products, respectively [parameters: -t 48 -m edlib -b]. The 16S rRNA gene region was extracted by pychopper using the reverse complement of the bridging primer (used as the forward primer) and the reverse nested primer (AP27_short). Conversely, the intI1 gene region was extracted by pychopper using the forward nested primer (HS915) together with the bridging primer.
Taxonomic profiling of the trimmed 16S rRNA regions of the epicPCR amplicons was performed using Emu v3.4.5 (Curry et al., 2022) against the Greengenes2 database v2022.10 (McDonald et al., 2023). Emu is optimised for taxonomic profiling amplicons from long-read sequencing and estimates relative abundance by mapping the reads to a taxonomic database. It employs an expectation-maximisation algorithm for accurate error-correction by iteratively refining taxon relative abundances based on total read mapping counts. We employed Emu by setting a minimum relative abundance cut-off of 0.5% [parameters: abundance --min-abundance 0.005 --threads 48], and then using emu collapse-taxonomy to collapse taxonomic profiles at the genus level [parameter: rank="genus"]. Analyses with these parameters showed that no spiked Synechococcus 16S rRNA sequence was detected in our set. We further filtered genera, retaining those only observed in at least two independent replicates (n ≥ 2) within each salad sample type. To assess if intI1-carraige had been previously reported in the genera we observed, the nucleotide sequence of intI1 from the IncW R388 plasmid was queried against the NCBI Nucleotide database by blastN using default parameters. The database searches were carried out individually for each of the thirteen genera observed in this study.
References
Curry KD, Wang Q, Nute MG, Tyshaieva A, Reeves E, Soriano S, et al. Emu: Species-level microbial community profiling of full-length 16S rRNA Oxford Nanopore sequencing data. Nat Methods 2022; 19: 845-853.
De Coster W, Rademakers R. NanoPack2: population-scale evaluation of long-read sequencing data. Bioinformatics 2023; 39.
McDonald D, Jiang Y, Balaban M, Cantrell K, Zhu Q, Gonzalez A, et al. Greengenes2 unifies microbial data in a single reference tree. Nat Biotechnol 2023.
Qi Q, Ghaly TM, Penesyan A, Rajabal V, Stacey JA, Tetu SG, et al. Uncovering bacterial hosts of class 1 integrons in an urban coastal aquatic environment with a single-cell fusion-polymerase chain reaction technology. Environ Sci Technol 2023; 57: 4870-4879.