Data and code from: Acquisition of novel arrays via horizontal gene transfer rewire CRISPR-mediated defense in Pseudomonas aeruginosa
Data files
May 09, 2026 version files 319.60 MB
-
map-spacers-withslip.py
3.06 KB
-
map-spacers.py
2.82 KB
-
raw-gels.zip
319.59 MB
-
README.md
1.13 KB
-
reads-to-spacers.py
3.77 KB
May 21, 2026 version files 320.70 MB
-
figure_s18-interactive.html
1.10 MB
-
map-spacers-withslip.py
3.06 KB
-
map-spacers.py
2.82 KB
-
raw-gels.zip
319.59 MB
-
README.md
1.22 KB
-
reads-to-spacers.py
3.77 KB
Abstract
The type I-F CRISPR-Cas system of Pseudomonas aeruginosa ATCC 10145 (PA10145) is composed of a cas operon flanked by two divergently organized arrays. Interestingly, an isolated CRISPR array, CRISPR3, was also found ~1.3 million bp away from cas, prevalent across P. aeruginosa (Psa) genomes. The cas and three CRISPR arrays together function towards adaptive immunity, eliminating plasmids engineered with protospacer targets. If plasmids possessed an intact protospacer adjacent motif (PAM), hyperactive adaptation was stimulated in all CRISPR arrays of PA10145, whereas minimal to no adaptation was observed when PAM was mutated. Spacer acquisition via interference-driven adaptation proceeds through strand-biased priming in PA10145. The isolated CRISPR3 and the cas-adjacent CRISPR2 have nearly identical leader sequences with 3 bp mismatches. From a survey of CRISPR loci in 1,198 Psa genomes, isolated arrays only occur as type I-F with CRISPR2-like leaders. Highly-transmissible mobile genetic elements (MGEs) only associate with CRISPR2 and CRISPR3, suggesting that isolated arrays might have originated from recombination events involving CRISPR2. Tracing evolutionary trajectories of the isolated CRISPR3 relative to cas-adjacent arrays revealed that CRISPR3 is horizontally acquired by Psa. Taken together, these results implicate the role of isolated arrays in CRISPR-mediated pan-immunity as gateways to mobilize genetic memories.
Dataset DOI: 10.5061/dryad.tqjq2bwdr
Description of the data and file structure
Python scripts for mapping novel spacer from expanded CRISPR arrays
Files and variables
File: reads-to-spacers.py
Description:
- partitions each read into spacers based on the repeat qualifier
- bins each spacer into the CRISPR array based on a provided csv file containing the pre-existing spacers
File: map-spacers.py
Description:
- maps spacers to a position on the provided template
- includes both positive and negative strands
- must be executed inside the generated folders with the input.fasta according to each binned CRISPR from reads-to-spacers.py
File: map-spacers-withslip.py
Description:
- similar to map-spacers.py but with detection for PAM slippage
File: raw-gels.zip
- Raw images of AGE gels for detection of CRISPR expansion
File: figure_s18-interactive.html
- Interactive 3D network as shown in Figure S18
Code/software
python 3.8
biopython 1.8.1
pandas 2.0.3
Detection of CRISPR expansion
PCR templates are obtained from the pooled populations of day 5 cultures or from three random sensitive colonies. Primer pairs bind to the upstream sequence of each CRISPR and its third spacer. Reactions were run in AGE at 100V for 35 minutes in 2 % agarose gel.
High-throughput spacer acquisition analysis
CRISPR arrays were PCR-amplified from populations 5 days post-exposure to the T plasmid. Bands higher than 200 bp were gel excised, extracted, then pooled according to the PS of the T plasmid set-up prior to purification. DNA were then mixed in equimolar amounts for library preparation as instructed by Oxford Nanopore Technologies (ONT). The library was then loaded to the SpotON flow cell in MinION Mk1B. Custom Python scripts were then executed to parse through the sequencing data by filtering reads that contain two or more repeats and mapping each spacer flanked by two repeats to a position on the backbone of the T plasmid. Each novel spacer was also binned according to which CRISPR had acquired it based on the two pre-existing spacers included within the read.
Changes after May 9, 2026:
Added figure_s18-interactive.html
- interactive version of the network shown in Figure S18
