Data and code from; Accelerated evolution in networked metapopulations of Pseudomonas aeruginosa
Data files
Apr 03, 2026 version files 1.68 MB
-
breseq_uploads.zip
1.62 MB
-
Isolate_carrying_capacity_copy.zip
29.61 KB
-
Metapop_Growth_rate_CC.zip
21.87 KB
-
README.md
7.44 KB
Abstract
Natural populations are often spatially structured, meaning they exist as metapopulations composed of subpopulations connected by migration. Little is known about the impact of spatial structure, in particular the topology of connections among subpopulations, on adaptive evolution. Typically, spatial structure slows adaptation, although some models suggest topologies that concentrate dispersing individuals through a central hub can accelerate adaptation above that of a well-mixed system. We provide evidence to support this claim and show that acceleration is accompanied by high rates of parallel evolution. Our results suggest metapopulation topology can be a potent force driving evolutionary dynamics and patterns of genomic repeatability in structured landscapes such as those involving the spread of pathogens or invasive species.
https://doi.org/10.5061/dryad.sf7m0cggv
Description of the data and file structure
The data is collected from a short-term parallel evolution experiment performed with 64 Pseudomonas aeruginosa metapopulations growing under sub-inhibitory concentrations of the antibiotic ciprofloxacin in rich laboratory media. The metapopulations are propagated by either the Star (subpopulations connected via the central hub) or the well-mixed (all subpopulations are connected to each other) networks. The mutation supply rate to each of the subpopulations was varied by manipulating either their effective population size (large or small) or the migration rate between the subpopulations (high or low).
At the end of the evolution experiment:
(a) we measure the maximum growth rate and carrying capacity(OD at the 24 hour time point) of the replicate metapopulations at different time points from samples cryopreserved during evolution experiment ;
(b) we measure the carrying capacity of clones isolated from evolved metapopulations in the selection media (ciprofloxacin);
(c) we perform whole genome whole populations sequencing of selected evolved metapopulations and calculate gene-level parallelism.
File/Folder contents
Folder: Metapop_Growth_rate_CC.zip
Description: The calculated growth rate and carrying capacities of evolved metapopulations at different time-points of the evolution experiment. The datasets for evolved metapopulations contain variables describing the name of the network Treatment (STAR/AMP or Well-mixed/WM), Population size (large or small), Migration rate (high or low), Replicate (1 to 8 for each population size and migration rate combination), Measurement_replicate (1 to 5, every biological replicate was measured 5 times), Day (1,3,7,11,15 for large or 1,3,5,7 for small metapopulations, respectively), Block (1 or 2), r (maximum growth rate), and k (carrying capacity).
The datasets for ancestors contain variables describing the identity of the Ancestor (PA14/PA14-LacZ), Day (1,3,7,11,15 for large or 1,3,5,7 for small metapopulations, respectively), Block (1 or 2), r_anc (maximum growth rate of the ancestor), and k_anc (carrying capacity of the ancestor).
Files:
GR final output P7_large_metapop.csv - dataset for large metapopulations
GR final output P5_small_metapop.csv - dataset for small metapopulations
P5_small_metapop_anc.csv - dataset for ancestors grown as a control when assaying for small metapopulations
P7_large_metapop_anc.csv - dataset for ancestors grown as a control when assaying for large metapopulations
growth output markdown_upload.Rmd - The R markdown file that calls all the files in the Metapop_Growth_rate_CC folder to create figure 2 and 3 in the article and perform statistical analyses.
Folder: Isolate_carrying_capacity_copy.zip
Description: The carrying capacities of evolved isolates collected from the metapopulations at the end of the evolution experiment, assayed in ciprofloxacin. A total of 12 isolates were collected from each subpopulation of a metapopulation (in total 48 isolates per metapopulation).
Each dataset (with *whole in the filename) contain the carrying capacity for 768 isolates for each Population size X Migration rate combination. They contain variables describing network Treatment (STAR/AMP or Well-mixed/WM), Env (CIProfloxacin), Metapop (Metapopulations, 1-8), Subpop(Subpopulations, 1-4), Pop_size (Population size, large or small), Mig_rate (Migration rate, high or low), Block (1 or 2), Isolate (1-12), V9 (carrying capacity in ciprofloxacin).
The datasets for ancestors (with *Ancestor_CC in the filename) contain variables describing the identity of the Ancestor (PA14/PA14-LacZ), Block (1 or 2), Isolate range (6 or 12, since isolates 1-6 and 7-12 were assayed on different days), Block (1 or 2, ancestors assayed either block 1 or 2 of the evolved isolates), Replicate (1 or 2, two biological replicates), Env (CIProfloxacin) ,CC (carrying capacity of the ancestor).
The other datasets (with *_anc in the filename) are repeated values of ancestral carrying capacity values to match each row of evolved isolates (depending on the ancestry of the isolate and isolate range), for ease of downstream data analyses/plotting.
Files:
P5_HM_Whole.csv - dataset for small metapopulations with high migration rate
P7_HM_Whole.csv - dataset for large metapopulations with high migration rate
P5_LM_Whole.csv - dataset for small metapopulations with low migration rate
P7_LM_Whole.csv - dataset for large metapopulations with low migration rate
Ancestor_CC_P5_HM.csv - Ancestors grown with small metapopulations with high migration rates as controls
Ancestor_CC_P7_HM.csv - Ancestors grown with large metapopulations with high migration rates as controls
Ancestor_CC_P5_LM.csv - Ancestors grown with small metapopulations with low migration rates as controls
Ancestor_CC_P7_LM.csv - Ancestors grown with large metapopulations with low migration rates as controls
P5_HM_Anc.csv - repeated Ancestor_CC_P5_HM.csv
P7_HM_Anc.csv - repeated Ancestor_CC_P7_HM.csv
P5_LM_Anc.csv - repeated Ancestor_CC_P5_LM.csv
P7_LM_Anc.csv - repeated Ancestor_CC_P7_LM.csv
K_histogram_upload.R - The R file that calls the first eight the files in the Isolate_carrying_capacity_copy folder to create figure 5 in the article and perform statistical analyses.
Folder: breseq_uploads.zip
Description: Identities of the mutations (locus tag, gene names, positions, specific amino acid changes, mutation category etc.) segregating in the evolved metapopulations at the end of the evolution experiment. This is accompanied by mutations already present in the ancestral background (PA14/PA14-LacZ) and a parsed .gbff file for all genes of Pseudomonas aeruginosa (UCBPP-PA14 - assembly GCF_000014625.1). The raw genome sequencing data has been submitted to NCBI Bioproject ID with accession PRJNA1401813. Raw data was trimmed by trimmomatic and aligned and variant called with respect to the PA14 reference genome using Breseq.
Files:
ancestors.csv - Standard Breseq output for all mutations present in PA14 and PA14/LacZ population.
breseq_all_mutations.csv - Standard Breseq output for all mutations present in the sequenced metapopulations. In the sample column, P5_HM = small metapopulation/high migration, P5_LM = small metapopulation/low migration, P7_HM = large metapopulation/high migration, P7_LM = large metapopulation/low migration. A = STAR/AMP, W = Well-mixed/WM, number = replicate metapopulation.
parsed_gbff_PA14.csv - Reference PA14 genome used for alignment and variant calling.
breseq_analyses_upload.R - This R file calls all the datafiles in the breseq_uploads folder to create figure 4 (number of mutations), 6 (identity of mutations) and table 1 (genetic parallelism) in the paper.
Code/software
All data visualization and statistical analyses have been conducted on the open-source statistical software R (version 4.2.2). All necessary packages and dependencies are mentioned in the individual R scripts.
This data was collected from experimentally evolving laboratory metapopulations of P. aeruginosa strain 14 (PA14) under a sub-inhibitory concentration of ciprofloxacin (40 ng/ml). The data includes the growth rate and carrying capacities of the evolved metapopulations, the carrying capacities of evolved isolates collected from these metapopulations, and the results from the genomic analyses (identities of the mutations) of the metapopulations. The data are accompanied by the corresponding R files used to plot and statistically analyse them.
