Data and code from: Universal bacterial clade dynamics dominate under predation despite altered phenotypes and mutation targets
Data files
Mar 30, 2026 version files 32.40 MB
-
ancestral_evolved_aucs.csv
72.48 KB
-
chosen_sample_df.csv
19.90 KB
-
clonal_order_df.csv
2.41 KB
-
clonal_structure_inference.ipynb
379.14 KB
-
clonal_structure.py
20.10 KB
-
cluster_clone_match.tsv
3.85 KB
-
linear_mixed_models.Rmd
30 KB
-
mutation_statistics.ipynb
387.97 KB
-
no_interaction_R2_decomposition.csv
704 B
-
phenotype_analysis.ipynb
7.21 MB
-
phenotyping.tar.gz
15.73 MB
-
pyclone_inputs.tar.gz
353.47 KB
-
pyclone_output.tar.gz
253.56 KB
-
README.md
41.07 KB
-
rep_time.tsv
31.37 KB
-
Sampling_experiment_day_matches.txt
531.67 KB
-
tree_analysis.ipynb
608.26 KB
-
utils.py
5.32 KB
-
vcfs.tar.gz
6.71 MB
Abstract
Recent studies have revealed bacterial genome-wide evolution to be complex and dynamic even in a constant environment, characterized by the emergence of new clades competing or temporarily coexisting as each clade undergoes evolutionary change. Previous studies on predator-prey dynamics tracking simple ecological and phenotypic metrics have shown predation to fundamentally alter prey evolution, facilitating defense evolution followed by coevolution and frequency-dependent selection between defended and undefended prey genotypes. Here, we sought to consolidate these fields by examining genome-wide evolution in five bacterial prey species separately subjected to long-term evolution under ciliate predation. We hypothesized that the presence of predation could change the pattern of clonal dynamics, for example, by more frequently producing selective sweeps if predation-defense-related mutations are under strong selection. For all species, we found mutational signals of prey adaptation, with phenotypic data and genomic mutation targets demonstrating changes in composition between the experimental treatments. Intriguingly, despite higher variant counts, overall temporal clade dynamics across the coevolved prey species were strikingly similar to those of bacteria evolving alone, with constant emergence, competition and quasi-stable coexistence of clades. This study shows that long-term molecular evolution in bacterial prey under predation is more interesting and less predictable than we might expect based on existing coevolutionary theories.
Dataset DOI: 10.5061/dryad.ffbg79d8s
Description of the data and file structure
Experimental and derived datasets for "Universal bacterial clade dynamics dominate under predation despite altered phenotypes and mutation targets" study. Experimental data provided includes LogPhase measurements of each of the five evolved-alone/coevolved-with-predator single species experiments, growing on the three different growth media (files in "phenotyping.tar.gz" archive; see full article for details). Genomic variants are provided in .vcf files ("vcfs.tar.gz" archive). Raw genomic sequences are deposited in ENA (accession: PRJEB85532). The remaining files are program code and various intermediate data to facilitate result replicability.
Files and variables
Note: IDs of study populations used across files
Evolutionary histories are denoted NP (no predator) and PS (predator selection, i.e., with predator). Species are denoted Bd (Brevundimonas diminuta), Ct (Comamonas testosteroni), Pf (Pseudomonas fluorescens), Sc (Sphingomonas capsulata), Sm (Serratia marcescens), and "r01", "r02", "r03" refer to the three replicate populations of the treatment-species combination.
File: chosen_sample_df.csv
Description: a table containing a list of curated genomic sample filenames for each experimental population. The index column (unnamed) is an ID for each study population.
Variables
- Chosen samples: a list of names of the sequencing samples that were used to compute the clonal structure of each population, intended to be read in Python.
File: clonal_order_df.csv
Description: a table containing a list of tuples for each experimental population. Each tuple contains the number of emerged clone in the population and its emergence time. The index column (unnamed) is an ID for each study population.
Variables
- Order: a list of tuples containing the number (i.e., PyClone cluster) of the emerged clone and its emergence time index. Intended to be read in Python.
File: cluster_clone_match.tsv
Description: a table containing clone number, clone name (represented in both a single character or a string of characters, following the clone hierarchy), and the time of its emergence.
Variables
- Replicate: an ID for each study population.
- Cluster: PyClone cluster number.
- Clone: Clone encoding label, representing clonal hierarchy
- Clone letter: each clone is assigned a letter (used to generate the clone encoding label).
- Time of emergence: time point index (used with "rep_time.csv" to get the exact amount of days).
File: rep_time.tsv
Description: a table containing a list for each experimental population, representing the curated list of sampling time points in days during the long-term experimental evolution experiment. The index column (unnamed) is an ID for each study population.
Variables
- Days: a list of days, intended to be read in Python. Files that use time point indices (like "cluster_clone_match.tsv") have to refer to this list to get the exact number of days in the experiment.
- Samples: a list of samples that correspond to each day in the experiment, intended to be read in Python.
- Dates: a list of dates that correspond to each day and sample in the experiment, intended to be read in Python.
File: Sampling_experiment_day_matches.txt
Description: the main experimental metadata file.
Variables
- id: unique experimental population identifier;
- bacterial_strain: species name;
- organism_type: bacteria or predator;
- experiment: evolutionary history identifier, denoting evolving-alone (NP) and coevolving with predator (PS) populations;
- replicate: replicate experimental population (three for each species and evolutionary history combination);
- sampling_date: the calendar date the experimental population was sampled;
- day_in_experiment: the day in the experiment that corresponds to the sampling date;
- unreliable_pop_size: a boolean variable denoting whether the experimental population size (prey optical density) was trustworthy (e.g., the optical density close to the detection threshold was considered unreliable);
- notes: additional notes related to the experimental population;
- OD600: optical density measurement of the prey population in the experimental population;
- pred_cells_ml: ciliate cell counts in the experimental population;
- pred_corrected: corrected ciliate cell counts in the experimental population.
File: pyclone_inputs.tar.gz
Description: an archive with input files to the PyClone-VI software for each experimental population, representing curated genomic variant trajectories. These files are generated using variant calling output files (see "vcfs.tar.gz" file description). The variables are shared across all files and are defined below.
Variables
- mutation_id: mutation ID in the corresponding VCF file.
- sample_id: genomic sequencing sample ID.
- ref_counts: number of reads where the variant was not detected.
- alt_counts: number of reads where the variant was detected.
- major_cn: major copy number (all are 1; needed for PyClone to run).
- minor_cn: minor copy number (all are 0; needed for PyClone to run).
- normal_cn: normal copy number (all are 1; needed for PyClone to run).
File: NP_Bd_r01_pyclone_input.tsv
Description: PyClone input file for NP_Bd_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Bd_r02_pyclone_input.tsv
Description: PyClone input file for NP_Bd_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Bd_r03_pyclone_input.tsv
Description: PyClone input file for NP_Bd_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Ct_r01_pyclone_input.tsv
Description: PyClone input file for NP_Ct_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Ct_r02_pyclone_input.tsv
Description: PyClone input file for NP_Ct_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Ct_r03_pyclone_input.tsv
Description: PyClone input file for NP_Ct_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Pf_r01_pyclone_input.tsv
Description: PyClone input file for NP_Pf_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Pf_r02_pyclone_input.tsv
Description: PyClone input file for NP_Pf_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Pf_r03_pyclone_input.tsv
Description: PyClone input file for NP_Pf_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Sc_r01_pyclone_input.tsv
Description: PyClone input file for NP_Sc_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Sc_r02_pyclone_input.tsv
Description: PyClone input file for NP_Sc_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Sc_r03_pyclone_input.tsv
Description: PyClone input file for NP_Sc_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Sm_r01_pyclone_input.tsv
Description: PyClone input file for NP_Sm_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Sm_r02_pyclone_input.tsv
Description: PyClone input file for NP_Sm_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: NP_Sm_r03_pyclone_input.tsv
Description: PyClone input file for NP_Sm_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Bd_r01_pyclone_input.tsv
Description: PyClone input file for PS_Bd_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Bd_r02_pyclone_input.tsv
Description: PyClone input file for PS_Bd_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Bd_r03_pyclone_input.tsv
Description: PyClone input file for PS_Bd_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Ct_r01_pyclone_input.tsv
Description: PyClone input file for PS_Ct_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Ct_r02_pyclone_input.tsv
Description: PyClone input file for PS_Ct_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Ct_r03_pyclone_input.tsv
Description: PyClone input file for PS_Ct_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Pf_r01_pyclone_input.tsv
Description: PyClone input file for PS_Pf_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Pf_r02_pyclone_input.tsv
Description: PyClone input file for PS_Pf_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Pf_r03_pyclone_input.tsv
Description: PyClone input file for PS_Pf_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Sc_r01_pyclone_input.tsv
Description: PyClone input file for PS_Sc_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Sc_r02_pyclone_input.tsv
Description: PyClone input file for PS_Sc_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Sc_r03_pyclone_input.tsv
Description: PyClone input file for PS_Sc_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Sm_r01_pyclone_input.tsv
Description: PyClone input file for PS_Sm_r01 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Sm_r02_pyclone_input.tsv
Description: PyClone input file for PS_Sm_r02 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: PS_Sm_r03_pyclone_input.tsv
Description: PyClone input file for PS_Sm_r03 sample (see Note at the top of this README to interpret the file name). See the "pyclone_inputs.tar.gz" file description for variable description.
File: pyclone_output.tar.gz
Description: an archive with PyClone software output files for each experimental population. See https://github.com/Roth-Lab/pyclone-vi for output file details.
Variables
- mutation_id: mutation ID in the corresponding vcf file.
- sample_id: genomic sequencing sample ID.
- cluster_id: PyClone-assigned cluster number.
- cellular_prevalence: proportion of population members carrying the relevant mutation.
- cellular_prevalence_std: standard error of the cellular_prevalence estimate.
- cluster_assignment_prob: posterior probability of the mutation assignment to the cluster.
File: NP_Bd_r01_final.tsv
Description: PyClone output file for NP_Bd_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Bd_r02_final.tsv
Description: PyClone output file for NP_Bd_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Bd_r03_final.tsv
Description: PyClone output file for NP_Bd_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Ct_r01_final.tsv
Description: PyClone output file for NP_Ct_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Ct_r02_final.tsv
Description: PyClone output file for NP_Ct_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Ct_r03_final.tsv
Description: PyClone output file for NP_Ct_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Pf_r01_final.tsv
Description: PyClone output file for NP_Pf_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Pf_r02_final.tsv
Description: PyClone output file for NP_Pf_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Pf_r03_final.tsv
Description: PyClone output file for NP_Pf_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Sc_r01_final.tsv
Description: PyClone output file for NP_Sc_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Sc_r02_final.tsv
Description: PyClone output file for NP_Sc_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Sc_r03_final.tsv
Description: PyClone output file for NP_Sc_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Sm_r01_final.tsv
Description: PyClone output file for NP_Sm_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Sm_r02_final.tsv
Description: PyClone output file for NP_Sm_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: NP_Sm_r03_final.tsv
Description: PyClone output file for NP_Sm_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Bd_r01_final.tsv
Description: PyClone output file for PS_Bd_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Bd_r02_final.tsv
Description: PyClone output file for PS_Bd_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Bd_r03_final.tsv
Description: PyClone output file for PS_Bd_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Ct_r01_final.tsv
Description: PyClone output file for PS_Ct_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Ct_r02_final.tsv
Description: PyClone output file for PS_Ct_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Ct_r03_final.tsv
Description: PyClone output file for PS_Ct_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Pf_r01_final.tsv
Description: PyClone output file for PS_Pf_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Pf_r02_final.tsv
Description: PyClone output file for PS_Pf_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Pf_r03_final.tsv
Description: PyClone output file for PS_Pf_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Sc_r01_final.tsv
Description: PyClone output file for PS_Sc_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Sc_r02_final.tsv
Description: PyClone output file for PS_Sc_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Sc_r03_final.tsv
Description: PyClone output file for PS_Sc_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Sm_r01_final.tsv
Description: PyClone output file for PS_Sm_r01 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Sm_r02_final.tsv
Description: PyClone output file for PS_Sm_r02 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: PS_Sm_r03_final.tsv
Description: PyClone output file for PS_Sm_r03 sample (see Note at the top of this README to interpret the file name). See "pyclone_output.tar.gz" file description for variable description.
File: phenotyping.tar.gz
Description: an archive with LogPhase600 measurement device output files, measuring optical density of the prey populations, sampled at different time points across the long-term experimental evolution experiment, growing on three different media (see main article for details).
The archive includes two directories for each study species (BD - Brevundimonas diminuta, CT - Comamonas testosteroni, PF - Pseudomonas fluorescens, SC - Sphingomonas capsulata, SM - Serratia marcescens), representing the two evolutionary lines (Evo - bacteria evolved alone, CoEvo - bacteria coevolved with the ciliate). Additionally, the "Ancestors" directory contains growth curves of the ancestral strains of each study species. Inside each directory, there are several Excel files, each of them containing 12 sheets, named Plate [1, 2, 3, 4] - [Results, Raw data, Procedure]. See "Plate_overview_for_12_plate_growth_curve_assay.xlsx" for mapping of plates to isolates, study population replicates, and sampling timing.
File names refer to the species (BD, CT, PF, SC, SM), evolutionary line (Evo, CoEvo), and growth medium - KB (control), KCl (salt stress), tetra (predator) - combination. Due to the design of the experiment, one Excel file may contain more than one species-history-medium combination - for example, filename "BD_CoEvo_KB_1-3_BD_CoEvo_KCl_1_01-kesä-2023_11-11-06.xlsx" says that the sheets named "Plate 1-", "Plate 2-" and "Plate 3-" contain measurements for the three technical replicates of BD-CoEvo in KB medium, whereas the fourth plate ("Plate 4") is technical replicate 1 of BD-CoEvo in KCl. The remaining replicates for this treatment combination are stored in other Excel files. The rest of the filename (in this case, "01-kesä-2023_11-11-06.xlsx") contains the measurement date and machine output IDs, which are not relevant to the measurement results and can be ignored. Some measurements include a fourth technical replicate - they were ignored during analysis to have a consistent number of replicates per treatment combination. Some Excel files are duplicated if they contain plates for two different species.
"Results" sheet contain information for each measured experimental plate well, and have 5 columns: "Well" maps the information to a specific well location on the 96-well experimental plate; "Name" is empty (species and medium combination is in the file name; see above); "Lagtime" is the LogPhase-computed growth lag time (in hours:minutes:seconds); "Max Rate (OD/min)" is the maximum growth rate of the recorded growth curve; "Stationary phase" refers to the beginning of the stationary phase of the population growth (in hours:minutes:seconds).
The "Raw data" sheet contains the raw OD curves for each experimental 96-well plate well. Columns refer to the well ID, except "Time", which are the times measurements are made (every 10 minutes) in hours:minutes:seconds format.
"Procedure" contains a summary of the LogPhase600 machine settings.
File: Plate_overview_for_12_plate_growth_curve_assay.xlsx
Description: This file contains a map of the 96-well experimental plate to the sampling time points, isolates, and replicate measurements. One plate contains one technical replicate of each isolate in one treatment for all samples of one species from either CoEvo or Evo evolutionary lines (each plate contains isolates from the three replicate populations). Only relevant to non-ancestral LogPhase600 measurements. The file includes color formatting and is intended to be used as a visual guide to the experimental design of the phenotypic measurement assay of this study. The 96-well plates are divided into several sections that house different sampling time points (TP3 was not used in the analyses of the present study) and 10 isolates for each study population. Green color indicates time point 1 (early), blue color indicates time point 2 (late), orange indicates TP3 (unused in the present study), and pink color indicates control wells (no bacterial cells present, just the growth medium). The plate design is the same for all three growth mediums.
File: ANC_test_platemap.xlsx
Description: This file contains a map of the 96-well experimental plate to the treatments (species and growth medium). Only relevant to ancestral LogPhase600 measurements. The file includes color formatting and is intended to be used as a visual guide to the experimental design of the phenotypic measurement assay of this study. The 96-well plate is divided into three sections, one for each growth medium (KB: control, KCl: salt stress, Tetra: predator). Yellow color represents the control medium, pink represents the predator medium, and light blue represents the salt stress medium. See Note at the top of the README document for species encodings (Ec and F/Ct were unused in our analysis; they represent Escherichia coli and Flavobacterium sp. (HAMBI 3533)).
File: ANC_test_30-kesä-2023_13-12-23.xlsx
Description: LogPhase600 output file for ancestral strain measurements. Needs "ANC_test_platemap.xlsx" file to be interpreted. See "phenotyping.tar.gz" file description for file structure details.
File: BD_CoEvo_KB_1-3_BD_CoEvo_KCl_1_01-kesä-2023_11-11-06.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: BD_CoEvo_KCl_2-3_BD_CoEvo_tetra_1-2_01-kesä-2023_11-11-00.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: BD_CoEvo_tetra_3_CT_CoEvo_KB_1-3_01-kesä-2023_11-10-55.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: BD_Evo_KB_1-4_05-helmi-2023_11-37-27.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: BD_Evo_KCL_1-4_05-helmi-2023_11-37-01.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: BD_Evo_Tetra_1-4_05-helmi-2023_11-37-17.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: CT_CoEvo_KCl_1-3_01-kesä-2023_11-10-49.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: CT_CoEvo_Tetra_1-3_01-kesä-2023_11-10-42.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: CT_Evo_KB_1-4_27-tammi-2023_12-59-04.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: CT_Evo_KCL_1-4_27-tammi-2023_12-59-32.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: CT_Evo_Tetra_1-4_27-tammi-2023_13-00-38.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: PF_CoEvo_KB_1-4_07-tammi-2023_15-24-26.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: PF_CoEvo_KCL_1-4_07-tammi-2023_15-25-01.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: PF_CoEvo_Tetra_1-4_07-tammi-2023_15-25-55.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: PF_EVO_KB1-4_16.1.2023_18-tammi-2023_13-27-00.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: PF_EVO_KCL1-4_16.1.2023_18-tammi-2023_13-27-03.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: PF_EVO_tetra1-4_16.1.2023_18-tammi-2023_13-26-44.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SC_CoeEvo_KB_2-3_SC_CoEvo_Tetra_1-2_11-touko-2023_11-31-25.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SC_CoEvo_KCl1-3_SC_CoEvo_KB1_11-touko-2023_11-31-32.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SC_CoEvo_Tetra3_SC_Evo_KB1-3_11-touko-2023_11-31-19.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SC_CoEvo_Tetra3_SC_Evo_KB1-3_11-touko-2023_11-31-19.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SC_Evo_KCl1-3_11-touko-2023_11-31-01.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SC_Evo_tetra_1-3_11-touko-2023_11-31-12.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SM_CoEvo_KB1-3_SM_CoEvo_KCL1_26-touko-2023_10-57-44.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SM_CoEvo_KCL2-3_SM_CoEvo_Tetra_1-2_26-touko-2023_10-57-38.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SM_CoEvo_Tetra_3_SM_Evo_Tetra_1-3_26-touko-2023_10-57-32.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SM_Evo_KB1-3_26-touko-2023_10-57-26.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: SM_Evo_KCL1-3_26-touko-2023_10-57-18.xlsx
Description: LogPhase600 output file. See "phenotyping.tar.gz" file description for file structure details. Needs "Plate_overview_for_12_plate_growth_curve_assay.xlsx" file to be interpreted.
File: vcfs.tar.gz
Description: an archive containing the detected and annotated genomic variants (vcf files) for each experimental population. Variant calling was done with GATK's Mutect2, and the resulting files were filtered with GATK's FilterMutectCalls. Only SNPs and Indels that PASS the aforementioned filter were retained. Each variant was then annotated with snpEff software. See https://github.com/samtools/hts-specs for the VCF file specification.
File: NP_Bd_r01_filtered_anno.vcf
Description: variant calling result file for NP_Bd_r01 sample (see Note at the top of this README to interpret the file name).
File: NP_Bd_r02_filtered_anno.vcf
Description: variant calling result file for NP_Bd_r02 sample (see Note at the top of this README to interpret the file name).
File: NP_Bd_r03_filtered_anno.vcf
Description: variant calling result file for NP_Bd_r03 sample (see Note at the top of this README to interpret the file name).
File: NP_Ct_r01_filtered_anno.vcf
Description: variant calling result file for NP_Ct_r01 sample (see Note at the top of this README to interpret the file name).
File: NP_Ct_r02_filtered_anno.vcf
Description: variant calling result file for NP_Ct_r02 sample (see Note at the top of this README to interpret the file name).
File: NP_Ct_r03_filtered_anno.vcf
Description: variant calling result file for NP_Ct_r03 sample (see Note at the top of this README to interpret the file name).
File: NP_Pf_r01_filtered_anno.vcf
Description: variant calling result file for NP_Pf_r01 sample (see Note at the top of this README to interpret the file name).
File: NP_Pf_r02_filtered_anno.vcf
Description: variant calling result file for NP_Pf_r02 sample (see Note at the top of this README to interpret the file name).
File: NP_Pf_r03_filtered_anno.vcf
Description: variant calling result file for NP_Pf_r03 sample (see Note at the top of this README to interpret the file name).
File: NP_Sc_r01_filtered_anno.vcf
Description: variant calling result file for NP_Sc_r01 sample (see Note at the top of this README to interpret the file name).
File: NP_Sc_r02_filtered_anno.vcf
Description: variant calling result file for NP_Sc_r02 sample (see Note at the top of this README to interpret the file name).
File: NP_Sc_r03_filtered_anno.vcf
Description: variant calling result file for NP_Sc_r03 sample (see Note at the top of this README to interpret the file name).
File: NP_Sm_r01_filtered_anno.vcf
Description: variant calling result file for NP_Sm_r01 sample (see Note at the top of this README to interpret the file name).
File: NP_Sm_r02_filtered_anno.vcf
Description: variant calling result file for NP_Sm_r02 sample (see Note at the top of this README to interpret the file name).
File: NP_Sm_r03_filtered_anno.vcf
Description: variant calling result file for NP_Sm_r03 sample (see Note at the top of this README to interpret the file name).
File: PS_Bd_r01_filtered_anno.vcf
Description: variant calling result file for PS_Bd_r01 sample (see Note at the top of this README to interpret the file name).
File: PS_Bd_r02_filtered_anno.vcf
Description: variant calling result file for PS_Bd_r02 sample (see Note at the top of this README to interpret the file name).
File: PS_Bd_r03_filtered_anno.vcf
Description: variant calling result file for PS_Bd_r03 sample (see Note at the top of this README to interpret the file name).
File: PS_Ct_r01_filtered_anno.vcf
Description: variant calling result file for PS_Ct_r01 sample (see Note at the top of this README to interpret the file name).
File: PS_Ct_r02_filtered_anno.vcf
Description: variant calling result file for PS_Ct_r02 sample (see Note at the top of this README to interpret the file name).
File: PS_Ct_r03_filtered_anno.vcf
Description: variant calling result file for PS_Ct_r03 sample (see Note at the top of this README to interpret the file name).
File: PS_Pf_r01_filtered_anno.vcf
Description: variant calling result file for PS_Pf_r01 sample (see Note at the top of this README to interpret the file name).
File: PS_Pf_r02_filtered_anno.vcf
Description: variant calling result file for PS_Pf_r02 sample (see Note at the top of this README to interpret the file name).
File: PS_Pf_r03_filtered_anno.vcf
Description: variant calling result file for PS_Pf_r03 sample (see Note at the top of this README to interpret the file name).
File: PS_Sc_r01_filtered_anno.vcf
Description: variant calling result file for PS_Sc_r01 sample (see Note at the top of this README to interpret the file name).
File: PS_Sc_r02_filtered_anno.vcf
Description: variant calling result file for PS_Sc_r02 sample (see Note at the top of this README to interpret the file name).
File: PS_Sc_r03_filtered_anno.vcf
Description: variant calling result file for PS_Sc_r03 sample (see Note at the top of this README to interpret the file name).
File: PS_Sm_r01_filtered_anno.vcf
Description: variant calling result file for PS_Sm_r01 sample (see Note at the top of this README to interpret the file name).
File: PS_Sm_r02_filtered_anno.vcf
Description: variant calling result file for PS_Sm_r02 sample (see Note at the top of this README to interpret the file name).
File: PS_Sm_r03_filtered_anno.vcf
Description: variant calling result file for PS_Sm_r03 sample (see Note at the top of this README to interpret the file name).
File: ancestral_evolved_aucs.csv
Description: a table with computed ancestral to (co-)evolved prey growth area under the curve (AUC) ratios, used to compute linear mixed models in R.
Variables
- Species: a species identifier (BD: Brevundimonas diminuta, CT: Comamonas testosteroni, PF: Pseudomonas fluorescens, SC: Sphingomonas capsulata, SM: Serratia marcescens).
- Treatment: evolutionary line identifier (Evo: bacteria evolved alone, CoEvo: bacteria coevolved with ciliate).
- Medium: growth medium for growth phenotype measurement (KB: control medium, KCl: salt stress medium, Tetra: predator medium).
- Timepoint: time point of the phenotypic sampling (TP1: early, TP2: late).
- Ratio: ancestral to (co-)evolved prey growth curve AUC ratio.
File: no_interaction_R2_decomposition.csv
Description: linear mixed model R-squared decomposition results from R, to be used with Python code.
Variables
- Medium: growth medium for growth phenotype measurement.
- Term: linear mixed model term in the no-interaction LMM model (see manuscript for details).
- R2: R-squared metric.
Code/software
All code was written in Python (version 3.10.14) and R (version 4.5.2). This repository includes program code files listed below. To run this code, the unarchived input files should be in the same directory, and the archives should be extracted into directories of the same name (e.g., "vcfs.tar.gz" should be extracted into ./vcfs/).
Jupyter iPython notebooks:
clonal_structure_inference.ipynb - This notebook shows how the clonal structure is computed for a given experiment.
mutation_statistics.ipynb - here we compute mutation count and recurrence statistics and make figures reported in the main article.
phenotype_analysis.ipynb - here we analyse our phenotypic data and make figures reported in the main article.
tree_analysis.ipynb - here we analyse clonal structure trees, compute statistics and make figures reported in the main article.
Python script files:
clonal_structure.py - main code with clonal structure inference algorithm.
utils.py - some helper functions.
R markdown files:
linear_mixed_models.Rmd - here we utilise linear mixed models to analyse growth phenotype (ancestral to (co-)evolved prey growth AUC ratios) measurements of our experimental species growing on three different media.
Access information
Other publicly accessible locations of the data:
Data was derived from the following sources:
- ENA (accession: PRJEB85532)
