A phylogenetic host-range index reveals ecological constraints in phage specialisation and virulence
Data files
Jun 19, 2025 version files 219.26 KB
-
All_phage_variables.csv
3.93 KB
-
Becket_test_for_modular_structure_of_matrices.md
7.63 KB
-
Contingency_table_analysis_moduls.md
1.53 KB
-
Modular_Matrix_Structure_test.md
32.52 KB
-
Nested_Matrix_Structure_test.md
5.32 KB
-
OTUMauri.csv
5.14 KB
-
OTUReu.csv
3.53 KB
-
Phage_virulence_raw_data_Mauritius.csv
13.92 KB
-
Phage_virulence_raw_data_Reunion.csv
14.32 KB
-
Phage_virulence.csv
25.50 KB
-
PHRI_and_phage_load_correlation_test.md
2.03 KB
-
Phylogenetic_signal_of_bacterial_hosts.md
4.03 KB
-
README.md
6.34 KB
-
Stats_tests_all_phage_variables.md
2.13 KB
-
Table_S1_Mauritius.csv
48.51 KB
-
Table_S1_Reunion.csv
39.97 KB
-
TreeRalstoMauri
1.61 KB
-
TreeRalstoReu
1.30 KB
Abstract
Phages are typically known for having a limited host range, targeting particular strains within a bacterial species, but accurately measuring their specificity remains challenging. Factors like the genetic diversity or epidemiology of host bacteria are often disregarded, despite their potential influence on phage specialisation and virulence. This study focuses on the Ralstonia solanacearum species complex (RSSC), which comprises genetically diverse bacteria responsible for a major plant disease. It uses a diversified collection of RSSC phages to develop new host-range analysis methods and to test ecological and evolutionary hypotheses on phage host range. We introduce a new "phylogenetic host-range index" that employs an ecological diversity index to account for the genetic diversity of bacterial hosts, allowing systematic classification of phages along a continuum between specialists and generalists. We propose and provide evidence that the CRISPR-Cas immune system of bacteria more frequently targets generalist phages than specialist phages. We explore the hypothesis that generalist phages might exhibit lower virulence than specialist ones due to potential evolutionary trade-offs between host-range breadth and virulence. Importantly, contrasted correlations between phage virulence and host range depend on the epidemiological context. A trade-off was confirmed in a context of low bacterial diversity, but not in a context of higher bacterial diversity, where no apparent costs were detected for phages adapted to a wide range of hosts. This study highlights the need for genetic analyses in phage host range and for investigating ecological trade-offs that could improve both fundamental phage knowledge and applications in biocontrol or therapy.
Dataset DOI: 10.5061/dryad.tmpg4f596
Description of the data and file structure
Here, we aim to assess phage host range by integrating available phylogenetic data with phenotypic data and applying quantitative network analyses of phage-bacteria interactions.
In all the scripts included, this version of R was used: Posit Team (2024). RStudio: IDE for R (Version 2024.04 “Chocolate Cosmos”). Posit Software, PBC. https://posit.co/
Files and variables
File: TreeRalstoReu
Description: The phylogenetic distance between bacterial hosts was calculated by using the phylogenetic tree of the RSSC endoglucanase (egl) gene, as previously established to study RSSC diversity (Fegan & Prior, 2005).
File: TreeRalstoMauri
Description: The phylogenetic distance between bacterial hosts was calculated by using the phylogenetic tree of the RSSC endoglucanase (egl) gene, as previously established to study RSSC diversity (Fegan & Prior, 2005).
File: Stats_tests_all_phage_variables.md
Description: R scripts with basic stats codes to associate phage variables
File: Phylogenetic_signal_of_bacterial_hosts.md
Description: R scripts with to calculate phylogenetic signal of phage bacterial hosts
File: PHRI_and_phage_load_correlation_test.md
Description: R scripts with to calculate Phylogenetic host range index (PHRI) of phages and stats test to correlate it with phage load
File: Nested_Matrix_Structure_test.md
Description: R scripts with to test the nested structure of a phage-bacteria interaction matrix
File: Contingency_table_analysis_moduls.md
Description: R scripts with to analyse the matrix moduls
File: Becket_test_for_modular_structure_of_matrices.md
Description: R scripts with to test the modular structure of a phage-bacteria interaction matrix
File: Modular_Matrix_Structure_test.md
Description: R scripts with to test the modular structure of a phage-bacteria interaction matrix
File: All_phage_variables.csv
Description: Data on phage phenotypic and genotypic traits
NA stands for "not applicable" data
Variables
- island: island
- ph_name: taxonomic name of clone
- phsp: phage species
- phgenera: phage genus
- phmorph: morphology
- phgensize: genome size
- phGC: GC content
- phprot: encoded protein number
- phlifecyc: temperate or virulent
- haplotype: bacterial host
- inh: phage inhibitory capacity of bacterial growth
- maxinh: phage maximum inhibitory capacity of bacterial growth
- PHRI: Phylogenetic host range index
- MODULE_July2024: module
- mean_logpfu: average phage load
- phylosignal: phylogenetic signal value of the bacterial hosts targeted by the phage
- fitness:
File: Phage_virulence raw data Reunion.csv
Description: Data on bacterial growth with and without phages for Reunion
Variables
- plate: day block of the experiment
- phname: phage name
- phage: yes (presence); no (absence)
- replicate (a, b or c)
- haplotype (of bacteria)
- od (OD 600 nm)
File: Phage_virulence raw data Mauritius.csv
Description: Data on bacterial growth with and without phages for Mauritius
Variables
- plate: day block of the experiment
- phname: phage name
- phage: yes (presence); no (absence)
- replicate (a, b or c)
- haplotype (of bacteria)
- od (OD 600 nm)
File: Phage_virulence.csv
Description: Data on phage inhibitory capacity of bacterial growth
Variables
- phname: see above
- phsp: see above
- phgenera: see above
- phfamily: see above
- phgensize: see above
- phGC: see above
- phprot: see above
- phlifecyc: see above
- phage: see above
- replicate: experimental replicate
- sequevar: bacterial sequevar
- average_od: cumulated OD values over 24h
- positive_noph: bacterial strain with no phage average OD values
- inhibition_value: phage inhibitory capacity of bacterial growth
- island: see above
File: OTUReu.csv
Description: Data on average Reunion phage load per strain
Variables
- : bacterial strain names
- Simangalove: average phage LOG10 pfu/mL on each bacterial strain
- Adzire: average phage LOG10 pfu/mL on each bacterial strain
- Sarlave: average phage LOG10 pfu/mL on each bacterial strain
- Elie: average phage LOG10 pfu/mL on each bacterial strain
- Anchaing: average phage LOG10 pfu/mL on each bacterial strain
- Albius: average phage LOG10 pfu/mL on each bacterial strain
- Raharianne: average phage LOG10 pfu/mL on each bacterial strain
- Cimandef: average phage LOG10 pfu/mL on each bacterial strain
- Dimitile: average phage LOG10 pfu/mL on each bacterial strain
- Heva: average phage LOG10 pfu/mL on each bacterial strain
File: OTUMauri.csv
Description: Data on average Mauritian phage load per strain
Variables
- : bacterial strain names
- Jenny: average phage LOG10 pfu/mL on each bacterial strain
- Bakoly: average phage LOG10 pfu/mL on each bacterial strain
- Dina: average phage LOG10 pfu/mL on each bacterial strain
- Hennie: average phage LOG10 pfu/mL on each bacterial strain
- Firinga: average phage LOG10 pfu/mL on each bacterial strain
- Hyacinthe: average phage LOG10 pfu/mL on each bacterial strain
- Alix: average phage LOG10 pfu/mL on each bacterial strain
- Claudette: average phage LOG10 pfu/mL on each bacterial strain
- Darius: average phage LOG10 pfu/mL on each bacterial strain
- Gervaise: average phage LOG10 pfu/mL on each bacterial strain
- Eline: average phage LOG10 pfu/mL on each bacterial strain
- Gamede: average phage LOG10 pfu/mL on each bacterial strain
- Gerry: average phage LOG10 pfu/mL on each bacterial strain
Files: Table_S1_Reunion.csv and Table_S1_Mauritius.csv
Description: List of Ralstonia solanacearum strains used for assessing the host range of phages from Reunion islands. It includes phylogenetic assignment, year and location of isolation, along with the corresponding egl sequence used for sequevar classification.
NA stands for "not applicable", ND stands for "not determined"
Variables
- Species
- Phylotype-sequevar
- Code (strain name)
- Haplotype
- Origin (country)
- Sequence EGL
