The global network of domestic mammal hosts and zoonotic bacteria: Implications for disease transmission and detection
Data files
Sep 26, 2025 version files 77.89 KB
-
cytb_supertree.tree
40.73 KB
-
domestic_year.csv
1.46 KB
-
OIK11077_R2.R
21.65 KB
-
README.md
3.75 KB
-
zoo_bac_dom2.csv
10.29 KB
Abstract
Zoonotic diseases, defined as those transmitted from animals to humans, represent over 75% of emerging infectious diseases and pose ongoing challenges to public health. Understanding the structure of host–pathogen interactions is critical for anticipating and mitigating such threats. In this study, we analysed the global bipartite network formed by zoonotic bacteria and their domestic mammal hosts. A bipartite network links two distinct sets of nodes—in this case, bacteria and hosts—through documented interactions. We aimed to (1) describe the structure of this host–bacteria (H–B) network using ecological network metrics, and (2) assess whether host traits, namely time since domestication and phylogenetic proximity to humans, predict the number of associated zoonotic bacteria per host species. The network, built from literature-based associations involving 24 domestic mammals and 51 zoonotic bacteria, exhibited high nestedness (a pattern where specialist bacteria infect subsets of hosts used by generalist bacteria) and low modularity (indicating weak compartmentalisation into distinct host–pathogen clusters) compared with null models. This suggests a non-random, core–periphery structure. Hosts with longer domestication histories hosted significantly more bacterial species, whereas phylogenetic distance from humans had no significant effect. These findings suggest that early-domesticated hosts may act as key nodes for pathogen accumulation and redistribution. The nested architecture also revealed likely “missing links” (unreported but plausible host–pathogen interactions), offering a practical basis for targeted surveillance. While network analysis is already established in zoonotic research, our global-scale findings highlight its continued potential to uncover structure and dynamics in increasingly complex host–pathogen systems.
María del Rosario Belén Pacheco & Mariano Devoto
Dataset DOI: 10.5061/dryad.ghx3ffbzd
Description of the data and file structure
This dataset was assembled to analyse the global bipartite network formed by zoonotic bacteria and their domestic mammal hosts. It compiles documented interactions between 24 domestic mammal species and 51 zoonotic bacterial species, drawn from published reviews and literature spanning 2002–2022, as well as classic compendia. The aim was to describe the structure of the host–bacteria network (nestedness, modularity, connectance) and to test whether domestication history and phylogenetic proximity to humans predict bacterial richness per host.
Files and variables
The dataset includes four files:
1. zoo_bac_dom2.csv
Contains the host–bacteria interaction data in long (edge-list) format. Each row corresponds to a recorded interaction between one domestic mammal species and one zoonotic bacterial species. Columns are:
- bacterium_name: Scientific name of the zoonotic bacterium (character string; e.g., Brucella abortus).
- host_name: Scientific name of the domestic mammal species (character string; matches the names in
domestic_year.csv; e.g., Bos taurus). - bacterium_num: Numeric identifier assigned to each bacterial species (integer index).
- host_num: Numeric identifier assigned to each mammal species (integer index).
Presence of a row indicates a documented host–bacterium association (equivalent to a “1” in a binary matrix). Absence of a pair indicates no reported interaction (equivalent to “0”).
2. domestic_year.csv
Provides metadata on each domestic mammal species included in the network. Columns are:
- Order: Taxonomic order of the mammal species (categorical).
- host_name: Scientific name of the domestic mammal species (character string; matches the rows in
zoo_bac_dom2.csv). - DOMYear: Estimated year of first domestication, expressed as years before present (BP) (integer; e.g.,
4514= domesticated ~4514 years ago). Missing values are indicated byNA. - domestic_category: Broad functional category of domestication (categorical; e.g., “Livestock”, "Small Livestock", “Utility”, “Pet”).
- publ_number: Number of publications that reported bacterial associations for this species (integer count; unit = publications).
- global_livestock: Estimated global population size for the species (numeric; unit = number of individuals). Missing values are indicated by
NA.
3. cytb_supertree.tree
A phylogenetic tree in Newick format, representing relationships among the 24 domestic mammal species plus humans. The topology is derived from mitochondrial cytochrome b data. No tabular variables apply beyond the tree structure itself.
4. OIK11077_R2.R
An R script containing the full analysis workflow. It reads the three data files, builds and analyses the host–bacteria bipartite network, calculates network- and node-level metrics, conducts null model comparisons, computes phylogenetic distances and Mantel tests, fits phylogenetic GLMMs, and generates figures.
Code/software
All analyses were performed in R (version 4.0 or later). Required open-source packages include igraph, bipartite, ape, vegan, and phyr. The workflow is entirely contained in OIK11077_R2.R, which imports the data files and runs the analyses sequentially. Users can reproduce the figures and statistical models by running this script with the datasets in the same working directory.
