Faecal pathogens and ectoparasites associated with small mammals in forest fringes around Sydney, Australia
Data files
Dec 12, 2025 version files 8.64 MB
-
20241123_sp80_40r_hellinger.csv
204.56 KB
-
20250226_biom_16s.csv
7.62 MB
-
20250304_asv_mat_16s.csv
171.69 KB
-
20250304_asv_mat_30r_hellinger.csv
254.62 KB
-
20250304_asv_taxmat_16s.csv
144.80 KB
-
20250304_asv_taxmat_30r_hellinger.csv
163.48 KB
-
20250304_sample_depth_16s.csv
1.68 KB
-
20250305_pathogens_16s_sm.csv
6.83 KB
-
20250306_human_pathogen_fungi_sm.csv
13.74 KB
-
20250409_ectoparasite_counts.csv
11.23 KB
-
20250804_small_mammal_pathogens_DRYAD.R
38.31 KB
-
README.md
8.65 KB
Abstract
This dataset contains curated and Hellinger-tranformed sequencing data obtained from DNA extracted from small mammal scats collected in Ku-ring-gai Chase National Park, and surrounding urban reserves. We present the csv files neccesary to obtained the published results of our study, in which we aimed to analyse the influence of host species identity and traits (i.e. sex, body mass index; BMI), and seasonality on the presence of faecal pathogenic fungi and bacteria as well as the ectoparasites associated with small mammals inhabiting forest reserves near urban areas in the Sydney region, New South Wales (NSW), Australia. Across samples, we identified 12 pathogenic fungi, nine bacterial pathogens, and 15 ectoparasite taxa. The most abundant representatives of each group were Malassezia japonica (fungi), Escherichia coli (bacteria), and Siphonaptera (fleas). Host traits influenced pathogen and ectoparasite occurrence in distinct ways. Host sex affected flea prevalence, with males more frequently infested than females. Host body mass index had no detectable effect on pathogen or ectoparasite presence. Host species was a strong predictor with Rattus fuscipes being more likely to carry fleas and mites, whereas Antechinus stuartii had a higher likelihood of harbouring fungal and bacterial pathogens in their scats. Seasonality also shaped pathogen and ectoparasite dynamics. Pathogenic fungi, bacteria, and ticks were more common in the autumn (wet season), whereas flea prevalence was highest in spring. Overall, our findings underscore the importance of broad-scale assessments of pathogen communities in wildlife species that live near humans, as such work is critical for identifying potential vectors and emerging zoonoses.
Dataset DOI: 10.5061/dryad.8sf7m0d2g
Description of the data and file structure
The provided data were retrieved from sequencing data of DNA extracted from live-trapped small mammal scats using the primers ITS1F-ITS2R and 16S V3-V4 primers 341F and 806R. The DADA2 pipeline was used to curate this data, where the taxonomic assignment was done using the reference database SILVA taxonomic training data version 138.2. The data on ectoparasite counts were obtained by direct inspection of small mammals and morphological identification.
Files and variables
File: 20241123_sp80_40r_hellinger.csv
Description: Taxonomical identification of fungi with a bootstrap confidence above 80 % found in small mammal scat samples, and those fungal species with more than 50 reads. The sequencing depth was rarefied to 40,000 reads. The resulting sequencing depth was transformed using the Hellinger transformation.
Variables
- sp_ID: Consecutive ID of the fungi species
- Kingdom to Species: Taxonomic classification of each identified fungi
- Taxonomy: Fungi species
- boot.Genus.80: Bootstrap confidence value above 80 % for genus (yes/no)
- boot.Species.80: Bootstrap confidence value above 80 % for species (yes/no)
- reads.50: Whether the taxon had more than 50 reads (yes/no)
- K.1.1 to U.5.E2: Small mammal scat samples
File: 20250306_human_pathogen_fungi_sm.csv
Description: Human pathogenic fungi identified with a bootstrap confidence above 80 % found in small mammal scat samples, and those fungal species with more than 50 reads. The sequencing depth was transformed using the Hellinger transformation. Data on small mammal individuals is included.
Variables
- ID: Consecutive ID
- sample_ID: Sample ID
- site: Site ID (Nine different sites)
- date: Date of sample collection
- season: Season of sample collection
- vector.genus: Small mammal individual genus
- vector: Small mammal individual species
- site.type: Site type (forest/urban)
- sex: Small mammal sex
- weight: Small mammal weight
- body_length: Small mammal body length
- bmi: Small mammal Body Mass Index
- pathogen_pa: Presence/absence of fungal pathogens
- Exophiala.xenobiotica to Malassezia.restricta: Fungal pathogens identified in small mammal scat samples. The row numbers represent the Hellinger transformed reads per sample of each fungal species.
File: 20250226_biom_16s.csv
Description: Taxonomic identification of all the bacteria identified in small mammal scat samples.
Variables
- sequence: Identified DNA sequenced of the ASV.
- tax.Kingdom to tax.Species: Taxonomic classification of the identified ASV.
- taxonomy: ASV species
- taxonomy_level_80: Taxonomic level of the identification with a confidence of > 80 %.
- boot_80_taxonomy: Bootstrap confidence value above 80 % for the specified taxonomic level (yes/no)
- boot.Kingdom to boot.Species: Bootstrap confidence value per each Taxonomic level
- reads: Number of reads of the identified species considering all the small mammal scat samples.
- K-1-12-16S to U-5-1-16S: Sample IDS. The row numbers represent the Hellinger transformed reads per sample of each identified taxon per sample.
File: 20250304_sample_depth_16s.csv
Description: Small mammal scat sample general data
Variables
- sample: Sample ID
- depth: Sequencing depth of the sample
- site_type: Site type where the sample was collected (forest/urban)
- site: Site identifier
File: 20250304_asv_taxmat_16s.csv
Description: Taxonomic identity matrix for the bacteria identified in small mammal scat samples
Variables
- ASV_ID: ASV identifier
- Kingdom:
- Phylum to Genus: Taxonomic classification of the bacteria
- Taxonomy: Taxa of bacteria identified in the sample with a bootstrap confidence level > 80 %
- taxonomy_level_80: corresponding taxonomic level of identification with bootstrap confidence level > 80 % confirmation
- boot_80_taxonomy: bootstrap confidence level > 80 % confirmation (yes)
File: 20250304_asv_mat_16s.csv
Description: ASV matrix with reads data for small mammal scat samples
Variables
- ASV_ID: ASV identifier
- K.1.12.16S to U.5.1.16S: Sample identifier
File: 20250304_asv_mat_30r_hellinger.csv
Description: ASV matrix with reads data for small mammal scat samples with rarefaction to 30,000 reads, and Hellinger transformation applied to the reads.
Variables
- X: ASV identifier
- K.1.12.16S to U.5.1.16S: Sample identifier. Row numbers represent the number of Hellinger transformed reads per ASV.
File: 20250304_asv_taxmat_30r_hellinger.csv
Description: Taxonomy matrix of the ASV found small mammal scat samples with rarefaction to 30,000 reads, and Hellinger transformation applied to the reads.
Variables
- X: ASV identifier
- Kingdom to Genus: Taxonomic identification of the bacteria
- Taxonomy: Taxa of bacteria identified in the sample with a bootstrap confidence level > 80 %
- taxonomy_level_80: corresponding taxonomic level of identification with bootstrap confidence level > 80 % confirmation
- boot_80_taxonomy: bootstrap confidence level > 80 % confirmation (yes)
File: 20250305_pathogens_16s_sm.csv
Description: Bacterial human pathogens identified with a bootstrap confidence above 80% found in small mammal scat samples, and those fungal species with more than 50 reads. The sequencing depth was transformed using the Hellinger transformation. Data on small mammal individuals is included.
Variables
- site_type: site type where the samples were collected (urban/forest)
- site: site identifier
- sample_ID: sample ID
- date: Date of sample collection
- season: Season of sample collection
- sex: Small mammal sex
- vector.genus: small mammal genus
- vector: small mammal species
- weight: small mammal weight
- body_length: small mammal body length
- bmi: body mass index of small mammal individual
- pathogen_pa: presence/absence of bacterial pathogens in the sample
- Escherichia to Mycobacterium.fortuitum: Pathogenic bacteria taxa identified in the samples. The row values are the Hellinger transformed reads of each bacteria per sample.
File: 20250409_ectoparasite_counts.csv
Description: Data on the counts of ectoparasites collected from small mammals.
Variables
- site.type: site type where the ectoparasites were collected
- site: site identifier
- ID: sample identifier
- date: date of sample collection
- season: season of collection
- trap.ID: ID of the trap where the inspected small mammal was captured
- weight: small mammal host weight
- body.length: small mammal host body length
- bmi: small mammal host body mass index
- sex: small mammal host sex
- vector.genus: small mammal host genus
- vector: small mammal host species
- ectoparasite_load: total ectoparasite load of small mammal host, i.e., total number of ectoparasites collected from the inspected individual.
- ixodida_load: number of ticks collected from the host
- mesostigmata_load: number of mites collected from the host
- siphonaptera_load: number of fleas collected from the host
- ectoparasite_pa: presence/absence of ectoparasites on the small mammal inspected
- ixodida_pa: presence/absence of ticks on the small mammal inspected
- mesostigmata_pa: presence/absence of mites on the small mammal inspected
- siphonaptera_pa: presence/absence of fleas on the small mammal inspected
File: 20250804_small_mammal_pathogens_DRYAD.R
Description: Code used to analyse the collected data on bacterial and fungal pathogens identified on small mammal scats, and small mammal ectoparasites found in urban reserves and forest in Sydney, NSW, Australia.
Code/software
#####Small mammal pathogen prevalence and ectoparasite load###################
#By Margarita Gil-Fernández
The code used to run the analyses can be found in the files under the name of 20250804_small_mammal_pathogens_DRYAD.R
This code was created and ran under RStudio version R-4.4.1. We gathered all the files needed to run the code in one single folder, which was used as the working directory.
Access information
Other publicly accessible locations of the data:
- The fungal sequencing data were also used to analyse the diversity of ectomycorrhizal fungi in small mammal scats.
Data was derived from the following sources:
- Sequencing data and complementary processing information for the fungal sequences can be found in: DOI:10.5061/dryad.jwstqjqns
We live-trapped small mammals at five sites in Ku-ring-gai Chase National Park, each paired with one surrounding urban reserve. The urban reserves were located at least 100 m outside the KNP boundary and separated from one another by at least 1.5 km. At each site, we installed 50 Elliot traps in a sampling grid spaced at 10 m intervals. At the forest sites, the sampling grids were 5 × 10 trap rectangles. At the urban reserves, this grid arrangement was not always feasible due to the irregular shape of some sites. We sampled a pair of sites simultaneously. Each trap was baited with a mixture of peanut butter, vanilla essence, and oats. We sampled the sites during both spring (September 2023) and autumn (April 2024). Upon capture, any small mammals showing signs of stress (16% of captures) were immediately released. The remaining individuals were taxonomically identified. After processing the captured individuals, any fresh scats present were collected in 1.5 ml Eppendorf tubes from inside the Elliot traps. We did not collect scats from recaptured individuals. The scat samples were immediately placed in a cooler box with ice packs following collection and then transferred to a -80 °C freezer upon arriving back in the laboratory. We extracted DNA from the scat samples after each sampling season using the DNeasy PowerSoil Pro kit from Qiagen®. We processed 108 scat samples that were sequenced with the ITS 1F and ITS 2R and used the DADA2 pipeline to process the sequencing data and identify the fungal species using the UNITE database for fungi version 10.0. These data were then assigned to functional guilds using the package FunguildR. Sixty of these samples were also sequenced using the 16s primers 16s V3-V4 primers 341F and 806R. The DADA2 pipeline was used to curate this data, where the taxonomical assignment was done using the reference database SILVA taxonomic training data version 138.2. The identified bacterial taxa were classified into ecological functional groups using the FAPROTAX and the NJC19 databases in microeco R package, and only human pathogenic bacteria were filtered for further analyses. For the ectoparasites, each captured small mammal was inspected for the presence of ectoparasites for two minutes. Upon collection, the ectoparasites from each mammal were submerged in 100% ethanol in an Eppendorf tube. Individual ectoparasites were examined under a stereomicroscope for identification.
Data analyses. We calculated the overall prevalence of fungal and bacterial pathogens, as well as ectoparasites for each host species and season. We then developed binomial generalised linear models to predict the probability of presence for each pathogen group, and per ectoparasite order (i.e., Ixodida, Mesostigmata and Siphonaptera). We generated a global model in MuMIn to assess the influence of host BMI and species, as well as season on the presence/absence of each of the above mentioned pathogens and ectoparasite groups.
