Parasite abundance-occupancy relationships across biogeographic regions: Joint effects of niche breadth, host availability, and climate
Data files
Dec 10, 2024 version files 139.60 KB
-
D.lin_pt.sel_final_metadata.csv
2.16 KB
-
D.lin_pt.sel_final.csv
118.34 KB
-
README.md
19.11 KB
Abstract
Aim
Changing biodiversity and environmental conditions may allow multi-host pathogens to spread amongst host species and affect prevalence. There are several widely acknowledged theories about mechanisms that may influence variation in pathogen prevalence, including the controversially debated dilution effect and abundance-occupancy relationship hypotheses. Here, we explore such abundance-occupancy relationships for unique lineages of three vector-borne avian blood parasite genera (the avian malaria parasite Plasmodium and the related haemosporidian parasites Parahaemoproteus and Leucocytozoon) across biogeographical regions.
Location
Nearctic-Neotropical and Palearctic-Afrotropical regions
Methods
We compiled a cross-continental dataset of 17,116 bird individuals surveyed from 46 bird assemblages across the Nearctic-Neotropical and Palearctic-Afrotropical regions and explored relationships between local parasite lineage prevalence and host assemblage metrics in a Bayesian random regression framework.
Results
Most lineages from these three genera infected ≥5 host species and exhibited clear phylogenetic or functional host specificity. Lineage prevalence from all three genera increased with host range, but also with higher degrees of specialisation to phylogenetically or functionally related host species. Local avian community features were also found to be important drivers of prevalence. For example, bird species richness was positively correlated with lineage prevalence for Plasmodium and Leucocytozoon, whereas higher relative abundances of the main host species were associated with lower prevalence for Plasmodium and Parahaemoproteus but higher prevalence for Leucocytozoon.
Conclusions
Our results broadly support several of the leading hypotheses about mechanisms that influence pathogen prevalence, including the niche breadth hypothesis in that higher avian host species diversity and broader host range amplifies prevalence through increasing ecological opportunities and the trade-off hypotheses in that specialisation among subsets of available host species may increase prevalence. Furthermore, the three studied avian haemosporidian genera exhibited different abundance-occupancy relationships across the major global climate gradients and in relation to host availability, emphasising that these relationships do not strictly follow common rules for vector-borne parasites with different life histories.
README: Parasite abundance-occupancy relationships across biogeographic regions: joint effects of niche breadth, host availability, and climate
https://doi.org/10.5061/dryad.gmsbcc2xg
Description of the data and file structure
Variable | Description |
---|---|
Site.ID | ID fof the Site where birds have been captured |
Lineage | Name of the haemosporidian lineage (corresponding to names used in the MalAvi database) |
Genus | Genus of haemosporidian lineage |
Seq.CytB | DNA sequence of the lineage/sample based on a cytochrome B sequence |
seq.AminoA | Amino-acid sequence matched to DNA sequence using the R package Biostrings |
seq.AminoA.ID | Amino-acid sequence ID |
nsamp_all | Number of all sampled bird individuals from the local assemblage |
nsamp_host | Number of sampled bird individuals recorded to be a host |
nsamp_infect | Number of sampled bird individuals infected |
nspec_all | Number of all sampled bird species |
nspec_host | Number of sampled bird species recorded as host species (either in local assemblage or elsewhere) |
nspec_infect | Number of sampled bird species infected |
HostRange_Chao_mean | Mean of the host range estimate based on Chao1 species richness estimator |
HostRange_Chao_sd | One SD of the host range estimate based on Chao1 species richness estimator |
Lineage_nobs | Number of haemosporidian lineages observed |
B.ind.phyl | Mean of the estimate of phylogenetic host specificity (B.phyl) |
B.ind.phyl.sd | One SD of the estimate of phylogenetic host specificity (B.phyl) |
B.ind.func | Mean of the estimate of functional host specificity (B.func) |
B.ind.func.sd | One SD of the estimate of functional host specificity (B.func) |
nsamp_host.main | Number of all sampled bird individuals from the main host species (main host species (i.e. the single host species for which the largest number of infected birds were recorded within a local assemblage)) |
nsamp_infect.main | Number of all sampled bird individuals from the main host species infected |
Longitude | Longitude value of the geographic coordinate fo the study site (WGS84) |
Latitude | Latitude value of the geographic coordinate fo the study site (WGS84) |
Country | Country where the study site is located |
realm | Zoogeographical region based on Holt et al. 2013 (doi: 10.1126/science.1228282) |
Bird.specrich | Local bird species richness of terrestrial birds based on a map that summarizes species distributions from BirdLife International range maps (https://biodiversitymapping.org/) |
Access information
Table of data sources and countries of origin of the data compiled for this study. For each dataset, the geographical coordinates, the country, and the total number of bird individuals and species in the data set as well as links to published work is listed.
Longitude | Latitude | Country | No. of bird individuals | No. of bird species | Data source/owner |
---|---|---|---|---|---|
-148.23181 | 63.32385 | USA | 48 | 13 | https://doi.org/10.1111/1365-2656.13089 |
-149.29083 | 65.28899 | USA | 95 | 20 | https://doi.org/10.1111/1365-2656.13089 |
-37.26666667 | -6.566666667 | Brazil | 176 | 38 | https://doi.org/10.1007/s00248-023-02283-x |
-38.682305 | -9.73225 | Brazil | 97 | 36 | https://doi.org/10.1017/S0031182022001317 |
-41 | -16.43333333 | Brazil | 164 | 48 | https://doi.org/10.1016/j.ijpara.2021.01.001 |
-42.13333333 | -19.78333333 | Brazil | 126 | 33 | https://doi.org/10.1016/j.ijpara.2021.01.001 |
-45.73333333 | -15.16666667 | Brazil | 127 | 26 | https://doi.org/10.1111/mec.15094 |
-46.73280556 | -23.99369444 | Brazil | 403 | 68 | https://doi.org/10.1017/S0031182022001317 |
-47.613628 | -15.546044 | Brazil | 790 | 53 | https://doi.org/10.1111/mec.15094 |
-57.49573611 | -30.20556389 | Brazil | 244 | 63 | https://doi.org/10.1111/1365-2656.13117 |
-59.68274444 | -13.82132778 | Brazil | 136 | 42 | https://doi.org/10.1111/mec.15094 |
-6.97 | 31.72 | Morocco | 75 | 17 | https://doi.org/10.1111/mec.15545 |
-71.35 | -42.91666667 | Argentina | 688 | 25 | https://doi.org/10.1017/S0031182022001317 |
-72.16 | -13.19933333 | Peru | 30 | 15 | https://doi.org/10.1111/ele.13263 |
-75.486 | 39.984 | USA | 1451 | 62 | https://doi.org/10.1007/s00248-023-02283-x |
-77.4167 | -6.7167 | Peru | 419 | 115 | https://doi.org/10.1111/ele.13263 |
-77.55 | -6.5833 | Peru | 433 | 113 | https://doi.org/10.1111/ele.13263 |
-77.6833 | -6.6833 | Peru | 169 | 56 | https://doi.org/10.1111/ele.13263 |
-77.73333333 | -2.116666667 | Ecuador | 343 | 88 | https://doi.org/10.1016/j.ijpara.2015.08.001 |
-78.64518 | -8.38685 | Peru | 85 | 20 | https://doi.org/10.1111/ele.13263 |
-78.77826667 | -7.398033333 | Peru | 135 | 37 | https://doi.org/10.1111/ele.13263 |
-79.2333 | -5.1 | Peru | 161 | 62 | https://doi.org/10.1111/ele.13263 |
-83.69 | 35.95 | USA | 127 | 26 | https://doi.org/10.1073/pnas.1515309112 |
-84.12 | 35.61 | USA | 156 | 32 | https://doi.org/10.1073/pnas.1515309112 |
-85.34944 | 42.32667 | USA | 381 | 46 | https://doi.org/10.1073/pnas.1515309112 |
-86.75175926 | 39.06638889 | USA | 500 | 34 | https://doi.org/10.1073/pnas.1515309112 |
-87.760704 | 41.746465 | USA | 2023 | 52 | https://doi.org/10.1073/pnas.1515309112 |
-89.22 | 35.12 | USA | 625 | 44 | https://doi.org/10.1073/pnas.1515309112 |
-89.71083 | 30.4025 | USA | 157 | 10 | https://doi.org/10.1073/pnas.1515309112 |
-91.0374 | 37.12555 | USA | 1438 | 49 | https://doi.org/10.1073/pnas.1515309112 |
-96.45972222 | 47.95611111 | USA | 150 | 13 | https://doi.org/10.1007/s00248-023-02283-x |
-99.25152 | 19.24604 | Mexico | 83 | 27 | https://doi.org/10.1645/18-130 |
13.433 | 55.6833 | Sweden | 2905 | 64 | https://doi.org/10.1111/oik.07280 |
20.9285 | -0.1703 | Democratic Republic Congo | 118 | 30 | Shannon Hackett & Heather Skeen, |
30.3806 | 0.3587 | Uganda | 136 | 44 | Shannon Hackett & Heather Skeen, |
30.4289 | 0.5061 | Uganda | 124 | 31 | Shannon Hackett & Heather Skeen, |
33.45 | -10.86666667 | Malawi | 206 | 90 | https://doi.org/10.1371/journal.pone.0121254 |
33.8 | -10.58333333 | Malawi | 95 | 30 | https://doi.org/10.1371/journal.pone.0121254 |
33.96666667 | -10.73333333 | Malawi | 93 | 26 | https://doi.org/10.1371/journal.pone.0121254 |
34.0506 | -18.4667 | Mozambique | 192 | 43 | Shannon Hackett & Heather Skeen, |
36.8778 | 0.0061 | Kenya | 61 | 14 | Wanyoike Wamiti, National Museums of Kenya, Nairobi & Jason Weckstein, Drexel University, Philadelphia, unpublished data, 2003 |
46.2315433 | 40.05055 | Karabakh | 37 | 6 | https://doi.org/10.1111/mec.15545 |
46.24 | 38.9 | Armenia | 72 | 18 | https://doi.org/10.1111/mec.15545 |
8.9772 | 9.8756 | Nigeria | 458 | 59 | https://doi.org/10.1111/j.1365-294X.2007.03227.x |
-49.21348 | -1.11058 | Brazil | 288 | 98 | https://doi.org/10.1016/j.ympev.2023.107828 |
-64.72866 | -7.41419 | Brazil | 296 | 97 | https://doi.org/10.1016/j.ympev.2023.107828 |
Methods
To examine the prevalence-host specificity relationships of unique haemosporidian lineages in bird assemblages, we systematically compiled field and molecular screening data from studies that reported individual-level haemosporidian infection status and parasite cytochrome-b (cyt-b) sequences in bird assemblages with reasonably large sample sizes (≥ 30 individuals and more than five host species surveyed at a single location) based on a dataset updated from our previous study (Fecchio et al., 2021). The assembled data set included 36,896 individual birds with recorded presence-absence of infection with the haemosporidian genera Plasmodium, Parahaemoproteus, and Leucocytozoon, and all infections confirmed by molecular sequencing of a 477 bp nucleotide cyt-b fragment (Bensch, Perez-Tris, Waldenström, & Hellgren, 2004). Sequence identities were verified with a local BLAST against the MalAvi database (Bensch, Hellgren, & Perez-Tris, 2009). We then matched nucleotide sequences to respective amino-acid sequences using the R package Biostrings (Pagès, Aboyoun, Gentleman, & DebRoy, 2024), reducing the number of 2,070 unique nucleotide lineages to 1,983 ‘functional lineages’ based on amino acid sequences. We acknowledge that because we only have data for a fragment of the cyt-b sequence, these lineage assignments may not necessarily represent the true functional lineage or species diversity, but is assumed to be the best possible representation of data currently available, amid the recent onset of genomic sequencing of haemosporidians (Videvall, 2019).
We obtained data on bird species' phylogenetic relationships from the open-source Birdtree.org phylogenetic supertree (Jetz, Thomas, Joy, Hartmann, & Mooers, 2012), using a consensus tree from a random selection of 100 possible tree topologies from the supertree's ‘Ericsson All Species’ Bayesian posterior distribution (available at Birdtree.org/subsets/). Bird species names from field data were revised according to the taxonomy used in these trees. Phylogenetic distances among pairs of bird species were calculated as mean pairwise distance across the selected trees.
For all bird species, we obtained data on their body mass and the proportional use of ten diet categories and seven foraging habitats from the EltonTraits v1.0 database (Wilman et al., 2014). We quantified pairwise functional distances using a Gower's distance matrix (Gower, 1971; Pavoine, Vallet, Dufour, Gachet, & Daniel, 2009). For analysis, phylogenetic and functional distance matrices were scaled (dividing by the maximum for each matrix).
To analyse the prevalence of parasite lineages within local assemblages in context of host specificity, we focused on sufficiently well sampled lineages that allowed reasonable estimates of prevalence and host specificity in a given local context. For this, we filtered the cleaned data and selected only those data for lineages that have been reported infecting ≥ 8 host individuals in local assemblages with a total ≥ 10 host individuals and ≥ 5 host species sampled (a sample size of ten translates to an ~ 80% probability of detecting a parasite with a true prevalence of 15%, and we assume a record of eight infections within a community of ≥ 5 host species to provide reasonable insights into the variation in host specificity of different lineages). This resulted in a dataset of 144 unique lineage-location records from 46 different locations and 17,116 sampled bird individuals, of which 2,830 individuals were recorded as infected. Locations were in the zoogeographical regions of the Nearctic, Palearctic, Neotropics, and Afrotropics (a single lineage from the Saudi-Arabian region was included in the Afrotropics group for analysis) (Tables S1, S2, Figure S1). The number of bird species sampled at the different locations ranged from 6 to 115 (mean 51± SD 22), the number of locally recorded and infected host species for the different lineages ranged from 1 to 21 (mean 5 ± SD 4) (Figure S2), comprising 1% to 69% of the locally sampled bird species. According to host range estimates (Chao-1 species richness estimates based on the frequency different bird species were recorded as host), the true local host ranges were likely larger than the number of recorded host species (species recorded to be infected by a lineage)(Figure S3), whereas the number of recorded host species was not correlated with the respective sample sizes (Spearman’s r = 0.05) and the proportion of host individuals in a sample was only weakly correlated with sample sizes (Spearman’s r = 0.31); we therefore assumed that sample sizes were sufficient for inference.
Host assemblage and environmental metrics
Within a local context, each haemosporidian lineage was recorded from a number of infected individuals that comprised a subset of the sampled bird individuals, which, in turn, can be considered a random sample of the local bird assemblage. To summarize key aspects of the infected and sampled species assemblage as proxies of aspects of host specificity and host availability, we computed the following metrics:
1) Phylogenetic host specificity (B.phyl) – host specificity calculated as a model-based estimate of the relative difference in phylogenetic distances among all pairs of infected versus sampled bird individuals. We regressed all possible pairwise phylogenetic distances d against the binary categorical classifier C of whether such distance was computed for any pair of infected individuals versus any pair of sampled and uninfected individuals (i.e. a binary indicator variable, where ‘1’ indicates the pair of sampled bird individuals that is infected; ‘0’ indicates that they are uninfected) in a linear model, such that:
d ~ Ɲ(µ + βphylC, σ2) (eqn. 1).
Ɲ(μ, σ2) represents the Gaussian distributions with mean μ and variance σ2 of the linear model, the coefficient estimate βphyl represents the difference between observed and potential host distances. Values of βphyl <0 indicate smaller distances between infected individuals than those in the entire sample and therefore stronger host specificity, a value of > 0 suggests that a lineage infects more distantly related hosts than expected (Wells, Gibson, & Clark, 2019). Model-based βphyl estimates of means and SDs computed with the glm() function in R (R Core Team, 2023) independently for each lineage are considered for defining the uncertainty and priors in our Bayesian random regression to compute the possible relationship between lineage prevalence and specificity (refer to statistical analysis). Host diversity calculated as Rao’s quadratic entropy (a measure of within and among community diversity of infected versus all sampled individuals from different bird species that takes phylogenetic relationships into account) correlated with B.phyl (Spearman r < 0.7) and was therefore not considered in analyses.
2) Functional host specificity (B.func) – host specificity calculated as a model-based estimate of the relative difference in functional distances among all pairs of infected versus sampled bird individuals, using the same concept as for computing phylogenetic specificity described above.
3) Main host availability (mainHavail) – the availability of known suitable main host individuals within a local assemblage. This metric is a probabilistic estimate of the proportion of main host individuals based on the number of individuals from the main host species (i.e. the single host species for which the largest number of infected birds were recorded within a local assemblage) and the total number of individuals sampled from a local assemblage. We used a multinomial model linked to a Dirichlet prior to model the proportion of available host individuals belonging to the main host as described below.
4) Host range (HR.Chao) – we computed host range as the Chao species richness estimate based on the locally sampled number of infected bird individuals from different species; host range estimates were computed with the package iNEXT (Hsieh, Ma, & Chao, 2016).
5) Bird species richness (SpRich) – We measured local bird species richness of terrestrial birds based on a map that summarizes species distributions from BirdLife International range maps (https://biodiversitymapping.org/; refer also to Jenkins, Pimm, & Joppa, 2013). This measure counts all terrestrial birds (potentially available as host) regardless of whether they have been sampled or not.
To account for key climatic differences across locations that are likely to affect many aspects of host (and vector) species, we selected six climate variables from the WorldClim database of gridded climate data at a 0.01-degree resolution ((Fick & Hijmans, 2017), 2017; http://world clim.org/version2)(bio1: annual mean temperature, bio4: temperature seasonality, bio7: temperature annual range, bio12: annual precipitation, bio14: precipitation of driest month, bio15: precipitation seasonality based on coefficient of variation).