Database Bd modelling for South Africa, Lesotho and eSwatini
Data files
Apr 10, 2024 version files 24.48 KB
Abstract
This dataset was used to create a predictive distribution map of Batrachochytrium dendrobatidis in South Africa, based on lineage. It contains records obtained from published resources, as well as records from fieldwork done. Lineage was identified using published lineage-specific primers, in conjunction with confirmed lineage typing by whole genome sequencing from O'Hanlon et al. 2018.
README: Database Bd modelling South Africa Lesotho eSwatini
This dataset contains records from Bd surveillance and isolation work. The records originate from publicly available, peer reviewed articles, along with our own fieldwork records. Isolates from our fieldwork has also been entered into the database for whole genome sequencing that has been published with O'Hanlon et al. 2018. Within the dataset the country, common name for site, total number of samples taken, the number of samples positive and negative, the lineage (if available from the evidence) coordinates, and the date of collection is provided. The family, genus and species names of the sampled individuals is also given.
Please note that any site was categorised as POSTIVE, if even a single individual tested positive.
"None" indicates where no positives were found.
"Not Available" refers to historical samples included for which the lineage was not available.
Methods
We modelled lineage distributions using Maxent (version 3.4.4;(Phillips et al., 2006) and algorithms described by Elith et al. (2011) to correlate environmental parameters with the presence of either Bd lineage. A database for all known Bd records for South Africa, Lesotho and Eswatini was created with the following information: locality and coordinates of sample origin, sample size, prevalence of Bd, lineage found (if available), date of sampling, and host genus and species name. Bioclimatic variables used for modelling were obtained from WorldClim (https://www.worldclim.org/data/bioclim.html) and the contribution of the different variables are given in Table 1. Maxent used 80 % of the positive records for training of the model and the remaining 20 % to test for accuracy of predictions. Model replications were set to 100 using subsample, with the regularisation multiplier set to 1. Random seed was set to true with all other settings left at default.
Model performances were evaluated using the area under curve (AUC) of the receiving operating characteristic (ROC) curve and by using jack-knife tests. The jack-knife tests examined the importance of each variable, firstly by removing one variable at a time and secondly by testing each variable in isolation (Figure S1 and S2). AUC values of 1 indicates perfect models, while values of 0.5 indicates that the model has no predictive ability. The lineage‑specific prediction models had a “10 percentile training presence logistic threshold” using the average results from the 100 model replications to convert the results to a binary output for display purposes.