Data from: A comparison of diversity estimators applied to a database of host-parasite associations
Data files
May 27, 2020 version files 2.78 GB
- 
              
                GMPD_main_zeroPrevIncluded_2016-12-01.csv
                11.21 MB
- 
              
                increased_sampling.rds
                1.32 GB
- 
              
                Readme.rtf
                1.85 KB
- 
              
                samplingAnalysis.Rmd
                36.92 KB
- 
              
                samplingSimulations.Rmd
                28.76 KB
- 
              
                simulated_sampling.rds
                1.44 GB
- 
              
                simulateSampling.R
                8.92 KB
- 
              
                SpadeR_functions_edited.R
                36.36 KB
Abstract
    Understanding the drivers of biodiversity is important for forecasting changes in the distribution of life on earth. However, most studies of biodiversity are limited by uneven sampling effort, with some regions or taxa better sampled than others. Numerous methods have been developed to account for differences in sampling effort, but most methods were developed for systematic surveys in which all study units are sampled using the same design and assemblages are sampled randomly. Databases compiled from multiple sources, such as from the literature, often violate these assumptions because they are composed of studies that vary widely in their goals and methods. Here, we compared the performance of several popular methods for estimating parasite diversity based on a large and widely used parasite database, the Global Mammal Parasite Database (GMPD). We created artificial datasets of host-parasite interactions based on the structure of the GMPD, then used these datasets to evaluate which methods best control for differential sampling effort. We evaluated the precision and bias of seven methods, including species accumulation and nonparametric diversity estimators, compared to analyzing the raw data without controlling for sampling variation. We find that nonparametric estimators, and particularly the Chao2 and second-order jackknife estimators, perform better than other methods. However, these estimators still perform poorly relative to systematic sampling, and effect sizes should be interpreted with caution because they tend to be lower than actual effect sizes. Overall, using these estimators is more effective in comparative studies than for estimating true estimates of diversity. We make recommendations for future sampling strategies and statistical methods that would improve estimates of global parasite diversity.
  
  
  
  All code are .R or .Rmd files to be run in R.
