Data from: Collection methods and distribution modeling for Strepsiptera in the United States
Data files
May 09, 2024 version files 115.48 KB
-
GBIF_strepsiptera_US_occurrences.xlsx
96.83 KB
-
NY_strepsiptera_survey.xlsx
11.50 KB
-
README.md
7.14 KB
May 21, 2024 version files 115.74 KB
-
README.md
7.41 KB
-
Table_S1_NY_strepsiptera_survey.xlsx
11.50 KB
-
Table_S2_GBIF_strepsiptera_US_occurrences.xlsx
96.83 KB
May 30, 2024 version files 116.04 KB
Abstract
The twisted-wing parasite order (Strepsiptera Kirby, 1813) is difficult to study due to the complexity of strepsipteran life histories, small body sizes, and a lack of accessible distribution data for most species. Here, we present a review of the strepsipteran species known from New York State. We also demonstrate successful collection methods and a survey of species carried out in an old-growth deciduous forest dominated by native New York species (Black Rock Forest, Cornwall, NY) and a private site in the Catskill Mountains (Shandaken, NY). Additionally, we model suitable habitat for Strepsiptera in the United States with species distribution modeling. We base our models on host distributions and climatic variables to inform predictions of where these twisted-wing parasites are likely to be found. With this work, we hope to provide a useful reference for the future collection of Strepsiptera.
README: Supplemental data for "Collection methods and distribution modeling for Strepsiptera in the United States"
https://doi.org/10.5061/dryad.n5tb2rc34
These files are supplemental information related to surveys of the twisted-wing parasites (Strepsiptera) in New York State and species distribution modeling for Strepsiptera in the United States.
The latest version (8) is for organizational changes: the names of two .MOV files in the supplemental files hosted by Zenodo have been changed to reflect their citations in the related manuscript.
Version 7 features organizational changes: the names of the two .xlsx files have been changed to reflect their citations in the related manuscript.
Data and file structure
- Video_S1_Xenos_pull.MOV (renamed as of version 8 to reflect manuscript)
- A video file of a female Xenos peckii being pulled from a female Polistes fuscatus host wasp
- This video is included to demonstrate how the wasp was immobilized with wire and foam for the purpose of isolating the parasite.
- Video_S2_Eupathocera_pull.MOV (renamed as of version 8 to reflect manuscript)
- A video file of a gravid female Eupathocera auripedis being pulled from host wasp Isodontia mexicana
- This video is included as visual accompaniment to the description of the gravid female E. auripedis and her larvae.
- Map_BRF.jpeg
- A map displaying the collection sites for this study in Black Rock Forest as an inset over a map of New York State
- Twisted-winged Insect (Pseudoxenos sp.) larvae hatching from female in Eumeninae host 20230720_5072.jpg
- An image of a gravid Pseudoxenos tigridis cephalothorax protruding from the sclerites of its host wasp Ancistrocerus adiabatus, with larvae visible in the brood canal
- Taken by John and Kendra Abbott, original image, 2023, used with permission of Abbott Nature Photography. This image is not covered by the terms of the license of this dataset. For permission to reuse, please contact the rights holder.
- Table_S1_NY_strepsiptera_survey.xlsx (Renamed as of version 7 to reflect manuscript)
- An Excel spreadsheet of the full collection data for the survey of Strepsiptera in some sites of New York State conducted in this study
- Metadata for this dataset can be found in the second tab.
- Table_S2_GBIF_strepsiptera_US_occurrences.xlsx (Renamed as of version 7 to reflect manuscript)
- An Excel spreadsheet of the full GBIF and host information used to inform Table 2 and all models featured in the manuscript related to these supplemental data
- Metadata for the first dataset in this spreadsheet (GBIF_data) can be found at https://doi.org/10.15468/dl.q43zd3
- Metadata for the second dataset in this spreadsheet (host_species) can be found in the third tab.
- US_shapefiles.zip
- A compressed folder containing all files related to a border shapefile of the contiguous United States
- Files: US_contiguous_final.cpg, US_contiguous_final.dbf, US_contiguous_final.prj, US_contiguous_final.qmd, US_contiguous_final.shp, US_contiguous_final.shx; the .shp file can be opened for use in QGIS and the .shp, .shx, and .dbf files are for use in species distribution modeling with Wallace.
- final_scripts.zip
- A compressed folder containing all scripts for the species distribution models featured in this manuscript as R markdown files
- The models and their corresponding files are as follows:
- Predicted Range of Xenos peckii and Polistes sp.: xenos_peckii_prediction.Rmd, polistes_fuscatus_prediction.Rmd
- Predicted Range of Elenchus sp. and Delphacidae: elenchus_prediction.Rmd, elenchus_host_prediction.Rmd
- Predicted Range of Halictophagus sp. and Cicadellidae: halictophagidae_prediction.Rmd, cicadellidae_prediction.Rmd
- Predicted Range of Eupathocera sp. and Sphecidae: eupathocera_prediction.Rmd, eupathocera_host_prediction.Rmd
- Predicted Range of Pseudoxenos sp. and Eumeninae: pseudoxenos_prediction.Rmd, pseudoxenos_host_prediction.Rmd
- Predicted Range of Xenos sp. and Polistes sp.: xenos_prediction.Rmd, polistes_prediction.Rmd
- Predicted Range of Stylops sp. and Andrena sp.: stylops_prediction.Rmd, andrena_prediction.Rmd
- plots_with_reference_data.zip
- A compressed folder containing all plots of strepsipteran SDMs in the main body of the manuscript, with the GBIF/survey reference data for each model plotted to contextualize their accuracy
- Files: elenchus_references.pdf, eupathocera_references.pdf, halictophagus_references.pdf, pseudoxenos_references.pdf, stylops_references.pdf, xenospeckii_references.pdf, xenosspp_references.pdfp
Sharing/Access information
Data was derived from the following sources:
- GBIF.org (30 December 2023) GBIF Occurrence Download https://doi.org/10.15468/dl.q43zd3
- User, G. O. (2023). Occurrence Download [dataset]. The Global Biodiversity Information Facility. https://doi.org/10.15468/DL.WVPJE9
- United States Government. (2023, May 23). Cartographic Boundary Files. United States Census Bureau. https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html
Code/Software
US shapefiles were generated with QGIS v3.2.6.
- Flenniken, J. M., Stuglik, S., & Iannone, B. V. (2020). Quantum GIS (QGIS): An introduction to a free alternative to more costly GIS platforms: FOR359/FR428, 2/2020. EDIS, 2020(2), 7–7. https://journals.flvc.org/edis/article/download/108810/120175
All R markdown scripts were generated with the R package “wallace” and its modeling application Wallace v2.0 (Kass et al., 2018, 2023), using the algorithm MaxEnt (Maximum Entropy) (Phillips et al., 2004) and incorporating Bioclim environmental data (Booth et al., 2014) as explanatory variables driving species presence.
Kass, J. M., Pinilla-Buitrago, G. E., Paz, A., Johnson, B. A., Grisales-Betancur, V., Meenan, S. I., Attali, D., Broennimann, O., Galante, P. J., Maitner, B. S., Owens, H. L., Varela, S., Aiello-Lammens, M. E., Merow, C., Blair, M. E., & Anderson, R. P. (2023). wallace 2: A shiny app for modeling species niches and distributions redesigned to facilitate expansion via module contributions. Ecography. https://doi.org/10.1111/ecog.06547
Kass, J. M., Vilela, B., Aiello-Lammens, M. E., Muscarella, R., Merow, C., & Anderson, R. P. (2018). Wallace: A flexible platform for reproducible modeling of species niches and distributions built for community expansion. Methods in Ecology and Evolution / British Ecological Society, 9(4), 1151–1156. https://doi.org/10.1111/2041-210x.12945
Phillips, S. J., Dudík, M., & Schapire, R. E. (2004). A maximum entropy approach to species distribution modeling. Proceedings of the Twenty-First International Conference on Machine Learning, 83. https://doi.org/10.1145/1015330.1015412
Booth, T. H., Nix, H. A., Busby, J. R., & Hutchinson, M. F. (2014). bioclim: the first species distribution modelling package, its early applications and relevance to most current MaxEnt studies. Diversity & Distributions, 20(1), 1–9. https://doi.org/10.1111/ddi.12144
Methods
Our specimens were collected in Black Rock Forest (BRF), Cornwall, New York over the course of six trips in July and August of 2022 and 2023. BRF is an old growth forest protected and maintained by a namesake scientific organization dedicated to its study—as such, this forest provides a uniquely mature and native environment in which to collect ecological data. We sampled six areas: native growth by the Black Rock Forest (BRF) Science Center (41.41408°, -74.011919°), a patch of wild growth in the parking lot (41.413249°, -74.011421°), the meadow of the Upper Reservoir (41.411015°, -74.007048°), Aleck Meadow (41.406405°, -74.014587°), meadows of Jim’s Pond (41.387490°, -74.020348°), and brush near the Stone House (41.397177°, -74.021423°) (Figure S1). In addition to the BRF sites, we sampled one privately owned site in the Catskill Mountains, Shandaken, New York in June and July 2023 (42.129425°, -74.377613°).
To generate predictive models of host and Strepsiptera ranges, we gathered occurrence data for each host-parasite pair for which collection coordinates were available from the Global Biodiversity Information Facility (GBIF) and combined it with the locality data from our collection efforts. Of the 78 strepsipteran species documented in the United States, only a subset had occurrence data. Of these, 51 species included specific coordinate data, and only 15 species had multiple unique coordinates. If hosts of these strepsipterans did not have occurrence data, we excluded these host species from the predictive analyses as well. Since our models require at least 5 occurrence datapoints to run, we ran models on genera instead of species to ensure that our predictions were robust. Our list was based on a checklist of strepsipteran species and their hosts in the United States from Kathirithamby, 2005, plus a United States checklist (Zabinski & Cook, 2023) and world checklist of the genus Stylops (Straka et al., 2015). Our GBIF search parameters specified human observation and preserved specimens as basis of record, data with coordinates, and the United States as an administrative area to restrict the search. When necessary for lessening computational time, we thinned the data by specifying coordinate uncertainty between 0-1 meters.
We took a species distribution modeling approach with the R package “wallace” and its modeling application Wallace v2.0 (Kass et al., 2018, 2023), using the algorithm MaxEnt (Maximum Entropy) (Phillips et al., 2004) and incorporating Bioclim environmental data (Booth et al., 2014) as explanatory variables driving species presence. For each species of Strepsiptera, we incorporated its host presence-absence prediction (10 percentile training presence threshold visualization) as a categorical variable. We standardized our models by specifying their region of study to a shapefile of the 48 contiguous United States, which we generated in QGIS using publicly available data (United States Government, 2023). We chose each model based on corrected Akaike information criterion (AICc), average omission rate when applying a 10-percentile training presence threshold to withheld validation data (OR.10p), and area under the curve of a receiver operating characteristic plot (auc.val.avg) (Kass et al., 2021; Peterson et al., 2011). Our R scripts for each model are openly available at Dryad. We visualized all data resulting from our models in QGIS v3.2.6 (Flenniken et al., 2020), and generated our host-parasite and species richness maps by using the QGIS Raster Calculator addition function.