Skip to main content

Occurrence datasets, model outputs, and R script for 12 termite species used for niche modeling

Cite this dataset

Goodman, Aaron et al. (2022). Occurrence datasets, model outputs, and R script for 12 termite species used for niche modeling [Dataset]. Dryad.


The advent of citizen-science databases in conjunction with museum specimen locality information has exponentially increased the power and accuracy of ecological niche modeling (ENM). Increased occurrence data has provided colossal potential to understand the distributions of lesser known or endangered species, including arthropods. Although niche modeling of termites has been conducted in the context of invasive and pest species, few studies have been performed to understand the distribution of basal termite genera. Using specimen records from the American Museum of Natural History (AMNH) as well as locality databases, we generated ecological niche models for 12 basal termite species belonging to six genera and three families. We extracted environmental data from the Worldclim 19 bioclimatic dataset v2, along with SoilGrids datasets and generated models using MaxEnt. We chose Optimal models based on partial Receiving Operating characteristic (pROC) and omission rate criterion and determined variable importance using permutation analysis. We also calculated response curves to understand changes in suitability with changes in environmental variables. Optimal models for our 12 termite species ranged in complexity, but no discernible pattern was noted among genera, families, or geographic range. Permutation analysis revealed that habitat suitability is affected predominantly by seasonal or monthly temperature and precipitation variation. Our findings not only highlight the efficacy of largely citizen-science and museum-based datasets, but our models provide a baseline for predictions of future abundance of lesser-known arthropod species in the face of habitat destruction and climate change.


We acquired occurrence records of non-Kalotermitidae non-neoisopteran species from the GBIF and iNaturalist. We selected occurrences possessing preserved museum samples and research grade observations which are occurrences possessing verified latitude and longitude coordinates, a photograph of the sighting, date, and ⅔ agreement on species identification by the community. Further occurrence filtering consisted of removing sightings with erroneous localities (middle of the ocean, locations of large museums). Additional localities were acquired from undatabased occurrences of species housed within the AMNH termite collection. We used gazetteers to acquire coordinates for museum specimens lacking latitude and longitude data but specific enough locality information. We acquired environmental rasters at 2.5 arc-second resolution (~5 km at the equator) from the WorldClim 2.0 database, along with environmental variables from the Global Soil Information Facilities (GSIF) SoilGrids database at 250m resolution at 0-5cm. We omitted four layers from the bioclimatic variables (bio08, bio09, bio18, bio19) due to their known spatial artifacts (Moo-Llanes et al. 2021).

Usage notes

The first zip file ( contains all of the species occurence data, maxent.jar model outputs, calibration results, and final model results. 

Organization of files follows Figure 2 from Cobos et al. 2019

Data files are structured for after the kuenm R package functions have been executed.

The second zip file ( contains the enviromental data layers 'ENVS' used in our analyses which are in ascii format.

The final file is the R package script for running the analysis.


National Science Foundation, Award: 1950610