Digital distribution maps of the bats of Texas
Data files
Feb 02, 2026 version files 32.91 MB
-
Digital_Distribution_Maps_for_the_Bats_of_Texas.zip
32.89 MB
-
README.md
12.57 KB
Abstract
Of all the terrestrial mammals in Texas, bats (order Chiroptera) are the most imperiled with 23 species (72% of species in the order that occur in Texas) listed by the Texas Parks and Wildlife Department as Species of Greatest Conservation Need (SGCN). Despite so many bat species categorized as SGCN, we have only a course understanding of their distribution and little quantitative understanding of the relative imperilment of species stemming from a variety of anthropogenic threats. High resolution estimate of distribution would aid much in directing conservation efforts in the state and could identify species needing the greatest conservation attention. Recent advances in ecological niche modeling allow construction of digital distribution maps that are much more resolved and provide estimates of habitat suitability that reflect the probability of occurrence of species. We generated ecological niche models (ENM’s) for 34 species of bats occurring in the state of Texas. These data represent the data we used to create these models and the results we obtained.
Dataset DOI: 10.5061/dryad.hx3ffbgsz
Description of the data and file structure
We submitted here our data (Data folder) around which Texas species we analyzed (Texas species from grant.csv), the standardized names we used for each species (Species name code.csv), and the smaller Texas collections we requested and received data from (Distribution Data from Smaller Texas Collections folder). We also submitted our results from our MaxEnt models of each Texas bat species (Results folder) as geotiffs (MaxEnt raster predictions sub-folder) and tables (MaxEnt Tables sub-folder) including MaxEnt tuning results (final_results.csv), variable importance measures (permutation_importance.csv), the total number of thinned coordinates per species (species_thinrec_counts.csv), and a master table containing the values from our environmental variables and species record data for each raster cell (master_values_table.csv). Additionally, we submitted our R codes (Codes folder) for preparing the environmental layers (Layer prep sub-folder), preparing background points and occurrence records (Occ record and background prep sub-folder), MaxEnt tuning and modeling (Maxent tuning and models sub-folder), and creating final maps and variable response curve plots (Maps, final plots, and master table sub-folder). In addition, our raw data can be found in a Supplementary folder which contains our environmental raster layers as geotiffs (Environmental rasters folder), a shapefile for our study area (study_area folder), and our occurrence records (Occurrences folder) including our filtered, but non-thinned museum occurrence records (Final_Chiroptera_records.csv), and the thinned version of these records (total_thin_coords.csv).
Files and variables
File: Digital_Distribution_Maps_for_the_Bats_of_Texas.zip
Data folder
Texas species from grant.csv
- List of all the species from this project in the order they appear in our final report
Species name code.csv
- Verbatim Name = GBIF field and data which can vary widely for the same species
- Standardized = our name for the species to standardize all the GBIF species names
Distribution Data from Smaller Texas Collections sub-folder
Museum Request Status.csv
- Which museums we requested data from, their response, and the contact at the institutions that responded
- 'N/A' represents not applicable as no contact is provided for collections with no bats in collection
Sul Ross Bats.csv
- The bat records including their catalog number, taxonomy, locality, date, collector, preservation, and sex from Sul Ross State University
- 'N/A' represents not available and data not provided in the species record by the collection
Tarleton Bats.csv
- The bat records including taxonomy, locality, catalogue number, sex, and date from Tarleton State University
Results folder
MaxEnt raster predictions sub-folder
- The cloglog prediction output from each species final tuned MaxEnt model as geotiff rasters
- Predictions are in the same projection, North America Albers Equal Area Conic projection (ESRI:102008), and resolution (19x19km) as our environmental rasters
- Values range from 0-1 along representing the probability of suitable habitat
MaxEnt Tables sub-folder
final_results.csv
- Represents the optimal parameters and evaluation metrics of the final MaxEnt model after tuning, given by the ENMeval R package
- Species = the species that was modeled
- fc = feature class
- rm = regularization multiplier
- tune.args = number of argument combinations, all 1 here as only 1 combination was provided after filtering non-contributing variables to the final MaxEnt model
- auc.train = AUC calculated on the full dataset
- cbi.train = Continuous Boyce Index calculated on the full dataset
- auc.diff = average and standard deviation of the difference between auc.train and auc.val
- auc.val = average and standard deviation of the AUC calculated on the validation datasets
- cbi.val = average and standard deviation of the Continuous Boyce Index calculated on the validation datasets
- or.10p = average and standard deviation of the omission rate with threshold as the minimum suitability value across occurrence records after removing the lowest 10 percent
- or.mtp = average and standard deviation of the omission rate with threshold as the minimum suitability value across occurrence records
- AICc = AIC corrected for small sample sizes
- delta.AICc = highest AICc value across all models minus this model's AICc value, zero here as only the final model was entered per species
- w.AIC = AIC weights, all 1 here as only the final model was entered per species
- ncoef = the number of non-zero coefficients in the final MaxEnt model
permutation_importance.csv
- Permutation values for each environmental variable for each species final MaxEnt model.
- --- = variable was not included in that species final MaxEnt model.
species_thinrec_counts.csv
- Number of thinned records (one record per 19x19km raster cell) that went into each species final MaxEnt model
master_value_table.csv
- Cell ID = id number given to each raster cell with data in our study area
- x,y = center coordinates for each raster cell with data in our study area
- Bio1 = annual mean temperature values given in C
- Bio2 = mean diurnal range (mean of monthly (max temp - min temp)) given in C
- Bio3 = isothermality (Bio2/Bio7) x 100 given in C
- Bio4 = temperature seasonality (standard deviation x 100) given in C
- Bio5 = max temperature of warmest month given in C
- Bio6 = min temperature of coldest month given in C
- Bio7 = temperature annual range (Bio5 - Bio6) given in C
- Bio8 = mean temperature of wettest quarter given in C
- Bio9 = mean temperature of driest quarter given in C
- Bio10 = mean temperature of warmest quarter given in C
- Bio11 = mean temperature of coldest quarter given in C
- Bio12 = annual precipitation given in mm
- Bio13 = precipitation of wettest month given in mm
- Bio14 = precipitation of driest month given in mm
- Bio15 = precipitation seasonality (coefficient of variation) given in mm
- Bio16 = precipitation of wettest quarter given in mm
- Bio17 = precipitation of driest quarter given in mm
- Bio18 = precipitation of warmest quarter given in mm
- Bio19 = precipitation of coldest quarter given in mm
- Cropland = proportion of raster cell that is designated as cropland
- DEM_elev = elevation given in m
- Evgrn Broadleaf = proportion of raster cell that is designated as evergreen broadleaf forest
- Evgrn Needleleaf = proportion of raster cell that is designated as evergreen needleleaf forest
- Decds Broadleaf = proportion of raster cell that is designated as deciduous broadleaf forest
- Grass/Shrubland = proportion of raster cell that is designated as grass/shrubland
- Igneous = proportion of raster cell that is designated as above ground igneous rock
- Metamorphic = proportion of raster cell that is designated as above ground metamorphic rock
- Sedimentary = proportion of raster cell that is designated as above ground sedimentary rock
- Pasture = proportion of raster cell that is designated as pasture
- Surface Water = proportion of raster cell that is designated as open surface water
- Prob of = the probability of each species in the raster cell ranging from 0 to 1 (0 to 100%)
- P/A = the presence (1) or absence (0) of each species
Codes folder
R code files here are in their workflow order
Layer prep sub-folder
- R codes to prepare the environmental data by masking it down to North America and to reproject them from the WGS84 projection (EPSG:4326) to North America Albers Equal Area Conic projection (ESRI:102008)
- Start with wc_layer_prep.R, then lithology_layer_prep.R, surface_water_layer_prep.R, landcover_layer_prep.R, and aridity_layer_prep.R
Occ record and background prep sub-folder
- R code, background_points.R that prepares background points for MaxEnt models
- R code, Coords_cleaner.R, that prepare the occurrences for MaxEnt models by highlighting dubious records for further review
- R code, Coords_NA.R after cleaning out final dubious records that snaps records outside the raster data to the nearest raster cell with data
Maxent tuning and models sub-folder
- R codes that tune and find optimized MaxEnt models for each species
- Most species processed through MaxEnt models full.R which tunes full set of feature class combinations and regularization multiplier values
- For species with more covariates than occurrences in complex models, here Diphylla ecuadata and Nyctinomops femorosaccus, tuning only with L and LQ through MaxEnt models reduced.R to avoid overfitting
Maps, final plots, and master table sub-folder
- R codes that gather results and make final figures
- R code, final_maps.R, that creates final figures for each species with a North American distribution map, North American cloglog MaxEnt prediction map, and Texas cloglog MaxEnt prediction map with state (level 1) and county boundaries (level 2)
- R code, variable_response_curves.R, to plot all the marginal response curves for each species
- R code, master_values_table.R, to create the master table which contains all the environmental raster, presence/absences for each species, and predicted probabilities of suitable habitat for each species in table format
File: Supplementary.zip
Occurrences folder
Final_Chiroptera_records.csv
- x and y = coordinates in North America Albers Equal Area Conic projection (ESRI:102008)
- gbifID = occurrence record ID on GBIF
- species = field from GBIF
- verbatimSpecies = standardized scientific names for species of each record
- year, month, and day = fields from GBIF describing the collection date of the specimen
- countryCode, county, locality, and verbatimLocality = fields from GBIF describing the collection locality of the specimen
- 'N/A' represents not available and data not provided in the GBIF record downloaded for this species
total_thin_coords.csv
- Thinned coordinates, x and y, for each species that went into the MaxEnt models
Environmental rasters folder
- All rasters as geotiffs
- Layers have been trimmed from global extents to North America and are in North America Albers Equal Area Conic projection (ESRI:102008)
- Layers include measures of aridity, elevation, and 19 bioclimatic variables measuring temperature and precipitation as well as the proportion of above ground lithology, available surface water, and landcover type in each raster cell
- Landcover forest layers (forest) divided by tree type, coniferous (con) and deciduous (dec), and leaf type, broadleaf (broad) and needleleaf (needle)
- Above ground lithological layers (litho) divided into igneous, metamorphic, and sedimentary rock in each cell
Study area folder
study_area_whole.shp
- Shapefile with country (0) and state levels (1)
- Read in .shp file, but .shx, .dbf, and .prj files are needed in the same folder to ensure it works properly
- NAME_0 country names and NAME_1 state/province names
- geometry = shape and was used to trim environmental data
Code/software
- All data was prepared and analyses run using the R statistical software
Access information
Key Information Sources
Occurrence data gathered from:
- GBIF
Environmental raster data gathered from:
- WorldClim, historical climate, dataset of 19 bioclimatic variables at 10 arcminutes
- Version 3 of the Global Aridity Index and Potential Evapotranspiration Database | Scientific Data, figshare link in paper
- World Ecological Land Units (ELUs) 2015 from USGS Science Data Catalog
- HydroRIVERS and HydroLAKES from HydroSHEDS project and Global Lakes and Wetlands Database accessed through World Wildlife Fund
- Winkler, K. et al. (2020): HILDA+ Global Land Use Change between 1960 and 2019 data found on PANGEA.
Shapefiles for country, state/province, and county boundaries from:
- Global Administrative Database (GADM)
