Data from: Predicting road encounter hotspots for Infrequently detected species using oportunistic data – a case study with Blanding’s turtle (Emydoidea blandingii )

Jackson, Sean1; Burrows, Alexandra1; Johnson, Glenn2; McCluskey, Eric3; Langen, Tom 1

Published Mar 13, 2026 on Dryad. https://doi.org/10.5061/dryad.fj6q5747t

Data files

Mar 13, 2026 version files 354.06 KB

Data_Jacksonetal__PredictingBlandingsHotspots.ods

343.95 KB
README.md

10.11 KB

Abstract

For road mitigation measures to prevent roadkill and conserve landscape connectivity to be effective, the measures must be located where animals are most likely to encounter roads. However, accurate identification of road encounter hotspots is difficult when presence records are sparse and collected haphazardly, which is often the case with small, uncommon species. Blanding’s Turtle (Emydoidea blandingii) is a threatened species for which road mortality contributes to population declines. Using opportunistic detections of Blanding’s Turtle along roads, we investigated whether it is possible to predict road encounter hotspots throughout an extensive road network with such data. First, we used general linear modeling (GLM) to infer landscape features associated with Blanding’s Turtle road encounter records. After locating spatial clusters of encounters, GLM was used to identify landscape features associated with these hotspots. Next, Blanding’s Turtle's least cost movement paths were delineated within the landscape, and sites where paths crossed roads were located. Blanding’s Turtle locations were positively associated with proximity and extent of wetlands, and negatively associated with grasslands and developed land use. Hotspots were located along predicted Blanding’s Turtle least cost movement paths, indicating that behavioral movement models are useful for predicting encounter locations. A significant fraction of road encounter records came from a small number of hotspot sites, located along the predicted movement paths. We conclude that it is possible to generate predictive models of road encounter hotspots even when data are sparse, collected opportunistically, and subject to spatial biases in reporting across a road network. These models can be applied throughout a road network to identify road segments that are good candidates for effective road mitigation.

Dataset DOI: 10.5061/dryad.fj6q5747t

Description of the data and file structure

We used a dataset of Blanding's turtle (Emydoidea blandingii) (BT) georeferenced, opportunistically - detected road encounter records from the St. Lawrence River Valley of New York State, including roadkill and live turtles on or near a roadway, to evaluate whether such data can be used to validly predict where the species most frequently encounters road throughout a road network, and hence where mitigation may be most effective for conservation. First, we used a GLM approach to infer landscape features associated with the BT road encounters in our database. Second, we located spatial clusters of road encounters in the database, and again statistically identified landscape features associated with these hotspots. Third, we predicted BT movement patterns within the landscape, and located areas where movement paths encounter roadways. Finally, we compared the predicted hotspots based on the modeled movement trajectories with the actual locations of BT road encounter hotspots, as indicated in our database. Our goal was to answer the question: Is it possible to validly predict road encounter hotspots throughout an extensive road network for an infrequently-detected species using sparse, haphazardly-collected data?

Note: Georeferenced Blanding's turtle locations are sensitive data because of active poaching for the pet trade. Location coordinates to all data have been removed. A georeferenced version of the data file is available for legitimate research purposes by request to the corresponding author. All Blanding's turtle georeferenced records are also archived with the New York State Heritage program, where they can be accessed for legitimate research purposes.

Files and variables

File: Data_Jacksonetal__PredictingBlandingsHotspots.ods

Description: An excel file with worksheets that provide the data used in the analyses presented in Jackson et. al Predicting Road Encounter Hotspots for Infrequently Detected Species with Opportunistic Data – a Case Study with Blanding’s Turtle (Emydoidea blandingii ) to be published in Ecology & Evolution. Methods used to generate these data and data analyses details are exhaustively detailed in the paper.

Worksheet 1: Metadata: Metadata for all of the worksheets.

Worksheet 2: Point LocationsLULC: Land use and land cover (LULC) around the turtle encounter locations and pseudo-random null points. See Table 3 in the paper.

Worksheet 3: PairedLocationsLULC: Same info as sheet 2, however this sheet is exclusive to turtle points with a paired null point(s). See Table 4 in the paper.

Worksheet 4: TurtleNullModelAICResults: Results of exploratory multiple logistic regression model selection comparing turtle points versus null (pseudo-random) points. See Table 5 in the paper.

Worksheet 5: LogisticModelPredvActual: Selected logistic model's prediction of each point (turtle or null) in comparison to its actual category. See section 3.1 in the paper.

Worksheet 6: HotspotvRandomLCP: LULC and distance to least costs paths for hotspots and random points. See Table 6 and sections 3.2, 3.3 in the paper.

Metadata

Variable	Description
Type	Point type (i.e. turtle or null)
Nearest Wetland (meters)	The measurement from the point to the nearest wetland
Opposite Wetland (meters)	The measurement from the point to the wetland on the opposite side of the roadway from the first wetland measurement
Distance Btwn 2 (meters)	The measurement between the opposite two wetlands recorded with respect to the point (displacement measurement between them)
Notes	Notes recorded explaining the point's location and surroundings
50km	Indicates the beginning of landcover classes for the 50 km buffer
OPEN WATER	Land classified by the National Land Cover Data (NLCD) as open water
DEVELOPED OPEN SPACE	Land classified by the NLCD as developed open space
DEVELOPED	Land classified by the NLCD as developed
DECIDUOUS FOREST	Land classified by the NLCD as deciduous forest
EVERGREEN FOREST	Land classified by the NLCD as evergreen forest
MIXED FOREST	Land classified by the NLCD as mixed forest
SHRUB/SCRUB	Land classified by the NLCD as shrub/scrub
PASTURE/GRASSLAND	Land classified by the NLCD as pasture/grassland
CULTIVATED CROPS	Land classified by the NLCD as cultivated crops
WOODY WETLAND	Land classified by the NLCD as woody wetland
EMERGENT HERBACEOUS WETLANDS	Land classified by the NLCD as emergent herbaceous wetland
100km	Indicates the beginning of landcover classes for the 100 km buffer
250km	Indicates the beginning of landcover classes for the 250 km buffer
Model	Type of model
AIC	Akaike Information Criterion estimator of prediction error
Delta AIC	Difference between the AIC of the best model and the model being compared
exp(-5*DeltaAC)	Relative likelihood of the model
AIC Weight	Proportion of the total predictive power
Near	Nearest Wetland (meters)
Opp	Opposite Wetland (meters)
Btwn	Distance Between Wetlands (meters)
Dev	Developed land (% of 250 m buffer)
Shr	Shrub/Pasture (% of 250 m buffer)
Wet	Wetland (% of 250 m buffer)
PointType	Type of point (turtle present = 1, turtle absent/null = 0)
Columns B to G	Same variables as Sheet 4
Formula	Logistic regression model (see Section 3.1 of the paper)
Logistic	Output of the logistic model (0–1). Values closer to 1 indicate higher likelihood of a turtle point
Class	Hotspot or Random point
Columns C to P	% coverage by NLCD class within a 250 m buffer around the point
Distance to LCP	Distance to the nearest least cost path (meters)
Hotspot Length	Length of a hotspot (meters)
Cluster Strength	Relative density of points within a hotspot cluster; higher values indicate stronger clustering

Code/software

The data file is provided in an open source format: OpenDocument spreadsheet