Data from: Factors influencing spatial and temporal patterns of Lanius ludovicianus (Loggerhead Shrike) occupancy at a grassland-sagebrush ecotone
Data files
Jan 14, 2026 version files 307.22 KB
-
LOSH_2015-2024.csv
302.10 KB
-
README.md
5.12 KB
Abstract
This dataset contains 10 years (2015–2024) of spatially referenced detection and non-detection data for Lanius ludovicianus (Loggerhead Shrike) collected during point-count surveys at a grassland–sagebrush ecotone in northeastern Wyoming, USA. Surveys were conducted at fixed points spaced 250 m apart, with repeated visits during some years (2015–2017) and single visits during others (2018–2024). Each record includes survey location, date, observer information, and whether L. ludovicianus was detected, along with associated site-level habitat measurements (e.g., vegetation structure, fence length, tree presence) and survey covariates (e.g., cloud cover, day of year). These data were used to estimate detection, initial occupancy, colonization, and local extinction probabilities within a dynamic (multi-season) occupancy modeling framework, while accounting for imperfect detection and spatial autocorrelation among adjacent survey points.
Dataset DOI: 10.5061/dryad.rbnzs7hr9
Description of the data and file structure
The data were collected as part of a long-term monitoring project designed to evaluate Loggerhead Shrike (Lanius ludovicianus) occupancy dynamics in a grassland–sagebrush ecotone of northeastern Wyoming, USA. Standardized point-count surveys were conducted annually from 2015 to 2024, with repeated visits in some years to estimate detection probability. Habitat characteristics (e.g., vegetation structure, tree presence, fence length) and survey conditions (e.g., cloud cover, date) were recorded to assess their influence on shrike detection and site use over time.
Files and variables
File: LOSH_2015-2024.csv
Description: This file contains point-count survey data on Loggerhead Shrikes (Lanius ludovicianus) collected from 2015–2024 at a grassland–sagebrush ecotone in northeastern Wyoming, USA. Each row represents a survey point within a transect, with information on detections/non-detections, survey conditions, observer identity, and associated habitat variables. The dataset is structured for dynamic occupancy analysis, with paired columns (_1 and _2) indicating multiple surveys conducted at the same point within a given year. If a site was surveyed only once in a given year, the corresponding _2 column values are recorded as NA.
Variables
- Transect: Transect identifier.
- Point : Point-count location identifier within transect.
- Year_1 / Year_2: Survey years corresponding to each sampling occasion in a dynamic occupancy framework.
- Date_Year_1 / Date_Year_2: Calendar date of survey.
- Obs_Year_1 / Obs_Year_2: Observer conducting the survey.
- Time_Year_1 / Time_Year_2: Time of survey.
- Sky_Year_1 / Sky_Year_2: Sky condition during survey (categorical code, e.g., clear, partly cloudy, overcast).
- Wind_Year_1 / Wind_Year_2: Wind condition during survey (categorical code or Beaufort scale value).
- Temp_Year_1 / Temp_Year_2: Air temperature (°C) at time of survey.
- Year_1_AC / Year_2_AC: Binary variable indicating whether Loggerhead Shrike was detected at any adjacent (neighboring) survey point (1 = yes, 0 = no).
- Veg_Height: Average height of herbaceous vegetation (cm).
- Fence_Leng: Total length of fence within survey point buffer (m).
- Tree_Area_Bin: Binary variable indicating presence (1) or absence (0) of a tree within survey point buffer.
- Herb_Year: Percentage of herbaceous ground cover (%) during survey year.
- LTR_Year: Percentage of litter ground cover (%) during survey year.
- SHR_Year: Percentage of shrub ground cover (%) during survey year.
Code/software
The dataset is provided in .csv format and can be opened with any standard text editor, spreadsheet program (e.g., Microsoft Excel), or statistical software. All analyses for the associated study were conducted in R (version 4.3.1; R Core Team 2023), an open-source statistical programming language freely available at https://www.r-project.org.
Analyses were implemented primarily using the unmarked package for dynamic occupancy modeling, along with several additional packages for data cleaning, visualization, and model selection. The following R packages were loaded during the workflow:
unmarked (1.4.1) – occupancy and abundance models
ggplot2 (3.5.0) – data visualization
tidyr / tidyverse – data wrangling
lubridate – date and time handling
dplyr – data manipulation
corrplot – correlation visualization
PerformanceAnalytics – descriptive statistics and correlations
ggpubr – publication-ready graphics
scales – formatting scales in plots
stringr – string manipulation
MuMIn – model selection and averaging
AICcmodavg – model comparison and AICc-based inference
bbmle – likelihood-based model fitting
graphics – base R graphics
readxl – import of Excel files (ancillary data)
Workflow: Survey data were first imported and cleaned using readxl, tidyverse, dplyr, and lubridate. Habitat covariates were merged and formatted for occupancy analysis. Detection/non-detection data were structured into the appropriate format for multi-season occupancy modeling using unmarked. Candidate model sets were compared using MuMIn and AICcmodavg, and diagnostic plots and visualizations were generated using ggplot2, ggpubr, and corrplot.
No proprietary software is required to access or reuse the dataset; only free and open-source tools were used in the analysis.
Example model:
Model <- colext(
psi = ~ Veg_Height + Fence_Leng + Tree_Area_Bin, # initial occupancy
gamma = ~ LTR, # colonization
epsilon = ~ SHR, # local extinction
p = ~ sky + date + AC, # detection probability
data = umf # unmarked data frame
)
