Skip to main content

Data from: Evaluating the use of lidar to discern snag characteristics important for wildlife

Cite this dataset

Stitt, Jessica M. et al. (2022). Data from: Evaluating the use of lidar to discern snag characteristics important for wildlife [Dataset]. Dryad.


Standing dead trees (known as snags) are historically difficult to map and model using airborne laser scanning (ALS), or lidar. Specific snag characteristics are important for wildlife; for instance, a larger snag with a broken top can serve as a nesting platform for raptors. The objective of this study was to evaluate whether characteristics such as top intactness could be inferred from discrete-return ALS data. We collected structural information for 198 snags in closed-canopy conifer forest plots in Idaho. We selected 13 lidar metrics within 5 m diameter point clouds to serve as predictor variables in random forest (RF) models to classify snags into four groups by size (small [<40 cm diameter] or large [≥40 cm diameter]) and intactness (intact or broken top) across multiple iterations. We conducted these models first with all snags combined, and then ran the same models with only small or large snags. Overall accuracies were highest in RF models with large snags only (77%), but kappa statistics for all models were low (0.29–0.49). ALS data alone were not sufficient to identify top intactness for large snags; future studies combining ALS data with other remotely sensed data to improve classification of snag characteristics important for wildlife is encouraged.


The study was conducted in closed canopy forest stands within the Idaho Panhandle National Forest (IPNF), Idaho, USA. Discrete-return airborne lidar (also airborne laser scanning, ALS) data were acquired as part of an ongoing effort by the U.S. Forest Service Rocky Mountain Research Station (USFS RMRS; Moscow, ID, USA) to develop advanced protocols for forest mapping and monitoring. The locations of ALS acquisitions were used as the basis for the two study sites (based on IPNF Ranger Districts): Coeur d’Alene River (CdA) and Saint Joe (StJ). The lidar extent of CdA spanned 78,706 ha to the east of Coeur d'Alene, ID, USA (centered at 47°44'40.0"N, 116°36'38.8"W). The lidar extent of StJ spanned 7,598 ha to the east of Avery, ID, USA (centered at 47°08'28.5"N, 116°05'38.9"W).

All snags evaluated for this study were located within 25 m fixed-radius survey plots. Survey plot locations were selected through a stratified random sampling design based on canopy cover and height metrics from ALS and constrained using criteria to enable subsequent point count surveys of forest birds, including a 300 m minimum distance from other plot centers and a road buffer (minimum of 60 m and maximum of 250 m away from roads). Plot centers were recorded using a global navigation satellite system (Trimble GNSS) receiver and differentially corrected in post-processing. Snags identified within a 25 m radius survey plot were measured, provided they satisfied the criteria, including (1) a minimum height equal to breast height (used to measure diameter at breast height, DBH; 1.37m above ground), (2) a minimum DBH of 0.15 m, (3) less than a 45º lean in any direction, and (4) possessed no visible live parts of the tree (primarily the presence of green needles; red/brown needles were accepted and noted). The minimum DBH was selected based on criteria for woodpecker use. Snag locations were determined using the same Trimble GNSS receiver as used for plot center, which was positioned on the north side of the bole.

The georeferenced ALS data were processed at the survey plot-level using the R environment (version 4.0.3). Point clouds were preprocessed using the lidR package at a 50 m-radius around plot centers (0.8 ha plots) to generate buffered, height-normalized point clouds. All ground, near-ground, and understory returns below 1.37 m were excluded to retain only canopy returns. Outliers below 50 m height above ground were not removed as they may represent snag hits. Snag center coordinates were calculated by subtracting half the measured DBH from the northing value collected in the field, to correct for the GNSS receiver being positioned on the north side of the bole. Snag center coordinates were used as locations to clip the height-normalized ALS point clouds from within each survey plot. Individual snag point clouds were clipped to a 2.5 m circular radius buffer around each snag center, in order to encompass the full tree footprint, offset any error in GPS location, and preserve enough returns to compute all metrics.

Lidar-derived metrics were calculated for each 2.5 m-radius snag point cloud, with a minimum height set to 1.37 m above ground. A subset of the full suite of standard height and return-based metrics (as provided in the lidR package) was chosen alongside additional structural and topographic metrics for this study, in order to best evaluate differences among the four classes of snag structural characteristics. We focused on lidar metrics previously shown to be effective indicators of forest biomass across different forest types and those which would have specific utility for individual snags, as compared to individual live tree or plot-level analyses. In total, 13 metrics were used for our evaluation.

To analyze whether airborne lidar metrics could be used to discriminate between snag classes, we conducted Random Forest (RF) classification using the R environment and with the packages randomForest & caret. For our response variable, we focused on two aspects from field inventory data for each snag: diameter and intactness. We divided the candidate snags into two categories for each of these variables, in order to have distinct groups to compare lidar metrics against. Snag diameter was divided into small (DBH less than 40 cm) and large (DBH greater than or equal to 40 cm) groups, based on literature suggesting these are ecologically relevant distinctions, as well as important for ALS detectability. Snag intactness was divided into intact (where the snag bole still had a visible top) and broken (where at least part of the bole had broken off). Therefore, the distinct snag classes we used for evaluation were “small intact” (SI), “small broken” (SB), “large intact” (LI), and “large broken” (LB). A total of 198 snags were used for these analyses.

For validation purposes, RF models were run within a bootstrap with 20 iterations. For each iteration, we drew 60% of the snags (ntrain = 118) from the total available snag population (ntotal = 198) to train the RF model. The remaining snags not drawn per iteration (ntest = 80) were used for independent validation of the RF model. For each bootstrap iteration, confusion matrices of observed versus predicted classifications were used to calculate overall accuracy and the kappa statistic (a coefficient ranging from 0 to 1, where 0 is equated with random chance and 1 is perfect concordance), as well as producer’s accuracy (inverse of omission error) and user’s accuracy (inverse of commission error) for each class.

In addition to the confusion matrix, an evaluation of predictor variable importance was reported for the top RF model (RFALL) in terms of mean decrease in the Gini Index (a measure of the decrease in node impurity). To evaluate the importance of each predictor relative to one another among the four snag classes (and across models), the class-specific variable importance scores were standardized using the Model Improvement Ration (MIR) whereby all raw variable importance scores are divided by the maximum.

We conducted additional RF modeling on subsets of the snags by diameter, modeling only small snags (ntotal = 82) and only large snags (ntotal = 116) against the same suite of predictor variables and using the same RF parameters and validation methods. The only change was that classification was reduced from the four snag classes to only two based on snag intactness (intact versus broken top) for each model. The confusion matrix and predictor variable importance tables for the top RF models (small snags = RFSM; large snags = RFLG) were also reported, which let us further evaluate which of the 13 variables were most important for each size class.

Usage notes

Please see ReadMe files for additional information.


National Science Foundation

Idaho Space Grant Consortium

NASA, Award: NNH15AZ06I