Benefits of modelling abundance for rare species conservation: a case study with multiple birds across one million hectares
Data files
Nov 22, 2024 version files 565.24 KB
-
BIRD_OBSERVATIONS_CHESTNUT_QUAILTHRUSH.csv
10.66 KB
-
BIRD_OBSERVATIONS_CRESTED_BELLBIRD.csv
10.65 KB
-
BIRD_OBSERVATIONS_MALLEE_EMUWREN.csv
10.66 KB
-
BIRD_OBSERVATIONS_MALLEE_STRIATED_GRASSWREN.csv
10.66 KB
-
BIRD_OBSERVATIONS_REDLORED_WHISTLER.csv
10.66 KB
-
BIRD_OBSERVATIONS_SHY_HEATHWREN.csv
10.66 KB
-
BIRD_OBSERVATIONS_SLENDERBILLED_THORNBILL.csv
5.80 KB
-
BIRD_OBSERVATIONS_SOUTHERN_SCRUBROBIN.csv
10.66 KB
-
OBSERVATION_COVATIATES_SLENDERBILLED_THORNBILL.csv
74.64 KB
-
OBSERVATION_COVATIATES.csv
204.64 KB
-
README.md
10.94 KB
-
SITE_COVARIATES_SLENDERBILLED_THORNBILL.csv
77.41 KB
-
SITE_COVARIATES.csv
117.22 KB
Abstract
Aim: Many management programs that are based on the needs of rare or threatened species are ineffective because they fail to collect enough data to reliably estimate abundance and map distributions for their target species. Information that does exist for rare species is often based on presence-only data, because it is difficult to collect sufficient data on abundance for such species. We targeted ten rare bird species that were excluded from a recent study due to insufficient data. For these species, we aimed to (a) collect sufficient abundance data, (b) identify important locations and (c) estimate population sizes.
Location: A large reserve system (~1M-ha) in south-eastern Australia.
Methods: We undertook intensive field surveys, using repeat area searches of 660 independent 25-ha sites, totalling 2,640 hrs of surveys (2-hr surveys; two surveys per site). We used N-mixture models to estimate abundance whilst accounting for imperfect detection.
Results: This survey effort returned enough high-quality data on nine rare bird species to identify important locations and estimate their population sizes. To illustrate potential applications of mapped important locations, we used our results to assess the likely impact of a planned burn program in part of the study region. We identified planned burns that are likely to have a significant impact on important locations for rare species that may not have otherwise been identified.
Populations were generally larger than previously estimated using expert opinion. For example, our population estimate for the threatened Red-lored Whistler (Pachycephala rufogularis) was ~16 times larger than the previous estimate.
Main Conclusions: Our results show (a) the benefits of using abundance to identify important locations for rare species (b) the value of developing bespoke survey methods for estimating abundance of rare species with low detectability and (c) a pathway for the application of mapped important locations in conservation land management.
https://doi.org/10.5061/dryad.c59zw3rgg
All the files required to run N-Mixture models for each target species in R ('unmarked' package; Fiske & Chandler 2011). Each species was analysed separately and each species required three different file types: Bird Observations (species counts per site per survey), Site Covariates (e.g. fire-age, elevation) and Observation covariates (e.g. wind, time of day). Details of each file type are provided below.
Description of the data and file structure
The data files are formatted for N-Mixture modelling i.e. separate files for Bird Observations, Site Covariates and Observation Covariates.
BIRD_OBSERVATIONS_SPECIES_NAME.csv files (separate file for each species):
Each of these files includes the bird survey results for a single species. Each 25-hectare study site site has a unique row. Each survey round has a unique column. The numerical values represent the number of individuals counted during a survey. "0" values indicate an absence during that survey, whereas blank cells indicate that no survey was undertaken for that survey round at that site. There are four columns because the maximum number of surveys at a site was four. However, there are many blank cells because most sites were only surveyed twice. This is the format required for input to the 'unmarked' package in R (Fiske & Chandler 2011).
OBSERVATION_COVARIATES.csv:
Each column in this file relates to a different observation covariate i.e. a variable related to the environmental conditions at the time of the survey. Each row represents a survey round at a unique site. As a result, there are four rows per site (because the maximum surveys at a site was four). Because not all sites had four surveys, the data for some rows is blank (i.e. those rows where no survey was undertaken).
Descriptions of Observation covariates:
SITE_ID_24ha: String. Unique site identifier. The sites were 600 x 400 m i.e. 24 hectares. We added a 5 m buffer on all sides which made them 25 hectares during analysis.
SurveyRound: Numeric. Ranging from 1-4. Most sites were surveyed twice. The data is formatted so that all sites have four rows - one for each survey round. For sites with less than four surveys, some rows are left blank.
Region: String. BD = Big Desert (An overarching name for the reserve network found in the State of Victoria; the eastern half of the study region) or NGAR = Ngarkat (The name of the reserve found in the State of South Australia; the western half of the study region).
Set: Numeric. Sites were spatially arranged into sets of three, that a single surveyor can cover in one day. Sites with the same Region name and Set number are in the same set and therefore located nearby to one another (within 3-km).
Site: Numeric. Ranging from 1-3. The unique identifier for the three the sites within a Region and Set.
Decimal Start Time: Numeric. Minutes since midnight.
Average of Wind 6CATS: Numeric continuous. Wind speed was assessed by surveyors on a scale of 1-6. Wind speed was assessed 6 times per site survey (every 20 mins over the 2-hr survey). The average value of these 6 assessments was used in analysis.
Average of Wind 3CATS: Numeric continuous. Same as above however the classes used in the assessment were dissolved to create only three classes so that the first class included 1 and 2; the second class 3 and 4 and the third class 5 and 6.
Month: Numeric. Number represents the month of the year on a scale in which January is 1 and December is 12. Our surveys spanned months 5-10.
Season: Surveys divided by time of year into Spring or Autumn. Autumn surveys: May, June, July. Spring: August, September, October.
Decimal_Sunrise: Numeric. Minutes after midnight that sunrise occurred. Values were based on the month of the survey and the region.
MinutesSinceSunrise: Numeric. The start time of the survey in relation to sunrise. Negative values indicate that the survey started before sunrise.
MinutesSinceSunrise_negs_removed: Numeric. The start time of the survey in relation to sunrise. Same as above except that surveys that started before sunrise are given values of "0".
TimeSinceSunrise_2CATS: Numeric. Ranging from 1-2. Activity for many birds seemed greater around sunrise. This variable captures that field observation by classing the time into two classes: Sunrise = 1. All later times = 2.
TimeSinceSunrise_3CATS: Numeric. Ranging from 1-3. The survey design was such that the data had natural breaks in the data that made it appropriate to split the data into categories: Sunrise = 1, Mid-morning = 2 and Noon =3.
Observer: Name of surveyor.
SKILL_LEVEL_2CATS: Bird ID Skill level of observer, as assessed by Simon Verdon (lead author). 2 classes: Beginner = 1; Expert = 2.
SKILL_LEVEL_3CATS: Bird ID Skill level of observer, as assessed by Simon Verdon (lead author). 3 classes: Beginner = 1; Intermediate = 2; Expert = 3.
X: Longitude. Decimal degrees
Y: Latitude: Decimal degrees
SpeciesSurveyed: Some surveyors (experts) recorded all species encountered ("All Species"). Remaining surveyors recorded only the target species ("Only Target Species").
SITE COVARIATES.csv:
Each column in this file relates to a different site covariate i.e. a variable related to the environmental conditions at each site. Each row represents a unique site.
Descriptions of site covariates:
UNIQUE SITE ID: String. Unique site identifier. The sites were 600 x 400 m i.e. 24 hectares. We added a 5 m buffer on all sides which made them 25 hectares during analysis.
SURVEY ROUND: All rows have the same value: "1". This column was used as a lookup value to extract one row per unique site from the survey data, which had four rows per unique site (one row for each survey round).
UNIQUE SURVEY ID: A site+survey code creating a unique identifier for each site and survey combination. This code adds the survey round number to the end of the unique site ID.
EASTING: Longitude. Decimal degrees
NORTHING: Latitude. Decimal degrees
EASTING INTEGER: Longitude converted to integer so that it can be included in analyses as a random effect. Some error distribution families can only work with whole numbers (e.g. Poisson), making this field necessary for some analyses.
VEG TYPE GROUND: Vegetation type as assessed in the field. This could not be used in analysis because we required remote-sensed data layers for all site covariates in order to produce maps and estimates of abundance across the entire reserve system.
VEG TYPE 3CAT GROUND: Vegetation type as assessed in the field. Dissolved into three broad classes: Heath, Mallee and Callitris. Like the variable above, this could not be used in analysis.
REGION: String. BD = Big Desert (An overarching name for the reserve network found in the State of Victoria; the eastern half of the study region) or NGAR = Ngarkat (The name of the reserve found in the State of South Australia; the western half of the study region).
TSF UNK 1950: Years since the last fire. Where there was no recorded fire history, the fire year was set to 1950 (72 years since fire at the time of the surveys).
MID SUCCESSIONAL: Binary. This field splits the fire-ages on offer into two classes: mid-successional (12-49 years since fire) and not mid-successional (0-11 years since fire and 50+ years since fire). This transformation can be used to more accurately assess the effect of time since fire on mid-successional species. Generalised Additive Models are a better approach, but N-mixture models are based on Generalised Linear Models i.e. they have limited capacity to identify mid-successional fire responses with untransformed data.
FIRETYPE UNK WILD2: Binary. 2 = Planned burn/management burn. 1 = Wildfire. Fires with no recorded firetype were classed as Wildfires.
FIRETYPE UNK WILD3: Binary. 0 = Planned burn/management burn. 1 = Wildfire. Fires with no recorded firetype were classed as Wildfires.
FIRETYPE UNK WILD2 TEXT: String. Same as above but string rather than binary data.
ANRAIN2: Mean annual rainfall at each site taken from the Australian Government Bureau of Meteorology.
VEG TYPE REMOTE2: Binary. 2 = Heath vegetation. 1 = Mallee vegetation.
VEGTYPE REMOTE3: Binary. 0 = Heath vegetation. 1 = Mallee vegetation.
VEGTYPE REMOTE2 TEXT: String. Same as above but string rather than binary data.
HA MALLEE 1KMR2: Count of Hectares of Mallee vegetation in the 1-km radius surrounding the site. Functions as a landscape level variable for vegetation type.
HA OLD 1KMR2: Count of Hectares of old growth vegetation (> 40 years since fire) in the 1-km radius surrounding the site. Functions as a landscape level variable for time since fire.
ELEV2: Elevation as meters above sea level.
HAR2: Height Above Rivers. An index of localised elevation where the baseline is set by the mean height of rivers in the region rather than metres above sea level
RUGGED2: Terrain Ruggedness Index. An index of topographic complexity representing the elevational difference between each location (200 x 200 m raster cell) and the surrounding terrain (raster cells). Ruggedness at each location was assessed by comparing it to the surrounding 32 hectares (surrounding eight raster cells)
TWI2: Topographic Wetness Index. An index representing local topographic position (from exposed dune-top to sheltered swale). Based on the elevation and aspect of a cell compared to the surrounding 32 hectares (surrounding eight raster cells). This variable was rendered in SAGAS GIS with default settings. This variable differs from Wind Exposure Index (below), in that the rain direction is not incorporated into the analysis (as is the case for wind exposure index).
WIND EXP MEAN: Wind Exposure Index. An index representing local topographic position (from exposed dune-top to sheltered swale). Based on the elevation and aspect of a cell compared to the surrounding 32 hectares (surrounding eight raster cells). This variable was rendered in SAGAS GIS with default settings (Böhner & Antonić 2009).
Note that separate OBSERVATION_COVARIATES.csv and SITE_COVARIATES.csv files are provided for the Slender-billed Thornbill because we used a restricted dataset for this species. This was necessary because we only used data from expert surveyors for this species due to its similarity to the Buff-rumped Thornbill. These files are called:
OBSERVATION_COVARIATES_SLENDERBILLED_THORNBILL.csv and SITE_COVARIATES SLENDERBILLED_THORNBILL.csv
Sharing/Access information
There are no other publicly accessible locations of the data
Code/Software
The data is formatted for use in R ('unmarked' package; Fiske & Chandler 2011)
This DOI comprises all the data required to run the N-Mixture models for each of the target species. It is also formatted for this purpose, with separate files for Bird observations, site covariates (e.g. fire-age, elevation) and observation covariates (e.g. wind, time of day). Details of data collection method are below:
Site selection
We randomly sampled 660 sites (25 ha each; 410 x 610 m), stratified according to fire-age (years since fire) and fire type (planned burn or wildfire). Through stratification we attempted to balance the dataset by maximising the number of sites in uncommon fire-age classes and in planned burns, which were scarce compared to wildfires. All sites were separated by > 1 km. Sites were arranged in ‘sets’ of three so that a single surveyor could complete one set per day. At least one site per set was within 1 km of the nearest track.
Survey method
From May to October 2022, we conducted 1,346 surveys (2,692 hrs covering 33,650 ha). To achieve this survey effort, we conducted 10 trips (9 days each, 4-10 people per trip). In total there were 57 surveyors, 54 of whom were volunteers. To account for inter-observer variability, we rated the experience level of all surveyors and incorporated this into analyses.
Each surveyor surveyed one set of three sites per day. Surveys started 40 minutes before dawn (± 15 mins), resulting in three distinct survey periods labelled ‘dawn’, ‘mid-morning’ and ‘noon’. Sites were surveyed 2.04 times on average (once: 45 sites; twice: 545 sites; three times: 70 sites). Repeat surveys of the same site were conducted on consecutive days. The order in which sites were visited was reversed for each repeat survey day.
Surveys consisted of a single person conducting a 2 hr area search, walking 1,500-2,500 m per survey and recording counts (abundance) of each target species. To manage surveying such large sites, surveys were broken up into six consecutive 20 minute survey bouts, covering adjacent ~4 ha cells (~200 x 200 m). The data from the six survey bouts was combined at the end of each survey to form the site survey data. Each site was also given a 5 m buffer on all sides, so that birds recorded exactly on the boundary were counted as occurring within the site, so for analysis, sites were 410 x 610 m i.e., ~25 ha. Survey bouts began when the surveyor entered the cell. The surveyor then conducted a 7 minute meandering area search on the way to the centre point (~200 m). At all times the surveyor was free to wander throughout the cell to search for birds and confirm bird species identities and abundances. At the centre-point of each cell, the surveyor conducted playback for eight of the ten target species (20 seconds per species, 20 second gap between species). The Crested Bellbird and Black-eared Miner were excluded from playback because it is ineffective for the Crested Bellbird, and playback of the Black-eared Miner can negatively affect detectability of other species due to its interspecific aggression (MF Clarke pers. comm.). The playback process at the centre point of each cell took ~6 minutes. This included time required to take site photos, record wind speed, take general notes and identify any birds detected. The surveyor then conducted another 7 minute meandering area search on the way to the cell boundary where that survey bout ended (also ~200 m).
Detecting birds through their calls is the most common form of detection for the target species. After initial detection by call, surveyors attempted to sight birds to confirm numbers. If a bird was heard in the 25 ha site but in an adjacent cell (i.e., not the cell currently being surveyed), the surveyor still recorded that individual as in the 25 ha site. Surveyors took great care to avoid double-counting birds in site surveys, using information relating to bird species mobility, direction and time of previous detection and movement of the surveyor in that time. We did not use external speakers to amplify playback because we wanted to minimise the risk of ‘calling birds in’ from outside the site, leading to inflated estimates of bird abundance (Kéry & Royle 2015). We continued surveys in light rain. In moderate-heavy rain we paused surveys and waited for them to pass. In moderate-heavy and consistent rain we cancelled surveys for that day.
Covariates
We used abundance covariates (i.e. representing the environment at sites) and detection covariates (i.e. representing survey conditions). We used eight abundance covariates that were hypothesised to affect the occurrence and abundance of the target species. We only used abundance covariates that had associated spatial data because we aimed to extrapolate model predictions to estimate abundance across the entire study area. We transformed the spatial data for each abundance covariate to generate a single value for each 25 ha site (e.g. using the mean value for the site).
We used detection covariates to account for factors affecting detectability of species during surveys. Detection covariates were: Observer Skill (ordinal: Beginner, Intermediate, Expert; classified by the lead author), Time Of Day (ordinal: Dawn, Mid-Morning or Noon), Season (continuous numeric: Month of the Year: 5-10 corresponding to May-October) and Wind Speed (continuous numeric: scored on a qualitative scale but calibrated between observers during a workshop at the start of each trip: None = 0, Slight = 1, Breezy = 2, Gusty = 3, Strong Winds = 4, Gale = 5).
