Data from: Social and ecological factors associated with innovation in urban sulphur-crested cockatoos (Cacatua galerita)
Data files
Dec 26, 2025 version files 374.55 MB
-
README.md
12.13 KB
-
repository.zip
374.53 MB
Abstract
Why some species thrive in urban environments while others do not is a central question in behaviouralecology. Behavioral innovations have been proposed as a key mechanism facilitating this adaptation. At the individual level, innovativeness varies with cognitive and behavioral traits. However, at the population level, innovation rates can also be influenced by social and ecological factors, including group size, and environmental novelty and complexity. The role of these factors is still under-explored, especially at within-city scales. To disentangle factors influencing group-level variation in innovation rates, we presented roosts of wild sulphur-crested cockatoos Cacatua galerita, with extractive-foraging tasks that required innovative problem-solving. We installed three tasks of different levels of difficulty on trees at fifteen communal roost sites across an urban matrix. We matched these with direct measures of roost size and connectivity, and with high-resolution remote-sensing mapping to estimate variation in urbanization and environmental heterogeneity. We found that approach time was significantly associated with urbanization, with individuals in more urban sites approaching tasks more quickly, suggesting reduced neophobia with urbanization or increased familiarity with human-derived objects. In contrast, the time to innovate in our study was explained by task difficulty rather than environmental and social factors. While we detected no significant effects of group size, connectivity, and environmental heterogeneity, larger sample sizes may be needed to reveal more subtle influences on innovation. Together, these results suggest that urbanization gradients can shape behavioral responses to novelty independently of problem-solving abilities.
Description of the data and file structure
repository.zip contains data to support the study: Social and ecological factors associated with innovation in wild sulphur-crested cockatoos (Cacatua galerita)
Contact information
Primary contact:
Lisa Fontana
Research School of Biology
Australian National University
Canberra, ACT 2601, Australia
Email: lisa.fontana@anu.edu.au
Principal Investigator:
Dr. Lucy M. Aplin
Research School of Biology
Australian National University
Canberra, ACT 2601, Australia
Email: lucy.aplin@anu.edu.au
Study system
The experiment was conducted between 14/06 and 24/10/2023 during the Austral winter/early spring. Three tasks, one per difficulty level, were installed at 15 roost sites (45 tasks total) in Canberra, ACT, Australia. Tasks were randomly positioned to the west, east, or center of the roost relative to geographic north, at approximately two-thirds of tree height. Each box was monitored with a HyperFire 2™ camera trap (Reconyx Inc.) positioned approximately 1 m away on the same branch. Cameras were motion-activated with integrated infrared sensors for low-light recording. Installation time was recorded at each site, and puzzle boxes remained in place for up to 8 weeks or until solved.
ACCESS INFORMATION
1. Licenses/restrictions placed on the data
This dataset is released under CC0 (Creative Commons Zero) license
2. Data derived from other sources
Publicly available data sources:
- ESA WorldCover 2020 land cover data (10 m resolution). European Space Agency. https://esa-worldcover.org/en
- Microsoft Road Detections dataset (2020-2022). https://github.com/microsoft/RoadDetections
- 2024 Buildings © Geoscape, Australia. Building footprints, heights, and attributes used to calculate urbanization indices. Available from https://geoscape.com.au/
Note: Raw building data cannot be shared due to copyright restrictions. Final calculated urbanization indices for all study sites are provided in this repository.
DATA FILES AND VARIABLES
1. FinalDataset_SCC.csv and FinalDataset.csv
Raw video observation data from camera traps monitoring puzzle box tasks at 15 sulphur-crested cockatoo roost sites in Canberra, Australia (June-October 2023). Contains all recorded approaches by SC-cockatoos and other species.
Approach Definition and Categorization
An "approach" was defined as presence on camera in close proximity to the puzzle box setup.
- Start time: Subject's appearance on camera
- End time: Subject no longer visible for at least 15 seconds
- Approach number: number of approaches by SCC
Approaches were categorized as:
- presence: Proximity to setup without interaction with puzzle box
- attempt: Clear attention to or active attempt to solve the box
- solving: Successful solving of the puzzle and access to reward
Presence and Attempt were grouped together and treated the same way
Species Notes-> Final_dataset_SCC.csv
Multiple species approached the boxes, but only sulphur-crested cockatoo (SC-cockatoo) data were included in the final analysis. Approaches by other species were documented but excluded from statistical models.
2. Roost.csv
Description
Final roost size estimates for all 15 focal roost sites. Combines counts performed specifically for this experiment with supplementary counts from concurrent lab projects
Variables
- Roost: Roost site name/identifier (matches roost\_id in other files)
- CODE: Roost site abbreviation code
- Lat: Roost site latitude (decimal degrees, WGS84)
- Long: Roost site longitude (decimal degrees, WGS84)
- Count: Final roost size used in analysis
* Range: 41-454 birds
* Median: 165 birds
3. RoostsCounts_Lisa.csv
Roost counts performed during the experiment period by primary researcher (LF).
- Roost: Roost site name
- Date: Date of count (YYYY-MM-DD)
- Time: Time period of count (dawn = sunrise departure count, dusk = sunset arrival count)
- Count: Number of sulphur-crested cockatoos counted
- Weather: Weather conditions during count
- Notes: Relevant observations
Count method: Birds were counted either departing roosts at sunrise or arriving at dusk.
Description
Roost counts performed specifically during the puzzle box experiment period (June-October 2023).
Variables
- Roost: Roost site identifier
- Count: Number of SC-cockatoos counted
- Lat = Roost latitude
- Long = Roost longitude
4. RoostCounts_Julia_Median.csv
Description
Additional roost counts from concurrent lab projects, used to estimate roost sizes (median) for sites where weather prevented counting on installation day.
Variables
- Roost: Roost site identifier (code)
- Date_count: Date of count (YYYY-MM-DD)
- Number of birds: Number of SC-cockatoos counted
- Median: median count
6. Envindexes.csv
Site-level environmental indices calculated from remote sensing and GIS data.
- roost_id: Roost site identifier
- roost_name: Roost site name
- latitude: Roost latitude (decimal degrees, WGS84)
- longitude: Roost longitude (decimal degrees, WGS84)
- urbanization_index: PC1 from PCA of land cover and building variables within 2 km radius (negative = suburban with high tree cover, positive = urban centers with tall buildings; range: -1.23 to 1.41, mean ± SD = 0.39 ± 0.80)
- environmental_entropy: O'Neill's absolute entropy calculated from land cover, buildings, and roads within 2 km radius (higher values = greater landscape heterogeneity)
Data sources: ESA WorldCover 2020 (land cover), Buildings © Geoscape Australia 2024 (buildings - proprietary), Microsoft Road Detections (roads). Raw building data cannot be redistributed; final indices provided.
Spatial scale: 2 km radius around each roost, approximating observed home range size.
7. network_metrics_[day/night].csv
Social network centrality metrics quantifying roost connectivity.
- Roost: Roost site name
- WEIGHTED_DEGREE: Weighted Freeman's degree centrality (sum of edge weights for roosts within 2.5 km; higher = more neighboring roosts)
- UNWEIGHTED_DEGREE: Unweighted Freeman's degree centrality (count of roosts within 2.5 km)
Network construction: Nodes = all known roost sites (27 total; 15 focal + 12 non-focal), Edges = connections between roosts within 2.5 km, Network type = undirected graph
Rationale: Sulphur-crested cockatoos visit multiple roosts and have overlapping foraging ranges; centrality serves as proxy for extended population size.
Analysis note: WEIGHTED_DEGREE was used as "roost connectivity" predictor in statistical models.
8. PUZZLEBOXES
Materials for replicating the experimental apparatus.
-SVG - Vector files (.svg) for laser cutting puzzle box components
- STL_Files - 3D printer files (.stl) for puzzle box moving parts
CODE SCRIPTS AND WORKFLOW
All analysis were conducted in R.
Recommended workflow:
1.Data preparation:
- DataCleaning.R: Processes raw observations into analysis-ready dataset
- networkmetrics_centrality_analysis.R: Calculates roost connectivity metrics
- EnvironmentalAnalysis/1_Data_cleaning_preparation.R: Prepares environmental data
- EnvironmentalAnalysis/2_UI.R: Calculates urbanization index via PCA
- EnvironmentalAnalysis/3_ONeill.R: Calculates O'Neill's entropy
2.Primary analyses (run after data preparation):
- Cox_Frequentist.R: Cox proportional hazards models (Model 1: time to approach, Model 2: time to solve)
- Cox_Bayesian.R: Bayesian survival analysis using brms (validation of frequentist results)
3.Supplementary analyses (optional):
- Supplementary/Number_approaches.R: Analysis of multiple approach events
- Supplementary/Engagement_data.R: Calculates cumulative interaction time
- Supplementary/Engagement_models.R: Models using engagement time metric
- Supplementary/graphs.R: Main text figures
- Supplementary/Scatterplot.R: Relationship visualizations
1. DataCleaning.R
Processes camera trap video observations into time-to-event format. Calculates approach and solving latencies from installation timestamps and video-coded events. Handles censoring for non-approached/non-solved tasks and camera failures. Merges with environmental and social predictors.
2. networkmetrics_centrality_analysis.R
Calculates weighted and unweighted degree centrality for roost sites using sna package. Constructs undirected network with edges between roosts within 2.5 km radius. Outputs used as roost connectivity predictor.
3. EnvironmentalAnalysis/1_Data_cleaning_preparation.R
Imports and prepares land cover, building, and road data for environmental index calculations. Handles raster processing and spatial subsetting to 2 km radius around each roost.
4. EnvironmentalAnalysis/2_UI.R
Calculates urbanization index via Principal Component Analysis of land cover proportions (tree, grass, vegetation, built-up) and building characteristics (height, area, residential proportion) within 2 km radius. PC1 used as urbanization metric.
5. EnvironmentalAnalysis/3_ONeill.R
Calculates O'Neill's absolute entropy as measure of landscape heterogeneity using SpatEntropy package. Combines land cover, building, and road layers within 2 km radius.
6. Cox_Frequentist.R
Primary statistical analysis using Cox proportional hazards mixed models (survival, coxme packages).
- Model 1: Time to first approach ~ urbanization + entropy + roost size + connectivity + task level + (1|roost_id)
- Model 2: Time to solve ~ urbanization + entropy + roost size + connectivity + task level + (1|roost_id)
- Model 3: Binomial logistic regression for solving probability
Includes model diagnostics (Schoenfeld residuals for proportional hazards assumption).
7. Cox_Bayesian.R
Bayesian survival analysis using brms package to validate frequentist results. Same model structure as Cox_Frequentist.R. Run with 4 chains, 20,000 iterations (5,000 warmup), adapt_delta = 0.999. Weakly informative priors: normal(0,1) for fixed effects, exponential(1) for random effect SD.
8. Supplementary/Number_approaches.R
Analyzes frequency and timing of multiple approach events per task (descriptive statistics and visualizations).
9. Supplementary/Engagement_data.R
Calculates cumulative interaction time (sum of all approach durations) as alternative metric to time-to-first-approach.
10. Supplementary/Engagement_models.R
Re-runs Cox models using engagement time instead of simple time-to-event (sensitivity analysis comparing temporal scales).
11. Supplementary/graphs.R
Generates main manuscript figures including Kaplan-Meier survival curves, effect plots, and study site map.
12. Supplementary/Scatterplot.R
Creates scatterplots examining relationships between predictors and examining collinearity.
SOFTWARE VERSIONS
R version 4.3.2 (2023-10-31)
Core analysis packages:
- survival: 3.5-7 (Cox models)
- coxme: 2.2-18 (mixed effects Cox models)
- survminer: 0.4.9 (survival curve visualization)
- brms: 2.20.4 (Bayesian survival analysis)
- rstan: 2.32.3 (Stan interface for brms)
Data manipulation:
- tidyverse: 2.0.0
- dplyr: 1.1.3
- lubridate: 1.9.3
Spatial analysis:
- sf: 1.0-14
- raster: 3.6-23
- SpatEntropy: 0.1.0 (O'Neill's entropy)
Network analysis:
- sna: 2.7-1 (centrality calculations)
- igraph: 1.5.1
Visualization:
- ggplot2: 3.4.4
- ggmap: 3.0.2
QGIS 3.30.2: Used for raster processing and spatial data preparation (ESA WorldCover, building layers, road layers).
ArcGIS 10.6.1: Used for initial spatial data extraction and processing of building characteristics.
VIDEOS: Video recordings from Lyneham roost site (Level 2 and 3) are included as representative examples of the data collection methodology and recording quality.
This dataset contains time-to-event data (approach and solving times in minutes) for puzzle box tasks presented to sulphur-crested cockatoos at 15 roost sites in Canberra, Australia. Environmental predictors include urbanization index (derived from PCA of land cover and building characteristics), environmental heterogeneity (O'Neill's entropy), roost size, and roost connectivity. Data were analyzed using Cox proportional-hazards mixed models with roost ID as a random effect, examining effects of environmental and social factors on approach and solving times. Censored observations indicate tasks not approached or solved during the experimental period.
