Systematic review of validation of supervised machine learning models in accelerometer-based animal behaviour classification literature

Published Jun 24, 2025 on Dryad. https://doi.org/10.5061/dryad.fxpnvx14d

Data files

Jun 24, 2025 version files 46.22 KB

README.md

2.53 KB
Systematic_Review_Supplementary.xlsx

43.69 KB

Abstract

Supervised machine learning has been used to detect fine-scale animal behaviour from accelerometer data, but a standardised protocol for implementing this workflow is currently lacking. As the application of machine learning to ecological problems expands, it is essential to establish technical protocols and validation standards that align with those in other "big data" fields. Overfitting is a prevalent and often misunderstood challenge in machine learning. Overfit models overly adapt to the training data to memorise specific instances rather than to discern the underlying signal. Associated results can indicate high performance on the training set, yet these models are unlikely to generalise to new data. Overfitting can be detected through rigorous validation using independent test sets. Our systematic review of 119 studies using accelerometer-based supervised machine learning to classify animal behaviour reveals that 79% (94 papers) did not validate their models sufficiently well to robustly identify potential overfitting. Although this does not inherently imply that these models are overfit, the absence of independent test sets limits the interpretability of their results. To address these challenges, we provide a theoretical overview of overfitting in the context of animal accelerometry and propose guidelines for optimal validation techniques. We aim to equip ecologists with the tools necessary to adapt general machine learning validation theory to the specific requirements of biologging, facilitating reliable overfitting detection and advancing the field.

https://doi.org/10.5061/dryad.fxpnvx14d

Description of the data and file structure

Files and variables

File: Systematic_Review_Supplementary.xlsx

Description: Methods information from animal accelerometer-based behaviour classification literature utilising supervised machine learning techniques.

Variables

Citation: Citation information for paper
Title: Extracted title from citation information
Year: Year of publication
ModelCategory: General category of the supervised machine learning model used (e.g., all Support Vector Machines are listed as SVM)
- DT — Decision Tree
- EM — Expectation Maximisation
- Ensemble — Ensemble methods (e.g., boosting, bagging)
- HMM — Hidden Markov Model
- Isolation Forest — Anomaly detection using Isolation Forest
- kNN — k-Nearest Neighbours
- Multiple — Multiple models trialled and compared
- NB — Naive Bayes
- NN — Neural Network (any architecture)
- QDA — Quadratic Discriminant Analysis
- RF — Random Forest
- SVM — Support Vector Machine
- Tree — Other tree-based models (e.g., CART)

Species: Main research species (common name)
Free/Captive: Whether the species was free-roaming or captive for the duration of the study. (free-roaming/captive/split designation)
SampleSize: Number of individuals' data included in the study (numeric)
Overlap: % overlap between windows during feature generation (numeric, 'vague' if unclear from publication, or publication descriptor, e.g., "rolling")
FeatureSelection: Whether feature selection was performed prior to model construction (yes/no/blank for not reported)
HyperparameterTuning: Whether model hyperparameters were tuned prior to the selection of the final model (yes/no/blank for not reported)
ValidationSplit: How data was stratified betwene training, validation, and test sets. (Random/chronological/ individual)
ValidationSet: Inclusion of a dataset specifically for model tuning (yes/no/blank for not reported)
ValidationMethod: Use of single or cross-validated validation (single/cross-validation/blank for not reported)
Fscore - AUC: Performance metrics reported in publication (numeric)
Other_performance_metrics: report any other metrics in publication (name and numeric)