Skip to main content
Dryad

Data from: Evaluating machine learning models for multi-species wildlife detection and identification on remote sensed nadir imagery in South African savanna

Data files

Jan 16, 2026 version files 446.21 GB

Click names to download individual files Select up to 11 GB of files for zip download

Abstract

This research paper investigates the efficacy of leading machine learning (ML) models for detecting and identifying ungulate species in the African savanna using nadir imagery from unmanned aerial vehicles (UAVs). Traditional aerial counting methods, while widely used, suffer from significant limitations in accuracy and precision, in part due to human biases. We examine the use of ML and its potential for aerial censuses by evaluating the performance of nine leading ML models, focusing on their ability to detect and identify five ungulate species: impala (Aepyceros melampus), nyala (Tragelaphus angasii), sable (Hippotragus niger), roan (Hippotragus equinus), and buffalo (Syncerus caffer). Using a UAV, 20137 nadir images were obtained from two properties in north-east South Africa. Data were manually annotated using bounding boxes and split into training, validation and test sets. ML models were trained on the same sets and run for the detection of wildlife as a single class and for identification of each individual species. The models were compared across four metrics: precision, recall, F1-score and mean average precision (mAP). The resulting highest wildlife detection scores were: precision 86.7%, recall 81%, F1 82.6% and mAP 85%, with newer and smaller models generally achieving higher scores than older and larger models respectively. Our results show ML model’s animal detection rates comparable to highest human detections during aerial censuses. However, species identification results were overall lower with highest scores being: precision 59.4, recall 74.2, F1-score 52.1% and mAP 55.7%, with significant variation between models and species, influenced by body size, colour and dataset size. The lower scores in species identification demonstrate that ML models are not yet suitable for performing fully automated censuses. Incorporating ML in a semi-automated process may however, achieve higher precision than using human observers through the removal of human biases and greater repeatability through the ability to pre-programme flight paths.