Skip to main content
Dryad

Bounding-box detection data for delphinid whistles

Data files

Aug 01, 2025 version files 8.25 GB

Abstract

Deep learning methods offer automated solutions for detecting marine mammal calls, yet require time-intensive development for optimized neural network performance, including carefully curating data and creating a robust network architecture. Using data collected in two aquariums and two open ocean environments, we evaluated the performance of a series of pre-trained object detection networks, CSP-DarkNet-53, ResNet-50, and Tiny YOLO, in detecting highly variable bottlenose dolphin (Tursiops truncatus) whistles using DeepAcoustics, a user-friendly deep learning tool. We compared the F1-score, average precision (AP), and mean AP performance of all network architectures with combinations of training samples from each acoustic environment. CSP-DarkNet-53 consistently outperformed Tiny YOLO and ResNet-50 across various test datasets, demonstrating robustness, but underperformed in select scenarios. Performance remained higher for aquarium data compared to open ocean data based on AP and mean AP values, indicating a greater ability of the networks to accurately detect whistles in these environments. However, networks trained on open ocean datasets showed only slightly improved APs on open ocean data, highlighting the challenge of achieving generalizability across divergent acoustic environments. This effort highlights the importance of network architecture selection and the effects of different acoustic environments on deep learning methods for detecting complex underwater vocalizations.