Skip to main content
Dryad

Tephritid26: A standardized, multi-angle image dataset of quarantine-significant true fruit flies for deep learning-based identification

Data files

May 26, 2026 version files 119.44 GB

Click names to download individual files Select up to 11 GB of files for zip download

Abstract

Accurate and rapid identification of quarantine-significant tephritids is critical to global agricultural biosecurity, but the application of deep learning is limited by the lack of large public image datasets. We present Tephritid26, a multi-angle image dataset of 26 tephritid species to address this gap. The dataset includes 38,081 images from 1,473 specimens across seven genera and two subfamilies, assembled through a global collaborative effort to source these regulated species. Specimens were mounted using a novel protocol combining varied thoracic attachment points and pin angles, and a rotational imaging setup, then systematically captured each specimen from multiple perspectives to mimic real inspection conditions. The dataset is formatted for machine learning workflows. To demonstrate its utility, we trained deep learning models for species identification. ResNet-50, ConvNeXt-B, Vit-Small, and Swin-Tiny all attained high species-level accuracy (Macro-Averaged F1-score > 96.75). Gradient-weighted Class Activation Mapping confirmed that the models focused on taxonomically informative morphological regions. This dataset serves as a benchmark for developing automated identification tools in phytosanitary applications.