A high-quality sport ball dataset annotation based on videos

Zou, Tianjian 1 ; Liu, Jun1

Research facility: Beijing University of Posts and Telecommunications

Published Apr 30, 2026 on Dryad. https://doi.org/10.5061/dryad.3bk3j9m13

Data files

Apr 30, 2026 version files 9.14 MB

footBall.zip

3.27 MB
README.md

1.84 KB
tableTennis.zip

3.49 MB
tennis.zip

2.38 MB

Abstract

In the task of object detection, detecting Sport Balls presents a relatively challenging problem.

This is primarily due to their small size and lack of distinctive features, with the added difficulty of motion blur caused by the high-speed movement of spherical objects in images.

We have observed that most data sources for Sport Ball detection tasks consist of consecutive video frames from sports scenes.

Building on this, we have created a ball object dataset composed of table tennis, tennis, and soccer videos (each containing over 10,000 target objects) to assist in training models for detecting Sport Balls.

Our dataset annotations follow the same format as those used by ultralytics for object detection, defined as bounding boxes specified by x, y, w, h.

Experimental results presented in the subsequent paper demonstrate that our dataset is of high quality, with various models achieving strong performance on it.

Our data is divided into three categories: football, tennis, and table tennis, each stored in separate archives.

All annotation files are stored in the "Labels" directory.(Respectively under the footBall, tableTennis, and tennis directories within footBall.zip, tableTennis.zip, and tennis.zip.) For details on data reading and correspondence, please refer to the README file in the folder.

The dataset is constructed based on sports videos, all of which have been automatically desensitized using the YOLO26 model.

However, even after desensitization, the videos still contain a small amount of sensitive information that can not comply with the CC0 public license (but satisfy CC BY-NC-ND 4.0). We have published the videos corresponding to the labels elsewhere, at the address: https://doi.org/10.5281/zenodo.19874311 with DOI 10.5281/zenodo.19874312

Reference of YOLO26 (Used to remove sensitive information) (Bibtex)

@software{yolo26_ultralytics,
  author = {Glenn Jocher and Jing Qiu},
  title = {Ultralytics YOLO26},
  version = {26.0.0},
  year = {2026},
  url = {https://github.com/ultralytics/ultralytics},
  orcid = {0000-0001-5950-6979, 0000-0003-3783-7069},
  license = {AGPL-3.0}
}

Human subjects data

All our data are sourced from open-source online videos or competition videos (has been published). All released data have been de-identified using YOLO26. We detect all individuals in the video and apply blurring to each of them on a frame-by-frame basis. As a result, the processed video essentially contains no human subject information. Additionally, we confirm that we obtained explicit consent from the participants to publish the de-identified data in the public domain.

A high-quality sport ball dataset annotation based on videos

Data files

Abstract

README: A high-quality sport ball dataset annotation based on videos

Human subjects data