Song recordings and annotation files of 3 canaries used to evaluate training of TweetyNet models for birdsong segmentation and annotation
Cohen, Yarden (2022), Song recordings and annotation files of 3 canaries used to evaluate training of TweetyNet models for birdsong segmentation and annotation, Dryad, Dataset, https://doi.org/10.5061/dryad.xgxd254f4
Many analyses of birdsong require time-consuming manual annotation of the individual elements of song, known as syllables or notes. We developed the first automated algorithm for birdsong annotation, "TweetyNet", that is applicable to complex song such as canary song. TweetyNet is trained with a small amount of hand-labeled data using supervised learning methods. We evaluate the amount of data required for training TweetyNet models using vocalizations of two songbird species - Bengalese finches and Canaries. This dataset contains song audio files and their accompanying annotation files for the three canaries used in this analysis.
This dataset was acquired between late April and early May 2018 - a period during which canaries perform their mating season songs. Birds were individually housed in soundproof boxes and recorded for 7-10 days (Audio-Technica AT831B Lavalier Condenser Microphone, M-Audio M-track amplifiers, and VOS games' Boom Recorder software on a Mac Pro desktop computer). In-house software was used to detect and save only sound segments that contained vocalizations.
The vocalizations of 3 canaries are in 3 separate folders.
Two annotation files describe all the labeled vocalization segments of each animal in two formats:
1. A .csv file contains the annotations as a table with a row for each annotated canary syllable and columns:
- label - Identity of syllable
- onset_s - Time (sec from file onset) of the syllable onset
- offset_s - Time (sec from file onset) of the syllable offset
- audio_file - Path to audio file
- annot_file - Path to annotation file
- sequence - n.a.
- annotation - number of audio files
2. A Matlab file containing 2 cell arrays:
- keys - cell array of srtings - audio file names.
- elements - cell array if structs (matching the files in 'keys') with fields:
- filenum: file number
- segFileStartTimes (vector): Time (sec from file onset) of the syllable onsets
- segFileEndTimes (vector): Time (sec from file onset) of the syllable offsets
- segType (vector): syllable identities
National Institutes of Health, Award: R01NS104925, R24NS098536