Skip to main content

soundscape_IR: A source separation toolbox for exploring acoustic diversity in soundscapes

Cite this dataset

Sun, Yi-Jen; Yen, Shih-Ching; Lin, Tzu-Hao (2022). soundscape_IR: A source separation toolbox for exploring acoustic diversity in soundscapes [Dataset]. Dryad.


1. Soundscapes contain rich acoustic information associated with animal behaviors, environmental characteristics, and human activities, providing opportunities for predicting biodiversity changes and associated drivers. However, assessing the diversity of animal vocalizations remains challenging due to the interference of environmental and anthropogenic noise. A tool for separating sound sources and delineating changes in acoustic signals is crucial for an effective assessment of acoustic diversity.

2. We present soundscape_IR, an open-source Python toolbox dedicated to soundscape information retrieval in which non-negative matrix factorization is applied. This toolbox provides algorithms for supervised and unsupervised source separation (SS). It also enables the use of a snapshot recording for model training and subsequently applying adaptive and semi-supervised SS when target species produce sounds with varying features and when unseen sound sources are encountered.

3. Our results demonstrated that SS could enhance the vocalizations of target species, characterize the complexity of vocal repertoires, and investigate the spatio-temporal divergence of soundscapes. In tropical forest soundscapes, the application of SS effectively detected the rutting vocalizations of sika deer and revealed a graded structure in their acoustic characteristics. In subtropical estuarine soundscapes, SS automated the process of identifying distinct biotic and abiotic sounds, and the result uncovered divergent sound compositions between inshore and offshore waters.

4. Implementation of SS in soundscape analysis offers a promising method for streamlining the assessment of acoustic diversity in diverse environments. Future application of SS will open new directions to acoustically quantify ecological interactions across individual, species, and ecosystem levels.


Tropical Forest Soundscape: This dataset is a collection of tropical forest recordings that contain the rutting vocalizations of sika deer (Cervus nippon) and sounds produced by diverse soniferous species. Rutting vocalizations play a crucial role in the breeding behavior of sika deer and vary among individuals. To monitor the population status and behavior of sika deer at Sheding Nature Park, Pingtung, Taiwan (21°58′02.7′′N, 120°49′02.4′′E), one Song Meter SM4 acoustic recorder (Wildlife Acoustics Inc., Maynard, MA, USA) was fixed on a tree at 1.5 m above the ground on November 18, 2017. The recorder was scheduled to run one 5-min recording every 15 min, with a sampling frequency of 44.1 kHz.

Subtropical Estuary Soundscape: This dataset features the underwater recordings of a subtropical estuary in western Taiwan waters. Estuaries support diverse marine species, including many soniferous fish and marine mammals. Listening to estuary soundscapes can help assess soniferous communities and anthropogenic activities. From August 26 to August 27, 2018, two recording sites were established in Chunggang River Estuary, Taiwan. The first site was located in shallow water where the mean depth was 9 m (24°40′48.4′′N, 120°49′07.1′′E), whereas the second site was located in relatively deep water where the mean water depth was 17.5 m (24°41′32.3′′N 120°48′48.8′′E). The distance between the two sites was approximately 1.5 km. At each site, one SoundTrap ST300 HF recorder (Ocean Instruments, Auckland, New Zealand) was bottom-mounted 1 m above the seafloor and scheduled to record sounds every 5 min using a sampling frequency of 192 kHz.

Usage notes

Tropical Forest Soundscape: The SM4 recorder generated uncompressed audio files (WAV format) with two channels. The file name contains information associated with recording site and the beginning time of each recording (site_yyyymmdd_HHMMSS.wav). For example, KT08_20171118_011500.wav represents the recording obtained from the site KT08, the beginning time is November 18, 2017, at 01:15. Audio recordings were collected by Shih-Ching Yen and Tzu-Hao Lin.

Subtropical Estuary Soundscape: Two folders were provided, one for the audio recordings collected from the inshore site and another one for those collected from the offshore site. The SoundTrap recorder generated lossless compressed single-channel audio files (SUD format). SUD files were transformed into uncompressed WAV files by using the Soundtrap Host software (Ocean Instruments, NZ). The file name contains information associated with the serial number of a recorder and the beginning time of each recording (SN.yymmddHHMMSS.wav). For example, 1208791070.180826091503.wav represents the recording obtained by using the Soundtrap #1208791070, the beginning time is August 26, 2018, at 09:15:03. The sensitivity is -179.2 dB re 1V/μPa for SoundTrap #1208791070, and -178.6 dB re 1V/μPa for SoundTrap #1207963682. Audio recordings were collected by Observer Ecological Consultant.


Ministry of Science and Technology, Taiwan, Award: 109-2621-B-001-007-MY3

Industrial Technology Research Institute