Data for: Hearable devices with sound bubbles
Data files
Nov 18, 2024 version files 60.64 GB
-
README.md
1.91 KB
-
syn_1_5m.tar
18.53 GB
-
syn_1m.tar
18.53 GB
-
syn_2m.tar
18.53 GB
-
syn_test.tar
5.05 GB
Abstract
The human auditory system has a limited ability to perceive distance and distinguish speakers in crowded settings. A headset technology that can create a sound bubble in which all speakers within the bubble are audible, but speakers and noise outside the bubble are suppressed, could augment human hearing. However, developing such technology is challenging. Here we report an intelligent headset system capable of creating sound bubbles. The system is based on real-time neural networks that use acoustic data from up to six microphones integrated into noise-canceling headsets and are run on-device, processing 8 ms audio chunks in 6.36 ms on an embedded central processing unit. Our neural networks can generate sound bubbles with programmable radii between 1 and 2 meters, and with output signals that reduce the intensity of sounds outside the bubble by 49 decibels. With previously unseen environments and wearers, our system can focus on up to two speakers within the bubble with one to two interfering speakers and noise
outside the bubble. It introduces a headset technology that can create a sound bubble in which all speakers within the bubble are audible, but speakers and noise outside the bubble are suppressed, which could augment human hearing. This repo mainly contains the synthetics dataset for results reported in the paper for 1m, 1.5m, and 2m bubble sizes.
https://doi.org/10.5061/dryad.r7sqv9smv
This is the dataset for the paper “Hearable devices with sound bubbles”
Description of the data and file structure
The dataset is used to create a sound bubble on a hearable, which can create a sound bubble in which all speakers within the bubble are audible, but speakers and noise outside the bubble are suppressed. The dataset in this repo contained the speech data simulated from pyroomacoustic (https://github.com/LCAV/pyroomacoustics). For each data sample in the train/val/test set contains the following files:
metadata.json - JSON file contains all information including room information, microphone array positions, sound source information, and so on.
mixture.wav - the wave file for the multi-channel mixture of multiple sources.
micsxx_yy.wav - the recording of a single sound source yy at the microphone xx for training target.
Simulation data for Sound bubble
syn_1m.tar - train and val set for 1m bubble evaluation
syn_1_5m.tar - train and val set for 1.5m bubble evaluation
syn_2m.tar - train and val set for 2m bubble evaluation
syn_test.tar - test set for 1m, 1.5m, and 2m bubble evaluations to reproduce the results in Figure 3
Files and variables
File: syn_1m.tar
Description: train and val set for 1m bubble evaluation
File: syn_test.tar
Description: test set for 1m, 1.5m, and 2m bubble evaluation to reproduce the results in Figure 3 of the paper
File: syn_1_5m.tar
Description: train and val set for 1.5m bubble evaluation
File: syn_2m.tar
Description: train and val set for 2m bubble evaluation
Code/software
Source code including training and evaluation is available in