Skip to main content
Dryad

Benchmarking speech-to-text robustness in noisy emergency medical dialogues: An evaluation of stt-models under realistic acoustic conditions

Data files

Dec 03, 2025 version files 1.33 MB

Click names to download individual files

Abstract

This dataset provides a representative JSON file containing 99 fully synthetic German emergency medical service (EMS) dialogue texts used as ground-truth material in our benchmarking study on speech-to-text (STT) robustness. Each dialogue represents a prehospital scenario with conversational exchanges between EMS personnel and patients, and includes synthetic clinical information (e.g., diagnoses, medications, and vital signs).

In the associated study, these texts served as the basis for generating synthetic audio samples with text-to-speech systems and for evaluating multiple STT models under controlled noise conditions. Only the underlying dialogue texts are included here, as they form the necessary foundation for reproducing the audio corpus or for conducting related benchmarking tasks.

All contents are fully synthetic, contain no personal or sensitive information, and may be used freely for research and methodological development.