Skip to content

DAPS (Device and Produced Speech)

Professional studio-quality speech with time-aligned recordings of the same speech captured on consumer devices (tablet, smartphone) in real-world environments; 20 speakers.

Recommendation

Ideal as clean reference / parallel data for speech enhancement, dereverberation, and device-channel robustness, and as a high-quality reference for TTS evaluation. Small (~14 min per speaker, ~4.5 h of unique content) and non-commercial, so not a training-scale corpus.

Getting the data

Obtain from the dataset homepage.

Mirrored on Zenodo (record 4660670). 15 versions (3 produced + 12 device/environment); ~4.5 h of unique speech replicated across versions.