DAPS (Device and Produced Speech)

Professional studio-quality speech with time-aligned recordings of the same speech captured on consumer devices (tablet, smartphone) in real-world environments; 20 speakers.

Task: tts
Languages: en
Hours: 4.5
Domain: clean studio + device-recorded
License: CC BY-NC 4.0
Homepage: https://ccrma.stanford.edu/~gautham/Site/daps.html
Paper: https://doi.org/10.1109/LSP.2015.2438544

Recommendation

Ideal as clean reference / parallel data for speech enhancement, dereverberation, and device-channel robustness, and as a high-quality reference for TTS evaluation. Small (~14 min per speaker, ~4.5 h of unique content) and non-commercial, so not a training-scale corpus.

Getting the data

Obtain from the dataset homepage.

Mirrored on Zenodo (record 4660670). 15 versions (3 produced + 12 device/environment); ~4.5 h of unique speech replicated across versions.