Skip to content

LibriSpeech

Read English audiobooks; the standard English ASR benchmark.

Recommendation

The default starting point for English ASR — clean, well-segmented read speech with transcripts. Prototype on train-clean-100; use the full 960 h for production. Not representative of conversational or noisy audio.

Getting the data

Downloadable via VoxKitchen (librispeech, source: openslr, size: 299 MB - 28.5 GB):

vkit docker download --tag slim librispeech --root ./data/librispeech

Subsets: dev-clean, dev-other, test-clean, test-other, train-clean-100, train-clean-360, train-other-500.

Suggested processing

A recommended VoxKitchen pipeline ships in the repository at examples/pipelines/librispeech-asr.yaml — run it with vkit docker run.