Skip to content

LJSpeech

Single-speaker English TTS corpus (24 h, 13,100 clips) recorded from LibriVox readings. Universally used as a single-speaker TTS baseline.

Recommendation

Ideal for single-speaker TTS experiments and for quickly checking a pipeline end-to-end due to its small size. Because it is a single speaker it is not suitable for multi-speaker or voice-cloning experiments. Prefer LibriTTS or VCTK when speaker diversity matters.

Getting the data

Downloadable via VoxKitchen (ljspeech, source: keithito, size: 2.6 GB):

vkit docker download --tag slim ljspeech --root ./data/ljspeech

Subsets: default.

Suggested processing

A recommended VoxKitchen pipeline ships in the repository at voxkitchen/templates/pipelines/tts-data-prep.yaml — run it with vkit docker run.