Skip to content

AISHELL-2

1000 hours of clean Mandarin read-speech from ~1991 speakers covering entertainment, finance, technology, sports, and place-of-interest commands, recorded over iOS/Android/microphone channels.

Recommendation

A standard industrial-scale Mandarin ASR training set — pick it for a serious read-speech baseline beyond AISHELL-1 or THCHS-30. Distribution is gated (free for academic use after a signed agreement with Shell Shell Technology); not redistributable.

Getting the data

Obtain from the dataset homepage.

Apply to AISHELL/Shell Shell with an institutional agreement. iOS channel is the free academic release; Android/Mic channels are dev/test only.

Suggested processing

A recommended VoxKitchen pipeline ships in the repository at voxkitchen/templates/pipelines/asr-training-data.yaml — run it with vkit docker run.