Skip to content

KeSpeech

1,542 h from 27,237 speakers across 34 cities, covering standard Mandarin and its 8 subdialects with transcription, speaker, and subdialect labels.

Recommendation

Excellent for Mandarin ASR, accent/subdialect identification, and speaker recognition, with parallel Mandarin-vs-subdialect recordings. Choose it when dialectal robustness or large speaker diversity matters. Download requires accepting a custom usage agreement.

Getting the data

Obtain from the dataset homepage.

NeurIPS 2021 Datasets & Benchmarks. Download requires accepting a usage agreement; covers 8 subdialects.

Suggested processing

A recommended VoxKitchen pipeline ships in the repository at voxkitchen/templates/pipelines/asr-training-data.yaml — run it with vkit docker run.