AISHELL-3
85-hour multi-speaker Mandarin TTS corpus with 218 speakers in clean recording conditions; the standard Chinese multi-speaker TTS baseline.
- Task: tts
- Languages: zh
- Hours: 85
- Domain: read speech
- License: CC BY-NC-ND 4.0
- Homepage: https://www.openslr.org/93
- Paper: https://arxiv.org/abs/2010.11567
Recommendation
Best starting point for multi-speaker Mandarin TTS. Clean studio conditions and a large speaker pool make it suitable for voice cloning research. Non-commercial license — check before production use.
Getting the data
Downloadable via VoxKitchen (aishell3, source: openslr, size: 17.7 GB):
vkit docker download --tag slim aishell3 --root ./data/aishell3
Subsets: data_aishell3.
Suggested processing
A recommended VoxKitchen pipeline ships in the repository at voxkitchen/templates/pipelines/tts-data-prep.yaml — run it with vkit docker run.