Skip to content

MSP-Podcast

Large-scale naturalistic emotional speech mined from Creative-Commons podcasts, multi-rater annotated with categorical emotions and valence/ arousal/dominance attributes.

Recommendation

Best for realistic, in-the-wild speech emotion recognition at scale and the standard benchmark for recent SER challenges. Size is version-dependent and grows per release; natural emotion yields lower inter-rater agreement; access is gated behind a signed institutional agreement.

Getting the data

Obtain from the dataset homepage.

Requires an institution-signed academic license (free); released by UT Dallas MSP Lab. Continually expanding, so size grows per release.

Suggested processing

A recommended VoxKitchen pipeline ships in the repository at examples/pipelines/emotion-recognize.yaml — run it with vkit docker run.