Skip to content

CREMA-D

7,442 acted audio-visual emotional clips from 91 demographically diverse actors speaking 12 sentences in 6 emotions at 4 intensity levels.

Recommendation

Good for demographically diverse acted emotion recognition and audio-visual affect work, with crowd-sourced perceptual ratings included. Choose it when speaker/ethnic diversity matters; emotion is acted and the corpus is modest in size.

Getting the data

Obtain from the dataset homepage.

Openly available on GitHub under Open Data Commons licenses.

Suggested processing

A recommended VoxKitchen pipeline ships in the repository at examples/pipelines/emotion-recognize.yaml — run it with vkit docker run.