A full day workshop in conjunction with ACM Multimedia 2021

Please scroll down for instructions on how to get access to the data.


The MuSe-Wilder and MuSe-Sent sub-challenges use the MuSe-CAR database. It is a large, multimodal dataset which has been gathered in-the-wild with the intention of further understanding Multimodal Sentiment Analysis in-the-wild, e.g., the emotional engagement that takes place during product reviews (i.e., automobile reviews) where a sentiment is linked to a topic or entity.

The estimated age range of the professional, semi-professional (influncers), and casual reviewers is between the middle of 20s until the late 50s. Most are native English speakers from the UK or the US, while a small minority are non-native, yet fluent English speakers. We have designed MuSe-CAR to be of high voice and video quality, as informative video social media content, as well as everyday recording devices have improved in recent years. [dataset paper] [last year baselines][baseline paper] {baseline git code}

Symbolic photo, not taken from the databse (privacy reasons) instead by Tim Gouw on Unsplash


For MuSe-Physio and MuSe-Stress, the novel Ulm-TSST database is used, supplying a multimodal richly annotated dataset of self-reported, and external dimensional ratings of emotion and mental well-being. After a brief period of preparation the subjects are asked to give an oral presentation, within a job-interview setting. Ulm-TSST includes biological recordings, such as Electrocardiogram (ECG), Electrodermal Activity (EDA), Respiration, and Heart Rate (BPM) as well as continuous arousal and valence annotations. With 105 participants (69.5% female) aged between 18 and 39 years, a total of 10 hours were accumulated. [baseline paper] {baseline git code}

Get the Data

Please download the respective End User License Agreement (EULA), fill it out and send it via the form below or to to access the data. We will review your application and get in touch as soon as possible. [baseline paper] {baseline code}

MuSe-Wilder and MuSe-Sent: MuSe-CaR EULA MuSe-Stress and MuSe-Physio: Ulm-TSST EULA


You have to be logged in with a Google account in order to see the form above (otherwise it is greyed out).

If you do not have or do not want a google account, you can also submit the form to Processing may take longer.

Both data sets can be used for research purposes only.

EULA Summary

1. Intended use

The MuSe 2021 datasets can only be used for the purpose of benchmarking audio, video, or audiovisual affect and fusion recognition systems, according to the guidelines of MuSe 2021.

2. Responsibility

The signee, who is responsible for the team, must hold a permanent position at an academic institute. Up to five other researchers affiliated with the same institute may be named (e.g. PhD students), which will allow them to work with this dataset. We are not responsible for the content nor the meaning of the videos.

3. Data storage

Unprotected, the data must only be stored on the computers of the signees of this document. If stored on a local network, the data must be subject to user-level access control.

4. Commercial use

The user are not allowed to use the database for any commercial purposes. The database and annotations are available for non-commercial research purposes only. Commercial purposes include, but are not limited to:

  • proving the efficiency of commercial systems

  • testing commercial systems

  • using screenshots of subjects from the database in advertisements

  • selling data from the database

5. Distribution

The user may not distribute the database in any way. No portion (screenshots/audioclips etc.) may be distributed in any publications or presentation, with the exception of presentations and documents in context of this challenge and the workshop. Participants will agree not to reproduce, duplicate, copy, sell, trade, resell or exploit the data for any commercial purposes, any portion of the images and any portion of derived data. They will also agree not to further copy, publish or distribute any portion of annotations of the dataset. The user will forward all requests for copies of the database to the MuSe database administrators.

6. External systems

  • Participants can use the raw as well as preprocessed scene/background/audio/body pose etc. features along with the provided information.

  • The participants are free to use external data for training along with the MuSe. However, this should be reproducible and clearly discussed in the accompanying paper. The participants are free to use any commercial or academic feature extractors, pre-trained network and libraries.

At the end of the challenges, we may ask the best teams to send us:

- team name, name of team members

- the predictions on the test set (or the submission number of a previous one)

- a link to a Github repository or the zipped source code including parameters for the replication of the results

- a link to an ArXiv paper with 2-6 pages describing their proposed methodology, data used and results.

We encourage the participates to submit their solution as paper describing the solution and results to our workshop. The winners of each sub-challenge have to submit a paper in order to be announced as a winner. To submit a paper to the Challenge Workshop in the ACM Multimedia proceedings, the report has to fit the conference requirements.