MuSe Workshop Scope
A core discipline of multimedia research is to handle big data and develop effective embedding and fusing strategies in order to combine multiple modalities and understand multimedia content better. The recordings of the database are from the fast growing source of big data - user-generated content in a natural setting. They also inherently contain three modalities: video in the form of domain-context, perceived gestures and facial expressions; audio through voice prosody and intonations as well as domain-dependent environment sounds; and text via natural spoken language.
Photo by BMW
The ACM-MM and CHI conferences have recently seen an upsurge of interest in affective computing related topics, with the ACM-MM running a dedicated area on `Emotional and Social Signals’ since 2015. Furthermore, automatic understanding of trustworthiness in user-generated content is an under-researched but highly-relevant, novel field of human social signals.
ACM-MM seeks to find new ways of bridging the vision, audio, and language communities. Of note, MuSe’s goal is to bring machine learning researchers from signal-oriented audiovisual, and symbolic natural language processing together, as well as researchers specifically in the realm of affect/emotion and sentiment. This interdisciplinary approach sentiment understanding is prominent for MuSe 2020 due to the unique richly annotated modalities available from the large scale MuSe-CAR dataset.