CFP: The Second Workshop on Multimodal Semantic Representations

Event Notification Type: 
Call for Papers
Abbreviated Title: 
MMSR II
Location: 
Co-located with ECAI 2024 (https://www.ecai2024.eu/)
Saturday, 19 October 2024
Country: 
Spain
City: 
Santiago de Compostela
Contact: 
Kenneth Lai
Lucia Donatelli
Ricky Brutti
James Pustejovsky
Nikhil Krishnaswamy
Submission Deadline: 
Wednesday, 15 May 2024

The demand for more sophisticated natural human-computer and human-robot interactions is rapidly increasing as users become more accustomed to conversation-like interactions with AI and NLP systems. Such interactions require not only the robust recognition and generation of expressions through multiple modalities (language, gesture, vision, action, etc.), but also the encoding of situated meaning.

When communications become multimodal, each modality in operation provides an orthogonal angle through which to probe the computational model of the other modalities, including the behaviors and communicative capabilities afforded by each. Multimodal interactions thus require a unified framework and control language through which systems interpret inputs and behaviors and generate informative outputs. This is vital for intelligent and often embodied systems to understand the situation and context that they inhabit, whether in the real world or in a mixed-reality environment shared with humans.

Furthermore, multimodal large language models appear to offer the possibility for more dynamic and contextually rich interactions across various modalities, including facial expressions, gestures, actions, and language. We invite discussion on how representations and pipelines can potentially integrate such state-of-the-art language models.

We solicit papers on multimodal semantic representation, including but not limited to the following topics:

- Semantic frameworks for individual linguistic co-modalities (e.g. gaze, facial expression);
- Formal representation of situated conversation and embodiment, including knowledge graphs, designed to represent epistemic state;
- Design, annotation, and corpora of multimodal interaction and meaning representation;
- Challenges (including cross-lingual and cross-cultural) in multimodal representation and/or processing;
- Criteria or frameworks for evaluation of multimodal semantics;
- Challenges in aligning co-modalities in formal representation and/or NLP tasks;
- Design and implementation of neurosymbolic or fusion models for multimodal processing (with a representational component);
- Methods for probing knowledge of multimodal (language and vision) models;
- Virtual and situated agents that embody multimodal representations of common ground.

Submission Information
Two types of submissions are solicited: long papers and short papers. Long papers should describe original research and must not exceed 8 pages, excluding references. Short papers (typically system or project descriptions, or ongoing research) must not exceed 4 pages, excluding references. Both types will be published in the workshop proceedings. Accepted papers get an extra page in the camera-ready version.

We strongly encourage students to submit to the workshop.