Deadline Extended: Fourth Workshop on Visually Grounded Interaction and Language

Event Notification Type: 
Call for Papers
Abbreviated Title: 
ViGIL
Location: 
Co-located with NAACL 2021
Thursday, 10 June 2021
Contact Email: 
Contact: 
ViGIL Workshop Organizers
Submission Deadline: 
Friday, 19 March 2021

We would like to invite you to submit to the Visually Grounded Interaction and Language (ViGIL) Workshop to be held at NAACL 2021 on June 10th, 2021 - Mexico City. The workshop will be held virtually or in-person (more information on NAACL website).

ViGIL will be a one-day interdisciplinary workshop that has a focus on grounded language learning, and aims to bridge the fields of human cognition and machine learning through discussions on combining language with perception and interaction. We have a series of exciting speakers from varied research domains (list of speakers noted in the addendum).

In addition, this year’s ViGIL workshop will also be hosting the 2nd GQA challenge, which focuses on compositional reasoning for visual question answering.

The paper submission deadline is March 19, 2021. Papers should be submitted to: https://cmt3.research.microsoft.com/ViGIL2021.

More details are available on the workshop website: https://vigilworkshop.github.io/.

---

You are welcome to submit a 4-page paper based on in-progress work, or your relevant paper being presented at NAACL 2021, on any of the following topics, or other topics related to the workshop:

- grounded and interactive language acquisition;
- reasoning and planning in language, vision, and interactive domains;
- machine translation with visual cues;
- transfer learning in language and vision tasks;
- visual captioning, dialog, storytelling, and question-answering;
- visual synthesis from language;
- embodied agents: language instructions, agent coordination through language, interaction;
- language-grounded robotic learning with multimodal inputs;
- human-machine interaction with language through robots or within virtual world;
- audio-visual scene understanding and dialog systems;
- novel tasks that combine language, vision, interactions, and other modalities;
- understanding and modeling the relationship between language and vision in humans;
- semantic systems and modeling of natural language and visual stimuli representations in the human brain.
- epistemology and research reflections about language grounding, human embodiment and other related topics
- visual and linguistic cognition in infancy and/or adults

Accepted papers will be presented during joint poster sessions, with exceptional submissions selected for spotlight oral presentations. Accepted papers will be made publicly available as non-archival reports, allowing future submissions to archival conferences or journals.

Submissions should be up to 4 pages excluding references, acknowledgments, and supplementary material, and should follow the NAACL paper format. The review process will be double-blind.

We welcome review and positional papers that may foster discussions. We also encourage published papers from *non-ML* conferences, e.g. epistemology, cognitive science, psychology, neuroscience, that are within the scope of the workshop.

---
Speakers at the workshop:

- Sandra Waxman (Northwestern University) focuses on infant language acquisition and development of concepts and language, and the relation between the two.
- Trevor Darrell (UC Berkeley), whose research interests include computer vision, language, machine learning, graphics, and perception-based human computer interfaces.
- Max Garagnani (University of London) focuses on the implementation of biologically realistic neural-network in language, memory and visual perception.
- Roger Levy (MIT) focuses on understanding the cognitive underpinning of natural language processing and acquisition.
- Yejin Choi (University of Washington / AI2), whose research is at the intersection of natural language and machine learning, with interests in computer vision and digital humanities.
- Stefanie Tellex (Brown University) who focuses is to construct robots that seamlessly use natural language to communicate with humans.
- Katerina Fragkiadaki (CMU) explores building machines that understand the stories that videos portray and, using videos to teach machines about the world.
- Justin Johnson (University of Michigan / FAIR) focuses on visual reasoning, vision and language, image generation, and 3D reasoning using deep neural networks.

---
Important dates:
- Paper submission deadline: 19 March 2021 (previously: 12 March)
- Decision notifications: 15 April 2021
- Camera ready deadline: 15 May 2021
- Workshop date: 10 June 2021

All deadlines are 11:59pm Pacific Time.