Difference between revisions of "2020Q1 Reports: ACL 2020"

Revision as of 03:27, 18 February 2020

General Chair

Dan Jurafsky, Stanford University

The 58th annual meeting of the Association for Computational Linguistics (ACL) will take place in Seattle, Washington at the Hyatt Regency Seattle in downtown Seattle from July 5th through July 10th, 2020.

We have a great set of chairs! We are continuing 2019's new roles (Diversity and Inclusion chairs, Remote Presentation Chairs, AV Chairs) and adding new ones: (Sustainability chair), and we are doing well in demographic representation among our chairs (gender and region).

Following advice from last year, we have been using Slack for most intra-committee communication (and we put the Slack channel into the ACL pro space, so it can be preserved for future years), and using email only when absolutely necessary.

As usual, the growing size of the conference (both in papers and attendees) is a challenge, but both in papers and space we have been doing well (see the individual chair summaries below).

[this summary in progress]

Program Chairs

Joyce Chai, University of Michigan

Natalie Schluter, IT University of Copenhagen, Denmark

Joel Tetreault, Dataminr, USA

Local Organisation Chairs

Priscilla Rasmussen, ACL

With advice from:

Jianfeng Gao, Microsoft Research

Luke Zettlemoyer, University of Washington

Tutorial Chairs

Agata Savary, University of Tours, France

Yue Zhang, Westlake University

The call, submission, reviewing and selection of tutorials was coordinated jointly for 4 conferences: ACL, AACL-IJCNLP, COLING and EMNLP.

Before drafting the call, we collected the lists of tutorials offered within the past 4 years. We analysed previous calls for tutorials and reports from tutorial chairs (from 2016, 2017 and 2018). We consulted previous tutorial chairs with a questionnaire including questions about: the number of submissions, encouraging submissions on specific topics or from specific lecturers, the review procedure, the evaluation criteria, the post-tutorial availability of the slides/codes, and lessons learned from tutorial coordination. We also discussed the publication of slides and video recordings from future tutorials with the persons in charge of the ACL Anthology. As a result of these steps, we created two new sections for the ACL Conference Handbook (future chairs are kindly requested to update these documents yearly):

the list of past tutorials at ACL, COLING, EACL, EMNLP, and NAACL in 2016-2019
a tutorial chair handbook

The final call differs from previous calls in several aspects: (i) the expectations about tutorial proposals were made clearer, (ii) the teachers' payment policy was replaced by a fee-waiving policy, (iii) the required submission details include two new items: diversity considerations and agreement for open access publication of slides, codes, data and video recordings, (iv) the evaluation criteria (see below) are announced.

We recruited a review committee of 19 members, including the 8 tutorial chairs and 11 external members selected for their large understanding of the NLP domain and a good experience in reviewing and/or tutorial teaching:

Review Committee

Timothy Baldwin (University of Melbourne, Australia) - AACL-IJCNLP 2020 tutorial chair
Daniel Beck (University of Melbourne, Australia) - COLING 2020 tutorial chair
Emily M. Bender (University of Washington, WA, USA)
Erik Cambria (Nanyang Technological University, Singapore)
Gaël Dias (University of Caen Normandie, France)
Stefan Evert (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
Yang Liu (Tsinghua University, Beijing, China)
Agata Savary (University of Tours, France) - ACL 2020 tutorial chair
João Sedoc (Johns Hopkins University, Baltimore, MD, USA)
Lucia Specia (Sheffield University, UK) - COLING 2020 tutorial chair
Xu SUN (Peking University, China)
Yulia Tsvetkov (Carnegie Mellon University, Pittsburgh, PA, USA)
Benjamin Van Durme (Johns Hopkins University, Baltimore, MD, USA) - EMNLP 2020 tutorial chair
Aline Villavicencio (University of Sheffield, UK and Federal University of Rio Grande do Sul, Brazil) - EMNLP 2020 tutorial chair
Taro Watanabe (Google, Inc., Tokyo, Japan)
Aaron Steven White (University of Rochester, NY, USA)
Fei Xia (University of Washington, WA, USA) - AACL-IJCNLP 2020 tutorial chair
Yue Zhang (Westlake University, Hangzhou, China) - ACL 2020 tutorial chair
Meishan Zhang (Tianjin University, China)

In total, we received 43 submissions for the 4 conferences. Each proposal received 3 reviews. The selection criteria included: clarity and preparedness, novelty or timely character of the topic, lecturers' experience, likely audience interest, open access of the teaching material, diversity aspects (multilingualism, gender, age and country of the lecturers), and compatibility with the preferred venues. We accepted 31 proposals, 2 proposals were further withdrawn by the authors.

The final selection for ACL 2020 consists of the following 8 tutorials of 3 hours each (each of them had ACL as the preferred or the second preferred venue):

Morning Tutorials

T1: Interpretability and Analysis in Neural NLP (cutting-edge)
Yonatan Belinkov, Sebastian Gehrmann and Ellie Pavlick
While deep learning has transformed the NLP field and impacted the larger computational linguistics community, the rise of neural networks is stained by their opaque nature: It is challenging to interpret the inner workings of neural network models, and explicate their behavior. Therefore, in the last few years, an increasingly large body of work has been devoted to the analysis and interpretation of neural network models in NLP.
This body of work is so far lacking a common framework and methodology. Moreover, approaching the analysis of modern neural networks can be difficult for newcomers to the field. This tutorial aims to fill this gap and introduce the nascent field of interpretability and analysis of neural networks in NLP.
The tutorial covers the main lines of analysis work, such as probing classifier, behavior studies and test suites, psycholinguistic methods, visualizations, adversarial examples, and other methods. We highlight not only the most commonly applied analysis methods, but also the specific limitations and shortcomings of current approaches, in order to inform participants where to focus future efforts.

T2: Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web (cutting-edge)
Xin Luna Dong, Hannaneh Hajishirzi, Colin Lockard and Prashant Shiralkar
The World Wide Web contains vast quantities of textual information in several forms: unstructured text, template-based semi-structured webpages (which present data in key-value pairs and lists), and tables. Methods for extracting information from these sources and converting it to a structured form have been a target of research from the natural language processing (NLP), data mining, and database communities. While these researchers have largely separated extraction from web data into different problems based on the modality of the data, they have faced similar problems such as learning with limited labeled data, defining (or avoiding defining) ontologies, making use of prior knowledge, and scaling solutions to deal with the size of the Web.
In this tutorial we take a holistic view toward information extraction, exploring the commonalities in the challenges and solutions developed to address these different forms of text. We will explore the approaches targeted at unstructured text that largely rely on learning syntactic or semantic textual patterns, approaches targeted at semi-structured documents that learn to identify structural patterns in the template, and approaches targeting web tables which rely heavily on entity linking and type information.
While these different data modalities have largely been considered separately in the past, recent research has started taking a more inclusive approach toward textual extraction, in which the multiple signals offered by textual, layout, and visual clues are combined into a single extraction model made possible by new deep learning approaches. At the same time, trends within purely textual extraction have shifted toward full-document understanding rather than considering sentences as independent units. With this in mind, it is worth considering the information extraction problem as a whole to motivate solutions that harness textual semantics along with visual and semi-structured layout information. We will discuss these approaches and suggest avenues for future work.

T3: Reviewing Natural Language Processing Research (introductory)
Kevin Cohen, Karën Fort, Margot Mieskes and Aurélie Névéol
As the demand for reviewing grows, so must the pool of reviewers. As the survey presented by Graham Neubig at the 2019 ACL showed, a considerable number of reviewers are junior researchers, who might lack the experience and expertise necessary for high-quality reviews. Some of them might not have the environment or lack opportunities that allow them to learn the skills necessary. A tutorial on reviewing for the NLP community might increase reviewers’ confidence, as well as the quality of the reviews. This introductory tutorial will cover the goals, processes, and evaluation of reviewing research papers in natural language processing.

T4: Stylized Text Generation: Approaches and Applications (cutting-edge)
Lili Mou and Olga Vechtomova
Text generation has played an important role in various applications of natural language processing (NLP), and kn recent studies, researchers are paying increasing attention to modeling and manipulating the style of the generation text, which we call stylized text generation. In this tutorial, we will provide a comprehensive literature review in this direction. We start from the definition of style and different settings of stylized text generation, illustrated with various applications. Then, we present different settings of stylized generation, such as parallel supervised, style label-supervised, and unsupervised. In each setting, we delve deep into machine learning methods, including embedding learning techniques to represent style}, adversarial learning and reinforcement learning with cycle consistency to match content but to distinguish different styles. We also introduce current approaches of evaluating stylized text generation systems. We conclude our tutorial by presenting the challenges of stylized text generation and discussing future directions, such as small-data training, non-categorical style modeling, and a generalized scope of style transfer (e.g., controlling the syntax as a style).

Afternoon Tutorials

T5: Achieving Common Ground in Multi-modal Dialogue (cutting-edge)
Malihe Alikhani and Matthew Stone
All communication aims at achieving common ground (grounding): interlocutors can work together effectively only with mutual beliefs about what the state of the world is, about what their goals are, and about how they plan to make their goals a reality. Computational dialogue research offers some classic results on grouding, which unfortunately offer scant guidance to the design of grounding modules and behaviors in cutting-edge systems. In this tutorial, we focus on three main topic areas: 1) grounding in human-human communication; 2) grounding in dialogue systems; and 3) grounding in multi-modal interactive systems, including image-oriented conversations and human-robot interactions. We highlight a number of achievements of recent computational research in coordinating complex content, show how these results lead to rich and challenging opportunities for doing grounding in more flexible and powerful ways, and canvass relevant insights from the literature on human--human conversation. We expect that the tutorial will be of interest to researchers in dialogue systems, computational semantics and cognitive modeling, and hope that it will catalyze research and system building that more directly explores the creative, strategic ways conversational agents might be able to seek and offer evidence about their understanding of their interlocutors.

T6: Commonsense Reasoning for Natural Language Processing (introductory)
Maarten Sap, Vered Shwartz, Antoine Bosselut, Dan Roth and Yejin Choi
In our tutorial, we (1) outline the various types of commonsense (e.g., physical, social), and (2) discuss techniques to gather and represent commonsense knowledge, while highlighting the challenges specific to this type of knowledge (e.g., reporting bias). We will then (3) discuss the types of commonsense knowledge captured by modern NLP systems (e.g., large pretrained language models), and (4) present ways to measure systems' commonsense reasoning abilities. We finish with (5) a discussion of various ways in which commonsense reasoning can be used to improve performance on NLP tasks, exemplified by an (6) interactive session on integrating commonsense into a downstream task.

T7: Integrating Ethics into the NLP Curriculum (introductory)
Emily M. Bender, Dirk Hovy and Alexandra Schofield
Our goal in this tutorial is to empower NLP researchers and practitioners with tools and resources to teach others about how to ethically apply NLP techniques. Our tutorial will present both high-level strategies for developing an ethics-oriented curriculum, based on experience and best practices, as well as specific sample exercises that can be brought to a classroom. We plan to make this a highly interactive work session culminating in a shared online resource page that pools lesson plans, assignments, exercise ideas, reading suggestions, and ideas from the attendees. We consider three primary topics with our session that frequently underlie ethical issues in NLP research: Dual use, bias and privacy.
In this setting, a key lesson is that there is no single approach to ethical NLP: each project requires thoughtful consideration about what steps can be taken to best support people affected by that project. However, we can learn (and teach) what kinds of issues to be aware of and what kinds of strategies are available for mitigating harm. To teach this process, we apply and promote interactive exercises that provide an opportunity to ideate, discuss, and reflect. We plan to facilitate this in a way that encourages positive discussion, emphasizing the creation of ideas for the future instead of negative opinions of previous work.

T8: Recent Advances in Open-Domain Question Answering (cutting-edge)
Danqi Chen and Scott Wen-tau Yih
Open-domain (textual) question answering (QA), the task of finding answers to open-domain questions by searching a large collection of documents, has been a long-standing problem in NLP, information retrieval (IR) and related fields (Voorhees et al., 1999; Moldovan et al., 2000; Brill et al.,2002; Ferrucci et al., 2010). Traditional QA systems were usually constructed as a pipeline, consisting of many different components such as question processing, document/passage retrieval and answer processing. With the rapid development of neural reading comprehension (Chen, 2018), modern open-domain QA systems have been restructured by combining traditional IR techniques and neural reading comprehension models (Chen et al., 2017; Yang et al., 2019) or even implemented in a fully end-to-end fashion (Lee et al., 2019; Seo et al., 2019). While the system architecture has been drastically simplified, two technical challenges remain critical:(1) “Retriever”: finding documents that (might)contain an answer from a large collection of documents; (2) “Reader”: finding the answer in a given paragraph or a document.
In this tutorial, we aim to provide a comprehensive and coherent overview of recent advances in this line of research. We will start by first giving a brief historical background of open-domain question answering, discussing the basic setup and core technical challenges of the research problem.The focus will then shift to modern techniques and resources proposed for open-domain QA, including the basics of latest neural reading comprehension systems, new datasets and models. The scope will also be broadened to cover the information retrieval component on how to effectively identify passages relevant to the questions. Moreover, in-depth discussions will be given on the use of traditional / neural IR modules, as well as the trade-offs between modular design and end-to-end training. If time permits, we also plan to discuss some hybrid approaches for answering questions using both text and large knowledge bases (e.g. (Sun et al., 2018)) and give a critical review on how structured data complements the information from unstructured text.
At the end of our tutorial, we will discuss some important questions, including (1) How much progress have we made compared to the QA systems developed in the last decade?(2) What are the main challenges and limitations of cur-rent approaches? (3) How to trade off the efficiency (computational time and memory requirements) and accuracy in the deep learning era? We hope that our tutorial will not only serve as a useful resource for the audience to efficiently acquire the up-to-date knowledge, but also provide new perspectives to stimulate the advances of open-domain QA research in the next phase.

Workshop Chairs

Milica Gašić, Heinrich Heine University Düsseldorf

Dilek Hakkani-Tur, Amazon Alexa AI

Saif M. Mohammad, National Research Council Canada

Ves Stoyanov, Facebook AI

Student Research Workshop Chairs and Faculty Advisors

Rotem Dror, Technion - Israel Institute of Technology

Jiangming Liu, The University of Edinburgh

Shruti Rijhwani, Carnegie Mellon University

Omri Abend, Hebrew University of Jerusalem

Sujian Li, Peking University

Zhou Yu, University of California, Davis

Audio-Video Chairs

Hamid Palangi, Microsoft Research, Redmond

Lianhui Qin, University of Washington

Conference Handbook Chair

Nanyun Peng, University of Southern California

Demo Chairs

Asli Celikyilmaz, Microsoft Research, Redmond

Shawn Wen, PolyAI

Diversity & Inclusion (D&I) Chairs

Cecilia Ovesdotter Alm, Rochester Institute of Technology

Vinodkumar Prabhakaran, Google

Local Sponsorship Chairs

Hoifung Poon, Microsoft

Kristina Toutanova, Google

Publication Chairs

Steven Bethard, University of Arizona

Ryan Cotterrell, University of Cambridge

Rui Yan, Peking University

Starting from the style files from ACL 2019, we have produced new LaTeX style files for ACL 2020. Most of the description was retained, but the order of sections was overhauled to make sure that important information wasn't scattered so haphazardly across the document. Other improvements were also made, like using the recommended citation style consistently throughout the LaTeX source, and separating out all the LaTeX-specific stuff into clearly marked sections. The MS Word version was derived from these LaTeX versions to match as closely as possible. The LaTeX version was also posted to the Overleaf gallery. The most recent .bib file for the entire ACL Anthology was included in the style file distribution to encourage authors to use the official citations for ACL Anthology publications. All style file changes were merged into https://github.com/acl-org/acl-pub/tree/gh-pages/paper_styles.

Publicity Chair

Emily M. Bender, University of Washington

Dissemination

Durable accounts for the ACL meeting on Twitter and Facebook have been created:

* https://twitter.com/aclmeeting
* https://www.facebook.com/aclmeeting/

These will be passed along to the ACL 2021 publicity chair(s) so that they don't have to build up followers separately. As of Feb 4, 2020 the Twitter account has 4,061 followers and the Facebook account has 181. We have not yet been making use of the Instagram account, but we have been using the Twitter and Facebook accounts to publicize important dates as well as blog posts. The Twitter account especially has been useful for fielding questions from the community. Calls for papers have also gone out over the ACL member portal and several mailing lists, as well as websites such as WikiCFP. (These are maintained in a spreadsheet which can be handed off to the ACL 2021 publicity chair(s)).

Next Steps

* Recruit co-chairs, especially to coordinate live-tweeting of the conference
* Contact local media for coverage
* Develop land acknowledgement in consultation with the Duwamish Tribe (on whose land the meeting will take place). The Duwamish publish this information about land acknowledgments: https://www.duwamishtribe.org/land-acknowledgement

Remote Presentation Chairs

Hao Fang, Microsoft Semantic Machines

Yi Luan, Google AI Language

Sustainability Chairs

Ananya Ganesh, Educational Testing Service

Klaus Zechner, Educational Testing Service

Our main goal for this new focus area is to engage the ACL community in discussions about how best to reduce the carbon footprint of future ACL conferences in order to contribute to sustainable and livable conditions on this planet. One of the main directions we are currently envisioning is to encourage and support conference attendees in virtual participation using live streaming of conference events as air travel is the main contributor to the carbon footprint of international conferences.

Website & Conference App Chairs

Sudha Rao, Microsoft Research, Redmond

Yizhe Zhang, Microsoft Research, Redmond

Business Office