SIGDAT - Summer 2011 Report

SIGDAT is ACL's special interest group for linguistic data and corpus-based approaches to NLP.

In October 2010, SIGDAT organized its annual Conference On Empirical Methods in Natural Language Processing at the Stata Center in Cambridge, Massachusetts, October 9-11. Hang Li and Lluís Màrquez served were the program co-chairs, Regina Barzilay was local arrangements chair and Eric Fosler-Lussier was publications chair. There were 18 area chairs and 460 reviewers. Kevin Knight, Andrew Ng and Amit Singhal presented invited talks. 500 submissions were received, and 125 papers were accepted (an overall rate of 25%). 70 papers (14%) were accepted for oral presentation, and 55 (11%) were accepted for poster presentation. The Fred Jelinek Best Paper Award was presented to Terry Koo, Alexander Rush, Michael Collins, Tommi Jaakkola and David Sontag for "Dual Decomposition for Parsing with Non-Projective Head Automata". The conference has held in 3 parallel sessions and the proceedings exceeded 1300 pages. Nearly 300 people attended and the relatively large budget surplus will help keep registration fees low for upcoming meetings.

In July 2011, SIGDAT will present 15th anniversary EMNLP conference in Edinburgh, Scotland at the University of Edinburgh John McIntyre Conference Centre and Informatics Forum. The main session will occur between July 27-29, followed by 2 days of workshops on July 30-31. Paola Merlo is the general chair, Regina Barzilay and Mark Johnson are program co-chairs, Bonnie Webber is the local arrangements chair, Marie Candito is the workshop chair and Wangxiang Che is the publications chair. 626 submission were received, and 149 papers were accepted (an overall rate of 23.8%). 95 papers (15%) were accepted for oral presentation, and 54 (9%) were accepted for poster presentation. Three parallel sessions are planned. In addition, for the first time ever, the EMNLP conference will be held as a stand-alone anchor event with 7 associated workshops. These will include:

  • WMT11 (Sixth workshop on statistical machine translation),
  • UNSUP-2011 — First Workshop on Unsupervised Learning in NLP
  • SLPAT-2011 — Workshop on Speech and Language Processing for Assistive Technologies
  • TextInfer-2011 — Workshop on Textual Entailment
  • GEMS-2011 — GEometrical Models of Natural Language Semantics
  • Dialects-2011 — First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
  • UCNLG+Eval — The 4th UCNLG Workshop: Language Generation and Evaluation

The EMNLP 2012 meeting is planned to be held jointly with CoNLL 2012 in collocation with ACL 2011 in Jeju, Korea.

Overall, SIGDAT has had a continuing history of organizational success, with five consecutive years of 400-650 submissions and 230-300+ attendees, suggesting the viability of SIGDAT continuing to organize off-phase and/or stand-alone EMNLP meetings in the future when the NLP conference calendar is relatively thin (as in 2011) while helping to provide critical mass to collocated ACL conferences in very crowded years such as 2012.