2014Q3 Reports: Program Chairs
Program Chairs (Kristina Toutanova and Hua Wu)
Innovations
As compared to ACL conferences in prior years, the main innovations this year were:
- We allocated reviewers to areas based on the reviewers’ preferences over areas, the preferences of area chairs over reviewers, and the number of submissions in each area (using a tool developed by Mark Dredze and applied to ACL with the help of Jiang Guo; this tool had been applied to NAACL and EMNLP in the past but was applied at ACL for the first time this year).
- We optimized the conference schedule based on feedback from attendees on the talks they would like to see.
- All talks will be recorded at ACL 2014, as in NAACL 2013.
- We scheduled two large poster sessions on two evenings of the conference, to accommodate the large number of poster presentations. Instead of a banquet, we will have is a no-fee light social event, and the president’s talk will be given on the first morning of the conference.
- We grouped oral TACL papers in thematic sessions together with ACL paper presentations, instead of in separate TACL-only sessions.
- We solicited nominations for outstanding reviewers and acknowledged them in the proceedings (around 14% of reviewers were acknowledged).
Submissions and Presentations
ACL 2014 received a total of 1123 submissions, of which 572 were long papers and 551 were short papers. 15 long papers and 19 short papers were rejected without review due to non-anonymity or formatting issues.The remaining submissions were assigned to one of 20 areas, and managed by a program committee of 33 area chairs and 779 reviewers.
146 (26.2%) of the 557 qualifying long papers and 139 (26.1%) of the 532 qualifying short papers were selected for presentation at the conference. Of the accepted long papers, 95 were selected for oral presentation, and 51 for poster presentation. Of the accepted short papers, 51 have oral and 88 have poster presentation slots. The oral versus poster decision was made based not on the quality of the work, but the estimated appeal to a wide audience.
In addition, 19 TACL papers will be presented at ACL – 13 as talks and 6 as posters. Including TACL papers, there will be 159 oral and 145 poster presentations at the main ACL conference. The table below shows the number of reviewed submissions in each area for long and short papers, as well as the number of papers accepted in each area. The table also shows the number of qualifying long and short papers that were withdrawn prior to the completion of the review process (11 long and 34 short papers were withdrawn).
Areas | Long Received | Long Accepted | Short Received | Short Accepted | Total Submissions | Percent of Total Submission | Total Accepts | Percent of Total Accepts | Area Accept Rate |
Cognitive Modeling and Psycholinguistics | 9 | 3 | 14 | 4 | 23 | 2.11% | 7 | 2.46% | 30.43% |
Dialogue and Interactive Systems | 10 | 2 | 8 | 2 | 18 | 1.65% | 4 | 1.40% | 22.22% |
Discourse, Coreference, and Pragmatics | 22 | 5 | 20 | 5 | 42 | 3.86% | 10 | 3.51% | 23.81% |
Document Categorization, Sentiment Analysis, and Topic Models | 53 | 14 | 48 | 13 | 101 | 9.27% | 27 | 9.47% | 26.73% |
Generation | 13 | 6 | 7 | 4 | 20 | 1.84% | 10 | 3.51% | 50.00% |
Information Extraction and Text Mining | 54 | 13 | 49 | 14 | 103 | 9.46% | 27 | 9.47% | 26.21% |
Information Retrieval | 8 | 2 | 9 | 2 | 17 | 1.56% | 4 | 1.40% | 23.53% |
Language Resources and Evaluation | 31 | 8 | 28 | 10 | 59 | 5.42% | 18 | 6.32% | 30.51% |
Lexical Semantics and Ontology | 26 | 7 | 23 | 6 | 49 | 4.50% | 13 | 4.56% | 26.53% |
Machine Learning for Language Processing | 39 | 13 | 15 | 5 | 54 | 4.96% | 18 | 6.32% | 33.33% |
Machine Translation | 76 | 18 | 72 | 19 | 148 | 13.59% | 37 | 12.98% | 25.00% |
Multilinguality and Multimodal NLP | 12 | 3 | 14 | 3 | 26 | 2.39% | 6 | 2.11% | 23.08% |
NLP Applications and NLP-enabled Technology | 32 | 6 | 34 | 9 | 66 | 6.06% | 15 | 5.26% | 22.73% |
NLP for the Web and Social Media | 29 | 5 | 29 | 7 | 58 | 5.33% | 12 | 4.21% | 20.69% |
Question Answering | 6 | 2 | 4 | 0 | 10 | 0.92% | 2 | 0.70% | 20.00% |
Semantics | 53 | 16 | 37 | 11 | 90 | 8.26% | 27 | 9.47% | 30.00% |
Summarization | 19 | 6 | 11 | 4 | 30 | 2.75% | 10 | 3.51% | 33.33% |
Spoken Language Processing | 9 | 2 | 10 | 3 | 19 | 1.74% | 5 | 1.75% | 26.32% |
Tagging, Chunking, Syntax and Parsing | 35 | 12 | 48 | 14 | 83 | 7.62% | 26 | 9.12% | 31.33% |
Phonology, Morphology and Word Segmentation | 10 | 4 | 18 | 4 | 28 | 2.57% | 8 | 2.81% | 28.57% |
Withdrawn | 11 | 0 | 34 | 0 | 45 | 4.13% | 0 | 0% | 0% |
Total | 557 | 146 | 532 | 139 | 1089 | 100% | 285 | 100% | 26.17% |
The top five areas for the highest number of submissions this year were Machine Translation, Information Extraction, Document Categorization/ Sentiment analysis/Topic Models, Semantics, and Tagging/Chunking/Syntax/Parsing.
Review process
As mentioned above, we used Mark Dredze’s tool for reviewer assignment, which Jiang Guo helped apply to ACL this year. Thanks to the tool and the collaboration among the area chairs, the committee organized a smooth review process with sufficient number of expert reviewers. The submissions were reviewed under different categories and using different review forms for empirical/data-driven, theoretical, applications/tools, resources/evaluation, and survey papers. For the short papers we additionally used a negative results category and were glad to see that the community is becoming more open to enabling the publication of useful negative results (two out of six short papers submitted under the negative results category were accepted; additional papers reporting negative results but submitted under other categories were also accepted).
We modified the review forms slightly. One modification was to change the final recommendation range from the 1-to-6 range used last year to a 1-to-5 range. We believe this reduces some unnecessary variance in the scores. Another modification was to reword the questions asking reviewers to check whether any submitted code was well structured, because we thought performing a thorough code review would be an unrealistic demand from the reviewers.
We increased the page limit to 9 pages for submitted long papers and 5 pages for submitted short papers and kept the same limits for the camera-ready versions. The change will likely not be implemented next year.
Formatting and anonymity were checked with the help of the area chairs and the student volunteer Jiang Guo. More than 10% of the submissions had author names in the properties of their files – we chose to manually remove this identifying information, but it would be great if that could be automated. Jiang Guo and Jason Eisner helped us apply automated ways to do this, but they did not work in all cases. Papers that had author names listed under the title were rejected without review unless the authors sent us an email realizing their area in the 48 hours following the deadline; in the latter case we removed the author list from the paper for them.
We did not reject papers for non-anonymity of their supplementary data, software, or notes, since the call for papers did not say anything about whether such materials should be anonymized.
Best paper awards
The area chairs nominated sixteen papers for the long best paper award. We selected a list of seven candidates which were ranked by a specially formed best paper committee, consisting of seven members --- six area chairs and one external member. Based on the ranking, we selected a long best paper and a long best student paper winner. The selection of a best short paper is in process. The best long papers will be presented in a plenary session, whereas the best short paper will be presented in one of the parallel sessions.
Presentations
The oral presentations are arranged in five parallel sessions. There are two large poster sessions including dinner on two evenings of the conference.
We optimized the conference schedule based on feedback from attendees on the talks they would like to see. We collected attendee responses using a scheduling survey developed with the help of David Yarowsky and Svitlana Volkova, and we optimized the conference schedule to assign popular sessions to large conference rooms, and to reduce the chance that two talks that an attendee is interested in are scheduled at the same time. The number of responses collected by the time the schedule was due was 307. The top five topics with highest interest per paper were Semantics, Lexical Semantics, Information Extraction, Question Answering, and Discourse, Dialogue, Coreference and Pragmatics. We started with a manual grouping of papers in groups for sessions. We then optimized the assignments of groups to rooms/times, and the order of papers within a group automatically. Compared to the original manually created schedule, the optimized schedule had 644 fewer instances where an attendee was interested in seeing two papers scheduled at the same time. More thorough and exact optimization methods could be applied in the future.
Recommendations
Based on our workthis year, we would like to make the following recommendations for future years:
- Recommendations for the tool for allocating reviewers to areas: the tool was very useful to ensure a balanced and sufficient number of reviewers were assigned to each area; we recommend using it in future years. (i) Integrate the tool in START: it would make things easier and less confusing for area chairs and reviewers. (ii) Enable area chairs to bid on reviewers after the submissions are in since there are some very narrow areas for which there are only a handful of expert reviewers.
- Add tools for checking non-anonymity and formatting issues to START. We rejected 15 long and 19 short papers for such issues which could have been prevented if the system could automatically alert the authors of such issues. If there are large technical problems with implementing this, there could be a student volunteer on the organizing committee who could help alert authors that submit at least a few hours before the deadline.
- Enable merging of the schedules for long, short, and TACL papers in START, as well as uploading metadata and papers for TACL papers, to simplify the process of deriving the conference program for the website, the conference handbook, and the downloadable proceedings.
- Enable the persistence of reviewers and their assignment completion rate in START.
- Issue a policy on whether self-published papers (published on ArXiv or the authors’ web pages) should constitute previously published work. This year we published a policy applicable to ACL-14 only, but this changes from year to year and CL conference to conference.
- Issue a policy on whether supplementary materials should be anonymized.