Difference between revisions of "2014Q3 Reports: Program Chairs"

Latest revision as of 13:42, 5 June 2014

Program Chairs (Kristina Toutanova and Hua Wu)

Innovations

As compared to ACL conferences in prior years, the main innovations this year were:

We allocated reviewers to areas based on the reviewers’ preferences over areas, the preferences of area chairs over reviewers, and the number of submissions in each area (using a tool developed by Mark Dredze and applied to ACL with the help of Jiang Guo; this tool had been applied to NAACL and EMNLP in the past but was applied at ACL for the first time this year).
We optimized the conference schedule based on feedback from attendees on the talks they would like to see.
All talks will be recorded at ACL 2014, as in NAACL 2013.
We scheduled two large poster sessions on two evenings of the conference, to accommodate the large number of poster presentations. Instead of a banquet, we will have is a no-fee light social event, and the president’s talk will be given on the first morning of the conference.
We grouped oral TACL papers in thematic sessions together with ACL paper presentations, instead of in separate TACL-only sessions.
We solicited nominations for outstanding reviewers and acknowledged them in the proceedings (around 14% of reviewers were acknowledged).

Submissions and Presentations

ACL 2014 received a total of 1123 submissions, of which 572 were long papers and 551 were short papers. 15 long papers and 19 short papers were rejected without review due to non-anonymity or formatting issues.The remaining submissions were assigned to one of 20 areas, and managed by a program committee of 33 area chairs and 779 reviewers.

146 (26.2%) of the 557 qualifying long papers and 139 (26.1%) of the 532 qualifying short papers were selected for presentation at the conference. Of the accepted long papers, 95 were selected for oral presentation, and 51 for poster presentation. Of the accepted short papers, 51 have oral and 88 have poster presentation slots. The oral versus poster decision was made based not on the quality of the work, but the estimated appeal to a wide audience.

In addition, 19 TACL papers will be presented at ACL – 13 as talks and 6 as posters. Including TACL papers, there will be 159 oral and 145 poster presentations at the main ACL conference. The table below shows the number of reviewed submissions in each area for long and short papers, as well as the number of papers accepted in each area. The table also shows the number of qualifying long and short papers that were withdrawn prior to the completion of the review process (11 long and 34 short papers were withdrawn).

Areas	Long Received	Long Accepted	Short Received	Short Accepted	Total Submissions	Percent of Total Submission	Total Accepts	Percent of Total Accepts	Area Accept Rate
Cognitive Modeling and Psycholinguistics	9	3	14	4	23	2.11%	7	2.46%	30.43%
Dialogue and Interactive Systems	10	2	8	2	18	1.65%	4	1.40%	22.22%
Discourse, Coreference, and Pragmatics	22	5	20	5	42	3.86%	10	3.51%	23.81%
Document Categorization, Sentiment Analysis, and Topic Models	53	14	48	13	101	9.27%	27	9.47%	26.73%
Generation	13	6	7	4	20	1.84%	10	3.51%	50.00%
Information Extraction and Text Mining	54	13	49	14	103	9.46%	27	9.47%	26.21%
Information Retrieval	8	2	9	2	17	1.56%	4	1.40%	23.53%
Language Resources and Evaluation	31	8	28	10	59	5.42%	18	6.32%	30.51%
Lexical Semantics and Ontology	26	7	23	6	49	4.50%	13	4.56%	26.53%
Machine Learning for Language Processing	39	13	15	5	54	4.96%	18	6.32%	33.33%
Machine Translation	76	18	72	19	148	13.59%	37	12.98%	25.00%
Multilinguality and Multimodal NLP	12	3	14	3	26	2.39%	6	2.11%	23.08%
NLP Applications and NLP-enabled Technology	32	6	34	9	66	6.06%	15	5.26%	22.73%
NLP for the Web and Social Media	29	5	29	7	58	5.33%	12	4.21%	20.69%
Question Answering	6	2	4	0	10	0.92%	2	0.70%	20.00%
Semantics	53	16	37	11	90	8.26%	27	9.47%	30.00%
Summarization	19	6	11	4	30	2.75%	10	3.51%	33.33%
Spoken Language Processing	9	2	10	3	19	1.74%	5	1.75%	26.32%
Tagging, Chunking, Syntax and Parsing	35	12	48	14	83	7.62%	26	9.12%	31.33%
Phonology, Morphology and Word Segmentation	10	4	18	4	28	2.57%	8	2.81%	28.57%
Withdrawn	11	0	34	0	45	4.13%	0	0%	0%
Total	557	146	532	139	1089	100%	285	100%	26.17%

The top five areas for the highest number of submissions this year were Machine Translation, Information Extraction, Document Categorization/ Sentiment analysis/Topic Models, Semantics, and Tagging/Chunking/Syntax/Parsing.

Review process

As mentioned above, we used Mark Dredze’s tool for reviewer assignment, which Jiang Guo helped apply to ACL this year. Thanks to the tool and the collaboration among the area chairs, the committee organized a smooth review process with sufficient number of expert reviewers. The submissions were reviewed under different categories and using different review forms for empirical/data-driven, theoretical, applications/tools, resources/evaluation, and survey papers. For the short papers we additionally used a negative results category and were glad to see that the community is becoming more open to enabling the publication of useful negative results (two out of six short papers submitted under the negative results category were accepted; additional papers reporting negative results but submitted under other categories were also accepted).

We modified the review forms slightly. One modification was to change the final recommendation range from the 1-to-6 range used last year to a 1-to-5 range. We believe this reduces some unnecessary variance in the scores. Another modification was to reword the questions asking reviewers to check whether any submitted code was well structured, because we thought performing a thorough code review would be an unrealistic demand from the reviewers.

We increased the page limit to 9 pages for submitted long papers and 5 pages for submitted short papers and kept the same limits for the camera-ready versions. The change will likely not be implemented next year.

Formatting and anonymity were checked with the help of the area chairs and the student volunteer Jiang Guo. More than 10% of the submissions had author names in the properties of their files – we chose to manually remove this identifying information, but it would be great if that could be automated. Jiang Guo and Jason Eisner helped us apply automated ways to do this, but they did not work in all cases. Papers that had author names listed under the title were rejected without review unless the authors sent us an email realizing their area in the 48 hours following the deadline; in the latter case we removed the author list from the paper for them.

We did not reject papers for non-anonymity of their supplementary data, software, or notes, since the call for papers did not say anything about whether such materials should be anonymized.

Best paper awards

The area chairs nominated sixteen papers for the long best paper award. We selected a list of seven candidates which were ranked by a specially formed best paper committee, consisting of seven members --- six area chairs and one external member. Based on the ranking, we selected a long best paper and a long best student paper winner. The selection of a best short paper is in process. The best long papers will be presented in a plenary session, whereas the best short paper will be presented in one of the parallel sessions.

Presentations

The oral presentations are arranged in five parallel sessions. There are two large poster sessions including dinner on two evenings of the conference.

We optimized the conference schedule based on feedback from attendees on the talks they would like to see. We collected attendee responses using a scheduling survey developed with the help of David Yarowsky and Svitlana Volkova, and we optimized the conference schedule to assign popular sessions to large conference rooms, and to reduce the chance that two talks that an attendee is interested in are scheduled at the same time. The number of responses collected by the time the schedule was due was 307. The top five topics with highest interest per paper were Semantics, Lexical Semantics, Information Extraction, Question Answering, and Discourse, Dialogue, Coreference and Pragmatics. We started with a manual grouping of papers in groups for sessions. We then optimized the assignments of groups to rooms/times, and the order of papers within a group automatically. Compared to the original manually created schedule, the optimized schedule had 644 fewer instances where an attendee was interested in seeing two papers scheduled at the same time. More thorough and exact optimization methods could be applied in the future.

Recommendations

Based on our workthis year, we would like to make the following recommendations for future years:

Recommendations for the tool for allocating reviewers to areas: the tool was very useful to ensure a balanced and sufficient number of reviewers were assigned to each area; we recommend using it in future years. (i) Integrate the tool in START: it would make things easier and less confusing for area chairs and reviewers. (ii) Enable area chairs to bid on reviewers after the submissions are in since there are some very narrow areas for which there are only a handful of expert reviewers.
Add tools for checking non-anonymity and formatting issues to START. We rejected 15 long and 19 short papers for such issues which could have been prevented if the system could automatically alert the authors of such issues. If there are large technical problems with implementing this, there could be a student volunteer on the organizing committee who could help alert authors that submit at least a few hours before the deadline.
Enable merging of the schedules for long, short, and TACL papers in START, as well as uploading metadata and papers for TACL papers, to simplify the process of deriving the conference program for the website, the conference handbook, and the downloadable proceedings.
Enable the persistence of reviewers and their assignment completion rate in START.
Issue a policy on whether self-published papers (published on ArXiv or the authors’ web pages) should constitute previously published work. This year we published a policy applicable to ACL-14 only, but this changes from year to year and CL conference to conference.
Issue a policy on whether supplementary materials should be anonymized.

@@ Line 1: / Line 1: @@
+===Program Chairs (Kristina Toutanova and Hua Wu)===
----+2014Q3 Reports: Program Chairs
-Program Chairs (Kristina Toutanova and Hua Wu)
 == Innovations ==
 As compared to ACL conferences in prior years, the main innovations this year were:
-• We allocatedreviewers to areas based on the reviewers’ preferences over areas, the preferences of area chairs over reviewers, and the number of submissions in each area (using a tool developed by Mark Dredze and applied to ACL with the help of Jiang Guo; this tool had been applied to NAACL and EMNLP in the past but was applied at ACL for the first time this year).
+# We allocated reviewers to areas based on the reviewers’ preferences over areas, the preferences of area chairs over reviewers, and the number of submissions in each area (using a tool developed by Mark Dredze and applied to ACL with the help of Jiang Guo; this tool had been applied to NAACL and EMNLP in the past but was applied at ACL for the first time this year).
-• We optimized the conference schedule based on feedback from attendees on the talks they would like to see.
+# We optimized the conference schedule based on feedback from attendees on the talks they would like to see.
-• All talks will be recorded at ACL 2014, as in NAACL 2013.
+# All talks will be recorded at ACL 2014, as in NAACL 2013.
-• We scheduled two large poster sessions on two evenings of the conference, to accommodate the large number of poster presentations. Instead of a banquet, we will have is a no-feelight social event, and the president’s talk will be given on the first morning of the conference.
+# We scheduled two large poster sessions on two evenings of the conference, to accommodate the large number of poster presentations. Instead of a banquet, we will have is a no-fee light social event, and the president’s talk will be given on the first morning of the conference.
-• We grouped oral TACL papers in thematic sessions together with ACL paper presentations, instead of in separate TACL-only sessions.
+# We grouped oral TACL papers in thematic sessions together with ACL paper presentations, instead of in separate TACL-only sessions.
-• We solicited nominations for outstanding reviewers and acknowledged them in the proceedings (around 14% of reviewers were acknowledged).
+# We solicited nominations for outstanding reviewers and acknowledged them in the proceedings (around 14% of reviewers were acknowledged).
 == Submissions and Presentations ==
 ACL 2014 received a total of 1123 submissions, of which 572 were long papers and 551 were short papers. 15 long papers and 19 short papers were rejected without review due to non-anonymity or formatting issues.The remaining submissions were assigned to one of 20 areas, and managed by a program committee of 33 area chairs and 779 reviewers.
 (26.2%) of the 557 qualifying long papers and 139 (26.1%) of the 532 qualifying short papers were selected for presentation at the conference. Of the accepted long papers, 95 were selected for oral presentation, and 51 for poster presentation. Of the accepted short papers, 51 have oral and 88 have poster presentation slots. The oral versus poster decision was made based not on the quality of the work, but the estimated appeal to a wide audience.
 In addition, 19 TACL papers will be presented at ACL – 13 as talks and 6 as posters. Including TACL papers, there will be 159 oral and 145 poster presentations at the main ACL conference.
 The table below shows the number of reviewed submissions in each area for long and short papers, as well as the number of papers accepted in each area. The table also shows the number of qualifying long and short papers that were withdrawn prior to the completion of the review process (11 long and 34 short papers were withdrawn).
+{| class="wikitable" cellpadding="2" border="1"
-Areas	Long received	Long accepted	Short received 	Short accepted	Total submissions 	Percent of Total	Total Accepts	Percent of Total	Area Acceptance Rate
+| Areas||Long Received||Long Accepted||Short Received||Short Accepted||Total Submissions ||Percent of Total Submission||Total Accepts||Percent of Total Accepts||Area Accept Rate
-Cognitive Modeling and Psycholinguistics	9	3	14	4	23	2.11%	7	2.46%	30.43%
+|-
-Dialogue and Interactive Systems	10	2	8	2	18	1.65%	4	1.40%	22.22%
+| Cognitive Modeling and Psycholinguistics	||9	||3	||14	||4	||23	||2.11%	||7	||2.46%	||30.43%
-Discourse, Coreference, and Pragmatics	22	5	20	5	42	3.86%	10	3.51%	23.81%
+|-
-Document Categorization, Sentiment Analysis, and Topic Models	53	14	48	13	101	9.27%	27	9.47%	26.73%
+| Dialogue and Interactive Systems	||10	||2	||8	||2	||18	||1.65%	||4	||1.40%	||22.22%
-Generation	13	6	7	4	20	1.84%	10	3.51%	50.00%
+|-
-Information Extraction and Text Mining	54	13	49	14	103	9.46%	27	9.47%	26.21%
+| Discourse, Coreference, and Pragmatics	||22	||5	||20	||5	||42	||3.86%	||10	||3.51%	||23.81%
-Information Retrieval	8	2	9	2	17	1.56%	4	1.40%	23.53%
+|-
-Language Resources and Evaluation	31	8	28	10	59	5.42%	18	6.32%	30.51%
+| Document Categorization, Sentiment Analysis, and Topic Models	||53	||14	||48	||13	||101	||9.27%	||27	||9.47%	||26.73%
-Lexical Semantics and Ontology	26	7	23	6	49	4.50%	13	4.56%	26.53%
+|-
-Machine Learning for Language Processing	39	13	15	5	54	4.96%	18	6.32%	33.33%
+| Generation	||13	||6	||7	||4	||20	||1.84%	||10	||3.51%	||50.00%
-Machine Translation	76	18	72	19	148	13.59%	37	12.98%	25.00%
+|-
-Multilinguality and Multimodal NLP	12	3	14	3	26	2.39%	6	2.11%	23.08%
+| Information Extraction and Text Mining	||54	||13	||49	||14	||103	||9.46%	||27	||9.47%	||26.21%
-NLP Applications and NLP-enabled Technology 	32	6	34	9	66	6.06%	15	5.26%	22.73%
+|-
-NLP for the Web and Social Media	29	5	29	7	58	5.33%	12	4.21%	20.69%
+| Information Retrieval	||8	||2	||9	||2	||17	||1.56%	||4	||1.40%	||23.53%
-Question Answering	6	2	4	0	10	0.92%	2	0.70%	20.00%
+|-
-Semantics	53	16	37	11	90	8.26%	27	9.47%	30.00%
+| Language Resources and Evaluation	||31	||8	||28	||10	||59	||5.42%	||18	||6.32%	||30.51%
-Summarization 	19	6	11	4	30	2.75%	10	3.51%	33.33%
+|-
-Spoken Language Processing	9	2	10	3	19	1.74%	5	1.75%	26.32%
+| Lexical Semantics and Ontology	||26	||7	||23	||6	||49	||4.50%	||13	||4.56%	||26.53%
-Tagging, Chunking, Syntax and Parsing	35	12	48	14	83	7.62%	26	9.12%	31.33%
+|-
-Phonology, Morphology and Word Segmentation	10	4	18	4	28	2.57%	8	2.81%	28.57%
+| Machine Learning for Language Processing	||39	||13	||15	||5	||54	||4.96%	||18	||6.32%	||33.33%
-Withdrawn	11	 	34	 	45	4.13%	0	0.00%	0.00%
+|-
-Total	557	146	532	139	1089	100.00%	285	100.00%	26.17%
+| Machine Translation	||76	||18	||72	||19	||148	||13.59%	||37	||12.98%	||25.00%
+|-
+| Multilinguality and Multimodal NLP	||12	||3	||14	||3	||26	||2.39%	||6	||2.11%	||23.08%
+|-
+| NLP Applications and NLP-enabled Technology 	||32	||6	||34	||9	||66	||6.06%	||15	||5.26%	||22.73%
+|-
+| NLP for the Web and Social Media	||29	||5	||29	||7	||58	||5.33%	||12	||4.21%	||20.69%
+|-
+| Question Answering	||6	||2	||4	||0	||10	||0.92%	||2	||0.70%	||20.00%
+|-
+| Semantics	||53	||16	||37	||11	||90	||8.26%	||27	||9.47%	||30.00%
+|-
+| Summarization 	||19	||6	||11	||4	||30	||2.75%	||10	||3.51%	||33.33%
+|-
+| Spoken Language Processing	||9	||2	||10	||3	||19	||1.74%	||5	||1.75%	||26.32%
+|-
+| Tagging, Chunking, Syntax and Parsing	||35	||12	||48	||14	||83	||7.62%	||26	||9.12%	||31.33%
+|-
+| Phonology, Morphology and Word Segmentation	||10	||4	||18	||4	||28	||2.57%	||8	||2.81%	||28.57%
+|-
+| Withdrawn	||11	 ||0	||34	||0	||45	||4.13%	||0	||0%	||0%
+|-
+| Total	||557	||146	||532	||139	||1089	||100%	||285	||100%	||26.17%
+|}
+The top five areas for the highest number of submissions this year were Machine Translation, Information Extraction, Document Categorization/ Sentiment analysis/Topic Models, Semantics, and Tagging/Chunking/Syntax/Parsing.
+===Review process===
+As mentioned above, we used Mark Dredze’s tool for reviewer assignment, which Jiang Guo helped apply to ACL this year. Thanks to the tool and the collaboration among the area chairs, the committee organized a smooth review process with sufficient number of expert reviewers.
+The submissions were reviewed under different categories and using different review forms for empirical/data-driven, theoretical, applications/tools, resources/evaluation, and survey papers. For the short papers we additionally used a negative results category and were glad to see that the community is becoming more open to enabling the publication of useful negative results (two out of six short papers submitted under the negative results category were accepted; additional papers reporting negative results but submitted under other categories were also accepted).
+We modified the review forms slightly. One modification was to change the final recommendation range from the 1-to-6 range used last year to a 1-to-5 range. We believe this reduces some unnecessary variance in the scores. Another modification was to reword the questions asking reviewers to check whether any submitted code was well structured, because we thought performing a thorough code review would be an unrealistic demand from the reviewers.
+We increased the page limit to 9 pages for submitted long papers and 5 pages for submitted short papers and kept the same limits for the camera-ready versions.  The change will likely not be implemented next year.
+Formatting and anonymity were checked with the help of the area chairs and the student volunteer Jiang Guo. More than 10% of the submissions had author names in the properties of their files – we chose to manually remove this identifying information, but it would be great if that could be automated. Jiang Guo and Jason Eisner helped us apply automated ways to do this, but they did not work in all cases.  Papers that had author names listed under the title were rejected without review unless the authors sent us an email realizing their area in the 48 hours following the deadline; in the latter case we removed the author list from the paper for them.
+We did not reject papers for non-anonymity of their supplementary data, software, or notes, since the call for papers did not say anything about whether such materials should be anonymized.
+===Best paper awards===
+The area chairs nominated sixteen papers for the long best paper award. We selected a list of seven candidates which were ranked by a specially formed best paper committee, consisting of seven members --- six area chairs and one external member. Based on the ranking, we selected a long best paper and a long best student paper winner.  The selection of a best short paper is in process. The best long papers will be presented in a plenary session, whereas the best short paper will be presented in one of the parallel sessions.
+===Presentations===
+The oral presentations are arranged in five parallel sessions. There are two large poster sessions including dinner on two evenings of the conference.
+We optimized the conference schedule based on feedback from attendees on the talks they would like to see. We collected attendee responses using a scheduling survey developed with the help of David Yarowsky and Svitlana Volkova, and we optimized the conference schedule to assign popular sessions to large conference rooms, and to reduce the chance that two talks that an attendee is interested in are scheduled at the same time. The number of responses collected by the time the schedule was due was 307.  The top five topics with highest interest per paper were Semantics, Lexical Semantics, Information Extraction, Question Answering, and Discourse, Dialogue, Coreference and Pragmatics.
+We started with a manual grouping of papers in groups for sessions. We then optimized the assignments of groups to rooms/times, and the order of papers within a group automatically. Compared to the original manually created schedule, the optimized schedule had 644 fewer instances where an attendee was interested in seeing two papers scheduled at the same time. More thorough and exact optimization methods could be applied in the future.
+==Recommendations==
+Based on our workthis year, we would like to make the following recommendations for future years:
+# Recommendations for the tool for allocating reviewers to areas: the tool was very useful to ensure a balanced and sufficient number of reviewers were assigned to each area; we recommend using it in future years. (i) Integrate the tool in START: it would make things easier and less confusing for area chairs and reviewers. (ii) Enable area chairs to bid on reviewers after the submissions are in since there are some very narrow areas for which there are only a handful of expert reviewers.
+# Add tools for checking non-anonymity and formatting issues to START. We rejected 15 long and 19 short papers for such issues which could have been prevented if the system could automatically alert the authors of such issues. If there are large technical problems with implementing this, there could be a student volunteer on the organizing committee who could help alert authors that submit at least a few hours before the deadline.
+# Enable merging of the schedules for long, short, and TACL papers in START, as well as uploading metadata and papers for TACL papers, to simplify the process of deriving the conference program for the website, the conference handbook, and the downloadable proceedings.
+# Enable the persistence of reviewers and their assignment completion rate in START.
+# Issue a policy on whether self-published papers (published on ArXiv or the authors’ web pages) should constitute previously published work. This year we published a policy applicable to ACL-14 only, but this changes from year to year and CL conference to conference.
+# Issue a policy on whether supplementary materials should be anonymized.