2017Q3 Reports: Program Chairs

Innovations

As compared to previous ACL conferences, this year's main innovations were:

We used a blog to communicate and interact with the community about a number of important conference and field related issues.
We opted for a single joint deadline for papers, both long and short.
We added DOIs into the footers of the papers and changed the style files to incorporate DOI (and as a fallback, ACL Anthology) references to facilitate wayfinding between citing and reference papers.
We used the Toronto Paper Matching System (TPMS) to match reviewers to papers.
We opted to combine a few areas into single, larger areas.
Through the blog and social media channels, we recruited area chairs and reviewers partially through a crowdsourcing process. Invited speaker candidates were also nominated through this process.
We shortened the initial review cycle to two weeks and lengthened the discussion period. We asked the recruited area chairs to be sensitive to this requirement.
We added a direct-to-AC communication text box, on the review form, to allow authors to see a higher audience when they felt reviewers misunderstood key aspects of the work.
We renamed the “Other” area, introduced last year, to "Multidisciplinary" to cater to a wider audience and to carry a more positive sentiment.

Rationale

We wanted to make the process of organizing the conference a transparent one. For this reason we started the blog, which evolved into an online, ongoing dialogue for discussion about certain issues. We stayed away from using the blog as an announcement channel for conference related issues, as that can better be done through mass email and/or social media channels.

We felt that the dialogue between authors and reviewers does not work always work out well. A few of our innovations were catered to address this, most notably the shortened initial review period and the longer dialog period. We stressed this when recruiting both area chairs and reviewers. We did not incorporate a formal meta-reviews unlike previous conferences, but worked closely with the area chairs to ensure healthy dialogue between authors and reviewers was maintained all the way until the final acceptance decisions. Larger areas also facilitate less fracturing among disciplines and allows reviewers and the assignment software better chances to find qualified reviewers. This meant that large areas needed a "meta area chair" that would oversee the entire process and help facilitate direct communication with the PC chairs to lessen confusion among ACS of big areas.

We wanted to also ensure that the legacy of the conference through its papers are facilitated to the wider, global audience. This meant adding hyperlinks to references and ensuring that our papers can be easily referenced by other fields.

Submissions and Presentations

ACL 2016 received a total of 1288 valid submissions, of which 825 were long papers and 463 were short papers. 21 long papers and 9 short papers were rejected without review due to non-anonymity or formatting issues. The remaining submissions were each assigned to one of 19 areas, and managed by a program committee of 38 area chairs and 884 reviewers (including secondary reviewers indicated on the review forms). 231 (28%) of the 825 qualifying long papers and 97 (21%) of the 463 qualifying short papers were selected for presentation at the conference. Of the accepted long papers, 116 were selected for oral presentation, and 115 for poster presentation. Of the accepted short papers, 49 have oral and 48 have poster presentations. The oral versus poster decision was made based on the recommendations of reviewers, which we took as a noisy signal of the intended audience’s preference of format for each paper.

In addition, 25 TACL papers will be presented at ACL – 24 as talks and one as a poster. Including TACL papers, there will be 189 oral and 163 poster presentations at the main ACL conference. The table below shows the number of reviewed submissions in each area for long and short papers, as well as the number of papers accepted in each area. Approximately 59 short and 52 long papers were withdrawn before review was completed; these are not included in the table.

area	long reviewed	long accepted	short reviewed	short accepted	total submissions	percentage of total submissions	total accepted	percentage of total accepted	area acceptance rate	outstanding papers
Semantics	114	43	66	13	180	14.0%	56	17.1%	31.1%	3
Information Extraction, Question Answering, and Text Mining	122	27	48	9	170	13.2%	36	11.0%	21.2%	1
Sentiment Analysis and Opinion Mining	77	9	28	3	105	8.2%	12	3.7%	11.4%
Document Analysis	53	13	49	8	102	7.9%	21	6.4%	20.6%
Machine Translation	58	15	36	9	94	7.3%	24	7.3%	25.5%	1
Tagging, Chunking, Syntax, and Parsing	48	18	33	9	81	6.3%	27	8.2%	33.3%	2
Social Media	39	6	30	8	69	5.4%	14	4.3%	20.3%
Machine Learning	46	15	22	6	68	5.3%	21	6.4%	30.9%	1
Resources and Evaluation	44	17	21	6	65	5.0%	23	7.0%	35.4%
Other	34	10	27	5	61	4.7%	15	4.6%	24.6%
Discourse and Pragmatics	42	15	18	4	60	4.7%	19	5.8%	31.7%
Summarization	29	5	19	2	48	3.7%	7	2.1%	14.6%
Multilinguality	19	6	24	5	43	3.3%	11	3.4%	25.6%
Phonology, Morphology, and Word Segmentation	23	6	10	4	33	2.6%	10	3.0%	30.3%	1
Generation	20	8	9	3	29	2.3%	11	3.4%	37.9%
Dialog and Interactive Systems	20	7	8	0	28	2.2%	7	2.1%	25.0%	1
Cognitive Modeling and Psycholinguistics	17	7	8	1	25	1.9%	8	2.4%	32.0%	1
Vision, Robots, and Grounding	14	3	6	2	20	1.6%	5	1.5%	25.0%
Speech	6	1	1	0	7	0.5%	1	0.3%	14.3%

total	825	231	463	97	1288	100.0%	328	100.0%	25.5%	11

The top five areas with the highest number of submissions this year were Semantics; Information Extraction, Question Answering, and Text Mining; Sentiment Analysis and Opinion Mining; Document Analysis (including text categorization, topic models, and retrieval), and Machine Translation.

Outstanding and Best Papers

The area chairs nominated outstanding papers from their areas. Following this stage, the PC chairs selected a pool of 22 (approximate 1.5%) papers from the areas, using criteria to normalize criteria across areas and diversity.

Out of these, the best paper committee of 5 (headed by Min as PC chair) awarded the “best long paper”, "best short paper", and an additional "best resource paper", in a two stage process. Papers were given to the committee in their camera ready form, with author affiliations. Reviews and meta-reviews of the nominated papers were also provided to the committee for reference but members were asked to provide their recommendations and justification (n.b., not review) first without reference to the supplied reviews.

Presentations

The oral presentations are arranged in five parallel sessions. There are two large poster sessions including dinner on the two first evenings of the conference, that include poster presentation space for the system demonstrations and the student research workshop. We manually grouped the papers into sessions largely by area, with TACL papers being manually assigned into the same area hierarchy by us.

We followed the previous guidelines for poster presentations and had 11 m² available for every poster presented in the poster sessions, to make the space comfortable and easy to move in.

Timeline

In anticipation of a larger pool of submissions, we intentionally scheduled the short and long paper deadlines as a joint, single date deadline..

The complete timeline after submission is given below:

Feb 9-12 Paper bidding
Feb 13 ACs assign papers to reviewers
Feb 13-27 Paper reviewing
Feb 28 AC check that all reviews have been received
Mar 13-15 Author response period
Mar 16-20 Reviewers update reviews after reading author response
Mar 25 ACs send preliminary ranking to PCs
Mar 28 ACs produce Meta reviews for borderline papers; ACs produce final rankings and Accept / Reject decisions
Mar 30 Notification of acceptance
April 22 Camera ready deadline

We had one exception to the schedule of completing transmission of the acceptance decisions, being about 12 hours late, due to operations difficulty.

Also, recruitment of the invited speakers started later than we had initially envisioned, which may have led to problems recruiting an appropriate external-to-the-field speaker.

Recommendations

We recommend starting the recruitment of a good external speaker well in advance as possible, as things become busy quite early on in planning the submission process.

We recommend that ACL keep with using TPMS to help assign reviewers to reviews. However, TPMS is only as good as its profiles. To benefit from it, ACL needs to support and encourage its use. There is a difficulty of its potential costs; at the outset we were not informed that TPMS would incur cost, but ACL was billed 2K USD for its use, but this was eventually waived. A clear agreement needs to be set before its use. Even though we feel this mitigated assignment difficulties, it is still not a solved matter and needs a lot of care. Manual intervention and checks are necessary with any amount of automation.

We also detail our actions with respect to the outcomes and recommendations from ACL 2016, on two of their relevant points; see [2016 Program Chairs' report].

> 2. Many reviews were late. At the time that author response started, one third of the papers had at least one review missing, and some papers had all three reviews missing. We recommend leaving a few extra days between the end of reviewing and the start of author response, and starting some way of passing information about delinquent reviewers forward from conference to conference.

We mitigated this somewhat by having a shorter initial review cycle. While certain reviewers were late at this stage, we had a lengthened dialogue period that made it much easier to control for delays in reviews coming in. We recommend also setting an outstanding reviewer recognition award to a somewhat large proportion of reviewers (perhaps 5%) to spur on-time review and the necessary service time to do a good job of reviewing.

> 3. As discussed above, the reviewer load balancing task needs a more principled solution so that enough reviewers are recruited in advance of the deadlines and so that load balancing is handled smoothly with a good outcome.

This was mitigated somewhat by the solution of re-using the previous NAACL and ACL roles for reviewers. However, this has the potential problem (noted by Michael Strube) that personal information of reviewers is circulated to new chairs without explicit permission by the reviewer. Post-conference, we will try to solicit reviewers' explicit permission to have ACL store their personal profile for subsequent program committees.