2014Q3 Reports: Publications Chairs

From Admin Wiki
Revision as of 20:19, 3 June 2014 by Yusuke (talk | contribs)
Jump to navigation Jump to search

Summary

The publication chairs produced the proceedings of the main conference (two volumes, for long and short papers) and coordinated the work of the 22 book chairs for the individual workshops.

This process went reasonably smoothly. We innovated over earlier years by producing machine-readable proceedings in an XML format, and by imposing a stricter paper testing regime on the upload of camera-ready papers.


Changes from previous years

We invited authors to optionally submit their Latex source code of their paper along with the camera-ready PDF file. We then automatically converted these Latex sources into XML using the LaTeXML tool, and are making them available alongside the PDF files in the Anthology. These XML files are meant to be machine-readable, in order to support future research using the ACL proceedings as a corpus. Roughly 75% of all papers that were prepared using Latex were submitted together with their sources. We managed to convert about 90% of these, yielding a grand total of 177 papers that will now be available in XML.

Our experience with the Latex-to-XML conversion was positive. We are going to document the process in detail, and would encourage future publication chairs to convert their papers as well. We focused on converting the papers for the main conference this year, as a pilot experiment. Future publication chairs might extend the conversion to the other books as well.

Publishing papers in XML is becoming popular in other areas (e.g. PubMed Central), because it allows for automatic processing of content (e.g. combining running texts with online ontologies), as well as increases human accessibility in various media such as smart phones and speech synthesizers. In particular, our research field is focused on processing language data, and this effort should be encouraged. We do not intend to completely replace PDFs with XMLs, but we encourage future publication chairs to continue and improve this effort on producing machine-readable proceedings.

Further changes:

  • Together with the Softconf team, we changed the page for camera-ready submissions to perform some automated tests on the papers. When submitting a PDF file, the system asks for the number of pages, and automatically checks the consistency of the page number and font embeddings. The system also shows an image of the paper, and allows the author to fix margins if necessary.
  • In collaboration with the Softconf team, we included supplementary materials, such as datasets, software, and notes, in the CD-ROM proceedings and the ACL anthology. In some previous conferences (e.g. ACL 2012) the anthology includes such materials, but it is done by hand because it's not supported by the START system. This year the START system has a function to automatically include supplementary materials in the proceedings, and the load of publication chairs is limited. Future publication chairs are encouraged to continue this effort.
  • We only made the proceedings available as a download this year, instead of distributing them on a USB stick. This simplifies the production of the proceedings, but we will have to evaluate carefully whether participants like this change.
  • We decided to stick to the A4 paper format, instead of the Letter format that has been customary for ACLs in North America. This became possible because the proceedings are electronic only, and allows us to remain consistent across years.


Problems

  • Coordinating with the authors about enforcing the style requirements was a very substantial amount of work. Frequent problems were page formats, font choices, and the spelling of author names. Future publication chairs should continue to automate style checking as much as possible. The format checking function of the START system invented this year should be further improved, to enforce stricter validation, including margins (this year we only showed an image to encourage margin correction, but technically margins can be checked automatically) and page size (this is not included this year but it should be included because actually many people use wrong style files), although it is a little difficult to include checking of page numbers.
  • We simply inherited ACL style files this year, but some details are inconsistent between the human-readable style guide and the Latex style file, and these should be corrected. Known problems are:
  • Ensuring adherence of the books prepared by individual book chairs was also a remarkably time-intensive effort. Future publication chairs should fill in as much metadata for workshops into the START system as possible; this time investment will pay off later. Workshop chairs should also be instructed in the correct use of the "order" file as early as possible, in order to make the local chairs' lives easier when they generate the conference handbook.
  • Checking that all papers had the correct copyright-transfer signatures was a minor amount of work, but it still felt unncessary. Perhaps this could be automated in the future, and camera-ready papers not even accepted if the signature is not there.
  • Including images in proceedings volumes (e.g. sponsor logos, or a logo for the cover page) is tricky and not well documented. This should be simplified in the START system.
  • It seems excessive that ACL requires a transfer of the entire copyright to a paper from the authors. We should look into a less complete transfer of rights to ACL.
  • Compared to last year, the schedule of the conference remained pretty stable, and did not cause major problems in regenerating the proceedings. However, including the schedule in the proceedings feels anachronistic when the proceedings are no longer actually printed as physical books. Future chairs might consider fully decoupling the preparation of the proceedings from the preparation of the schedule by ordering the papers in some other way and removing the schedule from the proceedings.