***First Call for Papers***
*** AI & Scientific Discovery (AISD) Workshop at NAACL 2025 ***
Website: https://ai-and-scientific-discovery.github.io
Tentative First call for workshop papers: November 22 (Friday), 2024
Dates
-----------
All deadlines are 11:59 PM AoE time.
Workshop Paper Due Date: Jan 30, 2025
Pre-reviewed (ARR) submission deadline: Feb 20, 2025
Notification of acceptance: Mar 1, 2025
Camera-ready papers due: Mar 10, 2025
Workshop date: TBD
(NAACL 2025 workshops will take place in Albuquerque, New Mexico May 3–May 4, 2025)
List of Topics
------------------
Just as coding assistants have dramatically increased productivity for coding tasks over the last two years, researchers in the NLP community have begun to explore methods and opportunities ahead for creating scientific assistants that can help with the process of scientific discovery and increase the pace at which novel discoveries are made. Over the last year, language models have been used to create problem-general scientific discovery assistants that are not restricted to narrow problem domains or formulations. Such applications hold opportunities for assisting researchers in broad domains, or scientific reasoning more generally. Beyond assisting, a growing body of work has begun to focus on the prospect of creating largely autonomous scientific discovery agents that can make novel discoveries with minimal human intervention.
These recent developments highlight the possibility of rapidly accelerating the pace of scientific discovery in the near term. Given the influx of researchers into this expanding field, this workshop proposes to serve as a vehicle for bringing together a diverse set of perspectives from this quickly expanding subfield, helping to disseminate the latest results, standardize evaluation, foster collaboration between groups, and allow discussing aspirational goals for 2025 and beyond. As such, the workshop welcomes and covers a wide range of topics, including (but not limited to):
***Literature-based Discovery:*** This topic focuses on using existing scientific articles to facilitate novel discoveries, by finding “undiscovered public knowledge’’, conditioning novel studies based on existing work, or easing the search and reading process. Systems that read the literature at scale may ultimately help us come up with research ideas that are more novel, broader in scope, and operationalized in more useful experiments.
***Agent-centered Approaches:*** Agent models are being used to form end-to-end scientific discovery pipelines, with separate agents taking on different roles, such as hypothesis generation, execution, and reviewing experiments.
***Automated Experiment Execution:*** Agents that can automatically execute experiments require a large number of pragmatic skills that emerging benchmarks are beginning to evaluate, including downloading data, writing code, and interfacing with apps.
***Automated Replication:*** Science is generally suffering a “reproducibility crisis”, and NLP is not immune to this, with (for example) 59% of reproducibility attempts finding worse performance than papers originally reported. Automated reproducibility attempts to read original papers, Github repositories, and instructions, to replicate experiments fully automatically.
***Data-driven Discovery:*** Reading existing papers and combining or re-analyzing existing datasets frequently allows for novel data-driven discoveries without having to collect new data empirically, and may yield inexpensive novel discoveries.
***Discovery in Virtual Environments:*** Virtual environments, including text-based simulations, allow training and evaluating agents at aspects of the scientific discovery process in simulated proxy tasks without running physical experiments, which are costly and time consuming.
***Discovery with Humans in the Loop:*** How can we make the textual knowledge from literature best engage with humans in the discovery process, through various interfaces, processes, and surfaces?
***Assistants for Scientific Writing:*** Language models can be used as assistants in the scientific dissemination process, either as planners for writing text such as academic articles, or blog posts intended for public consumption.
Submission Guidelines
--------------------------------
We welcome three types of papers: archival workshop papers, non-archival papers, and non-archival cross-submissions. Only regular archival workshop papers will be included in the workshop proceedings. All submissions should be in PDF format and made through the OpenReview website set up for this workshop (https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/AISD). In line with the ACL main conference policy, camera-ready versions of papers will be given one additional page of content.
***Archival, regular workshop papers:*** Authors should submit a paper up to 8 pages (both short and long papers are welcome), with unlimited pages for references, following the ACL author guidelines. The reported research should be substantially original. All submissions will be reviewed in a single track, regardless of length. Accepted papers will be presented as posters by default, and best papers may be given the opportunity for a brief talk to introduce their work. Reviewing will be double-blind, and thus no author information should be included in the papers; self-reference that identifies the authors should be avoided or anonymised. Accepted papers will appear in the workshop proceedings. Preference for oral presentation slots in the workshop will be given to archival papers.
Non-archival regular workshop papers: This is the same as the option above, but these papers will not appear in the proceedings and will typically only receive poster presentation slots. Non-archival submissions in this category will still undergo the review process. This is appropriate for nearly finished work that is intended for submission to another venue at a later date.
***Non-archival cross-submissions:*** We also solicit cross-submissions, i.e., papers on relevant topics that have already appeared in other venues (e.g., workshop or conference papers at NLP, ML, or cognitive science venues, among others). Accepted papers will be presented at the workshop, with an indication of original venue, but will not be included in the workshop proceedings. Cross-submissions are ideal for related work which would benefit from exposure to the AISD audience. Papers in this category do not need to follow the ACL format, and the submission length is determined by the original venue. The paper selection will be solely determined by the organizing committee in a non-blind fashion. These papers will typically receive poster presentation slots.
In addition, we welcome papers on relevant topics that are under review or to be submitted to other venues (including the ACL 2024 main conference). These papers must follow the regular workshop paper format and will not be included in the workshop proceedings. Papers in this category will be reviewed by workshop reviewers.
Note to authors: While you submit your paper through OpenReview (https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/AISD), please select the “Submission Type” properly based on the guidelines.
For questions about the submission guidelines, please contact workshop organizers via aisd-organizers [at] googlegroups.com
Organizers
----------------
Peter Jansen, University of Arizona/Ai2
Bodhisattwa Prasad Majumder, Allen Institute for AI (Ai2)
Bhavana Dalvi Mishra, Allen Institute for AI (Ai2)
Tushar Khot, Allen Institute for AI (Ai2)
Harsh Trivedi, Stony Brook University
Tom Hope, Hebrew University Jerusalem/Ai2
Doug Downey, Allen Institute for AI (Ai2)
Eric Horvitz, Microsoft