BEA 2023 Shared Task on Generating AI Teacher Responses in Educational Dialogues

Event Notification Type: 
Call for Participation
Location: 
ACL 2023
Thursday, 13 July 2023
Country: 
Canada
Contact Email: 
City: 
Toronto
Contact: 
Anaïs Tack
Submission Deadline: 
Friday, 5 May 2023

Conversational agents offer promising opportunities for education. They can fulfill various roles (e.g., intelligent tutors and service-oriented assistants) and pursue different objectives (e.g., improving student skills and increasing instructional efficiency) (Wollny et al. 2021). Among all of these different vocations of an educational chatbot, the most prevalent one is the AI teacher helping a student with skill improvement and providing more opportunities to practice. Some recent meta-analyses have even reported a significant effect of chatbots on skill improvement, for example in language learning (Bibauw et al. 2022). What is more, current advances in AI and natural language processing have led to the development of conversational agents that are founded on more powerful generative language models.

Despite these promising opportunities, the use of powerful generative models as a foundation for downstream tasks also presents several crucial challenges. In the educational domain in particular, it is important to ascertain whether that foundation is solid or flimsy. Bommasani et al. (2021: pp. 67-72) stressed that, if we want to put these models into practice as AI teachers, it is imperative to determine whether they can (a) speak to students like a teacher, (b) understand students, and (c) help students improve their understanding. Therefore, Tack and Piech (2022) formulated the AI teacher test challenge: How can we test whether state-of-the-art generative models are good AI teachers, capable of replying to a student in an educational dialogue?

Following the AI teacher test challenge, we organize a first shared task on the generation of teacher language in educational dialogues. The goal of the task is to use NLP and AI methods to generate teacher responses in real-world samples of teacher-student interactions. These samples are taken from the Teacher Student Chatroom Corpus (Caines et al. 2020; Caines et al. 2022). Each training sample is composed of a dialogue context (i.e., several teacher-student utterances) as well as the teacher’s response. For each test sample, participants are asked to submit their best generated teacher response.

The purpose of the task is to benchmark the ability of generative models to act as AI teachers, replying to a student in a teacher-student dialogue. Submissions will be ranked according to several automated dialogue evaluation metrics, with the top submissions selected for further human evaluation. During this manual evaluation, human raters will compare a pair of teacher responses in terms of three abilities: can speak like a teacher, can understand a student, can help a student (Tack & Piech 2022). As such, we adopt an evaluation method that is akin to ACUTE-Eval for evaluating dialogue systems (Li et al. 2019).

PARTICIPATION

The shared task is hosted on CodaLab (Pavao et al. 2022). Anyone participating in the shared task will be asked to:

  1. Register on the CodaLab platform.
  2. Fill in the registration form with their CodaLab ID. Participants must comply with the terms and conditions of the task and the TSCC data outlined in the form.
  3. Register for the CodaLab competition using the CodaLab ID. We will only accept people who submitted the registration form. Note that you can participate as a member of one team only.

IMPORTANT DATES

  • Mar 24, 2023: Training data release
  • May 1, 2023: Test data release
  • May 5, 2023: Final submissions due
  • May 8, 2023: Results announced
  • May 12, 2023: Human evaluation results announced
  • May 22, 2023: System papers due
  • May 26, 2023: Paper reviews returned
  • May 30, 2023: Camera-ready papers due
  • June 12, 2023: Pre-recorded video due
  • July 13, 2023: BEA Workshop at ACL

ORGANIZERS

  • Anaïs Tack, KU Leuven
  • Ekaterina Kochmar, MBZUAI
  • Zheng Yuan, King’s College London
  • Serge Bibauw, Universidad Central del Ecuador
  • Chris Piech, Stanford University

Webpage: https://sig-edu.org/sharedtask/2023