Call For Participation: Shared Task on Automatic Evaluation for Code-Switched Text Generation

January 23, 2025 | BY Barid Xi Ai

Event Notification Type:

Call for Participation

Location:

NAACL

Saturday, 3 May 2025

State:

New Mexico

Country:

USA

Contact Email:

genta.winata@capitalone.com

sudipta.kar.8080@gmail.com

mzhukova@ucsb.edu

City:

Albuquerque

Contact:

Genta Winata

Sudipta Kar

Marina Zhukova

Website:

https://eval.ai/web/challenges/challenge-page/2437/overview

Submission Deadline:

Friday, 21 February 2025

Task Description

This shared task focuses on developing automatic evaluation metrics for code-switched (CS) text generation. Participants are tasked with creating systems that can accurately assess the quality of synthetically generated CS text, considering both fluency and accuracy. This is crucial because:

Scarcity of CS Data: CS text data is limited, making automatic generation vital for data augmentation and improving model performance.
Growing Demand: The need for CS text is increasing, particularly in dialogue systems and chatbots, to enable more natural and inclusive interactions.
Lack of Robust Evaluation: Current methods for evaluating CS text are insufficient, hindering progress in this field.

This shared task aims to address this gap and drive further research in automatic evaluation metrics for CS text generation.

Goal

The goal of this shared task is to encourage the development of robust and reliable automatic evaluation metrics for CS text generation, ultimately leading to more fluent and accurate CS language models.

Important Dates

Jan 23: Platform release (ready for submissions)
Feb 14: Test set release
Feb 21: Results submission
Feb 28 Paper submission
Mar 8: Acceptance notification

Languages Supported

Public Leaderboard: English-Hindi, English-Tamil, English-Malayalam
Private Leaderboard: English-Indonesian, Indonesian-Javanese, Singlish (English-Chinese)

Metric

Accuracy: Systems will be evaluated based on their accuracy in predicting human preferences for CS text. This will be measured by comparing the system's ranking of generated sentences (Sent 1 vs. Sent 2) with human annotations in the CSPref dataset.

Dataset

The CSPref dataset will be used for this task. It contains:

Original L1: English sentences
Original L2: Hindi, Tamil, or Malayalam sentences
Sent 1, Sent 2: Two different CS generations based on the original sentences
Chosen: Human annotation indicating the preferred sentence (Sent 1, Sent 2, or Tie)
Lang: Language pair
Data is available here: https://huggingface.co/datasets/garrykuwanto/cspref

Evaluation

Systems will be ranked on a public leaderboard based on their accuracy in predicting human preferences on the English-Hindi, English-Tamil, and English-Malayalam language pairs.
A private leaderboard will evaluate system performance on unseen language pairs (English-Indonesian, Indonesian-Javanese, Singlish) to assess generalization ability.
We will rank based on performance.

Submission

Participants will submit their system's predictions for each instance in the test set, indicating their preferred sentence (Sent 1, Sent 2, or Tie).

Menu

Call For Participation: Shared Task on Automatic Evaluation for Code-Switched Text Generation

Latest Events

Menu

Call For Participation: Shared Task on Automatic Evaluation for Code-Switched Text Generation

User login

Latest Events