Call for Participation: Multilingual Word Sense Disambiguation

Event Notification Type: 
Call for Participation
Abbreviated Title: 
SemEval 2013 - Task 12
Location: 
Co-located with NAACL HLT 2013
Thursday, 13 June 2013 to Friday, 14 June 2013
State: 
Georgia
Country: 
USA
City: 
Atlanta
Contact: 
Roberto Navigli
David Jurgens
Submission Deadline: 
Friday, 15 February 2013

Call For Participation

Multilingual Word Sense Disambiguation
SemEval 2013 - Task #12

http://www.cs.york.ac.uk/semeval-2013/task12/

The aim of this task is to evaluate Word Sense Disambiguation systems in an all-words multilingual setting.
INTRODUCTION

Task 12 provides a traditional setup for evaluating Word Sense Disambiguation (WSD) systems in an all-words, multilingual setting by marking occurrences of potentially polysemous words in five different languages (English, French, German, Italian, Spanish) with sense labels provided by a multilingual sense inventory. To enable multilinguality we make use of the BabelNet sense inventory [1], a wide-coverage semantic network built by merging WordNet with Wikipedia to provide an “encyclopedic dictionary.” BabelNet concepts are lexicalized in many languages using Wikipedia’s inter-language links and the output of a state-of-the-art machine translation system. Task 12 will use a validated version of BabelNet 1.1 (http://babelnet.org) in which the Wikipedia-WordNet mappings of all senses of lemmas in the test data have been manually verified.

PARTICIPANT SENSE INVENTORY

Participants are free to work on the full BabelNet sense inventory or to work on either of its inventory subsets, i.e. WordNet 3.0 or Wikipedia page titles. They are also free to participate using a single language of their choice or all five languages.

TASK DETAILS:

Following the traditional WSD “all-words” experimental setting [2], systems will be expected to link all occurrences of noun phrases within arbitrary texts in different languages to the most suitable senses in the sense inventory of their choice. For instance, given the sentence:

The dramatic force of Miller's play derives in part from expressionistic techniques he used to portray Loman's psychological anguish and guilt-ridden fantasy life.

a disambiguation system should link “Miller” to any of (1) the BabelNet synset for Arthur Miller, (2) the Wikipedia sense corresponding to the page http://en.wikipedia.org/wiki/Arthur_Miller, or (3) Miller#n#3 (i.e. the third WordNet sense for Miller), depending on the participant’s choice of sense inventory. Note that the BabelNet synset will contain where applicable both the Wikipedia page and the WordNet synset in its representation.

Participants will be evaluated in groups based on their choice of sense inventory and target language. All the information about the submitted systems (such as training data, resources, etc. used by the system) will be reported in the task paper.

DATASETS:

No training data will be provided as a part of this task; however, participants are allowed to use any freely available training data for building their system.

For annotating the test set, by mid-February we will provide a gold standard version of BabelNet 1.1 where all synsets used in the test data have been manually verified for correctness.

IMPORTANT DATES

February 15, 2013 - Registration Deadline
March 1, 2013 onwards - Start of evaluation period
March 15, 2013 - End of evaluation period
April 9, 2013 - Paper submission deadline [TBC]
April 23, 2013 - Reviews Due [TBC]
May 4, 2013 - Camera ready Due [TBC]

MORE INFORMATION

The Semeval-2013 Task #12 website, for signup and details, is:

http://www.cs.york.ac.uk/semeval-2013/task12/

If interested in the task please join our mailing list for updates:

http://groups.google.com/group/semeval13-multilingual-wsd/

ORGANIZERS
Roberto Navigli (lastname [at] di.uniroma1.it), Sapienza University of Rome, Italy
David Jurgens (lastname [at] di.uniroma1.it), Sapienza University of Rome, Italy

REFERENCES
1. Roberto Navigli & Simone Paolo Ponzetto. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193, 2012, pp. 217-250.
2. Roberto Navigli. Word Sense Disambiguation: A survey. ACM Computing Survey, 41(2), ACM Press, 2009, pp. 1-69.