Difference between revisions of "File:TWSI397.zip"

From ACL Wiki
Jump to navigation Jump to search
(This file describes the data format of the TWSI (Turk bootstrap Word Sense Inventory) version 1.0. For the description of the process, please consult the paper for further documentation. In short, three Mturk tasks were used to yield the data provided he)
 
(uploaded a new version of "Image:TWSI397.zip": The TWSI is organized by target word: For the most frequent 397 nouns in English Wikipedia (dump used from January 3rd, 2008), all targets are organized into senses. With each sense, there are associated )
 
(No difference)

Latest revision as of 19:22, 1 February 2010

This file describes the data format of the TWSI (Turk bootstrap Word Sense Inventory) version 1.0. For the description of the process, please consult the paper for further documentation. In short, three Mturk tasks were used to yield the data provided here: - "Substitutable words in context": Workers are presented a sentence with a target word and supply substitutions - "Are these words used with the same meaning?": Workers are presented a pair of sentences with the same target word marked in bold and can decide whether the meanings are identical, similar or different - "Match the Meaning" Workers are presented a sense inventory represented by prototypical sentences and align further sentences with the same target word to those senses.

The TWSI is organized by target word: For the most frequent 397 nouns in English Wikipedia (dump used from January 3rd, 2008), all targets are organized into senses. With each sense, there are associated substitutions and sentences where the target word was used in this sense.

This data has been curated and extracted from the output of a turk bootstrapping acquisition cycle. Raw data is not included here, but is available upon request.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current19:22, 1 February 2010 (6.44 MB)Biem (talk | contribs)The TWSI is organized by target word: For the most frequent 397 nouns in English Wikipedia (dump used from January 3rd, 2008), all targets are organized into senses. With each sense, there are associated substitutions and sentences where the target word w
18:05, 1 February 2010 (6.06 MB)Biem (talk | contribs)This file describes the data format of the TWSI (Turk bootstrap Word Sense Inventory) version 1.0. For the description of the process, please consult the paper for further documentation. In short, three Mturk tasks were used to yield the data provided he

There are no pages that use this file.