File:TWSI397.zip

From ACLWiki
Revision as of 20:05, 1 February 2010 by Biem (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
TWSI397.zip(file size: 6.44 MB, MIME type: application/zip)
Warning: This file type may contain malicious code. By executing it, your system may be compromised.

This file describes the data format of the TWSI (Turk bootstrap Word Sense Inventory) version 1.0. For the description of the process, please consult the paper for further documentation. In short, three Mturk tasks were used to yield the data provided here: - "Substitutable words in context": Workers are presented a sentence with a target word and supply substitutions - "Are these words used with the same meaning?": Workers are presented a pair of sentences with the same target word marked in bold and can decide whether the meanings are identical, similar or different - "Match the Meaning" Workers are presented a sense inventory represented by prototypical sentences and align further sentences with the same target word to those senses.

The TWSI is organized by target word: For the most frequent 397 nouns in English Wikipedia (dump used from January 3rd, 2008), all targets are organized into senses. With each sense, there are associated substitutions and sentences where the target word was used in this sense.

This data has been curated and extracted from the output of a turk bootstrapping acquisition cycle. Raw data is not included here, but is available upon request.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current21:22, 1 February 2010 (6.44 MB)Biem (Talk | contribs) (The TWSI is organized by target word: For the most frequent 397 nouns in English Wikipedia (dump used from January 3rd, 2008), all targets are organized into senses. With each sense, there are associated substitutions and sentences where the target word w)
20:05, 1 February 2010 (6.06 MB)Biem (Talk | contribs) (This file describes the data format of the TWSI (Turk bootstrap Word Sense Inventory) version 1.0. For the description of the process, please consult the paper for further documentation. In short, three Mturk tasks were used to yield the data provided he)

There are no pages that link to this file.

Personal tools