File:Disco2011-shared-task-complete-dataset.zip

From ACL Wiki
Revision as of 02:59, 30 June 2011 by Biem (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Disco2011-shared-task-complete-dataset.zip(file size: 265 KB, MIME type: application/zip)
Warning: This file type may contain malicious code. By executing it, your system may be compromised.

This archive contains data sets for compositionality judgments for English and German as well as the official scoring scripts. The data was collected from Amazon turk. Workers were presented a sentence with a bolded target phrase and were asked to score how literal the phrase was between 0 and 10. 4-5 different, randomly sampled sentences from the WaCKy corpora for UK English and German were presented to 4 workers each.

Phrases consist of two lemmas and come in three grammatical relations: - ADJ_NN: adjective modifying a noun - V_SUBJ: noun as a subject of a verb - V_OBJ: noun as an object of a verb Passive constructions were resolved active constructions for relation assignment purposes.

Phrases were extracted semi-automatically. The relations were assigned by patterns and manually checked for validity. Phrases were selected in a way as to balance the data set while controlling for frequency.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current11:09, 10 March 2014 (265 KB)Biem (Talk | contribs)DISCO 2011 Complete Dataset (Training and Test Data, Eval Scripts)
02:59, 30 June 2011 (265 KB)Biem (Talk | contribs)This archive contains data sets for compositionality judgments for English and German as well as the official scoring scripts. The data was collected from Amazon turk. Workers were presented a sentence with a bolded target phrase and were asked to score h

There are no pages that link to this file.