Difference between revisions of "RTE5 - Ablation Tests"

From ACL Wiki
Jump to navigation Jump to search
(New page: === Publicly available Resources === {|class="wikitable sortable" cellpadding="3" cellspacing="0" style="margin-left: 20px;" border="1" |- bgcolor="#CDCDCD" ! Resource ! Type ! Author ! cl...)
 
m (Reverted edits by Creek (talk) to last revision by Celct)
 
(60 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=== Publicly available Resources ===
+
The following table lists the results of the ablation tests submitted by participants, which have been introduced as a mandatory track in the RTE5 campaign.<br><br>
{|class="wikitable sortable" cellpadding="3" cellspacing="0" style="margin-left: 20px;" border="1"
+
The first column contains the specific resources which have been ablated.<br>The second column lists the Team Run in the form ''[name_of_the_Team][number_of_the_submitted_run].[submission_task]'' (e.g. BIU1.2way, Boeing3.3way).<br>The third and fourth columns present the normalized difference between the accuracy of the complete system run and the accuracy of the ablation run (i.e. the output of the complete system without the ablated resource), showing the impact of the resource on the performance of the system. The third refers to the score obtained in the 2-way task, the fourth to that obtained in the 3-way task. For all the runs submitted as 3-way, also the 2-way derived accuracy has been calculated.<br>
 +
Finally, the fifth column contains a brief description of the specific usage of the resource. It is based on the information provided both in the "readme" files submitted together with the ablation tests and in the system reports published in the RTE5 proceedings.<br><br>
 +
 
 +
Participants are kindly invited to check if all the inserted information is correct and complete.
 +
 
 +
 
 +
{|class="wikitable sortable" cellpadding="3" cellspacing="0" border="1"
 +
 
 
|- bgcolor="#CDCDCD"
 
|- bgcolor="#CDCDCD"
! Resource
+
! Ablated Resource
! Type
+
! Team Run<ref>For further information about participants, click here: [[RTE Challenges - Data about participants]]</ref>
! Author
+
! <small>Resource impact - 2way</small>
! class="unsortable"|Brief description
+
! <small>Resource impact - 3way</small>
! RTE Users*
+
! Resource Usage Description
! class="unsortable"|Usage info
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [[WordNet]]
+
| Acronym guide
| Lexical DB
+
| Siel_093.3way
| Princeton University
+
| style="text-align: center;"| 0
| Lexical database of English nouns, verbs, adjectives and adverbs
+
| style="text-align: center;"| 0
| style="text-align: center;"|<small>21 - RTE4</small> <br> <small>3 - RTE3</small>
+
| The acronyms are expanded using the acronym database, so the acronyms are also matched with the expanded acronyms, and entailment is predicted accordingly
| [[WordNet - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://verbs.colorado.edu/~mpalmer/projects/verbnet.html Verbnet]
+
| Acronym guide + <br>UAIC_Acronym_rules
| Lexical DB
+
| UAIC20091.3way
| University of Colorado Boulder
+
| style="text-align: center;"| 0.17
| Lexicon for English verbs organized into classes extending Levin (1993) classes through refinement and addition of subclasses to achieve syntactic and semantic coherence among members of a class
+
| style="text-align: center;"| 0.16
| style="text-align: center;"|<small>2 - RTE4</small> <br> <small>2 - RTE3</small>
+
| We start from acronym-guide, but additional we use a rule that consider for expressions like Xaaaa Ybbbb Zcccc the acronym XYZ, regardless of length of text with this form.
| [[Verbnet - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [[VerbOcean]]
+
| DIRT
| Lexical DB
+
| BIU1.2way
| Information Sciences Institute, University of Southern California
+
| style="text-align: center;"| 1.33
| Broad-coverage semantic network of verbs
+
| style="text-align: center;"| &mdash;
| style="text-align: center;"|<small>4 - RTE4</small> <br> <small>1 - RTE3</small>
+
| Inference rules
| [[VerbOcean - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://framenet.icsi.berkeley.edu/ FrameNet]
+
| DIRT
| Lexical DB
+
| Boeing3.3way
| ICSI (International Computer Science Institute) - Berkley University
+
| style="text-align: center;"| -1.17
| Lexical resource for English words, based on frame semantics (valences) and supported by corpus evidence
+
| style="text-align: center;"| 0
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>1 - RTE3</small>
+
| Verb paraphrases
| [[Framenet - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://nlp.cs.nyu.edu/meyers/NomBank.html NomBank]
+
| DIRT
| Lexical DB
+
| UAIC20091.3way
| New York University
+
| style="text-align: center;"| 0.17
| Lexical resource containing syntactic frames for nouns, extracted from annotated corpora
+
| style="text-align: center;"| 0.33
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>2 - RTE3</small>
+
| We transform text and hypothesis with MINIPAR into dependency trees: use of DIRT relations to map verbs in T with verbs in H
| [[NomBank Resource - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://verbs.colorado.edu/~mpalmer/projects/ace.html PropBank]
+
| Framenet+ <br/>WordNet
| Lexical DB
+
| DLSIUAES1.2way
| University of Colorado Boulder
+
| style="text-align: center;"| 1.16
| Lexical resource containing syntactic frames for verbs, extracted from annotated corpora
+
| style="text-align: center;"| &mdash;
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>2 - RTE3</small>
+
| Frame-to-frame similarity metric
| [[PropBank Resource - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://nlp.cs.nyu.edu/nomlex/index.html Nomlex] Plus
+
| Framenet+ <br/>WordNet
| Lexical DB
+
| DLSIUAES1.3way
| New York University
+
| style="text-align: center;"| -0.17
| Dictionary of English nominalizations: it describes the allowed complements for a nominalization and relates the nominal complements to the arguments of the corresponding verb
+
| style="text-align: center;"| -0.17
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| Frame-to-frame similarity metric
| [[Nomlex Plus - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://www.wikipedia.org/ Wikipedia]
+
| Framenet
| Encyclopedia
+
| UB.dmirg3.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
| If two lexical items are covered in a single FrameNet frame, then the two items are treated as semantically related.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Grady Ward’s MOBY Thesaurus + <br>Roget's Thesaurus
 +
| Venses2.2way
 +
| style="text-align: center;"| 2.83
 +
| style="text-align: center;"| &mdash;
 +
| Semantic fields are used as semantic similarity matching, in all cases of non identical lemmas
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| MontyLingua Tool
 +
| Siel_093.3way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| 0
 +
| For the VerbOcean, the verbs have to be in the base form. We used the "MontyLingua" tool to convert the verbs into their base form
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| NEGATION_rules by UAIC
 +
| UAIC20091.3way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| -1.34
 +
| Negation rules check in the dependency trees on verbs descending branches to see if some categories of words that change the meaning are found.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| NER (RASP Parser nertag)
 +
| JU_CSE_TAC1.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
| Named Entity match: measure based on the number of Nes in the hypothesis that match in the corresponding text. For named entity recognition, the RASP Parser (Briscoe et al., 2006) nertag  component has been used.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| NE component
 +
| UI_ccg1.2way
 +
| style="text-align: center;"| 4.83
 +
| style="text-align: center;"| &mdash;
 +
| Named Entity recognition/comparison
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| PropBank
 +
| cswhu1.3way
 +
| style="text-align: center;"| 2
 +
| style="text-align: center;"| 3.17
 +
| syntactic and semantic parsing
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Stanford NER
 +
| QUANTA1.2way
 +
| style="text-align: center;"| 0.67
 +
| style="text-align: center;"| &mdash;
 +
| We use Named Entity similarity as a feature
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Stopword list
 +
| FBKirst1.2way
 +
| style="text-align: center;"| 1.5
 +
| style="text-align: center;"| &mdash;
 +
| A list of the 572 most frequent English words has been collected in order to prevent assigning high costs to the deletion/insertion of terms that are unlikely to bring relevant information to detect entailment,and to avoid substituting these terms with any content word.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Training data from RTE1, 2, 3
 +
| PeMoZa3.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
|
 +
 
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Training data from RTE2
 +
| PeMoZa3.2way
 +
| style="text-align: center;"| 0.66
 +
| style="text-align: center;"| &mdash;
 
|  
 
|  
| Free encyclopedia. Used for extraction of lexical-semantic rules (from its more structured parts), named entity recognition, geographical information etc.
+
 
| style="text-align: center;"|<small>3 - RTE4</small> <br> <small>0 - RTE3</small>
+
|- bgcolor="#ECECEC" "align="left"
| [[Wikipedia - RTE Users|Users]]
+
| Training data from RTE2, 3
 +
| PeMoZa3.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
|
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| VerbOcean
 +
| DFKI1.3way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| 0.17
 +
| VerbOcean relations are used to calculate relatedness between verbs in T and H
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| VerbOcean
 +
| DFKI2.3way
 +
| style="text-align: center;"| 0.33
 +
| style="text-align: center;"| 0.5
 +
| VerbOcean relations are used to calculate relatedness between verbs in T and H
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| VerbOcean
 +
| DFKI3.3way
 +
| style="text-align: center;"| 0.17
 +
| style="text-align: center;"| 0.17
 +
| VerbOcean relations are used to calculate relatedness between verbs in T and H
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| VerbOcean
 +
| FBKirst1.2way
 +
| style="text-align: center;"| -0.16
 +
| style="text-align: center;"| &mdash;
 +
| Extraction of 18232 entailment rules for all the English verbs connected by the ”stronger-than” relation. For instance, if ”kill [stronger-than] injure”, then the rule ”kill ENTAILS injure” is added to the rules repository.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| VerbOcean
 +
| QUANTA1.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
| We use "opposite-of" relation in VerbOcean as a feature
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| VerbOcean
 +
| Siel_093.3way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| 0
 +
| Similarity/anthonymy/unrelatedness between verbs
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WikiPedia
 +
| BIU1.2way
 +
| style="text-align: center;"| -1
 +
| style="text-align: center;"| &mdash;
 +
| Lexical rules extracted from Wikipedia definition sentences, title parenthesis, redirect and hyperlink relations
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WikiPedia
 +
| cswhu1.3way
 +
| style="text-align: center;"| 1.33
 +
| style="text-align: center;"| 3.34
 +
| Lexical semantic rules
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WikiPedia
 +
| FBKirst1.2way
 +
| style="text-align: center;"| 1
 +
| style="text-align: center;"| &mdash;
 +
| Rules extracted from WP using Latent Semantic Analysis (LSA)
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WikiPedia
 +
| UAIC20091.3way
 +
| style="text-align: center;"| 1.17
 +
| style="text-align: center;"| 1.5
 +
| Relations between named entities
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Wikipedia + <br>NER's (LingPipe, GATE) + <br>Perl patterns
 +
| UAIC20091.3way
 +
| style="text-align: center;"| 6.17
 +
| style="text-align: center;"| 5
 +
| NE module: NERs, in order to identify Persons, Locations, Jobs, Languages, etc; Perl patterns built by us for RTE4 in order to identify numbers and dates; our own resources extracted from Wikipedia in order to identify a "distance" between one name entity from hypothesis and name entities from text
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| AUEBNLP1.3way
 +
| style="text-align: center;"| -2
 +
| style="text-align: center;"| -2.67
 +
| Synonyms
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| BIU1.2way
 +
| style="text-align: center;"| 2.5
 +
| style="text-align: center;"| &mdash;
 +
| Synonyms, hyponyms (2 levels away from the original term), hyponym_instance and derivations
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| Boeing3.3way
 +
| style="text-align: center;"| 4
 +
| style="text-align: center;"| 5.67
 +
| Wordnet synonyms, hypernyms relationships between (senses of) words, "similar" (SIM), "pertains" (PER), and "derivational" (DER) links to recognize equivalence between T and H
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| DFKI1.3way
 +
| style="text-align: center;"| -0.17
 +
| style="text-align: center;"| 0
 +
| Argument alignment between T and H
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| DFKI2.3way
 +
| style="text-align: center;"| 0.16
 +
| style="text-align: center;"| 0.34
 +
| Argument alignment between T and H
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| DFKI3.3way
 +
| style="text-align: center;"| 0.17
 +
| style="text-align: center;"| 0.17
 +
| Argument alignment between T and H
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| DLSIUAES1.2way
 +
| style="text-align: center;"| 0.83
 +
| style="text-align: center;"| &mdash;
 +
| Similarity between lemmata, computed by WordNet-based metrics
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| DLSIUAES1.3way
 +
| style="text-align: center;"| -0.5
 +
| style="text-align: center;"| -0.33
 +
| Similarity between lemmata, computed by WordNet-based metrics
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| JU_CSE_TAC1.2way
 +
| style="text-align: center;"| 0.34
 +
| style="text-align: center;"| &mdash;
 +
| WordNet based Unigram match: if any synset for the H unigram matches with any synset of a word in T then the hypothesis unigram is considered as a WordNet based unigram match.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| PeMoZa1.2way
 +
| style="text-align: center;"| -0.5
 +
| style="text-align: center;"| &mdash;
 +
| Derivational Morphology from WordNet
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| PeMoZa1.2way
 +
| style="text-align: center;"| 1.33
 +
| style="text-align: center;"| &mdash;
 +
| Verb Entailment from Wordnet
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| PeMoZa2.2way
 +
| style="text-align: center;"| 1
 +
| style="text-align: center;"| &mdash;
 +
| Derivational Morphology from WordNet
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| PeMoZa2.2way
 +
| style="text-align: center;"| -0.33
 +
| style="text-align: center;"| &mdash;
 +
| Verb Entailment from Wordnet
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| QUANTA1.2way
 +
| style="text-align: center;"| -0.17
 +
| style="text-align: center;"| &mdash;
 +
| We use several relations from wordnet, such as synonyms, hyponym, hypernym et al.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| Rhodes1.3way
 +
| style="text-align: center;"| 3.17
 +
| style="text-align: center;"| 4
 +
| Lexicon based match: we chose a very simple metric: matching between words in T and H based on a path of distance at most 2 in the WordNet graph, using any links (hyponymy, hypernymy, meronymy, pertainymy, etc.)
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| Sagan1.3way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| -0.83
 +
| The system is based on machine learning approach. The ablation test was obtained with 2 less features using WordNet (namely, string similarity based on Levenshtein distance and semantic similarity) in the training and testing steps.
 +
 
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| Siel_093.3way
 +
| style="text-align: center;"| 0.34
 +
| style="text-align: center;"| -0.17
 +
| Similarity between nouns using WN tool
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| ssl1.3way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| 0.67
 +
| WordNet Analysis
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| UB.dmirg3.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
| Synonyms, hypernyms (2 levels away from the original term)
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet
 +
| UI_ccg1.2way
 +
| style="text-align: center;"| 4
 +
| style="text-align: center;"| &mdash;
 +
| Word similarity == identity
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| WordNet +<br>FrameNet
 +
| UB.dmirg3.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
| WN: synonyms, hypernyms (2 levels away from the original term). FN: if two lexical items are covered in a single FrameNet frame, then the two items are treated as semantically related.
 +
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [[TEASE]] Collection
+
| WordNet +<br>VerbOcean
| Collection of Entailment Rules
+
| DFKI1.3way
| Bar-Ilan University
+
| style="text-align: center;"| 0
| Output of the TEASE algorithm
+
| style="text-align: center;"| 0.17
| style="text-align: center;"|<small>0 - RTE4</small> <br> <small>0 - RTE3</small>
+
| VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
| [[Tease Collection - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://badc.nerc.ac.uk/help/abbrevs.html BADC Acronym and Abbreviation List]
+
| WordNet +<br>VerbOcean
| Word List
+
| DFKI2.3way
| BADC (British Atmospheric Data Centre)
+
| style="text-align: center;"| 0.5
| Acronym and Abbreviation List
+
| style="text-align: center;"| 0.67
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
| [[BADC Acronym and Abbreviation List - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://www.acronym-guide.com/ Acronym Guide]
+
| WordNet +<br>VerbOcean
| Word List
+
| DFKI3.3way
| Acronym-Guide.com
+
| style="text-align: center;"| 0.17
| Acronym and Abbreviation Lists for English, branched in thematic directories
+
| style="text-align: center;"| 0.17
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
| [[Acronym Guide - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://www.cs.ualberta.ca/~lindek/downloads.htm Dekang Lin’s Thesaurus]
+
| WordNet +<br>VerbOcean
| Thesaurus
+
| UAIC20091.3way
| University of Alberta
+
| style="text-align: center;"| 2
| Thesaurus automatically constructed using a parsed corpus, based on distributional similarity scores
+
| style="text-align: center;"| 1.50
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| Contradiction identification
| [[Dekang Lin’s Thesaurus - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://en.wikipedia.org/wiki/Roget%27s_Thesaurus Roget's Thesaurus]
+
| WordNet +<br>VerbOcean + <br>DLSIUAES_negation_list
| Thesaurus
+
| DLSIUAES1.2way
| Peter Mark Roget (Electronic version distributed by University of Chicago)
+
| style="text-align: center;"| 0.66
| Roget's Thesaurus is a widely-used English thesaurus, created by Dr. Peter Mark Roget in 1805. The original edition had 15,000 words, and each new edition has been larger. The electronic edition ([http://machaut.uchicago.edu/rogets version 1.02]) is made available by University of Chicago.
+
| style="text-align: center;"| &mdash;
| style="text-align: center;"|<small>0 - RTE4</small> <br> <small>1 - RTE3</small>
+
| Antonym relations between verbs (VO+WN); polarity based on negation terms (short list constructed by participant themselves)
| [[Roget's Thesaurus - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T13 Web1T 5-grams]
+
| WordNet +<br>VerbOcean + <br>DLSIUAES_negation_list
| Word list
+
| DLSIUAES1.3way
| Linguistic Data Consortium, University of Pennsylvania; Google Inc.
+
| style="text-align: center;"| -1
| Data set containing English word n-grams and their observed frequency counts. The n-gram counts were generated from approximately 1 trillion word tokens of text from publicly accessible Web pages
+
| style="text-align: center;"| -0.5
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| Antonym relations between verbs (VO+WN); polarity based on negation terms (short list constructed by participant themselves)
| [[Web1T - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://geonames.usgs.gov/index.html GNIS - Geographic Names Information System]
+
| WordNet +<br>XWordNet
| Gazetteer
+
| UAIC20091.3way
| USGS (United States Geological Survey)
+
| style="text-align: center;"| 1
| Database containing the Federal and national standard toponyms for USA, associated areas and Antarctica
+
| style="text-align: center;"| 1.33
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| Synonymy, hyponymy and hypernymy and eXtended WordNet relation
| [[GNIS - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://www.geonames.org/ Geonames]
+
| System component
| Gazetteer
+
| DirRelCond3.2way
|  
+
| style="text-align: center;"| 4.67
| Database containing eight million geographical names. It is integrating geographical data such as names of places in various languages, elevation, population and others from various sources.
+
| style="text-align: center;"| &mdash;
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| The ablation test (abl-1) was meant to test one component of the most complex condition for entailment used in step 3 of the system
| [[Geonames - RTE Users|Users]]
+
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| System component
 +
| DirRelCond3.2way
 +
| style="text-align: center;"| -1.5
 +
| style="text-align: center;"| &mdash;
 +
| The ablation test (abl-2) was meant to test one component of the most complex condition for entailment used in step 3 of the system
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| System component
 +
| DirRelCond3.2way
 +
| style="text-align: center;"| 0.17
 +
| style="text-align: center;"| &mdash;
 +
| The ablation test (abl-3) was meant to test one component of the most complex condition for entailment used in step 3 of the system
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| System component
 +
| DirRelCond3.2way
 +
| style="text-align: center;"| -1.16
 +
| style="text-align: center;"| &mdash;
 +
| The ablation test (abl-4) was meant to test one component of the most complex condition for entailment used in step 3 of the system
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| System component
 +
| DirRelCond3.2way
 +
| style="text-align: center;"| 4.17
 +
| style="text-align: center;"| &mdash;
 +
| The ablation test (abl-5) was meant to test one component of the most complex condition for entailment used in step 3 of the system
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| UAIC20091.3way
 +
| style="text-align: center;"| 4.17
 +
| style="text-align: center;"| 4
 +
| Pre-processing module, using MINIPAR, TreeTagger tool and some transformations, e.g. ''hasn't'' > ''has not''
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| DLSIUAES1.2way
 +
| style="text-align: center;"| 1
 +
| style="text-align: center;"| &mdash;
 +
| Everything ablated except lexical-based metrics
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| DLSIUAES1.2way
 +
| style="text-align: center;"| 3.33
 +
| style="text-align: center;"| &mdash;
 +
| Everything ablated except semantic-derived inferences
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| DLSIUAES1.3way
 +
| style="text-align: center;"| -0.17
 +
| style="text-align: center;"| -0.33
 +
| Everything ablated except lexical-based metrics
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| DLSIUAES1.3way
 +
| style="text-align: center;"| 2.33
 +
| style="text-align: center;"| 3.17
 +
| Everything ablated except semantic-derived inferences
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| FBKirst1.2way
 +
| style="text-align: center;"| 2.84
 +
| style="text-align: center;"| &mdash;
 +
| The automatic estimation of operation costs from run-1 modules was removed: the set of costs were assigned manually.
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| JU_CSE_TAC1.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
| Skip bigram match
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| JU_CSE_TAC1.2way
 +
| style="text-align: center;"| 0
 +
| style="text-align: center;"| &mdash;
 +
| Bigram match
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| JU_CSE_TAC1.2way
 +
| style="text-align: center;"| -0.5
 +
| style="text-align: center;"| &mdash;
 +
| Longest Common Subsequence
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Stemmer
 +
| JU_CSE_TAC1.2way
 +
| style="text-align: center;"| -0.5
 +
| style="text-align: center;"| &mdash;
 +
| Stemming, using WordNet stemmer
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| PeMoZa1.2way
 +
| style="text-align: center;"| -2.5
 +
| style="text-align: center;"| &mdash;
 +
| Idf score
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| PeMoZa1.2way
 +
| style="text-align: center;"| -0.66
 +
| style="text-align: center;"| &mdash;
 +
| Proper Noun Levenstain Distance
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| PeMoZa1.2way
 +
| style="text-align: center;"| 0.34
 +
| style="text-align: center;"| &mdash;
 +
| J&C (Jiang and Conrath, 1997) similarity score on nouns, adjectives
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| PeMoZa2.2way
 +
| style="text-align: center;"| 1
 +
| style="text-align: center;"| &mdash;
 +
| Idf score
 +
 
 +
|- bgcolor="#ECECEC" "align="left"
 +
| Other
 +
| PeMoZa2.2way
 +
| style="text-align: center;"| 0.17
 +
| style="text-align: center;"| &mdash;
 +
| Proper Noun Levenstain Distance
 +
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://nlp.cs.nyu.edu/paraphrase/ Sekine's Paraphrase Database]
+
| Other
| Collection of paraphrases
+
| PeMoZa2.2way
| Department of Computer Science, New York University
+
| style="text-align: center;"| 0.5
| Data-base created using Sekine's method, NOT cleaned up by human. It includes 19,975 sets of paraphrases with 191,572 phrases.
+
| style="text-align: center;"| &mdash;
| style="text-align: center;"| <small>0 - RTE4</small> <br> <small>0 - RTE3</small>
+
| J&C (Jiang and Conrath, 1997) similarity score on nouns, adjectives
| [[Sekine's Paraphrase Database - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://research.microsoft.com/research/downloads/Details/607D14D9-20CD-47E3-85BC-A2F65CD28042/Details.aspx Microsoft Research Paraphrase Corpus]
+
| Other
| Collection of paraphrases
+
| Rhodes1.3way
| Microsoft Research
+
| style="text-align: center;"| -0.17
| Text file containing 5800 pairs of sentences which have been extracted from news sources on the web, along with human annotations indicating whether each pair captures a paraphrase/semantic equivalence relationship.
+
| style="text-align: center;"| -0.17
| style="text-align: center;"| <small>0 - RTE4</small> <br> <small>0 - RTE3</small>
+
| Acronym match: we match words in all caps against sequences of capitalized words whose initial characters  concatenate to form the acronym
| [[Microsoft Research Paraphrase Corpus - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| [http://www.cs.cornell.edu/~cristian/Without_a_doubt_-_Data.html Downward entailing operators]
+
| Other
| Collection of entailing operators
+
| Rhodes1.3way
| Department of Computer Science, Cornell University, Ithaca NY
+
| style="text-align: center;"| 3.33
| System output of an unsupervised algorithm recovering many Downward Entailing operators, like 'doubt'.
+
| style="text-align: center;"| 1.83
| style="text-align: center;"| <small>0 - RTE4</small> <br> <small>0 - RTE3</small>
+
| Proper nouns match: exact string match between T and H, for proper nouns
| [[Downward entailing operators - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
|[http://cs.biu.ac.il/~shey/WikiRules.html WikiRules!]  <br>
+
| Other
| Lexical Reference rule-base
+
| Rhodes1.3way
| Bar-Ilan University
+
| style="text-align: center;"| 0.33
| Extraction of lexical reference rules from the text body (first sentence) and from metadata (links, redirects, parentheses) of Wikipedia
+
| style="text-align: center;"| 0.17
| style="text-align: center;"|<small>1 - RTE4</small> <br> <small>0 - RTE3</small>
+
| Numbers match: exact string match between T and H, for numbers
| [[Lexical reference rules - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| ''New resource''
+
| Other
|  
+
| Rhodes1.3way
|  
+
| style="text-align: center;"| 3.17
| ''Participants are encouraged to contribute''
+
| style="text-align: center;"| 4
| style="text-align: center;"|  
+
| Edit-distance-based matching: 2 words match if 80% of the letters of a H word occur in one or more adjacent T words in the same order
| [[New Resource1 - RTE Users|Users]]
+
 
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
| ''New resource''
+
| Other
|  
+
| UI_ccg1.2way
|  
+
| style="text-align: center;"| 1
| ''Participants are encouraged to contribute''
+
| style="text-align: center;"| &mdash;
| style="text-align: center;"|  
+
| Less sophisticated NE similarity metric: mainly Jaro-Winkler-based
| [[New Resource2 - RTE Users|Users]]
+
 
 
|}
 
|}
 
<br>
 
<br>
<br>
+
==Footnotes==
 +
<references />
 +
 
 +
 
 +
    Return to [[RTE Knowledge Resources]]

Latest revision as of 04:18, 25 June 2012

The following table lists the results of the ablation tests submitted by participants, which have been introduced as a mandatory track in the RTE5 campaign.

The first column contains the specific resources which have been ablated.
The second column lists the Team Run in the form [name_of_the_Team][number_of_the_submitted_run].[submission_task] (e.g. BIU1.2way, Boeing3.3way).
The third and fourth columns present the normalized difference between the accuracy of the complete system run and the accuracy of the ablation run (i.e. the output of the complete system without the ablated resource), showing the impact of the resource on the performance of the system. The third refers to the score obtained in the 2-way task, the fourth to that obtained in the 3-way task. For all the runs submitted as 3-way, also the 2-way derived accuracy has been calculated.
Finally, the fifth column contains a brief description of the specific usage of the resource. It is based on the information provided both in the "readme" files submitted together with the ablation tests and in the system reports published in the RTE5 proceedings.

Participants are kindly invited to check if all the inserted information is correct and complete.


Ablated Resource Team Run[1] Resource impact - 2way Resource impact - 3way Resource Usage Description
Acronym guide Siel_093.3way 0 0 The acronyms are expanded using the acronym database, so the acronyms are also matched with the expanded acronyms, and entailment is predicted accordingly
Acronym guide +
UAIC_Acronym_rules
UAIC20091.3way 0.17 0.16 We start from acronym-guide, but additional we use a rule that consider for expressions like Xaaaa Ybbbb Zcccc the acronym XYZ, regardless of length of text with this form.
DIRT BIU1.2way 1.33 Inference rules
DIRT Boeing3.3way -1.17 0 Verb paraphrases
DIRT UAIC20091.3way 0.17 0.33 We transform text and hypothesis with MINIPAR into dependency trees: use of DIRT relations to map verbs in T with verbs in H
Framenet+
WordNet
DLSIUAES1.2way 1.16 Frame-to-frame similarity metric
Framenet+
WordNet
DLSIUAES1.3way -0.17 -0.17 Frame-to-frame similarity metric
Framenet UB.dmirg3.2way 0 If two lexical items are covered in a single FrameNet frame, then the two items are treated as semantically related.
Grady Ward’s MOBY Thesaurus +
Roget's Thesaurus
Venses2.2way 2.83 Semantic fields are used as semantic similarity matching, in all cases of non identical lemmas
MontyLingua Tool Siel_093.3way 0 0 For the VerbOcean, the verbs have to be in the base form. We used the "MontyLingua" tool to convert the verbs into their base form
NEGATION_rules by UAIC UAIC20091.3way 0 -1.34 Negation rules check in the dependency trees on verbs descending branches to see if some categories of words that change the meaning are found.
NER (RASP Parser nertag) JU_CSE_TAC1.2way 0 Named Entity match: measure based on the number of Nes in the hypothesis that match in the corresponding text. For named entity recognition, the RASP Parser (Briscoe et al., 2006) nertag component has been used.
NE component UI_ccg1.2way 4.83 Named Entity recognition/comparison
PropBank cswhu1.3way 2 3.17 syntactic and semantic parsing
Stanford NER QUANTA1.2way 0.67 We use Named Entity similarity as a feature
Stopword list FBKirst1.2way 1.5 A list of the 572 most frequent English words has been collected in order to prevent assigning high costs to the deletion/insertion of terms that are unlikely to bring relevant information to detect entailment,and to avoid substituting these terms with any content word.
Training data from RTE1, 2, 3 PeMoZa3.2way 0


Training data from RTE2 PeMoZa3.2way 0.66
Training data from RTE2, 3 PeMoZa3.2way 0
VerbOcean DFKI1.3way 0 0.17 VerbOcean relations are used to calculate relatedness between verbs in T and H
VerbOcean DFKI2.3way 0.33 0.5 VerbOcean relations are used to calculate relatedness between verbs in T and H
VerbOcean DFKI3.3way 0.17 0.17 VerbOcean relations are used to calculate relatedness between verbs in T and H
VerbOcean FBKirst1.2way -0.16 Extraction of 18232 entailment rules for all the English verbs connected by the ”stronger-than” relation. For instance, if ”kill [stronger-than] injure”, then the rule ”kill ENTAILS injure” is added to the rules repository.
VerbOcean QUANTA1.2way 0 We use "opposite-of" relation in VerbOcean as a feature
VerbOcean Siel_093.3way 0 0 Similarity/anthonymy/unrelatedness between verbs
WikiPedia BIU1.2way -1 Lexical rules extracted from Wikipedia definition sentences, title parenthesis, redirect and hyperlink relations
WikiPedia cswhu1.3way 1.33 3.34 Lexical semantic rules
WikiPedia FBKirst1.2way 1 Rules extracted from WP using Latent Semantic Analysis (LSA)
WikiPedia UAIC20091.3way 1.17 1.5 Relations between named entities
Wikipedia +
NER's (LingPipe, GATE) +
Perl patterns
UAIC20091.3way 6.17 5 NE module: NERs, in order to identify Persons, Locations, Jobs, Languages, etc; Perl patterns built by us for RTE4 in order to identify numbers and dates; our own resources extracted from Wikipedia in order to identify a "distance" between one name entity from hypothesis and name entities from text
WordNet AUEBNLP1.3way -2 -2.67 Synonyms
WordNet BIU1.2way 2.5 Synonyms, hyponyms (2 levels away from the original term), hyponym_instance and derivations
WordNet Boeing3.3way 4 5.67 Wordnet synonyms, hypernyms relationships between (senses of) words, "similar" (SIM), "pertains" (PER), and "derivational" (DER) links to recognize equivalence between T and H
WordNet DFKI1.3way -0.17 0 Argument alignment between T and H
WordNet DFKI2.3way 0.16 0.34 Argument alignment between T and H
WordNet DFKI3.3way 0.17 0.17 Argument alignment between T and H
WordNet DLSIUAES1.2way 0.83 Similarity between lemmata, computed by WordNet-based metrics
WordNet DLSIUAES1.3way -0.5 -0.33 Similarity between lemmata, computed by WordNet-based metrics
WordNet JU_CSE_TAC1.2way 0.34 WordNet based Unigram match: if any synset for the H unigram matches with any synset of a word in T then the hypothesis unigram is considered as a WordNet based unigram match.
WordNet PeMoZa1.2way -0.5 Derivational Morphology from WordNet
WordNet PeMoZa1.2way 1.33 Verb Entailment from Wordnet
WordNet PeMoZa2.2way 1 Derivational Morphology from WordNet
WordNet PeMoZa2.2way -0.33 Verb Entailment from Wordnet
WordNet QUANTA1.2way -0.17 We use several relations from wordnet, such as synonyms, hyponym, hypernym et al.
WordNet Rhodes1.3way 3.17 4 Lexicon based match: we chose a very simple metric: matching between words in T and H based on a path of distance at most 2 in the WordNet graph, using any links (hyponymy, hypernymy, meronymy, pertainymy, etc.)
WordNet Sagan1.3way 0 -0.83 The system is based on machine learning approach. The ablation test was obtained with 2 less features using WordNet (namely, string similarity based on Levenshtein distance and semantic similarity) in the training and testing steps.


WordNet Siel_093.3way 0.34 -0.17 Similarity between nouns using WN tool
WordNet ssl1.3way 0 0.67 WordNet Analysis
WordNet UB.dmirg3.2way 0 Synonyms, hypernyms (2 levels away from the original term)
WordNet UI_ccg1.2way 4 Word similarity == identity
WordNet +
FrameNet
UB.dmirg3.2way 0 WN: synonyms, hypernyms (2 levels away from the original term). FN: if two lexical items are covered in a single FrameNet frame, then the two items are treated as semantically related.
WordNet +
VerbOcean
DFKI1.3way 0 0.17 VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
WordNet +
VerbOcean
DFKI2.3way 0.5 0.67 VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
WordNet +
VerbOcean
DFKI3.3way 0.17 0.17 VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
WordNet +
VerbOcean
UAIC20091.3way 2 1.50 Contradiction identification
WordNet +
VerbOcean +
DLSIUAES_negation_list
DLSIUAES1.2way 0.66 Antonym relations between verbs (VO+WN); polarity based on negation terms (short list constructed by participant themselves)
WordNet +
VerbOcean +
DLSIUAES_negation_list
DLSIUAES1.3way -1 -0.5 Antonym relations between verbs (VO+WN); polarity based on negation terms (short list constructed by participant themselves)
WordNet +
XWordNet
UAIC20091.3way 1 1.33 Synonymy, hyponymy and hypernymy and eXtended WordNet relation
System component DirRelCond3.2way 4.67 The ablation test (abl-1) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component DirRelCond3.2way -1.5 The ablation test (abl-2) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component DirRelCond3.2way 0.17 The ablation test (abl-3) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component DirRelCond3.2way -1.16 The ablation test (abl-4) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component DirRelCond3.2way 4.17 The ablation test (abl-5) was meant to test one component of the most complex condition for entailment used in step 3 of the system
Other UAIC20091.3way 4.17 4 Pre-processing module, using MINIPAR, TreeTagger tool and some transformations, e.g. hasn't > has not
Other DLSIUAES1.2way 1 Everything ablated except lexical-based metrics
Other DLSIUAES1.2way 3.33 Everything ablated except semantic-derived inferences
Other DLSIUAES1.3way -0.17 -0.33 Everything ablated except lexical-based metrics
Other DLSIUAES1.3way 2.33 3.17 Everything ablated except semantic-derived inferences
Other FBKirst1.2way 2.84 The automatic estimation of operation costs from run-1 modules was removed: the set of costs were assigned manually.
Other JU_CSE_TAC1.2way 0 Skip bigram match
Other JU_CSE_TAC1.2way 0 Bigram match
Other JU_CSE_TAC1.2way -0.5 Longest Common Subsequence
Stemmer JU_CSE_TAC1.2way -0.5 Stemming, using WordNet stemmer
Other PeMoZa1.2way -2.5 Idf score
Other PeMoZa1.2way -0.66 Proper Noun Levenstain Distance
Other PeMoZa1.2way 0.34 J&C (Jiang and Conrath, 1997) similarity score on nouns, adjectives
Other PeMoZa2.2way 1 Idf score
Other PeMoZa2.2way 0.17 Proper Noun Levenstain Distance
Other PeMoZa2.2way 0.5 J&C (Jiang and Conrath, 1997) similarity score on nouns, adjectives
Other Rhodes1.3way -0.17 -0.17 Acronym match: we match words in all caps against sequences of capitalized words whose initial characters concatenate to form the acronym
Other Rhodes1.3way 3.33 1.83 Proper nouns match: exact string match between T and H, for proper nouns
Other Rhodes1.3way 0.33 0.17 Numbers match: exact string match between T and H, for numbers
Other Rhodes1.3way 3.17 4 Edit-distance-based matching: 2 words match if 80% of the letters of a H word occur in one or more adjacent T words in the same order
Other UI_ccg1.2way 1 Less sophisticated NE similarity metric: mainly Jaro-Winkler-based


Footnotes

  1. For further information about participants, click here: RTE Challenges - Data about participants


   Return to RTE Knowledge Resources