RTE5 - Ablation Tests

Ablated Resource	Team Run	Δ Accuracy % - 2way	Δ Accuracy % - 3way	Resource Usage Description
Acronym guide	Siel_093.3way	0	0	Acronym Resolution
Acronym guide + UAIC_Acronym_rules	UAIC20091.3way	0.17	0.16	We start from acronym-guide, but additional we use a rule that consider for expressions like Xaaaa Ybbbb Zcccc the acronym XYZ, regardless of length of text with this form.
DIRT	BIU1.2way	1.33	—	Inference rules
DIRT	Boeing3.3way	-1.17	0	Verb paraphrases
DIRT	UAIC20091.3way	0.17	0.33	We transform text and hypothesis with MINIPAR into dependency trees: use of DIRT relations to map verbs in T with verbs in H
Framenet	DLSIUAES1.2way	1.16	—	Frame-to-frame similarity metric
Framenet	DLSIUAES1.3way	-0.17	-0.17	Frame-to-frame similarity metric
Framenet	UB.dmirg3.2way	0	—
Grady Ward’s MOBY Thesaurus + Roget's Thesaurus	VensesTeam2.2way	2.83	—	Semantic fields are used as semantic similarity matching, in all cases of non identical lemmas
MontyLingua Tool	Siel_093.3way	0	0	For the VerbOcean, the verbs have to be in the base form. We used the "MontyLingua" tool to convert the verbs into their base form
NEGATION_rules by UAIC	UAIC20091.3way	0	-1.34	Negation rules check in the dependency trees on verbs descending branches to see if some categories of words that change the meaning are found.
NER	UI_ccg1.2way	4.83	—	Named Entity recognition/comparison
PropBank	cswhu1.3way	2	3.17	syntactic and semantic parsing
Stanford NER	QUANTA1.2way	0.67	—	We use Named Entity similarity as a feature
Stopword list	FBKirst1.2way	1.5	—	A list of the 572 most frequent English words has been collected in order to prevent assigning high costs to the deletion/insertion of terms that are unlikely to bring relevant information to detect entailment,and to avoid substituting these terms with any content word.
Training data from RTE1, 2, 3	PeMoZa3.2way	0	—
Training data from RTE2	PeMoZa3.2way	0.66	—
Training data from RTE2, 3	PeMoZa3.2way	0	—
VerbOcean	DFKI1.3way	0	0.17	VerbOcean relations are used to calculate relatedness between verbs in T and H
VerbOcean	DFKI2.3way	0.33	0.5	VerbOcean relations are used to calculate relatedness between verbs in T and H
VerbOcean	DFKI3.3way	0.17	0.17	VerbOcean relations are used to calculate relatedness between verbs in T and H
VerbOcean	FBKirst1.2way	-0.16	—	Extraction of 18232 entailment rules for all the English verbs connected by the ”stronger-than” relation. For instance, if ”kill [stronger-than] injure”, then the rule ”kill ENTAILS injure” is added to the rules repository.
VerbOcean	QUANTA1.2way	0	—	We use "opposite-of" relation in VerbOcean as a feature
VerbOcean	Siel_093.3way	0	0	Similarity/anthonymy/unrelatedness between verbs
WikiPedia	BIU1.2way	-1	—	Lexical rules extracted from Wikipedia definition sentences, title parenthesis, redirect and hyperlink relations
WikiPedia	cswhu1.3way	1.33	3.34	Lexical semantic rules
WikiPedia	FBKirst1.2way	1	—	Rules extracted from WP using Latent Semantic Analysis (LSA)
WikiPedia	UAIC20091.3way	1.17	1.5	Relations between named entities
Wikipedia + NER's (LingPipe, GATE) + Perl patterns	UAIC20091.3way	6.17	5	NE module: NERs, in order to identify Persons, Locations, Jobs, Languages, etc; Perl patterns built by us for RTE4 in order to identify numbers and dates; our own resources extracted from Wikipedia in order to identify a "distance" between one name entity from hypothesis and name entities from text
WordNet	AUEBNLP1.3way	-2	-2.67	Synonyms
WordNet	BIU1.2way	2.5	—	Synonyms, hyponyms (2 levels away from the original term), hyponym_instance and derivations
WordNet	Boeing3.3way	4	5.67	Wordnet synonyms, hypernyms relationships between (senses of) words, "similar" (SIM), "pertains" (PER), and "derivational" (DER) links to recognize equivalence between T and H
WordNet	DFKI1.3way	-0.17	0	Argument alignment between T and H
WordNet	DFKI2.3way	0.16	0.34	Argument alignment between T and H
WordNet	DFKI3.3way	0.17	0.17	Argument alignment between T and H
WordNet	DLSIUAES1.2way	0.83	—	Similarity between lemmata, computed by WordNet-based metrics
WordNet	DLSIUAES1.3way	-0.5	-0.33	Similarity between lemmata, computed by WordNet-based metrics
WordNet	JU_CSE_TAC1.2way	0.34	—	WordNet based Unigram match: WordNet synsets are identified for each of the unmatched unigrams in the hypothesis. If any synset for the H unigram matches with any synset of a word in T then the hypothesis unigram is considered as a WordNet based unigram match.
WordNet	PeMoZa1.2way	-0.5	—	Derivational Morphology from WordNet
WordNet	PeMoZa1.2way	1.33	—	Verb Entailment from Wordnet
WordNet	PeMoZa2.2way	1	—	Derivational Morphology from WordNet
WordNet	PeMoZa2.2way	-0.33	—	Verb Entailment from Wordnet
WordNet	QUANTA1.2way	-0.17	—	We use several relations from wordnet, such as synonyms, hyponym, hypernym et al.
WordNet	Sagan1.3way	0	-0.83	The system is based on machine learning approach. The ablation test was obtained with 2 less features using WordNet in the training and testing steps.
WordNet	Siel_093.3way	0.34	-0.17	Similarity between nouns using WN tool
WordNet	ssl1.3way	0	0.67	WordNet Analysis
WordNet	UB.dmirg3.2way	0	—
WordNet	UI_ccg1.2way	4	—	word similarity == identity
WordNet + FrameNet	UB.dmirg3.2way	0	—
WordNet + VerbOcean	DFKI1.3way	0	0.17	VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
WordNet + VerbOcean	DFKI2.3way	0.5	0.67	VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
WordNet + VerbOcean	DFKI3.3way	0.17	0.17	VerbOcean is used to calculate relatedness between nominal predicates in T and H, after using WordNet to change the nouns into verbs.
WordNet + VerbOcean	UAIC20091.3way	2	1.50	Contradiction identification
WordNet + VerbOcean + DLSIUAES_negation_list	DLSIUAES1.2way	0.66	—	Antonym relations between verbs (VO+WN); polarity based on negation terms (short list constructed by participant themselves)
WordNet + VerbOcean + DLSIUAES_negation_list	DLSIUAES1.3way	-1	-0.5	Antonym relations between verbs (VO+WN); polarity based on negation terms (short list constructed by participant themselves)
WordNet + XWordNet	UAIC20091.3way	1	1.33	Synonymy, hyponymy and hypernymy and eXtended WordNet relation
System component	DirRelCond3.2way	4.67	—	The ablation test (abl-1) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component	DirRelCond3.2way	-1.5	—	The ablation test (abl-2) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component	DirRelCond3.2way	0.17	—	The ablation test (abl-3) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component	DirRelCond3.2way	-1.16	—	The ablation test (abl-4) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component	DirRelCond3.2way	4.17	—	The ablation test (abl-5) was meant to test one component of the most complex condition for entailment used in step 3 of the system
System component	UAIC20091.3way	4.17	4	Pre-processing module
Other	DLSIUAES1.2way	1	—	Everything ablated except lexical-based metrics
Other	DLSIUAES1.2way	3.33	—	Everything ablated except semantic-derived inferences
Other	DLSIUAES1.3way	-0.17	-0.33	Everything ablated except lexical-based metrics
Other	DLSIUAES1.3way	2.33	3.17	Everything ablated except semantic-derived inferences
Other	FBKirst1.2way	2.84	—	The automatic estimation of operation costs from run-1 modules was removed: the set of costs were assigned manually.
Other	JU_CSE_TAC1.2way	0	—	Named Entity match
Other	JU_CSE_TAC1.2way	0	—	Skip bigram
Other	JU_CSE_TAC1.2way	0	—	Bigram match
Other	JU_CSE_TAC1.2way	-0.5	—	Longest Common Subsequence
Other	JU_CSE_TAC1.2way	-0.5	—	Unigram match after stemming
Other	PeMoZa1.2way	-2.5	—	Idf score
Other	PeMoZa1.2way	-0.66	—	Proper Noun Levenstain Distance
Other	PeMoZa1.2way	0.34	—	J&C (Jiang and Conrath, 1997) similarity score on nouns, adjectives
Other	PeMoZa2.2way	1	—	Idf score
Other	PeMoZa2.2way	0.17	—	Proper Noun Levenstain Distance
Other	PeMoZa2.2way	0.5	—	J&C (Jiang and Conrath, 1997) similarity score on nouns, adjectives
Other	UI_ccg1.2way	1	—	Less sophisticated NE similarity metric: mainly Jaro-Winkler-based

RTE5 - Ablation Tests

Navigation menu

Search