Difference between revisions of "RTE Knowledge Resources"

From ACL Wiki
Jump to: navigation, search
m (Reverted edits by Creek (talk) to last revision by Dagiampi)
 
(34 intermediate revisions by 7 users not shown)
Line 8: Line 8:
 
<br>
 
<br>
 
=== Call for Resources ===
 
=== Call for Resources ===
[[RTE6 - Call for Resources]]
 
<br>
 
  
 +
In order to help the research, all the participants are invited to contribute, sharing their own resources with the RTE community.
 +
<br>
 +
Making the resources available to be used by other systems has several advantages. On the one hand, it helps improve the TE technology; on the other hand, it offers an opportunity to further test and evaluate the resource.
 +
<br>
 +
<br>
 +
*[[RTE6 - Call for Resources]]
 +
*[[RTE7 - Call for Resources]]
 +
<br>
 
=== Ablation Tests ===
 
=== Ablation Tests ===
[[RTE5 - Ablation Tests]]
+
An ablation test consists of removing one module at a time from a system, and rerunning the system on the test set with the other modules, except the one tested.
 +
<br>
 +
Ablation test are meant to help better understand the relevance of the knowledge resources used by RTE systems, and evaluate the contribution of each of them to the systems' performances. In fact, comparing the results achieved in the ablation tests to those obtained by the systems as a whole allows assessing the contribution given by each single resource.
 +
<br>
 +
<br>
 +
*[[RTE5 - Ablation Tests]]
 +
*[[RTE6 - Ablation Tests]]
 +
*[[RTE7 - Ablation Tests]]
 
<br>
 
<br>
  
Line 26: Line 39:
 
! width="45"|<small>RTE4 Users<ref name:"rtefour">RTE-4 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref></small>
 
! width="45"|<small>RTE4 Users<ref name:"rtefour">RTE-4 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref></small>
 
! width="45"|<small>RTE5 Users<ref name:"rtefive">RTE-5 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref></small>
 
! width="45"|<small>RTE5 Users<ref name:"rtefive">RTE-5 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref></small>
 +
! width="45"|<small>RTE6 Users<ref name:"rtesix">RTE-6 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref></small>
 
! class="unsortable" width="30"|<small>Usage info</small>
 
! class="unsortable" width="30"|<small>Usage info</small>
  
Line 36: Line 50:
 
| style="text-align: center;"|21
 
| style="text-align: center;"|21
 
| style="text-align: center;"|18
 
| style="text-align: center;"|18
 +
|style="text-align: center;"| 22
 
| [[WordNet - RTE Users|Users]]
 
| [[WordNet - RTE Users|Users]]
  
Line 46: Line 61:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|2
 
| style="text-align: center;"|2
 +
|style="text-align: center;"| 0
 
| [[eXtended WordNet - RTE Users|Users]]
 
| [[eXtended WordNet - RTE Users|Users]]
  
Line 56: Line 72:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Augmented Wordnet - RTE Users|Users]]
 
| [[Augmented Wordnet - RTE Users|Users]]
  
Line 66: Line 83:
 
| style="text-align: center;"|2
 
| style="text-align: center;"|2
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Verbnet - RTE Users|Users]]
 
| [[Verbnet - RTE Users|Users]]
  
Line 76: Line 94:
 
| style="text-align: center;"|3
 
| style="text-align: center;"|3
 
| style="text-align: center;"|6
 
| style="text-align: center;"|6
 +
|style="text-align: center;"| 7
 
| [[VerbOcean - RTE Users|Users]]
 
| [[VerbOcean - RTE Users|Users]]
  
Line 86: Line 105:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|2
 
| style="text-align: center;"|2
 +
|style="text-align: center;"| 3
 
| [[Framenet - RTE Users|Users]]
 
| [[Framenet - RTE Users|Users]]
  
Line 96: Line 116:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[NomBank Resource - RTE Users|Users]]  
 
| [[NomBank Resource - RTE Users|Users]]  
  
Line 106: Line 127:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[PropBank Resource - RTE Users|Users]]
 
| [[PropBank Resource - RTE Users|Users]]
  
Line 116: Line 138:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[Nomlex Plus - RTE Users|Users]]
 
| [[Nomlex Plus - RTE Users|Users]]
  
Line 126: Line 149:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 2
 
| [[Dekang Lin’s Thesaurus - RTE Users|Users]]
 
| [[Dekang Lin’s Thesaurus - RTE Users|Users]]
  
Line 136: Line 160:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Grady Ward's Moby Thesaurus - RTE Users|Users]]
 
| [[Grady Ward's Moby Thesaurus - RTE Users|Users]]
  
Line 146: Line 171:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Roget's Thesaurus - RTE Users|Users]]
 
| [[Roget's Thesaurus - RTE Users|Users]]
  
Line 156: Line 182:
 
| style="text-align: center;"|3
 
| style="text-align: center;"|3
 
| style="text-align: center;"|6
 
| style="text-align: center;"|6
 +
|style="text-align: center;"| 6
 
| [[Wikipedia - RTE Users|Users]]
 
| [[Wikipedia - RTE Users|Users]]
  
Line 166: Line 193:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Umbel - RTE Users|Users]]
 
| [[Umbel - RTE Users|Users]]
  
Line 176: Line 204:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[YAGO - RTE Users|Users]]
 
| [[YAGO - RTE Users|Users]]
  
Line 186: Line 215:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[DBpedia - RTE Users|Users]]
 
| [[DBpedia - RTE Users|Users]]
  
Line 196: Line 226:
 
| style="text-align: center;"|4
 
| style="text-align: center;"|4
 
| style="text-align: center;"|3
 
| style="text-align: center;"|3
 +
|style="text-align: center;"| 4
 
| [[DIRT Paraphrase Collection - RTE Users|Users]]
 
| [[DIRT Paraphrase Collection - RTE Users|Users]]
  
Line 206: Line 237:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[Tease Collection - RTE Users|Users]]
 
| [[Tease Collection - RTE Users|Users]]
  
Line 216: Line 248:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[BADC Acronym and Abbreviation List - RTE Users|Users]]
 
| [[BADC Acronym and Abbreviation List - RTE Users|Users]]
 +
 +
|- bgcolor="#ECECEC" "align="left"
 +
| [http://clipdemos.umiacs.umd.edu/catvar/ Catvar The Categorial Variation Database (English)]
 +
| Word List
 +
| University of Maryland
 +
| A Categorial-Variation Database (or Catvar) is a database of clusters of uninflected words (lexemes) and their categorial (i.e. part-of-speech) variants. The database was developed for English using a combination of resources and algorithms, including the LCS Verb and Preposition Databases (Dorr 2001), the Brown Corpus section of the Penn Treebank (Marcus et al. 1994), an English morphological analysis lexicon developed for PC-Kimmo (ENGLEX) (Antworth 1990), WordNet1.6 (Fellbaum 1998), an English Verb-Noun list extracted from Nomlex (Macleod et al. 1998), a similar list extracted from LDOCE (Procter 1978) and the Porter stemmer (Porter 1980).
 +
| style="text-align: center;"|0
 +
| style="text-align: center;"|0
 +
| style="text-align: center;"|0
 +
|style="text-align: center;"| 1
 +
| [[Categorial-Variation Database - RTE Users|Users]]
 +
  
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
Line 226: Line 271:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|3
 
| style="text-align: center;"|3
 +
|style="text-align: center;"| 0
 
| [[Acronym Guide - RTE Users|Users]]
 
| [[Acronym Guide - RTE Users|Users]]
  
Line 236: Line 282:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[Web1T - RTE Users|Users]]
 
| [[Web1T - RTE Users|Users]]
  
Line 246: Line 293:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Normalized Google Distance (RTE3&RTE4)- RTE Users|Users]]
 
| [[Normalized Google Distance (RTE3&RTE4)- RTE Users|Users]]
  
Line 256: Line 304:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Normalized Google Distance (RTE5)- RTE Users|Users]]
 
| [[Normalized Google Distance (RTE5)- RTE Users|Users]]
  
Line 266: Line 315:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[GNIS - RTE Users|Users]]
 
| [[GNIS - RTE Users|Users]]
  
Line 276: Line 326:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[Geonames - RTE Users|Users]]
 
| [[Geonames - RTE Users|Users]]
  
Line 286: Line 337:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[Sekine's Paraphrase Database - RTE Users|Users]]
 
| [[Sekine's Paraphrase Database - RTE Users|Users]]
  
Line 296: Line 348:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[Microsoft Research Paraphrase Corpus - RTE Users|Users]]
 
| [[Microsoft Research Paraphrase Corpus - RTE Users|Users]]
  
Line 306: Line 359:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[Downward entailing operators - RTE Users|Users]]
 
| [[Downward entailing operators - RTE Users|Users]]
  
Line 316: Line 370:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
|style="text-align: center;"| 0
 
| [[WikiRules! - RTE Users|Users]]
 
| [[WikiRules! - RTE Users|Users]]
  
Line 326: Line 381:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[DART - RTE Users|Users]]
 
| [[DART - RTE Users|Users]]
  
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
|[http://cs.biu.ac.il/~nlp/download/FRED/latest/FRED.zip FRED]
+
|[http://cs.biu.ac.il/~nlp/downloads/FRED.html FRED]
 
| FrameNet-derived entailment rule-base
 
| FrameNet-derived entailment rule-base
 
| Bar-Ilan University
 
| Bar-Ilan University
| This package contains the outputs of the FRED algorithm, an algorithm which extracts entailment rules from FrameNet.
+
| This package contains the outputs of the FRED algorithm which extracts entailment rules from FrameNet.
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[FRED - RTE Users|Users]]
 
| [[FRED - RTE Users|Users]]
  
Line 346: Line 403:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[DIRECT - RTE Users|Users]]
 
| [[DIRECT - RTE Users|Users]]
  
Line 357: Line 415:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 
| [[BinaryDIRT- RTE Users|Users]]
 
| [[BinaryDIRT- RTE Users|Users]]
  
 +
|- bgcolor="#ECECEC" "align="left"
 +
|[http://u.cs.biu.ac.il/~nlp/downloads/unaryBInc.html unaryBInc]
 +
| Entailment rules between unary templates using BInc algorithm
 +
| Bar-Ilan University
 +
| This resource contains entailment rules over unary templates learned over the Reuters corpus using
 +
the BInc algorithm of Szpektor and Dagan (2008).
 +
| style="text-align: center;"|0
 +
| style="text-align: center;"|0
 +
| style="text-align: center;"|0
 +
|style="text-align: center;"| 0
 +
| [[unaryBInc- RTE Users|Users]]
  
 
|- bgcolor="#ECECEC" "align="left"
 
|- bgcolor="#ECECEC" "align="left"
Line 368: Line 438:
 
| style="text-align: center;"|  
 
| style="text-align: center;"|  
 
| style="text-align: center;"|  
 
| style="text-align: center;"|  
 +
|style="text-align: center;"|
 
| [[New Resource2 - RTE Users|Users]]
 
| [[New Resource2 - RTE Users|Users]]
  
Line 387: Line 458:
 
! width="30"|<small>RTE4 Users</small>
 
! width="30"|<small>RTE4 Users</small>
 
! width="30"|<small>RTE5 Users</small>
 
! width="30"|<small>RTE5 Users</small>
 +
! width="30"|<small>RTE6 Users</small>
 
! class="unsortable" width="30"|<small>Usage info</small>
 
! class="unsortable" width="30"|<small>Usage info</small>
  
Line 396: Line 468:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| [[Parc Polarity Lexicon - RTE Users|Users]]
 
| [[Parc Polarity Lexicon - RTE Users|Users]]
Line 405: Line 478:
 
| Cities and other geographical names
 
| Cities and other geographical names
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
Line 416: Line 490:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| [[Geographic Ontology - RTE Users|Users]]
 
| [[Geographic Ontology - RTE Users|Users]]
Line 427: Line 502:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| [[Geo - RTE Users|Users]]
 
| [[Geo - RTE Users|Users]]
  
Line 437: Line 513:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| [[Regex - RTE Users|Users]]
 
| [[Regex - RTE Users|Users]]
  
Line 447: Line 524:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| [[Syntactic rule base - RTE Users|Users]]
 
| [[Syntactic rule base - RTE Users|Users]]
  
Line 457: Line 535:
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 
| style="text-align: center;"|0  
 
| style="text-align: center;"|0  
 +
| style="text-align: center;"|0
 
| [[Polarity rule base - RTE Users|Users]]
 
| [[Polarity rule base - RTE Users|Users]]
  
Line 466: Line 545:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| [[Lexical-Syntactic rule base - RTE Users|Users]]
 
| [[Lexical-Syntactic rule base - RTE Users|Users]]
Line 475: Line 555:
 
| Collections of rules, patterns etc. for RTE purpose, extracted from Reuter corpus parsed using Minipar.
 
| Collections of rules, patterns etc. for RTE purpose, extracted from Reuter corpus parsed using Minipar.
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0  
 
| style="text-align: center;"|0  
Line 487: Line 568:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1
 
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| [[Abbr - RTE Users|Users]]
 
| [[Abbr - RTE Users|Users]]
  
Line 497: Line 579:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|1  
 
| style="text-align: center;"|1  
 +
| style="text-align: center;"|0
 
| [[UAIC Negation_list - RTE Users|Users]]
 
| [[UAIC Negation_list - RTE Users|Users]]
  
Line 506: Line 589:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
| style="text-align: center;"|1  
+
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| [[DLSIUAES Negation_list - RTE Users|Users]]
 
| [[DLSIUAES Negation_list - RTE Users|Users]]
  
Line 516: Line 600:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
| style="text-align: center;"|1  
+
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| [[UAIC Quantifier_list - RTE Users|Users]]
 
| [[UAIC Quantifier_list - RTE Users|Users]]
  
Line 526: Line 611:
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
 
| style="text-align: center;"|0
| style="text-align: center;"|1  
+
| style="text-align: center;"|1
 +
| style="text-align: center;"|0
 
| [[FBKirst StopWord list - RTE Users|Users]]
 
| [[FBKirst StopWord list - RTE Users|Users]]
 +
 +
|- bgcolor="#ECECEC" "align="left"
 +
| IKOMA Dictionary of Named Entities Acronyms and Synonyms
 +
| Dictionary of Named Entities Acronyms and Synonyms
 +
| IKOMA; NEC Corporation, Takayama, Ikoma, Nara, Japan
 +
| Acronym dictionary constructed automatically from the corpus and a synonym dictionary that contains geographical terms.
 +
| style="text-align: center;"|0
 +
| style="text-align: center;"|0
 +
| style="text-align: center;"|0
 +
| style="text-align: center;"|1
 +
| [[IKOMA2 - RTE Users|Users]]
  
 
|}
 
|}
Line 534: Line 631:
 
==Footnotes==
 
==Footnotes==
 
<references/>
 
<references/>
 +
 +
Return to: [[Textual_Entailment_Resource_Pool]]

Latest revision as of 05:18, 25 June 2012

Knowledge resources have shown their relevance for applied semantic inference, and are extensively used by applied inference systems, such as those developed within the Textual Entailment framework.

This page presents a list of the knowledge resources used by systems that have participated in the last RTE challenges. The first table lists the publicly available resources, the second one lists unpublished resources. Both tables are sortable by Resource name, type, author and number of users.

RTE Participants are encouraged to add information about all kind of knowledge resources used, from standard existing resources (e.g. WordNet) to knowledge collections created for specific purposes, which can be made available to the community.


Call for Resources

In order to help the research, all the participants are invited to contribute, sharing their own resources with the RTE community.
Making the resources available to be used by other systems has several advantages. On the one hand, it helps improve the TE technology; on the other hand, it offers an opportunity to further test and evaluate the resource.


Ablation Tests

An ablation test consists of removing one module at a time from a system, and rerunning the system on the test set with the other modules, except the one tested.
Ablation test are meant to help better understand the relevance of the knowledge resources used by RTE systems, and evaluate the contribution of each of them to the systems' performances. In fact, comparing the results achieved in the ablation tests to those obtained by the systems as a whole allows assessing the contribution given by each single resource.


Publicly available Resources

Resource Type Author Brief description PAST Users <ref name:"rtethree">RTE-3 data have been provided by participants by means of a questionnaire.</ref> RTE4 Users<ref name:"rtefour">RTE-4 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref> RTE5 Users<ref name:"rtefive">RTE-5 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref> RTE6 Users<ref name:"rtesix">RTE-6 data have been provided by participants and have been integrated with information extracted from the related proceedings.</ref> Usage info
WordNet Lexical DB Princeton University Lexical database of English nouns, verbs, adjectives and adverbs 3 21 18 22 Users
eXtended Wordnet Lexical DB Human Language Technology Research Institute, University of Texas at Dallas Extension of WordNet based on the exploitation of the information contained in WordNet definitional glosses: the glosses are syntactically parsed, transformed into logic forms and content words are semantically disambiguated. The Extended Wordnet is an ongoing project. 0 0 2 0 Users
Augmented Wordnet Lexical DB Stanford University The resource is the result of the application of a learning algorithm for inducing semantic taxonomies from parsed text. The algorithm automatically acquires items of world knowledge, and uses these to produce significantly enhanced versions of WordNet (up to 40,000 synsets more). 0 0 1 0 Users
Verbnet Lexical DB University of Colorado Boulder Lexicon for English verbs organized into classes extending Levin (1993) classes through refinement and addition of subclasses to achieve syntactic and semantic coherence among members of a class 2 2 1 0 Users
VerbOcean Lexical DB Information Sciences Institute, University of Southern California Broad-coverage semantic network of verbs 2 3 6 7 Users
FrameNet Lexical DB ICSI (International Computer Science Institute) - Berkley University Lexical resource for English words, based on frame semantics (valences) and supported by corpus evidence 1 1 2 3 Users
NomBank Lexical DB New York University Lexical resource containing syntactic frames for nouns, extracted from annotated corpora 2 1 0 0 Users
PropBank Lexical DB University of Colorado Boulder Lexical resource containing syntactic frames for verbs, extracted from annotated corpora 2 1 1 0 Users
Nomlex Plus Lexical DB New York University Dictionary of English nominalizations: it describes the allowed complements for a nominalization and relates the nominal complements to the arguments of the corresponding verb 0 1 0 0 Users
Dekang Lin’s Thesaurus Thesaurus University of Alberta Thesaurus automatically constructed using a parsed corpus, based on distributional similarity scores 0 1 1 2 Users
Grady Ward's Moby Thesaurus Thesaurus University of Sheffield Thesaurus containing 30,260 root words, with 2,520,264 synonyms and related terms. Grady Ward placed this thesaurus in the public domain in 1996. 0 0 1 0 Users
Roget's Thesaurus Thesaurus Peter Mark Roget (Electronic version distributed by University of Chicago) Roget's Thesaurus is a widely-used English thesaurus, created by Dr. Peter Mark Roget in 1805. The original edition had 15,000 words, and each new edition has been larger. The electronic edition (version 1.02) is made available by University of Chicago. 1 0 1 0 Users
Wikipedia Encyclopedia Free encyclopedia. Used for extraction of lexical-semantic rules (from its more structured parts), named entity recognition, geographical information etc. 0 3 6 6 Users
Umbel Ontology Structured Dynamics LLC, Coralville, IA UMBEL stands for Upper Mapping and Binding Exchange Layer and is a lightweight ontology structure for relating Web content and data to a standard set of subject concepts 0 0 1 0 Users
YAGO Ontology Max-Planck Institute for Informatics, Saarbrücken, Germany Light-weight and extensible ontology. It contains more than 2 million entities and 20 million facts about these entities. The facts have been automatically extracted from Wikipedia and unified with WordNet. 0 0 1 0 Users
DBpedia Ontology Open community project DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. The DBpedia knowledge base currently describes more than 2.9 million things in 91 different languages and consists of 479 million pieces of information. 0 0 1 0 Users
DIRT Paraphrase Collection Collection of paraphrases University of Alberta DIRT (Discovery of Inference Rules from Text) is both an algorithm and a resulting knowledge collection. The DIRT knowledge collection is the output of the DIRT algorithm over a 1GB set of newspaper text. 2 4 3 4 Users
TEASE Collection Collection of Entailment Rules Bar-Ilan University Output of the TEASE algorithm 0 0 0 0 Users
BADC Acronym and Abbreviation List Word List BADC (British Atmospheric Data Centre) Acronym and Abbreviation List 0 1 1 0 Users
Catvar The Categorial Variation Database (English) Word List University of Maryland A Categorial-Variation Database (or Catvar) is a database of clusters of uninflected words (lexemes) and their categorial (i.e. part-of-speech) variants. The database was developed for English using a combination of resources and algorithms, including the LCS Verb and Preposition Databases (Dorr 2001), the Brown Corpus section of the Penn Treebank (Marcus et al. 1994), an English morphological analysis lexicon developed for PC-Kimmo (ENGLEX) (Antworth 1990), WordNet1.6 (Fellbaum 1998), an English Verb-Noun list extracted from Nomlex (Macleod et al. 1998), a similar list extracted from LDOCE (Procter 1978) and the Porter stemmer (Porter 1980). 0 0 0 1 Users


Acronym Guide Word List Acronym-Guide.com Acronym and Abbreviation Lists for English, branched in thematic directories 1 1 3 0 Users
Web1T 5-grams Word list Linguistic Data Consortium, University of Pennsylvania; Google Inc. Data set containing English word n-grams and their observed frequency counts. The n-gram counts were generated from approximately 1 trillion word tokens of text from publicly accessible Web pages 0 1 0 0 Users
Normalized Google Distance (RTE3&RTE4) Word Pair Co-occurrence Saarland University Co-occurrence of the word pairs in RTE3 and RTE4 using Normalized Google Distance (Cilibrasi and Vitanyi, 2004). The word pairs are all the possible combinations of content words in T and H. In practice, we used Yahoo! as the search engine. 0 0 1 0 Users
Normalized Google Distance (RTE5) Word Pair Co-occurrence Saarland University Co-occurrence of the word pairs in RTE3 and RTE4 using Normalized Google Distance (Cilibrasi and Vitanyi, 2004). The word pairs are all the possible combinations of content words in T and H. In practice, we used Yahoo! as the search engine. 0 0 1 0 Users
GNIS - Geographic Names Information System Gazetteer USGS (United States Geological Survey) Database containing the Federal and national standard toponyms for USA, associated areas and Antarctica 0 1 0 0 Users
Geonames Gazetteer Database containing eight million geographical names. It is integrating geographical data such as names of places in various languages, elevation, population and others from various sources. 0 1 0 0 Users
Sekine's Paraphrase Database Collection of paraphrases Department of Computer Science, New York University Data-base created using Sekine's method, NOT cleaned up by human. It includes 19,975 sets of paraphrases with 191,572 phrases. 0 0 0 0 Users
Microsoft Research Paraphrase Corpus Collection of paraphrases Microsoft Research Text file containing 5800 pairs of sentences which have been extracted from news sources on the web, along with human annotations indicating whether each pair captures a paraphrase/semantic equivalence relationship. 0 0 0 0 Users
Downward entailing operators Collection of entailing operators Department of Computer Science, Cornell University, Ithaca NY System output of an unsupervised algorithm recovering many Downward Entailing operators, like 'doubt'. 0 0 1 0 Users
WikiRules! Lexical Reference rule-base Bar-Ilan University Extraction of about 8 million lexical reference rules from the text body (first sentence) and from metadata (links, redirects, parentheses) of Wikipedia. Provides better performance than other automatically constructed resources and comparable performance to WordNet. Offers complementary knowledge to WordNet. 0 1 1 0 Users
DART Collection of "world knowledge" propositions Boeing Research and Technology 23 million tuples such as "airplanes can fly to airports", "rivers can flood" collected from abstracted parse trees. 0 0 0 0 Users
FRED FrameNet-derived entailment rule-base Bar-Ilan University This package contains the outputs of the FRED algorithm which extracts entailment rules from FrameNet. 0 0 0 0 Users
DIRECT Directional Distributional Term-Similarity Resource Bar-Ilan University This is a resource of directional distributional term-similarity rules (mostly lexical entailment rules) automatically extracted using the inclusion relation as described in (Kotlerman et.al., ACL-09). 0 0 0 0 Users
binaryDIRT Entailment rules between binary templates using DIRT algorithm Bar-Ilan University This resource contains entailment rules over binary templates learned over the Reuters corpus using

the DIRT algorithm of Lin and Pantel.

0 0 0 0 Users
unaryBInc Entailment rules between unary templates using BInc algorithm Bar-Ilan University This resource contains entailment rules over unary templates learned over the Reuters corpus using

the BInc algorithm of Szpektor and Dagan (2008).

0 0 0 0 Users
New resource Participants are encouraged to contribute Users



Not available Resources

The following table lists the unpublished resources used by RTE participants. Some of them have been developed by Users themselves specifically for RTE. Interested people may turn to authors to obtain further information.

Resource Type Author Brief description PAST Users RTE4 Users RTE5 Users RTE6 Users Usage info
PARC Polarity Lexicon Lexical DB PARC - Palo Alto Research Center Verbs classification with respect to semantic polarity 0 1 0 0 Users
Gazetteer from TREC Gazetteer NIST - National Institute of Standards and Technology Cities and other geographical names 1 0 0 0 Users
DFKI Geographic Ontology
(to be released)
Ontology DFKI - German Research Center for Artificial Intelligence Ontology containing geographic terms and two kinds of relations: the directional part-of relation, and the equal relation for synonyms and abbreviations of the same geographic area (e.g the United Kingdom, the UK, Great Britain, etc.) 0 1 0 0 Users
Geo Collection of Entailment Rules Bar-Ilan University; Tel-Aviv University Meronymy entailment rules, based on TREC’s TIPSTER gazetteer. 0 0 1 0 Users
Regex Collection of Entailment rules Bar-Ilan University; Tel-Aviv University Small set of entailment rules based on regular expressions, intended to address lexical variability involving temporal phrases 0 0 1 0 Users
Syntactic rule base
(to be released)
Collection of Entailment Rules Bar-Ilan University; Tel-Aviv University A manually-composed collection of entailment rules which define parse tree transformations. The rules cover generic syntactic phenomena such as appositions, conjunctions, passive, relative clause, etc. (Bar-Haim et al., AAAI-07) 0 1 1 0 Users
Polarity rule base
(to be released)
Collection of Entailment Rules Bar-Ilan University; Tel-Aviv University A manually-composed collection of entailment rules which detect predicates whose polarity is negative (e.g. didn't dance) or unknown (e.g. plans to dance). The rules capture diverse phenomena that affect polarity, e.g. verbal negation, modal verbs, conditionals, and certain verbs that induce negative or "unknown" polarity context. The latter were taken mainly from VerbNet. Extends a resource described in (Bar-Haim et al., AAAI-07) 0 1 0 0 Users
Lexical-Syntactic rule base Collection of Entailment Rules Bar-Ilan University; Tel-Aviv University Extract lexical-syntactic entailment rules for predicates (verbal and nominal), including argument mapping. The resource is based on WordNet, Nomlex-Plus and Unary DIRT (Szpektor and Dagan, Coling 08) 0 1 0 0 Users
OPENU Collection Collection of Entailment Rules and Patterns Open University Collections of rules, patterns etc. for RTE purpose, extracted from Reuter corpus parsed using Minipar. 1 0 0 0 Users
Abbr Collection of rules for abbreviation Bar-Ilan University; Tel-Aviv University 2000 Abbreviation rules, extracted from BADC and Acronym Guide 0 0 1 0 Users
UAIC Negation_list Negation rules „Al. I. Cuza“ University, Iasi, Romania List of negative terms and words (verbs, adjectives, nouns) affecting modality or factuality of a infinitive verb preceded by the particle "to" (e.g. "believe","necessary", "attempt") 0 0 1 0 Users
DLSIUAES Negation_list List of negative terms University of Alicante Basic list of negative terms. 0 0 1 0 Users
UAIC Quantifier_list List of quantifiers „Al. I. Cuza“ University, Iasi, Romania List of quantifiers affecting entailment judgment. The quantifiers are taken from a list which contains expressions like “more than”, “less than”, or words such as “over”, “under”, etc. 0 0 1 0 Users
FBKirst StopWord list List of frequent words FBK-Irst;
University of Trento - Italy
A list of the 572 most frequent English words. 0 0 1 0 Users
IKOMA Dictionary of Named Entities Acronyms and Synonyms Dictionary of Named Entities Acronyms and Synonyms IKOMA; NEC Corporation, Takayama, Ikoma, Nara, Japan Acronym dictionary constructed automatically from the corpus and a synonym dictionary that contains geographical terms. 0 0 0 1 Users


Footnotes

<references/>

Return to: Textual_Entailment_Resource_Pool