Difference between revisions of "Textual Entailment Resource Pool"

Revision as of 08:43, 18 June 2010

Textual entailment systems rely on many different types of NLP resources, including term banks, paraphrase lists, parsers, named-entity recognizers, etc. With so many resources being continuously released and improved, it can be difficult to know which particular resource to use when developing a system.

In response, the Recognizing Textual Entailment (RTE) shared task community initiated a new activity for building this Textual Entailment Resource Pool. RTE participants and any other member of the NLP community are encouraged to contribute to the pool.

In an effort to determine the relative impact of the resources, RTE participants are strongly encouraged to report, whenever possible, the contribution to the overall performance of each utilized resource. Formal qualitative and quantitative results should be included in a separate section of the system report as well as posted on the talk pages of this Textual Entailment Resource Pool.

Adding a new resource is very easy. See how to use existing templates to do this in Help:Using Templates.

Complete RTE Systems

VENSES (from Ca' Foscari University of Venice, Italy)
Nutcracker (available for download)
Entailment Demo (from the University of Illinois at Urbana-Champaign)
EDITS - Edit Distance Textual Entailment Suite (open source software developed by Human Language Technology (HLT) group at FBK-Irst)

RTE data sets

FrameNet manually annotated RTE 2006 Test Set. Provided by SALSA project, Saarland University.
Manually Word Aligned RTE 2006 Data Sets. Provided by the Natural Language Processing Group, Microsoft Research.
RTE data sets annotated for a 3-way decision: entails, contradicts, unknown. Provided by Stanford NLP Group.
BPI RTE data set - 250 pairs, focusing on world knowledge. Provided jointly by Boeing, Princeton, and ISI.
Textual Entailment Specialized Data Sets - 90 RTE-5 Test Set pairs annotated with linguistic phenomena + 203 monothematic pairs (i.e. pairs where only one linguistic phenomenon is relevant to the entailment relation) created from the 90 annotated pairs. Provided jointly by FBK-Irst, and CELCT.
RTE-5 Search Pilot Data Set annotated with anaphora and coreference information - RTE-5 Search Data Set annotated with anaphora/coreference information + Augmented RTE-5 Search Data Set, where all the referring expressions which need to be resolved in the entailing sentences are substituted by explicit expressions on the basis of the anaphora/coreference annotation. Provided by CELCT and distributed by NIST at the Past TAC Data web page (2009 Search Pilot, annotated test/dev data).

Knowledge Resources

The RTE Knowledge Resources page presents:

a call for resources, inviting system developers to share the resources used by their own TE engines, to both help improve the TE technology and further test and evaluate such resources;
the ablation tests carried out in the RTE challenges in order to evaluate the impact of knowledge resources and tools on TE system performances;
lists of knowledge resources, both publically available and unpublished, used by systems participating in the last RTE challenges.

Tools

Parsers

C&C parser for Combinatory Categorial Grammar
Minipar
Shallow Parser - from the University of Illinois at Urbana-Champaign, see a web demo of this tool

Role Labelling

ASSERT
Shalmaneser
Semantic Role Labeler - from the University of Illinois at Urbana-Champaign, see a web demo of this tool

Entity Recognition Tools

Illinois Named Entity Tagger - see a web demo of this tool
Illinois Multi-lingual Named Entity Discovery Tool - see a web demo of this tool

Similarity / Relatedness Tools

UKB: Open source WordNet-based similarity/relatedness tool, includes also pre-computed semantic vectors for all words

Corpus Readers

NLTK provides a corpus reader for the data from RTE Challenges 1, 2, and 3 - see the Corpus Readers Guide for more information.

Related Libraries

PyPES general purpose library containing evaluation environment for RTE and McPIET text inference engine based on the ERG (English Resource Grammar)

Links

@@ Line 23: / Line 23: @@
 == Knowledge Resources ==
-The [[RTE Knowledge Resources]] page presents a list of knowledge resources that have been used in the RTE challanges. It also gives access to the [[RTE Knowledge Resources#Ablation tests|'''Ablation tests''']] carried out in RTE-5 and to [[RTE Knowledge Resources#Call for Resources|'''the RTE-6 Call for Resources''']].
+The [[RTE Knowledge Resources]] page presents:
-The content of the page is as follows:
-* [[RTE Knowledge Resources#Call for Resources|Call for Resources]]
-* [[RTE Knowledge Resources#Ablation tests|Ablation tests]]
-* [[RTE Knowledge Resources#Publicly available Resources|Publicly available resources]]
-* [[RTE Knowledge Resources#Not available Resources|Not available resources]]
+* a [[RTE Knowledge Resources#Call for Resources|call for resources]], inviting system developers to share the resources used by their own TE engines, to both help improve the TE technology and further test and evaluate such resources;
+* [[RTE Knowledge Resources#Ablation tests|the ablation tests]] carried out in the RTE challenges in order to evaluate the impact of knowledge resources and tools on TE system performances;
+* [[RTE Knowledge Resources#Publicly available Resources|lists of knowledge resources]], both publically available and unpublished, used by systems participating in the last RTE challenges.
 == Tools ==