Difference between revisions of "Textual Entailment Resource Pool"

From ACL Wiki
Jump to navigation Jump to search
(→‎RTE data sets: Adding MultiNLI)
 
(4 intermediate revisions by 3 users not shown)
Line 30: Line 30:
 
* [http://www.nist.gov/tac/2011/RTE/index.html RTE7 dataset] - provided by [http://www.nist.gov/index.html NIST] - freely available upon request. For details see [http://www.nist.gov/tac/data/forms/index.html TAC User Agreements]
 
* [http://www.nist.gov/tac/2011/RTE/index.html RTE7 dataset] - provided by [http://www.nist.gov/index.html NIST] - freely available upon request. For details see [http://www.nist.gov/tac/data/forms/index.html TAC User Agreements]
 
* [http://www.cs.york.ac.uk/semeval-2013/task7/  The Joint Student Response Analysis and 8th Recognizing Textual Entailment Challenge] at SemEval 2013
 
* [http://www.cs.york.ac.uk/semeval-2013/task7/  The Joint Student Response Analysis and 8th Recognizing Textual Entailment Challenge] at SemEval 2013
 
+
* [http://www.nyu.edu/projects/bowman/multinli/ The MultiGenre NLI Corpus] (433k examples, used in the [https://repeval2017.github.io/shared/ RepEval 2017 Shared Task])
  
 
=== RTE data sets translated in other languages ===
 
=== RTE data sets translated in other languages ===
 
* [http://www.dfki.de/~neumann/resources/RTE3_DE_V1.2_2013-12-02.zip RTE3 dataset translated in German] - provided by [https://sites.google.com/site/excitementproject/ EXCITEMENT]
 
* [http://www.dfki.de/~neumann/resources/RTE3_DE_V1.2_2013-12-02.zip RTE3 dataset translated in German] - provided by [https://sites.google.com/site/excitementproject/ EXCITEMENT]
* [http://www.excitement-project.eu/attachments/article/97/RTE3-ITA_V1_2012-10-04.zip RTE3 dataset translated in Italian] - provided by [https://sites.google.com/site/excitementproject/ EXCITEMENT]
+
* [https://sites.google.com/site/excitementproject/results/RTE3-ITA_V1_2012-10-04.zip RTE3 dataset translated in Italian] - provided by [https://sites.google.com/site/excitementproject/ EXCITEMENT]
  
  
 
=== Other data sets ===
 
=== Other data sets ===
 +
* [http://nlp.stanford.edu/projects/snli The Stanford Natural Language Inference (SNLI) corpus], a 570k example manually-annotated TE dataset with accompanying leaderboard.
 
* [http://www.coli.uni-saarland.de/projects/salsa/fate FrameNet manually annotated RTE 2006 Test Set.] Provided by  [http://www.coli.uni-saarland.de/projects/salsa/ SALSA project, Saarland University.]
 
* [http://www.coli.uni-saarland.de/projects/salsa/fate FrameNet manually annotated RTE 2006 Test Set.] Provided by  [http://www.coli.uni-saarland.de/projects/salsa/ SALSA project, Saarland University.]
 
* [http://www.cs.biu.ac.il/~nlp/files/RTE_2006_Aligned.zip Manually Word Aligned RTE 2006 Data Sets.] Provided by  [http://research.microsoft.com/nlp/ the Natural Language Processing Group, Microsoft Research.]
 
* [http://www.cs.biu.ac.il/~nlp/files/RTE_2006_Aligned.zip Manually Word Aligned RTE 2006 Data Sets.] Provided by  [http://research.microsoft.com/nlp/ the Natural Language Processing Group, Microsoft Research.]
Line 52: Line 53:
 
* [http://www.cs.york.ac.uk/semeval-2012/task8/ Cross-Lingual Textual Entailment for Content Synchronization] The Cross-Lingual Textual Entailment task at [http://www.cs.york.ac.uk/semeval-2012/‎ SemEval 2012].
 
* [http://www.cs.york.ac.uk/semeval-2012/task8/ Cross-Lingual Textual Entailment for Content Synchronization] The Cross-Lingual Textual Entailment task at [http://www.cs.york.ac.uk/semeval-2012/‎ SemEval 2012].
 
* [http://www.cs.york.ac.uk/semeval-2013/task8/ Cross-Lingual Textual Entailment for Content Synchronization] The Cross-Lingual Textual Entailment task at [http://www.cs.york.ac.uk/semeval-2013/‎ SemEval 2013].
 
* [http://www.cs.york.ac.uk/semeval-2013/task8/ Cross-Lingual Textual Entailment for Content Synchronization] The Cross-Lingual Textual Entailment task at [http://www.cs.york.ac.uk/semeval-2013/‎ SemEval 2013].
 +
* [http://nilc.icmc.usp.br/assin/ ASSIN] a shared task on TE for Portuguese with 10,000 pairs.
  
 
== Knowledge Resources ==
 
== Knowledge Resources ==

Latest revision as of 08:31, 29 May 2017

Textual Entailment > Resources:


Textual entailment systems rely on many different types of NLP resources, including term banks, paraphrase lists, parsers, named-entity recognizers, etc. With so many resources being continuously released and improved, it can be difficult to know which particular resource to use when developing a system.

In response, the Recognizing Textual Entailment (RTE) shared task community initiated a new activity for building this Textual Entailment Resource Pool. RTE participants and any other member of the NLP community are encouraged to contribute to the pool.

In an effort to determine the relative impact of the resources, RTE participants are strongly encouraged to report, whenever possible, the contribution to the overall performance of each utilized resource. Formal qualitative and quantitative results should be included in a separate section of the system report as well as posted on the talk pages of this Textual Entailment Resource Pool.

Adding a new resource is very easy. See how to use existing templates to do this in Help:Using Templates.

Complete RTE Systems

RTE data sets

Past campaigns data sets

RTE data sets translated in other languages


Other data sets

Knowledge Resources

The RTE Knowledge Resources page presents:

  • a call for resources, inviting system developers to share the resources used by their own TE engines, to both help improve the TE technology and further test and evaluate such resources;
  • the ablation tests carried out in the RTE challenges in order to evaluate the impact of knowledge resources and tools on TE system performances;
  • lists of knowledge resources, both publicly available and unpublished, used by systems participating in the last RTE challenges.

Projects

Tools

Parsers

Role Labelling

Entity Recognition Tools

Similarity / Relatedness Tools

  • UKB: Open source WordNet-based similarity/relatedness tool, includes also pre-computed semantic vectors for all words

Corpus Readers

  • NLTK provides a corpus reader for the data from RTE Challenges 1, 2, and 3 - see the Corpus Readers Guide for more information.

Related Libraries

  • PyPES general purpose library containing evaluation environment for RTE and McPIET text inference engine based on the ERG (English Resource Grammar)

Text Normalizers

Java number normalizer (Beta) A tool for converting textual representations of numbers to a standard numerical string.

References

Links