Difference between revisions of "Paraphrase Identification (State of the art)"

Revision as of 13:00, 24 March 2009

Microsoft Research Paraphrase Corpus (MSRP)
see Dolan, Quirk, and Brockett (2004)
train: 4,076 sentence pairs (2,753 positive: 67.5%)
test: 1,725 sentence pairs (1,147 positive: 66.5%)

Sample data

Sentence 1: Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence.
Sentence 2: Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence.
Class: 1 (true paraphrase)

Table of results

Algorithm	Reference	Description	Accuracy	F
MCS	Mihalcea et al. (2006)	unsupervised combination of several word similarity measures	70.3%	81.3%
WDDP	Wan et al. (2006)	supervised dependency-based features	75.0%	73,0%

References

Dolan, B., Quirk, C., and Brockett, C. (2004). [http://acl.ldc.upenn.edu/C/C04/C04-1051.pdf Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources], Proceedings of the 20th international conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 350-356.

Mihalcea, R., Corley, C., and Strapparava, C. (2006). Corpus-based and knowledge-based measures of text semantic similarity, Proceedings of the National Conference on Artificial Intelligence (AAAI 2006), Boston, Massachusetts, pp. 775-780.

Wan, S., Dras, M., Dale, R., and Paris, C. (2006). Using dependency-based features to take the "para-farce" out of paraphrase, Proceedings of the Australasian Language Technology Workshop (ALTW 2006), pp. 131-138.

@@ Line 27: / Line 27: @@
 | 70.3%
 | 81.3%
+|-
+| WDDP
+| Wan et al. (2006)
+| supervised dependency-based features
+| 75.0%
+| 73,0%
 |-
 |}
@@ Line 35: / Line 41: @@
 Exploiting massively parallel news sources], ''Proceedings of the 20th international conference on Computational Linguistics (COLING 2004)'', Geneva, Switzerland, pp. 350-356.
-Mihalcea, R., Corley, C., and Strapparava, C. (2006). Corpus-based and knowledge-based measures of text semantic similarity, ''Proceedings of the National Conference on Artificial Intelligence (AAAI 2006)'', Boston, Massachusetts, pp. 775-780.
+Mihalcea, R., Corley, C., and Strapparava, C. (2006). [http://reference.kfupm.edu.sa/content/c/o/corpus_based_and_knowledge_based_measure_3759629.pdf Corpus-based and knowledge-based measures of text semantic similarity], ''Proceedings of the National Conference on Artificial Intelligence (AAAI 2006)'', Boston, Massachusetts, pp. 775-780.
+Wan, S., Dras, M., Dale, R., and Paris, C. (2006). [http://www.alta.asn.au/events/altw2006/proceedings/swan-final.pdf Using dependency-based features to take the "para-farce" out of paraphrase], ''Proceedings of the Australasian Language Technology Workshop (ALTW 2006)'', pp. 131-138.

Difference between revisions of "Paraphrase Identification (State of the art)"

Revision as of 13:00, 24 March 2009

Contents

Sample data

Table of results

References

See also

Navigation menu

Search