Difference between revisions of "Paraphrase Identification (State of the art)"
(50 intermediate revisions by 13 users not shown) | |||
Line 1: | Line 1: | ||
− | * [http://research.microsoft.com/en-us/downloads/607D14D9-20CD-47E3-85BC-A2F65CD28042/default.aspx Microsoft Research Paraphrase Corpus] (MSRP) | + | * '''source''': [http://research.microsoft.com/en-us/downloads/607D14D9-20CD-47E3-85BC-A2F65CD28042/default.aspx Microsoft Research Paraphrase Corpus] (MSRP) |
− | * see Dolan | + | * '''task''': given a pair of sentences, classify them as paraphrases or not paraphrases |
− | * train: | + | * '''see''': Dolan et al. (2004) |
− | * test: | + | * '''train''': 4,076 sentence pairs (2,753 positive: 67.5%) |
+ | * '''test''': 1,725 sentence pairs (1,147 positive: 66.5%) | ||
+ | * '''see also:''' [[Similarity (State of the art)]] | ||
== Sample data == | == Sample data == | ||
− | * Sentence 1: Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence. | + | * '''Sentence 1''': Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence. |
− | * Sentence 2: Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence. | + | * '''Sentence 2''': Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence. |
− | * Class: 1 (true paraphrase) | + | * '''Class''': 1 (true paraphrase) |
== Table of results == | == Table of results == | ||
+ | |||
+ | * '''Listed in order of increasing F score.''' | ||
+ | |||
{| border="1" cellpadding="5" cellspacing="1" width="100%" | {| border="1" cellpadding="5" cellspacing="1" width="100%" | ||
Line 18: | Line 23: | ||
! Algorithm | ! Algorithm | ||
! Reference | ! Reference | ||
− | ! | + | ! Description |
+ | ! Supervision | ||
! Accuracy | ! Accuracy | ||
! F | ! F | ||
+ | |- | ||
+ | | Vector Based Similarity (Baseline) | ||
+ | | Mihalcea et al. (2006) | ||
+ | | cosine similarity with tf-idf weighting | ||
+ | | unsupervised | ||
+ | | 65.4% | ||
+ | | 75.3% | ||
+ | |- | ||
+ | | ESA | ||
+ | | Hassan (2011) | ||
+ | | explicit semantic space | ||
+ | | unsupervised | ||
+ | | 67.0% | ||
+ | | 79.3% | ||
+ | |- | ||
+ | | KM | ||
+ | | Kozareva and Montoyo (2006) | ||
+ | | combination of lexical and semantic features | ||
+ | | supervised | ||
+ | | 76.6% | ||
+ | | 79.6% | ||
+ | |- | ||
+ | | LSA | ||
+ | | Hassan (2011) | ||
+ | | latent semantic space | ||
+ | | unsupervised | ||
+ | | 68.8% | ||
+ | | 79.9% | ||
+ | |- | ||
+ | | RMLMG | ||
+ | | Rus et al. (2008) | ||
+ | | graph subsumption | ||
+ | | unsupervised | ||
+ | | 70.6% | ||
+ | | 80.5% | ||
|- | |- | ||
| MCS | | MCS | ||
| Mihalcea et al. (2006) | | Mihalcea et al. (2006) | ||
| combination of several word similarity measures | | combination of several word similarity measures | ||
+ | | unsupervised | ||
| 70.3% | | 70.3% | ||
| 81.3% | | 81.3% | ||
|- | |- | ||
+ | | STS | ||
+ | | Islam and Inkpen (2007) | ||
+ | | combination of semantic and string similarity | ||
+ | | unsupervised | ||
+ | | 72.6% | ||
+ | | 81.3% | ||
+ | |- | ||
+ | | SSA | ||
+ | | Hassan (2011) | ||
+ | | salient semantic space | ||
+ | | unsupervised | ||
+ | | 72.5% | ||
+ | | 81.4% | ||
+ | |- | ||
+ | | QKC | ||
+ | | Qiu et al. (2006) | ||
+ | | sentence dissimilarity classification | ||
+ | | supervised | ||
+ | | 72.0% | ||
+ | | 81.6% | ||
+ | |- | ||
+ | | ParaDetect | ||
+ | | Zia and Wasif (2012) | ||
+ | | PI using semantic heuristic features | ||
+ | | supervised | ||
+ | | 74.7% | ||
+ | | 81.8% | ||
+ | |- | ||
+ | | Vector-based similarity | ||
+ | | Milajevs et al. (2014) | ||
+ | | Additive composition of vectors and cosine distance | ||
+ | | unsupervised | ||
+ | | 73.0% | ||
+ | | 82.0% | ||
+ | |- | ||
+ | | SDS | ||
+ | | Blacoe and Lapata (2012) | ||
+ | | simple distributional semantic space | ||
+ | | supervised | ||
+ | | 73.0% | ||
+ | | 82.3% | ||
+ | |- | ||
+ | | matrixJcn | ||
+ | | Fernando and Stevenson (2008) | ||
+ | | JCN WordNet similarity with matrix | ||
+ | | unsupervised | ||
+ | | 74.1% | ||
+ | | 82.4% | ||
+ | |- | ||
+ | | FHS | ||
+ | | Finch et al. (2005) | ||
+ | | combination of MT evaluation measures as features | ||
+ | | supervised | ||
+ | | 75.0% | ||
+ | | 82.7% | ||
+ | |- | ||
+ | | PE | ||
+ | | Das and Smith (2009) | ||
+ | | product of experts | ||
+ | | supervised | ||
+ | | 76.1% | ||
+ | | 82.7% | ||
+ | |- | ||
+ | | WDDP | ||
+ | | Wan et al. (2006) | ||
+ | | dependency-based features | ||
+ | | supervised | ||
+ | | 75.6% | ||
+ | | 83.0% | ||
+ | |- | ||
+ | | SHPNM | ||
+ | | Socher et al. (2011) | ||
+ | | recursive autoencoder with dynamic pooling | ||
+ | | supervised | ||
+ | | 76.8% | ||
+ | | 83.6% | ||
+ | |- | ||
+ | | MTMETRICS | ||
+ | | Madnani et al. (2012) | ||
+ | | combination of eight machine translation metrics | ||
+ | | supervised | ||
+ | | 77.4% | ||
+ | | 84.1% | ||
+ | |- | ||
+ | | L.D.C Model | ||
+ | | Wang et al. (2016) | ||
+ | | Sentence Similarity Learning by Lexical Decomposition and Composition | ||
+ | | supervised | ||
+ | | 78.4% | ||
+ | | 84.7% | ||
+ | |- | ||
+ | | Multi-Perspective CNN | ||
+ | | He et al. (2015) | ||
+ | | Multi-perspective Convolutional NNs and structured similarity layer | ||
+ | | supervised | ||
+ | | 78.6% | ||
+ | | 84.7% | ||
+ | |- | ||
+ | | REL-TK | ||
+ | | Filice et al. (2015) | ||
+ | | Combination of Convolution Kernels and similarity scores | ||
+ | | supervised | ||
+ | | 79.1% | ||
+ | | 85.2% | ||
+ | |- | ||
+ | | SAMS-RecNN | ||
+ | | Cheng and Kartsaklis (2015) | ||
+ | | Recursive NNs using syntax-aware multi-sense word embeddings | ||
+ | | supervised | ||
+ | | 78.6% | ||
+ | | 85.3% | ||
+ | |- | ||
+ | | TF-KLD | ||
+ | | Ji and Eisenstein (2013) | ||
+ | | Matrix factorization with supervised reweighting | ||
+ | | supervised | ||
+ | | 80.4% | ||
+ | | 85.9% | ||
|} | |} | ||
== References == | == References == | ||
− | Dolan, B., Quirk, C., and Brockett, C. (2004). [http://acl.ldc.upenn.edu/C/C04/C04-1051.pdf Unsupervised construction of large paraphrase corpora: | + | * '''Listed alphabetically.''' |
− | Exploiting massively parallel news sources], ''Proceedings of the 20th international conference on Computational Linguistics (COLING 2004)'', Geneva, Switzerland, pp. 350-356. | + | |
+ | |||
+ | Blacoe, W. and Lapata, M. (2012). [http://newdesign.aclweb.org/anthology/D/D12/D12-1050.pdf A comparison of vector-based representations for semantic composition], ''Proceedings of EMNLP'', Jeju Island, Korea, pp. 546-556. | ||
+ | |||
+ | Cheng, J. and Kartsaklis, D. (2015). [http://www.aclweb.org/anthology/D/D15/D15-1177.pdf Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning], ''Proceedings of EMNLP 2015'', Lisbon, Portugal, pp. 1531-1542. | ||
+ | |||
+ | Das, D., and Smith, N. (2009). [http://www.aclweb.org/anthology-new/P/P09/P09-1053.pdf Paraphrase identification as probabilistic quasi-synchronous recognition]. ''Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP'', pp. 468-476, Suntec, Singapore. | ||
+ | |||
+ | Dolan, B., Quirk, C., and Brockett, C. (2004). [http://acl.ldc.upenn.edu/C/C04/C04-1051.pdf Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources], ''Proceedings of the 20th international conference on Computational Linguistics (COLING 2004)'', Geneva, Switzerland, pp. 350-356. | ||
+ | |||
+ | Fernando, S., and Stevenson, M. (2008). [http://staffwww.dcs.shef.ac.uk/people/S.Fernando/pubs/clukPaper.pdf A semantic similarity approach to paraphrase detection], ''Computational Linguistics UK (CLUK 2008) 11th Annual Research Colloquium''. | ||
+ | |||
+ | Filice, S., Da San Martino, G., and Moschitti, A. (2015). [http://www.aclweb.org/anthology/P15-1097 Structural Representations for Learning Relations between Pairs of Texts], "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL 2015)", Beijing, China, pp. 1003-1013. | ||
+ | |||
+ | Finch, A., and H, Y.S., and Sumita, E. (2005). [http://aclweb.org/anthology/I/I05/I05-5003.pdf Using machine translation evaluation techniques to determine sentence-level semantic equivalence], "Proceedings of the Third International Workshop on Paraphrasing (IWP 2005)", Jeju Island, South Korea, pp. 17-24. | ||
+ | |||
+ | Hassan, Samer. [http://samerhassan.com/images/0/01/Dissertation.pdf Measuring Semantic Relatedness Using Salient Encyclopedic Concepts]. Doctor of Philosophy, August 2011 | ||
+ | |||
+ | He, Hua, Gimpel K. and Lin J. (2015). [http://aclweb.org/anthology/D/D15/D15-1181.pdf Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks], ''Proceedings of EMNLP 2015'', Lisbon, Portugal, pp. 1576-1586. | ||
+ | |||
+ | Islam, A., and Inkpen, D. (2007). [http://www.site.uottawa.ca/~diana/publications/ranlp_2007_textsim_camera_ready.pdf Semantic similarity of short texts], ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2007)'', Borovets, Bulgaria, pp. 291-297. | ||
+ | |||
+ | Ji, Y. and Eisenstein, J. (2013) [http://www.aclweb.org/anthology/D/D13/D13-1090.pdf Discriminative Improvements to Distributional Sentence Similarity], | ||
+ | ''Proceedings of Empirical Methods in Natural Language Processing (EMNLP 2013)'', Seattle, Washington, USA, pp. 891--896 | ||
+ | |||
+ | Kozareva, Z., and Montoyo, A. (2006). [http://www.dlsi.ua.es/~zkozareva/papers/fintalKozareva.pdf Paraphrase identification on the basis of supervised machine learning techniques], ''Advances in Natural Language Processing: 5th International Conference on NLP (FinTAL 2006)'', Turku, Finland, 524-533. | ||
+ | |||
+ | Madnani, N., Tetreault, J., and Chodorow, M. (2012). [http://www.aclweb.org/anthology-new/N/N12/N12-1019.pdf Re-examining Machine Translation Metrics for Paraphrase Identification], ''Proceedings of 2012 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012)'', pp. 182-190. | ||
+ | |||
+ | Mihalcea, R., Corley, C., and Strapparava, C. (2006). [http://www.cse.unt.edu/~rada/papers/mihalcea.aaai06.pdf Corpus-based and knowledge-based measures of text semantic similarity], ''Proceedings of the National Conference on Artificial Intelligence (AAAI 2006)'', Boston, Massachusetts, pp. 775-780. | ||
+ | |||
+ | Milajevs, D., Kartsaklis, D., Sadrzadeh, M. and Purver, M. (2014). [https://aclweb.org/anthology/D/D14/D14-1079.pdf Evaluating Neural Word Representations in Tensor-Based Compositional Settings], ''Proceedings of EMNLP 2014'', Doha, Qatar, pp. 708–719. | ||
+ | |||
+ | Qiu, L. and Kan, M.Y. and Chua, T.S. (2006). [http://acl.ldc.upenn.edu/W/W06/W06-1603.pdf Paraphrase recognition via dissimilarity significance classification], ''Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006)'', pp. 18-26. | ||
+ | |||
+ | Rus, V. and McCarthy, P.M. and Lintean, M.C. and McNamara, D.S. and Graesser, A.C. (2008). [http://csep.psyc.memphis.edu/McNamara/pdf/Paraphrase_Identification.pdf Paraphrase identification with lexico-syntactic graph subsumption], ''FLAIRS 2008'', pp. 201-206. | ||
+ | |||
+ | Socher, R. and Huang, E.H., and Pennington, J. and Ng, A.Y., and Manning, C.D. (2011). [http://www.socher.org/uploads/Main/SocherHuangPenningtonNgManning_NIPS2011.pdf Dynamic pooling and unfolding recursive autoencoders for paraphrase detection], "Advances in Neural Information Processing Systems 24" | ||
+ | |||
+ | Wan, S., Dras, M., Dale, R., and Paris, C. (2006). [http://www.alta.asn.au/events/altw2006/proceedings/swan-final.pdf Using dependency-based features to take the "para-farce" out of paraphrase], ''Proceedings of the Australasian Language Technology Workshop (ALTW 2006)'', pp. 131-138. | ||
− | + | Zia Ul-Qayyum and Wasif Altaf, (2012). [http://maxwellsci.com/print/rjaset/v4-4894-4904.pdf Paraphrase Identification using Semantic Heuristic Features], ''Research Journal of Applied Sciences, Engineering and Technology'', 4(22): 4894-4904. | |
+ | Zhiguo Wang, Haitao Mi and Abraham Ittycheriah. 2016. [http://arxiv.org/pdf/1602.07019v1.pdf Sentence Similarity Learning by Lexical Decomposition and Composition]. In Coling 2016. | ||
− | |||
− | + | <!-- Please keep this list in alphabetical order --> | |
[[Category:State of the art]] | [[Category:State of the art]] | ||
+ | [[Category:Similarity]] |
Latest revision as of 01:34, 29 November 2016
- source: Microsoft Research Paraphrase Corpus (MSRP)
- task: given a pair of sentences, classify them as paraphrases or not paraphrases
- see: Dolan et al. (2004)
- train: 4,076 sentence pairs (2,753 positive: 67.5%)
- test: 1,725 sentence pairs (1,147 positive: 66.5%)
- see also: Similarity (State of the art)
Sample data
- Sentence 1: Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence.
- Sentence 2: Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence.
- Class: 1 (true paraphrase)
Table of results
- Listed in order of increasing F score.
Algorithm | Reference | Description | Supervision | Accuracy | F |
---|---|---|---|---|---|
Vector Based Similarity (Baseline) | Mihalcea et al. (2006) | cosine similarity with tf-idf weighting | unsupervised | 65.4% | 75.3% |
ESA | Hassan (2011) | explicit semantic space | unsupervised | 67.0% | 79.3% |
KM | Kozareva and Montoyo (2006) | combination of lexical and semantic features | supervised | 76.6% | 79.6% |
LSA | Hassan (2011) | latent semantic space | unsupervised | 68.8% | 79.9% |
RMLMG | Rus et al. (2008) | graph subsumption | unsupervised | 70.6% | 80.5% |
MCS | Mihalcea et al. (2006) | combination of several word similarity measures | unsupervised | 70.3% | 81.3% |
STS | Islam and Inkpen (2007) | combination of semantic and string similarity | unsupervised | 72.6% | 81.3% |
SSA | Hassan (2011) | salient semantic space | unsupervised | 72.5% | 81.4% |
QKC | Qiu et al. (2006) | sentence dissimilarity classification | supervised | 72.0% | 81.6% |
ParaDetect | Zia and Wasif (2012) | PI using semantic heuristic features | supervised | 74.7% | 81.8% |
Vector-based similarity | Milajevs et al. (2014) | Additive composition of vectors and cosine distance | unsupervised | 73.0% | 82.0% |
SDS | Blacoe and Lapata (2012) | simple distributional semantic space | supervised | 73.0% | 82.3% |
matrixJcn | Fernando and Stevenson (2008) | JCN WordNet similarity with matrix | unsupervised | 74.1% | 82.4% |
FHS | Finch et al. (2005) | combination of MT evaluation measures as features | supervised | 75.0% | 82.7% |
PE | Das and Smith (2009) | product of experts | supervised | 76.1% | 82.7% |
WDDP | Wan et al. (2006) | dependency-based features | supervised | 75.6% | 83.0% |
SHPNM | Socher et al. (2011) | recursive autoencoder with dynamic pooling | supervised | 76.8% | 83.6% |
MTMETRICS | Madnani et al. (2012) | combination of eight machine translation metrics | supervised | 77.4% | 84.1% |
L.D.C Model | Wang et al. (2016) | Sentence Similarity Learning by Lexical Decomposition and Composition | supervised | 78.4% | 84.7% |
Multi-Perspective CNN | He et al. (2015) | Multi-perspective Convolutional NNs and structured similarity layer | supervised | 78.6% | 84.7% |
REL-TK | Filice et al. (2015) | Combination of Convolution Kernels and similarity scores | supervised | 79.1% | 85.2% |
SAMS-RecNN | Cheng and Kartsaklis (2015) | Recursive NNs using syntax-aware multi-sense word embeddings | supervised | 78.6% | 85.3% |
TF-KLD | Ji and Eisenstein (2013) | Matrix factorization with supervised reweighting | supervised | 80.4% | 85.9% |
References
- Listed alphabetically.
Blacoe, W. and Lapata, M. (2012). A comparison of vector-based representations for semantic composition, Proceedings of EMNLP, Jeju Island, Korea, pp. 546-556.
Cheng, J. and Kartsaklis, D. (2015). Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning, Proceedings of EMNLP 2015, Lisbon, Portugal, pp. 1531-1542.
Das, D., and Smith, N. (2009). Paraphrase identification as probabilistic quasi-synchronous recognition. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 468-476, Suntec, Singapore.
Dolan, B., Quirk, C., and Brockett, C. (2004). Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources, Proceedings of the 20th international conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 350-356.
Fernando, S., and Stevenson, M. (2008). A semantic similarity approach to paraphrase detection, Computational Linguistics UK (CLUK 2008) 11th Annual Research Colloquium.
Filice, S., Da San Martino, G., and Moschitti, A. (2015). Structural Representations for Learning Relations between Pairs of Texts, "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL 2015)", Beijing, China, pp. 1003-1013.
Finch, A., and H, Y.S., and Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence, "Proceedings of the Third International Workshop on Paraphrasing (IWP 2005)", Jeju Island, South Korea, pp. 17-24.
Hassan, Samer. Measuring Semantic Relatedness Using Salient Encyclopedic Concepts. Doctor of Philosophy, August 2011
He, Hua, Gimpel K. and Lin J. (2015). Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks, Proceedings of EMNLP 2015, Lisbon, Portugal, pp. 1576-1586.
Islam, A., and Inkpen, D. (2007). Semantic similarity of short texts, Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2007), Borovets, Bulgaria, pp. 291-297.
Ji, Y. and Eisenstein, J. (2013) Discriminative Improvements to Distributional Sentence Similarity, Proceedings of Empirical Methods in Natural Language Processing (EMNLP 2013), Seattle, Washington, USA, pp. 891--896
Kozareva, Z., and Montoyo, A. (2006). Paraphrase identification on the basis of supervised machine learning techniques, Advances in Natural Language Processing: 5th International Conference on NLP (FinTAL 2006), Turku, Finland, 524-533.
Madnani, N., Tetreault, J., and Chodorow, M. (2012). Re-examining Machine Translation Metrics for Paraphrase Identification, Proceedings of 2012 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), pp. 182-190.
Mihalcea, R., Corley, C., and Strapparava, C. (2006). Corpus-based and knowledge-based measures of text semantic similarity, Proceedings of the National Conference on Artificial Intelligence (AAAI 2006), Boston, Massachusetts, pp. 775-780.
Milajevs, D., Kartsaklis, D., Sadrzadeh, M. and Purver, M. (2014). Evaluating Neural Word Representations in Tensor-Based Compositional Settings, Proceedings of EMNLP 2014, Doha, Qatar, pp. 708–719.
Qiu, L. and Kan, M.Y. and Chua, T.S. (2006). Paraphrase recognition via dissimilarity significance classification, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pp. 18-26.
Rus, V. and McCarthy, P.M. and Lintean, M.C. and McNamara, D.S. and Graesser, A.C. (2008). Paraphrase identification with lexico-syntactic graph subsumption, FLAIRS 2008, pp. 201-206.
Socher, R. and Huang, E.H., and Pennington, J. and Ng, A.Y., and Manning, C.D. (2011). Dynamic pooling and unfolding recursive autoencoders for paraphrase detection, "Advances in Neural Information Processing Systems 24"
Wan, S., Dras, M., Dale, R., and Paris, C. (2006). Using dependency-based features to take the "para-farce" out of paraphrase, Proceedings of the Australasian Language Technology Workshop (ALTW 2006), pp. 131-138.
Zia Ul-Qayyum and Wasif Altaf, (2012). Paraphrase Identification using Semantic Heuristic Features, Research Journal of Applied Sciences, Engineering and Technology, 4(22): 4894-4904.
Zhiguo Wang, Haitao Mi and Abraham Ittycheriah. 2016. Sentence Similarity Learning by Lexical Decomposition and Composition. In Coling 2016.