Difference between revisions of "Parsing (State of the art)"
(→References: Updated links for the Berkeley parser papers) |
|||
Line 19: | Line 19: | ||
| Lexicalized N-Best PCFG + Discriminative reranking | | Lexicalized N-Best PCFG + Discriminative reranking | ||
| Johnson and Charniak (2005) | | Johnson and Charniak (2005) | ||
− | | [http://www. | + | | [http://www.thai-sbobet.com sbo] |
| 91.4% | | 91.4% | ||
| also works well on Brown | | also works well on Brown | ||
Line 26: | Line 26: | ||
| Above + self-training on ~2 million raw sentences from NANC | | Above + self-training on ~2 million raw sentences from NANC | ||
| McClosky, Charniak, and Johnson (2006) | | McClosky, Charniak, and Johnson (2006) | ||
− | | [http://www. | + | | [http://www.thai-sbobet.com sbobet] |
| 92.1% | | 92.1% | ||
| also works well on Brown | | also works well on Brown |
Revision as of 01:52, 25 June 2012
- Performance measure: PARSEVAL - the evalb program
- Training data: sections 2-21 of Wall Street Journal corpus
- Testing data: section 23 of Wall Street Journal corpus
Table of results
System name | Short description | Main publications | Software | Results (PARSEVAL) | Comments |
---|---|---|---|---|---|
Charniak & Johnson's Parser | Lexicalized N-Best PCFG + Discriminative reranking | Johnson and Charniak (2005) | sbo | 91.4% | also works well on Brown |
Self-trained Charniak & Johnson Parser | Above + self-training on ~2 million raw sentences from NANC | McClosky, Charniak, and Johnson (2006) | sbobet | 92.1% | also works well on Brown |
Collins' Parser | Lexicalized PCFG | Collins (1999), Bikel (2004) | Dan Bikel's implementation | ? | ? |
Berkeley Parser | Automatically induced PCFG | Petrov et al. (2006), Petrov and Klein (2007) | Berkeley Parser | 90.1% | works well also for Chinese and German |
Link Grammar | Dependency grammar | Temperley, Sleator, Lafferty, others (1995-2006) | Actively supported project | ? | Persian, Arabic, Chinese, German, Russian dictionaries have been developed. |
References
Bikel, D. (2004). On The Parameter Space of Generative Lexicalized Statistical Parsing Models. PhD Thesis, Computer and Information Science, University of Pennsylvania.
Collins, M. (1999). Head-driven Statistical Models for Natural Language Parsing. PhD Thesis, Computer and Information Science, University of Pennsylvania.
Charniak, E. and Johnson, M. (2005). Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. Proceedings of the 43rd Annual Meeting of the ACL, pages 173–180, Ann Arbor, June 2005.
McClosky, D., Charniak, E., and Johnson, M. (2006) Effective Self-Training for Parsing. Proceedings of HLT/NAACL 2006, pages 152-159, New York City, USA, June 2006.
Petrov, S., Barrett, L., Thibaux, R., and Klein, D. (2006). Learning accurate, compact, and interpretable tree annotation. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pages 433–440, Sydney.
Petrov, S., and Klein, D. (2007). Improved inference for unlexicalized parsing. Proceedings of NAACL 2007, pages 404-411.
Sleator, Daniel & Davy Temperly (1993) "Parsing English with a Link Grammar", Third International Workshop on Parsing Technologies.
See also
External links
- PARSEVAL - the evalb program