POS Tagging (State of the art)
- Performance measure: per token accuracy
- Training data: sections 0-18 of Wall Street Journal corpus
- Testing data: sections 22-24 of Wall Street Journal corpus
Table of results
System name | Short description | Main publications | Software | Results |
---|---|---|---|---|
SVMTool | SVM-based tagger and tagger generator | Giménez and Márquez (2004) | SVMTool | 97.16% |
Stanford Tagger | learning with cyclic dependency network | Toutanova et al. (2003) | Stanford Tagger | 97.24% |
POS tagger | bidirectional perceptron learning | Shen et al. (2007) | POS tagger | 97.33% |
GENiA Tagger | ? | Tsuruoka, et al (2005) | GENiA | 96.94% on WSJ, 98.26% on biomed. |
References
- Giménez, J., and Márquez, L. (2004). SVMTool: A general POS tagger generator based on Support Vector Machines. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'04). Lisbon, Portugal.
- Shen, L., Satta, G., and Joshi, A. (2007). Guided learning for bidirectional sequence classification. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL 2007), pages 760-767.
- Toutanova, K., Klein, D., Manning, C.D., Yoram Singer, Y. (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. Proceedings of HLT-NAACL 2003, pages 252-259.
- Yoshimasa Tsuruoka, Yuka Tateishi, Jin-Dong Kim, Tomoko Ohta, John McNaught, Sophia Ananiadou, and Jun'ichi Tsujii, "Developing a Robust Part-of-Speech Tagger for Biomedical Text, Advances in Informatics" - 10th Panhellenic Conference on Informatics, LNCS 3746, pp. 382-392, 2005
- Yoshimasa Tsuruoka and Jun'ichi Tsujii, "Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data", Proceedings of HLT/EMNLP 2005, pp. 467-474.