NP Chunking (State of the art)
From ACLWiki
(Difference between revisions)
| Line 10: | Line 10: | ||
| − | + | {| border="1" cellpadding="5" cellspacing="1" width="100%" | |
| − | { | + | |
| − | + | ||
| − | + | ||
| − | | | + | |
|- | |- | ||
| − | + | ! System name | |
| + | ! Short description | ||
| + | ! Main publications | ||
| + | ! Software | ||
| + | ! Results (F) | ||
|- | |- | ||
| − | | -- | + | | KM00 |
| + | | B-I-O tagging using SVM classifiers with polynomial kernel | ||
| + | | Kudo and Matsumoto (2000) | ||
| + | | [http://chasen.org/~taku/software/yamcha/ YAMCHA Toolkit] (but models are not provided) | ||
| + | | 93.79 | ||
| + | |- | ||
| + | | KM01 | ||
| + | | learning as in KM00, but voting between different representations | ||
| + | | Kudo and Matsumoto (2001) | ||
| + | | No | ||
| + | | 94.22 | ||
| + | |- | ||
| + | | SS05 | ||
| + | | specialized HMM + voting between different representations | ||
| + | | Shen and Sarkar (2005) | ||
| + | | No | ||
| + | | 95.23 | ||
|- | |- | ||
|} | |} | ||
| − | + | ||
| − | + | KM00 - Taku Kudo and Yuji Matsumoto. 2000b. Use of Support Vector Learning for Chunk Identification. In Proceedings of the 4th Conference on CoNLL-2000 and LLL-2000 | |
| − | + | [http://citeseer.comp.nus.edu.sg/rd/0%2C394415%2C1%2C0.25%2CDownload/http://citeseer.comp.nus.edu.sg/cache/papers/cs/18905/http:zSzzSzlcg-www.uia.ac.bezSzconll2000zSzpszSz14244kud.pdf/kudoh00use.pdf] | |
| + | |||
| + | KM01 - Taku Kudo and Yuji Matsumoto. Chunking with support vector machines. In NAACL-2001 | ||
| + | [http://cactus.aist-nara.ac.jp/~taku-ku/publications/naacl2001.pdf] | ||
| + | |||
| + | Sarkar2005 - Hong Shen and Anoop Sarkar. Voting between Multiple Data Representations for Text Chunking. In proceedings of the Eighteenth Meeting of the Canadian Society for Computational Intelligence, Canadian AI 2005. | ||
| + | [http://www.cs.sfu.ca/~anoop/papers/pdf/ai05.pdf] | ||
[[Category:State of the art]] | [[Category:State of the art]] | ||
Revision as of 12:45, 27 June 2007
- Performance measure: F = 2 * Precision * Recall / (Recall + Precision)
- Precision: percentage of NPs found by the algorithm that are correct
- Recall: percentage of NPs defined in the corpus that were found by the chunking program
- Training data: sections 15-18 of Wall Street Journal corpus (Ramshaw and Marcus)
- Testing data: section 20 of Wall Street Journal corpus (Ramshaw and Marcus)
- original data of the NP chunking experiments by Lance Ramshaw and Mitch Marcus
- data contains one word per line and each line contains six fields of which only the first three fields are relevant: the word, the part-of-speech tag assigned by the Brill tagger, and the correct IOB tag
- dataset is available from ftp://ftp.cis.upenn.edu/pub/chunker/
- more information is available from http://ifarm.nl/erikt/research/np-chunking.html
| System name | Short description | Main publications | Software | Results (F) |
|---|---|---|---|---|
| KM00 | B-I-O tagging using SVM classifiers with polynomial kernel | Kudo and Matsumoto (2000) | YAMCHA Toolkit (but models are not provided) | 93.79 |
| KM01 | learning as in KM00, but voting between different representations | Kudo and Matsumoto (2001) | No | 94.22 |
| SS05 | specialized HMM + voting between different representations | Shen and Sarkar (2005) | No | 95.23 |
KM00 - Taku Kudo and Yuji Matsumoto. 2000b. Use of Support Vector Learning for Chunk Identification. In Proceedings of the 4th Conference on CoNLL-2000 and LLL-2000
[1]
KM01 - Taku Kudo and Yuji Matsumoto. Chunking with support vector machines. In NAACL-2001 [2]
Sarkar2005 - Hong Shen and Anoop Sarkar. Voting between Multiple Data Representations for Text Chunking. In proceedings of the Eighteenth Meeting of the Canadian Society for Computational Intelligence, Canadian AI 2005. [3]