Difference between revisions of "NP Chunking (State of the art)"

From ACL Wiki
Jump to navigation Jump to search
Line 22: Line 22:
 
| Kudo and Matsumoto (2000)
 
| Kudo and Matsumoto (2000)
 
| [http://chasen.org/~taku/software/yamcha/ YAMCHA Toolkit] (but models are not provided)
 
| [http://chasen.org/~taku/software/yamcha/ YAMCHA Toolkit] (but models are not provided)
| 93.79
+
| 93.79%
 
|-
 
|-
 
| KM01
 
| KM01
Line 28: Line 28:
 
| Kudo and Matsumoto (2001)
 
| Kudo and Matsumoto (2001)
 
| No
 
| No
| 94.22
+
| 94.22%
 
|-
 
|-
 
| SS05
 
| SS05
Line 34: Line 34:
 
| Shen and Sarkar (2005)
 
| Shen and Sarkar (2005)
 
| No
 
| No
| 95.23  
+
| 95.23%
 
|-
 
|-
 
|}
 
|}

Revision as of 11:03, 27 June 2007

  • Performance measure: F = 2 * Precision * Recall / (Recall + Precision)
  • Precision: percentage of NPs found by the algorithm that are correct
  • Recall: percentage of NPs defined in the corpus that were found by the chunking program
  • Training data: sections 15-18 of Wall Street Journal corpus (Ramshaw and Marcus)
  • Testing data: section 20 of Wall Street Journal corpus (Ramshaw and Marcus)
  • original data of the NP chunking experiments by Lance Ramshaw and Mitch Marcus
  • data contains one word per line and each line contains six fields of which only the first three fields are relevant: the word, the part-of-speech tag assigned by the Brill tagger, and the correct IOB tag
  • dataset is available from ftp://ftp.cis.upenn.edu/pub/chunker/
  • more information is available from NP Chunking


System name Short description Main publications Software Results (F)
KM00 B-I-O tagging using SVM classifiers with polynomial kernel Kudo and Matsumoto (2000) YAMCHA Toolkit (but models are not provided) 93.79%
KM01 learning as in KM00, but voting between different representations Kudo and Matsumoto (2001) No 94.22%
SS05 specialized HMM + voting between different representations Shen and Sarkar (2005) No 95.23%


Kudo, T., and Matsumoto, Y. (2000). Use of support vector learning for chunk identification. Proceedings of the 4th Conference on CoNLL-2000 and LLL-2000, pages 142-144, Lisbon, Portugal.

Kudo, T., and Matsumoto, Y. (2001). Chunking with support vector machines. Proceedings of NAACL-2001.

Shen, H., and Sarkar, A. (2005). Voting between multiple data representations for text chunking. Proceedings of the Eighteenth Meeting of the Canadian Society for Computational Intelligence, Canadian AI 2005.