Difference between revisions of "CONLL-2003 (State of the art)"

From ACL Wiki
Jump to navigation Jump to search
m (Reverted edits by Creek (talk) to last revision by Pdturney)
 
(4 intermediate revisions by 2 users not shown)
Line 44: Line 44:
 
| 55.98%
 
| 55.98%
 
|-
 
|-
 +
| BI-LSTM-CRF
 +
| Bidirectional LSTM-CRF Model
 +
| S
 +
| Huang et al. (2015)
 +
| -
 +
| 90.10%
 +
|-
 +
| BI-LSTM-CRF
 +
| Bidirectional LSTM-CRF Model
 +
| S
 +
| Akbik, Blythe, & Vollgraf (2018)
 +
| https://github.com/zalandoresearch/flair
 +
| 93.09%
 
|}
 
|}
  
* (1) '''System type''': R = hand-crafted rules, S = supervised learning, U = unsupervised learning, H = hybrid  
+
* (1) '''System type''': R = hand-crafted rules, S = supervised learning, U = unsupervised learning, H = hybrid
 
 
  
 
== References ==
 
== References ==
Line 55: Line 67:
 
Nadeau, D., Turney, P. D. and Matwin, S. (2006) [http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-48727_e.html Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity]. ''Proceedings 19th Canadian Conference on Artificial Intelligence''. Québec, Canada.
 
Nadeau, D., Turney, P. D. and Matwin, S. (2006) [http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-48727_e.html Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity]. ''Proceedings 19th Canadian Conference on Artificial Intelligence''. Québec, Canada.
  
Tjong Kim Sang, E. F. and De Meulder, F. (2003) [http://www.cnts.ua.ac.be/conll2003/pdf/14247tjo.pdf Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition]. ''Proceedings of CoNLL-2003''. Edmonton, Canada.  
+
Tjong Kim Sang, E. F. and De Meulder, F. (2003) [http://www.cnts.ua.ac.be/conll2003/pdf/14247tjo.pdf Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition]. ''Proceedings of CoNLL-2003''. Edmonton, Canada.
 +
 
 +
Z. H. Huang, W. Xu, and K. Yu. (2015) [http://arxiv.org/abs/1508.01991 Bidirectional LSTM-CRF Models for Sequence Tagging]. ''In arXiv:1508.01991''. 2015.
  
 +
Akbik, A., Blythe, D., and Vollgraf, R. (2018). Contextual string embeddings for sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1638-1649).
  
 
== See also ==
 
== See also ==

Latest revision as of 06:29, 12 July 2019

  • Performance measure: F = 2 * Precision * Recall / (Recall + Precision)
  • Precision: percentage of named entities found by the algorithm that are correct
  • Recall: percentage of named entities defined in the corpus that were found by the program
  • Exact match (for all words of a chunk) is used in the calculation of precision and recall (see CONLL scoring software)


  • Training data: Train split of CONLL-2003 corpus
  • Dryrun data: Testa split of CONLL-2003 corpus
  • Testing data: Testb split of CONLL-2003 corpus
  • The corpus contains a very high ratio of metonymic references (city names standing for sport teams)


Table of results

System name Short description System type (1) Main publications Software Results
FIJZ Best CONLL-2003 participant S Florian, Ittycheriah, Jing and Zhang (2003) - 88.76%
Baseline Vocabulary transfer from training to testing S Tjong Kim Sang and De Meulder(2003) - 59.61%
Balie Unsupervised approach: no prior training U Nadeau, Turney and Matwin (2006) sourceforge.net 55.98%
BI-LSTM-CRF Bidirectional LSTM-CRF Model S Huang et al. (2015) - 90.10%
BI-LSTM-CRF Bidirectional LSTM-CRF Model S Akbik, Blythe, & Vollgraf (2018) https://github.com/zalandoresearch/flair 93.09%
  • (1) System type: R = hand-crafted rules, S = supervised learning, U = unsupervised learning, H = hybrid

References

Florian, R., Ittycheriah, A., Jing, H. and Zhang, T. (2003) Named Entity Recognition through Classifier Combination. Proceedings of CoNLL-2003. Edmonton, Canada.

Nadeau, D., Turney, P. D. and Matwin, S. (2006) Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. Proceedings 19th Canadian Conference on Artificial Intelligence. Québec, Canada.

Tjong Kim Sang, E. F. and De Meulder, F. (2003) Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. Proceedings of CoNLL-2003. Edmonton, Canada.

Z. H. Huang, W. Xu, and K. Yu. (2015) Bidirectional LSTM-CRF Models for Sequence Tagging. In arXiv:1508.01991. 2015.

Akbik, A., Blythe, D., and Vollgraf, R. (2018). Contextual string embeddings for sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1638-1649).

See also