CONLL-2003 (State of the art)
Jump to navigation
Jump to search
- Performance measure: F = 2 * Precision * Recall / (Recall + Precision)
- Precision: percentage of named entities found by the algorithm that are correct
- Recall: percentage of named entities defined in the corpus that were found by the program
- Exact match (for all words of a chunk) is used in the calculation of precision and recall (see CONLL scoring software)
- Training data: Train split of CONLL-2003 corpus
- Dryrun data: Testa split of CONLL-2003 corpus
- Testing data: Testb split of CONLL-2003 corpus
- The corpus contains a very high ratio of metonymic references (city names standing for sport teams)
Table of results
System name | Short description | System type (1) | Main publications | Software | Results |
---|---|---|---|---|---|
FIJZ | Best CONLL-2003 participant | S | Florian, Ittycheriah, Jing and Zhang (2003) | sbo | 88.76% |
Baseline | Vocabulary transfer from training to testing | S | Tjong Kim Sang and De Meulder(2003) | sbobet | 59.61% |
Balie | Unsupervised approach: no prior training | U | Nadeau, Turney and Matwin (2006) | sourceforge.net | 55.98% |
- (1) System type: R = hand-crafted rules, S = supervised learning, U = unsupervised learning, H = hybrid
References
Florian, R., Ittycheriah, A., Jing, H. and Zhang, T. (2003) Named Entity Recognition through Classifier Combination. Proceedings of CoNLL-2003. Edmonton, Canada.
Nadeau, D., Turney, P. D. and Matwin, S. (2006) Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. Proceedings 19th Canadian Conference on Artificial Intelligence. Québec, Canada.
Tjong Kim Sang, E. F. and De Meulder, F. (2003) Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. Proceedings of CoNLL-2003. Edmonton, Canada.