A Neural Multi-digraph Model for Chinese NER with Gazetteers

Ruixue Ding, Pengjun Xie, Xiaoyan Zhang, Wei Lu, Linlin Li, Luo Si


Abstract
Gazetteers were shown to be useful resources for named entity recognition (NER). Many existing approaches to incorporating gazetteers into machine learning based NER systems rely on manually defined selection strategies or handcrafted templates, which may not always lead to optimal effectiveness, especially when multiple gazetteers are involved. This is especially the case for the task of Chinese NER, where the words are not naturally tokenized, leading to additional ambiguities. To automatically learn how to incorporate multiple gazetteers into an NER system, we propose a novel approach based on graph neural networks with a multi-digraph structure that captures the information that the gazetteers offer. Experiments on various datasets show that our model is effective in incorporating rich gazetteer information while resolving ambiguities, outperforming previous approaches.
Anthology ID:
P19-1141
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1462–1467
Language:
URL:
https://aclanthology.org/P19-1141
DOI:
10.18653/v1/P19-1141
Bibkey:
Cite (ACL):
Ruixue Ding, Pengjun Xie, Xiaoyan Zhang, Wei Lu, Linlin Li, and Luo Si. 2019. A Neural Multi-digraph Model for Chinese NER with Gazetteers. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1462–1467, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Neural Multi-digraph Model for Chinese NER with Gazetteers (Ding et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1141.pdf
Supplementary:
 P19-1141.Supplementary.pdf
Software:
 P19-1141.Software.zip
Code
 PhantomGrapes/MultiDigraphNER