DM_NLP at SemEval-2018 Task 8: neural sequence labeling with linguistic features

Chunping Ma, Huafei Zheng, Pengjun Xie, Chen Li, Linlin Li, Luo Si


Abstract
This paper describes our submissions for SemEval-2018 Task 8: Semantic Extraction from CybersecUrity REports using NLP. The DM_NLP participated in two subtasks: SubTask 1 classifies if a sentence is useful for inferring malware actions and capabilities, and SubTask 2 predicts token labels (“Action”, “Entity”, “Modifier” and “Others”) for a given malware-related sentence. Since we leverage results of Subtask 2 directly to infer the result of Subtask 1, the paper focus on the system solving Subtask 2. By taking Subtask 2 as a sequence labeling task, our system relies on a recurrent neural network named BiLSTM-CNN-CRF with rich linguistic features, such as POS tags, dependency parsing labels, chunking labels, NER labels, Brown clustering. Our system achieved the highest F1 score in both token level and phrase level.
Anthology ID:
S18-1114
Volume:
Proceedings of the 12th International Workshop on Semantic Evaluation
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marianna Apidianaki, Saif M. Mohammad, Jonathan May, Ekaterina Shutova, Steven Bethard, Marine Carpuat
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
707–711
Language:
URL:
https://aclanthology.org/S18-1114
DOI:
10.18653/v1/S18-1114
Bibkey:
Cite (ACL):
Chunping Ma, Huafei Zheng, Pengjun Xie, Chen Li, Linlin Li, and Luo Si. 2018. DM_NLP at SemEval-2018 Task 8: neural sequence labeling with linguistic features. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 707–711, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
DM_NLP at SemEval-2018 Task 8: neural sequence labeling with linguistic features (Ma et al., SemEval 2018)
Copy Citation:
PDF:
https://aclanthology.org/S18-1114.pdf