Definition Extraction Using a Sequential Combination of Baseline Grammars and Machine Learning Classifiers

Łukasz Degórski, Michał Marcińczuk, Adam Przepiórkowski


Abstract
The paper deals with the task of definition extraction from a small and noisy corpus of instructive texts. Three approaches are presented: Partial Parsing, Machine Learning and a sequential combination of both. We show that applying ML methods with the support of a trivial grammar gives results better than a relatively complicated partial grammar, and much better than pure ML approach.
Anthology ID:
L08-1294
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/213_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Łukasz Degórski, Michał Marcińczuk, and Adam Przepiórkowski. 2008. Definition Extraction Using a Sequential Combination of Baseline Grammars and Machine Learning Classifiers. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Definition Extraction Using a Sequential Combination of Baseline Grammars and Machine Learning Classifiers (Degórski et al., LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/213_paper.pdf