Towards a Standardized Linguistic Annotation of the Textual Content of Labels in Knowledge Representation Systems

Thierry Declerck, Piroska Lendvai


Abstract
WWe propose applying standardized linguistic annotation to terms included in labels of knowledge representation schemes (taxonomies or ontologies), hypothesizing that this would help improving ontology-based semantic annotation of texts. We share the view that currently used methods for including lexical and terminological information in such hierarchical networks of concepts are not satisfactory, and thus put forward ― as a preliminary step to our annotation goal ― a model for modular representation of conceptual, terminological and linguistic information within knowledge representation systems. Our CTL model is based on two recent initiatives that describe the representation of terminologies and lexicons in ontologies: the Terminae method for building terminological and ontological models from text (Aussenac-Gilles et al., 2008), and the LexInfo metamodel for ontology lexica (Buitelaar et al., 2009). CTL goes beyond the mere fusion of the two models and introduces an additional level of representation for the linguistic objects, whereas those are no longer limited to lexical information but are covering the full range of linguistic phenomena, including constituency and dependency. We also show that the approach benefits linguistic and semantic analysis of external documents that are often to be linked to semantic resources for enrichment with concepts that are newly extracted or inferred.
Anthology ID:
L10-1575
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/832_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Thierry Declerck and Piroska Lendvai. 2010. Towards a Standardized Linguistic Annotation of the Textual Content of Labels in Knowledge Representation Systems. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Towards a Standardized Linguistic Annotation of the Textual Content of Labels in Knowledge Representation Systems (Declerck & Lendvai, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/832_Paper.pdf