Multilingual Corpus Development for Opinion Mining

Julia Maria Schulz, Christa Womser-Hacker, Thomas Mandl


Abstract
Opinion Mining is a discipline that has attracted some attention lately. Most of the research in this field has been done for English or Asian languages, due to the lack of resources in other languages. In this paper we describe an approach of building a manually annotated multilingual corpus for the domain of product reviews, which can be used as a basis for fine-grained opinion analysis also considering direct and indirect opinion targets. For each sentence in a review, the mentioned product features with their respective opinion polarity and strength on a scale from 0 to 3 are labelled manually by two annotators. The languages represented in the corpus are English, German and Spanish and the corpus consists of about 500 product reviews per language. After a short introduction and a description of related work, we illustrate the annotation process, including a description of the annotation methodology and the developed tool for the annotation process. Then first results on the inter-annotator agreement for opinions and product features are presented. We conclude the paper with an outlook on future work.
Anthology ID:
L10-1476
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/689_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Julia Maria Schulz, Christa Womser-Hacker, and Thomas Mandl. 2010. Multilingual Corpus Development for Opinion Mining. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Multilingual Corpus Development for Opinion Mining (Schulz et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/689_Paper.pdf