Generalising Lexical Translation Strategies for MT Using Comparable Corpora

Bogdan Babych, Serge Sharoff, Anthony Hartley


Abstract
We report on an on-going research project aimed at increasing the range of translation equivalents which can be automatically discovered by MT systems. The methodology is based on semi-supervised learning of indirect translation strategies from large comparable corpora and applying them in run-time to generate novel, previously unseen translation equivalents. This approach is different from methods based on parallel resources, which currently can reuse only individual translation equivalents. Instead it models translation strategies which generalise individual equivalents and can successfully generate an open class of new translation solutions. The task of the project is integration of the developed technology into open-source MT systems.
Anthology ID:
L08-1111
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/340_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Bogdan Babych, Serge Sharoff, and Anthony Hartley. 2008. Generalising Lexical Translation Strategies for MT Using Comparable Corpora. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Generalising Lexical Translation Strategies for MT Using Comparable Corpora (Babych et al., LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/340_paper.pdf