Making Text Resources Accessible to the Reader: the Case of Patent Claims

Simon Mille, Leo Wanner


Abstract
Hardly any other kind of text structures is as notoriously difficult to read as patents. This is first of all due to their abstract vocabulary and their very complex syntactic constructions. Especially the claims in a patent are a challenge: in accordance with international patent writing regulations, each claim must be rendered in a single sentence. As a result, sentences with more than 200 words are not uncommon. Therefore, paraphrasing of the claims in terms the user can understand is of high demand. We present a rule-based paraphrasing module that realizes paraphrasing of patent claims in English as a rewriting task. Prior to the rewriting proper, the module implies the stages of simplification and discourse and syntactic analyses. The rewriting makes use of a full-fledged text generator and consists in a number of genuine generation tasks such as aggregation, selection of referring expressions, choice of discourse markers and syntactic generation. As generator, we use the MATE-work bench, which is based on the Meaning-Text Theory of linguistics.
Anthology ID:
L08-1318
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/352_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Simon Mille and Leo Wanner. 2008. Making Text Resources Accessible to the Reader: the Case of Patent Claims. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Making Text Resources Accessible to the Reader: the Case of Patent Claims (Mille & Wanner, LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/352_paper.pdf