Difference between revisions of "Downloadable NLG systems"

From ACL Wiki
Jump to navigation Jump to search
 
(20 intermediate revisions by 5 users not shown)
Line 1: Line 1:
'''[[Tools and Software for English]] - Downloadable NLG systems'''
 
 
For languages other than English, see [[List of resources by language]].
 
 
<!-- MoinMoin name:  DownloadableSystems -->
 
<!-- MoinMoin name:  DownloadableSystems -->
 
<!-- Comment:        changed simplenlg URL -->
 
<!-- Comment:        changed simplenlg URL -->
Line 9: Line 6:
  
 
The natural language generation systems listed below are available for download over the web.   
 
The natural language generation systems listed below are available for download over the web.   
If you know of a system which is not listed here, please click on Edit in the upper left corner of this page and add the system yourself.
+
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.
  
 
== ASTROGEN ==
 
== ASTROGEN ==
 
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html
 
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html
  
Aggregated deep and Surface naTuRal language GENerator - Prolog based system.
+
Aggregated deep and Surface naTuRal language GENerator - Prolog based system.  
  
== CLINT ==
+
== Chimera ==
http://www.cs.bgu.ac.il/~elhadad/clint.html
 
  
CLINT is a hybrid template / word-based generation system with an example application of
+
https://github.com/AmitMY/chimera
business letter generation. The system is written in C++ and runs under Microsoft Windows.
 
<!-- THIS IS NOT NLG:
 
== Concordance ==
 
http://www.concordancesoftware.co.uk/
 
  
Concordance is a sophisticated text analysis software for making concordances, wordlists,
+
Chimera is a component-based step-by-step pipeline for data-to-text generation based on https://arxiv.org/abs/1904.03396
and Web Concordances.
+
It handles the necessary pre-processing for text-planning and surface realization which use neural networks, and does referring-expressions generation.
Supports many different Western languages.  Turn a concordance into HTML.
+
It can automatically evaluate datasets with a train-dev-test split, with both BLEU and data coverage.
Fully functional version available for download with a time limit.
 
-->
 
  
 
== CRISP ==
 
== CRISP ==
[http://code.google.com/p/crisp-nlg/ CRISP] is Alexander Koller's NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.
+
http://code.google.com/p/crisp-nlg/
 +
 
 +
CRISP is Alexander Koller's NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.
 +
 
 +
== CODA Tools software Release 1.1 ==
 +
http://computing.open.ac.uk/coda/resources/tools_form.html
 +
 
 +
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).
  
== DAYDREAMER ==
+
== Elvex ==
 +
https://github.com/lionelclement/Elvex
  
[ftp://ftp.cs.cmu.edu/user/ai/new/daydreamer/0new.html DAYDREAMER] is a computer model of the stream of thought developed at UCLA by Erik T. Mueller from 1983 to 1988. The generator is located in the file dd_gen.cl. Common lisp source code available under GPL v2.
+
Elvex is a NLG system based on a functional unification grammar close to LFG. It is implemented in C++, and is freely available under the GNU GPL License.
  
 
== FUF/SURGE ==
 
== FUF/SURGE ==
[http://www.cs.bgu.ac.il/~elhadad/research.html FUF] is available as the [ftp://ftp.cs.bgu.ac.il/pub/fuf/fuf5.3.tar.gz original Common Lisp implementation] and as a C++ port called [http://www.cs.bgu.ac.il/~elhadad/cfuf.zip CFUF] which has an embedded Scheme interpreter.
+
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html
  
For more information, see [[#SURGE]], [[#SURGE_2.3]], [[#SURG-SP]], [[#SURG-IT]].
+
FUF/SURGE is a surface realisation system, based on functional unification grammar.
  
 
== GenI ==
 
== GenI ==
 
http://kowey.github.io/GenI
 
http://kowey.github.io/GenI
  
surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].
+
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].
  
 
== Grammar Explorer ==
 
== Grammar Explorer ==
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html ([http://www.stir.ac.uk/crcl/Computational-tools/Grexplorer/grexplorer.html old site] unavailable as April, 2011)
+
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html
  
provides a means of exploring large-scale systemic-functional grammars in order to see how they are  
+
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are  
 
organized and what kinds of things they cover. It can be used to explore the KPML resources.  
 
organized and what kinds of things they cover. It can be used to explore the KPML resources.  
 
Downloadable standalone executables of the grammar explorer are available for&nbsp;Windows 95/98/NT.
 
Downloadable standalone executables of the grammar explorer are available for&nbsp;Windows 95/98/NT.
 
These already include a version of the Nigel grammar of English and pre-installed examples.
 
These already include a version of the Nigel grammar of English and pre-installed examples.
  
<!-- no longer available as of April, 2011
+
== GoPhi : an AMR to ENGLISH VERBALIZER ==  
== HALogen ==
 
http://www.isi.edu/licensed-sw/halogen/
 
  
HALogen is a general-purpose natural language generation system developed by Irene Langkilde-Geary and  Kevin Knight at the USC Information Sciences Institute.
+
https://github.com/rali-udem/gophi
The download package consists of the symbolic generator, the forest ranker, and some sample inputs. The symbolic generator includes the  Sensus Ontology dictionary (which is based on WordNet). The forest ranker includes a 250-million word ngram language model (unigram, bigram, and trigram) trained on WSJ newspaper text. The symbolic generator is written in LISP and requires a CommonLisp interpreter.
 
-->
 
  
<!-- NOT AN NLG SYSTEM:
+
GoPhi (Generation Of Parenthesized Human Input) is a system for generating a literal reading of Abstract Meaning Representation (AMR) structures. The system, written in SWI-Prolog, uses a symbolic approach to transform the original rooted graph into a tree of constituents that is transformed into an English sentence by jsRealB.
== kfNgram ==
+
 
http://www.kwicfinder.com/kfNgram/
+
== jsRealB ==
 +
 
 +
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser
 +
 
 +
jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module.
 +
Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [https://github.com/rali-udem/jsRealB].
  
kfNgram is a free stand-alone Windows program for linguistic research which generates lists of n-grams in text and HTML files.  Here n-gram is understood as a sequence of either n words, where n can be any positive integer, also known as lexical bundles, chains, wordgrams, and, in WordSmith, clusters, or else of n characters, also known as chargrams.
 
-->
 
 
== KPML ==
 
== KPML ==
  
Line 75: Line 72:
  
 
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.
 
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.
 
The KPML system was a direct descendent of the Penman text generation system, as developed further
 
multilingually in cooperative work between
 
the Komet (http://www.darmstadt.gmd.de/publish/komet/index.html)
 
project in Darmstadt and the Systemic Modelling Group
 
at Macquarie University. Downloadable standalone executables of the system are available for
 
PCs running Windows. The source code is written in ANSI Common Lisp and uses the
 
Common Lisp Interface Manager (CLIM).
 
The system has been compiled and tested[
 
under Franz Allegro Common Lisp (4.2, 4.3, 4.3.1, 5.0, 6.0, 7.0)
 
for Unix and Franz Allegro Common Lisp 3.0
 
and Harlequin Lispworks 4.0, 4.1, 4.2 for Windows.
 
It is possible to use the system without the window interface as a generator serving requests for generation across sockets or via files.
 
  
 
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the  
 
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the  
Line 94: Line 78:
  
 
== LKB ==
 
== LKB ==
[http://wiki.delph-in.net/moin/LkbTop LKB] ([[Linguistic Knowledge Builder]]) is a grammar engineering environment for unification-based formalisms, typically HPSG.
+
http://wiki.delph-in.net/moin/LkbTop
 +
 
 +
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.
 
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.
 
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.
  
Line 116: Line 102:
 
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)
 
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)
  
==NLGen==
+
== NLGen and NLGen2 ==
The [https://launchpad.net/nlgen NLGen] natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output. Not to be confused with NLGen2, below, which uses a different sentence generation theory.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].
+
https://launchpad.net/nlgen
  
== NLGen2 ==
+
https://launchpad.net/nlgen2
The [https://launchpad.net/nlgen2 NLGen2] natural language generation system uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Not to be confused with NLGen, above, which uses a different sentence generation theory. Java, Apache license. Reference: Blake Lemoine, "[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]".
+
 
 +
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].
 +
 
 +
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, "[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]".
  
 
== OpenCCG ==
 
== OpenCCG ==
[http://openccg.sourceforge.net/ OpenCCG], the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.
+
http://openccg.sourceforge.net/
  
<!-- as of April, 2011, a 30-day trial of project reporter is no longer offered
+
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.
== Project Reporter ==
 
http://www.cogentex.com/products/reporter
 
  
Project Reporter generates dynamic web-based project status reports from files created with Microsoft Project or
+
== rLDCP: Text Generation from Data ==
other compatible project management software. Reports feature hyperlinked textual descriptions of
+
https://cran.r-project.org/web/packages/rLDCP/index.html
project elements, as well as coordinated multimodal display with an interactive Gantt chart applet.  
 
Commercial product. Implemented in Java. Free 30-day evaluation; on-line demo on website.
 
-->
 
  
== RAGS (Reference Architecture for Generation Systems) software ==
+
R package for text generation from data
http://www.csd.abdn.ac.uk/~cmellish/rags/deliverables/
 
  
Deliverables from the RAGS project - RAGSOCKS software for interfacing modules using RAGS data representations,
+
== RNNLG ==
example RAGS module (genetic algorithm based text planner) and RAGS wrapper for FUF/SURGE.
+
https://github.com/shawnwun/RNNLG
  
<!-- no longer available, nor a NLG syste,
+
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.
== RSTTool ==
 
http://www.dai.ed.ac.uk/staff/personal_pages/micko/RSTTool/
 
  
is a tool which allows you to graphically annotate the
+
== RoseaNLG ==
rhetorical structure of your text. The structure can be saved in an xml format, or save
+
 
eps versions of the structure diagram for inclusion in Latex, etc. Written in Tcl/Tk.  
+
https://rosaenlg.org
Runs on any machine.
+
 
-->
+
RosaeNLG is a Natural Language Generation library for node.js or client side (browser) execution, based on the Pug template engine. It was previously known as FreeNLG. It supports English, French, German and Italian, and is complete enough to write production grade real life NLG applications.
  
 
== SimpleNLG ==
 
== SimpleNLG ==
  
http://simplenlg.googlecode.com/
+
https://github.com/simplenlg/simplenlg   (English)
 +
 
 +
https://github.com/rali-udem/SimpleNLG-EnFr  (English and French)
 +
 
 +
https://github.com/citiususc/SimpleNLG-GL    (Galician)
 +
 
 +
https://github.com/citiususc/SimpleNLG-ES    (Spanish)
 +
 
 +
https://github.com/sebischair/SimpleNLG-DE (German)
  
is an ultra-simple Java-based realiser.  Its
+
SimpleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively
grammatical coverage and syntactic knowledge is
+
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interfaceThere are "unofficial" ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]
minuscule compared to KPML or FUF/SURGE.
 
However, because it is so simple, its relatively
 
easy for people to learn how to use it.  It has
 
been used by many people in Aberdeen, and also
 
for teachingIt is set up as a Java package,
 
so it can only be used by Java programs.
 
  
 
== SPUD ==
 
== SPUD ==
[http://www.cs.rutgers.edu/~mdstone/nlg.html SPUD] (Sentence Planner Using Descriptions) is Matthew Purver's LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.
+
http://www.cs.rutgers.edu/~mdstone/nlg.html
 +
 
 +
SPUD (Sentence Planner Using Descriptions) is Matthew Purver's LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.
  
 
== STANDUP ==
 
== STANDUP ==
The [http://www.csd.abdn.ac.uk/research/standup/ STANDUP project] (System To Augment Non-speakers' Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.
+
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php
 +
 
 +
STANDUP (System To Augment Non-speakers' Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.
  
 
== Suregen-2 ==
 
== Suregen-2 ==
[http://www.suregen.de/00023.html Suregen] is “a hybrid, ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”
+
http://www.suregen.de/00023.html
 +
 
 +
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”
 
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.
 
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.
  
== SURGE ==
+
== Syntax Maker ==
http://www.cs.bgu.ac.il/surge/
+
 
 +
[https://github.com/mikahama/syntaxmaker Syntax Maker] is an open-source surface realization tool for Finnish. It can inflect words into their correct morphology based on case government and agreement.
  
Syntactic realization package. (A CommonLisp package providing an interpreter for a functional
+
Paper: https://www.aclweb.org/anthology/W18-0205/
unification formalism called FUF and SURGE, a large grammar of English written in FUF.) Offers download of SURGE 2.2.
 
  
== SURGE 2.3 ==
+
== TGen ==  
http://homepages.inf.ed.ac.uk/ccallawa/resources.html
 
  
The latest version of Surge, including support for written dialogue, and expanded
+
A statistical generator generating sentences from dialogue acts or similar representations, based on the sequence-to-sequence (seq2seq) neural network architecture. Beams generated using seq2seq are reranked based on whether they conform to the input meaning representation. The system is written in Python and uses Tensorflow.
syntactic coverage based on the Penn TreeBank.
 
  
== SURG-SP ==
+
Link: https://github.com/UFAL-DSG/tgen
http://homepages.inf.ed.ac.uk/ccallawa/resources.html
 
  
Systemic Unification Reusable Grammar for Spanish is a large scale
+
Paper: https://aclweb.org/anthology/P16-2008
Spanish grammar allowing systems which already use FUF/SURGE for English NLG to be able
 
to generate syntactically (and many times semantically) equivalent text in Spanish when
 
new lexical items are introduced. SURG-SP makes use of inputs almost identical to the
 
English version Surge 2.3.
 
  
== SURG-IT ==
+
== UralicNLP ==
http://homepages.inf.ed.ac.uk/ccallawa/resources.html
 
  
The Italian version of Surge 2.3.
+
[https://github.com/mikahama/uralicNLP UralicNLP] provides morphological generators in various languages including Finnish, Russian, German, Norwegian, Arabic, Erzya, Moksha, Skolt Sami...
  
== TG/2 ==
+
Paper: https://doi.org/10.21105/joss.01345
http://www.dfki.de/pas/f2w.cgi?lts/tg2-e
 
  
is a shallow verbalizer that can be quickly accustomed to new domains and tasks.
 
It combines context-free grammars with templates and canned
 
text in a single formalism. Thus the granularity of the language model may depend on the application
 
needs. The system currently runs under Solaris 2.5. It is available freely under a research license.
 
  
 
[[Category:Software]]
 
[[Category:Software]]
 
{{SIGGEN Wiki}}
 
{{SIGGEN Wiki}}

Latest revision as of 04:25, 29 June 2020


The natural language generation systems listed below are available for download over the web. If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.

ASTROGEN

http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html

Aggregated deep and Surface naTuRal language GENerator - Prolog based system.

Chimera

https://github.com/AmitMY/chimera

Chimera is a component-based step-by-step pipeline for data-to-text generation based on https://arxiv.org/abs/1904.03396 It handles the necessary pre-processing for text-planning and surface realization which use neural networks, and does referring-expressions generation. It can automatically evaluate datasets with a train-dev-test split, with both BLEU and data coverage.

CRISP

http://code.google.com/p/crisp-nlg/

CRISP is Alexander Koller's NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.

CODA Tools software Release 1.1

http://computing.open.ac.uk/coda/resources/tools_form.html

This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).

Elvex

https://github.com/lionelclement/Elvex

Elvex is a NLG system based on a functional unification grammar close to LFG. It is implemented in C++, and is freely available under the GNU GPL License.

FUF/SURGE

https://www.cs.bgu.ac.il/~elhadad/install-fuf.html

FUF/SURGE is a surface realisation system, based on functional unification grammar.

GenI

http://kowey.github.io/GenI

GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification). Toy example grammars provided for English and French. Largish core grammar for French is under development (contact us for details). GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well). Written in Haskell. Source code available via hackage, GitHub, or hub.darcs.net.

Grammar Explorer

http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html

The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are organized and what kinds of things they cover. It can be used to explore the KPML resources. Downloadable standalone executables of the grammar explorer are available for Windows 95/98/NT. These already include a version of the Nigel grammar of English and pre-installed examples.

GoPhi : an AMR to ENGLISH VERBALIZER

https://github.com/rali-udem/gophi

GoPhi (Generation Of Parenthesized Human Input) is a system for generating a literal reading of Abstract Meaning Representation (AMR) structures. The system, written in SWI-Prolog, uses a symbolic approach to transform the original rooted graph into a tree of constituents that is transformed into an English sentence by jsRealB.

jsRealB

http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser

jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module. Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [1].

KPML

http://www.purl.org/net/kpml

The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.

A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html ) for current examples. The development of further languages and of extensions to existing resources are very welcome!

LKB

http://wiki.delph-in.net/moin/LkbTop

LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG. It includes a realiser that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.

Multimodal Unification Grammar

http://www.david-reitter.com/compling/mug/

MUG Workbench is a development and debugging tool for Multimodal NLG. The grammar formalism supported is Multimodal Functional Unification Grammar (MUG). The MUG system runs MUG grammars with fixed (test cases) and arbitrary input specifications to produce output in a natural language, graphical user interface and possibly in other modes. It is designed to do three things: - Multimodal Fission (distributing output to interaction/communication modes) - Some sentence planning (chosing information to include in the utterance) - Natural Language and graphical user interface realization (producing some form of output) The MUG system does these three jobs in parallel. MUG Workbench can serve to inspect the data-structures used during generation. It should help you to learn more about the nature of unification grammars used for parsing or natural language generation. Furthermore, the MUG Workbench is helpful in debugging your grammars.

NaturalOWL

http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)

Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a Protégé plug-in. See here for publications describing NaturalOWL. (GPL)

NLGen and NLGen2

https://launchpad.net/nlgen

https://launchpad.net/nlgen2

The NLGen natural language generation system applies the SegSim strategy for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of RelEx output. Java, Apache license. See demo: Demo of AI Virtual Pet Answering Simple Questions.

NLGen2 uses RelEx dependency parses, together with Link Grammar linkage analysis to generate English-language output. Java, Apache license. Reference: Blake Lemoine, "NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System".

OpenCCG

http://openccg.sourceforge.net/

OpenCCG is both a parser and a realizer for Combinatory Categorial Grammar. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.

rLDCP: Text Generation from Data

https://cran.r-project.org/web/packages/rLDCP/index.html

R package for text generation from data

RNNLG

https://github.com/shawnwun/RNNLG

RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.

RoseaNLG

https://rosaenlg.org

RosaeNLG is a Natural Language Generation library for node.js or client side (browser) execution, based on the Pug template engine. It was previously known as FreeNLG. It supports English, French, German and Italian, and is complete enough to write production grade real life NLG applications.

SimpleNLG

https://github.com/simplenlg/simplenlg (English)

https://github.com/rali-udem/SimpleNLG-EnFr (English and French)

https://github.com/citiususc/SimpleNLG-GL (Galician)

https://github.com/citiususc/SimpleNLG-ES (Spanish)

https://github.com/sebischair/SimpleNLG-DE (German)

SimpleNLG is a simple Java-based realiser. Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively easy for people to learn how to use it. It has a Java API, and can be used from other languages via an XML interface. There are "unofficial" ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including Dutch, Italian, Mandarin

SPUD

http://www.cs.rutgers.edu/~mdstone/nlg.html

SPUD (Sentence Planner Using Descriptions) is Matthew Purver's LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.

STANDUP

https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php

STANDUP (System To Augment Non-speakers' Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.

Suregen-2

http://www.suregen.de/00023.html

Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.” The system Suregen-2 is written in (Allegro) Common Lisp. A demo system which runs under Windows is available for download. A screencast video shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in VLC if you run into problems.) Perhaps this system could be considered an instance of the WYSIWYM approach.

Syntax Maker

Syntax Maker is an open-source surface realization tool for Finnish. It can inflect words into their correct morphology based on case government and agreement.

Paper: https://www.aclweb.org/anthology/W18-0205/

TGen

A statistical generator generating sentences from dialogue acts or similar representations, based on the sequence-to-sequence (seq2seq) neural network architecture. Beams generated using seq2seq are reranked based on whether they conform to the input meaning representation. The system is written in Python and uses Tensorflow.

Link: https://github.com/UFAL-DSG/tgen

Paper: https://aclweb.org/anthology/P16-2008

UralicNLP

UralicNLP provides morphological generators in various languages including Finnish, Russian, German, Norwegian, Arabic, Erzya, Moksha, Skolt Sami...

Paper: https://doi.org/10.21105/joss.01345

Siggen-logo.gif This page was imported semi-automatically from the NLG Resources Wiki which was run by ACL SIGGEN in the years 2005–2009. Please correct conversion errors and help update its contents.

Now this page is associated with the Natural Language Generation Portal.