Combinatory Categorial Grammar

From ACL Wiki
Revision as of 01:52, 12 August 2008 by Ioan (Talk | contribs)

Jump to: navigation, search

Introduction

Combinatory Categorial Grammar (CCG) is an efficiently parseable, yet linguistically expressive grammar formalism. It has a completely transparent interface between surface syntax and underlying semantic representation, including predicate-argument structure, quantification and information structure.

Software

OpenCCG: The OpenNLP library

OpenCCG, the OpenNLP CCG Library, is an open source natural language processing library written in Java, which provides parsing and realization services based on Mark Steedman's Combinatory Categorial Grammar (CCG) formalism. The library makes use of multi-modal extensions to CCG developed by Jason Baldridge as part of the Grok system (the precursor to OpenCCG). Current development efforts, led by Michael White, are focused on making the realizer practical to use in dialogue systems. For the latest news about OpenCCG, check out the SourceForge project page.

The C&C Parser and Supertagger

The C&C CCG parser and supertagger form part of the language processing tools developed by James Curran and Stephan Clark. The tools are written in C++ and have been designed to be efficient enough for large-scale NLP tasks.

StatCCG

StatCCG is a statistical CCG parser (trained on CCGbank) written by Julia Hockenmaier. Executables are available here

Boxer

Boxer is developed by Johan Bos and generates formal semantic representations for CCG grammars. Boxer takes as input CCG (Combinatory Categorial Grammar) derivations and produces DRSs (Discourse Representation Structures, from Hans Kamp's Discourse Representation Theory) as output. It is distributed with the C&C tools. Boxer produces standard DRS syntax, uses a neo-Davidsonian analysis for events (with thematic roles from VerbNet), incorporates Van der Sandt's algorithm for presupposition, is 100% compatible with first-order logic (FOL), and normalises cardinal and date expressions. DRSs can be generated in various output formats: resolved or underspecified, in Prolog or XML, flattened or recursive structures, with discourse referents represented by Prolog atoms or variables, and with pretty printed DRSs or not. It is also possible to output FOL formulas translated from the DRSs.

CCGbank

Publications

People