The Geppetto Development Environment
Geppetto is an environment
aiming at facilitating the development of linguistic modules and resources for
NLP. Geppetto provides facilities for:
- editing and debugging grammars and lexica;
- linking linguistic data to a parser and/or a generator;
- integrating domain knowledge stored in KBs;
- using already available specialized processors (e.g. morphological
analyzers).
Geppetto is based on a Typed Feature Logic (Carpenter92)
oriented formalism for specifying linguistic data. TFL specifications are
compiled into a graph format, where each node represents a Typed Feature
Structure (TFS). The TFL standard formalism has been modified to accommodate:
- Declaration statements specifying, for instance, that a certain
object is not an ordinary TFS. In case its properties must be assessed by
other, possibly external, modules, such a fact can be specified by means of
...
- External constraints. They provide explicit links to external
modules, e.g. morphological processors, independent KBs, etc.
- Directives for the unifier. For instance, it is possible to force
the unifier to consider certain paths in the first place. Such paths may be
those that have been observed to cause more frequent failures (Uszkoreit91).
- Macros.
Declarations statements and external constraints
have been employed to interface the grammars and lexica produced by means of
Geppetto to existing KBs containing domain knowledge. Geppetto provides a number
of unification algorithms among which the user can select the one that best
suits his/her purposes. The algorithms have been designed to:
- carefully control and minimize the amount of copying needed with
non-deterministic parsing schema (Wroblewski87, Kogure90); as is well known,
this is crucial to reduce both time and space requirements at runtime;
- provide a better match between the characteristics of the unifiers and
those of the linguistic processors using them. It can be observed, in fact,
that different linguistic processors may profit of different unification
algorithms. Geppetto allows the user to choose the unification algorithm best
suited to the needs of the particular linguistic processor at hand.
Other relevant characteristics are the following:
- Types are implemented as bit-vectors (Ait-Kaci89). This move permits to
efficiently handle very large type hierarchies as well as providing a
straightforward way to account for type disjunction.
- A memory management schema has been designed to control the amount of
garbage collection done at run time. In particular, TFS-graph nodes creation
is reduced by accessing a stock of available nodes.
The editing
environment offers graphical facilities for editing linguistic data:
- A grapher for editing the type inheritance hierarchy. The grapher displays
mouse sensible nodes, allowing the user to input the TFL specifications for
each type of the hierarchy.
- Browsers for grammar rules, lexical items and macros.
- Specialized windows for TFL constraint editing.
- TFL-syntax error checking.
- Debugging facilities.
Geppetto makes available a library of
linguistic processors: chart parsers (island driven, CYK) and generators
(non-deterministic Head-Driven Bottom-Up, (Pianesi93)) together with the
possibility of integrating user's designed ones. As said above, the user can
specify the unification algorithms more appropriate to the chosen linguistic
processor.
People:
Relevant Publications:
- Fabio Ciravegna, Alberto Lavelli, Daniela Petrelli, Fabio
Pianesi.
Developing
Language Resources and Applications with Geppetto.
In
Proceedings of the First International Conference on Language Resources and
Evaluation, pages 619-625, Granada, Spain, May 28-30, 1998.
- Fabio Ciravegna, Alberto Lavelli, Daniela Petrelli, Fabio
Pianesi.
Participatory
Design for Linguistic Engineering: the Case of the GEPPETTO Development Environment.
In Proceedings
of the ACL/EACL Workshop on Computational Environments for Grammar Development
and Linguistic Engineering, pages 16-23, Madrid, Spain, July 12, 1997.
- Fabio Ciravegna, Alberto Lavelli, Daniela Petrelli, Fabio
Pianesi.
The
GEPPETTO Development Environment. Version 2.1. User
Manual.
Technical Report, IRST, November 1997.
Last modified: Mon Oct 4 14:22:47 MET DST 1999