Multimodal Functional Unification Grammar

MUG System, MUG Workbench

A natural language generation development environment for multimodal applications

Download is here

What is it?

MUG Workbench is a development and debugging tool for Multimodal Natual Language Generation (NLG). The grammar formalism supported is Multimodal Functional Unification Grammar (MUG).

The MUG system runs MUG grammars with fixed (test cases) and arbitrary input specifications to produce output in a natural language, graphical user interface and possibly in other modes. It is designed to do three things:

  • Multimodal Fission (distributing output to interaction/communication modes)
  • Some sentence planning (chosing information to include in the utterance)
  • Natural Language and graphical user interface realization (producing some form of output)

The MUG system does these three jobs in parallel. MUG Workbench can serve to inspect the data-structures used during generation. It should help you to learn more about the nature of unification grammars used for parsing or natural language generation. Furthermore, the MUG Workbench is helpful in debugging your grammars.

Technically, the MUG system consists of several components, among them

  • A grammar formalism based on the unification of functional, weakly typed attribute-value-matrices.
  • A hybrid generation algorithm using soft and hard constraints, which model the efficiency and efficacy of the output. (See our HLT/NAACL-04 paper: UI on the Fly)
  • A central realizer with various constraint optimization algorithms - among them iterative-deepening branch&bound depth-first search with an admissible heuristic, which turned out to be the most efficient search method.
  • A knowledge base, i.e. a type hierarchy (example)
  • MUG Workbench including several visualization options, including publication-ready markup of attribute-value-matrices for LaTeX.
  • The FASiL VPA grammar (sending e-mail) and smaller example grammars
  • Some intial forays into discourse generation, including referring expressions with a unification-based implementation of Centering theory.

More documentation can be made available on request. A tutorial introducing the MUG formalism and the workbench is available with the system.

Credits and Contact

MUG System was written by David Reitter, while at MIT Media Lab Europe. Contact via E-Mail: dreitter @

Contributions from Erin Panttaja (erin @ (MUG components/grammars, designs) and Eva Maguire (Centering components).

MUG was developed with funds from MIT Media Lab Europe and the European Commission under the FASiL grant. More about the multimodal interaction developed in the FASiL project can be found here. There is a FASiL website, too.

What's Multimodal Functional Unification Grammar?

If you know a bit about NLG, you'll probably know how Functional Unification Grammar (FUG) works. FUF is a common formalism, and SURGE is probably the best-known example for an actual grammar (SURGE generates English sentences). FUG works with feature-structures (attribute value matrices), and it works by unifying (constituent) substructures of the input specification with suitable grammar rules. It's similar to a unification-based producation grammar, except that symbols don't replace each other -- you have one big structure in the end. (There are many other aspects to Functional Unification, but I shouldn't go into detail here. Read Dale&Reiter.)

MUG extends FUG by saying: let's unify one grammar rule for each mode (modality = screen, voice, gesture and the like) with the constituent substructures. Thereby we ensure that the output between the modes is coherent where necessary. (For details, please read our HLT-2004 paper.)

The generation process can adapt to situations and devices. While the semantic specification that is input to the generation and the grammar itself both specify hard constraints, situations and devices usually give soft constraints: "The colors on this device's screen don't show a lot of contrast.", or "In this situation, it's hard to hear all of the audio." Such soft constraints are envoded in a fitness function, which is a scoring heuristic that determines how well a particular output variant works in the given situation on the given device. Remember that the grammar potentially generates many output variants (it's ambiguous!). We need to pick one of them to display, and that's what the fitness function helps us with.

How do I get started?

Just download the MUG System, download and install

If you want to use MUG in a dialogue system, it is recommended to additionally install YAP-Prolog for the runtime module.

It runs on Linux, Mac OS X, and even on Windows.

Download the MUG System 1.2 with the complete source code:


Following the instructions in the README file. Usually, you will just need to extract the MUG System package, install SWI-Prolog (easy).

The MUG Workbench runs in a Web Browser (IE, Mozilla, Safari etc.).


This is a view that shows variants coming out of the grammar (along with a score), and a feature structure describing part of the information exchanged during the generation process:

You can demo the result in a PDA-size screen:

You can inspect a log of the generation process - useful in learning how the algorithm works:

Select among several device and situation models (extensible!):


David Reitter. A development environment for multimodal functional unification generation grammars. In Proc. Third International Conference on Natural Language Generation (INLG04), 2004.
[ abstract | bib | .pdf ]

Erin Panttaja, David Reitter, and Fred Cummins. The evaluation of adaptable multimodal system outputs. In Proceedings of the DUMAS Workshop on Robust and Adaptive Information Processing for Mobile Speech Interfaces, at COLING, 2004.
[ bib | .pdf]

David Reitter, Erin Panttaja, and Fred Cummins. UI on the fly: Generating a multimodal user interface. In Proceedings of Human Language Technology conference 2004 / North American chapter of the Association for Computational Linguistics (HLT/NAACL-04), 2004.
[ bib | .pdf ]