Adaptive multimodal natural language generation

Speech, prosody & corpus-based methods

Multimodality, cognition and evaluation



Adaptive multimodal natural language generation

This subproject focuses on improving the presentation of answers in a QA system by extending the answer using information about discourse structure, possibly in combination with other techniques from text summarization, and enriching the answer with graphical output (lay-out and pictures, but possibly also animations or movies). The general aim is to generate an answer that is more informative and appropriate for the current user than the ‘plain’ output provided by answer extraction.

So far, an algorithm has been developed for answer-focused summarization (Bosma 2005a). After a QA engine has found an answer-sentence to a question in a collection of documents, information about rhetorical structure of the source document (annotated with Rhetorical Structure Theory, RST) is used to create a summary of the document in which the answer was found, thus providing the user with a more informative 'extended' answer. Bosma (2005b) presents a preliminary evaluation of this approach, suggesting that participants indeed prefer such extensions. In addition, several tools were developed for RST annotation and manipulation of RST-annotated documents (e.g., for visualization and summarization). After an answer is generated, a collection of pictures from web documents is searched for a picture which may be relevant to the answer. Then, the picture may be presented to the user along with the answer. Each of the pictures is associated with the surrounding text in their original document. The picture whose surrounding text is most similar to the answer text, is taken to be most relevant to the answer. Latent Semantic Indexing (LSI) is used to measure text similarity. The process is described in (Bosma 2005c). An answer presentation module incorporating both results has been developed and integrated in the IMIX demonstrator.

With some adaptations, the answer-focused summarization method of Bosma (2005ab) can also be used to create extended answers combining sentences from different source documents. Bosma (2006) describes a method for doing this using a knowledge-poor method, where discourse structure is automatically determined based on document lay-out, and where redundancy between documents is detected using the dependency tree alignment algorithm of Marsi et al. (2006). Within DUC 2006, this summarization method was evaluated on various aspects of content and linguistic quality. On average over all summaries and all evaluated aspects of linguistic quality, the system performed second best of 34 participants.

For an overview of all results in this subproject, see Bosma (2008).