Tags: architecture, australia phone, csiro, delivery mechanisms, discourse, graphical interface, graphical user interface, interactive multimedia, interface environment, julien, lampert, linguistic resources, multimedia presentation systems, multimedia presentations, natural language generation, north ryde nsw, rhetorical structure theory, surveillance, vdp,
The Delivery of Multimedia Presentations in a Graphical
User Interface Environment
Nathalie Colineau, Julien Phalip, Andrew Lampert
CSIRO ICT Centre
Locked Bag 17, North Ryde
NSW 1670, Australia
phone: (+61) 2 9325 3100
{nathalie.colineau, julien.phalip, andrew.lampert}@csiro.au
ABSTRACT tasks (see [2]), the VDP determines the relevant information to
A major issue in many domains is to present information to present to the operators. This information is then organized and
people that is tailored to their need, in such a way that it supports delivered on the operators' graphical interface.
them in their tasks. In this paper, we present the Virtual The VDP is based on a typical Natural Language Generation
Document Planner (VDP), a platform we developed for (NLG) architecture, where the linguistic resources are separate
generating tailored interactive multimedia presentations in the from the engine, and, like many multimedia presentation systems
surveillance domain. Integrated with the surveillance operators' (e.g., [1], [4] and [6]), it employs a discourse approach based on
graphical interface, the VDP provides tailored information Rhetorical Structure Theory (RST) [5]. However, our approach is
delivery mechanisms that adapt the operators' information rich different in a number of ways. Unlike these systems, the
environment to their tasks and information needs. presentation we produce requires a real integration between the
information tailored by the VDP and the information already on
Categories and Subject Descriptors screen. Indeed, the aim was not to generate the operators'
H.5.2 [User Interfaces]: Natural Language, Graphical User interface, essentially a radar display, but rather, to augment their
Interfaces; H.5.3 [Group and Organization Interfaces]: radar display with additional information. To do this, we have
Synchronous Interaction; I.2.7 [Natural Language Processing]: designed and integrated a workspace to the operators' radar
Discourse, Language Generation. display. This is where most of the information generated by the
VDP is displayed. When it is needed, the information can be also
directly overlaid on the radar display, like, for example, the
General Terms visualization of track's kinematics information or the display of
Design, Human Factors. flight routes. In addition, while we use simple techniques to do
the media allocation (not as sophisticated as [8]), we have focused
Keywords our effort on two specific points that we present in this article:
Tailored information delivery, multimedia presentation - Developing a flexible approach to specify the layout of the
generation, discourse approach, task-sensitive user interface. presentation. The design is based on a template approach
which allows us to keep a clear separation between our
1. INTRODUCTION delivery platform and the delivery device; and
In the context of innovative Airborne Early Warning and Control
- Developing a new approach to resolve the media objects'
(AEW&C) platforms, we have investigated the delivery of
synchronization. The originality consists of using the
multimedia information to support the surveillance operators in
underlying discourse structure of the presentation to link the
their tasks. In their current environment, operators access
media objects together and determine their behavior.
information that comes from heterogeneous sources, and that is
delivered in a number of ways, across several displays.
2. THE SURVEILLANCE DOMAIN
To support operators in their tasks, we have enhanced their Surveillance operators work with information that comes from
graphical interface with information delivery mechanisms that various sources and that may be visualized on different displays.
tailor this information rich and complex environment to their The primary source of information is real-time information from
tasks and information needs. These delivery mechanisms have sensors such as radars that indicate the location of airborne
been integrated within the operators' interface, minimizing the objects in the area. In addition to the sensors' returns, other
numbers of displays and ensuring the most relevant information is information may come from a variety of sources. It is the role of
always available and prominently displayed. To this end, we have the surveillance operators to collect and integrate this additional
implemented a platform, called the Virtual Document Planner information to facilitate the interpretation of the battle space. The
(VDP), for generating tailored interactive multimedia combined output of the surveillance operators' work is the
presentations. Based on an analysis of the operators' tasks and an Recognized Air Picture. To facilitate the operators' task, it was
understanding of the context in which operators perform their proposed to provide their interface with mechanisms that select
Copyright is held by the author/owner(s). and aggregate relevant information from a variety of sources, and
IUI'06, January 26February 1, 2006, Sydney, Australia.
ACM 1-59593-287-9/06/0001.
then, deliver it into one integrated display rather than distributed Virtual Document Planner (VDP): The VDP generates multimedia
across several screens. information tailored to operators' tasks. It is based on a typical
Natural Language Generation (NLG) architecture, where the
linguistic resources are separate from the engine. The
3. PLATFORM OVERVIEW
presentations generated are scripting commands that are
To effectively deliver information on the operators' interface, it
interpreted by the API, and then, rendered by the GUI. The VDP
was necessary to adapt their current environment centered on the
has been used in different applications to generate a variety of
radar display. The aim was not to design the next generation of
documents.
their interface, but to propose a solution on how the additional
information could be delivered to the operators, and how the
operators would interact with it. Based on a study of their current 4. GENERATING A PRESENTATION
interfaces and work practices (internal technical report), we The VDP operates in three stages as illustrated in Figure 2: a) the
proposed to add to the radar display an integrated workspace that planning process; b) the assembly process, and c) the realization,
gathers and delivers information tailored by the VDP, as which consists of producing a script of the presentation that is
illustrated in Figure 1. The objective was simple: minimize visual then sent to the API for interpretation and rendered by the GUI. In
clutter and only display task- and situation-tailored information. the following, we briefly outline our approach to assemble and
This integrated workspace was also a way to overcome the realize the presentation.
potential intrusiveness of unsolicited delivery of information.
Figure 2: Overview of the generation process
Figure 1: Graphical User Interface Organization
The first step when generating a presentation of information is to
Figure 2 provides an overview of the platform architecture. The decide what information is to be included, and how to convey this
platform is composed of five components: information. In the VDP, this takes place in two planning stages:
Simulator: The primary input to the GUI is from a simulator 1) the content planning, in which discourse rules are selected and
module, which generates tracks (i.e., simulated series of radar combined to specify what content should be included (i.e., what
returns, which are associated with a single individual aircraft). needs to be conveyed) and how to organize it coherently; 2) the
presentation planning, in which presentation rules are selected and
Data Repository: The GUI uses a series of data stores to record combined to specify the most appropriate way to convey the
and maintain information about objects within a simulation (e.g., content selected. This is also during the presentation planning that
tracks). the content is actually retrieved from the data resources.
Graphical User Interface (GUI): The GUI is the interface through
which operators interact with the system. It is a prototype, which 4.1 The Assembly Process
implements a subset of features that allow us to demonstrate the Once the presentation has been planned, the next step consists of
technology without seeking absolute real word verisimilitude. Our assembling the presentation, and, in particular, specifying how the
GUI is implemented in Java. The display of spatial information is content is to be laid out. In this application, the content consists
handled using the OpenMap toolkit (http://openmap.bbn.com). mostly of figures, labels and short specifications. It is mainly
presented in windows and delivered on a GUI. To lay out the
Application Programming Interface (API): The API operates as an
content in window and control precisely the every details of the
interface layer between the VDP and the GUI. Its role is to
window layout, we use a template approach. The template
interpret information sent by the VDP that will be rendered by the
representation used by the VDP is different from other approaches
GUI. To this end, we use BeanShell (http://www.beanshell.org), a
as it uses a generic representation that it is not expressed in any
free Java source interpreter. By using the BeanShell interpreter,
specific programming language. The templates are simple
we create a very flexible scripting interface for the GUI.
declarative XML files, and combined together, they specify how a
particular presentation layout has to be achieved. To realize the
actual layout, the VDP templates are then associated with XSLT the same discourse relation will share and also exhibit the
style sheets (XSLT is an extensible style sheet Language for same behavior.
defining XML document transformation and presentation), which Once the output script has been generated, the file is interpreted
describe how to display content in a syntax understood by the by the API. At this stage, the behavior rules that control the media
delivery device (in our case the GUI environment). In doing so, object synchronization are activated. They ensure that media
we ensure a clear separation between the VDP and the rendering objects of similar importance behave in a similar way. In doing
device. so, this approach enables the VDP to deliver not just content, but
also to specify interaction and synchronization constraints.
The assembly is a bottom-up process. It starts at the leaves of the
discourse tree built during the planning process, selects the
templates to be used, and step by step embeds the templates 5. CONCLUSION
together specifying how each piece of information must be laid We have presented the VDP platform designed for the delivery of
out. When the VDP templates are associated with the multimedia information. This platform has been integrated within
corresponding style sheets, this also automatically specifies how a surveillance environment to augment the operators' graphical
the style sheets are to be combined. interface with information delivery mechanisms.
While assembling the templates, the assembly process annotates
further the discourse tree structure. The annotations consist of 6. ACKNOWLEDGMENTS
indicating for each node what is the media object built. These We wish to thank Robert Tot for his work on the task analysis and
annotations are used later on by the realization process when modeling and the RAAF surveillance operators and fighter
interpreting the discourse relations holding between the nodes controllers who participated in the task analysis interviews. We
(i.e., discourse segments). The final step for the VDP is the wish to thank also Cécile Paris for participating in discussions and
realization of an output script that contains all the presentation for her comments on an early version of this paper. We
specifications. It is then interpreted by the API and eventually acknowledge the support of Boeing (Research Agreement WT-
rendered on the GUI. SIDA 010.2) and CSIRO.
4.2 The Realization Process 7. REFERENCES
The realization process is responsible for generating an output [1] André, E. and Rist, T. (1995). Generating coherent
script embedding all the information necessary to render the presentations employing textual and Visual material. In
presentation on the display. This is done during a second bottom- Artificial Intelligent Review, Special volume on the
up pass through the discourse tree. At this point, the annotations Integration of Natural Language and Vision Processing, 9
made during the assembly process are interpreted and the (2/3). 147-165.
discourse relations holding between nodes are processed. To
[2] Colineau, N. and Paris, C. (2003). Task-Driven Information
ensure a coherent interaction process with the user, it is important
Presentation. In Proc. of OZCHI'03, Brisbane, Australia,
to coordinate the behavior of the media objects constituting the
Nov 25-28.
presentation. An easy way to determine how to link them together
is to use the discourse tree structure created during the planning [3] Dalal, M., Feiner, S., McKeown, K., Zhou, M., Hollere, T.,
process. Our approach consists of reasoning about the discourse Shaw. J., Feng, Y. and Fromer, J. (1996). Negotiation for
structure underlying the presentation to derive the relationship automated generation of temporal multimedia presentations,
between the media objects, and then use the type of discourse In ACM Multimedia Conference, p. 55-64, Boston, MA.
relation that links a media object to another in the discourse tree USA.
structure to assign them a particular behavior. [4] De Carolis, B., De Rosis, F., Andreoli, C., Cavallo, V. and
There exist various languages to specify the temporal dimension De Cicco, M.L. (1998). The dynamic generation of hypertext
of interactive multimedia presentations (e.g., [3] [7]) enabling presentations of medical guidelines. In The New Review of
very detailed specifications of the media object synchronization. Hypermedia and Multimedia, 4, 67-88.
However, our approach to assigning a behavior to media objects [5] Mann, W. and Thompson S. (1998). Rhetorical Structure
is different in a number of ways: Theory: Towards a Functional Theory of Text Organization.
In Text, 8(3): 243-281, 1988.
- First, in coordinating the media objects' behavior based on the
underlying discourse structure, it provides a motivation as [6] Rutledge, R., Bailey, B., van Ossenbruggen, J., Hardman, L.
why and how media objects should be synchronized. The and Geurts, J. (2000). Generating Presentation Constraints
discourse structure provides us with an understanding of how from Rhetorical Structure. In Proc. of the 11th ACM
information is related together, and how each piece of Conference on Hypertext and Hypermedia, San Antonio,
information is contributing to the whole; Texas, May 30 June 3, 19-28.
- Second, it provides an easy way to define or change objects' [7] SMIL (Synchronized Multimedia Integration Language)
behavior by modifying the association between the discourse http://www.w3.org/TR/2005/REC-SMIL2-20050107/
relations and the behavior rules, or by changing the discourse [8] Zhou, M., Wen, Z. and Aggarwal, V. (2005). A graph-
relation holding between two nodes; matching approach to dynamic media allocation in intelligent
- Finally, it ensures consistency in the way all the media objects multimedia interfaces. In Proc. of IUI'05, January 9-12, San
behave across the presentation. Indeed, two objects linked by Diego, California, US, pp.114-121.