Technical Rep ort - Ulmer Informatik-Berichte No. 99-01
Department of Computer Science, University of Ulm, Germany.
A Comparison of Multimedia Do cument Mo dels Concerning
Advanced Requirements
Susanne Boll, Wolfgang Klas, Utz Westermann
Databases and Information Systems (DBIS),
Computer Science Department, University of Ulm, Germany
f
b oll, klas, westermann
g
@informatik.uni -ul m.de
Abstract
Existing multimedia do cument mo dels like HTML, MHEG, SMIL, and HyTime lack appropriate
mo deling primitives that meet sp ecic requirements given by advanced multimedia information sy-
stem application s. In traditional multimedia applications, multimedia do cument mo dels just had to
cop e with the mo deling of the temp oral, spatial, and interactive course of a multimedia presentation.
However, we seriously question whether existing mo dels t the needs of next generation multimedia
applications that bring up requirements like reusabilityofmultimedia content in dierent presenta-
tions and contexts, and adaptation to user preferences. In this pap er, we motivate and present new
requirements stemming from advanced multimedia applications and the resulting consequences for
multimedia do cument mo dels. Along these requirements, we discuss HTML, HyTime, MHEG, SMIL,
and Z
Y
X, a new mo del that has b een develop ed with sp ecial fo cus on reusability and adaptation. The
analysis and comparison of the mo dels show the limitations of existing mo dels, p oint the waytothe
need for new exible multimedia do cument mo dels, and throw light on the many implicatio ns on
authoring systems, multimedia content management, and presentation.
Keywords:
Multimedia do cument mo del, multimedia databases, educational medical applicatio ns.
1 Intro duction
The initial requirements to multimedia do cuments were the mo deling of the temp oral and spatial course
of a multimedia presentation. So on the imp ortance of interactivity for multimedia applications was
understo o d and interaction mo deling formed an additional requirement. To oer suitable supp ort for
multimedia applications the developmentofmultimedia do cument mo dels b egan. On the one side stan-
dardization activities started and on the other side commercial to ols were evolving. The development
and the passing of standards to ok quite a long time, while in the meantime very sophisticated commercial
multimedia authoring to ols came to the market that supp ort their own
proprietary
format and only by
now start to cross the bridge to do cument standards.
However, we think that the standards and the commercial to ols develop ed so far only partially oer
the necessary prerequisites for multimedia do cument mo deling as \next generation" multimedia applica-
tions extend the requirements given ab oveby far: demand for
reusability
of the media including entire
do cuments and parts of do cuments, mo deling of
adaptation
to user sp ecic needs and context-dep endent,
ne-grained re-usage of the multimedia material, and wide-spread use in the Internet.
Whydowe consider these to b e the new requirements to multimedia do cuments? As authoring of
multimedia information is a very time consuming and costly task the reuse of material is denitely of
high interest simply from an economical p oint of view. But reuse by means of \cut and paste" obviously
can not b e a solution, rather distinct and ne-grained reuse of multimedia content is highly demanded.
Personalization and adaptation of information systems to p ersonal needs and p ersonal interests b ecome
more and more imp ortant (e.g., [Bul98]). The trend to oer a user the most suitable and narrowed down
multimedia information can b e seen in research prototyp es from dierent research areas, e.g., a user
adaptive diagram assistant [C-L98], an adaptive tutorial agent [SSW98], adaptive textb o oks on the WWW
[EBS97], p ersonalized news pap er [KBA93], p ersonalized delivery of news [KLAV98], etc. Personalization
and adaptation in consequence calls for the enhancementofmultimedia content with metadata to allow
for the targeted context-sp ecic selection of multimedia content. Another new requirement to a do cument
1
mo del is its Internet-applicability, i.e., how it can cop e with the demands of the heterogeneous environment
of the Internet.
Within our pro ject \Gallery of Cardiac Surgery" (Cardio-OP)
1
, that aims at the developmentofan
Internet-based and database-driven multimedia information system in the domain of cardiac surgery,
we nd a representative application that explicitly requires a mo del for multimedia material which can
b e extensively reused in dierent context. Based on a multimedia content rep ository, the system is
going to serve as a common information and education base for its dierenttyp es of users, physicians,
medical lecturers, students, and patients, who are provided with multimedia information according to
their user sp ecic request to the multimedia information system, their dierent understanding of the
selected sub ject, their lo cation and technical infrastructure. For example, a high qualitymultimedia
presentation dynamically comp osed during a lecture at the university campus should b e available for
students at home for revise although they do not need the high quality of videos or images at home.
Therefore, either the multimedia material must b e delivered to the student with a lower quality and
lower bitrate or high data volume parts like video are replaced by \comparable" but less voluminous
parts like a slide show.
Toachieve this kind of functionality, a suitable multimedia do cument mo del that allows for exible and
context-dep endent reuse of a multimedia do cument and parts of it is needed. On our way to such a \next
generation" multimedia do cument mo del we rst tracked down and identied the advanced requirements
lo oking at up coming advanced applications like Cardio-OP. When investigating the applicability of the
existing mo dels HTML, HyTime, MHEG, and SMIL we, unfortunately, nd serious limitations and
drawbacks. It is quite a challenge to push these limits and to dene a multimedia do cument mo del
providing mo deling primitives that go b eyond those found in existing mo dels. An example for sucha
mo del is Z
Y
X , presented in [BK99].
The implications of approaches trying to resolve the shortcomings of existing mo dels are manyfold:
There arises an urgent need for suitable authoring systems that supp ort ne-grained reuse of multimedia
content, adaptability of content to user needs and individual interest, and, as a direct consequence, the
presentation-neutral representation of material, e.g., in a database. The latter is an analogue to the
principle of data indep endence of applications well-known from database systems. What is known as
data indep endence for \traditional" applications must b e enhanced by presentation indep endence for
multimedia applications. In addition, presentation-neutral representation of multimedia content directly
impacts the design of presentation to ols, since these have to \deliver" exibility and adaptability to end
users.
In this pap er, we give a motivation towards the development of next generation multimedia do cument
mo dels likeZ
Y
X [BK99] by identifying advanced application requirements, analyzing existing do cument
mo dels showing the limits of current approaches, and calling for a concerted action on developing next
generation authoring and management to ols for advanced multimedia applications.
The remainder of the pap er is organized as follows: Section 2 presents the new requirements for
multimedia do cument mo dels. Section 3 intro duces the reader to the dierent mo dels for multimedia
do cuments we compare in this pap er, HTML, SMIL, MHEG-5, HyTime, and Z
Y
X. Section 4 presents
the comparison of the mo dels along the requirements identied in Section 2. The pap er concludes with
an reection of the analysis and p oints the way to the future of multimedia do cument mo dels.
2 Requirements to Next Generation Multimedia Do cumentMod-
els
In this section, we identify requirements to multimedia do cument mo dels. These can b e divided into
traditional requirements, whichwe consider to b e imp erative for anymultimedia do cument mo del, and
advanced requirements, whichwe exp ect to b e demanded more and more by future multimedia appli-
cations. The availabilityof a
temporal model
,a
spatial model
,aswell as supp ort for the mo deling of
interaction
are traditional requirements while
reusability
of multimedia do cument content,
adaptation
1
Cardio-OP - Gallery of Cardiac Surgery - is partially funded by the German Ministry of Research and Education,
grantnumb er 08C58456. Our pro ject partners are the University Hospital of Ulm, Dept. of Cardiac Surgery and Dept.
of Cardiology, the University Hospital of Heidelb erg, Dept. of Cardiac Surgery, an asso ciated Rehabilitation Hospital, the
publishers Barth-Verlag and dpunkt-Verlag, Heidelb erg, FAW Ulm, and ENTEC GmbH, St. Augustin. For details see also
URL www.informatik.uni-ulm.de/dbis/Cardio-OP/
2
to user sp ecic needs, and
presentation-neutral representation
of multimedia do cument content are ad-
vanced requirements. Each of these requirements is motivated and illustrated in its dierent facettes in
the following subsections. The requirements form a metric along which selected multimedia do cument
mo dels are analyzed in Section 4.
2.1 Temp oral mo del
As the presentation of multimedia do cuments is time-dep endent, one of the basic requirements to a
multimedia do cument mo del is the mo deling of the temp oral course of the presentation. Thus, a temp oral
mo del must b e provided to describ e temp oral dep endencies b etween the media elements that a multimedia
do cument comprises. We nd three typ es of temp oral mo dels:
point-based
temp oral mo dels,
interval-based
temp oral mo dels and
event-based
temp oral mo dels.
In the p oint-based mo del the temp oral extent of each media element in the multimedia do cumentis
mo deled by
points in time
. These determine at which p oint in time on the time axis the presentation
of a media element starts, and ends resp ectively.For anytwo p oints in time one of the relationships
before
(
<
),
after
(
>
), or
equals
(=) holds. This is a simple representation of time with a small number of
temp oral relationships.
Existing representations of temp oral asp ects in the context of multimedia presentations are mainly
based on some or all of the 13 binary temp oral relations b etween
time intervals
as dened by Allen [All83].
These mo dels, however, do not supp ort time intervals of unknown duration that o ccur, for instance, in
the context of user interaction in multimedia presentations (e.g., Ob ject Comp osition Petri Nets (OCPN)
[LG93]). Therefore, enhanced interval-based temp oral mo dels have b een prop osed to handle op en time
intervals and indenite interval relationships [DK95, HFK95, WR94].
In an event-based mo del of time,
events
determine the temp oral course of the presentation. An event
is connected to actions and when an event o ccurs, e.g., a video reaches a certain p oint in time, the
corresp onding actions, typically start and stop of the presentation of other media elements, is carried
out.
Another way to sp ecify temp oral relations b etween media elements is by the use of
scripts
{ programs
written in a scripting language which can comprise temp oral op erations. If the scripting language forms
a complete programming language, this mechanism allows for very complex and p owerful sp ecications
of temp oral dep endencies b etween media elements.
2.2 Spatial Mo del
If a presentation consists of visual media elements, not only the temp oral synchronization of these elements
is of interest but also their spatial p ositioning on the presentation media (e.g., a window). This p ositioning
can b e sp ecied by the use of a spatial mo del. In general, three approaches to spatial mo dels can b e
distinguished:
absolute positioning
,
directional relations
, and
topological relations
.
With absolute p ositioning the media element is placed on the presentation area at a xed
absolute
position
sp ecied by a co ordinate pair. To handle overlapping, a third value maybe intro duced by which
the ordering of overlapping media elements is dened.
A more exible way to dene the spatial p ositioning of visual media elements is the sp ecication of
directional
relations [PTSE95, PS94], like
north
,
north-west
etc. At a ner granularity,byintro ducing re-
lations like
strong-north
and
weak-north
to sp ecify overlapping, 169 dierent directional relations b etween
two rectangles in 2D space can b e distinguished [PTSE95].
Another way to dene spatial relationships is by the use of top ological relations [EF91]. Between any
two continuous region ob jects, the following eight top ological relations can b e distinguished:
disjoint
,
meet
,
overlap
,
covers
,
covered-by
,
contains
,
inside
, and
equal
.
2.3 Interaction
A distinct feature of a multimedia do cument mo del is the ability to sp ecify user interaction in order to
let a user cho ose b etween dierent presentation paths. Multimedia do cuments without user interaction
are not very interesting as the course of their presentation is exactly known in advance and, hence, could
b e recorded as a movie. For the mo deling of user interaction, one can identify at least two basic typ es of
interaction:
navigational interactions
,
design interactions
.
3
With navigational interactions a user can determine the owofamultimedia presentation. An example
is the selection of a link or an item from a menu to decide which presentation path is to b e followed.
Design interactions inuence the visual and audible layout of a presentation. Examples are the
adjustmentofspeaker volume, fonts, scaling of images, and the like.
2.4 Reusability
As motivated in the intro duction, reusabilityofmultimedia content is a desired feature of a next gen-
eration multimedia do cument mo del. Reusability of do cument content can b e characterized along three
dimensions: the
granularity
of reuse, the
kind
of reuse, and the
selection and identication
of reusable
comp onents.
Granularity:
The granularity of reuse determines
what
can b e reused. Regarding multimedia do cument
mo dels, we can distinguish at least three levels of granularity of reusable comp onents: reuse of complete
multimedia
documents
, reuse of
fragments
of multimedia do cuments like single scenes or chapters, and
reuse of individual atomic
media elements
such as a video or audio.
Kind of re-usage:
For all three levels of granularitywe distinguish b etween dierentways of
how
to
reuse material for the comp osition of new do cuments:
identical re-usage
, i.e., the comp onents are reused
including all temp oral, spatial, design and interaction relationships and constraints as originally sp ecied
by the author(s), and
structural re-usage
, i.e., we separate the layout from the structure of comp onents
and reuse only the structural parts.
Selection and identication:
Before we can reuse comp onents wehaveto
identify
and
select
them
within an information system. This calls for metadata and a mechanism for classifying, indexing, and
querying comp onents. Hence, a do cument mo del should provide supp ort for annotation of reusable
comp onents with metadata.
2.5 Adaptation
Presentation of multimedia do cuments preferably dep ends on the user context and hence, the multimedia
presentation needs to b e adapted to this user context. But it is also of interest whether all p ossible
adaptation alternatives are to b e known and mo deled at authoring time of a multimedia do cumentorif
they are left for evaluation at the actual presentation time.
Parameters of adaptability:
For the user context, we distinguish b etween
adaptation to personal
interest
and
adaptation to technical infrastructure
. Consider a professor on campus who is interested to
see in-depth multimedia material on coronary artery bypass grafting, and an undergraduate studentat
home who needs to get only an abstraction of the same material. In the example the presentation needs to
b e adapted to the p ersonal interest, here identied by p ersonal interest \coronary artery bypass grafting"
and professional level \professor" and \student". In addition to this kind of semantic adaptation of
multimedia do cuments, the multimedia presentation can b e adapted according to the technical capabilities
of the environment a user is working in, i.e., \on campus", \at home". The professor may run a high
quality presentation on the university campus providing excellent network bandwidth and computer
power, whereas the student can view the presentation at home where he do es not have the same excellent
technical prerequisites. A do cument mo del supp orts these typ es of adaptation if it can supp ort the
mo deling of user-sp ecic and system-sp ecic parameters as \input parameters" for adaptation suciently.
Denition of presentation alternatives:
Dep ending on
when
the dierent \alternatives" are dened
that can b e exploited for adaptation, we distinguish b etween
static adaptation
and
dynamic adaptation
.
With static adaptation the adaptable alternatives must b e known and included in the do cumentat
authoring time. Whereas for dynamic adaptation the available alternatives are determined due to the
sp ecic context at presentation time. One could therefore say that mo dels that allow static and/or
dynamic adaptation allows for \early and/or late adaptation binding".
4
2.6 Presentation-neutral Representation
Reuse of multimedia content in dierent context do es not mean that the material is presented always
identically. Rather reuse of contentmay require structural reuse of material and assignment of dierent
visual and audible layout according to the context. In addition, advanced distributed multimedia applica-
tions often face a heterogeneous environment with regard to op erating systems and hardware platforms.
It is desirable that the multimedia material of such an application can b e presented within this hetero-
geneous environment with minimal implementation eort. Thus, it makes p erfect sense to try to reuse
existing presentation software, e.g., HTML browsers, MHEG engines, on these systems.
As a consequence, the multimedia material has to b e mo deled in a
presentation-neutral
way, i.e.,
indep endent of the actual realization (layout) of a presentation. This is a challenging problem as it calls
for automatic conversion of the multimedia do cument mo del used for the presentation-neutral description
of multimedia contentinto the multimedia do cument mo del used for presentation of the multimedia
content. In general, two ma jor characteristics inuence the convertabilitybetween multimedia do cument
mo dels:
multimedia functionality
and
semantic level
of a mo del [RvOB97].
Multimedia functionality:
The multimedia functionalityof a multimedia do cument mo del describ es
the expressiveness of its mo deling primitives. In a conversion pro cess, this means that if the target
do cument mo del do es not oer an equivalentmultimedia functionality as oered by the source mo del,
the conversion will b e lossy.
Semantic level:
The semantic level of a multimedia do cument mo del plays an imp ortant role for the
automatic conversion for presentation. If the target do cument mo del of such a conversion provides a
semantic description of multimedia content on a high level, i.e., rather description of structure than
description of presentation, higher than the source do cument mo del, the conversion requires the analysis
of the do cument sp ecied in the source mo del and the derivation of its semantics for its enco ding in
the target mo del. In general, this requires knowledge ab out the multimedia content that often only the
author will have. In these cases, automatic conversion will not b e p ossible. However, the automatic
conversion of a multimedia do cument represented on a high level of semantics into a mo del based on a
comprehensive set of low level semantic constructs can b e p erformed much easier. In order to avoid the
problems of automatic conversion, the presentation-neutral representation of multimedia content should
{ b esides the coverage of richmultimedia functionality { take place on a high level of semantics.
3 Existing Multimedia Do cument Mo dels
In this section, we briey present the most imp ortant and relevant existing standards and data mo dels for
multimedia do cuments. We giveanintro duction to the existing do cument mo dels HTML, SMIL, MHEG-
5, HyTime, and also to Z
Y
X, an example for a mo del oering more advanced mo deling primitives. These
mo dels will b e analyzed and compared along the requirements of the previous section in Section 4.
3.1 HTML
The Hyp ertext Markup Language (HTML) [RLJ98] is based on SGML [ISO86] and denes a syntax
to enrich text pages with structural information using SGML
elements
.For instance, elements can b e
inserted into the text to organize it into paragraphs, to mark headings of dierent levels, to dene tables,
and to dene quotations. Furthermore, it is p ossible to include various kinds of ob jects like media elements
(e.g, images, videos and audio tracks), Java applets, ActiveX comp onents, and scripts. In addition to
that, HTML allows for the denition of
hyperlinks
between do cuments. These hyp erlinks are means to
dene interactions, i.e., an interaction (e.g., a mouse click) with the
link anchor
results in the presentation
of the do cument sp ecied by the
link target
. Scripts, applets, and ActiveX comp onents included with
a do cument are executed at presentation time by the presentation environment, the so-called
HTML
browser software
.However, the HTML standard do es neither dene syntax nor semantics of the scripting
languages, so presentation b ehaviour of a HTML page that includes scripts dep ends on the employed
browser software.
There are eorts of the large HTML browser software vendors Netscap e and Microsoft to allow for the
manipulation of the structure, layout, and content of a HTML do cument with scripting languages. Thus,
5
scripts can dynamically manipulate HTML do cuments, a technique which is also called
Dynamic HTML
(DHTML)
. The price for this increased exibility is that p ortability problems arise due to dierences
between the scripting languages employed by Microsoft and Netscap e.
3.2 SMIL
The Synchronized Multimedia Integration Language (SMIL) [HBB
+
98] is a W3C standard which aims
at synchronized multimedia presentations on the web. A SMIL do cument provides synchronization of
continuous media elements and constitutes an integrated presentation. SMIL is dened by an XML DTD
[BPSM98] and, hence, the language can b e understo o d as a set of element denitions sp ecied in terms
of XML. SMIL denes
schedule elements
to describ e temp oral synchronization b etween media elements.
Furthermore, the spatial layout of the media elements can b e dened. SMIL also allows to sp ecify links
between do cuments or parts of do cuments which are equivalent to HTML links. An interesting feature of
SMIL is the
switch
element which is a simple means for mo deling alternatives in the course and qualityof
a presentation. With the help of switch elements, an author can sp ecify dierent presentation alternatives
among which one is chosen at presentation time due to external parameters.
3.3 MHEG-5 and MHEG-6
MHEG-5 [ISO95, JR95] is an adaptation of the MHEG-1 Standard [MBE95] to the needs of video-on-
demand and kiosk applications for set-top-b oxes and low-end PC. MHEG-5 enco des applications in their
nal form and aims at an ecient realization of MHEG-1 attracting the interest of telecommunication
and entertainment industry in this standard. MHEG-5 provides an ob ject-oriented data mo del for mul-
timedia do cuments. The standard denes a hierarchyof
MHEG-5 classes
. This hierarchy comprises
classes for various uses. For example, there are classes that represent media elements like videos and
audios, classes that representinteraction elements like buttons, and even classes that provide variable
functionality of programming languages. Classes p ossess attributes, can p erform actions (which closely
resemble metho ds in ob ject-oriented programming languages), and re events. An MHEG-5 do cument
is a collection of instances of these classes organized in
scenes
which are the main structural primitives.
A scene corresp onds to a \page" on a screen and, hence, only one scene can b e presented at a time. In
addition to that, each MHEG-5 do cument features one instance of the class
Application
dening the
entry p oint for do cument's presentation. Moreover, this application ob ject can contain ob jects which are
global to every scene. The presentation b ehaviour of an MHEG-5 do cument is dened by the means of
links
which resemble event-condition-action rules.
MHEG-6 [ISO96, Hof96] is an extension of MHEG-5 that intro duces an interface b etween an MHEG-
5 engine and a Java Virtual Machine. With MHEG-6, it is p ossible to include Java programs into an
MHEG-5 do cument. Such a program has access to the ob jects of the do cument and, hence, can inuence
its presentation b ehaviour.
3.4 HyTime
HyTime [ISO92, DD94, NKN91] is a standard which allows for the description of the structure of mul-
timedia do cuments. Based on SGML [ISO86], HyTime provides a well-dened set of primitives which
allows for the interlinking of media ob jects without sp ecifying the enco ding of the media ob jects. The
primitives provided by HyTime are oered by means of
architectural forms
and are organized in terms
of mo dules. Architectural Forms (AF) are HyTime elements with pre-dened multimedia semantics and
attributes. An AF can b e used in any SGML DTD by extending an SGML elementtyp e by an attribute
HyTime
b earing the name of the AF to b e used. In that way, the elementtyp e inherits the semantics and
attributes of the AF.
The mo dules of HyTime Base-Mo dule, which denes the basic concepts of HyTime, the Lo cation-
Address-Mo dule, which implements the p owerful construct of
Locators
providing an abstract mechanism
for addressing external do cument ob jects, the Hyp erlink-Mo dule, which implements the concept of links,
the Finite-Co ordinate-Space-Mo dule, which provides means for the synchronized presentation of media
ob jects based on n-dimensional
coordinate spaces
, the Event-Pro jection-Mo dule, which allows to trans-
form eventschedules dening the temp oral execution of a presentation, and the Ob ject-Mo dication-
Mo dule, which allows to transform presentation ob jects, e.g., fading.
6
3.5 Z
Y
X
The Z
Y
Xmultimedia do cument mo del [BK99] has b een develop ed by our group in the context of the
Cardio-OP pro ject which aims at the development of a database-driven multimedia information system
with sp ecial needs for reusability, adaptation, interaction, and presentation-neutral description of multi-
media content. Z
Y
X describ es complete or fragments of multimedia do cuments by the means of a tree
(for an illustration, see Figure 1). The no des of the tree are called
presentation elements
. Each presen-
tation has got a
binding point
asso ciated with it. Such a binding p oint can b e b ound to one
variable
of another presentation element, thus creating the edges of the tree. The presentation elements are the
generic elements of the mo del. They can represent
atomic media elements
(e.g., videos, images and
text) or more complex comp ositions of media elements. Another group of presentation elements combine
presentation elements with certain semantics, the
operator elements
. There are op erator elements that
allow for temp oral synchronization, denition of interaction, adaptation, and for the spatial, audible, and
visible layout (the so-called
projector elements
) of the do cument.
It is p ossible to delay the pro cess of variable binding by leaving variables unb ound. This allows
for the denition of
templates
which can b e customized to a sp ecic problem at a later p oint in time.
Furthermore, a tree can b e encapsulated bya
complex media element
which can then b e used in other
trees (see Figure 2) likeany other presentation element. Unb ound variables of an encapsulated tree are
exp orted by the complex media element allowing for the encapsulation of templates. Thus, a complex
media element is somehow a blackbox view of a Z
Y
X tree.
v1
Presentation
element Binding point
Binding edge
Variable
Legend:
v2v1 v3 v4 v5
Slide 1 Slide 3
seq
Slide 2
Figure 1: A sample Z
Y
X tree in graphical notation
Video
v5v4 v7
complex media element
par
v7 v8
seq
Audio Text
seq
v2 v4v3
v5 v6
v1
template
Figure 2: Template encapsulated by a complex me-
dia element in graphical notation
4 Analysis
In this section, we analyze how the multimedia do cument mo dels intro duced in the previous section fulll
the requirements outlined in Section 2. First, weinvestigate the do cument mo dels HTML, MHEG-5, Hy-
Time, SMIL, and Z
Y
X with resp ect to the traditional requirements, i.e., their temp oral and spatial mo del,
and their interaction mo deling. Then, we examine how these mo dels b ehave concerning the advanced
requirements reusability, adaptation, and presentation-neutral description for do cument content. Figure
4 illustrates the summary of this analysis.
7
4.1 Temp oral Mo del
HTML:
As HTML has b een develop ed for Hyp ertext, the standard itself do es not oer constructs to
sp ecify temp oral synchronization b etween the media elements included in a HTML do cument. However,
by the use of DHTML, the temp oral course of a presentation can b e programmed in a scripting language.
Recently, there has b een a prop osal by Microsoft for adding temp oral synchronization supp ort to HTML
called HTML+TIME [SYS98]. This approachintegrates HTML with the temp oral mo deling mechanisms
intro duced by SMIL but it do es not seem to b e very mature, yet.
MHEG-5:
MHEG-5 sp ecies the temp oral course of a presentation by means of its link concept.
Presentable MHEG classes (e.g., the video and audio classes) dene a varietyof events relating to time.
These events can b e asso ciated via links with actions which are p erformed when the corresp onding event
o ccurs. Thus, the temp oral mo del of MHEG-5 is event-based.
HyTime:
In HyTime, the co ordinated presentation of media elements can b e sp ecied using the Finite-
Co ordinate-Space mo dule. This mo dule provides a means to dene
n
-dimensional co ordinate spaces which
can include time dimensions. Into such a co ordinate space, media elements can b e placed. This placement
is called
event
and exactly denes the p oint in the co ordinate space where the media element asso ciated
with the event will b e presented. Hence, the temp oral mo del of HyTime is p oint-based. Several events can
b e group ed to an
event schedule
which describ es the course of the presentation of a HyTime do cument.
SMIL:
SMIL follows an interval-based approach to temp oral synchronization. Each media element has
an asso ciated presentation interval. These intervals can b e co ordinated by the use of schedule elements.
In general, SMIL denes two kinds of schedule elements. On the one hand, there is the
paral lel element
which denes the parallel presentation of
n
intervals. Using attributes, a more detailed denition of
a parallel presentation is p ossible. For instance, time delays, lipsync synchronization, and lo ops can
b e sp ecied. On the other hand, SMIL provides the
sequential
element which allows for the sequential
presentation of
n
intervals. Again, a more detailed sp ecication of this presentation is p ossible with the
help of element attributes. The dierentschedule elements can b e nested, thus allowing for the mo deling
of complex temp oral relations.
Z
Y
X:
The temp oral mo del of Z
Y
X is closely related to the temp oral mo del of SMIL. Z
Y
X denes the
temp oral op erator elements
seq
and
par
which resemble the parallel and sequential elements of SMIL.
In contrast to SMIL, lo ops and time delays are not sp ecied using attributes. Instead, these are handled
byown temp oral op erator elements, the
l oop
op erator element and the
del ay
temp oral op erator element.
Due to its close resemblance to SMIL, the Z
Y
X temp oral mo del must b e considered as interval-based.
4.2 Spatial Mo del
HTML:
The control of the spatial layout of the media elements included in a HTML do cumentis
very limited. However, the concept of
framesets
allows to partition the presentation area of an HTML
do cument (i.e., the HTML browser window) into rectangular regions, so-called
frames
. In such a frame,
another HTML do cument can b e displayed which itself can dene further frames. This allows for frame
nesting. As frames are sp ecied by their size and p osition this constitutes a kind of absolute p ositioning.
Exact p ositioning of media elements is p ossible by the use of DHTML. Scripts can set and mo dify the
co ordinates at which a media element is presented in the browser window. Hence, (D)HTML oers
absolute p ositioning.
MHEG-5:
Each MHEG-5 class that represents visual media elements provides attributes dening the
co ordinates of the presentation area at which the visual media element has to b e presented. By the use
of the link concept, these co ordinates can b e set and changed as the result of events. This is a kind of
absolute p ositioning.
HyTime:
As mentioned ab ove, the Finite-Co ordinate-Space mo dule provides means to sp ecify the
course of the presentation of a HyTime do cumentby the means of eventschedules referring to
n
-
dimensional co ordinate spaces. Such a co ordinate space can include, b esides a temp oral dimension,
8
one or more spatial dimensions. Thus, an event not only describ es the p oint in time at which the asso ci-
ated media element will b e presented but also the spatial p osition. Therefore, HyTime allows for absolute
p ositioning.
SMIL:
SMIL provides a mechanism to allow for the absolute spatial p ositioning of media elements.
In the head of a SMIL do cument, rectangular regions of the presentation area can b e sp ecied, called
channels
. Eachchannel is dened by its p osition, its size, and a value which is used to dene the order
of overlapping channels. Each media element in the do cument b o dy can reference a channel thereby
sp ecifying its spatial p osition on the presentation area.
Z
Y
X:
Spatial layout in Z
Y
X is dened by the use of
spatial
pro jector elements. A spatial pro jector
element denes the rectangular region of the presentation area in which the subtree b elow the spatial
pro jector element is presented. Likechannels in SMIL, such a region is dened by its p osition, size, and
avalue to resolveoverlapping of regions. Spatial pro jector elements can b e nested. A spatial pro jector
element
p
in the subtree under spatial pro jector element
o
is seen in the context of
o
and not the entire
presentation area. All in all, Z
Y
X employs absolute p ositioning to sp ecify spatial layout of a do cument.
4.3 Interaction
HTML:
HTML provides the concept of links which allows for navigational interaction. Moreover,
HTML allows for the denition of data entry forms which consist of dierent controls like buttons and
text-input elds. The results of an interaction with a form cannot b e sp ecied using HTML itself. This is
left to CGI scripts running at a web server. However, if the employed browser software supp orts DHTML,
more sophisticated interactions like design interactions can b e programmed using scripts.
MHEG-5:
MHEG-5 provides a small set of basic interaction classes for the mo deling of user interaction.
MHEG-5 separates the element that initiates an interaction from the eect of an interaction. By the
denition of links, the interaction with an interaction element such as a button can trigger an action
resulting in a navigational or in a design interaction. Since MHEG-6 allows for the integration of Java
programs, it can supp ort additional, sophisticated user interactions.
HyTime:
As explained ab ove, the Finite-Co ordinate-Space mo dule of HyTime provides mechanisms
to dene the spatial and temp oral co ordination of a presentation. This is done byeventschedules
which require to know all spatial and temp oral p ositions of media ob jects in advance. This excludes ad-
ho c navigational interaction by the user. HyTime do es not supp ort design interactions as the primitives
provided by the Ob ject-Mo dication-Mo dule and the Event-Pro jection-Mo dule do not include
interaction
semantics.
SMIL:
The concept of links in SMIL provides for navigational interaction. But no supp ort is given for
the sp ecication of design interactions.
Z
Y
X:
The requirement to supp ort the mo deling of interactivemultimedia presentations is met byZ
Y
X's
interaction elements. There are twotyp es of interaction elements,
navigational
interaction elements and
design
interaction elements. Examples for navigational interaction elements are the
link
element that
allows to sp ecify hyp ertext structure as in SMIL or HTML and the
menu
element with which one can
interactively follow one path out of a set of presentations paths. The design interaction elements are
interactiveversions of the pro jector elements. For example, for the typ ographic pro jector that allows
to sp ecify font, size and style of a text, the
interactive typographic projector element
sp ecies that these
settings can b e carried out interactively when the do cument is presented.
4.4 Reusability
HTML:
HTML allows to reference whole do cuments and single media elements via
uniform resource
locators (URL)
.However, it is not p ossible to reference just a fragment of a do cument. Thus, reusability
is only supp orted on the highest and lowest level of granularity as identied in Section 2.
9
As there is no clear distinction b etween structure and layout of an HTML do cument, hence, reuse
can only b e identical. Although HTML 4.0 tries to separate structure from the layout of a do cumentin
a more rigid way and promotes the use of external cascading style sheets, it is still p ossible to mix layout
and structure for backward compatibility reasons.
In order to supp ort classication and identication of do cuments, HTML allows for the sp ecication
meta attributes by means of attribute-value pairs in the head of a do cument.
MHEG-5:
Considering the granularity of reuse, it is imp ortant to notice that MHEG-5 structures the
media elements of an application into
groups
which can b e addressed globally. Hence, groups constitute
the units which can b e reused. As an application ob ject is a group, it is p ossible to reuse entire MHEG-5
do cuments within an MHEG-5 do cument. Likewise, scenes are MHEG-5 groups and, hence, could b e
reused in principal. Since scenes can refer to ob jects global to an MHEG-5 do cument which are contained
in the application ob ject, it is not p ossible to reuse scenes which dep end on such global ob jects. Only
fully indep endent and isolated scenes could b e candidates for re-usage in other do cuments. Therefore,
there is no general supp ort for reusability at the level of do cument fragments. Regarding the reusability
at the level of mere media elements (like videos, audios), it is imp ortant to know that a media element
must b e asso ciated to exactly one group and is addressed through this group. Thus, it is not p ossible to
use one and the same media elementintwo dierent scenes. However, as media elements do not haveto
include the underlying data but also just can refer to their data, groups can share at least the data of
media elements.
Since MHEG-5 aims less at mo deling the structure of a multimedia application but at representing
its nal presentation form, which includes the layout, groups can only b e reused identically.
The identication and selection of groups to b e reused is a serious problem in MHEG-5 as no in-
formation can b e assigned to MHEG ob jects. One can use neither annotations, keywords, or any other
kind of metadata for classication of and search for media ob jects, nor semantically useful names for the
identication of parts to b e reused.
HyTime:
HyTime allows for reusability on all levels of granularity as identied in the requirements
section. As HyTime is built on SGML, single media elements and complete do cuments can b e referenced as
entities and therefore b e reused. Moreover, the Lo cation-Address-Mo dule provides p owerful mechanisms
to lo cate and address fragments of HyTime do cuments. Using
locators
, parts of a HyTime do cument can
b e referenced by name, p osition, or even by the use of a p owerful query language.
As any SGML DTD can b e made HyTime-compliant, HyTime do cuments describ e rather the structure
of a do cument than its presentation semantics. Thus, reuse in HyTime is semantic reuse.
Moreover, b ecause HyTime is indep endent of a DTD, a DTD can b e provided with supp ort for
classication of (parts of ) do cuments, e.g., by the use of attribute-value pairs. Hence, HyTime oers
supp ort for classication and identication of reusable comp onents.
SMIL:
As SMIL can reference complete do cuments and single media elements by the use of URL, SMIL
allows for reuse at the according levels of granularity as dened in Section 2. However, SMIL do es not
supp ort the reuse of fragments of do cuments.
SMIL separates layout sp ecications, whichhave togointo the head of the do cument, from the
structural sp ecications given in the b o dy. But as b oth kinds of sp ecications are closely interrelated,
SMIL provides only for identical reuse.
Like HTML, SMIL allows to dene meta-attributes within the head element of a do cument. Such meta-
attributes can b e used to classify and retrieve do cuments providing supp ort for selection and identication.
Z
Y
X:
The Z
Y
X do cument mo del has b een designed with all levels of granularity of reuse in mind. To
supp ort reusability of media elements, atomic media elements are provided which can b e reused in any
Z
Y
X sp ecication. Likewise, complex media elements which encapsulate sp ecications can b e reused in
any other sp ecication. As the encapsulated sp ecications can smo othly range from small logical parts
of a do cumenttoentire do cuments, Z
Y
X supp orts reuse b oth on the level of entire do cuments and ne-
grained do cument fragments. Moreover, the ability to encapsulate templates in complex media elements
provides for the reuse of do cument templates.
The ability to delay the pro cess of variable binding, esp ecially the binding of pro jector variables,
allows for the clear separation of the presentation elements building the structure of a do cument and the
10
pro jector elements determining its layout. This allows for structural reuse of Z
Y
X sp ecications. As Z
Y
X
complex media elements may include pro ject elements dening visual and audible layout this provides
for identical reuse of comp onents.
Concerning selection and identication of reusable elements, Z
Y
X allows media elements, either com-
plex or atomic, to b e annotated with key-value pairs.
4.5 Adaptation
HTML:
Since HTML do es not oer any mechanism to sp ecify adaptation of a do cument to user interest
or to technical infrastructure, we consider only DHMTL here. DHTML oers to dynamically manipulate
the structure and content of HTML do cuments. Therefore, adaptation to user interest or technical
infrastructure can b e implemented by the use of scripts. In a rst step, such a script has to determine
the user or system prole, for example by a database query. In the second step, the script has to change
the structure of the HTML do cument according to the prole. As the author must co de and thus know
at authoring time all adaptation alternatives inside scripts, this kind of adaptation must b e considered
as static.
MHEG-5:
MHEG-5 denes classes for variables whose contents can b e tested. Hence, variables can
b e used to cho ose b etween dierent branches of a presentation. Thus, a prole dening user interest and
technical infrastructure could b e mo deled using variables. However, the problem is how such a prole
is set. MHEG-5 allows to set variables only from within a do cument. User-sp ecic adaptation would
require to make the determination of the prole a part of the MHEG-5 do cument. In MHEG-6, the
MHEG engine could call a Java program which retrieves the actual values for a given prole and then
sets the variables of the do cument. So, with the use of MHEG-6, adaptation of a presentation to user
interest or technical infrastructure is p ossible. Since all adaptation alternatives must b e sp ecied within
a do cument at authoring time, this is static adaptation.
HyTime:
Since HyTime can b e used with any concrete DTD, it is always p ossible to dene sp ecic
attributes with elements of a DTD that characterize (parts of ) do cuments or media elements in terms
of user interest or technical prop erties like bandwidth needed, resolution or frame rate required. It is
also p ossible to check for values of such element attributes by using the Query-Lo cator provided by
the Lo cation-Address-Mo dule. But all the results of such queries checking attribute values are fully
determined by the concrete do cument content and cannot b e mo died by external parameters like those
in a user or system prole. Hence, it is not p ossible to adapt a HyTime do cument according to external
parameters like a prole.
SMIL:
SMIL oers the
switch
element to mo del alternative presentation variants. Using this element,
dierent adaptation alternatives can b e sp ecied inside the do cument at authoring time. Thus, the switch
element allows for static adaptation. The selection of the alternatives is guided by simple predicates which
include parameters set outside the SMIL do cument. These parameters are predened by the standard and
describ e mainly technical features like the available bandwidth. This allows to adapt a SMIL do cument
to technical infrastructure.
Z
Y
X:
As mentioned ab ove, each media elementofaZ
Y
X do cument can b e annotated with a set of
key-value pairs that describ es its content. In addition to that a user prole, also key-value pairs, can
b e dened to capture values that describ e a user's topics of interest, presentation system environment,
network connection characteristics and the like. The Z
Y
X mo del oers op erator elements to supp ort
adaptation to a user's prole by means of
switch elements
and the
query elements
.
Like in SMIL, a switch element allows to sp ecify dierent presentation alternatives for a part of the
do cument allowing for static adaptation. One of the alternatives is selected corresp onding to the user
prole. In contrast to SMIL, the scop e of a switch statement is not limited to predened parameters. A
switch element is used if all adaptation alternatives are known to the author of a do cument. In order
to allow for dynamic adaptation, the
query
elementisprovided. This element is a placeholder for a
media element or fragment which is describ ed by the means of a query. The query is represented bya
set of key-value pairs. When the do cument is selected for presentation the query elementisevaluated
11
and replaced by the complex or atomic media element with b est matching the set of key-value pairs with
regard to the user prole. Thus, Z
Y
X allows for adaptation to user interest and system structure.
4.6 Presentation-neutral Representation
Figure 3 shows the relationships b etween various formats and mo dels with resp ect to their supp ort for
multimedia and for presentation-neutral representation.
HTML
MHEG-5
SMIL
HyTime
ZYX
Semantic level
Multimedia functionality
DHTML
Figure 3: Presentation-neutral representation and multimedia supp ort of the dierent formats and mo dels
HTML:
As HTML do es not clearly separate the layout of a do cument from its structure, the seman-
tic level of a HTML do cument description is not as high as HyTime though it is comparable to SMIL.
However, HTML oers only an extremely limited multimedia functionality (even simple temp oral syn-
chronization is not p ossible). To oer more multimedia functionality, DHTML must b e employed. Since
DHMTL scripts must imp eratively implementmultimedia functionality, their use extremely reduces the
semantic level of a do cument description. Furthermore DHTML intro duces p ortability problems b etween
browser vendors. Thus, neither HTML nor DHMTL are well suited for presentation-neutral representa-
tion of multimedia do cument content.
MHEG-5:
MHEG-5 primarily aims at a detailed and platform indep endent description of a presenta-
tion, i.e., the layout, of a do cument. Toachieve this goal, the standard provides MHEG-5 with a rich
multimedia functionality.However, the description of the structure of an MHEG-5 do cumentisvery p o or.
Hence, the level of semantic mo deling is very low and if compared to the semantics of the structure of
multimedia do cuments MHEG-5 might b e viewed as a \multimedia-assembler". Hence, MHEG-5 cannot
supp ort presentation-neutral representation of multimedia do cuments.
HyTime:
Since HyTime mainly sp ecies the structure and semantics of a multimedia do cumentitis
quite well-suited for presentation-neutral representation of multimedia do cuments. HyTime oers sp eci-
cation of do cument content at a high semantic level though it lacks multimedia functionality (esp ecially
in the area of interaction).
SMIL:
In contrast to HyTime do cuments, a SMIL do cument describ es in detail the presentation of
the do cument but less detailed the structure of the do cument. However, SMIL oers more multimedia
functionality than HyTime. Compared to MHEG-5, the description of SMIL do cuments takes place on
a higher level of semantics though lacking the multimedia functionality of MHEG-5. Hence, SMIL ranks
between HyTime and MHEG-5 with resp ect to its supp ort for presentation-neutral representation.
Z
Y
X:
As it is p ossible to separate structure and layout of a do cument due to the ability to delay
the pro cess of variable binding and to encapsulate templates in complex media elements, the semantic
level of a do cument description is quite high and thus suited for presentation-neutral representation of
multimedia do cument content. The amountof multimedia functionality oered byZ
Y
X exceeds SMIL
but ranks b elow MHEG-5.
12
interval-
based
absolute
positioning absolute
positioning absolute
positioning absolute
positioning absolute
positioning absolute
positioning
HTML DHTML SMIL MHEG-5 HyTime ZYX
script event-
based point-
based based
interval-
Temporal Model -
Interaction
Presentation-neutral
Representation
high
Multimedia Functionality mediumvery low very high
Spatial Model
Navigational
Design
Reusability
Granularity
Kind of Reusage
Fragments
Adaptation
Parameters of Adaptability
Definition of Alternatives
high
Semantic Level medium medium lowvery low very high
lowhigh
++
Media Elements +++ +++
Documents +++ +++
+++ - ++
-+- -++
--- ++-
Identical +++ - ++
Structural ----
Identification/Selection +++ ++-
User Interest -+- -+MHEG-6
Technical Infrastructure -++ -+MHEG-6
Static --++ --+MHEG-6
Dynamic --- -+-
Figure 4: Summary of the supp ort of the requirements by (D)HTML, SMIL, MHEG-5, HyTime, and
Z
Y
X (+ supp ort, - no supp ort)
4.7 Summary
Summarizing (see also Figure 4), we can say that none of the examined do cument mo dels HTML, MHEG-
5, HyTime, and SMIL oers sucient supp ort for all requirements arising from advanced multimedia
applications. HTML can hardly b e characterized as a multimedia do cument mo del b ecause it lacks
supp ort for even the most basic multimedia requirement, a temp oral mo del. Though HTML can b ecome
a quite p owerful multimedia do cument mo del by the extension to DHTML, it still lacks supp ort for
reuse at all levels of granularity and suers from a low semantic level of content description which leaves
DHTML unsuitable for presentation-neutral description of multimedia content. This is also the case with
MHEG-5. Although MHEG-5 oers a high multimedia functionality, it mainly describ es the presentation
and not the structure of a multimedia do cument and, therefore, cannot b e employed for presentation-
neutral mo deling of multimedia do cument content. Furthermore, reuse at the level of fragments is severely
hamp ered due to the unexible scene-based do cument structure.
Powerful supp ort for reuse is the strength of HyTime. Moreover, HyTime describ es do cument con-
tentatavery high semantical level and, thus, is p erfectly suited for presentation-neutral mo deling of
do cument content. However, the lacking capabilityof interaction mo deling and mo deling of adaption is
a serious drawback. In contrast to HyTime, SMIL oers the mo deling of static adaptation to technical
infrastructure and navigational interaction. Furthermore, the semantic level of a SMIL do cument descrip-
tion ranks b etween MHEG-5 and HyTime and, hence, is quite well suited for the presentation neutral
description of multimedia do cument content. However, reuse at the level of fragments is not p ossible as
13
is the mo deling of design interactions.
Since Z
Y
X has b een designed with the fullment of the advanced requirements in mind, it oers
reuse on all three levels of granularity, static and dynamic adaptation to user sp ecic needs, a quite high
semantic level of do cument description, and presentation-neutral representation of multimedia content.
Regarding the traditional requirements, enough multimedia functionality has b een provided to allow for
interesting multimedia presentations including design interactions.
5 Conclusion and Future Work
Driven by our advanced multimedia information system application Cardio-OP,we rst have identied a
new set of requirements for multimedia do cument mo dels:
reusability of multimedia content
,
adaptation of
multimedia content to user needs and interests
, and
presentation-neutral description
of the structure and
contentofmultimedia do cuments. These requirements complement the more traditional requirements
for multimedia do cument mo dels, i.e., temp oral, spatial, and interaction mo deling, well known so far.
We then have presented an analysis of the relevant standard formats and mo dels, i.e., HTML, SMIL,
MHEG-5, HyTime, including the Z
Y
X mo del which has b een designed to meet the advanced requirements.
Wehave presented the capabilities and identied the limitations of these mo dels. The shortcomings of
standards call for a new initiative for next generation multimedia do cument mo dels. As illustrated byZ
Y
X
[BK99] it is very well p ossible to push the limits of existing approaches and to meet the new requirements.
Wewould liketopoint out that the implications of our analysis and of approaches trying to resolve
the shortcomings of existing mo dels are signicant: There arises an urgent need for appropriate authoring
to ols that supp ort ne-grained reuse of multimedia content, adaptability of content to user needs and
individual interest, and, as a direct consequence, the presentation-neutral representation of material, e.g.,
in a database. When developing the multimedia content rep ository of Cardio-OP based on Z
Y
Xwe made
this painful exp erience. Our group has already develop ed a DataBlade mo dule for the ob ject-relational
database system Informix Dynamic Server / Universal Data Option capable of managing Z
Y
X do cuments
and fragments [BKW99]. We currently develop an authoring to ol and a presentation engine for Z
Y
X,
since presentation-neutral representation of multimedia contentaswell as adaptation supp ort directly
impacts the design of authoring and presentation to ols.
References
[All83] J. F. Allen. Maintainin g Knowledge ab out Temp oral Intervals.
Communications of the ACM
,
26(11):832{843, Novemb er 1983.
[BK99] S. Boll and W. Klas. Z
Y
X | A Semantic Mo del for Multimedia Do cuments and Presentations. In
To
be published in: Proceedings of the 8th IFIP Conference on Data Semantics (DS-8): \Semantic Issues
in Multimedia Systems"
. Kluwer Academic Publishers, Rotorua, New Zealand, 5-8 January 1999.
[BKW99] S. Boll, W. Klas, and U. Westermann. Exploiting OR-DBMS Technology to Implement the Z
Y
X
Data Mo del for Multimedia Do cuments and Presentations. In A. Buchmann, editor,
Submitted to:
Datenbanksysteme in Buro, Technik und Wissenschaft (BTW)
. GI-Fachtagung, March 1999.
[BPSM98] T. Bray,J.Paoli, and C. M. Sp erb erg-McQueen.
Extensible Markup Language (XML) 1.0 { W3C
Recommendation 10-February-1998
. W3C, URL: http://www.w3.org/TR/1998/REC-xml-19980210,
Februar 1998.
[Bul98] D. C. A. Bulterman. User-centered Abstractions for Adaptive Hypermedia Presentations. In
Proc. of
the 6th ACM Multimedia Conference
, Bristol, UK, Septemb er 1998.
[C-L98] C-LAB | Research Institute of the UniversityofPaderb orn and Siemens AG. IDIAS - An Intelligent
Diagram Assistant, 1998. URL http://www.c-lab.de/ ucmm/idias/ro ot.html .
[DD94] S. DeRose and D. G. Durand.
Making Hypermedia Work: A User's Guide to HyTime
. Kluwer
Academic Publishers, Dordrecht, 1994.
[DK95] A. Duda and C. Keramane. Structured temp oral comp osition of multimedia data. In
Proc. IEEE
International Workshop on Multimedia- Database-Management Systems
, Blue Mountain Lake, August
1995.
[EBS97] J. Eklund, P. Brusilovsky, and E. Schwarz. Adaptive textb o oks on the www. In H. Ashman, P. Thistew-
aite, R. Debreceny, and A. Ellis, editors,
Proceedings of AUSWEB97, The ThirdAustralian Conference
on the World Wide Web, Queensland, Australia
, pages 186|192. Southern Cross University Press,
July, 5{9 1997.
14
[EF91] M. J. Egenhofer and R. Franzosa. Point-Set Top ological Spatial Relations.
Int. Journal of Geographic
Information Systems
, 5(2), March 1991.
[HBB
+
98] P. Hoschka, S. Buga j, D. Bulterman, et al.
Synchronized Multimedia Integration Language { W3C
Working Draft 2-February-98
. W3C, URL: http://www.w3.org/TR/1998/WD-smil-0202 , Februar
1998.
[HFK95] N. Hirzalla, B. Falchuk, and A. Karmouch. A Temp oral Mo del for Interactive Multimedia Scenarios.
IEEE Multimedia
, 2(3):24{31, Fall 1995.
[Hof96] P. Hofmann. MHEG-5 and MHEG-6: Multimedia Standards for Minimal Resource Systems. Technical
Rep ort, Technische Universitat Berlin, April 1996.
[ISO86] ISO.
Information processing - Text and Oce Systems - Standard Generalized Markup Language
(SGML)
, 1986. ISO-IS 8879.
[ISO92] ISO/IEC.
Information Technology - Hypermedia/Time-based Structuring Language (HyTime)
, 1992.
ISO/IEC IS 10744.
[ISO95] ISO/IEC JTC1/SC29/WG12.
Information Technology { Coding of Multimedia and Hypermedia In-
formation { Part 5: Support for Base-Level Interactive Applications, ISO/IEC IS 13522-5
. ISO/IEC,
1995.
[ISO96] ISO/IEC JTC1/SC29/WG12.
Information Technology { Coding of Multimedia and Hypermedia In-
formation { Part 6: Support for Enhanced Interactive Applications, ISO/IEC IS 13522-6
. ISO/IEC,
1996.
[JR95] R. Joseph and J. Rosengren. MHEG-5: An Overview. Technical Rep ort, GMD-FOKUS, Berlin,
URL://www.fokus.gmd.de/ovma/mug/archives/d o c/mh eg-reader/ rd120 6.html, Decemb er 1995.
[KBA93] T. Kamba, K. Bharat, and M. C. Alb ers. The Krakatoa Chronicle - An Interactive, Personalized
Newspap er on the Web. page http://www.w3.org/Conferences/WWW4/Pap ers/ 93/, 1993.
[KLAV98] W. Klippgen, T. D. C. Little, G. Ahanger, and D. Venkatesh. The Use of Metadata for the Rendering
of Personalized Video Delivery. In
[SK98]
, New York, 1998. McGraw-Hill.
[LG93] T. D. C. Little and A. Ghafo or. Interval-Based Conceptual Mo dels for Time-Dep endent Multimedia
Data.
IEEE Transactions on Know ledge and Data Engineering
, 5(4), August 1993.
[MBE95] T. Meyer-Boudnik and W. Eelsb erg. MHEG Explained.
IEEE Multimedia
, 2(1), Spring 1995.
[NKN91] S. R. Newcomb, N. A. Kipp, and V. T. Newcomb. "HyTime" { The Hyp ermedia/Time-Based Do cu-
ment Structuring Language.
Communications of the ACM
, 34(11), Novemb er 1991.
[PS94] D. Papadias and T. Sellis. Qualitative Representation of Spatial Knowledge in Two-Dimensional Space.
VLDB Journal
, 3(4), Octob er 1994.
[PTSE95] D. Papadias, Y. Theo doridis, T. Sellis, and M. J. Egenhofer. Top ological Relations in the World
of Minimum Bounding Rectangles: A Study with R-Trees. In
Proceedings of the ACM SIGMOD
Conference on Management of Data
, San Jose, May 1995.
[RLJ98] D. Raggett, A. Le Hors, and I. Jacobs.
HTML 4.0 Specication { W3C Recommendation, revisedon
24-April-1998
. W3C, URL: http://www.w3.org/TR/1998/REC-html40-19980424, April 1998.
[RvOB97] L. Rutledge, J. van Ossenbruggen, and D. C. A. Bulterman. A Framework for Generating Adaptable
Hyp ermedia Do cuments. In
Proc. ACM Multimedia Conference
, Seattle, Novemb er 1997.
[SK98] A. Sheth and W. Klas.
Multimedia Data Management - Using Metadata to Integrate and Apply Digital
Media
. McGraw-Hill, New York, 1998.
[SSW98] V. Schoch, M. Sp echt, and G. Web er. ADI | An Empirical Evaluation of a Tutorial Agent.
In T. Ottmann and I. Tomek, editors,
Proceedings of the ED-Media and ED-TELECOM 1998,
Freiburg, Germany
. Asso ciation for the Advancement of Computing in Education, June 1998. URL
http://apsymac33.uni-tri er.de:8080/AD I.html.
[SYS98] D. Schmitz, J. Yu, and P. Satangeli.
Timed Interactive Multimedia Extensions for HTML
(HTML+TIME)
. W3C, URL: http://www.w3.org/TR/1998/NOTE-HTMLplusTIME-19980918,
Septemb er 1998.
[WR94] T. Wahl and K. Rothermel. Representing Time in Multimedia Systems. In
Proc. IEEE International
Conference on Multimedia Computing and Systems
, pages 538{543, Boston, MA, May 1994.
15