In ProceedingsFirst International Workshop on Multimedia Intelligent Storage and Retrieval Management(MISRM),
Orlando, Florida, October 30th, 1999
Architecture of a DataBlade Module for the Integrated
Management of Multimedia Assets
Utz Westermann, Wolfgang Klas
Databasesand Information Systems (DBIS), Computer ScienceDepartment
Oberer Eselsberg, University of Ulm
James-Franck-Ring,89069 Ulm, Germany
Tel: +49-731-50-24131,Fax:+49-731-50-24134
f
westermann, klas
g
@informatik.uni-ulm.de
Abstract
Advancedmultimedia applications require adequate support by database technology for the integrated, uni-
form management, retrieval, and delivery of media data of different types. While there has been substantial
effort to provide database support for multimedia data and for its content-based retrieval, research has mostly
focusedon supportfor single media types, usuallyvideosor images. However,this has led to a varietyof isolated
solutions of database support optimized for the respective media data types which are difficult to integrate — a
uniformview is lacking. In this paper,we describethedesignandthe architectureof the Media IntegrationBlade,
a DataBlade module for the object-relational DBMS Informix DynamicServer/UniversalData Option. Building
upon existing media type-specific DataBlade modules, the Media Integration Blade establishes an integration
layerwhich offers uniform, homogeneousaccessto the different types of media data. It allows for the uniform re-
trieval of media data of different type by technical characteristicsand contentwhile maintaining the functionality
of the underlying media type-specific DataBlade modules. Hereby, the Media Integration Blade forms a generic
corecomponentthat can be employedfor the uniform managementand accessto media data of different types in
any database-drivenmultimedia information system.
1 Introduction
The advantages of buildingmultimediainformationsystems on topof database systems have longbeen recognized
[KA97, RNL95, AN97]. The nature of multimedia data, however, imposes specific requirements on database
systems. In general, media data have highvolume and, incase ofcontinuousmedia data types, are time-dependent.
To provide for the management and content-based retrieval of media data, support for media data types must be
seamlessly integrated with the database management system (DBMS) [KA97].
So far, research in the area of multimedia database systems has focused on database support for specific single
media data types, mostly video and image. This results in a variety of available prototypes. For example, the
MARS system [ORC
+
98] and the QBIC system [F
+
95] are image database prototypeswhich allow to query their
contents by example images using sophisticated similarity search algorithms. The STARCH database [BG99]
facilitates querying for images with the help of an ontology specified by description logics while the DISIMA
system [OOL
+
97] allows for the queryingof images by bothsimilaritysearch algorithmsand concepts taken from
an ontology. As an example of a video database, the OVID system [OT93] allows to define sets of consecutive
frames as video objects that can be queried by the VideoSQL query language. Jiang and Elmagarmid [JE98]
introduce a prototypeof a video database based on the Logical Hypervideo Data Model that allows to define video
objects inside video frames which can be indexed by free text and connected to other video objects via hyperlinks.
In literature, one can also find some few examples of databases concerned with the management of audio data and
speech [LR95, WS98]. On the commercial market, database technology is evolving that supports different media
data types. For instance, the object-relational DBMS Informix Dynamic Server/ Universal Data Option (IDS/UD)
can be extended by DataBlade modules for support of media data like image, text, and video.
As the prototype systems and database extensions focus on just one particular media data type, their imple-
mentations vary in the location where media data is stored, in the way technical metadata is available, and in the
way content-based retrieval of media data is performed. Considering the storage location of media data, variants
range from storage of media data in binarylarge objects inside a database, in a file system, ona web server, or even
on a full-fledgedmedia server allowingfor the streaming delivery of continuousmedia data. Handlingof technical
1
metadata is focused on each specific media data type. In consequence, for some media data types the support
for technical metadata is not necessarily comprehensive. For instance, the Informix Video Foundation DataBlade
offers no supportfor color depth at all, thoughcolor depthis desirable for videos. Additionally,technical metadata
applicable to different media data types, e.g., the size of image and video, may be handled and named differently.
Content-based retrieval of media data ranges from similarity search based on automatically extractable features
like the color distributionof an image to search based on content-descriptive semantic annotations.
As long as an application is concerned with only a single media data type, these variations cause no problems.
There are, however, applications that need to manage media data of different types in a uniform and integrated
fashion. To overcome the variations, such an applicationitselfhas to cope withthe heterogeneityof the media data
type-specific concepts. For instance, depending on the storage location of a medium and its particular media data
type, theapplicationhastoemploydifferentmechanisms toaccess media dataofdifferenttype. Moreover, content-
based retrieval of media data spanning several different media types is difficult to achieve by the application, as it
must map a given query imposed on the informationsystem to several media type-dependent facilities for content-
based retrieval. Thus, means for content-based retrieval applicable to different media data types is a necessity.
Finally, it is difficult for the applicationto limitthe search for media data usingtechnical characteristics applicable
todifferentmedia datatypessuch asthesize inbytessince mediadata type-specificprototypesystems anddatabase
extensions manage technical metadata differently. Here, an integrated view on media data wouldbe helpful.
With the project “Gallery of Cardiac Surgery” (Cardio-OP1)[KGF99], that aims at the development of an
Internet-based and database-driven multimedia information system in the domain of cardiac surgery, we find such
an applicationthatexplicitlyrequires uniformmanagement and uniformcontent-based retrievalof multimediama-
terial of differenttypes rangingfrom videos, images, and texts, to full-fledged multimedia presentations. Based on
a multimedia repository, the system is going to serve as a common informationand education base for its different
typesofusers, physicians,medical lecturers, students, andpatients, whoare providedwithmultimediadataaccord-
ingto theiruser specific request to themultimedia informationsystem, theirdifferentunderstandingof the selected
subject, their location and technical infrastructure. The underlying database technology of the repository is given
by the object-relational DBMS IDS/UD which has been chosen for reasons of flexibility, profound extensibility,
and industrial strength [BKW99a].
In this paper, we describe the design and the architecture of the Media Integration Blade (MIB) DataBlade
module for IDS/UD which establishes an integration layer upon media data type-specific DataBlade modules to
overcome the heterogeneity of the media data type-specific concepts and to offer applications transparent and
uniform access to media data of different media data types. Based on commercial DataBlade modules for text,
image, and video data, the MIB provides transparent access to media data of different type by encapsulating the
storage location in an abstract data type. The MIB builds an information layer that offers a homogeneous view
on the technical metadata of the different media data type. To support uniform, media data type-independent
content-based retrieval, the MIB allows for the semantic annotation of media data of different type with concepts
taken from a common ontology and by offering query operators for these annotations. However, media data
type-specific functionality for content-based retrieval based on automatically extractable features offered by the
underlying DataBlade modules remains accessible. The concepts illustratedin this paper and implemented by the
MIB are generic to the extent that the overall approach can be applied to other database technologies as well.
The paper is organized as follows: Section 2 shows how to achieve transparent media access. Section 3 de-
scribes the organizationof themedia data toaccomplish a homogeneous, integratedview tothe technical metadata.
Section 4 introduces the annotation and content-based querying facilities of the MIB. Section 5 illustrates the ap-
plication of the MIB module. Section 6 concludes the paper and gives an outlookto ongoingand future work.
2 Transparent media access
As mentioned in the introduction, the locations where media data is stored and accessed often differs with media
data type-specific database extensions. Among others, media data may be stored on a media server, on a web
server, in a file system, or in binary large objects of a database. Each storage location comes with its own access
methods. For instance, access to files is done with the help of operating system routines while a media server
might define a specific network protocol to access media data. This heterogeneity of access to media data hinders
applications requiring a uniform model of media access. A known solution to this problem is the employment of
1Cardio-OP - Gallery of Cardiac Surgery - is partially fundedby the GermanMinistry of ResearchandEducation, grant number08C58456.
Our project partners are the University Hospital of Ulm, Dept. of Cardiac Surgery and Dept. of Cardiology, the University Hospital of
Heidelberg, Dept. of Cardiac Surgery, an associated Rehabilitation Hospital, the publishers Barth-Verlag and dpunkt-Verlag, Heidelberg, FAW
Ulm, and ENTEC GmbH, St. Augustin. For details see also URL www.informatik.uni-ulm.de/dbis/Cardio-OP/
2
locators. A locator is a data type which abstracts from the way media data is stored by encapsulating its storage
location. A locator comes with a set of access routines. Depending on the particular storage location encapsulated
in an instance of the locator, the access routinesuse the access methods appropriate for thatstorage location. Thus,
it is hidden from an application where media data is stored and by which method exactly it is accessed as an
application always calls the same access routines.
Ideally, the problem of offering transparent access to a variety of storage locations could be solved by employ-
ing one comprehensive locator data type. As a matter of fact, different locator types have evolved that, unfortu-
nately, not only support different storage locations but also come with different access routines. For this reason,
it is difficult to decide for one of these variants. To illustrate this point, consider the locator variants provided
with the Excalibur Text DataBlade, Excalibur Image DataBlade, and the Informix Video Foundation DataBlade
modules for the IDS/UD. The Excalibur Text DataBlade module makes use of the LLD Locator data type. This
locator allows for the transparent access to media data stored in the file system of the IDS/UD server or in a
binary large object of a database. The Excalibur Image DataBlade module references media data by the use of
the IfdLocator which is essentially similar to LLD Locator but can store additional application-specific in-
formation. While the differences between LLD Locator and IfdLocator are mainly structural, the locator
MedLoc introduced by the Video Foundation DataBlade supports other storage locations. The MedLoc locator
serves to reference media data stored on a media server with the help of the Virtual Storage Interface (VSI), an
interface provided by Informix for accessing data on media servers.
For the design of the MIB which intends the integrationof the DataBlade modules mentioned above, a suitable
concept for a uniform locator has to be developed. This design must take into consideration that the functionality
of the underlyingmedia data type-specific DataBlade modules shouldremain usable for applications. For instance,
one would stilllike to use the functions and index structures of the Excalibur Image DataBlade responsible for the
similarity search among images which expect the media data type-specific locator IfdLocator not necessarily
applicabletofunctionsofothermedia data type-specificDataBlade modules. Therefore, theMIBdefines a uniform
locator named uniLocator that abstracts from the different variants. It is able to encapsulate an instance of one
of the various locator variants. Depending on the encapsulated locator variant, an instance of uniLocator
accommodates a suitable structure to represent the type and the data of the encapsulated variant. In order to obtain
instances of uniLocator, typecasts from the different locator variants to uniLocator are defined. Typecasts
are also defined vice versa to allow for the access of media data type-specific functionality of the underlying
DataBlade modules. However, not all instances of uniLocator can be casted to all locator variants. This
depends on the locator variant from which the uniLocator instance was constructed. In fact, the locator variant
used for the construction of the uniLocator instance might refer to a storage location not supported by the
locator variant to which the instance of uniLocator is to be casted.
The uniform locator uniLocator has associated access routines that facilitate the uniform, transparent ac-
cess to media data. For this purpose, the MIB implements several user-defined routines. The access routine
uniLocatorToClient copies the media data to the file system of the client calling this routine. For access at
a finer granularity, the MIB implements routines like open,close,read, and seek with the usual file access
semantics which in turn make use of the appropriate access methods provided with the specific locator variant
encapsulated by the instance of uniLocator.
The locatormechanism ofthe MIBas described above can be extended tocover additionalstorage locationsfor
media data withrelativelylittleeffort. For thispurpose, the type uniLocator must be extended toaccommodate
a suitable structure to reference media data in the new storage location. Furthermore, the user-defined routines for
uniformmedia access mustbeextended tointegratetheaccess tothenew storagelocation. Sofar, we have provided
medLocator with support for the locator variants LLD Locator,IfdLocator, and MedLoc thereby being
abletotransparentlyaccess mediadataonthefilesystemofthedatabase server, inbinarylarge objectsinadatabase,
and on media servers supportingVSI. We planto extend the locatormechanism withthe capabilitytoaccess media
data stored on a web server using HTTP.
3 Media organization
In the previous section, we have introduced the concepts provided by the MIB to transparently access media data
stored in various locations. In addition to transparent access to the actual media data, an organization of media
data that allows a user to find and select relevant material efficiently is of importance. Regarding Cardio-OP, this
on the one hand applies to users looking for information; if they cannot find the information they want quickly,
the acceptance of the system will diminish. On the other hand, efficient retrieval and selection of media is also
important during a multimedia authoring process; if it is more difficult for an author of multimedia content to
3
find existing media data and to reuse it than to reproduce it, the degree of reuse will be very low which is not
cost-effective. Hence, a multimedia repository should support sophisticated, fine-grained retrieval of media data
according to media type, the associated technical metadata, and content. For instance, it should be possible to
limit a search for media data to those images coded in JPEG format not exceeding a width of 500 pixels and a
height of 250 pixels. Additionally, the multimedia repository should support the selection of media data by the
information content, or even better, the mixed retrieval of media data by technical metadata and content. This
section concentrates on the organization of media data according to media data type and technical metadata while
the support for content-based retrieval is presented in Section 4.
The underlying DataBlade modules, on which the MIB is based, provide technical metadata for the media
data types they manage. However, this metadata is spread over the various modules and, as mentioned before,
cannot be accessed in an integrated fashion by applications. Metadata applicable to several media data types
might be named differently, might follow different units of measurement, or might not be even managed at all by
the respective DataBlade modules. What is needed for applications is an integrated view on media data and the
associated technical characteristics. Such a view should not restrict itself on technical metadata applicable to all
supported media data types. The lowest common denominator would not be very helpful for applications as, apart
from the size of media data in bytes and its coding format, there would not be much technical metadata applicable
to all media data types.
The approach taken by the MIB is to create a new layer of information that provides a homogeneous and uni-
form model of media data and the associated technical meta data. Depending on the applications to be supported,
a comprehensive subset is selected from all technical characteristics available with the different media data types
and integrated in this layer. Each technical characteristic of this set is associated with all the media data types it
applies to. To support sophisticated cross-media type queries, a further organization of this set is reasonable. A
user might be interested in media data which can be viewed but does not care whether it is of type image or video.
Another user might not be interested in a limitation to certain media types at all. To tackle this problem, the se-
lected technical characteristics are organized in a specialization hierarchy (see Figure 1). The top of the hierarchy
models thosetechnical characteristics applicabletoallmedia data types, i.e., the lowestcommon denominator. The
leaves of the specialization hierarchy represent the technical metadata that are applicable to specific media data
types. The inner nodes of the specialization hierarchy group technical media data of closely interrelated media
data types. For example, Viewable subsumes technical metadata common to all media data types that can be
viewed by humans like width and height. This specialization hierarchy is based on the assumption that the inner
nodes reflect distinctionsbetween media data types that are likelyto be importantfor user queries.
Image Video Audio
ReadableAudibleViewable
Medium
Text
Specialization
Figure 1: Media organization
00000000000000000000000000000
0
000000000000000000000000000
0
0
000000000000000000000000000
0
00000000000000000000000000000
11111111111111111111111111111
1
111111111111111111111111111
1
1
111111111111111111111111111
1
11111111111111111111111111111
Video v
Continuous
Annotations
000000000000111111111111
cardiovascular drugs
infusion
time axis
Annotations
Discrete
thorac
interval b, weight 0.8
interval a, weight 1
interval c, weight 0.2
operation
weight 1
emergency
weight 0.8
Figure 2: Video
v
with discrete and continuous annota-
tions
Employing this organization, an application is able to search for media data with regard to certain technical
characteristics withoutnecessarily limitingthesearch ononeparticularmedia data type. The organizationofmedia
data is notlimited to the specialization hierarchy as depicted in Figure 1. New media data types can be includedin
this organization by placing them intothe specialization hierarchy under the most suitable inner node. Hereby, the
set of technical metadata is extended by additional technical characteristics applicable only to the new media data
type. New types of queries can be supported by the reorganization and/or more fine-grained specialization of the
hierarchy.
The specialization hierarchy of Figure 1 is realized by the MIB with a table hierarchy, with each table repre-
senting a node. Hereby, the MIB utilizes primitives of the object-relational DBMS IDS/UD which allow to define
specialization relationships between tables. The columns of these tables model the technical metadata associated
with the respective node. An additional column is provided with the root table that references the actual media
data with the help of the uniLocator. This establishes the link between technical metadata and media data.
4
4 Content-based retrieval
In the previous section, we have explained how the MIB organizes media data according to technical metadata.
Furthermore, support for content-based retrieval of media data in a uniform fashion, independent of the particular
media data type is necessary. However, the facilities for content-based retrieval offered by the media data type-
specific database extensions are coined to the respective media data types. Thus, uniform and integrated cross-
media type content-based retrieval is difficult to achieve. For the integrated, uniform content-based access to
media data of different types, a uniform layer of content-descriptive information is established on which uniform
content-based queries can be executed. Therefore, we first introduce the notion of annotations which relate media
data of different types to concepts organized in a hierarchy. Based on this annotation scheme, we informally
introduce the different semantics of the content-based retrieval to be supported by the MIB. Following this query
semantics, we define a set of query operators for content-based retrieval that implement the introduced semantics.
4.1 Media annotation
Media annotations relate media data to concepts from an application domain with a given confidence in order to
semantically describe media data’s content. A concept is an abstract idea of an entity important for a particular
application domain. An example of a concept taken from the application domain of Cardio-OP is “chest of the
human body”. Each concept is associated with one or more not possibly ambiguous terms called captions in some
language. For instance, the captions “thorac” and “chest” are two terms for the concept “chest of the human
body”. Concepts are organized in a specialization hierarchy. This means that if media data is related to a concept
c
1
which is a subconcept of
c
2
then the media data related to
c
1
is considered to be related to
c
2
as well. The
concept hierarchy reflects the knowledge of the application domain. We distinguishbetween discrete annotations
and continuous annotations. Discrete annotations relate complete media data to concepts. However, in case of
continuousmedia data types, it might be of interest to relate concepts only to temporal intervals of the media data.
Thereby, a more fine-grained description of content can be achieved.
The confidence of an annotation, either continuous or discrete, is given by a weight
w
2
[0
;
1]
specifying how
strongly the content of a medium is related to a concept . This paves the way for “fuzzy” content-based retrieval
by which media may qualify with a given confidence for a query. Qualifying media can then be presented to the
user ordered by confidence. It is widely accepted that for content-based retrieval a simple boolean model in which
media either qualify or not is too restrictive for practical use [BYRN99].
Notethatwedonotmake anyassumptionsonthewayannotationsare created. They mightbe created manually.
Also, they might be (semi-)automatically extracted from metadata facilities provided with the underlying media
data type-specific database extensions.
The symbols introduced in Definition 1 are used in the formal definitions to follow. In Definition 2 we then
introduce the notions of discrete annotation and continuous annotationformally.
Definition 1 — Symbols:
M
,
CM
,
DM
,
C
,
T
,
dur ation
,
I
m
Let
M
denote the set of all media data.
Let
CM
M
denote the set of all media data of continuousmedia data type.
Let
DM
M
denote the set of all media data of discrete media data type.
Let
C
denote the set of concepts organized in the concept hierarchy.
Let
T
denote the set of all captions associated with the concepts
C
.
The function
dur ation
returns the duration of a continuousmedium
m
2
CM
.
For all
m
2
CM
, let
I
m
denote the set of all valid time intervals
i
of
m
,
i
=[
s; e
]
, with
s
e
,
s
0
, and
e
dur ation
(
m
)
.
Definition 2 — Annotation
The quadruple
a
m
=(
m; c; i; w
)
,
m
2
M
,
c
2
C
,
i
2
I
m
[f
"
g
,
w
2
[0
;
1]
is called an annotationof
m
with
weight
w
.If
i
=
"
then
a
m
is called a discrete annotation. Otherwise,
a
m
is called a continuous annotation.
The followingconditionmust hold for
a
m
:
m
2
DM
)
i
=
"
.
The annotation
a
m
=(
m; c; i; w
)
is called suiting to caption
t
2
T
iff
t
is a caption associated with
c
or a
subconcept of
c
.
Definition2 ensures thatonlycontinuousmedia data can be annotated continuously. With the notionofsuiting,
we provide means to find those concepts corresponding to a given caption.
5
For the management of the concept hierarchy, the MIB relies on the commercially available COCOON Data-
Blade module by dimedis[dim98]. The MIBmodels annotationsby providinga table called Annotations, each
row representing an annotation. This table features one column for referencing the media data, one column which
contains the id of the concept with which the media data is annotated, a column specifying the weight of the an-
notation, and a column describing the time interval in which the annotationis valid. This latter column is NULL in
case of a discrete annotation. The table Annotation realizes the association of concepts with media data.
4.2 Semantics of content-based retrieval
The annotations introduced in the previous subsection offer a means to describe the content of media data inde-
pendent of the respective media data type. In order to exploit these annotations for content-based retrieval, we
distinguishbetween continuous and discrete query semantics.
In the discrete query semantics, annotations of both discrete and continuous media data are apprehended as
related to the entire media data. Continuousannotationsare treated as ifthey were discrete, i.e.,
a
m
=(
m; c; i; w
)
,
i
6
=
"
is considered equivalent to
a
0
m
=(
m; c; "; w
)
. The results of queries with discrete semantics are always
weightedreferences toentiremedia. Consider, forexample, the annotatedvideo
v
showninFigure2. Withdiscrete
semantics, a query for media data referring to concepts “thorac” and “infusion” would return a weighted reference
to the entire video
v
, even thoughthese concepts have been annotated only to intervals
a
and
b
of
v
, respectively.
The granularityof the results of queries withdiscrete semantics is very coarse-grained especially if continuous
media data are involved. It is not very helpful for a user to retrieve a video clip with a duration of one hour that
at some point in time relates to certain concepts. A user would rather like to know when the content refers to the
desired concepts. For thiskind of queries, we provide continuousquery semantics which respect the time intervals
of continuous annotations.
In case of discrete media data, the continuous query semantics return weighted references to entire media data
just as with discrete semantics. In the case of continuous media data, however, the continuous query semantics
return weighted references to time intervals of continuous media data. These are the time intervals of continuous
media data that are relevant with regard to the query. Discrete annotations of continuous media data are treated
as continuous annotations that apply to the entire media data, i.e., the discrete annotation
a
m
=(
m; c; "; w
)
is considered equivalent to the continuous annotation
a
0
m
=(
m; c;
[0
; dur ation
(
m
)]
;w
)
. Consider a query for
media data referring to the concepts “thorac” and “infusion” with continuous semantics. Regarding the example
given in Figure 2, it is the question whether the depicted video
v
would strongly qualify or not as a result for this
query. If it was sufficient that at least two intervals exist in
v
each associated with one of the concepts,
v
would
be a highly weighted part of the query result. However, if the qualifying condition additionally demands that the
intervals which bear relevant annotations for the query must overlap,
v
is not very relevant to the query.
In order to exactly define the query semantics, operators for the retrieval, conjunction, and disjunction of
concepts are introduced and formally defined in the followingsection.
4.3 Query operators
Forthetwodifferentquerysemantics, we nowformallydefine thequeryoperatorstorealize thesesemantics. These
query operators are
r etr iev e
,
and
, and
or
. One variant of
r etr iev e
is provided for both discrete and continuous
query semantics each while
and
and
or
can be applied to bothquery semantics. The query operators return sets of
weighted references to media data as a result with the weight specifying the confidence with which a reference to
media data is relevant for a query. The result of a query can thus be sorted by the weights in order to present the
user the most relevant media first. Like annotations, a reference may be discrete, in case the whole media data is
referenced, or continuous,in case only a time intervalof media data of a continuousmedia data type is referenced.
Definition 3 formally introduces the notionof references.
Definition 3 — Reference
The triple
r
m
=(
m; i; w
)
,
m
2
M
,
i
2
I
m
[f
"
g
,
w
2
[0
;
1]
is called a reference to
m
with weight
w
.If
i
=
"
then
r
m
is called a discrete reference. Otherwise,
r
m
is called a continuousreference. The followingcondition
must hold for
r
m
:
m
2
DM
)
i
=
"
.
Given the notionof reference, we now can introduce the first of the query operators,
r etr iev e
, in Definition 5.
The operator allows to select the media data referring to concepts described by a given caption
t
2
T
. This opera-
tor is the cornerstone of the content-based retrieval facilities. The variant for discrete query semantics,
r etr iev e
d
,
returns only discrete references to media data while the variant for continuous query semantics,
retrieve
c
, re-
turns fine-grained continuous references whereever possible. The continuous variant
r etr iev e
c
ensures that only
6
continuous references to continuous media data are returned in the result set by converting discrete references on
continuous media data
m
to the equivalent continuous reference on interval
[0
; dur ation
(
m
)]
. This is done to
simplifythe followingformal definitionsof query operators for continuoussemantics. Additionalsymbols used in
the ensuing definitionsare given by Definition 4.
Definition 4 — Symbols:
R
,
A
Let
R
denote the set of all possible references.
Let
A
denote the set of annotations.
Definition 5 — Query operator
r etr iev e
The query operator for discrete query semantics
r etr iev e
d
:
T
!
2
R
is defined as follows:
r
m
=(
m; "; w
)
2
r etr iev e
d
(
t
)
iff
9
a
m
=(
m; c; i; w
)
2
A
with
a
m
suitingto
t
.
The query operator for continuous semantics
r etr iev e
c
:
T
!
2
R
is defined as follows:
8
a
m
=(
m; c; i; w
)
2
A
:
if
i
=
"
and
m
2
CM
and
a
m
is suitingto
t
then
r
m
=(
m;
[0
; dur ation
(
m
)]
;w
)
2
r etr iev e
c
(
t
)
. Otherwise, if
c
is suitingto
t
then
r
m
=(
m; i; w
)
2
retrieve
c
(
t
)
.
Considering the video
v
in Figure 2, the query
r etr iev e
d
(
”thorac”
)
returns the weighted discrete refer-
ence
(
v; ";
0
:
8)
among its results. In contrast, the query
r etr iev e
c
(
“thorac”
)
returns the continuous reference
(
v; b;
0
:
8)
amongitsresults. The queryresultfor
r etr iev e
c
(
“operation”
)
wouldencompass theweightedreference
(
v;
[0
; dur ation
(
v
)]
;
1)
. This is due to the handling of discrete references on continuous media data mentioned
above.
Itis notvery satisfyingtobe able toqueryfor thecontentofmedia data according toonecaptiononly. Rather, a
user might want to use a conjunctionof several concepts to query media data. For this purpose, the query operator
and
is provided which is formally described in Definition 6. This operator basically calculates some sort of
intersectionbetween twosets ofweightedreferences tomedia data2. However, the formal definitionis complicated
by the fact that the operator considers weights and, in case of continuous references, temporal intervals.
Definition 6 — Query operator
and
Let
w
a
:[0
;
1]
[0
;
1]
;!
[0
;
1]
with
w
a
(
w
1
;w
2
)=1
;
q
(1
;
w
1
)
2
+(1
;
w
2
)
2
2
denote the weight-merge function
for
and
3. The query operator
and
:2
R
2
R
!
2
R
is then defined as follows:
8
r
m
=(
m; i; w
)
2
R
:
r
m
2
and
(
R
1
;R
2
)
iff one of the ensuing conditionsholds:
1.
9
r
0
m
=(
m; i
1
;w
1
)
2
R
1
with
i
1
6
=
"
and
9
r
00
m
=(
m; i
2
;w
2
)
2
R
2
with
i
2
6
=
"
,
i
1
temporally overlaps
i
2
and
i
=
i
1
\
i
2
such that
w
=
w
a
(
w
1
;w
2
)
.
2.
9
r
0
m
=(
m; i; w
1
)
2
R
1
with
i
6
=
"
and
w
=
w
a
(
w
1
;
0)
such that
:9
r
00
m
=(
m; i
2
;w
2
)
2
R
2
with
i
2
6
=
"
and
i
2
temporally overlaps
i
and
:9
r
00
m
=(
m; "; w
2
)
2
R
2
.
3.
9
r
00
m
=(
m; i; w
2
)
2
R
2
with
i
6
=
"
and
w
=
w
a
(0
;w
2
)
such that
:9
r
0
m
=(
m; i
1
;w
1
)
2
R
1
with
i
1
6
=
"
and
i
1
temporally overlaps
i
and
:9
r
0
m
=(
m; "; w
1
)
2
R
1
.
4.
9
r
0
m
=(
m; "; w
1
)
2
R
1
and
9
r
00
m
=(
m; i; w
2
)
2
R
2
such that
w
=
w
a
(
w
1
;w
2
)
.
5.
9
r
00
m
=(
m; "; w
2
)
2
R
2
and
9
r
0
m
=(
m; i; w
1
)
2
R
1
such that
w
=
w
a
(
w
1
;w
2
)
.
6.
9
r
0
m
=(
m; i; w
1
)
2
R
1
with
i
=
"
and
w
=
w
a
(
w
1
;
0)
such that
:9
r
00
m
=(
m; i
2
;w
2
)
2
R
2
.
7.
9
r
00
m
=(
m; i; w
2
)
2
R
2
with
i
=
"
and
w
=
w
a
(0
;w
2
)
such that
:9
r
0
m
=(
m; i
1
;w
1
)
2
R
1
.
Condition 1 states that should two continuous references to the same media data
m
exist in the sets
R
1
and
R
2
and should the time intervals of these references overlap then a continuous reference to the intersection of
the intervals on
m
is inserted into the result set with merged weight. Condition 2 and Condition 3 ensure that
2It is an objective of our design, that the query operators
and
and
or
can be arbitrarily nested. This allows for the flexible composition of
complexqueries. To achievethis goal, the operators
and
and
or
do not takecaptionsas argumentslike the
r etr iev e
operator, but rather sets of
weighted references. These sets may, of course, be obtained as a result of a
r etr iev e
operator. For instance, in order to achieve a conjunction
betweentwo concepts,the
and
operatoris employedon two
r etr iev e
operatorswhich in turn deliverthe referencesto the media data referring
to the desired concepts.
3The choice of the weight-merge function
w
a
is motivated by the fact that
and
intends to have semantics similar to the boolean
and
b
.As
the boolean
and
b
(
b
1
;b
2
)
yields
1
for
b
1
=
b
2
=1
, the optimal situation for
and
is having two references
r
0
m
=(
m; i
1
;w
1
)
2
R
1
and
r
00
m
=(
m; i
2
;w
2
)
2
R
2
to the same media data
m
with weights
w
1
=
w
2
=1
. Thus following [SFW83], we use the complement of the
normalized Euclidean distance of the point
(
w
1
;w
2
)
from the desirable position
(1
;
1)
as a measure of similarity
w
a
to the optimal situation
for
and
.
7
if a continuous reference
r
m
to media data
m
exists in only one set or there are only continuous references to
m
in the other set which do not temporally overlap with
r
m
then
r
m
is present in the result set with reduced
weight. Conditions 4 and 5 deal with occurances of discrete and continuous references to the same media data
m
in both sets. In this case, the continuous reference is included with the result set with merged weight since
discrete references on continuous media data are treated like continous references spanning the whole duration of
the media data. Conditions4 and 5 also deal with discrete references to the same media data
m
in both sets in the
way that one of these references is inserted into the result set with merged weight. Condition 6 and Condition 7
are the equivalents to Condition2 and Condition3 for discrete references.
Let us give some examples of the
and
query operator. Regarding the video
v
shown in Figure 2, the query
using continuous query semantics
and
(
r etr iev e
c
(
”cardiovascular drugs”
)
; r etr iev e
c
(
”thorac”
))
returns the
reference
(
v; a
\
b;
0
:
86)
among its results by Condition 1. In contrast to that, the query using continuous query
semantics
and
(
r etr iev e
c
(
”cardiovascular drugs”
)
; r etr iev e
c
(
”infusion”
))
returns the references
(
v; a;
0
:
29)
and
(
v; c;
0
:
09)
as results by Conditions 2 and 3. Considering a query with discrete semantics,
and
(
r etr iev e
d
(
”cardiovascular drugs”
)
; r etr iev e
d
(
”operation”
))
yields the reference
(
v; ";
1)
as a result by applying Condi-
tions 4 and 5.
Similar to the
and
query operator that allows for the conjunction of several captions in a query, the
or
query
operator has been provided to supportthe disjunctionof several captions as well. The operator basically calculates
some sort of union between two sets of weighted references to media data. Definition 7 formally introduces the
or
query operator.
Definition 7 — Query operator
or
Let
w
o
:[0
;
1]
[0
;
1]
;!
[0
;
1]
with
w
o
(
w
1
;w
2
)=
q
w
2
1
+
w
2
2
2
denote the weight-merge functionfor
or
4. The
query operator
or
:2
R
2
R
!
2
R
is defined as follows:
8
r
m
=(
m; i; w
)
2
R
:
r
m
2
or
(
R
1
;R
2
)
iff one of the ensuing conditions holds:
1.
9
r
0
m
=(
m; i; w
1
)
2
R
1
and
9
r
00
m
=(
m; i
2
;w
2
)
2
R
2
such that
w
=
w
o
(
w
1
;w
2
)
.
2.
9
r
0
m
=(
m; i
1
;w
1
)
2
R
1
and
9
r
00
m
=(
m; i; w
2
)
2
R
2
such that
w
=
w
o
(
w
1
;w
2
)
.
3.
9
r
0
m
=(
m; i; w
1
)
2
R
1
with
w
=
w
o
(
w
1
;
0)
such that
:9
r
00
m
=(
m; i
2
;w
2
)
2
R
2
.
4.
9
r
00
m
=(
m; i; w
2
)
2
R
2
with
w
=
w
o
(0
;w
2
)
such that
:9
r
0
m
=(
m; i
1
;w
1
)
2
R
1
.
Condition 1 and Condition 2 state that references, either continuous or discrete, to the same media data
m
in both sets
R
1
and
R
2
appear in the result set with merged weight. Conditions 3 and 4 deal with references to
media data
m
that are made only in one of the sets
R
1
and
R
2
. Such references are inserted into the result set with
reduced weight.
Giving again some examples regarding video
v
of Figure 2, the query
or
(
r etr iev e
d
(
”thorac”
)
; r etr iev e
d
(
”infusion”
))
returns among the result set the discrete reference
(
v; ";
0
:
68)
by Condition 1 and Condition 2.
The query
or
(
r etr iev e
c
(
”thorac”
)
; r etr iev e
c
(
”nurse”
))
returns the continuous reference
(
v; b;
0
:
46)
among its
results, since
v
is not annotated with the concept “nurse” (Condition3).
The operators we have introduced so far can be arbitrarily nested and, therefore, provide a powerful means
of uniform, content-based retrieval of media data. For instance, the nested query
and
(
or
(
retrieve
c
(
“thorac”
)
;
r etr iev e
c
(
“medicament”
))
; r etr iev e
c
(
“complication”
))
returnsweightedreferences toallmedia data ofany type
with contents dealing with the concepts “thorac” or “medicament” and with any complications occurring at the
same time.
The MIB implements the query operators of both semantics introduced above as user-defined routines. These
are based on the primitives for content-based retrieval offered by the COCOON DataBlade module. These primi-
tivesallowtoretrieve alldatabase rowsthatcontainreferences toa givenconcept oftheconcept hierarchymanaged
by COCOON. Moreover, a user-defined routine named medToTempTable has been provided which copies the
results of a query to a temporary table. This table can then be simplyjoined witha table of the organizationhierar-
chy of Section 3, thereby allowing for the mixed retrieval of media data according to technical characteristics and
content.
4For the choice of the weight-merge function
w
o
, an argument similar to the choice of
w
a
applies. As the boolean operator
or
b
(
b
1
;b
2
)
yields
0
only for
b
1
=
b
2
=0
, the least optimal situation for
or
is having two references
r
0
m
=(
m; i
1
;w
1
)
2
R
1
and
r
00
m
=(
m; i
2
;w
2
)
2
R
2
to the same media data
m
with weights
w
1
,
w
2
close to
0
. Hence, we use the normalized Euclidean distance of the point
(
w
1
;w
2
)
from
the least desirable position
(0
;
0)
as a measure of similarity
w
o
to the optimal situation for
or
.
8
5 Application of the Media Integration DataBlade Module
The MIB has been successfully implemented and employed by our group for the management of the media data
in the context of the Cardio-OP project. Recently, we have developed a graphical media browser which can be
employed for the retrieval of media data managed by the MIB according to content and technical metadata in
an intuitive manner. This browser exploits the features of the MIB with regard to the organization of technical
metadata and the sophisticatedcontent-based retrieval. Figure 3 shows a screenshot of the media browser. The left
part ofthe browser shows the concept hierarchy (withGerman captions)managed by the MIB(see Section 4.1) out
of which a user can select the desired set of concepts. The top rightpart of the browser allows to limitthe search to
specific media data types. According to this selection, the browser offers means to further narrow down the search
by specifying constraints for those technical metadata applicable to the selected media data types. To the bottom
right, the query semantics for the content-based retrieval as presented in Section 4.2 can be chosen. The query
operators as presented in Section 4.3 implicitly are used by the browser when is comes to formulate the query to
the DBMS. Finally, the search button activates the retrieval process and returns references to all qualifyingmedia
data using the uniform locator mechanism presented in Section 2. The returned locators can be transferred into a
result pool (see the tab named “Pool” to the top left) that persists several queries and later be used to access the
media data transparently.
Figure 3: Screenshot of the media browser component
The media browser has been developed as a Java Bean for reuse in further applications that need media man-
agement and media browsing facilities. This component is currently reused for the development of a graphical
annotation editor and an authoring tool for multimedia presentations. The graphical annotation editor aims at the
comfortable creation and manipulation of annotations to discrete and continuous media data. Here, the media
browser component is used to select media data for annotation and to browse the annotations. The authoring tool
allows to build multimedia presentations using the ZYX document model [BK99] which has been developed in
the context of the Cardio-OP project since standard models like SMIL, MHEG-5, and HyTime do not meet the
project’sspecific requirements foradaptation, reuse, and presentationneutralityofmultimediacontent [BKW99b].
The authoring tool utilizes the media browser component, and hereby the MIB, to browse and select media data
for use in the multimedia composition. These tools show the application of the MIB for the integrated, uniform
management andretrievalof media dataof differenttypes. Otherapplicationsexploitingthe MIBfortheintelligent
management of media data can be imagined. Our group, for instance, plans to employ the MIB with a modified
media organization scheme for the management of publicationsthat have been downloaded from the internet.
6 Conclusion and Outlook
Starting out from the need of multimedia information systems for the integrated and uniform management and
access to media data of different media data types, we introduced the Media Integration DataBlade module as
9
a generic component which unites media data type-specific database extensions under a common roof. We in-
troduced the locator mechanism of the MIB allowing for the transparent, uniform access to media data stored in
various locations. We then explained how the media data is organized to support sophisticated search and re-
trieval according to technical characteristics effectively. We illustrated the facilities for content-based retrieval
and showed that the MIB powerfully allows for the mixed retrieval of media data of various type by content and
by technical metadata. We demonstrated how the MIB can be employed in applications requiring sophisticated,
uniform, intelligentmanagement of media data of differenttype: browsingand content-based retrieval media data,
management of annotations to media data, and multimedia authoring. We plan to extend the MIB with support
for further media data types. For this purpose, we evaluate database extensions for the management of audio data,
MPEG streams, and animations. In addition to that, there is ongoing work on a streaming server placed on top of
the MIB followingthe architecture of [BKL96] supportingthe delivery of continuous media data over a network.
Acknowledgments: We would like to thank Susanne Boll for her valuable comments and her great efforts to
prepare and improve this version of the paper.
References
[AN97] D. A. AdjerohandK. C. Nwosu. Multimedia DatabaseManagement–RequirementsandIssues. IEEEMultimedia,
4(3), 1997.
[BG99] S. BechhoferandC. Goble. ClassificationBasedNavigationandRetrievalfor PictureArchives. In Proceedings8th
IFIP Conferenceon Data Semantics(DS-8), Rotorua, New Zealand, January1999. Kluwer Academic Publishers.
[BK99] S. Boll and W. Klas. ZYX — A Semantic Model for Multimedia Documents and Presentations. In Proceedings
of the 8th IFIP Conferenceon Data Semantics (DS-8): “Semantic Issues in Multimedia Systems”, Rotorua, New
Zealand, January1999.
[BKL96] S. Boll, W. Klas, and M. L¨ohr. Integrated DatabaseServicesfor Multimedia Presentations. In S. M. Chung, editor,
Multimedia Information Storage and Management.Kluwer Academic Publishers, Dordrecht, 1996.
[BKW99a] S. Boll, W. Klas, and U. Westermann. Exploiting OR-DBMS Technologyto Implement the ZYX Data Model for
Multimedia Documents and Presentations. In ProceedingsDatenbanksystemein B¨uro, Technik und Wissenschaft
(BTW), Freiburg, Germany, March 1999.
[BKW99b] S. Boll, W. Klas, and U. Westermann. Multimedia Document Formats — Sealed Fate or Setting Out for New
Shores? In Proceedings IEEE International Conference on Multimedia Computing and Systems (ICMCS), Flo-
rence, Italy, June 1999.
[BYRN99] R. Baeza-Yates and B. Ribiero-Neto. ModernInformation Retrieval. Addison Wesley, Harlow, England, 1999.
[dim98] dimedis. The COCOON DataBlade Module UsersGuide. dimedis GmbH, Cologne, Germany, 1998.
[F
+
95] M. Flickner et al. Query by Image and Video Content: The QBIC System. IEEE Multimedia, 28(9), 1995.
[JE98] H. Jiang and A. K. Elmagarmid. Spatial and Temporal Content-Based Access to Hypervideo Databases. VLDB
Journal, 7(4), 1998.
[KA97] W. Klas and K. Aberer. Multimedia and its Impact on Database System Architectures. In P.M.G. Apers, H.M.
Blanken, and M.A.W. Houtsma, editors, Multimedia Databasesin Perspective.Springer, London, 1997.
[KGF99] W. Klas, C. Greiner, and R. Friedl. Cardio-OP — Gallery of Cardiac Surgery. In ProceedingsIEEE International
Conferenceon Multimedia Computing and Systems (ICMCS), Florence, Italy, June1999.
[LR95] M. L¨ohr and T. C. Rakow. Audio Support for an Object-Oriented Database Management System. Multimedia
SystemsJournal, 3(5-6), 1995.
[OOL
+
97] V. Oria, M. T. ¨
Ozsu, L. Liu, et al. Modeling Images for Content-Based Queries: The DISIMA Approach. In
Proceedings2nd InternationalConferenceon Visual Information Systems,San Diego, California, December1997.
[ORC
+
98] M. Ortega, Y. Rui, K Chakrabarti, et al. Supporting RankedBoolean Similarity Queries in MARS. IEEE Transac-
tions on Knowledgeand Data Engineering,10(6), 1998.
[OT93] E. Oomoto and K. Tanaka. OVID: Design and Implementation of a Video-Object DatabaseSystem. IEEE Trans-
actions on Knowledgeand Data Engineering, 5(4), 1993.
[RNL95] T. C. Rakow, E. J. Neuhold, and M. L¨ohr. Multimedia Database Systems – The Notions and the Issues. In
ProceedingsDatenbanksystemein B¨uro, Technikund Wissenschaft (BTW), Dresden, 1995. Springer.
[SFW83] G. Salton, E. A. Fox, and H. Wu. Extended Boolean Information Retrieval. Communications of the ACM, 26(11),
1983.
[WS98] M. Wechsler and P. Sch¨auble. Metadata for Content-BasedRetrieval of Speech. In A. Sheth and W. Klas, editors,
Multimedia Data Management.McGraw-Hill, New York, 1998.
10