Architecture of a DataBlade Module for the Integrated Management of Multimedia Assets [original]

In ProceedingsFirst International Workshop on Multimedia Intelligent Storage and Retrieval Management(MISRM),

Orlando, Florida, October 30th, 1999

Architecture of a DataBlade Module for the Integrated

Management of Multimedia Assets

Utz Westermann, Wolfgang Klas

Databasesand Information Systems (DBIS), Computer ScienceDepartment

Oberer Eselsberg, University of Ulm

James-Franck-Ring,89069 Ulm, Germany

Tel: +49-731-50-24131,Fax:+49-731-50-24134

westermann, klas

@informatik.uni-ulm.de

Abstract

Advancedmultimedia applications require adequate support by database technology for the integrated, uni-

form management, retrieval, and delivery of media data of different types. While there has been substantial

effort to provide database support for multimedia data and for its content-based retrieval, research has mostly

focusedon supportfor single media types, usuallyvideosor images. However,this has led to a varietyof isolated

solutions of database support optimized for the respective media data types which are difficult to integrate — a

uniformview is lacking. In this paper,we describethedesignandthe architectureof the Media IntegrationBlade,

a DataBlade module for the object-relational DBMS Informix DynamicServer/UniversalData Option. Building

upon existing media type-specific DataBlade modules, the Media Integration Blade establishes an integration

layerwhich offers uniform, homogeneousaccessto the different types of media data. It allows for the uniform re-

trieval of media data of different type by technical characteristicsand contentwhile maintaining the functionality

of the underlying media type-specific DataBlade modules. Hereby, the Media Integration Blade forms a generic

corecomponentthat can be employedfor the uniform managementand accessto media data of different types in

any database-drivenmultimedia information system.

1 Introduction

The advantages of buildingmultimediainformationsystems on topof database systems have longbeen recognized

[KA97, RNL95, AN97]. The nature of multimedia data, however, imposes specific requirements on database

systems. In general, media data have highvolume and, incase ofcontinuousmedia data types, are time-dependent.

To provide for the management and content-based retrieval of media data, support for media data types must be

seamlessly integrated with the database management system (DBMS) [KA97].

So far, research in the area of multimedia database systems has focused on database support for specific single

media data types, mostly video and image. This results in a variety of available prototypes. For example, the

MARS system [ORC

98] and the QBIC system [F

95] are image database prototypeswhich allow to query their

contents by example images using sophisticated similarity search algorithms. The STARCH database [BG99]

facilitates querying for images with the help of an ontology specified by description logics while the DISIMA

system [OOL

97] allows for the queryingof images by bothsimilaritysearch algorithmsand concepts taken from

an ontology. As an example of a video database, the OVID system [OT93] allows to define sets of consecutive

frames as video objects that can be queried by the VideoSQL query language. Jiang and Elmagarmid [JE98]

introduce a prototypeof a video database based on the Logical Hypervideo Data Model that allows to define video

objects inside video frames which can be indexed by free text and connected to other video objects via hyperlinks.

In literature, one can also find some few examples of databases concerned with the management of audio data and

speech [LR95, WS98]. On the commercial market, database technology is evolving that supports different media

data types. For instance, the object-relational DBMS Informix Dynamic Server/ Universal Data Option (IDS/UD)

can be extended by DataBlade modules for support of media data like image, text, and video.

As the prototype systems and database extensions focus on just one particular media data type, their imple-

mentations vary in the location where media data is stored, in the way technical metadata is available, and in the

way content-based retrieval of media data is performed. Considering the storage location of media data, variants

range from storage of media data in binarylarge objects inside a database, in a file system, ona web server, or even

on a full-fledgedmedia server allowingfor the streaming delivery of continuousmedia data. Handlingof technical

metadata is focused on each specific media data type. In consequence, for some media data types the support

for technical metadata is not necessarily comprehensive. For instance, the Informix Video Foundation DataBlade

offers no supportfor color depth at all, thoughcolor depthis desirable for videos. Additionally,technical metadata

applicable to different media data types, e.g., the size of image and video, may be handled and named differently.

Content-based retrieval of media data ranges from similarity search based on automatically extractable features

like the color distributionof an image to search based on content-descriptive semantic annotations.

As long as an application is concerned with only a single media data type, these variations cause no problems.

There are, however, applications that need to manage media data of different types in a uniform and integrated

fashion. To overcome the variations, such an applicationitselfhas to cope withthe heterogeneityof the media data

type-specific concepts. For instance, depending on the storage location of a medium and its particular media data

type, theapplicationhastoemploydifferentmechanisms toaccess media dataofdifferenttype. Moreover, content-

based retrieval of media data spanning several different media types is difficult to achieve by the application, as it

must map a given query imposed on the informationsystem to several media type-dependent facilities for content-

based retrieval. Thus, means for content-based retrieval applicable to different media data types is a necessity.

Finally, it is difficult for the applicationto limitthe search for media data usingtechnical characteristics applicable

todifferentmedia datatypessuch asthesize inbytessince mediadata type-specificprototypesystems anddatabase

extensions manage technical metadata differently. Here, an integrated view on media data wouldbe helpful.

With the project “Gallery of Cardiac Surgery” (Cardio-OP1)[KGF99], that aims at the development of an

Internet-based and database-driven multimedia information system in the domain of cardiac surgery, we find such

an applicationthatexplicitlyrequires uniformmanagement and uniformcontent-based retrievalof multimediama-

terial of differenttypes rangingfrom videos, images, and texts, to full-fledged multimedia presentations. Based on

a multimedia repository, the system is going to serve as a common informationand education base for its different

typesofusers, physicians,medical lecturers, students, andpatients, whoare providedwithmultimediadataaccord-

ingto theiruser specific request to themultimedia informationsystem, theirdifferentunderstandingof the selected

subject, their location and technical infrastructure. The underlying database technology of the repository is given

by the object-relational DBMS IDS/UD which has been chosen for reasons of flexibility, profound extensibility,

and industrial strength [BKW99a].

In this paper, we describe the design and the architecture of the Media Integration Blade (MIB) DataBlade

module for IDS/UD which establishes an integration layer upon media data type-specific DataBlade modules to

overcome the heterogeneity of the media data type-specific concepts and to offer applications transparent and

uniform access to media data of different media data types. Based on commercial DataBlade modules for text,

image, and video data, the MIB provides transparent access to media data of different type by encapsulating the

storage location in an abstract data type. The MIB builds an information layer that offers a homogeneous view

on the technical metadata of the different media data type. To support uniform, media data type-independent

content-based retrieval, the MIB allows for the semantic annotation of media data of different type with concepts

taken from a common ontology and by offering query operators for these annotations. However, media data

type-specific functionality for content-based retrieval based on automatically extractable features offered by the

underlying DataBlade modules remains accessible. The concepts illustratedin this paper and implemented by the

MIB are generic to the extent that the overall approach can be applied to other database technologies as well.

The paper is organized as follows: Section 2 shows how to achieve transparent media access. Section 3 de-

scribes the organizationof themedia data toaccomplish a homogeneous, integratedview tothe technical metadata.

Section 4 introduces the annotation and content-based querying facilities of the MIB. Section 5 illustrates the ap-

plication of the MIB module. Section 6 concludes the paper and gives an outlookto ongoingand future work.

2 Transparent media access

As mentioned in the introduction, the locations where media data is stored and accessed often differs with media

data type-specific database extensions. Among others, media data may be stored on a media server, on a web

server, in a file system, or in binary large objects of a database. Each storage location comes with its own access

methods. For instance, access to files is done with the help of operating system routines while a media server

might define a specific network protocol to access media data. This heterogeneity of access to media data hinders

applications requiring a uniform model of media access. A known solution to this problem is the employment of

1Cardio-OP - Gallery of Cardiac Surgery - is partially fundedby the GermanMinistry of ResearchandEducation, grant number08C58456.

Our project partners are the University Hospital of Ulm, Dept. of Cardiac Surgery and Dept. of Cardiology, the University Hospital of

Heidelberg, Dept. of Cardiac Surgery, an associated Rehabilitation Hospital, the publishers Barth-Verlag and dpunkt-Verlag, Heidelberg, FAW

Ulm, and ENTEC GmbH, St. Augustin. For details see also URL www.informatik.uni-ulm.de/dbis/Cardio-OP/

locators. A locator is a data type which abstracts from the way media data is stored by encapsulating its storage

location. A locator comes with a set of access routines. Depending on the particular storage location encapsulated

in an instance of the locator, the access routinesuse the access methods appropriate for thatstorage location. Thus,

it is hidden from an application where media data is stored and by which method exactly it is accessed as an

application always calls the same access routines.

Ideally, the problem of offering transparent access to a variety of storage locations could be solved by employ-

ing one comprehensive locator data type. As a matter of fact, different locator types have evolved that, unfortu-

nately, not only support different storage locations but also come with different access routines. For this reason,

it is difficult to decide for one of these variants. To illustrate this point, consider the locator variants provided

with the Excalibur Text DataBlade, Excalibur Image DataBlade, and the Informix Video Foundation DataBlade

modules for the IDS/UD. The Excalibur Text DataBlade module makes use of the LLD Locator data type. This

locator allows for the transparent access to media data stored in the file system of the IDS/UD server or in a

binary large object of a database. The Excalibur Image DataBlade module references media data by the use of

the IfdLocator which is essentially similar to LLD Locator but can store additional application-specific in-

formation. While the differences between LLD Locator and IfdLocator are mainly structural, the locator

MedLoc introduced by the Video Foundation DataBlade supports other storage locations. The MedLoc locator

serves to reference media data stored on a media server with the help of the Virtual Storage Interface (VSI), an

interface provided by Informix for accessing data on media servers.

For the design of the MIB which intends the integrationof the DataBlade modules mentioned above, a suitable

concept for a uniform locator has to be developed. This design must take into consideration that the functionality

of the underlyingmedia data type-specific DataBlade modules shouldremain usable for applications. For instance,

one would stilllike to use the functions and index structures of the Excalibur Image DataBlade responsible for the

similarity search among images which expect the media data type-specific locator IfdLocator not necessarily

applicabletofunctionsofothermedia data type-specificDataBlade modules. Therefore, theMIBdefines a uniform

locator named uniLocator that abstracts from the different variants. It is able to encapsulate an instance of one

of the various locator variants. Depending on the encapsulated locator variant, an instance of uniLocator

accommodates a suitable structure to represent the type and the data of the encapsulated variant. In order to obtain

instances of uniLocator, typecasts from the different locator variants to uniLocator are defined. Typecasts

are also defined vice versa to allow for the access of media data type-specific functionality of the underlying

DataBlade modules. However, not all instances of uniLocator can be casted to all locator variants. This

depends on the locator variant from which the uniLocator instance was constructed. In fact, the locator variant

used for the construction of the uniLocator instance might refer to a storage location not supported by the

locator variant to which the instance of uniLocator is to be casted.

The uniform locator uniLocator has associated access routines that facilitate the uniform, transparent ac-

cess to media data. For this purpose, the MIB implements several user-defined routines. The access routine

uniLocatorToClient copies the media data to the file system of the client calling this routine. For access at

a finer granularity, the MIB implements routines like open,close,read, and seek with the usual file access

semantics which in turn make use of the appropriate access methods provided with the specific locator variant

encapsulated by the instance of uniLocator.

The locatormechanism ofthe MIBas described above can be extended tocover additionalstorage locationsfor

media data withrelativelylittleeffort. For thispurpose, the type uniLocator must be extended toaccommodate

a suitable structure to reference media data in the new storage location. Furthermore, the user-defined routines for

uniformmedia access mustbeextended tointegratetheaccess tothenew storagelocation. Sofar, we have provided

medLocator with support for the locator variants LLD Locator,IfdLocator, and MedLoc thereby being

abletotransparentlyaccess mediadataonthefilesystemofthedatabase server, inbinarylarge objectsinadatabase,

and on media servers supportingVSI. We planto extend the locatormechanism withthe capabilitytoaccess media

data stored on a web server using HTTP.

3 Media organization

In the previous section, we have introduced the concepts provided by the MIB to transparently access media data

stored in various locations. In addition to transparent access to the actual media data, an organization of media

data that allows a user to find and select relevant material efficiently is of importance. Regarding Cardio-OP, this

on the one hand applies to users looking for information; if they cannot find the information they want quickly,

the acceptance of the system will diminish. On the other hand, efficient retrieval and selection of media is also

important during a multimedia authoring process; if it is more difficult for an author of multimedia content to

find existing media data and to reuse it than to reproduce it, the degree of reuse will be very low which is not

cost-effective. Hence, a multimedia repository should support sophisticated, fine-grained retrieval of media data

according to media type, the associated technical metadata, and content. For instance, it should be possible to

limit a search for media data to those images coded in JPEG format not exceeding a width of 500 pixels and a

height of 250 pixels. Additionally, the multimedia repository should support the selection of media data by the

information content, or even better, the mixed retrieval of media data by technical metadata and content. This

section concentrates on the organization of media data according to media data type and technical metadata while

the support for content-based retrieval is presented in Section 4.

The underlying DataBlade modules, on which the MIB is based, provide technical metadata for the media

data types they manage. However, this metadata is spread over the various modules and, as mentioned before,

cannot be accessed in an integrated fashion by applications. Metadata applicable to several media data types

might be named differently, might follow different units of measurement, or might not be even managed at all by

the respective DataBlade modules. What is needed for applications is an integrated view on media data and the

associated technical characteristics. Such a view should not restrict itself on technical metadata applicable to all

supported media data types. The lowest common denominator would not be very helpful for applications as, apart

from the size of media data in bytes and its coding format, there would not be much technical metadata applicable

to all media data types.

The approach taken by the MIB is to create a new layer of information that provides a homogeneous and uni-

form model of media data and the associated technical meta data. Depending on the applications to be supported,

a comprehensive subset is selected from all technical characteristics available with the different media data types

and integrated in this layer. Each technical characteristic of this set is associated with all the media data types it

applies to. To support sophisticated cross-media type queries, a further organization of this set is reasonable. A

user might be interested in media data which can be viewed but does not care whether it is of type image or video.

Another user might not be interested in a limitation to certain media types at all. To tackle this problem, the se-

lected technical characteristics are organized in a specialization hierarchy (see Figure 1). The top of the hierarchy

models thosetechnical characteristics applicabletoallmedia data types, i.e., the lowestcommon denominator. The

leaves of the specialization hierarchy represent the technical metadata that are applicable to specific media data

types. The inner nodes of the specialization hierarchy group technical media data of closely interrelated media

data types. For example, Viewable subsumes technical metadata common to all media data types that can be

viewed by humans like width and height. This specialization hierarchy is based on the assumption that the inner

nodes reflect distinctionsbetween media data types that are likelyto be importantfor user queries.

Image Video Audio

ReadableAudibleViewable

Medium

Text

Specialization

Figure 1: Media organization

00000000000000000000000000000

000000000000000000000000000

00000000000000000000000000000

11111111111111111111111111111

111111111111111111111111111

11111111111111111111111111111

Video v

Continuous

Annotations

000000000000111111111111

cardiovascular drugs

infusion

time axis

Annotations

Discrete

thorac

interval b, weight 0.8

interval a, weight 1

interval c, weight 0.2

operation

weight 1

emergency

weight 0.8

Figure 2: Video

with discrete and continuous annota-

tions

Employing this organization, an application is able to search for media data with regard to certain technical

characteristics withoutnecessarily limitingthesearch ononeparticularmedia data type. The organizationofmedia

data is notlimited to the specialization hierarchy as depicted in Figure 1. New media data types can be includedin

this organization by placing them intothe specialization hierarchy under the most suitable inner node. Hereby, the

set of technical metadata is extended by additional technical characteristics applicable only to the new media data

type. New types of queries can be supported by the reorganization and/or more fine-grained specialization of the

hierarchy.

The specialization hierarchy of Figure 1 is realized by the MIB with a table hierarchy, with each table repre-

senting a node. Hereby, the MIB utilizes primitives of the object-relational DBMS IDS/UD which allow to define

specialization relationships between tables. The columns of these tables model the technical metadata associated

with the respective node. An additional column is provided with the root table that references the actual media

data with the help of the uniLocator. This establishes the link between technical metadata and media data.

4 Content-based retrieval

In the previous section, we have explained how the MIB organizes media data according to technical metadata.

Furthermore, support for content-based retrieval of media data in a uniform fashion, independent of the particular

media data type is necessary. However, the facilities for content-based retrieval offered by the media data type-

specific database extensions are coined to the respective media data types. Thus, uniform and integrated cross-

media type content-based retrieval is difficult to achieve. For the integrated, uniform content-based access to

media data of different types, a uniform layer of content-descriptive information is established on which uniform

content-based queries can be executed. Therefore, we first introduce the notion of annotations which relate media

data of different types to concepts organized in a hierarchy. Based on this annotation scheme, we informally

introduce the different semantics of the content-based retrieval to be supported by the MIB. Following this query

semantics, we define a set of query operators for content-based retrieval that implement the introduced semantics.

4.1 Media annotation

Media annotations relate media data to concepts from an application domain with a given confidence in order to

semantically describe media data’s content. A concept is an abstract idea of an entity important for a particular

application domain. An example of a concept taken from the application domain of Cardio-OP is “chest of the

human body”. Each concept is associated with one or more not possibly ambiguous terms called captions in some

language. For instance, the captions “thorac” and “chest” are two terms for the concept “chest of the human

body”. Concepts are organized in a specialization hierarchy. This means that if media data is related to a concept

which is a subconcept of

then the media data related to

is considered to be related to

as well. The

concept hierarchy reflects the knowledge of the application domain. We distinguishbetween discrete annotations

and continuous annotations. Discrete annotations relate complete media data to concepts. However, in case of

continuousmedia data types, it might be of interest to relate concepts only to temporal intervals of the media data.

Thereby, a more fine-grained description of content can be achieved.

The confidence of an annotation, either continuous or discrete, is given by a weight

;

specifying how

strongly the content of a medium is related to a concept . This paves the way for “fuzzy” content-based retrieval

by which media may qualify with a given confidence for a query. Qualifying media can then be presented to the

user ordered by confidence. It is widely accepted that for content-based retrieval a simple boolean model in which

media either qualify or not is too restrictive for practical use [BYRN99].

Notethatwedonotmake anyassumptionsonthewayannotationsare created. They mightbe created manually.

Also, they might be (semi-)automatically extracted from metadata facilities provided with the underlying media

data type-specific database extensions.

The symbols introduced in Definition 1 are used in the formal definitions to follow. In Definition 2 we then

introduce the notions of discrete annotation and continuous annotationformally.

Definition 1 — Symbols:

dur ation

Let

denote the set of all media data.

Let



denote the set of all media data of continuousmedia data type.

Let



denote the set of all media data of discrete media data type.

Let

denote the set of concepts organized in the concept hierarchy.

Let

denote the set of all captions associated with the concepts

The function

dur ation

returns the duration of a continuousmedium

For all

, let

denote the set of all valid time intervals

s; e

]

, with





, and



dur ation

(

)

Definition 2 — Annotation

The quadruple

m; c; i; w

)

;

is called an annotationof

with

weight

.If

then

is called a discrete annotation. Otherwise,

is called a continuous annotation.

The followingconditionmust hold for

)

The annotation

m; c; i; w

)

is called suiting to caption

iff

is a caption associated with

or a

subconcept of

Definition2 ensures thatonlycontinuousmedia data can be annotated continuously. With the notionofsuiting,

we provide means to find those concepts corresponding to a given caption.

For the management of the concept hierarchy, the MIB relies on the commercially available COCOON Data-

Blade module by dimedis[dim98]. The MIBmodels annotationsby providinga table called Annotations, each

row representing an annotation. This table features one column for referencing the media data, one column which

contains the id of the concept with which the media data is annotated, a column specifying the weight of the an-

notation, and a column describing the time interval in which the annotationis valid. This latter column is NULL in

case of a discrete annotation. The table Annotation realizes the association of concepts with media data.

4.2 Semantics of content-based retrieval

The annotations introduced in the previous subsection offer a means to describe the content of media data inde-

pendent of the respective media data type. In order to exploit these annotations for content-based retrieval, we

distinguishbetween continuous and discrete query semantics.

In the discrete query semantics, annotations of both discrete and continuous media data are apprehended as

related to the entire media data. Continuousannotationsare treated as ifthey were discrete, i.e.,

m; c; i; w

)

is considered equivalent to

m; c; "; w

)

. The results of queries with discrete semantics are always

weightedreferences toentiremedia. Consider, forexample, the annotatedvideo

showninFigure2. Withdiscrete

semantics, a query for media data referring to concepts “thorac” and “infusion” would return a weighted reference

to the entire video

, even thoughthese concepts have been annotated only to intervals

and

, respectively.

The granularityof the results of queries withdiscrete semantics is very coarse-grained especially if continuous

media data are involved. It is not very helpful for a user to retrieve a video clip with a duration of one hour that

at some point in time relates to certain concepts. A user would rather like to know when the content refers to the

desired concepts. For thiskind of queries, we provide continuousquery semantics which respect the time intervals

of continuous annotations.

In case of discrete media data, the continuous query semantics return weighted references to entire media data

just as with discrete semantics. In the case of continuous media data, however, the continuous query semantics

return weighted references to time intervals of continuous media data. These are the time intervals of continuous

media data that are relevant with regard to the query. Discrete annotations of continuous media data are treated

as continuous annotations that apply to the entire media data, i.e., the discrete annotation

m; c; "; w

)

is considered equivalent to the continuous annotation

m; c;

; dur ation

(

)]

)

. Consider a query for

media data referring to the concepts “thorac” and “infusion” with continuous semantics. Regarding the example

given in Figure 2, it is the question whether the depicted video

would strongly qualify or not as a result for this

query. If it was sufficient that at least two intervals exist in

each associated with one of the concepts,

would

be a highly weighted part of the query result. However, if the qualifying condition additionally demands that the

intervals which bear relevant annotations for the query must overlap,

is not very relevant to the query.

In order to exactly define the query semantics, operators for the retrieval, conjunction, and disjunction of

concepts are introduced and formally defined in the followingsection.

4.3 Query operators

Forthetwodifferentquerysemantics, we nowformallydefine thequeryoperatorstorealize thesesemantics. These

query operators are

r etr iev e

and

, and

. One variant of

r etr iev e

is provided for both discrete and continuous

query semantics each while

and

can be applied to bothquery semantics. The query operators return sets of

weighted references to media data as a result with the weight specifying the confidence with which a reference to

media data is relevant for a query. The result of a query can thus be sorted by the weights in order to present the

user the most relevant media first. Like annotations, a reference may be discrete, in case the whole media data is

referenced, or continuous,in case only a time intervalof media data of a continuousmedia data type is referenced.

Definition 3 formally introduces the notionof references.

Definition 3 — Reference

The triple

m; i; w

)

;

is called a reference to

with weight

.If

then

is called a discrete reference. Otherwise,

is called a continuousreference. The followingcondition

must hold for

)

Given the notionof reference, we now can introduce the first of the query operators,

r etr iev e

, in Definition 5.

The operator allows to select the media data referring to concepts described by a given caption

. This opera-

tor is the cornerstone of the content-based retrieval facilities. The variant for discrete query semantics,

r etr iev e

returns only discrete references to media data while the variant for continuous query semantics,

retrieve

, re-

turns fine-grained continuous references whereever possible. The continuous variant

r etr iev e

ensures that only

continuous references to continuous media data are returned in the result set by converting discrete references on

continuous media data

to the equivalent continuous reference on interval

; dur ation

(

)]

. This is done to

simplifythe followingformal definitionsof query operators for continuoussemantics. Additionalsymbols used in

the ensuing definitionsare given by Definition 4.

Definition 4 — Symbols:

Let

denote the set of all possible references.

Let

denote the set of annotations.

Definition 5 — Query operator

r etr iev e

The query operator for discrete query semantics

r etr iev e

is defined as follows:

m; "; w

)

r etr iev e

(

)

iff

m; c; i; w

)

with

suitingto

The query operator for continuous semantics

r etr iev e

is defined as follows:

m; c; i; w

)

and

is suitingto

then

; dur ation

(

)]

)

r etr iev e

(

)

. Otherwise, if

is suitingto

then

m; i; w

)

retrieve

(

)

Considering the video

in Figure 2, the query

r etr iev e

(

”thorac”

)

returns the weighted discrete refer-

ence

(

v; ";

among its results. In contrast, the query

r etr iev e

(

“thorac”

)

returns the continuous reference

(

v; b;

amongitsresults. The queryresultfor

r etr iev e

(

“operation”

)

wouldencompass theweightedreference

(

; dur ation

(

)]

;

. This is due to the handling of discrete references on continuous media data mentioned

above.

Itis notvery satisfyingtobe able toqueryfor thecontentofmedia data according toonecaptiononly. Rather, a

user might want to use a conjunctionof several concepts to query media data. For this purpose, the query operator

and

is provided which is formally described in Definition 6. This operator basically calculates some sort of

intersectionbetween twosets ofweightedreferences tomedia data2. However, the formal definitionis complicated

by the fact that the operator considers weights and, in case of continuous references, temporal intervals.

Definition 6 — Query operator

and

Let

:[0

;



;

with

(

)=1

;

)

+(1

;

)

denote the weight-merge function

for

and

3. The query operator

and



is then defined as follows:

m; i; w

)

and

(

)

iff one of the ensuing conditionsholds:

m; i

)

with

and

m; i

)

with

temporally overlaps

and

such that

(

)

m; i; w

)

with

and

(

;

such that

m; i

)

with

and

temporally overlaps

and

m; "; w

)

m; i; w

)

with

and

)

such that

m; i

)

with

and

temporally overlaps

and

m; "; w

)

m; "; w

)

and

m; i; w

)

such that

(

)

m; "; w

)

and

m; i; w

)

such that

(

)

m; i; w

)

with

and

(

;

such that

m; i

)

m; i; w

)

with

and

)

such that

m; i

)

Condition 1 states that should two continuous references to the same media data

exist in the sets

and

and should the time intervals of these references overlap then a continuous reference to the intersection of

the intervals on

is inserted into the result set with merged weight. Condition 2 and Condition 3 ensure that

2It is an objective of our design, that the query operators

and

can be arbitrarily nested. This allows for the flexible composition of

complexqueries. To achievethis goal, the operators

and

do not takecaptionsas argumentslike the

r etr iev e

operator, but rather sets of

weighted references. These sets may, of course, be obtained as a result of a

r etr iev e

operator. For instance, in order to achieve a conjunction

betweentwo concepts,the

and

operatoris employedon two

r etr iev e

operatorswhich in turn deliverthe referencesto the media data referring

to the desired concepts.

3The choice of the weight-merge function

is motivated by the fact that

and

intends to have semantics similar to the boolean

and

.As

the boolean

and

(

)

yields

for

, the optimal situation for

and

is having two references

m; i

)

and

m; i

)

to the same media data

with weights

. Thus following [SFW83], we use the complement of the

normalized Euclidean distance of the point

(

)

from the desirable position

;

as a measure of similarity

to the optimal situation

for

and

if a continuous reference

to media data

exists in only one set or there are only continuous references to

in the other set which do not temporally overlap with

then

is present in the result set with reduced

weight. Conditions 4 and 5 deal with occurances of discrete and continuous references to the same media data

in both sets. In this case, the continuous reference is included with the result set with merged weight since

discrete references on continuous media data are treated like continous references spanning the whole duration of

the media data. Conditions4 and 5 also deal with discrete references to the same media data

in both sets in the

way that one of these references is inserted into the result set with merged weight. Condition 6 and Condition 7

are the equivalents to Condition2 and Condition3 for discrete references.

Let us give some examples of the

and

query operator. Regarding the video

shown in Figure 2, the query

using continuous query semantics

and

(

r etr iev e

(

”cardiovascular drugs”

)

; r etr iev e

(

”thorac”

))

returns the

reference

(

v; a

86)

among its results by Condition 1. In contrast to that, the query using continuous query

semantics

and

(

r etr iev e

(

”cardiovascular drugs”

)

; r etr iev e

(

”infusion”

))

returns the references

(

v; a;

29)

and

(

v; c;

09)

as results by Conditions 2 and 3. Considering a query with discrete semantics,

and

(

r etr iev e

(

”cardiovascular drugs”

)

; r etr iev e

(

”operation”

))

yields the reference

(

v; ";

as a result by applying Condi-

tions 4 and 5.

Similar to the

and

query operator that allows for the conjunction of several captions in a query, the

query

operator has been provided to supportthe disjunctionof several captions as well. The operator basically calculates

some sort of union between two sets of weighted references to media data. Definition 7 formally introduces the

query operator.

Definition 7 — Query operator

Let

:[0

;



;

with

(

denote the weight-merge functionfor

4. The

query operator



is defined as follows:

m; i; w

)

(

)

iff one of the ensuing conditions holds:

m; i; w

)

and

m; i

)

such that

(

)

m; i

)

and

m; i; w

)

such that

(

)

m; i; w

)

with

(

;

such that

m; i

)

m; i; w

)

with

)

such that

m; i

)

Condition 1 and Condition 2 state that references, either continuous or discrete, to the same media data

in both sets

and

appear in the result set with merged weight. Conditions 3 and 4 deal with references to

media data

that are made only in one of the sets

and

. Such references are inserted into the result set with

reduced weight.

Giving again some examples regarding video

of Figure 2, the query

(

r etr iev e

(

”thorac”

)

; r etr iev e

(

”infusion”

))

returns among the result set the discrete reference

(

v; ";

68)

by Condition 1 and Condition 2.

The query

(

r etr iev e

(

”thorac”

)

; r etr iev e

(

”nurse”

))

returns the continuous reference

(

v; b;

46)

among its

results, since

is not annotated with the concept “nurse” (Condition3).

The operators we have introduced so far can be arbitrarily nested and, therefore, provide a powerful means

of uniform, content-based retrieval of media data. For instance, the nested query

and

(

retrieve

(

“thorac”

)

;

r etr iev e

(

“medicament”

))

; r etr iev e

(

“complication”

))

returnsweightedreferences toallmedia data ofany type

with contents dealing with the concepts “thorac” or “medicament” and with any complications occurring at the

same time.

The MIB implements the query operators of both semantics introduced above as user-defined routines. These

are based on the primitives for content-based retrieval offered by the COCOON DataBlade module. These primi-

tivesallowtoretrieve alldatabase rowsthatcontainreferences toa givenconcept oftheconcept hierarchymanaged

by COCOON. Moreover, a user-defined routine named medToTempTable has been provided which copies the

results of a query to a temporary table. This table can then be simplyjoined witha table of the organizationhierar-

chy of Section 3, thereby allowing for the mixed retrieval of media data according to technical characteristics and

content.

4For the choice of the weight-merge function

, an argument similar to the choice of

applies. As the boolean operator

(

)

yields

only for

, the least optimal situation for

is having two references

m; i

)

and

m; i

)

to the same media data

with weights

close to

. Hence, we use the normalized Euclidean distance of the point

(

)

from

the least desirable position

;

as a measure of similarity

to the optimal situation for

5 Application of the Media Integration DataBlade Module

The MIB has been successfully implemented and employed by our group for the management of the media data

in the context of the Cardio-OP project. Recently, we have developed a graphical media browser which can be

employed for the retrieval of media data managed by the MIB according to content and technical metadata in

an intuitive manner. This browser exploits the features of the MIB with regard to the organization of technical

metadata and the sophisticatedcontent-based retrieval. Figure 3 shows a screenshot of the media browser. The left

part ofthe browser shows the concept hierarchy (withGerman captions)managed by the MIB(see Section 4.1) out

of which a user can select the desired set of concepts. The top rightpart of the browser allows to limitthe search to

specific media data types. According to this selection, the browser offers means to further narrow down the search

by specifying constraints for those technical metadata applicable to the selected media data types. To the bottom

right, the query semantics for the content-based retrieval as presented in Section 4.2 can be chosen. The query

operators as presented in Section 4.3 implicitly are used by the browser when is comes to formulate the query to

the DBMS. Finally, the search button activates the retrieval process and returns references to all qualifyingmedia

data using the uniform locator mechanism presented in Section 2. The returned locators can be transferred into a

result pool (see the tab named “Pool” to the top left) that persists several queries and later be used to access the

media data transparently.

Figure 3: Screenshot of the media browser component

The media browser has been developed as a Java Bean for reuse in further applications that need media man-

agement and media browsing facilities. This component is currently reused for the development of a graphical

annotation editor and an authoring tool for multimedia presentations. The graphical annotation editor aims at the

comfortable creation and manipulation of annotations to discrete and continuous media data. Here, the media

browser component is used to select media data for annotation and to browse the annotations. The authoring tool

allows to build multimedia presentations using the ZYX document model [BK99] which has been developed in

the context of the Cardio-OP project since standard models like SMIL, MHEG-5, and HyTime do not meet the

project’sspecific requirements foradaptation, reuse, and presentationneutralityofmultimediacontent [BKW99b].

The authoring tool utilizes the media browser component, and hereby the MIB, to browse and select media data

for use in the multimedia composition. These tools show the application of the MIB for the integrated, uniform

management andretrievalof media dataof differenttypes. Otherapplicationsexploitingthe MIBfortheintelligent

management of media data can be imagined. Our group, for instance, plans to employ the MIB with a modified

media organization scheme for the management of publicationsthat have been downloaded from the internet.

6 Conclusion and Outlook

Starting out from the need of multimedia information systems for the integrated and uniform management and

access to media data of different media data types, we introduced the Media Integration DataBlade module as

a generic component which unites media data type-specific database extensions under a common roof. We in-

troduced the locator mechanism of the MIB allowing for the transparent, uniform access to media data stored in

various locations. We then explained how the media data is organized to support sophisticated search and re-

trieval according to technical characteristics effectively. We illustrated the facilities for content-based retrieval

and showed that the MIB powerfully allows for the mixed retrieval of media data of various type by content and

by technical metadata. We demonstrated how the MIB can be employed in applications requiring sophisticated,

uniform, intelligentmanagement of media data of differenttype: browsingand content-based retrieval media data,

management of annotations to media data, and multimedia authoring. We plan to extend the MIB with support

for further media data types. For this purpose, we evaluate database extensions for the management of audio data,

MPEG streams, and animations. In addition to that, there is ongoing work on a streaming server placed on top of

the MIB followingthe architecture of [BKL96] supportingthe delivery of continuous media data over a network.

Acknowledgments: We would like to thank Susanne Boll for her valuable comments and her great efforts to

prepare and improve this version of the paper.

References

[AN97] D. A. AdjerohandK. C. Nwosu. Multimedia DatabaseManagement–RequirementsandIssues. IEEEMultimedia,

4(3), 1997.

[BG99] S. BechhoferandC. Goble. ClassificationBasedNavigationandRetrievalfor PictureArchives. In Proceedings8th

IFIP Conferenceon Data Semantics(DS-8), Rotorua, New Zealand, January1999. Kluwer Academic Publishers.

[BK99] S. Boll and W. Klas. ZYX — A Semantic Model for Multimedia Documents and Presentations. In Proceedings

of the 8th IFIP Conferenceon Data Semantics (DS-8): “Semantic Issues in Multimedia Systems”, Rotorua, New

Zealand, January1999.

[BKL96] S. Boll, W. Klas, and M. L¨ohr. Integrated DatabaseServicesfor Multimedia Presentations. In S. M. Chung, editor,

Multimedia Information Storage and Management.Kluwer Academic Publishers, Dordrecht, 1996.

[BKW99a] S. Boll, W. Klas, and U. Westermann. Exploiting OR-DBMS Technologyto Implement the ZYX Data Model for

Multimedia Documents and Presentations. In ProceedingsDatenbanksystemein B¨uro, Technik und Wissenschaft

(BTW), Freiburg, Germany, March 1999.

[BKW99b] S. Boll, W. Klas, and U. Westermann. Multimedia Document Formats — Sealed Fate or Setting Out for New

Shores? In Proceedings IEEE International Conference on Multimedia Computing and Systems (ICMCS), Flo-

rence, Italy, June 1999.

[BYRN99] R. Baeza-Yates and B. Ribiero-Neto. ModernInformation Retrieval. Addison Wesley, Harlow, England, 1999.

[dim98] dimedis. The COCOON DataBlade Module UsersGuide. dimedis GmbH, Cologne, Germany, 1998.

95] M. Flickner et al. Query by Image and Video Content: The QBIC System. IEEE Multimedia, 28(9), 1995.

[JE98] H. Jiang and A. K. Elmagarmid. Spatial and Temporal Content-Based Access to Hypervideo Databases. VLDB

Journal, 7(4), 1998.

[KA97] W. Klas and K. Aberer. Multimedia and its Impact on Database System Architectures. In P.M.G. Apers, H.M.

Blanken, and M.A.W. Houtsma, editors, Multimedia Databasesin Perspective.Springer, London, 1997.

[KGF99] W. Klas, C. Greiner, and R. Friedl. Cardio-OP — Gallery of Cardiac Surgery. In ProceedingsIEEE International

Conferenceon Multimedia Computing and Systems (ICMCS), Florence, Italy, June1999.

[LR95] M. L¨ohr and T. C. Rakow. Audio Support for an Object-Oriented Database Management System. Multimedia

SystemsJournal, 3(5-6), 1995.

[OOL

97] V. Oria, M. T. ¨

Ozsu, L. Liu, et al. Modeling Images for Content-Based Queries: The DISIMA Approach. In

Proceedings2nd InternationalConferenceon Visual Information Systems,San Diego, California, December1997.

[ORC

98] M. Ortega, Y. Rui, K Chakrabarti, et al. Supporting RankedBoolean Similarity Queries in MARS. IEEE Transac-

tions on Knowledgeand Data Engineering,10(6), 1998.

[OT93] E. Oomoto and K. Tanaka. OVID: Design and Implementation of a Video-Object DatabaseSystem. IEEE Trans-

actions on Knowledgeand Data Engineering, 5(4), 1993.

[RNL95] T. C. Rakow, E. J. Neuhold, and M. L¨ohr. Multimedia Database Systems – The Notions and the Issues. In

ProceedingsDatenbanksystemein B¨uro, Technikund Wissenschaft (BTW), Dresden, 1995. Springer.

[SFW83] G. Salton, E. A. Fox, and H. Wu. Extended Boolean Information Retrieval. Communications of the ACM, 26(11),

1983.

[WS98] M. Wechsler and P. Sch¨auble. Metadata for Content-BasedRetrieval of Speech. In A. Sheth and W. Klas, editors,

Multimedia Data Management.McGraw-Hill, New York, 1998.