Conclusion - Data Warehouse Systems: Design and Implementation

Database Reference

In-Depth Information

data warehouses must be designed in a way similar to traditional data

warehouses. Possible dimensions for image and video data can be the size of

the image or video, the width and height of the frames, the creation date,

and so on. Many of these dimensions also apply to other kinds of multimedia

data.

The main problem in multimedia data warehouses is their high dimen-

sionality. This is due to the fact that multimedia objects like images are

represented in a database by descriptors, which can be of two types: content-

based (or feature) descriptors and description-based (or textual) descriptors.

The former represent the intrinsic content of data (like color, texture, or

shape). The latter represent alphanumeric data like acquisition date, author,

topic, and so on. Most of the content-based descriptors are set oriented rather

than single valued. This would have as a consequence, for example, that we

may need to define each different color as a dimension. Given this high-

dimensional scenario, the main challenge is to be able to perform multimedia

analysis in reasonable execution time.

Image OLAP aims at supporting multidimensional on-line analysis of

image data. An example of the efforts in this field is the work by Jin et

al. [ 97 ], who proposed Visual Cube to perform multidimensional OLAP on

image collections such as web images indexed by search engines, product

images (e.g., from online shops), and photos shared on social networks. Visual

Cube defines two kinds of dimensions: metainformation dimensions such as

date, title, file name, owner, URL, tag, description, and GPS location and

visual dimensions (based on image visual features) such as image size, major

colors, face dimension (indicating the existence of faces), and a color/texture

histogram. To solve the dimensionality problem commented above, the

authors propose two kinds of schemes, namely, a multiple-dimension scheme

(MDS) and a single-dimension scheme (SDS). In an MDS representation, each

possible value of a feature is considered a dimension. For example, Sunny can

be a dimension. Each record corresponding to an image of a sunny day will

contain a '1' in this dimension. On the other hand, in an SDS representation,

the many possible features will be replaced by a dimension denoted Tag .Thus,

an image of a sunny day will contain the value sunny on the Tag dimension.

In addition, a set-valued attribute will contain the identifiers of the images

with that feature, and a single-valued attribute will contain the total number

of such identifiers. The measures in Visual Cube can be a representative

image in a cluster or the number of elements in such a cluster. Clusters

are computed using techniques like the ones studied in Chap. 9 . Records

in a cluster have a combination of descriptors corresponding to the cube

dimensions. In this way, OLAP operations can be performed. For example,

drill-down can be performed by clicking on an image to find others in the

cluster. Open questions are, for example, ecient evaluation of top-k queries

(given a query cell, find the top-k similar cells measured by the similarity

Data Warehouse Systems: Design and Implementation

Search WWH ::

Custom Search

Home