X-WACoDa - Data Warehousing Design and Advanced Engineering Applications

Database Reference

In-Depth Information

approach, the authors assume that all dimensions

are part of fact data and that each fact is described

in a single XML document.

Rajugan et al. (2005) also propose a view-

driven approach for modeling and designing an

XML fact repository, named GxFact. GxFact

gathers xFACTs (distributed XML warehouses

and datamarts) in a global company setting. The

authors also provide three design strategies for

building and managing GxFact to model further

hierarchical dimensions and/or global document

warehouses.

Baril and Bellahsène (2003) envisage XML

data warehouses as collections of views repre-

sented by XML documents. Views, defined in

the warehouse, allow to filter and to restructure

XML sources. A warehouse is defined as a set

of materialized views and provides a mediated

schema that constitutes a uniform interface for

querying the XML data warehouse. Following this

approach, Baril and Bellahsène have developed

a system named DAWAX.

Finally, Zhang et al. (2005) propose an ap-

proach to materialize XML data warehouses

based on frequent query patterns discovered from

historical queries. The authors apply a hierarchical

clustering technique to merge these queries and

therefore build the warehouse.

COUNT, SUMMARY, TOPIC, TOP KEYWORD

and CLUSTER. Some operators are inherited from

the relational context, while others are designed

for non-additive data and exploit text mining

techniques.

Beyer et al. (2005) argue that analytical

queries written in XQuery are difficult to read,

write, and process efficiently. To address these

issues, the authors propose to extend XQuery

FLWOR expressions with an explicit syntax for

grouping and numbering query results. They also

present solutions dealing with the homogeneous

and hierarchical aspect of XML data for explicit

grouping problems.

In the same context, Wang et al. (2005) pres-

ent concepts for XOLAP (OLAP on XML data).

The authors define a general XML aggregation

operator, GXaggregation. This operator permits

property extraction from dimensions and measures

through their XPath expression. Hence, computing

statistics over XML data becomes more flexible.

This process is performed with functions that ag-

gregate heterogeneous data over hierarchies. The

authors also envisage to embed GXaggregation in

an XML query language such as XQuery.

Finally, Ben Messaoud et al. (2006a) propose

an OLAP aggregation operator that is based on an

automatic clustering method: OpAC. The authors'

proposal enables precise analyses and provides

semantic aggregates for complex data represented

by XML documents. OpAC has been applied onto

XML cubes output by the XWarehousing approach

(Boussaïd et al. , 2006).

Multidimensional Analysis

over xML Data

Though several studies from the literature ad-

dress the issue of XML data warehousing, fewer

actually push through the whole decision-support

process and address the multidimensional analysis

of XML data. To query XML cubes, Park et al.

(2005) propose a multidimensional expression

language, XML-MDX. The authors supplement

the Microsoft multidimensional expression lan-

guage, MDX, with two additional statements:

CREATE XQ-CUBE to create XML cubes, and

SELECT to query them. In addition, the authors

define seven aggregation operators: ADD, LIST,

xML WAREHOUSING AND

ANALYSIS METHODOLOGY

In a data warehousing process, the data integration

phase is crucial. Data integration is a hard task

that involves reconciliation at various levels (data

models, data schemas, data instances, semantics).

Nowadays, in most organizations, XML docu-

ments are becoming a casual way to represent

Data Warehousing Design and Advanced Engineering Applications

Search WWH ::

Custom Search

Home