Database Reference
In-Depth Information
approach, the authors assume that all dimensions
are part of fact data and that each fact is described
in a single XML document.
Rajugan et al. (2005) also propose a view-
driven approach for modeling and designing an
XML fact repository, named GxFact. GxFact
gathers xFACTs (distributed XML warehouses
and datamarts) in a global company setting. The
authors also provide three design strategies for
building and managing GxFact to model further
hierarchical dimensions and/or global document
warehouses.
Baril and Bellahsène (2003) envisage XML
data warehouses as collections of views repre-
sented by XML documents. Views, defined in
the warehouse, allow to filter and to restructure
XML sources. A warehouse is defined as a set
of materialized views and provides a mediated
schema that constitutes a uniform interface for
querying the XML data warehouse. Following this
approach, Baril and Bellahsène have developed
a system named DAWAX.
Finally, Zhang et al. (2005) propose an ap-
proach to materialize XML data warehouses
based on frequent query patterns discovered from
historical queries. The authors apply a hierarchical
clustering technique to merge these queries and
therefore build the warehouse.
COUNT, SUMMARY, TOPIC, TOP KEYWORD
and CLUSTER. Some operators are inherited from
the relational context, while others are designed
for non-additive data and exploit text mining
techniques.
Beyer et al. (2005) argue that analytical
queries written in XQuery are difficult to read,
write, and process efficiently. To address these
issues, the authors propose to extend XQuery
FLWOR expressions with an explicit syntax for
grouping and numbering query results. They also
present solutions dealing with the homogeneous
and hierarchical aspect of XML data for explicit
grouping problems.
In the same context, Wang et al. (2005) pres-
ent concepts for XOLAP (OLAP on XML data).
The authors define a general XML aggregation
operator, GXaggregation. This operator permits
property extraction from dimensions and measures
through their XPath expression. Hence, computing
statistics over XML data becomes more flexible.
This process is performed with functions that ag-
gregate heterogeneous data over hierarchies. The
authors also envisage to embed GXaggregation in
an XML query language such as XQuery.
Finally, Ben Messaoud et al. (2006a) propose
an OLAP aggregation operator that is based on an
automatic clustering method: OpAC. The authors'
proposal enables precise analyses and provides
semantic aggregates for complex data represented
by XML documents. OpAC has been applied onto
XML cubes output by the XWarehousing approach
(Boussaïd et al. , 2006).
Multidimensional Analysis
over xML Data
Though several studies from the literature ad-
dress the issue of XML data warehousing, fewer
actually push through the whole decision-support
process and address the multidimensional analysis
of XML data. To query XML cubes, Park et al.
(2005) propose a multidimensional expression
language, XML-MDX. The authors supplement
the Microsoft multidimensional expression lan-
guage, MDX, with two additional statements:
CREATE XQ-CUBE to create XML cubes, and
SELECT to query them. In addition, the authors
define seven aggregation operators: ADD, LIST,
xML WAREHOUSING AND
ANALYSIS METHODOLOGY
In a data warehousing process, the data integration
phase is crucial. Data integration is a hard task
that involves reconciliation at various levels (data
models, data schemas, data instances, semantics).
Nowadays, in most organizations, XML docu-
ments are becoming a casual way to represent
Search WWH ::




Custom Search