Database Reference
In-Depth Information
Figure 8. Sample XQuery over the DDSM XML warehouse
In this case study, we selected and processed 1406
XML documents.
The second step in our approach is to design
a dw-model.xml document representing user
analysis requirements, with the help of the UM-
L2XML software. In the present case study, this
document represents a star schema composed
of Suspicious region facts (suspected cancer-
ous regions) characterized by the Region length
and Number of regions measures. dw-model.xml
also describes dimensions and their hierarchies.
We obtain ten dimensions: Patient, Lesion type,
Assessment, Subtlety, Pathology, Date of study,
Date of digitization, Digitizer, Scanner image
and Boundary .
In a third step, both the XML documents rep-
resenting complex medical data and dw-model.
xml are submitted to the X-warehousing software
to actually build an XML data warehouse. X-
warehousing outputs ten XML documents rep-
resenting dimensions: Patient.xml, Lesion_type.
xml, Assessment.xml, Subtlety.xml, Pathology.xml,
Date_of_study.xml, Date_of_digitization.xml,
Digitizer.xml, Scanner image.xml and Boundary.
xml ; and one XML documents containing facts:
facts.xml . These documents constitute the XML
data warehouse. We chose to store this warehouse
within the X-Hive XML-native DBMS 5 . X-Hive
allows the native storage of large documents and
supports XQuery. It also providesAPIs for storing,
querying, retrieving, transforming and publishing
XML data.
Finally, our analysis application exploits a set
of decision-support queries expressed in XQuery.
Figure 8 provides an example of analytical query
that returns the total number of suspicious regions
for fifty-eight-year-old patients. It performs one
join operation between the Patient dimension and
facts, a selection and an aggregation operation.
Variable q stores the Number of regions measure
values used by the aggregation function.
CONCLUSION AND PERSPECTIVES
Nowadays, data processed by DSSs tend to be
more and more complex and pose new challenges
to the data warehousing community. To efficiently
manage and analyze complex data, we propose a
full, generic, XML-based data warehousing and
on-line analysis approach: X-WACoDa. This ap-
proach includes complex data integration, multi-
dimensional modeling and analysis. In this paper,
we identified some substantial heterogeneity in
XML warehouse models from the literature, and
thus focused on proposing a unified reference XML
data warehouse architecture. We also presented
a software platform, also named X-WACoDa,
which implements our ideas.
Research perspectives in the young field of
XML warehousing are numerous. Regarding
complex data integration, we aim at extracting
useful knowledge for warehousing from data
themselves, by applying data mining techniques.
We plan to study a metadata representation of
data mining results in mixed structures combin-
ing XML Schema and the Resource Description
Framework (RDF). These description languages
are indeed well-suited for expressing semantic
properties and relationships between metadata.
Search WWH ::




Custom Search