Conclusion - Data Warehouse Systems: Design and Implementation

Database Reference

In-Depth Information

data types, 3D topological operators, and spatial operations and functions

that can operate on 3D data types. After that, issues such as the aggregation

of spatial measures should be addressed. Thus, there is fertile land for research

in the field of 3D spatial data warehouses. Combining these with trajectory

data warehouses such as the ones we studied in Chap. 12 wouldleadto4D

spatial data warehouses.

Another important issue in this respect is to cope with multiple

representations of spatial data, which means to allow the same real-world

object to have several geometries. Dealing with multiple representations is a

common requirement in spatial databases, in particular as a consequence of

dealing with multiple levels of spatial resolution. This is also an important

aspect in data warehouses since spatial data may be integrated from source

systems containing data at various different spatial resolutions. In this topic,

we have implicitly assumed that we have selected one representation from

those available. However, we may need to support multiple representations

in a multidimensional model. Again, conceptual models should be extended

to allow multiple representations of spatial data, as it is the case for the spa-

tiotemporal model MADS [ 155 ]. However, this raises some important issues.

For example, if levels forming a hierarchy can have multiple representations,

additional conditions are necessary to establish meaningful roll-up and drill-

down operations.

15.3 Text Analytics and Text Data Warehouses

Other topics that we envision to be important in the future are text

analytics and text data warehouses . This follows clearly from statistics

that report that only 20% of corporate data are in transactional systems and

the remaining 80% are in other formats, mainly text [ 171 , 206 ]. In addition,

the advent of social media has produced enormous amounts of text data, and

the tools studied in Chap. 13 have made possible the analysis of these data.

Text data warehouses can help to perform this task, as we explain next.

Automatic extraction of structured information from text has been studied

for a long time. There are two main approaches for information extraction:

the machine learning approach and the rule-based one. Most systems in both

categories were built for academic settings to be used by specialists and are,

in general, not scalable to heavy workloads.

In the machine learning approach, techniques like the ones studied

in Chap. 9 are used. For example, automatic text classification has been

extensively addressed mainly using supervised learning techniques, where

predefined category labels are assigned to documents. Examples are the

Rocchio algorithm, k-nearest neighbor, decision trees, naıve Bayes algorithm,

neural networks, and support vector machines, among other ones. More

Search WWH ::

Custom Search

Home