Database Reference
In-Depth Information
data types, 3D topological operators, and spatial operations and functions
that can operate on 3D data types. After that, issues such as the aggregation
of spatial measures should be addressed. Thus, there is fertile land for research
in the field of 3D spatial data warehouses. Combining these with trajectory
data warehouses such as the ones we studied in Chap. 12 wouldleadto4D
spatial data warehouses.
Another important issue in this respect is to cope with multiple
representations of spatial data, which means to allow the same real-world
object to have several geometries. Dealing with multiple representations is a
common requirement in spatial databases, in particular as a consequence of
dealing with multiple levels of spatial resolution. This is also an important
aspect in data warehouses since spatial data may be integrated from source
systems containing data at various different spatial resolutions. In this topic,
we have implicitly assumed that we have selected one representation from
those available. However, we may need to support multiple representations
in a multidimensional model. Again, conceptual models should be extended
to allow multiple representations of spatial data, as it is the case for the spa-
tiotemporal model MADS [ 155 ]. However, this raises some important issues.
For example, if levels forming a hierarchy can have multiple representations,
additional conditions are necessary to establish meaningful roll-up and drill-
down operations.
15.3 Text Analytics and Text Data Warehouses
Other topics that we envision to be important in the future are text
analytics and text data warehouses . This follows clearly from statistics
that report that only 20% of corporate data are in transactional systems and
the remaining 80% are in other formats, mainly text [ 171 , 206 ]. In addition,
the advent of social media has produced enormous amounts of text data, and
the tools studied in Chap. 13 have made possible the analysis of these data.
Text data warehouses can help to perform this task, as we explain next.
Automatic extraction of structured information from text has been studied
for a long time. There are two main approaches for information extraction:
the machine learning approach and the rule-based one. Most systems in both
categories were built for academic settings to be used by specialists and are,
in general, not scalable to heavy workloads.
In the machine learning approach, techniques like the ones studied
in Chap. 9 are used. For example, automatic text classification has been
extensively addressed mainly using supervised learning techniques, where
predefined category labels are assigned to documents. Examples are the
Rocchio algorithm, k-nearest neighbor, decision trees, naıve Bayes algorithm,
neural networks, and support vector machines, among other ones. More
Search WWH ::




Custom Search