Database Reference
In-Depth Information
Most applications focus on the analysis of data produced by objects like
customers, suppliers, and so on, assuming that these objects are static ,inthe
sense that their position in space and time is not relevant for the application
at hand. Nevertheless, many applications require the analysis of data about
moving objects , that is, objects that change their position in space and
time. The possibilities and interest of mobility data analysis have expanded
dramatically with the availability of positioning devices. Trac data, for
example, can be captured as a collection of sequences of positioning signals
transmitted by the cars' GPS along their itineraries. Although such sequences
can be very long, they are often processed by being divided in segments of
movement called trajectories , which are the unit of interest in the analysis
of movement data. Extending data warehouses to cope with trajectory data
leads to the notion of trajectory data warehouses . These are studied in
Chap. 12 .
1.3 New Domains and Challenges
Nowadays, the availability of enormous amounts of data is calling for a shift in
the way data warehouse and business intelligence practices have been carried
out since the 1990s. It is becoming clear that for certain kinds of business
intelligence applications, the traditional approach, where day-to-day business
data produced in an organization are collected in a huge common repository
for data analysis, needs to be revised, to account for eciently handling large-
scale data. In many emerging domains where business intelligence practices
are gaining acceptance, such as social networks or geospatial data analytics,
massive-scale data sources are becoming common, posing new challenges
to the data warehouse research community. In addition, new database
architectures are gaining momentum. Parallelism is becoming a must for large
data warehouse processing. Column-store database systems (like MonetDB
and Vertica) and in-memory database systems (like SAP HANA) are strong
candidates for data warehouse architectures since they deliver much better
performance than classic row-oriented databases for fact tables with a large
number of attributes. The MapReduce programming model is also becoming
increasingly popular, challenging traditional parallel database management
systems architectures. Even though at the time of writing this topic it is
still not clear if this approach can be applied to all kinds of data warehouse
and business intelligence applications, many large data warehouses have been
built based on this model. As an example, the Facebook data warehouse
was built using Hadoop (an open-source implementation of MapReduce).
Chapter 13 discusses these new data warehousing challenges.
We already commented that the typical method of loading data into a
data warehouse is through an ETL process. This process pulls data from
source systems periodically (e.g., daily, weekly, or monthly), obtaining a
Search WWH ::




Custom Search