Databases Reference
In-Depth Information
CHAPTER
8
Workload Management in the
Data Warehouse
For the machine meant the conquest of horizontal space. It also meant a sense of that space
which few people had experienced before—the succession and superimposition of views,
the unfolding of landscape in flickering surfaces as one was carried swiftly past it, and an
exaggerated feeling of relative motion (the poplars nearby seeming to move faster than the
church spire across the field) due to parallax. The view from the train was not the view from
the horse. It compressed more motifs into the same time. Conversely, it left less time in which
to dwell on any one thing.
—Robert Hughes, The Shock of the New
INTRODUCTION
There are several systems that have been built for data warehouses over the last 30 years. The primary
goal of a data warehouse has evolved from being a rear-view look at the business and metrics for
decision making to a real-time and predictive engine. The evolution has been really fast in the last 5
years compared to the 25 years prior, and such a pace of growth mandates several changes to happen
in the entire data processing ecosystem to scale up and scale out to handle the user demands and data
processing requirements. One of the key aspects to consider for designing the newer architectures is
to understand what we are processing and what is required from a system perspective for this process-
ing to happen in an acceptable performance time. Apart from the initial processing, we need to design
the system to sustain the performance, remain scalable, and remain financially viable for any enter-
prise. The answer to this lies in understanding the workloads that happen in the system today and in
the future, and create architectures that will meet these requirements.
Workload-driven data warehousing is a concept that will help the architects and system administra-
tors to create a solution based on the workload of processing data for each data type and its final inte-
gration into the data warehouse. This is key to understanding how to build Big Data platforms that will
remain independent of the database and have zero or very minimal dependency on the database. This
is the focus of this chapter and the design and integration will be discussed in Chapters 11 and 12.
Current state
Performance, throughput, scalability, and flexibility are all areas that have challenged a data ware-
house and will continue to be an area of challenge for data warehousing until we understand the
163
 
Search WWH ::




Custom Search