Database Reference
In-Depth Information
Chapter 34
Introduction to Columnstore Indexes
Columnstore indexes are Enterprise Edition feature, introduced in SQL Server 2012. They are part of the new family
of technologies called xVelocity , which optimizes the performance of analytic queries that scan and aggregate large
amounts of data.
Columnstore indexes use a different storage format for data, storing compressed data on a per-column rather
than a per-row basis. This storage format benefits query processing in a Data Warehousing environment where,
although queries typically read a very large number of rows, they work with just a subset of the columns from a table.
Design and implementation of Data Warehouse systems is a very complex process that is not covered in this
book. This chapter, however, will reference common database design patterns frequently encountered in such
systems. Moreover, it will provide an overview of columnstore indexes and their storage format, discuss batch-mode
data processing, and outline several tips that can improve the performance of Data Warehouse solutions.
Data Warehouse Systems Overview
Data Warehouse systems provide the data that is used for analysis, reporting, and decision support purposes. In
contrast to OLTP (Online Transactional Processing) systems, which are designed to support operational activity and
which process simple queries in short transactions, Data Warehouse systems handle complex queries that usually
perform aggregations and process large amounts of data.
For example, consider a company that sells articles to customers. A typical OLTP query from the company's
Point-of-Sale (POS) system might have the following semantic: Provide a list of orders that were placed by this
particular customer this month . Alternatively, a typical query in a Data Warehouse system might read as follows:
Provide the total amount of sales year to date, grouping the results by article categories and customer regions .
There are other differences between Data Warehouse and OLTP systems. Data in OLTP systems is usually volatile.
Such systems serve a large number of requests simultaneously, and they often have a performance SLA associated
with the customer-facing queries. Alternatively, the data in Data Warehouse systems is relatively static and is often
updated based on a set schedule, such as at nights or during weekends. Those systems usually serve a small number
of customers, typically business analysts, managers, and executives who can accept the longer execution time of the
queries due to the amount of data that needs to be processed.
To put things into perspective, the response time of the short OLTP queries usually needs to be in the
milliseconds range. However, for complex Data Warehouse queries, a response time in seconds or even minutes is
often acceptable.
Note
 
 
Search WWH ::




Custom Search