Database Reference
In-Depth Information
The radical reduction in the number of reads required to retrieve the data and the marginal increase in speed are
all the result of being able to reference information that is indexed by column instead of by row. The foregoing query
ran in what's called row mode . A columnstore index has two operations: batch mode and row mode. Of the two, batch
mode is much faster. It takes advantage of a special method of accessing the data that allows for pulling large sets
of data, in batches, so that decompression isn't necessary. The batch mode processing also has extra benefits when
working with multiprocessor machines and modern processors with big caches. To see whether a columnstore ran in
batch or row mode, you just have to check the properties of the columnstore operation. For example, in the query in
question, Figure 9-16 shows the property sheet from the columnstore scan operator.
Figure 9-16. Actual execution mode
There is a limited set of operations, documented in Books Online, that result in batch mode processing, but when
working with those operations on a system with enough processors, you will see yet another substantial performance
enhancement.
Columnstore indexes don't require you to have the columns in a particular order, unlike clustered and nonclustered
indexes. Also, unlike these other indexes, you should place multiple columns within a columnstore index so that you get
benefits across those columns. Put another way, if you anticipate that you'll need to query the column at some point, add
it proactively to the columnstore index definition. But if you're retrieving large numbers of columns from a columnstore
index, you might see some performance degradation.
You can see a clustered columnstore index in action too. Since you can't have constraints, you'll simply re-create
the table with this script:
SELECT *
INTO dbo.TransactionHistoryArchive
FROM Production.TransactionHistoryArchive;
CREATE INDEX ClusteredColumnstoreTest
ON dbo.TransactionHistoryArchive
(TransactionID);
From there, it's possible to create a clustered columnstore index, effectively converting the table from a
traditional clustered table to a columnstore table because, like with regular clustered indexes, the clustered
columnstore index stores the data. Here's the syntax:
CREATE CLUSTERED COLUMNSTORE INDEX ClusteredColumnstoreTest
ON dbo.TransactionHistoryArchive
WITH (DROP_EXISTING = ON);
This results in a table that is now stored as a clustered columnstore. If you modify the aggregate query from
earlier to run against this new table and you point your queries at this table, you'll see the same execution plan and
performance as you saw against the nonclustered columnstore index. But, this new structure can be updated, unlike
the index you created previously.
While the limitations for the use of the columnstore index are somewhat stringent, especially with the
nonclustered columnstore index, the benefits for structures that can operate within those limitations are clear.
Be sure to clean up any of the remaining objects.
DROP TABLE dbo.TransactionHistoryArchive;
DROP INDEX Production.TransactionHistoryArchive.ix_csTest;
 
Search WWH ::




Custom Search