Database Reference
In-Depth Information
In PDW, these kinds of queries often return in minutes, and a well-designed
schema, where the rows you're joining are stored together on the same distribution, can
even execute this query in seconds. This is because the architecture is optimized for
scans; PDW expects to scan every row in the table. Remember how we said that every
distribution has its own dedicated CPU, memory, and storage? When you submit the
query to join 1 billion rows to 50 million rows, each distribution is performing a scan
on its own Sales table of 13,800,000 rows and Customer table with 695,000 rows.
These smaller volumes are much more manageable, and an idle distribution can handle
this workload with ease. The data is then sent back to the Control node across an ultra-
fast dual-InfiniBand channel to consolidate the results and return the data to the end
user. It is this divide-and-conquer strategy that allows PDW to significantly outperform
SMP systems.
Clustered Columnstore Indexes
Columnstore refers to the storing of data in a columnar—or column-oriented—storage
format. In traditional RDBMS systems, data is stored using a rowstore —or row-orien-
ted—storage format. Rowstores are generally well suited for transactional applications
where the application is concerned with most or all columns for one or a small number
of rows. Columnstores, on the other hand, are better suited for analytical applications,
which are generally concerned with a subset of columns and a large portion, or even
all, of the rows in the table.
For those of you familiar with indexing in SQL Server, you may think of a
clustered columnstore index (CCI) as analogous to a clustered index in a row-oriented
table. But unlike a clustered index, once a CCI has been defined, additional indexes
such as nonclustered indexes may not be created on that table.
CCIs offer several performance improvements over traditional rowstores in PDW.
Some customers have seen up to ten times the query performance and up to seven
times the data compression improvements. Because of the compression and related per-
formance improvements, Microsoft recommends CCI as the standard storage format
for tables in PDW.
Tip Columnstores are available in both PDW and SQL Server Enterprise Edition.
For more information on columnstore indexes, refer to the “Columnstore Indexes
Described” topic on MSDN, or navigate to http://msdn.microsoft.com/en-
us/library/gg492088 .
Search WWH ::




Custom Search