Database Reference
In-Depth Information
DATABASe ToolS
Many people working in the SQL Server world don't understand the need for
analytic databases. Especially in organizations with either small databases or
really good SQL developers, some of the benefits are not immediately apparent.
The first benefit of analytic databases is related to performance, and this ben-
efit is typically the one that blinds some SQL developers to the other benefits.
SQL databases are really good at row-by-row work, but not as good at doing
aggregations. Summary tables have often been used to solve the issue of
aggregation, and OLAP was one approach to this.
olAP
OLAP stands for Online Analytic Processing, coined in a 1993 white paper
by Edgar F. Codd. The generic term for working with Multi-Dimensional
Analysis (MDA) is cubes, but this has been mostly used to apply to OLAP
cubes. There are subsets of OLAP: MOLAP is Multi-dimensional OLAP; and
ROLAP is Relational OLAP, in which the queries are passed through to SQL;
and Hybrid OLAP (or HOLAP) is a hybrid of the first two.
OLAP (and hence Microsoft's OLAP product, Analysis Services) provides per-
formance benefits by aggregating data according to user defined hierarchies.
For instance, the designer of the cube defines a date hierarchy as rolling up
from Day to Month to Year (YMD), and the OLAP engine creates aggregations
at each level—for example, summarizing sales for the day, for the month, and
for the year. Other hierarchies across multiple dimensions such as the organiza-
tion or location dimensions are also created and aggregated. This can lead to
a problem called database explosion (where the size of the database increases
exponentially), and thus needs a fair degree of skill to optimize.
Partitioning is another performance enhancement technique in Analysis
Services. This works by splitting the data into partitions (for instance by region
or by month), and then queries are optimized, either by reducing the size of
the data to be queried when it exists within a single partition, or by allowing
parallelization when the data is split across multiple partitions.
An alternative method for accelerating queries is to use a column store data-
base. Whereas SQL is a row store and stores data in rows, pages, and extents,
column stores instead store each column independently. This has benefits in
Search WWH ::




Custom Search