Architecture of Query Execution—Retrieving Data from Storage - Microsoft SQL Server 2008 Analysis Services

Databases Reference

In-Depth Information

First, Analysis Services use the Query Analyzer to analyze the request and divide it in two

parts: the request to the main measure group (the measure group that contains a many-to-

many dimension) and the request to the lookup measure group (the intermediate measure

group), which builds a lookup table used to resolve indirect dimensions.

For example, assume that you have a cube with two measure groups. The main measure

group contains information about sales to customers, so it has a Customer dimension and

a Sales measure that contains the amount customers have paid. To analyze sales by demo-

graphic groups, your cube has the Demographic measure group that contains the

Customers and Demographic Group dimensions. Note that the same customer can belong

to a few different demographic groups. When a user sends a query to produce sales by

demographic groups, the storage engine subsystem gets a request to the main measure

group, Sales , with a subcube that has granularity on the Demographic Group dimension.

This request comes to the Query Analyzer subsystem, which detects that it includes a

many-to-many dimension— Demographic Groups . Analysis Services breaks this request into

two parts: the request to the Sales measure group to bring data for sales of all the

customers, and the lookup request to the lookup measure group to bring data about the

demographic groups of customers.

Analysis Services creates subcubes for both the requests and two jobs: Data and Lookup

jobs. It also creates a dependency between the Data job and the Lookup job to make sure

that the Data job doesn't start until the Lookup job completes.

Analysis Services then starts the jobs. The Job Coordinator subsystem starts the Lookup

job first, and it scans segments of partitions in the intermediate measure group (as

discussed earlier in this chapter). Analysis Services collects the result in the Datacache

object and passes it to the Data job. The Datacache object contains two columns:

Customers and Demographic Group .

The Data job uses a lookup Datacache to build a table that it will use to look up a list of

all the demographic groups for each customer. Then, Analysis Services scans data of the

main measure group, and creates the Datacache objects for each partition's segment,

which has two columns: Customers and Sales . The Data job creates a new Datacache

object with two columns: Demographic Groups and Sales . For each customer in the

Datacache s, it looks up a record in the lookup Datacache , and then generates a record

with the DataID of the demographic group that corresponds to the current customer and

sales of that customer. If a customer belongs to multiple demographic groups, Analysis

Services creates a new record for each demographic group for this customer. In the next

phase, Analysis Services aggregates data from every segment into the resulting Datacache .

After all the segments produce data, the Datacache contains the result.

If your cube contains a large number of sales transactions, your measure group could

potentially contain hundreds of millions of records. However, if the number of customers

is not very large and there are just a few demographic groups, this scheme works effi-

ciently and scales well for a large number of sales. However, if the number of customers

and the number of demographic groups grows, you may encounter performance problems

because the lookup Datacache can become very large and will take a lot of memory (or

Search WWH ::

Custom Search

Home