Architecture of Query Execution—Calculating MDX Expressions - Microsoft SQL Server 2008 Analysis Services

Databases Reference

In-Depth Information

.

TupleCounter provides access to the number of tuples in the set. For some types of

sets, the number of tuples is known in advance or easy to calculate; for others,

Analysis Services has to materialize the set to find the number of its tuples.

.

TupleRanker provides access to the ordinal position of the tuple in the set or

tuple's rank. Similar to TupleCounter , for some sets it's easy and fast to find out

the tuple's ordinal position; for others, the set has to be materialized.

Optimizing Multidimensional Space by Removing

Empty Tuples

The logical space of a cube that can be addressed from an MDX query is large. It includes

combinations of all the members of all the hierarchies, regardless of whether any data for

those combinations exists. There are many scenarios in which you might like to remove a

coordinate (tuple) that would result in empty cells at the intersection of coordinates on

the other axes. To do this, you can use a NON_EMPTY operator or a NonEmpty function, as

discussed in Chapter 11.

Analysis Services uses the same algorithm to implement both the NON_EMPTY operator and

the NonEmpty function, so we'll explain both of them with an example of the NON_EMPTY

operator. Analysis Services has two approaches to implementation of the NON_EMPTY opera-

tor. Using the first one—the naïve approach—Analysis Services simply iterates over all the

tuples on the axis with NON EMPTY operator, calculates the values of the cells on intersec-

tion of the current tuple with all combinations of tuples on other axes, and only then

removes empty cells. In a large and sparse data set, this algorithm will take a long time

and consume a lot of memory.

The second approach—the data-bound approach—is based on the idea that the physical

space of the cube is much smaller than its logical space and only not-empty values are

physically stored in the cube. Therefore, this algorithm first issues a request to the storage

engine subsystem to retrieve not-null values covering the coordinate space defined by the

query, and then projects those not-null values on the axes.

This approach is much faster than the first one, especially when the data set is large and

sparse. In most scenarios, Analysis Services 2008 uses the data-bound approach. Even

when there are calculations covering the space over which Analysis Services performs a

NON EMPTY operation, it will use a physical plan (as we discuss shortly) to retrieve neces-

sary data first and then use the data-bound approach. There is a small set of scenarios

when Analysis Services can not use data-bound approach and fall back to the naiïve one.

For example, when a NON EMPTY algorithm is applied to multidimensional space with

recursive MDX calculation or many overlapping calculations, Analysis Services might

choose to use a naiïve algorithm.

Analysis Services always starts execution of NON_EMPTY with the data-bound approach, and

then it analyzes the multidimensional space defined by the query and falls back to the

Search WWH ::

Custom Search

Home