Databases Reference
In-Depth Information
.
TupleCounter provides access to the number of tuples in the set. For some types of
sets, the number of tuples is known in advance or easy to calculate; for others,
Analysis Services has to materialize the set to find the number of its tuples.
.
TupleRanker provides access to the ordinal position of the tuple in the set or
tuple's rank. Similar to TupleCounter , for some sets it's easy and fast to find out
the tuple's ordinal position; for others, the set has to be materialized.
Optimizing Multidimensional Space by Removing
Empty Tuples
The logical space of a cube that can be addressed from an MDX query is large. It includes
combinations of all the members of all the hierarchies, regardless of whether any data for
those combinations exists. There are many scenarios in which you might like to remove a
coordinate (tuple) that would result in empty cells at the intersection of coordinates on
the other axes. To do this, you can use a NON_EMPTY operator or a NonEmpty function, as
discussed in Chapter 11.
Analysis Services uses the same algorithm to implement both the NON_EMPTY operator and
the NonEmpty function, so we'll explain both of them with an example of the NON_EMPTY
operator. Analysis Services has two approaches to implementation of the NON_EMPTY opera-
tor. Using the first one—the naïve approach—Analysis Services simply iterates over all the
tuples on the axis with NON EMPTY operator, calculates the values of the cells on intersec-
tion of the current tuple with all combinations of tuples on other axes, and only then
removes empty cells. In a large and sparse data set, this algorithm will take a long time
and consume a lot of memory.
The second approach—the data-bound approach—is based on the idea that the physical
space of the cube is much smaller than its logical space and only not-empty values are
physically stored in the cube. Therefore, this algorithm first issues a request to the storage
engine subsystem to retrieve not-null values covering the coordinate space defined by the
query, and then projects those not-null values on the axes.
This approach is much faster than the first one, especially when the data set is large and
sparse. In most scenarios, Analysis Services 2008 uses the data-bound approach. Even
when there are calculations covering the space over which Analysis Services performs a
NON EMPTY operation, it will use a physical plan (as we discuss shortly) to retrieve neces-
sary data first and then use the data-bound approach. There is a small set of scenarios
when Analysis Services can not use data-bound approach and fall back to the naiïve one.
For example, when a NON EMPTY algorithm is applied to multidimensional space with
recursive MDX calculation or many overlapping calculations, Analysis Services might
choose to use a naiïve algorithm.
Analysis Services always starts execution of NON_EMPTY with the data-bound approach, and
then it analyzes the multidimensional space defined by the query and falls back to the
Search WWH ::




Custom Search