Designing the Data Warehouse for Analysis Services - Expert Cube Development with SSAS Multidimensional Models

Database Reference

In-Depth Information

Data types and Analysis Services

When we design data marts, we need to be aware that Analysis Services does not

treat all data types the same way. The cube will be much faster, for both processing

and querying, if we use the right data type for each column. Here, we provide a brief

table that helps us during the design of the data mart, to choose the best data type for

each column type:

Fact column type

Fastest SQL Server data types

Surrogate keys

tinyint , smallint , int , and bigint

Date key

int in the format yyyymmdd

Integer measures

tinyint , smallint , int , and bigint

Numeric measures

smallmoney , money , real , and float

(Note that decimal and vardecimal require more CPU power to

process than money and float types)

Distinct count

columns

tinyint , smallint , int , and bigint

(If your count column is char , consider either hashing or

replacing with surrogate key)

Clearly, we should always try to use the smallest data type that will be able to hold

any single value within the whole range of values needed by the application.

This is the rule for relational tables. However, you also need to

remember that the equivalent measure data type in Analysis Services

must be large enough to hold the largest aggregated value of a given

measure, not just the largest value present in a single fact table row.

Always remember that there are situations in which the rules must be overridden.

If we have a fact table containing 20 billion rows, each composed of 20 bytes and a

column that references a date, then it might be better to use a SMALLINT column for

the date if we find a suitable representation that holds all necessary values. We will

gain 2 bytes for each row, and that means a 10 percent reduction in the size of the

whole table.

Search WWH ::

Custom Search

Home