Database Reference
In-Depth Information
Data types and Analysis Services
When we design data marts, we need to be aware that Analysis Services does not
treat all data types the same way. The cube will be much faster, for both processing
and querying, if we use the right data type for each column. Here, we provide a brief
table that helps us during the design of the data mart, to choose the best data type for
each column type:
Fact column type
Fastest SQL Server data types
Surrogate keys
tinyint , smallint , int , and bigint
Date key
int in the format yyyymmdd
Integer measures
tinyint , smallint , int , and bigint
Numeric measures
smallmoney , money , real , and float
(Note that decimal and vardecimal require more CPU power to
process than money and float types)
Distinct count
columns
tinyint , smallint , int , and bigint
(If your count column is char , consider either hashing or
replacing with surrogate key)
Clearly, we should always try to use the smallest data type that will be able to hold
any single value within the whole range of values needed by the application.
This is the rule for relational tables. However, you also need to
remember that the equivalent measure data type in Analysis Services
must be large enough to hold the largest aggregated value of a given
measure, not just the largest value present in a single fact table row.
Always remember that there are situations in which the rules must be overridden.
If we have a fact table containing 20 billion rows, each composed of 20 bytes and a
column that references a date, then it might be better to use a SMALLINT column for
the date if we find a suitable representation that holds all necessary values. We will
gain 2 bytes for each row, and that means a 10 percent reduction in the size of the
whole table.
Search WWH ::




Custom Search