Database Reference
In-Depth Information
determining which OLTP tables you intend to use as the source for your dimension and fact tables and then
noting their current size.
SQL Server Management Studio reports are rendered using a miniature version of the Report Server
engine and do not require that SSRS be installed or configured on your machine. For information on setting up
custom reports, search the Web for “Customer Reports Management Studio.”
Note
This is a simple effective way to estimate the size of your data warehouse, but it will not work for all
occasions. For instance, you may have to pull data from sources other than SQL Server, such as a set of flat files or
an Oracle server. In those cases, you have to estimate the database size the old-fashioned way: by estimating the
size of each row in a table and multiplying it by the number of rows in that table. You can find more information
on estimating the size of a database on Microsoft's website at http://msdn.microsoft.com/en-us/library/
ms187445.aspx .
Setting the Recovery Model
After you have selected the name, owner, and size of your data warehouse, you may want to move to the options
page and change the recovery model. The recovery model controls the behavior of a database transaction log file.
Setting this correctly can increase your data warehouse loading performance.
The three settings are Full, Simple, and Bulk-Logged. Each functions slightly differently. In both full and
simple modes, the log file will record every row that you add to the table. The difference between these two
modes is that in simple mode, the transaction log file is periodically cleared automatically. When the database
is set to full mode, the log file continues to fill up until you back up the log file. In an OLTP database, it is best
practice to put the database in full mode and do regular backups on the transaction log. This makes sense, given
that the purpose of an OLTP database is to process transactions. Therefore, making sure that you have a backup of
these transactions is important. In an OLAP database, however, the focus is not on transactions but on analysis of
data. Because of this, using full mode is not necessary because you do not need to make backups of the individual
transactions.
A SQL Server transaction will occur whenever you insert, update or delete data from a table. Consequently,
you will still end up with transactional events each time you import to the data warehouse. In both simple and
full mode, if you import a million rows, the log records a million insert transactions. Therefore, regardless of
whether you use the simple or the full, the log file records each and every insert.
Using the full mode increases the size of the log files substantially because it is not automatically cleared for
you but instead relies on regular backups. Using the simple mode, which has automatic truncations, will keep the
log file size small, although it still has the extra overhead of tracking each and every insert. This brings us to the
third option: bulk-logged.
The bulk-logged option may be a more logical choice for an OLAP database. Figure 5-14 shows this selection.
With this setting, large imports of data are recorded with the minimal amount of transaction log entries,
which provides additional performance during data warehouse loading and minimal impact on the size of the
transaction log. You will, however, still need to perform regular backups on the log file to clear it out. Also, there
will still be times when bulk-logged is not the best choice, such as when you have slow-changing columns that
you could lose data from, as we discuss in a moment.
 
 
Search WWH ::




Custom Search