Database Reference
In-Depth Information
With this information, we can see that PDW handles the data file in the
primary filegroup differently from all other files and filegroups. In short,
the primary always grows irrespective of the CREATE DATABASE setting for
AUTOGROW . This makes sense; it's not something we've actually specified in
the DDL, and the space consumed is tiny. We certainly wouldn't want a DDL
request such as CREATE TABLE to fail for lack of storage. I have included
this here just for completeness of information.
More important is that the other database files and log files are handled
differently. Furthermore, these allocations are fixed by the release and do
not change (at the time of this writing) based on size of the database; that is,
itdoesnotmatterwhetheryourdatabase is1TBor1PB,the AUTOGROW value
is the same. It also doesn't matter whether the filegroup is for distributed or
replicated tables, the autogrow works in exactly the same way.
Finally, it should also be clear that PDW is not simply offering up SQL
Server functionality. AUTOGROW in PDW is very different than AUTOGROW
in SQL Server. Each filegroup type has been evaluated for the AUTOGROW
function, and a predesigned, tested configuration is applied on your behalf.
This is classic appliance behavior.
Therefore, in conclusion, the AUTOGROW function is really there to prevent
failures and 3:00-a.m. alarm support calls! We should definitely be
monitoring for consumption of our pre-sized database and log files and be
proactive with our management of storage capacity. You can consider it a
good practice to preallocate your distributed, replicated, and log sizes for
your anticipated growth. A typical projection would be roughly six months
to a year. That gives you enough space to fine-tune future allocations
without over-allocating in the first instance.
Distributed Tables
Distributed tables are the heart of PDW and are key to understanding its
parallelism. In PDW, a table is defined as being distributed when it is
created.Wepickacolumn(yes,onlyone,justlikepartitioning) andhashthe
distinct values of this column, allocating each distinct value into one of our
buckets. To be a bit more precise, the hash is performed by the DMS as the
data is written to the table. Let's look at an example:
CREATE TABLE [dbo].[FactInternetSales]
(
Search WWH ::




Custom Search