Database Reference
In-Depth Information
data is first loaded into a temporary table in the staging database before it is inserted in-
to the permanent table in the destination database.
Tip Using a staging database? Make sure your staging database has enough space
available to accommodate all tables being loaded concurrently. If you do not allocate
enough space initially, don't worry; you'll still be okay—the staging database will
autogrow. Your loads may just slow down while the autogrow is occurring. Also, your
staging database will likely need to be larger when you perform the initial table loads
during system deployment and migration. However, once your system becomes more
mature and the initial ramp-up is complete, you can recover some space by dropping
and re-creating a smaller staging database.
From within SSDT, execute the code in Listing 8-5 on your PDW appliance to cre-
ate a staging database.
Listing 8-5 . PDW Code to Run from SSDT to Create a Staging Database
CREATE DATABASE StageDB_Example
WITH
(
AUTOGROW
= ON
, REPLICATED_SIZE
= 1 GB
, DISTRIBUTED_SIZE
= 5 GB
, LOG_SIZE
= 1 GB
);
PDW introduces the concept of replicated and distributed tables. In a distributed
table , the data is split across all nodes using a distribution hash specified during table
creation. In a replicated table , the full table data exist on every Compute node. When
used correctly, a replicated table can often improve join performance. As a hypothetical
example, consider a small DimCountry dimension table with 200 rows. DimCoun-
try would likely be replicated, whereas a much larger FactSales table would be
distributed. This design allows any joins between FactSales and DimCountry to
take place locally on each node. Although you would essentially be creating ten copies
of DimCountry , one on each Compute node, the performance benefit of a local join
outweighs the minimal cost of storing duplicate copies of such a small table.
 
 
Search WWH ::




Custom Search