Database Reference
In-Depth Information
data,
replace HOUR with MINUTE or SECOND */
SET @startDate = DATEADD(HOUR, 1 , @startDate );
END;
This script will generate roughly 26,000 rows in the FactSales table spanning 3
years, although you can easily increase the number of rows generated by replacing
HOUR in the DATEADD statement with MINUTE or even SECOND .
Now that you have a data source to work with, you are ready to start working on
your Integration Services package.
Package Overview
Let's discuss what your package will do. You are going to configure a data flow that
will move data from SQL Server to PDW. You will create a connection to your data
source via an OLE DB Source. Because UNIQUEIDENTIFIER (also known as a
GUID ) is not yet supported as a data type in PDW, you will transform the
UNIQUEIDENTIFIER to a Unicode string (DT_WSTR) using a data conversion.
You will then configure the PDW destination adapter to load data into the APS appli-
ance. Lastly, you will multithread the package, which takes advantage of PDW's paral-
lelization to improve load performance.
One easy way to multithread is to create multiple data flows that execute in parallel
for the same table. You can have up to ten simultaneous loads—ten data flows—for a
table. The challenge with simultaneous loading, however, is to avoid causing too much
contention on the source system. You can minimize contention by isolating each data
flow to a separate, equal portion of the clustered index. Better yet, if you have SQL
Server Enterprise Edition, you can isolate each data flow by loading by partition. This
latter method is preferable and is the approach our example will use.
Now that you understand the general structure of the Integration Services package,
let's create it.
The Data Source
If you have not already done so, create a new Integration Services project named
PDW_Example (File
New
Project
Integration Services Project).
Search WWH ::




Custom Search