Database Reference
In-Depth Information
server needs to communicate with the Control node to identify which Compute nodes
should receive data. Because this overhead is incurred on every single load, transac-
tional load patterns (such as singleton inserts) should be avoided. PDW performs at its
best when data is loaded in large, incremental batches. You will see much better per-
formance loading 10 files with a 100,000 rows each, or a single file with 1,000,000
rows, than loading 1,000,000 rows individually.
Summary
We've covered a lot of material in this chapter. You have learned about the architec-
tures of Microsoft's Analytics Platform System (APS) and SQL Server Parallel Data
Warehouse (PDW). You've learned about the differences between SMP and MPP sys-
tems and why MPP systems are better suited for large analytical workloads. You have
learned about different methods for loading PDW and ways to improve load perform-
ance. You have also discovered some best practices along the way. Lastly, you walked
through a step-by-step exercise to parallelize loading data from SQL Server into PDW
using SSIS.
Search WWH ::




Custom Search