Database Reference
In-Depth Information
CHAPTER 8
Loading a PDW Region in APS
SQL Server Parallel Data Warehouse (PDW) is Microsoft's massively parallel process-
ing (MPP) offering and is available as part of Microsoft's Analytics Platform System
(APS). APS is a turnkey solution focused on big data analytics. It offers two regions, or
software options, to customers: PDW and HDInsight, which is Microsoft's 100%
Apache Hadoop distribution. PDW is built upon the SQL Server platform, although it is
a separate product with a build of SQL Server specifically designed to support MPP op-
erations.
Massively Parallel Processing
As the name suggests, massively parallel processing (MPP) uses multiple servers work-
ing as one system, called an appliance , to achieve much greater performance and scan
rates than in traditional SMP systems. SMP refers to symmetric multiprocessing ; most
database systems, such as all other versions of SQL Server, are SMP.
To obtain a better understanding of the difference between SMP and MPP systems,
let's examine a common analogy. Imagine you are handed a shuffled deck of 52 playing
cards and are asked to retrieve all of the queens. Even at your fastest, it would take you
several seconds to retrieve the requested cards. Let's now take that same deck of 52
cards and divide it among ten people. No matter how quick you are, these ten people
working together can retrieve all of the queens much faster than you can by yourself.
As you may have inferred, you represent the SMP system, and the ten people repres-
ent the MPP system. This divide-and-conquer strategy is why MPP appliances are par-
ticularly well suited for high-volume, scan-intensive data warehousing environments,
Search WWH ::




Custom Search