Database Reference
In-Depth Information
Server was designed to scale up. It was also not designed from the ground
up with data warehousing in mind. PDW, however, is designed in this way.
It is a workload-specific appliance focused on the data warehouse.
The fact that PDW is a scale-out technology is incredibly important. It
places Microsoft into the same bracket as other vendors of MPP distributed
databases, such as Teradata, Netezza, Oracle, SAP HANA, and Pivotal, with
technology that has the ability to scale to the demands of big data projects.
The only relational database technology that has any presence in the world
of big data involves scale-out MPP databases. Each and every one of them
purports to have integration with Hadoop in some form or other. PDW is no
exception.
MPP databases offer some compelling benefits for the data warehouse and
for big data. The primary benefit is the ability to scale across servers
enabling a divide and conquer philosophy to data processing. By leveraging
a number of servers, PDW can address many more CPU cores than would
ever be possible in an SMP configuration.
In its biggest configuration, PDW supports 56 data processing servers
(known as compute nodes) comprising the following resources:
• 896 physical CPU cores
• 14TB of memory
• 6PB+ of storage capacity
What is even more impressive is that PDW forces you to use all these
resources. In other words, it forces parallelism into your queries. This is
fantastic for data warehousing.
Imagine having an option in SQL Server that gave you an option to run
with a minimum degree of parallelism or MINDOP(448). You can't? No, of
course you can't, because there is no such option in SQL Server. With PDW,
you have that by default.
Search WWH ::




Custom Search