Database Reference
In-Depth Information
This is quite an advanced topic, and so we aren't going to be able to cover
it here. However, suffice to say that by replicating small dimension tables
and inner-joining them to our distribution table, we avert the need to
redistributedatabecauseofthejoin.However,hopefullyyoucanseethatwe
need to pay careful attention to our table design, as it can have a dramatic
impact on performance.
Naturally, there is a price to pay for this read enhancement. That price is
in the form of delayed writes. Consider for a moment our six-compute-node
deployment of PDW. If we have a replicated table, we will need to write
the same row six times to ensure consistency. Writes to replicated tables
are therefore much slower than to distributed tables. You can imagine that
if a user were able to read the data partway through this write that the
user would end up with inconsistent results. Therefore, the write is also a
blocking transaction.
Under normal operation, these are good trade-offs. The write penalty is a
one-time-only operation, and hopefully we can batch these up to maximize
efficiency.
Finally, PDW also performs the same abstraction for table names of
replicated tables as it does for distributed ones. A slight difference exists,
though, inasmuch as we need to create only one mapping name, and so
the naming convention differs slightly. For replicated tables, it is
TABLE_32AlphanumericCharacters . To see the value created, we can
use the sys.pdw_table_mappings catalog view as before.
Hopefully, you now have an appreciation for the what, why, and how of
PDW and are intrigued enough to consider it in your environment. In this
next section, we are going to talk about the one feature that's completely
unique to PDW—its jewel in the crown, Project Polybase.
Project Polybase
Project Polybase was devised by the Gray Systems Lab
( http://gsl.azurewebsites.net/ ) at the University of Wisconsin-Madison,
which is managed by Technical Fellow Dr. David DeWitt. In fact, Dr. DeWitt
was also instrumental in the development of Polybase, and is himself an
expert in distributed database technology.
Search WWH ::




Custom Search