Database Reference
In-Depth Information
Data Hub/Broker
Like PDW, Hadoop can often be considered as a downstream system. In
other words, it receives data from other data sources, which are then
brought in for analysis.
Integrating that data into Hadoop does require some additional skills and
possible tooling. However, you could mitigate that need by using PDW
as a data hub and delivery mechanism for the warehouse and Hadoop
environments. This would have some nice benefits:
• No new skills
• No new tools
• Consistency of data
• Streamlined operations through consolidated feeds
Speculating on the Future for Polybase
Sometimes it's worth doing some research and wider reading. You can
occasionally come across some hidden gems that make a worthwhile pursuit
an invaluable one. Bear this in mind as we dust off our crystal ball and
“speculate” on the future for Polybase.
Partitioned Appliance
In the June 2013 sigmod whitepaper authored by Dr. David DeWitt and
others titled “Split Query Processing in Polybase,” I unearthed the following
information morsel: “ . . . there are tentative plans to allow customers to
partition their appliances into disjoint Hadoop and PDW regions . . .”
This first comment largely speaks for itself. To me, it is clear that the PDW
team plans to enable PDW customers to use scale units for Hadoop. I'd
expectthistobeinadditiontothefeatureswehaveseeninPolybasealready.
Consequently, as users, I think what we will have is a choice. We can either
integrate with Hadoop that's configured inside the appliance and/or
integrate with Hadoop outside of the appliance. That'd be a pretty
impressive option don't you think? Other vendors tend to favor one
deployment model and only work with a single distribution. As we look to
the future, the agnostic architecture of Polybase is really starting to show its
true worth.
Search WWH ::




Custom Search