Database Reference
In-Depth Information
ThegoalforPolybasewassimple:provideT-SQLoverHadoop,asinglepane
of glass for analysts and developers to interact with data residing in HDFS
and to use that “nonrelational” data in conjunction with the relational data
in conventional tables.
The goal was simple, but the solution was not. It was so big that Polybase
had to be broken down into phases.
We've talked about Polybase and Hadoop integration a few times in this
chapter. This section dives right into it:
• Polybase architecture
• Business use cases for Polybase today
• The future for Polybase
Polybase Architecture
Polybase is unique to PDW. It is integrated within the PDW's DMS. The
DMS isn't shipped with any other SQL Server family product. Therefore, I
think it's fair to claim Polybase for PDW.
Polybase extends the DMS by including an HDFS Bridge component into
its architecture. The HDFS Bridge abstracts the complexity of Hadoop away
from PDW and allows the DMS to reuse its existing functionality; namely,
data type conversion (to ODBC types), generating the hash for data
distribution and loading data into the SMP SQL Servers on residing
compute nodes.
This section details the following:
• HDFS Bridge
• Imposing structure with external tables
• Querying across relational and nonrelational data
• Importing data
• Exporting data
HDFS Bridge
The HDFS Bridge is an extension of the DMS. Consequently, it is a unique
feature to PDW. Its job is to abstract away the complexity of Hadoop and
isolate it from the rest of PDW while providing the gateway to data residing
Search WWH ::




Custom Search