Database Reference
In-Depth Information
Running these BigInsights functions gives you an easy way to integrate
with Hadoop from your traditional application framework. With these func-
tions, database applications (which are otherwise Hadoop-unaware) can access
data in a BigInsights cluster using the same SQL interface they use to get
relational data out of them. Such applications can now leverage the parallel-
ism and scale of a BigInsights cluster without requiring extra configuration
or other overhead. Although this approach incurs additional performance
overhead as compared to a conventional Hadoop application, it is a very use-
ful way to integrate Big Data processing into your existing IT application
infrastructure.
The IBM PureData System for Analytics Adapter
BigInsights includes a connector that enables data exchange between a
BigInsights cluster and IBM PureData System for Anlaytics (or its earlier
incarnation, the Netezza appliance). This adapter supports splitting tables
(a concept similar to splitting files). This entails partitioning the table and
assigning each divided portion to a specific mapper. This way, your SQL
statements can be processed in parallel.
The adapter leverages the Netezza technology's external table feature,
which you can think of as a materialized external UNIX pipe. External tables
use JDBC. In this scenario, each mapper acts as a database client. Basically, a
mapper (as a client) will connect to the database and start a read from a UNIX
file that's created by the IBM PureData System's infrastructure.
JDBC Module
The Jaql JDBC module enables you to read and write data from any rela-
tional database that has a standard JDBC driver. This means you can easily
exchange data and issue SQL statements with every major database ware-
house product in the market today.
With Jaql's MapReduce integration, each map task can access a specific
part of a table, enabling SQL statements to be processed in parallel for parti-
tioned databases.
 
Search WWH ::




Custom Search