Database Reference
In-Depth Information
InfoSphere Streams for Data in Motion
As you'll discover in Chapter 7, Streams is the IBM solution for real-time
analytics on streaming data. Streams includes a sink adapter for BigInsights,
which lets you store streaming data directly into your BigInsights cluster.
Streams also includes a source adapter for BigInsights, which lets Streams
applications read data from the cluster. The integration between BigInsights
and Streams raises a number of interesting possibilities. At a high level, you
can create an infrastructure to respond to changes detected in data in real
time (as the data is being processed in-motion by Streams), while using a
wealth of existing data (stored and analyzed at rest by BigInsights) to inform
the response. You could also use Streams as a large-scale data ingest engine
to filter, decorate, or otherwise manipulate a stream of data to be stored in the
BigInsights cluster.
Using the BigInsights sink adapter, a Streams application can write a control
file to the BigInsights cluster. BigInsights can be configured to respond to the
appearance of such a file so that it would trigger a deeper analytics operation
to be run in the cluster. For more advanced scenarios, the trigger file from
Streams could also contain query parameters to customize the analysis in
BigInsights.
InfoSphere DataStage
DataStage is an extract, transform, and load (ETL) platform that is capable of
integrating high volumes of data across a broad, heterogeneous range of source
and target systems. It offers an easy-to-use interface and design tools; supports
massive scalability; transforms data from multiple, disparate information
sources; and delivers data in batch or in real time. Expanding its role as a data
integration agent, DataStage has been extended to work with BigInsights and
can push and pull data to and from BigInsights clusters. The DataStage con-
nector to BigInsights integrates with HDFS, taking advantage of the clustered
architecture so that any bulk writes to the same file are done in parallel.
The result of DataStage integration is that BigInsights can quickly exchange
data with any other software product able to connect with DataStage.
 
Search WWH ::




Custom Search