Database Reference
In-Depth Information
How InfoSphere Streams Works
As previously mentioned, Streams is all about analytics on data in motion. In
Streams, data flows continuously through a sequence of operators in a pipe-
line fashion, much like ingredients flowing through an assembly line in a
chocolate factory. Some operators discard data that isn't useful or relevant in
the same way that a sorter of cocoa beans might discard beans that are too
small. Other operators might transform data into a derived data stream in the
same way that cocoa beans are crushed and liquefied. Some operators com-
bine different types of data in the same way that nuts are combined with a
chocolate mixture in just the right proportions.
When operators are too slow to keep up, a data stream can be split up and
sent to parallel instances of those operators, in much the same way that a fac-
tory might arrange its assembly line to have parallel molding machines. Some
operators might send different kinds of data to different downstream opera-
tors, like sending a candy bar with nuts to one wrapping station, and the
candy bar without nuts to another wrapping station. Operators might even
send signals to earlier stages of analysis to change behavior, in much the same
way that quality control in the factory might increase the nuts-to-chocolate
mixture if samples are not meeting specifications.
Unlike an assembly line that can be changed only during a temporary
plant shutdown, however, Streams operators can be improved, added, or
removed dynamically without stopping the analysis. Streams is an excellent
approach for high-throughput, timely analysis; it enables businesses to lever-
age just-in-time intelligence to perform actions in real time, ultimately yield-
ing better results for the business. Streams provides operators to store data
and results in an at-rest engine, to send action signals, or to just toss out the
data if it's deemed to be of no value during in-flight analysis.
What's a Lowercase “stream”?
A stream is a continuous sequence of data elements. A Streams application
can be viewed as a graph of nodes connected by directed edges. Each node in the
graph is an operator or adapter that processes the data from a stream. Operators
can have zero or more inputs and zero or more outputs. The output (or out-
puts) from one operator connects to the input (or inputs) of another operator
(or operators). The edges of the graph that join the nodes together represent
 
Search WWH ::




Custom Search