Database Reference
In-Depth Information
output upperedTxt : line = upper(line);
}
() as Sink = FileSink(upperedTxt) {
param file : "/dev/stdout";
format : line;
}
}
In this SPL snippet, the built-in FileSource operator reads data from the
specified file one line at a time and puts it into a stream called LineStream
that has a single tuple attribute called line . The built-in Functor operator
consumes the LineStream stream , converts the line attribute from each
streamed tuple to uppercase text, and creates a new output stream called
upperedTxt using the same tuple format as LineStream . The Sink opera-
tor then reads the upperedTxt stream of data and sends the tuples to stan-
dard output ( STDOUT ). Notice that the application is wrapped in a composite
operator that encapsulates the function, as described in the previous
sections; again, all applications consist of one or more composite operators.
This snippet represents the simplest stream with a single source, a single
operation, and a single sink. Of course, the power of Streams is that it can run
massively parallel jobs across large clusters of servers, where each operator, or
a group of operators, can be running on a separate server. But before we get into
the enterprise class capabilities of Streams, let's look at some of the most popu-
lar adapters and operators that are available in this product. As described ear-
lier in this chapter, Streams provides nearly 30 built-in operators in the standard
toolkit, dozens of operators from special toolkits, such as data mining and text
analytics, and literally hundreds of functions out-of-the-box for developers.
Source and Sink Adapters
It goes without saying that in order to perform analysis on a stream of data,
the data has to enter a stream. Of course, a stream of data has to go somewhere
when the analysis is done (even if that “somewhere” is defined as a void where
bits get dumped into “nowhere”). Let's look at the most basic source adapters
available to ingest data along with the most basic sink adapters to which data
can be sent. Perhaps the most powerful adapters we describe are import and
export —these operators provide dynamic connections for jobs that can be
configured at deployment time and at run time.
 
Search WWH ::




Custom Search