Databases Reference
In-Depth Information
//Do nothing
}
return
;
}
String
str
;
BufferedReader
reader
=
new
BufferedReader
(
fileReader
);
try
{
while
((
str
=
reader
.
readLine
())
!=
null
){
this
.
collector
.
emit
(
new
Values
(
str
));
}
}
catch
(
Exception
e
){
throw
new
RuntimeException
(
"Error reading tuple"
,
e
);
}
finally
{
completed
=
true
;
}
}
Values is an implementation of ArrayList, where the elements of the list
are passed to the constructor.
nextTuple()
is called periodically from the same loop as the
ack()
and
fail()
methods.
It must release control of the thread when there is no work to do so that the other
methods have a chance to be called. So the first line of nextTuple checks to see if
processing has finished. If so, it should sleep for at least one millisecond to reduce load
on the processor before returning. If there is work to be done, each line in the file is
read into a value and emitted.
A tuple is a named list of values, which can be of any type of Java object
(as long as the object is serializable). By default, Storm can serialize
common types like strings, byte arrays, ArrayList, HashMap, and Hash-
Set.
Bolts
We now have a spout that reads from a file and emits one
tuple
per line. We need to
create two bolts to process these tuples (see
Figure 2-1
). The bolts implement the
backtype.storm.topology.IRichBolt
interface.
The most important method in the bolt is
void execute(Tuple input)
, which is called
once per tuple received. The bolt will emit several tuples for each tuple received.
A bolt or spout can emit as many tuples as needed. When the
nextTu
ple
or
execute
methods are called, they may emit 0, 1, or many tuples.
You'll learn more about this in
Chapter 5
.