Database Reference
In-Depth Information
Further Reading
This chapter has given a short overview of Flume. For more detail, see
Using Flume
by
Hari Shreedharan (O'Reilly, 2014). There is also a lot of practical information about
designing ingest pipelines (and building Hadoop applications in general) in
Hadoop Ap-
plication Architectures
by Mark Grover, Ted Malaska, Jonathan Seidman, and Gwen Sha-
pira (O'Reilly, 2014).
[
90
]
Note that a source has a
channels
property (plural) but a sink has a
channel
property (singular).
This is because a source can feed more than one channel (see
Fan Out
), but a sink can only be fed by one
channel. It's also possible for a channel to feed multiple sinks. This is covered in
Sink Groups
.
[
91
]
For a logfile that is continually appended to, you would periodically roll the logfile and move the old file
to the spooling directory for Flume to read it.
[
92
]
Table 14-1
describes the interceptors that Flume provides.
[
93
]
The Avro sink-source pair is older than the Thrift equivalent, and (at the time of writing) has some fea-
tures that the Thrift one doesn't provide, such as encryption.