Database Reference
In-Depth Information
Flume Sinks
Flume sinks sit on the other side of a Flume channel, acting as a destination
for events. Flume ships with a large number of built-in sinks, most of which
are focused on writing events to some sort of persistent store. Some
examples of sinks include an HDFS sink that writes data to Hadoop; a File
Roll sink that persists events to disk; and an Elasticsearch sink which writes
events to the Elasticsearch data store.
All of these sinks are important parts of a full-fledged data pipeline, but they
are not particularly applicable to processing streaming data. As such, they
will be mostly left aside in this section in favor of the sinks that are more
useful for streaming processing.
Avro Sink
The Avro sink's primary use case is to implement multi-level topologies
in Flume environments. It is also a good way of forwarding events onto
a streaming processing service for consumption. The Avro sink has a type
property of avro and minimally requires a destination host and port:
agent_1.sinks.sink-1.type= avro
agent_1.sinks.sink-1.hostname= localhost
agent_1.sinks.sink-1.port= 4141
There are also a number of optional parameters that are used to control
the behavior of the sink. The connection and reconnect timeouts can be
controlled using the appropriate properties, with the ability to specify a
reconnection interval. This can be useful when the destination is a
load-balanced connection, as load can often only be balanced at connection
time. By occasionally reconnecting, the load can be more evenly distributed:
agent_1.sinks.sink-1.connect-timeout= 20000
agent_1.sinks.sink-1.request-timeout= 10000
agent_1.sinks.sink-1.reset-connection-interval= 600000
Connections can also be compressed by setting the compression-type
property to deflate , the only supported option:
agent_1.sinks.sink-1.compression-type= deflate
agent_1.sinks.sink-1.compression-level= 5
Search WWH ::




Custom Search