Database Reference
In-Depth Information
Directory source attempts to overcome the limitations of the Tail source by
providing support for the most common form of logging—logs with rotation.
To use the Spool Directory source, the application must set up its log
rotation such that when a log file is rolled it is moved to a spool directory.
This is usually fairly easy to do with most logging systems. The Spool
Directory source monitors this directory for new files and writes them to
Flume. This minimally requires the definition of the
spoolDir
property
along with setting the type to
spooldir
:
agent_1.source.source-1.type=spooldir
agent_1.source.source-1.spoolDir=/var/logs/
A common pattern for applications is to write to a file ending in
.log
and
then to roll it to another log file ending in a timestamp. Normally the active
log file causes errors for the Spool Directory source, but it can be ignored by
setting the
ignorePattern
property to ignore files ending in
.log
:
agent_1.source.source-1.ignorePattern=^.*\.log$
When the Spool Directory Source finishes processing a file, it marks the
file by appending
.COMPLETED
to the filename, which is controlled by the
fileSuffix
property. This allows log cleanup tools to identify files that
canbesafelydeletedfromthefilesystem.Thesourcecanalsodeletethefiles
when it is done processing them by setting the
deletePolicy
property to
immediate
:
agent_1.source.source-1.fileSuffix=.DONE
agent_1.source.source-1.deletePolicy=immediate
Bydefault, thesourceassumes thattherecordsinthefilesaretextseparated
by newlines. This is the behavior of the
LINE
deserializer, which ships
with Flume. The base Flume installation also ships with
AVRO
and
BLOB
deserializers. The
AVRO
deserializer reads Avro-encoded data from the files
according to a schema specified by the
deserializer.schemaType
property. The
BLOB
deserializer reads a large binary object from each file,
usually with just a single object per file. Like the HTTP BlobHandler, this
can be dangerous because the entire Blob is read into memory. The choice
of deserializer is via the
deserializer
property: