Databases Reference
In-Depth Information
Splunk will note the length of files matching the pattern, but followTail instructs
Splunk to ignore everything currently in these files. Any new events written to the
files will be indexed. Remember that there is no easy way to alter this if you change
your mind later.
It is not currently possible to say "ignore all events older than X", but since most logs
roll on a daily basis, this is not commonly a problem.
When to use crcSalt
To keep track of what files have been seen before, Splunk stores a checksum of
the first 256 bytes of each file it sees. This is usually plenty, as most files start with
a log message, which is almost guaranteed to be unique.
This breaks down when the first 256 bytes are not unique on the same server.
I have seen two cases where this happens, as follows:
1.
Logs that start with a common header containing product version
information, for instance:
================================================================
== Great product version 1.2 brought to you by Great company ==
== Server kernel version 3.2.1 ==
2. A server writing many thousands of files with low time resolution, for
instance:
12:13:12 Session created
12:13:12 Starting session
To deal with these cases, we can add the path to the log to the checksum, or "salt our
crc". This is accomplished like so:
[monitor:///opt/B/logs/application.log*]
sourcetype = access
crcSalt = <SOURCE>
It says to include the full path to this log in the checksum.
This method will only work if your logs have a unique name. The easiest way
to accomplish this is to include the current date in the name of the log when it is
created. You may need to change the pattern for your log names so that the date is
always included and the log is not renamed.
Do not use crcSalt if your logs change names!
 
Search WWH ::




Custom Search