Databases Reference
In-Depth Information
• Splunk relies on the modification time to determine whether the new
events have been written to a file. File metadata may not be updated
as quickly on a share.
• A large directory structure will cause the Splunk process reading logs
to use a lot of RAM and a large percentage of the CPU. A process to move
old logs away would be advisable so as to minimize the number of files
Splunk must track.
This setup often looks like the following figure:
This configuration may look simple, but unfortunately, it does not scale easily.
Consuming logs in batch
Another less common approach is to gather logs periodically from servers after the
logs have rolled. This is very similar to monitoring logs on a shared drive, except
that the problems of scale are possibly even worse.
The advantages of this approach include:
• A forwarder does not need to be installed on each server that is writing its
logs to the share.
The disadvantages of this approach include:
• When new logs are dropped, if the files are large, the Splunk process will
only read events from one file at a time. When this directory is on an indexer,
this is fine, but when a forwarder is trying to distribute events across
multiple indexers, only one indexer will receive events at a time.
• The oldest events in the rolled log will not be loaded until the log is rolled
and copied.
• An active log cannot be copied, as events may be truncated during the copy
or Splunk may be confused and believe the update file is a new log, indexing
the entire file again.
 
Search WWH ::




Custom Search