Advanced Deployments - Implementing Splunk: Big Data Reporting and Development for Operational Intelligence

Databases Reference

In-Depth Information

If we increase the number of average concurrent queries, increase the amount of data

indexed per day, or decrease our IOPS, the number of indexers needed should scale

more or less linearly.

If we scale up a bit more, say 120 gigabytes a day, 5 concurrent queries, and 2

summary queries running on average, we grow as follows:

950/800 IOPS *

120/100 gigs *

(2 concurrent summary query + 5 concurrent user queries) / 4

= 2.5 indexers

Three indexers would cover this load, but if one indexer is down, we will struggle

to keep up with data from forwarders. Ideally, in this case, we should have four or

more indexers.

Planning redundancy

The term redundancy can mean different things, depending on your concern.

Splunk has features to help with some of these concerns but not others. In a nutshell,

up to and including Version 4.3, Splunk is excellent at making sure data is captured

but provides essentially no mechanism for reliably replicating data across multiple

indexers. Splunk 5, not covered in this topic, adds data replication features that can

eliminate most of these concerns.

Indexer load balancing

Splunk forwarders are responsible for load balancing across indexers. This is

accomplished most simply by providing a list of indexers in outputs.conf ,

as shown in the following code:

[tcpout:nyc]

server=nyc-splunk-index01:9997,nyc-splunk-index02:9997

If an indexer is unreachable, the forwarder will simply choose another indexer

in the list. This scheme works very well and powers most Splunk deployments.

If the DNS entry returns multiple addresses, Splunk will balance between the

addresses on the port specified.

Search WWH ::

Custom Search

Home