Databases Reference
In-Depth Information
The math then looks like:
2,880 summary events /
10,000 events processed per second =
.288 seconds
That's still a significant improvement over 100 seconds.
When to not use a summary index
There are several cases where summary indexes are either inappropriate or
inefficient. Consider the following:
When you need to see the original events : In most cases, summary indexes
are used to store aggregate values. A summary index could be used to store
a separate copy of events, but this is not usually the case. The more events
you have in your summary index, the less advantage it has over the
original index.
When the possible number of categories of data is huge : For example, if
you want to know the top IP addresses seen per day, it may be tempting
to simply capture a count of every IP address seen. This can still be a huge
amount of data, and may not save you a lot of search time, if any. Likewise,
simply storing the top 10 addresses per slice of time may not give an accurate
picture over a long period of time. We will discuss this scenario under the
Calculating top for a large time frame section.
When it is impractical to slice the data across sufficient dimensions :
If your data has a large number of dimensions or attributes, and it is useful
to slice the data across a large number of these dimensions, then the resulting
summary index may not be sufficiently smaller than your original index
to bother with.
When it is difficult to know the acceptable time slice : As we set up a few
summary indexes, we have to pick the slice of time to which we aggregate.
If you think 1 hour is an acceptable time slice, and you find out later that you
really need 10 minutes of resolution, it is not the easiest task to recalculate
the old data into these 10-minute slices. It is, however, very simple to later
change your 10-minute search to one hour, as the 10-minute slices should
still work for your hourly reports.
 
Search WWH ::




Custom Search