Database Reference
In-Depth Information
This is required because information is often omitted from each
record in the interest of reducing log sizes. For example, timestamps
in each record might be missing the year or time zone, which is
provided in the file name or known externally. The MDA uses
values that are provided in the batch's metadata to fill in the missing
fields. Without this standardization of time stamp data, apples-to-
apples comparisons of log files are impossible.
4. Event enrichment User-specified metadata—such as server name,
data center name, or application name—can be associated with
machine data batches during the ingestion process. For example, the
server name is normally not included in batches of log records relating
directly to the server itself, but it would be very useful when analyzing
this information alongside batches of logs from other servers.
5. Event generalization Machine data records usually contain
varying values, such as time stamps, IP addresses, measurements,
percentages, and messages. By replacing the varying values with
constant values ( masking ), the events can be generalized. Generalized
events are collected and given unique IDs, which are then used for
downstream analysis. These generalized events can be used in
frequent sequence analysis to identify which sequences of generalized
events occur most frequently. They can also be used in significance
testing to identify which generalized events are the most significant
with respect to a specific error. The fields to be masked can vary with
the log type. Event generalization is optional, so when users don't
provide any fields, generalization is not performed.
6. Extraction validation in BigSheets Before running the extraction
operation, you can preview the results in BigSheets to ensure that
the correct fields are being extracted, and that the standardization,
enrichment, and generalization operations were applied correctly.
7. Extracted log storage The resulting data is stored as compressed
binary files in a hierarchy of directories where each directory
contains the parsed log records from a batch of logs. The logs are
formatted as JSON records; each record contains the original log
record and the extracted fields for the log record.
Search WWH ::




Custom Search