Mining Finite-State Automata with Annotations - Mining Software Specifications: Methodologies and Applications

Databases Reference

In-Depth Information

2.5 Inferring FSA Annotated with Data-Flow

Information

This section describes KLFA, a technique that derives FSAs annotated

with data-flow information [21]. While gkTail focuses on the values that are

assigned to attributes, KLFA focuses on the patterns of occurrence of values

across events within the same trace (we call this recurrences data-flow pat-

terns). KLFA represents data-flow patterns by replacing the monitored events

(both the event names and their attribute values) with new events that do not

include attributes but incorporate information about the occurrence of the at-

tribute values within the labels, as illustrated by the example in Figure 2.12.

KLFA implements three rewriting strategies that can identify different data

flow patterns: global ordering, relative to instantiation, and relative to access

rewriting strategy.

The KLFA inference process consists of two phases: data preprocessing and

model generation. In the data preprocessing phase, KLFA rewrites traces. In

the model generation phase, KLFA infers a FSA that incorporates data-flow

information from the preprocessed traces.

2.5.1 Preprocessing Data

KLFA rewrites the events in the traces in three steps. In the first step,

KLFA identifies clusters of related attributes, that is, attributes that refer

to homogeneous types. This step avoids identifying data-flow patterns that

incorrectly relate heterogeneous quantities. For instance, it may make sense

to relate occurrences of values that represent distances, but it does not make

sense to relate occurrences of values that represent distances with values that

represent names of persons. In the second step, each cluster with homogeneous

attributes is rewritten according to three rewriting strategies implemented by

KLFA, thus producing three versions of each data cluster (global ordering, rel-

ative to instantiation and relative to access rewriting strategies). In the third

step, KLFA heuristically identifies the best rewritten version of each cluster

among the three available alternatives. KLFA may select different rewriting

strategies for different data clusters in the same system.

2.5.1.1

Identifying Data Clusters

In the first step, KLFA automatically identifies sets of attributes that are

assigned with homogeneous values, namely, the data clusters.

KLFA automatically identifies data clusters by comparing the values as-

signed to attributes in the traces. Given the sets of distinct values assigned

to two attributes in the traces, KLFA heuristically assumes that these two

attributes refer to a same or comparable quantity if they share a relevant

Search WWH ::

Custom Search

Home