Biology Reference
In-Depth Information
2
Stream A
Stream B
Stream C
Aggregate
1.5
1
0.5
0 2
3
4
Days to Detection
5
6
7
Figure 9.4
AMOC curves for univariate detectors (Streams A, B, and C) compared against the result of
using Fisher's method of p-value aggregation. The vertical axis of the graph corresponds to the
average frequency of detections outside of the duration of known events; the horizontal axis
denotes the time to detection counted from the first day of the event.
passed and failed sanitary inspections conducted at the U.S. Department
of Agriculture regulated slaughterhouses (Stream C). The task is to detect
known patterns of increased activity synthetically introduced into the actual
field data in order to measure power of the considered detectors. The syn-
thetic activity manifests itself with the linearly increasing counts of events,
induced independently but at the same date, over seven consecutive days.
The average performance computed for univariate temporal anomaly detec-
tors as well as the Fisher's aggregate on a sample of 100 synthetic injections
are shown in Figure 9.4. The graph, showing Activity Monitoring Operating
Characteristics (AMOCs), clearly depicts the benefits of aggregation of cor-
roborating evidence across multiple streams of data. The aggregate detector
is able to reliably call the event on the fourth day of its inception, two days
ahead of the best of the univariate methods. Earlier detections are also pos-
sible at the cost of additional alerts generated outside of the scope of the
known synthetic events. The average frequency of such alerts is substan-
tially lower when using evidence aggregation, than with the use of the less
informed alternatives.
9.4 Temporal Aggregation for Cross-Stream Analysis
Understanding relationships between disparate streams of event data can
be difficult. The problems increase if the available data are sparse and if the
events under consideration occur infrequently. In such cases, straightforward
application of classical regression methods may not yield statistically reliable
results due to the elevated risk of overfitting. Sometimes, simplification of
 
Search WWH ::




Custom Search