Database Reference
In-Depth Information
}
public ThresholdDetector(Forecaster f, int n) {
this (f,n,3);
}
public ThresholdDetector(Forecaster f) {
this (f,30);
}
When a new observation arrives, it is run through the forecaster to
determine the error. If the error exceeds the number of standard deviations
specified by sigma, it is considered an outlier. In this case, it is not used
to update the standard deviation of the errors to avoid skewing the data.
Otherwise, the standard deviation calculation is updated to reflect a
non-outlier value:
public boolean observe(double y) {
double err = y - f.forecast(y);
double sig = Math. sqrt (s2/((double)n-1.0));
//If this is an outlier don't include it in s2
if(Math. abs (err)/sig > sigma)
return true;
//Otherwise update our standard deviation
s2 += err*err;
if(values.size() == n)
s2 -= values.removeFirst();
values.add(err*err);
return false;
}
There are other approaches, but they mostly employ this basic framework
for their updates. For example, rather than using the standard deviation,
many outlier detectors declare an outlier as being outside 1.5 or 3 times the
interquartile range. This was originally used to identify outliers in boxplot
visualizations and has since been repurposed for outlier detection. This is
further generalized by scan statistic approaches, which use the percentiles
of the error to determine whether the process is in an outlier state.
Search WWH ::




Custom Search