Database Reference
In-Depth Information
outlier Identification
Outlier Detection in GARCH Models
Outliers refer to the data points which are grossly
different from or inconsistent with the rest of
data (Han & Kamber 2001). The usual strategy
for outlier mining is to find a model that aims at
maximally capturing the information of the nor-
mal data and take samples inconsistent with the
model as outliers. Based on the above strategy,
numerous successful outlier mining models have
been proposed, which can be further categorized
into four approaches: the statistical approach, the
distance-based approach, the deviation-based ap-
proach, and the density-based approach.
Generalized Autoregressive Conditional Het-
eroskedasticity (GARCH) model was introduced
by Bollerslev (1986). It is an econometric model
for modeling and forecasting time-dependent vari-
ance, and hence volatility, of stock price returns.
It represents current variance in terms of past
variances. The parameters in the model are usually
determined by Maximum Likelihood Estimation
applied to the likelihood function.
The GARCH model is typically called the
GARCH (1, 1) model. The (1, 1) in parentheses is
a standard notation in which the first number refers
to how many autoregressive lags, or (Autoregres-
sive Conditional Heteroscedasticity)ARCH terms
(Gourieroux 1997), appear in the equation, while
the second number refers to how many moving
average lags are specified, which here is often
called the number of GARCH terms. Sometimes
models with more than one lag are needed to find
good variance forecasts.
The GARCH model is a popular approach
to abnormal return detection. Franses and Dijk
(2000) researched on this issue and adapted the
outlier detection method proposed by Chen and
Liu (1993). The critical values for the relevant
test statistic were generated, and their methods
were evaluated in an extensive simulation study.
This outlier detection and correction method
was applied to 10 years of weekly return from
1986 to 1995 on the stock markets of Amster-
dam, Frankfurt, Paris, Hong Kong, Singapore
and New York, which amounts to approximately
500 observations. Franses and Dijk (2000) used
weekly data from 1996 to 1998 to evaluate the
out-of-sample forecast performance of conditional
volatility with GARCH (1, 1) models estimated
on the series before and after outlier correction.
The result shows that correcting for a few outliers
yields substantial improvements in out-of-sample
forecasts.
Outlier Test
Dixon (1950) firstly introduced his ratio R to test
outliers from a sample. It is proved to be robust and
applicable to any distribution (Chernick 1982). In
the Dixon Ratio Test, the range of the test values is
calculated, and the results are utilized to measure
variation of the stock markets. The Dixon Ration
R is calculated by the difference between the two
highest values and the range of all samples. Let H 1
be the highest value and H 2 be the second highest
value. Let LV be the lowest value.
R = ( H 1 - H 2 ) / ( H 1 - LV ) (1)
The closer the value of R is to one, the more
likely that the highest value is from another
distribution and an outlier to the current set of
values.
The Dixon Ration Test is used to detect the
extremely deviated data set from the rest of the
data. However, it fails to identify outliers where all
the top k highest values are outliers. Therefore, in
our research, we used the modified Dixon Ration
to test outliers. We define LA be the average value
of all values except the highest values by replacing
the H 2 in formula 1 with LA (Luo et al. 2008). Our
Test Ratio is calculated as following:
R = ( H 1 - LA ) / ( H 1 - LV ) (2)
This adjustment makes the R fit to measure
the multiple outliers.
Search WWH ::




Custom Search