Database Reference
In-Depth Information
description of v-BoMM and P-BoMM
results. In our research, we choose the latter be-
cause it keeps the original features of individual
time series. This approach also facilitates utilizing
the previous research outcomes.
Another issue is which measures to select as the
multiple time series for exception mining. For the
task of stock market surveillance, there are some
valuable experiences from financial experts which
can guide the choosing of measures. Price is the
most important measure of stock performance. We
can also use the outcomes of previous financial
research to choose measures. For example, there
are a great deal of financial research on the rela-
tionship between the abnormal behavior and the
response of stock. Meulbroek (1992) conducted
research on the relationship between insider trad-
ing, price movement and trading amounts. Their
conclusion is that there is an association between
these elements. Fishe & Robe (2002) also made a
similar conclusion. Therefore, the price movement
and trading amount are regarded as good measure-
ments for anomalies. The price movement can be
measured by price return and price fluctuation
range during one day. Price fluctuation range is
presented by the difference between the highest
price and the lowest price in one day.
Our OMM consists of two components: gen-
erators of outliers on individual time series and
integrators of multiple time series. The genera-
tors of outliers produce outliers by using existing
outlier mining technologies. Currently, we use
VOMM (Qi & Wang 2004) to carry out the task,
because it has been proved to be an effective and
efficient outlier mining technology applied in
stock market surveillance. The outliers generated
will be utilized by the integrators. The integrator
of multiple time series is to integrate the multiple
time series in order to refine the results. There are
two proposed approaches in our research. One is
based on major voting (V-BOMM) technology and
the other is based on probabilities (P-BOMM).
In order to illustrate our proposed OMM clearly,
we provide an example and demonstrate how the
V-BOMM and P-BOMM work.
Given 100 points on three time series X, Y and
Z, which are described as: [P 1 (x 1 , y 1 , z 1 ), P 2 (x 2 ,
y 2 , z 2 ),..., P 100 (x 100 , y 100 , z 100 )], where x 1 , x 2 , …,
x 100 represent the values of each points on X, y 1 ,
y 2 , …, y 100 represent the values of each points on
Y, and z 1 , z 2 , …, z 100 represent the values of each
points on Z.
First, we generate three lists of candidate
outliers on each time series by using VOMM.
The number of candidate outliers is determined
based on domain experience. Generally speak-
ing, the less the candidate outliers are, the result
is more accurate, but the coverage is worse. In
this example, we choose 3 candidate outliers on
each time series. Assume that the list of candidate
outliers obtained from time series X is [P 1 , P 3 ,
P 5 ], and the candidate outliers from Y and Z are
respectively [P 1 , P 3 , P 10 ] and [P 1 , P 5 , P 2 ].After that,
V-BOMM is used to refine the candidate outliers.
The V-BOMM produces the final outliers with
majority voting. There are 3 time series in total,
so the majority should be no less than 2. That is,
if a point appears in 2 or more lists of candidate
outliers, it will be regarded as one of the final
outliers. In the above example, P 1 , P 3 and P 5 are
the final outliers because they appear in 2 or 3 of
the above lists. On the contrary, P 10 and P 2 are not
included as the final outliers because they only
appear in one candidate list.
The P-BOMM produces the final points ranked
with the probabilities of being an outlier. First, we
generate three lists of candidate outliers on each time
series by VOMM. At the same time, an outlier test
ratio is calculated based on Formula (2). This ratio
gives the probability of being an outlier for each
point. For example, one list of candidate outliers
could be [{P 1 , 98%}, {P 3 , 92%}, {P 9 , 88%}] on
Search WWH ::




Custom Search