Database Reference
In-Depth Information
dispersion of the value of a market mea-
sure for a security over a specific period. It
frequently refers to return volatility which
is the standard deviation of the returns over
the period. A higher volatility means that
the price of a security can change dramati-
cally over a short time period in either di-
rection. The formula of return volatility
is VOMM price ange
practice. There are many semi-supervised learning
methods proposed in the literature. Based on their
underlying assumptions, they can be organized
into five classes: SSL with generative models,
SSL with low density separation, graph-based
methods, co-training methods and self-training
methods.
The task of time series classification is to
map each time series to one of the predefined
classes. However, the identified exceptions in
stock market are few in reality. It means that the
labeled examples are rare, but unlabeled data is
abundant. For example, there are only about 30
insider trading cases found in Hong Kong Stock
Exchange from 1997 to 2007, and it is almost
impossible to train a good classifier based on these
few identified cases. An alternative way is to use
the semi-supervised classification technology to
construct accurate classifiers with few labeled
examples.
k
_ , where V is the return
volatility over a time range of T and R t is
the return.
Spread is the difference between the bid
and ask price of a security. The formula to
compute spread is S = P b - P a , where S is
the spread, P b is the bid price and P a is the
ask price. Spread is used to measure the
market liquidity, which refers to the abil-
ity of securities to be bought or sold in the
market without causing a significant move-
ment in the price and with minimum loss
of value. Commonly, the wider the spread
is, the lower the liquidity is.
Shape-Based Analysis
Application of Semi-
Supervised classification
The shape-based analysis is another possible way
of identifying exceptional patterns (Fu et al. 2007).
It is significant to define the movement pattern
based on the shape of time series. For example,
the time series may change in the order of in-
crease, decrease and then increase again, which
form a shape of wave. This approach is based on
the assumption that there exist unique shapes of
time series corresponding to some exceptions in
stock market. The frequent shapes on multiple
time series are identified first, and then the ex-
ceptional shapes are to be found. However, it is
challenging to define the shapes of time series
movement and the similarity between two time
series. Another challenge is how to split the time
slides to compare time series.
In the shape-based analysis, the identification
of perceptually important points (PIPs) is the most
important issues. By identifying the PIPs, we can
measure or compare the multiple time series. In
particular, it is challenging to identify the PIPs
Time series classification has attracted great inter-
est in the last decade. However current research
assumes the existence of large amounts of labeled
training data. In reality, such data are often very
difficult or expensive to obtain. Therefore, a
semi-supervised technique for building time series
classifiers is valuable in many domains (Ratsaby
& Venkatesh 1995).
The idea of using unlabeled data to help classi-
fication may sound initially unintuitive. However,
several studies in the literature have indicated the
utility of unlabeled data for classification. Learn-
ing from both labeled and unlabeled data is called
semi-supervised learning (SSL) (Chapelle et al.
2006). Because semi-supervised learning requires
less human effort and generally achieves higher
accuracy, it is of great interest both in theory and in
Search WWH ::




Custom Search