Database Reference
In-Depth Information
graph clustering
correlating Burst events on
Streaming Stock Market data
A clustering algorithm divides records in a given
database into groups or clusters such that records
within a cluster are similar to each other and
records from different clusters are dissimilar. In
graph clustering, data sets can be represented
as weighted graphs, where nodes correspond to
the entities to cluster and edges correspond to a
similarity measure between those entities (Kannan
et al. 2000). The problem of graph clustering is
well studied and the literature on the subject is
very rich (Everitt 1980, Jain & Dubes 1988). The
best known graph clustering algorithms attempt
to optimize specific criteria such as k-median,
minimum sum, minimum diameter, etc. Other
algorithms are application-specific and take ad-
vantage of the underlying structure or other known
characteristics of the data.
Palshikar and Apte (2008) research on col-
lusion set detection by using graph clustering.
Their research aims to assist to identify stock
market manipulation. In stock market, some
stock market manipulations take the form of
collusion set of traders. For example, a group of
traders have “heavy trading” among themselves
in order to make false impression of some stocks
and attract the other investors to buy. Palshikar
and Apte (2008) utilize graph clustering to solve
this problem. In their model, The label φ (u, v)
on a directed edge (u, v) is the total quantity of
shares sold by u to v. Therefore, the higher is
the edge label, the closer is the vertices in terms
of “heaviness” of trading. The authors test their
models on synthetic trading data and real data
sets. The results show that the graph clustering
can effectively detect the collusion set.
One obvious limitation of the proposed ap-
proach is the failure of classifying the detected
collusion set in stock markets, such as marking
the end and matched orders.
Burst events refer to the events of importance
happening within the same time frame. Identifi-
cation of the burst events can significantly help
monitoring or surveillance tasks. In particular,
the identification of burst events is very critical
to recognize anomalous activity for applications
of fraud detection. For example, in stock market,
some market manipulations take the form of burst
events, and burst detection technologies are uti-
lized to capture suspicious activities in large stock
market volumes (Lerner & Shasha 2003).
Vlachos et al. (2008) provide a solution for
monitoring and identification of correlated burst
patterns in multi-stream time series database.
The authors firstly identify the burst sections in
datasets, and then store them for easy retrieval in
an efficient in-memory index. The burst detec-
tion scheme imposes a variable threshold on the
examined data and takes advantage of the skewed
distribution that is typically encountered in many
applications. The detected bursts are compacted
into burst intervals and stored in an interval
index. The index facilitates the identification of
correlated bursts by performing very efficient
overlap operations on the stored burst regions.
Their approach was tested on financial stock data
at the NYSE and the target burst patterns are the
events in stock trading volumes during the days
before and after the 9/11 event. The results showed
that it efficiently detected the burst events from
multi-stream time series datasets.
A possible extension of this research is the de-
tection of cross-correlation between multiple data-
streams based on their burst characteristics.
Artificial-Immune- Abnormal-
trading-detection System (AIAS)
As a biologically inspired system, natural immune
system (NIS) has been researched on a variety of ar-
eas, such as feature extraction, self-regulation and
Search WWH ::




Custom Search