Databases Reference
In-Depth Information
13.13
What are the major challenges faced in bringing data mining research to
market
? Illus-
trate one data mining research issue that, in your view, may have a strong impact on the
market and on society. Discuss how to approach such a research issue.
13.14
Based on your view, what is the most
challenging research problem
in data mining? If
you were given a number of years and a good number of researchers and implementors,
what would your plan be to make good progress toward an effective solution to such a
problem?
13.15
Based on your experience and knowledge, suggest a
new frontier
in data mining that was
not mentioned in this chapter.
13.8
Bibliographic Notes
For mining complex data types, there are many research papers and books covering
various themes. We list here some recent topics and well-cited survey or research articles
for references.
Time-series analysis
has been studied in statistics and computer science commu-
nities for decades, with many textbooks such as Box, Jenkins, and Reinsel [BJR08];
Brockwell and Davis [BD02]; Chatfield [Cha03b]; Hamilton [Ham94]; and Shumway
and Stoffer [SS05]. A fast subsequence matching method in time-series databases
was presented by Faloutsos, Ranganathan, and Manolopoulos [FRM94]. Agrawal, Lin,
Sawhney, and Shim [ALSS95] developed a method for fast
similarity search
in the pres-
ence of noise, scaling, and translation in time-series databases. Shasha and Zhu present
an overview of the methods for high-performance discovery in time series [SZ04].
Sequential pattern mining
methods have been studied by many researchers,
including Agrawal and Srikant [AS95]; Zaki [Zak01]; Pei, Han, Mortazavi-Asl, et al.
[PHM-A
C
04]; and Yan, Han, and Afshar [YHA03]. The study on
sequence classifica-
tion
includes Ji, Bailey, and Dong [JBD05] and Ye and Keogh [YK09], with a survey by
Xing, Pei, and Keogh [XPK10]. Dong and Pei [DP07] provide an overview on
sequence
data mining
methods.
Methods for
analysis of biological sequences
including
Markov chains
and
hidden
Markov models
are introduced in many topics or tutorials such as Waterman [Wat95];
Setubal and Meidanis [SM97]; Durbin, Eddy, Krogh, and Mitchison [DEKM98];
Baldi and Brunak [BB01]; Krane and Raymer [KR03]; Rabiner [Rab89]; Jones and
Pevzner [JP04]; and Baxevanis and Ouellette [BO04]. Information about BLAST
(see also Korf, Yandell, and Bedell [KYB03]) can be found at the NCBI web site
www.ncbi.nlm.nih.gov/BLAST/
.
Graph pattern mining
has been studied extensively, including Holder, Cook, and
Djoko [HCD94]; Inokuchi, Washio, and Motoda [IWM98]; Kuramochi and Karypis
[KK01]; Yan and Han [YH02, YH03a]; Borgelt and Berthold [BB02]; Huan, Wang,
Bandyopadhyay, et al. [HWB
C
04]; and the Gaston tool by Nijssen and Kok [NK04].