Databases Reference
In-Depth Information
Time-series data (e.g., stock market data)
Symbolic sequences (e.g., customer shopping
sequences, web click streams)
Biological sequences (e.g., DNA and protein
sequences)
C
o
m
p
l
e
x
Sequence
Data
Homogeneous (nodes/links are of same type)
or heterogeneous (nodes/links are of different
types)
Examples: Graphs and social, and information
networks, etc.
T
y
p
e
s
Graphs and
Network Mining
o
f
Spatial data
Spatiotemporal data
Cyber-physical system data
Multimedia data
Text data
Web data
Data streams
Other Kinds
of Data
D
a
t
a
Figure 13.1 Complex data types for mining.
web data, and data streams. Due to the broad scope of these themes, this section presents
only a high-level overview; these topics are not discussed in-depth in this topic.
13.1.1 Mining Sequence Data: Time-Series, Symbolic
Sequences, and Biological Sequences
A sequence is an ordered list of events. Sequences may be categorized into three groups,
based on the characteristics of the events they describe: (1) time-series data , (2) symbolic
sequence data , and (3) biological sequences . Let's consider each type.
In time-series data , sequence data consist of long sequences of numeric data,
recorded at equal time intervals (e.g., per minute, per hour, or per day). Time-series
data can be generated by many natural and economic processes such as stock markets,
and scientific, medical, or natural observations.
Symbolic sequence data consist of long sequences of event or nominal data, which
typically are not observed at equal time intervals. For many such sequences, gaps (i.e.,
lapses between recorded events) do not matter much. Examples include customer shop-
ping sequences and web click streams, as well as sequences of events in science and
engineering and in natural and social developments.
Biological sequences include DNA and protein sequences. Such sequences are typi-
cally very long, and carry important, complicated, but hidden semantic meaning. Here,
gaps are usually important.
Let's look into data mining for each of these sequence data types.
 
Search WWH ::




Custom Search