Information Technology Reference
In-Depth Information
Fig. 2. Codebook Generation
noting that a subsequence is a subinterval of a time series data instance. For
example a subsequence is represented as x i =( x 2 ,x 3 ,x 4 ) with a window size 3.
To extract representative local patterns, we first extract subsequences from
all the time series data instances by a sliding window for each time series data
instance. More specifically, we will extract p
w +1 subsequences from each time
series data instance and we will have n ( p
w +1) subsequences from the training
data. Note that w is the window size. The clustering algorithms can be applied
to find the local patterns. In our experiments, we use k -means to cluster these
subsequences with d centroids. All these centroids are regarded as codewords of
the codebook D
w×d . The process is shown in Fig 2. In the following, we
will address how to use these features to describe a time series instance.
R
2.2 Feature Encoding with Codebook
w×d is learned. As we mentioned above, the
codewords of a codebook can be viewed as features of time series data instances.
Each feature (codeword) is represented as a local pattern which could improve
the ability to define the class. Thus, how to utilize these local patterns is critical
step for new representation.
For BoW model in text classification, we investigate the histogram of the
words in a document. That is, we utilize the distribution of the words in the
dictionary to classify the unlabeled document. Similar to text classification, we
also aim to check the distribution of local patterns (i.e., the codewords) for a
We assume that a codebook D
R
Search WWH ::




Custom Search