Information Technology Reference
In-Depth Information
measure. In the univariate case (a) the medoid contains a high amount of patterns that
recur in the time series objects of the corresponding cluster, making it an excellent
prototype. As expected, in the multivariate case (b) the medoid time series contains
less and shorter intervals of recurring patterns.
12.10 Application
Having introduced our recurrence plot-based distance measure, we are eventually in
the position to present BestTime, a platform-independent Matlab application with
graphical user interface, which enables us to find representatives that best com-
prehend the recurring temporal patterns contained in a certain time series dataset.
Although BestTime was originally designed to analyze vehicular sensor data and
identify characteristic operational profiles that comprise frequent behavior patterns
[ 32 ], our extended version [ 36 ] can be used to find representatives in arbitrary sets
of single- or multi-dimensional time series of variable length.
As described above, our approach to find representatives in time series datasets
is based on agglomerative hierarchical clustering [ 14 ]. We define a representative as
the time series that is closest to the corresponding cluster center of gravity [ 25 ]. Since
we want a representative to comprehend the recurring temporal patterns contained
in the time series of the respective cluster, we need a distance measure that accounts
for similar subsequences regardless of their position in time [ 32 ].
However, as mentioned before, traditional time series distance measures, such as
the Euclidean distance (ED) and Dynamic Time Warping (DTW), are not suitable to
match similar subsequences that occur in arbitrary order [ 1 , 4 ]. Hence, we proposed
to employ Recurrence Plots (RPs) and corresponding Recurrence Quantification
Analysis (RQA) [ 21 , 38 ] to measure the pairwise (dis)similarity of time series with
similar patterns at arbitrary positions [ 34 ]. Above, we introduced a novel recurrence
plot-based distance measure, which is used by our BestTime tool to cluster time
series and find representatives.
In the following, we briefly describe the operation of our BestTime application
and illustrate the data processing for a small set of sample time series, see Figs. 12.8
and 12.9 . Please feel free to download our BestTime tool [ 36 ] to follow the stepwise
operating instructions given below.
Input Data. BestTime is able to analyze multivariate time series with same dimen-
sionality and of variable length. Each individual time series needs to be stored
in an independent csv (comma separated values) file, where rows correspond to
observations and columns correspond to variables. Optionally, the first row may
specify the names of the variables. The user selects an input folder that should
contain all time series in specified csv format. A small set of sample time series
that we use as input is illustrated in Fig. 12.8 .
Search WWH ::




Custom Search