Database Reference
In-Depth Information
6.2. Control Charts
In this data set there are six different classes of control charts, synthetically
generated by the process in [Alcock and Manolopoulos (1999)]. Each time
series is of length
n
= 60, and it is defined by
y
(
t
), with 1
≤ t ≤ n
:
y
t
m
rs
m
s
r
Normal:
(
)=
+
.Where
= 30,
=2and
is a random number
in [
3
,
3].
Cyclic:
y
(
t
)=
m
+
rs
+
a
sin(2
πt/T
).
a
and
T
are in [10,15].
Increasing:
y
(
t
)=
m
+
rs
+
gt
.
g
is in [0.2,0.5].
Decreasing:
y
(
t
)=
m
+
rs − gt
.
Upward:
y
(
t
)=
m
+
rs
+
kx
.
x
is in [7.5,20] and
k
=0beforetime
t 3 and
1 after this time.
t 3 is in [
n/
3
,
2
n/
3].
Downward:
y
(
t
)=
m
+
rs − kx
.
Figure 5 shows two examples of each class. The data used was obtained
from the UCI KDD Archive [Hettich and Bay (1999)]. The results are shown
in Figure 6(A) and in Table 7. The results are clearly better for interval
based literals, with the exception of early classification with very short
fragments. This is possibly due to the fact that most of the literals refer to
intervals that are after these initial fragments. In any case, early classifica-
tion is not useful with so short fragments, because the error rates are too
large.
Variable Length Version .The Control data set was also altered to a
variable length variant, Control-Var . In this case the resulting series have
lengths between 30 and 60. The results are shown in Figure 6(B) and in
Table 8.
6.3. Trace
This dataset is introduced in [Roverso (2000)]. It is proposed as a bench-
mark for classification systems of temporal patterns in the process industry.
This data set was generated artificially. There are four variables, and each
variable has two behaviors, as shown in Figure 7. The combination of the
behaviors of the variables produces 16 different classes. For this data set it
is specified a training/test partition.
The length of the series is between 268 and 394. The running times
of the algorithms depend on the length of the series, so the data set was
preprocessed, averaging the values in groups of four. The lengths of the
examples in the resulting data set have lengths between 67 and 99.
Search WWH ::




Custom Search