Database Reference
In-Depth Information
since
B
preserves dot products as an orthonormal matrix (by inductive
hypothesis) and
v
(
w
0
,l
)
are orthonormal by construction.
i
The detailed hierarchical SVD algorithm is shown below. In practice,
the maximum level
L
is determined based on the length
m
of the time
series so far,
L
≈
log
W
(
m/w
0
).
m
,
w
0
,
W
,
L
,
k
=6)
Algorithm 4
Hierarchical (
x
∈
R
{
Start with level
l
= 0, corresponding to window
w
0
}
V
(
w
0
,
0)
,
P
(
w
0
,
0)
,
Σ
(
w
0
,
0)
,π
(
w
0
,
0)
←
LocalPattern(
x
,w
0
,k
)
W
l
{
Levels
l
, corresponding to window
w
l
=
w
0
·
}
for
level
l
=1to
L
do
V
(
w
0
,l
)
,
P
(
w
0
,l
)
,
Σ
(
w
0
,l
)
,π
(
w
0
,l
)
←
LocalPattern(
P
(
w
0
,l−
1)
,W,k
)
Compute patterns
v0
(
w
0
,l
)
for window size
w
l
are based on Equa-
i
tion (5.7)
end for
Choosing the initial window
The initial window
w
0
has some im-
pact on the quality of the approximations. This also depends on the
relationship of
k
to
w
0
(the larger
k
is, the better the approximation
and if
k
=
w
0
then
P
(
w
0
,
1)
=
X
(
w
0
)
, i.e., no information is discarded at
the first level). However, we want
k
to be relatively small since, as we will
see, it determines the buffering requirements of the streaming approach.
Hence, we fix
k
= 6. We found that this simple choice works well for
real-world sequences, but we could also use energy-based thresholding
[33], which can be done incrementally.
If
w
0
is too small, then we discard too much of the variance too
early. If
w
0
is unnecessarily big, this increases buffering requirements
and the benefits of the hierarchical approach diminish. In practice, a
good compromise is a value in the range 10
≤ w
0
≤
20.
Finally, out of the six patterns we keep per level, the first two or
three are of interest and reported to the user. The remaining are kept
to ensure that
X
(
w
0
,l
)
is a good approximation of
X
(
w
l
)
.
Choosing the scales
As discussed in Section 8.1.1, if there is a sharp
drop of
π
(
T
)
at window
w
=
T
, then we will also observe drops at
multiples
w
=
iT
,
i
=2
,
3
,...
. Therefore, we choose a few different
starting windows
w
0
and scale factors
W
that are relatively prime to
each other. In practice, the following set of three choices is sucient