Database Reference
In-Depth Information
is still less than max error ) then the algorithm continues to increment
i
as in the classic algorithm. Otherwise they begin to decrement
until the
measured error is less than max error . This optimization can greatly speed
up the algorithm if the mean length of segments is large in relation to
the standard deviation of their length. The monotonically non-decreasing
property of residual error also allows binary search for the length of the
segment. Surprisingly, no one we are aware of has suggested this.
The Sliding Window algorithm can give pathologically poor results
under some circumstances, particularly if the time series in question con-
tains abrupt level changes. Most researchers have not reported this [Qu
et al. (1998), Wang and Wang (2000)], perhaps because they tested the
algorithm on stock market data, and its relative performance is best on
noisy data. Shatkay (1995), in contrast, does notice the problem and gives
elegant examples and explanations [Shatkay (1995)]. They consider three
variants of the basic algorithm, each designed to be robust to a certain
case, but they underline the diculty of producing a single variant of the
algorithm that is robust to arbitrary data sources.
Park et al. (2001) suggested modifying the algorithm to create “ mono-
tonically changing ” segments [Park et al. (2001)]. That is, all segments con-
sist of data points of the form of
i
t 1 ≥ t 2 ≥ ··· ≥ t n .
This modification worked well on the smooth synthetic dataset it was
demonstrated on. But on real world datasets with any amount of noise,
the approximation is greatly overfragmented.
Variations on the Sliding Window algorithm are particularly popular
with the medical community (where it is known as FAN or SAPA), since
patient monitoring is inherently an online task [Ishijima et al. (1983), Koski
et al. (1995), McKee et al. (1994), Vullings et al. (1997)].
t 1 ≤ t 2 ≤ ··· ≤ t n or
2.2. The Top-Down Algorithm
The Top-Down algorithm works by considering every possible partitioning
of the times series and splitting it at the best location. Both subsections
are then tested to see if their approximation error is below some user-
specified threshold. If not, the algorithm recursively continues to split the
subsequences until all the segments have approximation errors below the
threshold. The pseudocode for the algorithm is shown in Table 3.
Variations on the Top-Down algorithm (including the 2-dimensional
case) were independently introduced in several fields in the early 1970's.
In cartography, it is known as the Douglas-Peucker algorithm [Douglas and
 
Search WWH ::




Custom Search