Information Technology Reference
In-Depth Information
Any
D
S
I
D S
D W
Fig. 4. An example taxonomy of symbols
since their number is usually much smaller than the number of symbol in a
TA series. Such a process has been proposed in a CBR context in [14], and is
illustrated in figure 3.
3.2 TA-Based Retrieval in Medical CBR
Our research group is currently working at the definition of a time series retrieval
framework, which exploits TA for dimensionality reduction in medical domains.
In particular, our framework allows for multi-level abstractions , according to
two dimensions , namely a taxonomy of (trend or state) TA symbols, and a
variety of time granularities.
Actually, TA symbols can be organized in a conventional is-a taxonomy, in
order to provide different levels of detail in the description of episodes. An exam-
ple taxonomy of symbols for trend TA is the one illustrated in figure 4, in which
the symbol Any is specialized into D (decrease), S (stationary) and I (increase),
and D is further specialized into D S (strong decrease) and D W (weak decrease),
according to the slope.
On the other hand, time granularities allow one to describe episodes at in-
creasingly more abstract levels of temporal aggregation (see figure 5). Obviously,
the number of levels in the time granularities taxonomy and in the symbol tax-
onomy, as well as the dimension of granules, can be differently set depending on
the application domain. However it is worth noting that our approach is appli-
cable in absence of domain knowledge as well (in the worst case, by resorting
to flat taxonomies). Observe that the time dimension requires that aggrega-
tion is “homogeneous” at every given level, in the sense that each granule at a
given level must be an aggregation of exactly the same number of consecutive
granules at the lower level (while this number may vary from level to level; for
instance, two 1-hour-long granules compose a 2-hours-long granule, while three
20-minutes-long granules compose a 1-hour-long granule). Such an “homogene-
ity” restriction is motivated by the fact that, in such a way, the duration of
each episode is (implicitly) represented in the sequence of symbols. For example,
at the time granularity level of 20 minutes, the string DDDS may represent a
1-hour episode of D followed by 20 minutes of S .
Although the use of a symbol taxonomy and/or of a temporal granularity
taxonomy has been already advocated in other works (e.g. in a data warehouse
context, see [55]), to the best of our knowledge we are proposing the first ap-
proach attempting to fully exploit the advantages of taxonomical knowledge in
 
Search WWH ::




Custom Search