Mining Linguistic Trends from Time Series - Data Mining: Foundations and Practice

Databases Reference

In-Depth Information

data dimension was reduced, they further transformed the data into a discrete

representation and mined k -motifs from the transformed time series. Agrawal

et al. proposed an algorithm to capture the shapes from historical time-series

database by using a simple translation [2]. They first transformed the differ-

ence value of every two adjacent data points into a predefined category, such

as increase, steep increase, steep decrease, decrease, no-change, and zero. The

same time series may be labeled more than one category. In other words, the

intervals among these categories have overlapped a little. The transformed

symbolic series were then used for querying desired results.

Most of the above approaches, however, usually require predefined crisp

intervals for each category. It thus needs domain knowledge and depends on

applications. Udechukwu et al. thus proposed a domain-independent trend-

encoding method to mine frequent trends [13]. They transformed the difference

value between two adjacent data points into an angle, instead of the difference

value itself. The angles lay within the range

90 0 to 90 0 , and were partitioned

into 52 predefined angular categories, represented by letters. They then used

the data structure of su x trees to find the maximally repeated patterns as

frequent trends. In this way, the effect of the domain knowledge could be

reduced. Their approach, however, had too many angular categories, which

might cause users hard to understand the meaning of the patterns easily.

As to fuzzy data mining, Hong et al. proposed several fuzzy mining al-

gorithms to mine linguistic association rules from quantitative data [6, 7, 10].

They transformed each quantitative item into a fuzzy set and used fuzzy oper-

ations to find fuzzy rules. Their approaches, however, focused on transaction

data. For time-series data, Song et al. proposed a fuzzy stochastic time series

and built a model by assuming the values are fuzzy sets [12]. Chen et al. pro-

posed a two-factor time-variant fuzzy time-series model to deal with forecast-

ing problems [4]. Au and Chan proposed a fuzzy mining approach to find fuzzy

rules for classifying time-series [1]. Watanabe exploited the Takagi-Sugeno

model to build a time-series model [14].

In this chapter, we thus propose a mining algorithm based on angles of

adjacent points in a time series to find linguistic trends. Several fuzzy sets

for angles are predefined to represent semantic concepts understandable to

human being. The a priori-like fuzzy mining algorithm is then used to generate

linguistic trends. Appropriate post-processing is also performed to remove

redundant patterns. Since the final results are represented by linguistic terms,

they will be friendlier to human than quantitative representation.

−

2 Mining Linguistic Trends for Time Series

The proposed fuzzy mining algorithm integrates the fuzzy sets, the a pri-

ori mining algorithm and the time-series concepts to find out appropriate

linguistic trends from a time series. The proposed approach first transforms

data values into angles, and then uses a sliding window to generate continuous

Data Mining: Foundations and Practice

Search WWH ::

Custom Search

Home