Java Reference
In-Depth Information
Transformation
Sequence-1
Bin 'income' as
[0..80K)
Remove attributes
product1, product2
'1'
[80K..300K)
'2'
Figure 18-2
Transformations sequence example.
Generally, case filtering transformations are excluded from the
reusable TransformationSequence . The first Transformation object
defines which attributes to exclude from the dataset. Based on the
specification to remove attributes found to be constants, the data
mining engine (DME) removed product1 and product2. Here, per-
haps all cases show that the product was purchased and hence the
attributes provide no useful information for mining. The second
Transformation instance contains the specific binning of the income
attribute. Here, the DME selected $80K as the split point for the bin-
ning, assigning income values of $0 through $80,000 to bin 1, and
$80,000 through $300,000 to bin 2.
18.2
Time Series
A time series is a sequence of numerical values, ordered in time. Exam-
ples of time series data include the Dow Jones Industrial Average
(DJIA) daily values over the course of a year, a retailer's sales each
day for the quarter, or the hourly rate of production on an assembly
line dating back to the first production run. Figure 18-3 depicts the
DJIA over the period April 2005-April 2006. This time series data is
called the signal or, in data mining terminology, the target attribute . In
some time series data, additional information can be provided, called
interventions , such as a stock market crash, retail sale promotion, or
failure of a key piece of equipment on the assembly line. Interventions
consist of unusual or irregularly occurring events. They can be speci-
fied as one-time events, or events that occur over several periods in
the time series, with a certain characteristic (e.g., a slow decay rate) or
a constant impact over the period.
Time series analysis [Chatfield 2004] is the data mining technique
that extracts the underlying patterns present in the time series data.
Search WWH ::




Custom Search