Databases Reference
In-Depth Information
2. To remove outlying observations based on patterns between two or more
attributes:
a. Isolate observation(s) using the filter sliders.
b. Right-click on slider to “Make dataset from filter”.
c.
In the Control Center,
i. drag and drop the newly created dataset of the isolated outlying
observation(s) onto the parent dataset
ii. select “Create dataset from difference” to generate a dataset con-
taining all but the isolated observation(s).
Dimension reduction
Dimensions (attributes) should be removed from a dataset when they are not
considered applicable or of value with respect to a planned data mining task.
Some attributes to be eliminated are obvious candidates. For example, cus-
tomer account numbers would not be expected to contribute to any planned
pattern analyses. Use features of the Control Center to eliminate these
attributes. Other attributes should be eliminated if, after review, they appear
unrelated or weakly related to the planned data mining task. The correlation
matrix and parallel coordinate plot viewers can help to first identify then
eliminate these attributes.
Attribute elimination using the Control Center:
1. Selectively removing columns:
a. Right-click on the dataset.
b. Select “Create derived dataset”.
c. Check only those columns to be included in new dataset.
2. Creating a single summary attribute to replace multiple attributes:
a. While in the “Create derived dataset” option, create a “computed
column” based on the attributes to be summarized. For example, create
a column TotalSales as the sum of SalesX, SalesY, and SalesZ.
b. Remove (leave unchecked) the newly summarized attributes from the
derived dataset.
Attribute elimination using the correlation matrix:
1. Ctrl-click on unwanted attribute names.
2. Click “Create Subset” button to save.
Search WWH ::




Custom Search