Databases Reference
In-Depth Information
Each Variety in the dataset is now represented using a different color. You
should recognize that the clusters of points represent different varieties. Setosa
is in the lower right cluster. Versicolor and Virginica are in the upper left. You
should also note that within the Versicolor-Virginica cluster there is a direct
relationship between PetalLength and SepalWidth rather than the inverse
relationship reported by the correlation matrix.
Suppose that the objective of your data mining activity is to determine a set
of classification rules to predict iris variety based on the four flower measures.
The scatter plot can help you formulate those rules. For example, in the plot
of PetalLength versus PetalWidth, with Variety selected as the category
(Figure 2.14), you clearly see that Setosa flowers are much smaller. You
also see that Versicolor are next in size; Virginica are the largest. Note also
that although there is a distinct separation between Setosa and the others there is
some overlap between the Versicolor and Virginica. It will be more difficult to
distinguish between these two varieties.
You can add a third (Z) dimension to the scatter plot by selecting another
attribute using the “Z Axis” drop-down. Try selecting SepalWidth. Static 3-D
Figure 2.14
Petal Width versus Petal Length
Search WWH ::




Custom Search