Database Reference
In-Depth Information
9
Clustering, Classifying,
and Working with Weka
In this chapter, we will cover the following recipes:
F Loading CSV and ARFF iles into Weka
F Filtering, renaming, and deleting columns in Weka datasets
F Discovering groups of data using K-Means clustering
F Finding hierarchical clusters in Weka
F Clustering with SOMs in Incanter
F Classifying data with decision trees
F Classifying data with the Naive Bayesian classiier
F Classifying data with support vector machines
F Finding associations in data with the Apriori algorithm
Introduction
Looking for patterns in our dataset is a large part of data analysis. Of course, a dataset of
any complexity is too much for the human mind to see patterns in, so we rely on computers,
statistics, and machine learning to augment our insights.
In this chapter, we'll take a look at a number of methods used to cluster and classify data.
Depending on the nature of the data and the question(s) we're trying to answer, different
algorithms will be more or less useful. For instance, while K-Means clustering is great for
clustering numeric datasets, it's poorly suited for working with nominal data.
Search WWH ::




Custom Search