Clustering, Classifying, and Working with Weka - Clojure Data Analysis

Database Reference

In-Depth Information

9

Clustering, Classifying,

and Working with Weka

In this chapter, we will cover the following recipes:

F Loading CSV and ARFF iles into Weka

F Filtering, renaming, and deleting columns in Weka datasets

F Discovering groups of data using K-Means clustering

F Finding hierarchical clusters in Weka

F Clustering with SOMs in Incanter

F Classifying data with decision trees

F Classifying data with the Naive Bayesian classiier

F Classifying data with support vector machines

F Finding associations in data with the Apriori algorithm

Introduction

Looking for patterns in our dataset is a large part of data analysis. Of course, a dataset of

any complexity is too much for the human mind to see patterns in, so we rely on computers,

statistics, and machine learning to augment our insights.

In this chapter, we'll take a look at a number of methods used to cluster and classify data.

Depending on the nature of the data and the question(s) we're trying to answer, different

algorithms will be more or less useful. For instance, while K-Means clustering is great for

clustering numeric datasets, it's poorly suited for working with nominal data.

Search WWH ::

Custom Search

Home