Clustering, Classifying, and Working with Weka - Clojure Data Analysis

Database Reference

In-Depth Information

;; The algorithm's class and invocation

;; function are used here to actually

;; perform the processing.

(doto (new ~a-class)

(.setOptions options#)

(. ~a-method dataset#)))))

4. Now we can deine a wrapper for K-Means clustering (as well as the other algorithms

we'll introduce later in the chapter) very quickly. This also makes clear how the macro

has helped us. It's allowed us to DRY-up (Don't Repeat Yourself) the options list.

Now we can clearly see what options an algorithm takes and how it uses them:

(defanalysis

k-means SimpleKMeans buildClusterer

[["-N" k 2]

["-I" max-iterations 100]

["-V" verbose false :flag-true]

["-S" seed 1 random-seed]

["-A" distance EuclideanDistance .getName]])

5.

We can now call this wrapper function and get the results. We'll irst load the dataset

and then ilter it into a new dataset that only includes the columns related to the

petal size. Our clustering will be based upon those attributes:

user=> (def iris (load-arff "data/UCI/iris.arff"))

user=> (def iris-petal

(filter-attributes iris

[:sepallength :sepalwidth :class]))

user=> (def km (k-means iris-petal :k 3))

user=> km

#<SimpleKMeans

kMeans

======

Number of iterations: 8

Within cluster sum of squared errors: 1.7050986081225123

…

How it works…

There are several interesting things to talk about in this recipe.

Search WWH ::

Custom Search

Home