Database Reference
In-Depth Information
There's moreā€¦
F The Weka website's documentation has a good page on the LibSVM class at
http://weka.wikispaces.com/LibSVM
F R. Berwick has written An Idiot's guide to Support vector machines (SVMs) , which is
an excellent introduction to the history and theoretical background of SVMs. You can
ind it at http://www.cs.ucf.edu/courses/cap6412/fall2009/papers/
Berwick2003.pdf
F More information on the ionosphere dataset is available at http://archive.ics.
uci.edu/ml/datasets/Ionosphere
Finding associations in data with the Apriori
algorithm
One of the main goals of data mining and clustering is to learn the implicit relationships
in the data. The Apriori algorithm helps to do this by teasing out such relationships into an
explicit set of association rules. A common example of this type of analysis is what is done by
groceries stores. They analyze receipts to see which items are commonly bought together, and
then they can modify the store layout and marketing to suggest the second item once you've
decided to buy the irst item.
In this recipe, we'll use this algorithm to extract the relationships from the mushroom dataset
that we've already seen several times in this chapter.
Getting ready
First, we'll use the same dependencies that we did in the Loading CSV and ARFF iles
into Weka recipe.
We'll use only one import in our script or REPL:
(import [weka.associations Apriori])
We'll also use the mushroom dataset that we introduced in the Classifying data with decision
trees recipe. We'll set the class attribute to the column indicating whether the mushroom is
edible or poisonous:
(def shrooms (doto (load-arff "data/UCI/mushroom.arff")
(.setClassIndex 22)))
Finally, we'll use the defanalysis macro from the Discovering groups of data using K-Means
clustering recipe.
 
Search WWH ::




Custom Search