Database Reference
In-Depth Information
How to do it…
For this recipe, we'll write several utility functions and then use them to load the data:
1.
First, we'll need a utility function to convert options into an array of strings:
(defn ->options
[& opts]
(into-array String
(map str (flatten (remove nil? opts)))))
2. Next, we'll create a function that takes a ilename and an optional :header keyword
argument and returns the Weka dataset of instances:
(defn load-csv [filename & {:keys [header]
:or {header true}}]
(let [options (->options (when-not header "-H"))
loader (doto (CSVLoader.)
(.setOptions options)
(.setSource (File. filename)))]
(.getDataSet loader)))
3. Finally, we can use this to load CSV iles:
(def data (load-csv "data/chn-land.csv"))
4. Alternatively, if we have a ile without a header row, we can do this:
(def data (load-csv "data/chn-land.csv"
:header false))
5.
We can use a similar function to load ARFF iles:
(defn load-arff [filename]
(.getDataSet
(doto (ArffLoader.)
(.setFile (File. filename)))))
There are ARFF iles of standard datasets already created and available to download from
http://weka.wikispaces.com/Datasets . We'll use some of these in later recipes.
How it works…
Weka can be used in a number of ways. Although we're using it as a library here, it is also
possible to use it as a GUI or a command-line application. In fact, in a lot of ways, to use the
interface as a library is the same as using it as a command-line application (just without
calling it from the shell). Whether used from a GUI or programmatically, at some point, we're
setting options using a command-line-style string array.
 
Search WWH ::




Custom Search