Database Reference
In-Depth Information
Converting datasets to matrices
Although datasets are often convenient, many times we'll want to treat our data as a matrix
from linear algebra. In Incanter, matrices store a table of doubles. This provides good
performance in a compact data structure. Moreover, we'll need matrices many times because
some of Incanter's functions, such as trans , only operate on a matrix. Plus, it implements
Clojure's ISeq interface, so interacting with matrices is also convenient.
Getting ready
For this recipe, we'll need the Incanter libraries, so we'll use this project.clj ile:
(defproject inc-dsets "0.1.0"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]])
We'll use the core and io namespaces, so we'll load these into our script or REPL:
(use '(incanter core io))
We'll use the Virginia census data that we've used periodically throughout the topic. See the
Managing program complexity with STM recipe from Chapter 3 , Managing Complexity with
Concurrent Programming , for information on how to get this dataset. You can also download
it from http://www.ericrochester.com/clj-data-analysis/data/all_160_
in_51.P35.csv .
This line binds the ile name to the identiier data-file :
(def data-file "data/all_160_in_51.P35.csv")
How to do it…
For this recipe, we'll create a dataset, convert it to a matrix, and then perform some
operations on it:
1.
First, we need to read the data into a dataset, as follows:
(def va-data (read-dataset data-file :header true))
2.
Then, in order to convert it to a matrix, we just pass it to the to-matrix function.
Before we do this, we'll pull out a few of the columns since matrixes can only contain
loating-point numbers:
(def va-matrix
(to-matrix ($ [:POP100 :HU100 :P035001] va-data)))
 
Search WWH ::




Custom Search