Database Reference
In-Depth Information
Getting ready
First, we'll need to declare a dependency on Incanter in the
project.clj
ile:
(defproject inc-dsets "0.1.0"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]
[org.clojure/data.csv "0.1.2"]])
Next, we'll include Incanter
core
and
io
in our script or REPL:
(require '[incanter.core :as i]
'[incanter.io :as i-io])
For data, we'll use the census race data for all the states. You can download it from
These lines will load the data into the
race-data
name:
(def data-file "data/all_160.P3.csv")
(def race-data (i-io/read-dataset data-file :header true))
How to do it…
Incanter lets you group rows for further analysis or to summarize them with the
$group-by
function. All you need to do is pass the data to
$group-by
with the column or function to
group on:
(def by-state (i/$group-by :STATE race-data))
How it works…
This function returns a map where each key is a map of the ields and values represented by
that grouping. For example, this is how the keys look:
user=> (take 5 (keys by-state))
({:STATE 29} {:STATE 28} {:STATE 31} {:STATE 30} {:STATE 25})
We can get the data for Virginia back out by querying the group map for state 51.
user=> (i/$ (range 3) [:GEOID :STATE :NAME :POP100]
(by-state {:STATE 51}))
| :GEOID | :STATE | :NAME | :POP100 |
|---------+--------+---------------+---------|
| 5100148 | 51 | Abingdon town | 8191 |
| 5100180 | 51 | Accomac town | 519 |
| 5100724 | 51 | Alberta town | 298 |