Database Reference
In-Depth Information
Selecting columns with $
Often, you need to cut the data to make it more useful. One common transformation is to
pull out all the values from one or more columns into a new dataset. This can be useful for
generating summary statistics or aggregating the values of some columns.
The Incanter macro $ slices out parts of a dataset. In this recipe, we'll see this in action.
Getting ready
For this recipe, we'll need to have Incanter listed in our project.clj ile:
(defproject inc-dsets "0.1.0"
:dependencies [[org.clojure/clojure "1.6.0"]
[incanter "1.5.5"]
[org.clojure/data.csv "0.1.2"]])
We'll also need to include these libraries in our script or REPL:
(require '[clojure.java.io :as io]
'[clojure.data.csv :as csv]
'[clojure.string :as str]
'[incanter.core :as i])
Moreover, we'll need some data. This time, we'll use some country data from the World Bank.
Point your browser to http://data.worldbank.org/country and select a country. I
picked China. Under World Development Indicators, there is a button labeled Download
Data. Click on this button and select CSV. This will download a ZIP ile. I extracted its contents
into the data/chn directory in my project. I bound the ilename for the primary data ile to the
data-file name.
How to do it…
We'll use the $ macro in several different ways to get different results. First, however, we'll
need to load the data into a dataset, which we'll do in steps 1 and 2:
1. Before we start, we'll need a couple of utilities that load the data ile into a sequence
of maps and makes a dataset out of those:
(defn with-header [coll]
(let [headers (map #(keyword (str/replace % \space \-))
(first coll))]
(map (partial zipmap headers) (next coll))))
 
Search WWH ::




Custom Search