Database Reference
In-Depth Information
How to do it…
For this recipe, we'll read in the data, break it into chunks, and use separate threads to total
the number of housing units and the number of families in each chunk. Each chunk will add
its totals to some global references:
1. We need to deine two references that the STM will manipulate: one for the total of
housing units and one for families:
(def total-hu (ref 0))
(def total-fams (ref 0))
2. Now, we'll need a couple of utility functions to safely read a CSV ile to a lazy
sequence. The irst is lazy-read-csv from the Lazily processing very large data
sets recipe in Chapter 2 , Cleaning and Validating Data . We'll also deine a new
function, with-header , that uses the irst row to create maps from the rest of the
rows in the dataset:
(defn with-header [coll]
(let [headers (map keyword (first coll))]
(map (partial zipmap headers) (next coll))))
3.
Next, we'll deine some utility functions. One ( ->int ) will convert a string into an
integer. Another ( sum-item ) will calculate the running totals for the ields we're
interested in. A third function ( sum-items ) will calculate the sums from a collection
of data maps:
(defn ->int [i] (Integer. i))
(defn sum-item
([fields] (partial sum-item fields))
([fields accum item]
(mapv + accum (map ->int (map item fields)))))
(defn sum-items [accum fields coll]
(reduce (sum-item fields) accum coll))
4. We can now deine the function that will actually interact with the STM. The
update-totals function takes a list of ields that contains the housing unit
and family data and a collection of items. It will total the ields in the parameter
with the items passed into the function and update the STM references with them:
(defn update-totals [fields items]
(let [mzero (mapv (constantly 0) fields)
 
Search WWH ::




Custom Search