Database Reference
In-Depth Information
Managing program complexity with STM
The basis of Clojure's concurrency is its STM system. Basically, this extends the semantics of
database transactions to the computer's memory.
For this recipe, we'll use the STM to calculate the families per housing unit from a piece of
U.S. census data. We'll use future-call to perform the calculations in the thread pool and
spread the execution over multiple cores. Afterwards, we'll go into more detail about how
the STM works in general, and how it's applied in this particular recipe.
Getting ready
To prepare for this recipe, we irst need to list our dependencies in the Leiningen project.
clj ile:
(defproject concurrent-data "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.6.0"]
[org.clojure/data.csv "0.1.2"]])
We also need to import these dependencies to our script or REPL:
(require '[clojure.java.io :as io]
'[clojure.data.csv :as csv])
Finally, we need to have our data ile. I downloaded one of the bulk data iles from the
Investigative Reporters and Editors' U.S. census site at http://census.ire.org/data/
bulkdata.html . The data in this recipe will use the family census data for Virginia. I've also
uploaded this data at http://www.ericrochester.com/clj-data-analysis/data/
all_160_in_51.P35.csv . You can easily download it from here and save it to a directory
named data . Let's bind the ilename to a variable for easy access:
(def data-file "data/all_160_in_51.P35.csv")
Here's the data ile, opened in a spreadsheet, and showing the irst few rows:
 
Search WWH ::




Custom Search