Database Reference
In-Depth Information
The data that we'll work on will be a sequence of strings that contain words and numbers.
We'll convert all of the letters to lowercase and all of the numbers to integers. Based on this
speciication, the irst step of the processing pipeline will be
str/lower-case
. The second
step will be the
->int
function:
(defn ->int [x]
(try
(Long/parseLong x)
(catch Exception e
x)))
The data that we'll work on will be this list:
(def data
(str/split (str "This is a small list. It contains 42 "
"items. Or less.")
#"\s+"))
If you run this using
clojure.core/map
, you will get the results that you had expected:
user=> (map ->int
(map str/lower-case
data))
("this" "is" "a" "small" "list." "it" "contains" 42 "items." "or"
"less.")
The problem with this approach isn't the results; it's what Clojure is doing between the two
calls to
map
. In this case, the irst
map
creates an entirely new lazy sequence. The second
map
walks over it again before throwing it and its contents away. Repeatedly allocating lists
and immediately throwing them away is wasteful. It takes more time, and can potentially
consume more memory, than you have available. In this case, this isn't really a problem,
but for longer pipelines of the
map
calls (potentially processing long sequences) this can
be a performance problem.
This is a problem that reducers address. Let's change our calls to
map
into calls to
clojure.reducers/map
and see what happens:
user=> (r/map ->int
(r/map str/lower-case
data))
#<reducers$folder$reify__1529 clojure.core.reducers$folder$reify__152
9@37577fd6>