Database Reference
In-Depth Information
The data that we'll work on will be a sequence of strings that contain words and numbers.
We'll convert all of the letters to lowercase and all of the numbers to integers. Based on this
speciication, the irst step of the processing pipeline will be str/lower-case . The second
step will be the ->int function:
(defn ->int [x]
(try
(Long/parseLong x)
(catch Exception e
x)))
The data that we'll work on will be this list:
(def data
(str/split (str "This is a small list. It contains 42 "
"items. Or less.")
#"\s+"))
If you run this using clojure.core/map , you will get the results that you had expected:
user=> (map ->int
(map str/lower-case
data))
("this" "is" "a" "small" "list." "it" "contains" 42 "items." "or"
"less.")
The problem with this approach isn't the results; it's what Clojure is doing between the two
calls to map . In this case, the irst map creates an entirely new lazy sequence. The second
map walks over it again before throwing it and its contents away. Repeatedly allocating lists
and immediately throwing them away is wasteful. It takes more time, and can potentially
consume more memory, than you have available. In this case, this isn't really a problem,
but for longer pipelines of the map calls (potentially processing long sequences) this can
be a performance problem.
This is a problem that reducers address. Let's change our calls to map into calls to
clojure.reducers/map and see what happens:
user=> (r/map ->int
(r/map str/lower-case
data))
#<reducers$folder$reify__1529 clojure.core.reducers$folder$reify__152
9@37577fd6>
 
Search WWH ::




Custom Search