Database Reference
In-Depth Information
5.
We'll use that function in compute-file , which does the primary processing for
each ile. It also uses send-off to safely queue the next task for this agent:
(defn compute-file [fs]
(dosync
(if-let [[s & ss] (seq fs)]
(let [tokens (tokenize-brown (slurp s))
tc (count tokens)
fq (reduce accum-freq {} tokens)]
(commute total-docs inc)
(commute total-words #(+ tc %))
(commute freqs #(merge-with + % fq))
(send-off *agent* compute-file)
ss)
(do (alter finished (constantly true))
'()))))
6.
Another function will update the report in parallel:
(defn compute-report [{term :term, :as report}]
(dosync
(when-not @finished
(send *agent* compute-report))
(let [term-freq (term (ensure freqs) 0)
tc (ensure total-words)
r (if (zero? tc)
nil
(float (/ term-freq tc)))]
(assoc report
:frequency term-freq
:ratio r))))
7.
Finally, compute-frequencies gets the entire thing started:
(defn compute-frequencies [inputs term]
(let [a (agent inputs)]
(send running-report #(assoc % :term term))
(send running-report compute-report)
(send-off a compute-file)))
8.
To use this, we just call compute-frequencies with the inputs and a term, and
then we poll finished and running-report to see how processing is going:
user=> (compute-frequencies input-files :committee)
#<Agent@1830f455: (…)>
 
Search WWH ::




Custom Search