Database Reference
In-Depth Information
(defn get-person
"This takes a list item and returns a map of the person's
name and relationship."
[li]
(let [[{pnames :content} rel] (:content li)]
{:name (apply str pnames)
:relationship (string/trim rel)}))
(defn get-rows
"This takes an article and returns the person mappings,
with the family name added."
[article]
(let [family (get-family article)]
(map #(assoc % :family family)
(map get-person
(html/select article [:ul :li])))))
(defn load-data
"This downloads the HTML page and pulls the data out of
it."
[html-url]
(let [html (html/html-resource (URL. html-url))
articles (html/select html [:article])]
(i/to-dataset (mapcat get-rows articles))))
2.
Now that these functions are deined, we just call load-data with the URL that we
want to scrape:
user=> (load-data (str "http://www.ericrochester.com/"
"clj-data-analysis/data/"
"small-sample-list.html"))
| :family | :name | :relationship |
|----------------+-----------------+---------------|
| Addam's Family | Gomez Addams | — father |
| Addam's Family | Morticia Addams | — mother |
| Addam's Family | Pugsley Addams | — brother |
 
Search WWH ::




Custom Search