Importing Data for Analysis - Clojure Data Analysis

Database Reference

In-Depth Information

How it works…

The let bindings in load-data tell the story here. Let's talk about them one by one.

The irst binding has Enlive download the resource and parse it into Enlive's internal

representation:

(let [page (html/html-resource (URL. url))

The next binding selects the table with the data ID:

table (html/select page [:table#data])

Now, select of all the header cells from the table, extract the text from them, convert each to a

keyword, and then convert the entire sequence into a vector. This gives headers for the dataset:

headers (->>

(html/select table [:tr :th])

(map html/text)

(map to-keyword)

vec)

First, select each row individually. The next two steps are wrapped in map so that the cells in

each row stay grouped together. In these steps, select the data cells in each row and extract

the text from each. Last, use filterseq , which removes any rows with no data, such as the

header row:

rows (->> (html/select table [:tr])

(map #(html/select % [:td]))

(map #(map html/text %))

(filterseq))]

Here's another view of this data. In this image, you can see some of the code from this web

page. The variable names and select expressions are placed beside the HTML structures that

they match. Hopefully, this makes it more clear how the select expressions correspond to the

HTML elements:

Search WWH ::

Custom Search

Home