Database Reference
In-Depth Information
How it works…
After examining the web page, each family is wrapped in an article tag that contains a
header with an h2 tag. get-family pulls that tag out and returns its text.
get-person processes each person. The people in each family are in an unordered list
( ul ), and each person is in an li tag. The person's name itself is in an em tag. let gets the
contents of the li tag and decomposes it in order to pull out the name and relationship
strings. get-person puts both pieces of information into a map and returns it.
get-rows processes each article tag. It calls get-family to get that information from
the header, gets the list item for each person, calls get-person on the list item, and adds
the family to each person's mapping.
Here's how the HTML structures correspond to the functions that process them. Each function
name is mentioned beside the elements it parses:
Finally, load-data ties the process together by downloading and parsing the HTML ile and
pulling the article tags from it. It then calls get-rows to create the data mappings and
converts the output to a dataset.
 
Search WWH ::




Custom Search