Database Reference
In-Depth Information
load-xml-data implements this process. This takes three parameters:
F The input ilename
F A function that takes the root node of the parsed XML and returns the irst data node
F A function that takes a data node and returns the next data node or nil, if there are
no more nodes
First, the function parses the XML ile and wraps it in a zipper (we'll talk more about zippers in
the next section). Then, it uses the two functions that are passed in to extract all of the data
nodes as a sequence. For each data node, the function retrieves that node's child nodes and
converts them into a series of tag name / content pairs. The pairs for each data node are
converted into a map, and the sequence of maps is converted into an Incanter dataset.
There's moreā€¦
We used a couple of interesting data structures or constructs in this recipe. Both are common
in functional programming or Lisp, but neither have made their way into more mainstream
programming. We should spend a minute with them.
Navigating structures with zippers
The irst thing that happens to the parsed XML is that it gets passed to clojure.zip/
xml-zip . Zippers are standard data structures that encapsulate the data at a position in a
tree structure, as well as the information necessary to navigate back out. This takes Clojure's
native XML data structure and turns it into something that can be navigated quickly using
commands such as clojure.zip/down and clojure.zip/right . Being a functional
programming language, Clojure encourages you to use immutable data structures, and
zippers provide an eficient, natural way to navigate and modify a tree-like structure, such as
an XML document.
Zippers are very useful and interesting, and understanding them can help you understand
and work better with immutable data structures. For more information on zippers, the
Clojure-doc page is helpful ( http://clojure-doc.org/articles/tutorials/
parsing_xml_with_zippers.html ). However, if you would rather dive into the deep
end, see Gerard Huet's paper, The Zipper ( http://www.st.cs.uni-saarland.de/edu/
seminare/2005/advanced-fp/docs/huet-zipper.pdf ).
Processing in a pipeline
We used the ->> macro to express our process as a pipeline. For deeply nested function calls,
this macro lets you read it from the left-hand side to the right-hand side, and this makes the
process's data low and series of transformations much more clear.
 
Search WWH ::




Custom Search