Database Reference
In-Depth Information
Many of the recipes in this chapter and in this topic use Incanter ( http://incanter.org/ )
to import the data and target Incanter datasets. Incanter is a library that is used for statistical
analysis and graphics in Clojure (similar to R) an open source language for statistical
computing ( http://www.r-project.org/ ). Incanter might not be suitable for every task
(for example, we'll use the Weka library for machine learning later) but it is still an important
part of our toolkit for doing data analysis in Clojure. This chapter has a collection of recipes
that can be used to gather data and make it accessible to Clojure.
For the very irst recipe, we'll take a look at how to start a new project. We'll start with very
simple formats such as comma-separated values (CSV) and move into reading data from
relational databases using JDBC. We'll examine more complicated data sources, such as
web scraping and linked data (RDF).
Creating a new project
Over the course of this topic, we're going to use a number of third-party libraries and external
dependencies. We will need a tool to download them and track them. We also need a tool to
set up the environment and start a REPL (read-eval-print-loop or interactive interpreter) that
can access our code or to execute our program. REPLs allow you to program interactively. It's a
great environment for exploratory programming, irrespective of whether that means exploring
library APIs or exploring data.
We'll use Leiningen for this ( http://leiningen.org/ ) . This has become a standard
package automation and management system.
Getting ready
Visit the Leiningen site and download the lein script. This will download the Leiningen JAR
ile when it's needed. The instructions are clear, and it's a simple process.
How to do it...
To generate a new project, use the lein new command, passing the name of the project
to it:
$ lein new getting-data
Generating a project called getting-data based on the default template.
To see other templates (app, lein plugin, etc), try lein help new.
There will be a new subdirectory named getting-data . It will contain iles with stubs for the
getting-data.core namespace and for tests.
 
Search WWH ::




Custom Search