Database Reference
In-Depth Information
Fortunately, there is a command-line tool called
sql2csv
, which is part of the Csvkit
suite. Because it leverages the Python SQLAlchemy package, we only have to use one
tool to execute queries on many different databases, including MySQL, Oracle, Post‐
greSQL, SQLite, Microsoft SQL Server, and Sybase. The output of
sql2csv
is, as its
name suggests, in CSV format.
We can obtain data from relational databases by executing a
SELECT
query on them.
(
sql2csv
also support
INSERT
,
UPDATE
, and
DELETE
queries, but that's not the purpose
of this chapter.) To select a specific set of data from an SQLite database named
iris.db
,
sql2csv
can be invoked as follows:
$
sql2csv --db
'sqlite:///data/iris.db'
--query
'SELECT * FROM iris '
\
>
'WHERE sepal_length > 7.5'
sepal_length,sepal_width,petal_length,petal_width,species
7.6,3.0,6.6,2.1,Iris-virginica
7.7,3.8,6.7,2.2,Iris-virginica
7.7,2.6,6.9,2.3,Iris-virginica
7.7,2.8,6.7,2.0,Iris-virginica
7.9,3.8,6.4,2.0,Iris-virginica
7.7,3.0,6.1,2.3,Iris-virginica
Here, we're selecting all rows where
sepal_length
is larger than 7.5. The
--db
option
specifies the database URL, of which the typical form is:
dialect+driver://user
name:password@host:port/database
.
Downloading from the Internet
The Internet provides without a doubt the largest resource for data. This data is avail‐
able in various forms, using various protocols. The command-line tool cURL (Sten‐
berg, 2012) can be considered the command line's Swiss Army knife when it comes to
downloading data from the Internet.
When you access a URL, which stands for
uniform resource locator
, through your
browser, the data that is downloaded can be interpreted. For example, an HTML file
is rendered as a website, an MP3 file may be automatically played, and a PDF file may
be automatically opened by a viewer. However, when cURL is used to access a URL,
the data is downloaded as is, and is printed to standard output. Other command-line
tools may then be used to process this data further.
The easiest invocation of
curl
is to simply specify a URL as a command-line argu‐
ment. For example, to download Mark Twain's
Adventures of Huckleberry Finn
from
Project Gutenberg, we can run the following command:
$
curl -s http://www.gutenberg.org/cache/epub/76/pg76.txt | head -n 10
The Project Gutenberg EBook of Adventures of Huckleberry Finn, Complete
by Mark Twain (Samuel Clemens)