Database Reference
In-Depth Information
Fanny and Alexander (1982),FannyandAlexander(1982),1982,210,41,8,y,n/a,n/a,
Beauty and the Beast (1991),BeautyandtheBeast(1991),1991,249,2,8,y,n/a,n/a,
As you can see, CSV by default is not too readable. You can pipe the data to a tool
called csvlook (Groskopf, 2014), which will nicely format the data into a table. Here,
we'll display a subset of the columns using csvcut such that the table fits on the page:
$ in2csv data/imdb-250.xlsx | head | csvcut -c Title,Year,Rating | csvlook
|------------------------------------------+------+---------|
| Title | Year | Rating |
|------------------------------------------+------+---------|
| Sherlock Jr. (1924) | 1924 | 8 |
| The Passion of Joan of Arc (1928) | 1928 | 8 |
| His Girl Friday (1940) | 1940 | 8 |
| Tokyo Story (1953) | 1953 | 8 |
| The Man Who Shot Liberty Valance (1962) | 1962 | 8 |
| Persona (1966) | 1966 | 8 |
| Stalker (1979) | 1979 | 8 |
| Fanny and Alexander (1982) | 1982 | 8 |
| Beauty and the Beast (1991) | 1991 | 8 |
|------------------------------------------+------+---------|
A spreadsheet can contain multiple worksheets. By default, in2csv extracts the first
worksheet. To extract a different worksheet, you need to pass the name of worksheet
to the --sheet option.
The tools in2csv , csvcut , and csvlook are actually part of Csvkit, which is a collec‐
tion of command-line tools to work with CSV data. Csvkit will be used quite often in
this topic because it has so many valuable tools. If you're running the Data Science
Toolbox, you already have Csvkit installed. Otherwise, see Appendix A for instruc‐
tions on how to install it.
An alternative approach to in2csv is to open the spreadsheet in
Microsoft Excel or an open source variant such as LibreOffice Calc,
and manually export it to CSV. While this works as a one-off solu‐
tion, the disadvantage is that it does not scale well to multiple files
and is not automatable. Furthermore, when you are working on the
command line of a remote server, chances are that you don't have
such an application available.
Querying Relational Databases
Most companies store their data in a relational database. Examples of relational data‐
bases are MySQL, PostgreSQL, and SQLite. These databases all have a slightly differ‐
ent way of interfacing with them. Some provide a command-line tool or a command-
line interface, while others do not. Moreover, they are not very consistent when it
comes to their usage and output.
Search WWH ::




Custom Search