Database Reference
In-Depth Information
head -n $[NUM] |
sed -E "s/.*ebooks\/([0-9]+)\">([^<]+)<.*/\\1,\\2/" > $OUTPUT
We can now rebuild the first step by specifying the %html tag:
$ drake -w 03.drake '=%html'
Discussion
One of the beauties of the command line is that it allows you to play with your data.
You can easily execute different commands and process different datafiles. It's a very
interactive and iterative process. After a while, it is easy to forget which steps you
have taken to get the desired result. It's therefore very important to document your
steps every once in a while. This way, if you or one of your collaborators picks up
your project after some time, the same result can be produced again by executing the
same steps.
This chapter has shown that just putting every command in one Bash script is subop‐
timal. We have proposed to use Drake as a command-line tool to manage your data
workflow. By using a running example, we have shown you how to define steps and
the dependencies between them. We've also discussed how to use variables and tags.
There's nothing more fun than just playing with your data and forgetting everything
else. But trust us when we say that it's worthwhile to keep a record of what you have
done (by means of a Drake workflow). Not only will it make your life easier, but you
will also start thinking about your data workflow in terms of steps. Just as with your
own data science toolbox—which you expand over time, making you more efficient
—Drake workflows also make for a more organized setup. The more steps you have
defined, the easier it gets to keep doing it, because very often you can reuse certain
steps. We hope that you will get used to Drake, and that it'll make your life easier.
This chapter has only scratch the surface of all Drake has to offer. Some of its more
advanced features are:
• Asynchronous execution of steps
• Support for inline Python and R code
• Upload and download data from HDFS and S3
Further Reading
• Factual. (2014). Drake. Retrieved from https://github.com/Factual/drake .
Search WWH ::




Custom Search