Database Reference
In-Depth Information
Example 6-3. Drake worklow with dependencies (02.drake)
NUM:=5
BASE=data/
top.html <- [-timecheck]
curl -s 'http://www.gutenberg.org/browse/scores/top' > $OUTPUT
top-$[NUM] <- top.html
< $INPUT grep -E '^<li>' |
head -n $[NUM] |
sed -E "s/.*ebooks\/([0-9]+)\">([^<]+)<.*/\\1,\\2/" > $OUTPUT
You can specify variables in Drake, preferably at the beginning of the file, by
specifying the variable name, then an equal sign, and then the value. The name of
the variable doesn't have to be in all capitals, but it does make them stand out
more. As you can see, we have used for the variable NUM the notation := instead of
= . This means that if the variable NUM is already set, it will not be overridden. This
allows us to specify the value of NUM from the command line before we run
Drake.
The BASE variable is a special variable. Drake will treat every file specified in the
workflow as if it were in this base directory.
We now have two steps. The first step has the same input as before, but now the
output is a different file, namely, top.html . This output is defined again as the
input of step 2. This is how Drake knows that the second step depends on the
first step.
We have used two more special variables: INPUT and OUTPUT . Values of these two
special variables are set to what we have defined as the input and output of that
step, respectively. This way we don't have to specify the input and output of a cer‐
tain step twice. Furthermore, it allows us to easily reuse certain steps in future
workflows.
Let's execute this new workflow using Drake:
$ drake -w 02.drake
The following steps will be run, in order:
1: data/top.html <- [missing output]
2: data/top-5 <- data/top.html [projected timestamped]
Confirm? [y/n] y
Running 2 steps with concurrence of 1...
--- 0. Running (missing output): data/top.html <-
--- 0: data/top.html <- -> done in 0.89s
--- 1. Running (missing output): data/top-5 <- data/top.html
Search WWH ::




Custom Search