Database Reference
In-Depth Information
zcat 3.json.gz | ./jq -r '.borough' | sort | uniq -c | awk '{print $2","$1}'
zcat 4.json.gz | ./jq -r '.borough' | sort | uniq -c | awk '{print $2","$1}'
zcat 5.json.gz | ./jq -r '.borough' | sort | uniq -c | awk '{print $2","$1}'
zcat 6.json.gz | ./jq -r '.borough' | sort | uniq -c | awk '{print $2","$1}'
zcat 7.json.gz | ./jq -r '.borough' | sort | uniq -c | awk '{print $2","$1}'
zcat 8.json.gz | ./jq -r '.borough' | sort | uniq -c | awk '{print $2","$1}'
zcat 9.json.gz | ./jq -r '.borough' | sort | uniq -c | awk '{print $2","$1}'
This long command breaks down as follows:
Print the list of files using ls and pipe it into parallel .
Transmit the jq binary to each remote machine. (Luckily, jq has no dependen‐
cies.) This file will be removed from the remote machine at the end because we
specified the --trc option (which implies the --cleanup option).
The --trc {.}.csv option is short for --transfer --return {.}.csv --
cleanup . (The replacement string {.} gets replaced with the input filename
without the last extension.) Here, this means that the JSON file gets transferred
to the remote machine, the CSV file gets returned to the local machine, and both
files will be removed from the remote machine after each job.
Specify a list of hostnames. Remember, if you want to try this out locally, you can
specify --sshlogin : instead of --self instances .
Note the escaping in the awk expression. Quoting can sometimes be tricky. Here,
the dollar signs and the double quotes are escaped. If quoting ever gets too con‐
fusing, remember that you can put turn pipeline into a separate command-line
tool just as we did with sum .
If we, at some point during this command, run ls on one of the remote machines, we
would see that parallel indeed transfers (and cleans up) the binary jq , the JSON
files, and CSV files:
$ ssh $( head -n 1 instances ) ls
1.json.csv
1.json.gz
jq
Each CSV file looks like this:
$ cat 1.json.csv
bronx,3
brooklyn,5
manhattan,24
queens,3
staten_island,2
unspecified,63
Search WWH ::




Custom Search