Database Reference
In-Depth Information
The command line rules
The command line rules
If you ever wonder whether your GNU Parallel command is set up
correctly, you can add the --dryrun option. Instead of actually exe‐
cuting the command, GNU Parallel will print out all the com‐
mands exactly as if they would have been executed.
Controlling the Number of Concurrent Jobs
By default, parallel runs one job per CPU core in parallel. You can control the num‐
ber of jobs that will be run in parallel with the --jobs or -j option. Simply specifying
a number, say n , means that n jobs will be run in parallel. If you put a plus sign in
front of the number n , then parallel will run m+n jobs plus the number of CPU
cores, where m is the number of CPU cores. If you put a minus sign in front of the
number, then parallel will run m-n jobs. You can also specify a percentage to the -j
option. So, the default is 100% of the number of CPU cores. The optimal number of
jobs to run in parallel depends on the actual commands you are running:
$ seq 5 | parallel -j0 "echo Hi {}"
Hi 1
Hi 2
Hi 3
Hi 4
Hi 5
$ seq 5 | parallel -j200% "echo Hi {}"
Hi 1
Hi 2
Hi 3
Hi 4
Hi 5
If you specify -j1 , then the commands will be run in serial. Even though this doesn't
do the name of the tool of justice, it still has its uses. For example, when you need to
access an API which only allows one connection at a time. If you specify -j0 , then
parallel will run as many jobs in parallel as possible. This can be compared to loop‐
ing with subshells, which is not advised.
Logging and Output
To save the output of each command, you might be tempted to do the following:
$ seq 5 | parallel "echo \"Hi {}\" > data/hi-{}.txt"
This will save the output into individual files. Or, if you want to save everything into
one big file, you could do the following:
Search WWH ::




Custom Search