Bash — parallel command execution

There are three ways to execute commands in parallel using bash

Bash — parallel command execution
Photo by Jason Yuen on Unsplash

There are three ways to execute commands in parallel using bash

  • plain bash: no external dependencies to be installed
  • parallel: a very smart tool to execute commands in parallel
  • xargs: a swiss knife tool that every Linux user must know

plain bash

Save the following as configuration.cfg/var,*.log,log_res.txt
/var,*.gz,gzip_res.txt

Save the following as bash_plain_parallel.sh#!/bin/bash
for n in $(cat ./configuration.cfg)
do
DIRECTORY=$(echo $n | cut -d "," -f 1)
FILES=$(echo $n | cut -d "," -f 2)
RESULTS=$(echo $n | cut -d "," -f 3)
CMD="find "${DIRECTORY}" -name "${FILES}" > "${RESULTS}
eval ${CMD} &
done
wait
echo "Script execution completed"

make the file executable$ chmod +x bash_plain_parallel.sh

Explaining the script

The script reads each file of configuration.cfg, each line is stored in $n variablefor n in $(cat ./configuration.cfg)

Then each $n configuration line is spitted with “,” as delimiterDIRECTORY=$(echo $n | cut -d "," -f 1)
FILES=$(echo $n | cut -d "," -f 2)
RESULTS=$(echo $n | cut -d "," -f 3)

Then we build the command to be executed by concatenating text and the parameters we parsed from the configuration fileCMD="find "${DIRECTORY}" -name "${FILES}" > "${RESULTS}

We execute the command using eval, the & in the end of the line instructs bash to execute the command in the backgroundeval ${CMD} &

Now we wait for all executed processes to be completed using the wait command, what it does is actually waits for all PIDs started from the current bash shell to terminate, its a good practice to use wait because you can print a message when the background commands execution complete.wait

parallel

Parallel does what the plain bash solution does but with more tweak options, by default parallel is not included and needs to be installed

Create the following file and save it as parallel_config.cfg, each line is a command that will be executedfind /var -name *.gz > gz_files.txt
find /var -name *.log > log_files.txt

The commands can be executed with$ parallel -j 2 < parallel_config.cfg

  • -j is the maximum number of parallel jobs

xargs

The true power of xargs comes from the ability to parallel tasks coming from another command$ find . -name '*.txt' | xargs -P10 -I {} grep -i 'something' {}

xargs get all file names from the find command, then executes a maximum of 10 Commands at the time where {} is being substituted from one of the filenames inputted to stdin

Join Medium with my referral link - Konstantinos Patronas
As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…