This question already has answers here:
While loop stops reading after the first line in Bash
(5 answers)
Closed 2 years ago.
I've created a bash script to connect to a number of servers and execute a program. The ips and quantities per IP should be read from a config file that is structured like this:
127.0.0.1 10
127.0.0.1 1
127.0.0.1 3
etc
j=$((0))
while IFS=' ' read -r ip quantity; do
echo "${ip} x ${quantity}";
for (( i = 1; i <= quantity; i++ ))
do
echo "ssh root#${ip} cd test/libhotstuff && ./examples/hotstuff-app --conf ./hotstuff.gen-sec${j}.conf > log${j} 2>&1"
ssh root#"${ip}" "cd test/libhotstuff && ./examples/hotstuff-app --conf ./hotstuff.gen-sec${j}.conf > log${j} 2>&1" &
j=$((j+1))
done
sleep 1
done < ips
I noticed that this while loop breaks if the execution takes too long. If I put sleep for 1s here it will stop after the first execution. If I remove it, but the inner loop takes too long a subset of the lines will not be read.
What is the problem here?
Here's a version that starts your background processes with a 1 second delay between each, waits 6 minutes before killing them one by one, with a 1 second delay between each, to give them approximately the same running time.
You should also add some options to ssh to prevent it from interfering with stdin and terminate your loop prematurely while running.
-n
Prevents reading from stdin
-oBatchMode=yes
Passphrase/password querying will be disabled
-oStrictHostKeyChecking=no
Connect to host even if the host key has changed
#!/bin/bash
sshopts=(-n -oBatchMode=yes -oStrictHostKeyChecking=no)
j=0
pids=()
while IFS=$' \t\n' read -r ip quantity; do
echo "${ip} x ${quantity}";
for (( i = 0; i < quantity; ++i ))
do
remotecmd="cd test/libhotstuff && ./examples/hotstuff-app --conf ./hotstuff.gen-sec${j}.conf > log${j} 2>&1"
localcmd=(ssh ${sshopts[#]} root#${ip} "$remotecmd")
echo "${localcmd[#]}"
"${localcmd[#]}" &
# store the background pid
pids+=($!)
(( ++j ))
sleep 1
done
done < ips
seconds=360
echo "running ${pids[#]} in the background $seconds seconds"
sleep $seconds
echo "telling the background processes to terminate"
for pid in ${pids[#]}
do
echo killing $pid
kill $pid
sleep 1
done
echo "waiting for all the background processes to terminate"
wait
echo Done
Here is a version that offloads the loop and parallel processes to the remote shell script. Generate a remote shell script from a HereDocument with quantity, and wait for all the background processes to terminate before exiting.
#!/usr/bin/env sh
while IFS=$' \t\n\r' read -r ip quantity || [ -n "$quantity" ]
do
{
# When satisfied by the output:
# Ucomment the line below and delete its following line with the echo and cat
# ssh "root#$ip" <<EOF
echo ssh "root#$ip"; cat <<EOF
if cd test/libhotstuff
then
i=$quantity
until
i=\$((i - 1))
[ \$i -lt 0 ]
do
./examples/hotstuff-app \\
--conf "./hotstuff.gen-sec\$i.conf" >"log\$i" 2>&1 &
done
wait
fi
EOF
} &
done <ips
# Wait for all child processes to terminate
wait
echo "All child ssh done!"
Another way replacing the dynamic HereDocument by an inline shell script called with a quantity argument:
#!/usr/bin/env sh
while IFS=$' \t\n\r' read -r ip quantity || [ -n "$quantity" ]; do
echo ssh "root#$ip" sh -c '
if cd test/libhotstuff
then
i=0
while [ $i -lt "$1" ]; do
./examples/hotstuff-app --conf "./hotstuff.gen-sec$i.conf" >"log$i" 2>&1 &
i=$((i + 1))
done
wait
fi
' _ "$quantity" &
done <ips
# Wait for all child processes to terminate
wait
echo "All child ssh done!"
Related
Here is my code:
count=0
head -n 10 urls.txt | while read LINE; do
curl -o /dev/null -s "$LINE" -w "%{time_total}\n" &
count=$((count+1))
[ 0 -eq $((count % 3)) ] && wait && echo "process wait" # wait for 3 urls
done
echo "before wait"
wait
echo "after wait"
I am expecting the last curl to finish before printing the last echo, but actually it's not the case:
0.595499
0.602349
0.618237
process wait
0.084970
0.084243
0.099969
process wait
0.067999
0.068253
0.081602
process wait
before wait
after wait
➜ Downloads 0.088755 # already exited the script
Does anyone know why it's happening? And how to fix this?
As described in BashFAQ #24, this is caused by your pipeline causing the while loop to be performed in a different shell from the rest of your script.
Consequently, your curls are subprocesses of that subshell, not the outer interpreter; so the outer interpreter cannot wait for them.
This can be resolved by not piping to while read, but instead redirecting its input in a way that doesn't shuffle it into a pipeline element -- as with <(...), a process substitution:
#!/usr/bin/env bash
# ^^^^ - NOT /bin/sh; also, must not start with "sh scriptname"
count=0
while IFS= read -r line; do
curl -o /dev/null -s "$line" -w "%{time_total}\n" &
count=$((count+1))
(( count % 3 == 0 )) && { wait; echo "process wait"; } # wait for 3 urls
done < <(head -n 10 urls.txt)
echo "before wait"
wait
echo "after wait"
why it's happening?
Because you run the processes in the subshell, the parent process can't wait for them.
$ echo | { echo subshell; sleep 100 & }
$ wait # exits immiedately
$
Call wait from the same process the background processes were spawned:
someotherthing | {
while someotherthing; do
something &
done
wait # will wait for something
}
And how to fix this?
I recommend not to use a crude while read loop and use different approach using some tool. Use GNU xargs with -P option to run 3 processes concurently:
head -n 10 urls.txt | xargs -P3 -n1 -d '\n' curl -o /dev/null -w "%{time_total}\n" -s
But you could just use move wait into the subshell as above, or make the while loop to be executed in the parent shell alternatively.
I would like to scan multiple ports in multiple hosts. I used this script but it takes long time to show the result.
#!/bin/bash
hosts=(
"server1"
"server2"
)
for host in "${hosts[#]}"
do
echo "=========================================="
echo "Scanning $host"
echo "=========================================="
for port in {21,22,80}
do
echo "" > /dev/tcp/$host/$port && echo "Port $port is open"
done 2>/dev/null
done
Some people suggested to use telnet or NetCat instead but i prefer to do it without installing any new packages. So, are there any ways to speed it up by multithreading or other way.
You could use GNU Parallel to run all the checks in parallel. I am not the best at using it, and #OleTange (the author) normally has to correct me but I keep trying. So, let's try your case, by building up to it slowly:
parallel echo {1} {2} ::: 192.168.0.1 192.168.0.8 ::: 21 22 80
192.168.0.8 22
192.168.0.8 80
192.168.0.8 21
192.168.0.1 80
192.168.0.1 22
192.168.0.1 21
looks kind of hopeful to me. Then I add in -k to keep the results in order, and I supply a function that takes those IP addresses and ports as arguments:
parallel -k 'echo "" > /dev/tcp/{1}/{2} && echo {1}:{2} is open' ::: 192.168.0.1 192.168.0.8 ::: 21 22 80 2>/dev/null
192.168.0.1:80 is open
192.168.0.8:21 is open
192.168.0.8:22 is open
192.168.0.8:80 is open
This will run 8 jobs in parallel if your CPU has 8 cores, however echo is not very resource intensive so you can probably run 32 in parallel, so add -j 32 after the -k.
If you wanted to stick closer to your own script, you can do it like this:
#!/bin/bash
hosts=(
"192.168.0.1"
"192.168.0.8"
)
for host in "${hosts[#]}"
do
for port in {21,22,80}
do
echo "(echo > /dev/tcp/$host/$port) 2>/dev/null && echo Host:$host Port:$port is open"
done
done | parallel -k -j 32
Basically, instead of running your commands, I am just sending them to the stdin of parallel so it can do its magic with them.
You could run all three pokes in the background, then wait for them all to finish, and probably slash the running time to 1/3.
for port in 21 22 80; do
echo "" > /dev/tcp/$host/$port 2>/dev/null &
pid[$port]=$!
done
for port in 21 22 80; do
wait $pid[$port] && echo "Port $port" is open"
done
You could add parallelism by running multiple hosts in the background, too, but that should be an obvious extension.
#!/bin/bash
function alarm {
local timeout=$1; shift;
# execute command, store PID
bash -c "$#" &
local pid=$!
# sleep for $timeout seconds, then attempt to kill PID
{
sleep "$timeout"
kill $pid 2> /dev/null
} &
wait $pid 2> /dev/null
return $?
}
function scan {
if [[ -z $1 || -z $2 ]]; then
echo "Usage: ./scanner <host> <port, ports, or port-range>"
echo "Example: ./scanner google.com 79-81"
return
fi
local host=$1
local ports=()
# store user-provided ports in array
case $2 in
*-*)
IFS=- read start end <<< "$2"
for ((port=start; port <= end; port++)); do
ports+=($port)
done
;;
*,*)
IFS=, read -ra ports <<< "$2"
;;
*)
ports+=($2)
;;
esac
# attempt to write to each port, print open if successful, closed if not
for port in "${ports[#]}"; do
alarm 1 "echo >/dev/tcp/$host/$port" &&
echo "$port/tcp open" ||
echo "$port/tcp closed"
done
}
scan $1 $2
I have more than 10 tasks to execute, and the system restrict that there at most 4 tasks can run at the same time.
My task can be started like:
myprog taskname
How can I write a bash shell script to run these task. The most important thing is that when one task finish, the script can start another immediately, making the running tasks count remain 4 all the time.
Use xargs:
xargs -P <maximum-number-of-process-at-a-time> -n <arguments-per-process> <command>
Details here.
I chanced upon this thread while looking into writing my own process pool and particularly liked Brandon Horsley's solution, though I couldn't get the signals working right, so I took inspiration from Apache and decided to try a pre-fork model with a fifo as my job queue.
The following function is the function that the worker processes run when forked.
# \brief the worker function that is called when we fork off worker processes
# \param[in] id the worker ID
# \param[in] job_queue the fifo to read jobs from
# \param[in] result_log the temporary log file to write exit codes to
function _job_pool_worker()
{
local id=$1
local job_queue=$2
local result_log=$3
local line=
exec 7<> ${job_queue}
while [[ "${line}" != "${job_pool_end_of_jobs}" && -e "${job_queue}" ]]; do
# workers block on the exclusive lock to read the job queue
flock --exclusive 7
read line <${job_queue}
flock --unlock 7
# the worker should exit if it sees the end-of-job marker or run the
# job otherwise and save its exit code to the result log.
if [[ "${line}" == "${job_pool_end_of_jobs}" ]]; then
# write it one more time for the next sibling so that everyone
# will know we are exiting.
echo "${line}" >&7
else
_job_pool_echo "### _job_pool_worker-${id}: ${line}"
# run the job
{ ${line} ; }
# now check the exit code and prepend "ERROR" to the result log entry
# which we will use to count errors and then strip out later.
local result=$?
local status=
if [[ "${result}" != "0" ]]; then
status=ERROR
fi
# now write the error to the log, making sure multiple processes
# don't trample over each other.
exec 8<> ${result_log}
flock --exclusive 8
echo "${status}job_pool: exited ${result}: ${line}" >> ${result_log}
flock --unlock 8
exec 8>&-
_job_pool_echo "### _job_pool_worker-${id}: exited ${result}: ${line}"
fi
done
exec 7>&-
}
You can get a copy of my solution at Github. Here's a sample program using my implementation.
#!/bin/bash
. job_pool.sh
function foobar()
{
# do something
true
}
# initialize the job pool to allow 3 parallel jobs and echo commands
job_pool_init 3 0
# run jobs
job_pool_run sleep 1
job_pool_run sleep 2
job_pool_run sleep 3
job_pool_run foobar
job_pool_run foobar
job_pool_run /bin/false
# wait until all jobs complete before continuing
job_pool_wait
# more jobs
job_pool_run /bin/false
job_pool_run sleep 1
job_pool_run sleep 2
job_pool_run foobar
# don't forget to shut down the job pool
job_pool_shutdown
# check the $job_pool_nerrors for the number of jobs that exited non-zero
echo "job_pool_nerrors: ${job_pool_nerrors}"
Hope this helps!
Using GNU Parallel you can do:
cat tasks | parallel -j4 myprog
If you have 4 cores, you can even just do:
cat tasks | parallel myprog
From http://git.savannah.gnu.org/cgit/parallel.git/tree/README:
Full installation
Full installation of GNU Parallel is as simple as:
./configure && make && make install
Personal installation
If you are not root you can add ~/bin to your path and install in
~/bin and ~/share:
./configure --prefix=$HOME && make && make install
Or if your system lacks 'make' you can simply copy src/parallel
src/sem src/niceload src/sql to a dir in your path.
Minimal installation
If you just need parallel and do not have 'make' installed (maybe the
system is old or Microsoft Windows):
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
mv parallel sem dir-in-your-$PATH/bin/
Test the installation
After this you should be able to do:
parallel -j0 ping -nc 3 ::: foss.org.my gnu.org freenetproject.org
This will send 3 ping packets to 3 different hosts in parallel and print
the output when they complete.
Watch the intro video for a quick introduction:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
I would suggest writing four scripts, each one of which executes a certain number of tasks in series. Then write another script that starts the four scripts in parallel. For instance, if you have scripts, script1.sh, script2.sh, script3.sh, and script4.sh, you could have a script called headscript.sh like so.
#!/bin/sh
./script1.sh &
./script2.sh &
./script3.sh &
./script4.sh &
I found the best solution proposed in A Foo Walks into a Bar... blog using build-in functionality of well know xargs tool
First create a file commands.txt with list of commands you want to execute
myprog taskname1
myprog taskname2
myprog taskname3
myprog taskname4
...
myprog taskname123
and then pipe it to xargs like this to execute in 4 processes pool:
cat commands.txt | xargs -I CMD --max-procs=4 bash -c CMD
you can modify no of process
Following #Parag Sardas' answer and the documentation linked here's a quick script you might want to add on your .bash_aliases.
Relinking the doc link because it's worth a read
#!/bin/bash
# https://stackoverflow.com/a/19618159
# https://stackoverflow.com/a/51861820
#
# Example file contents:
# touch /tmp/a.txt
# touch /tmp/b.txt
if [ "$#" -eq 0 ]; then
echo "$0 <file> [max-procs=0]"
exit 1
fi
FILE=${1}
MAX_PROCS=${2:-0}
cat $FILE | while read line; do printf "%q\n" "$line"; done | xargs --max-procs=$MAX_PROCS -I CMD bash -c CMD
I.e.
./xargs-parallel.sh jobs.txt 4 maximum of 4 processes read from jobs.txt
You could probably do something clever with signals.
Note this is only to illustrate the concept, and thus not thoroughly tested.
#!/usr/local/bin/bash
this_pid="$$"
jobs_running=0
sleep_pid=
# Catch alarm signals to adjust the number of running jobs
trap 'decrement_jobs' SIGALRM
# When a job finishes, decrement the total and kill the sleep process
decrement_jobs()
{
jobs_running=$(($jobs_running - 1))
if [ -n "${sleep_pid}" ]
then
kill -s SIGKILL "${sleep_pid}"
sleep_pid=
fi
}
# Check to see if the max jobs are running, if so sleep until woken
launch_task()
{
if [ ${jobs_running} -gt 3 ]
then
(
while true
do
sleep 999
done
) &
sleep_pid=$!
wait ${sleep_pid}
fi
# Launch the requested task, signalling the parent upon completion
(
"$#"
kill -s SIGALRM "${this_pid}"
) &
jobs_running=$((${jobs_running} + 1))
}
# Launch all of the tasks, this can be in a loop, etc.
launch_task task1
launch_task tast2
...
launch_task task99
This tested script runs 5 jobs at a time and will restart a new job as soon as it does (due to the kill of the sleep 10.9 when we get a SIGCHLD. A simpler version of this could use direct polling (change the sleep 10.9 to sleep 1 and get rid of the trap).
#!/usr/bin/bash
set -o monitor
trap "pkill -P $$ -f 'sleep 10\.9' >&/dev/null" SIGCHLD
totaljobs=15
numjobs=5
worktime=10
curjobs=0
declare -A pidlist
dojob()
{
slot=$1
time=$(echo "$RANDOM * 10 / 32768" | bc -l)
echo Starting job $slot with args $time
sleep $time &
pidlist[$slot]=`jobs -p %%`
curjobs=$(($curjobs + 1))
totaljobs=$(($totaljobs - 1))
}
# start
while [ $curjobs -lt $numjobs -a $totaljobs -gt 0 ]
do
dojob $curjobs
done
# Poll for jobs to die, restarting while we have them
while [ $totaljobs -gt 0 ]
do
for ((i=0;$i < $curjobs;i++))
do
if ! kill -0 ${pidlist[$i]} >&/dev/null
then
dojob $i
break
fi
done
sleep 10.9 >&/dev/null
done
wait
Other answer about 4 shell scripts does not fully satisfies me as it assumes that all tasks take approximatelu the same time and because it requires manual set up. But here is how I would improve it.
Main script will create symbolic links to executables following certain namimg convention. For example,
ln -s executable1 ./01-task.01
first prefix is for sorting and suffix identifies batch (01-04).
Now we spawn 4 shell scripts that would take batch number as input and do something like this
for t in $(ls ./*-task.$batch | sort ; do
t
rm t
done
Look at my implementation of job pool in bash: https://github.com/spektom/shell-utils/blob/master/jp.sh
For example, to run at most 3 processes of cURL when downloading from a lot of URLs, you can wrap your cURL commands as follows:
./jp.sh "My Download Pool" 3 curl http://site1/...
./jp.sh "My Download Pool" 3 curl http://site2/...
./jp.sh "My Download Pool" 3 curl http://site3/...
...
Here is my solution. The idea is quite simple. I create a fifo as a semaphore, where each line stands for an available resource. When reading the queue, the main process blocks if there is nothing left. And, we return the resource after the task is done by simply echoing anything to the queue.
function task() {
local task_no="$1"
# doing the actual task...
echo "Executing Task ${task_no}"
# which takes a long time
sleep 1
}
function execute_concurrently() {
local tasks="$1"
local ps_pool_size="$2"
# create an anonymous fifo as a Semaphore
local sema_fifo
sema_fifo="$(mktemp -u)"
mkfifo "${sema_fifo}"
exec 3<>"${sema_fifo}"
rm -f "${sema_fifo}"
# every 'x' stands for an available resource
for i in $(seq 1 "${ps_pool_size}"); do
echo 'x' >&3
done
for task_no in $(seq 1 "${tasks}"); do
read dummy <&3 # blocks util a resource is available
(
trap 'echo x >&3' EXIT # returns the resource on exit
task "${task_no}"
)&
done
wait # wait util all forked tasks have finished
}
execute_concurrently 10 4
The script above will run 10 tasks and 4 each time concurrently. You can change the $(seq 1 "${tasks}") sequence to the actual task queue you want to run.
I made my modifications based on methods introduced in this Writing a process pool in Bash.
#!/bin/bash
#set -e # this doesn't work here for some reason
POOL_SIZE=4 # number of workers running in parallel
#######################################################################
# populate jobs #
#######################################################################
declare -a jobs
for (( i = 1988; i < 2019; i++ )); do
jobs+=($i)
done
echo '################################################'
echo ' Launching jobs'
echo '################################################'
parallel() {
local proc procs jobs cur
jobs=("$#") # input jobs array
declare -a procs=() # processes array
cur=0 # current job idx
morework=true
while $morework; do
# if process array size < pool size, try forking a new proc
if [[ "${#procs[#]}" -lt "$POOL_SIZE" ]]; then
if [[ $cur -lt "${#jobs[#]}" ]]; then
proc=${jobs[$cur]}
echo "JOB ID = $cur; JOB = $proc."
###############
# do job here #
###############
sleep 3 &
# add to current running processes
procs+=("$!")
# move to the next job
((cur++))
else
morework=false
continue
fi
fi
for n in "${!procs[#]}"; do
kill -0 "${procs[n]}" 2>/dev/null && continue
# if process is not running anymore, remove from array
unset procs[n]
done
done
wait
}
parallel "${jobs[#]}"
xargs with -P and -L options does the job.
You can extract the idea from the example below:
#!/usr/bin/env bash
workers_pool_size=10
set -e
function doit {
cmds=""
for e in 4 8 16; do
for m in 1 2 3 4 5 6; do
cmd="python3 ./doit.py --m $m -e $e -m $m"
cmds="$cmd\n$cmds"
done
done
echo -e "All commands:\n$cmds"
echo "Workers pool size = $workers_pool_size"
echo -e "$cmds" | xargs -t -P $workers_pool_size -L 1 time > /dev/null
}
doit
#! /bin/bash
doSomething() {
<...>
}
getCompletedThreads() {
_runningThreads=("$#")
removableThreads=()
for pid in "${_runningThreads[#]}"; do
if ! ps -p $pid > /dev/null; then
removableThreads+=($pid)
fi
done
echo "$removableThreads"
}
releasePool() {
while [[ ${#runningThreads[#]} -eq $MAX_THREAD_NO ]]; do
echo "releasing"
removableThreads=( $(getCompletedThreads "${runningThreads[#]}") )
if [ ${#removableThreads[#]} -eq 0 ]; then
sleep 0.2
else
for removableThread in "${removableThreads[#]}"; do
runningThreads=( ${runningThreads[#]/$removableThread} )
done
echo "released"
fi
done
}
waitAllThreadComplete() {
while [[ ${#runningThreads[#]} -ne 0 ]]; do
removableThreads=( $(getCompletedThreads "${runningThreads[#]}") )
for removableThread in "${removableThreads[#]}"; do
runningThreads=( ${runningThreads[#]/$removableThread} )
done
if [ ${#removableThreads[#]} -eq 0 ]; then
sleep 0.2
fi
done
}
MAX_THREAD_NO=10
runningThreads=()
sequenceNo=0
for i in {1..36}; do
releasePool
((sequenceNo++))
echo "added $sequenceNo"
doSomething &
pid=$!
runningThreads+=($pid)
done
waitAllThreadComplete
I was wondering how, if possible, I can create a simple job management in BASH to process several commands in parallel. That is, I have a big list of commands to run, and I'd like to have two of them running at any given time.
I know quite a bit about bash, so here are the requirements that make it tricky:
The commands have variable running time so I can't just spawn 2, wait, and then continue with the next two. As soon as one command is done a next command must be run.
The controlling process needs to know the exit code of each command so that it can keep a total of how many failed
I'm thinking somehow I can use trap but I don't see an easy way to get the exit value of a child inside the handler.
So, any ideas on how this can be done?
Well, here is some proof of concept code that should probably work, but it breaks bash: invalid command lines generated, hanging, and sometimes a core dump.
# need monitor mode for trap CHLD to work
set -m
# store the PIDs of the children being watched
declare -a child_pids
function child_done
{
echo "Child $1 result = $2"
}
function check_pid
{
# check if running
kill -s 0 $1
if [ $? == 0 ]; then
child_pids=("${child_pids[#]}" "$1")
else
wait $1
ret=$?
child_done $1 $ret
fi
}
# check by copying pids, clearing list and then checking each, check_pid
# will add back to the list if it is still running
function check_done
{
to_check=("${child_pids[#]}")
child_pids=()
for ((i=0;$i<${#to_check};i++)); do
check_pid ${to_check[$i]}
done
}
function run_command
{
"$#" &
pid=$!
# check this pid now (this will add to the child_pids list if still running)
check_pid $pid
}
# run check on all pids anytime some child exits
trap 'check_done' CHLD
# test
for ((tl=0;tl<10;tl++)); do
run_command bash -c "echo FAIL; sleep 1; exit 1;"
run_command bash -c "echo OKAY;"
done
# wait for all children to be done
wait
Note that this isn't what I ultimately want, but would be groundwork to getting what I want.
Followup: I've implemented a system to do this in Python. So anybody using Python for scripting can have the above functionality. Refer to shelljob
GNU Parallel is awesomesauce:
$ parallel -j2 < commands.txt
$ echo $?
It will set the exit status to the number of commands that failed. If you have more than 253 commands, check out --joblog. If you don't know all the commands up front, check out --bg.
Can I persuade you to use make? This has the advantage that you can tell it how many commands to run in parallel (modify the -j number)
echo -e ".PHONY: c1 c2 c3 c4\nall: c1 c2 c3 c4\nc1:\n\tsleep 2; echo c1\nc2:\n\tsleep 2; echo c2\nc3:\n\tsleep 2; echo c3\nc4:\n\tsleep 2; echo c4" | make -f - -j2
Stick it in a Makefile and it will be much more readable
.PHONY: c1 c2 c3 c4
all: c1 c2 c3 c4
c1:
sleep 2; echo c1
c2:
sleep 2; echo c2
c3:
sleep 2; echo c3
c4:
sleep 2; echo c4
Beware, those are not spaces at the beginning of the lines, they're a TAB, so a cut and paste won't work here.
Put an "#" infront of each command if you don't the command echoed. e.g.:
#sleep 2; echo c1
This would stop on the first command that failed. If you need a count of the failures you'd need to engineer that in the makefile somehow. Perhaps something like
command || echo F >> failed
Then check the length of failed.
The problem you have is that you cannot wait for one of multiple background processes to complete. If you observe job status (using jobs) then finished background jobs are removed from the job list. You need another mechanism to determine whether a background job has finished.
The following example uses starts to background processes (sleeps). It then loops using ps to see if they are still running. If not it uses wait to gather the exit code and starts a new background process.
#!/bin/bash
sleep 3 &
pid1=$!
sleep 6 &
pid2=$!
while ( true ) do
running1=`ps -p $pid1 --no-headers | wc -l`
if [ $running1 == 0 ]
then
wait $pid1
echo process 1 finished with exit code $?
sleep 3 &
pid1=$!
else
echo process 1 running
fi
running2=`ps -p $pid2 --no-headers | wc -l`
if [ $running2 == 0 ]
then
wait $pid2
echo process 2 finished with exit code $?
sleep 6 &
pid2=$!
else
echo process 2 running
fi
sleep 1
done
Edit: Using SIGCHLD (without polling):
#!/bin/bash
set -bm
trap 'ChildFinished' SIGCHLD
function ChildFinished() {
running1=`ps -p $pid1 --no-headers | wc -l`
if [ $running1 == 0 ]
then
wait $pid1
echo process 1 finished with exit code $?
sleep 3 &
pid1=$!
else
echo process 1 running
fi
running2=`ps -p $pid2 --no-headers | wc -l`
if [ $running2 == 0 ]
then
wait $pid2
echo process 2 finished with exit code $?
sleep 6 &
pid2=$!
else
echo process 2 running
fi
sleep 1
}
sleep 3 &
pid1=$!
sleep 6 &
pid2=$!
sleep 1000d
I think the following example answers some of your questions, I am looking into the rest of question
(cat list1 list2 list3 | sort | uniq > list123) &
(cat list4 list5 list6 | sort | uniq > list456) &
from:
Running parallel processes in subshells
There is another package for debian systems named xjobs.
You might want to check it out:
http://packages.debian.org/wheezy/xjobs
If you cannot install parallel for some reason this will work in plain shell or bash
# String to detect failure in subprocess
FAIL_STR=failed_cmd
result=$(
(false || echo ${FAIL_STR}1) &
(true || echo ${FAIL_STR}2) &
(false || echo ${FAIL_STR}3)
)
wait
if [[ ${result} == *"$FAIL_STR"* ]]; then
failure=`echo ${result} | grep -E -o "$FAIL_STR[^[:space:]]+"`
echo The following commands failed:
echo "${failure}"
echo See above output of these commands for details.
exit 1
fi
Where true & false are placeholders for your commands. You can also echo $? along with the FAIL_STR to get the command status.
Yet another bash only example for your interest. Of course, prefer the use of GNU parallel, which will offer much more features out of the box.
This solution involve tmp file output creation for collecting of job status.
We use /tmp/${$}_ as temporary file prefix $$ is the actual parent process number and it is the same for all the script execution.
First, the loop for starting parallel job by batch. The batch size is set using max_parrallel_connection. try_connect_DB() is a slow bash function in the same file. Here we collect stdout + stderr 2>&1 for failure diagnostic.
nb_project=$(echo "$projects" | wc -w)
i=0
parrallel_connection=0
max_parrallel_connection=10
for p in $projects
do
i=$((i+1))
parrallel_connection=$((parrallel_connection+1))
try_connect_DB $p "$USERNAME" "$pass" > /tmp/${$}_${p}.out 2>&1 &
if [[ $parrallel_connection -ge $max_parrallel_connection ]]
then
echo -n " ... ($i/$nb_project)"
wait
parrallel_connection=0
fi
done
if [[ $nb_project -gt $max_parrallel_connection ]]
then
# final new line
echo
fi
# wait for all remaining jobs
wait
After run all jobs is finished review all results:
SQL_connection_failed is our convention of error, outputed by try_connect_DB() you may filter job success or failure the way that most suite your need.
Here we decided to only output failed results in order to reduce the amount of output on large sized jobs. Especially if most of them, or all, passed successfully.
# displaying result that failed
file_with_failure=$(grep -l SQL_connection_failed /tmp/${$}_*.out)
if [[ -n $file_with_failure ]]
then
nb_failed=$(wc -l <<< "$file_with_failure")
# we will collect DB name from our output file naming convention, for post treatment
db_names=""
echo "=========== failed connections : $nb_failed/$nb_project"
for failure in $file_with_failure
do
echo "============ $failure"
cat $failure
db_names+=" $(basename $failure | sed -e 's/^[0-9]\+_\([^.]\+\)\.out/\1/')"
done
echo "$db_names"
ret=1
else
echo "all tests passed"
ret=0
fi
# temporary files cleanup, could be kept is case of error, adapt to suit your needs.
rm /tmp/${$}_*.out
exit $ret
I need to write an infinite loop that stops when any key is pressed.
Unfortunately this one loops only when a key is pressed.
Ideas please?
#!/bin/bash
count=0
while : ; do
# dummy action
echo -n "$a "
let "a+=1"
# detect any key press
read -n 1 keypress
echo $keypress
done
echo "Thanks for using this script."
exit 0
You need to put the standard input in non-blocking mode. Here is an example that works:
#!/bin/bash
if [ -t 0 ]; then
SAVED_STTY="`stty --save`"
stty -echo -icanon -icrnl time 0 min 0
fi
count=0
keypress=''
while [ "x$keypress" = "x" ]; do
let count+=1
echo -ne $count'\r'
keypress="`cat -v`"
done
if [ -t 0 ]; then stty "$SAVED_STTY"; fi
echo "You pressed '$keypress' after $count loop iterations"
echo "Thanks for using this script."
exit 0
Edit 2014/12/09: Add the -icrnl flag to stty to properly catch the Return key, use cat -v instead of read in order to catch Space.
It is possible that cat reads more than one character if it is fed data fast enough; if not the desired behaviour, replace cat -v with dd bs=1 count=1 status=none | cat -v.
Edit 2019/09/05: Use stty --save to restore the TTY settings.
read has a number of characters parameter -n and a timeout parameter -t which could be used.
From bash manual:
-n nchars
read returns after reading nchars characters rather than waiting for a complete line of input, but honors a delimiter if fewer than nchars characters are read before the delimiter.
-t timeout
Cause read to time out and return failure if a complete line of input (or a specified number of characters) is not read within timeout seconds. timeout may be a decimal number with a fractional portion following the decimal point. This option is only effective if read is reading input from a terminal, pipe, or other special file; it has no effect when reading from regular files. If read times out, read saves any partial input read into the specified variable name. If timeout is 0, read returns immediately, without trying to read any data. The exit status is 0 if input is available on the specified file descriptor, non-zero otherwise. The exit status is greater than 128 if the timeout is exceeded.
However, the read builtin uses the terminal which has its own settings. So as other answers have pointed out we need to set the flags for the terminal using stty.
#!/bin/bash
old_tty=$(stty --save)
# Minimum required changes to terminal. Add -echo to avoid output to screen.
stty -icanon min 0;
while true ; do
if read -t 0; then # Input ready
read -n 1 char
echo -e "\nRead: ${char}\n"
break
else # No input
echo -n '.'
sleep 1
fi
done
stty $old_tty
Usually I don't mind breaking a bash infinite loop with a simple CTRL-C. This is the traditional way for terminating a tail -f for instance.
Pure bash: unattended user input over loop
I've done this without having to play with stty:
loop=true
while $loop; do
trapKey=
if IFS= read -d '' -rsn 1 -t .002 str; then
while IFS= read -d '' -rsn 1 -t .002 chr; do
str+="$chr"
done
case $str in
$'\E[A') trapKey=UP ;;
$'\E[B') trapKey=DOWN ;;
$'\E[C') trapKey=RIGHT ;;
$'\E[D') trapKey=LEFT ;;
q | $'\E') loop=false ;;
esac
fi
if [ "$trapKey" ] ;then
printf "\nDoing something with '%s'.\n" $trapKey
fi
echo -n .
done
This will
loop with a very small footprint (max 2 millisecond)
react to keys cursor left, cursor right, cursor up and cursor down
exit loop with key Escape or q.
Here is another solution. It works for any key pressed, including space, enter, arrows, etc.
The original solution tested in bash:
IFS=''
if [ -t 0 ]; then stty -echo -icanon raw time 0 min 0; fi
while [ -z "$key" ]; do
read key
done
if [ -t 0 ]; then stty sane; fi
An improved solution tested in bash and dash:
if [ -t 0 ]; then
old_tty=$(stty --save)
stty raw -echo min 0
fi
while
IFS= read -r REPLY
[ -z "$REPLY" ]
do :; done
if [ -t 0 ]; then stty "$old_tty"; fi
In bash you could even leave out REPLY variable for the read command, because it is the default variable there.
I found this forum post and rewrote era's post into this pretty general use format:
# stuff before main function
printf "INIT\n\n"; sleep 2
INIT(){
starting="MAIN loop starting"; ending="MAIN loop success"
runMAIN=1; i=1; echo "0"
}; INIT
# exit script when MAIN is done, if ever (in this case counting out 4 seconds)
exitScript(){
trap - SIGINT SIGTERM SIGTERM # clear the trap
kill -- -$$ # Send SIGTERM to child/sub processes
kill $( jobs -p ) # kill any remaining processes
}; trap exitScript SIGINT SIGTERM # set trap
MAIN(){
echo "$starting"
sleep 1
echo "$i"; let "i++"
if (($i > 4)); then printf "\nexiting\n"; exitScript; fi
echo "$ending"; echo
}
# main loop running in subshell due to the '&'' after 'done'
{ while ((runMAIN)); do
if ! MAIN; then runMain=0; fi
done; } &
# --------------------------------------------------
tput smso
# echo "Press any key to return \c"
tput rmso
oldstty=`stty -g`
stty -icanon -echo min 1 time 0
dd bs=1 count=1 >/dev/null 2>&1
stty "$oldstty"
# --------------------------------------------------
# everything after this point will occur after user inputs any key
printf "\nYou pressed a key!\n\nGoodbye!\n"
Run this script