Limiting the number of subshells spawned

Limiting the number of subshells spawned - bash

I'm trying to limit the amount of subshells that are spawned in a script that I'm using to sniff our internal network to audit Linux servers in our network. The script works as intended, but due to the way I'm nesting the for loop, it spawns 255 Sub Shells for each network, so therefore it kills the CPU due to the fact that there are over 1000 processes spawned off. I need to be able to limit the amount of processes, and since variables lose their value when in a Sub Shell, I can't figure out a way to make this work. Again, the script works, it just spawns a ton a processes - I need to limit it to, say 10 Processes max:
#!/bin/bash
FILE=/root/ats_net_final
for network in `cat $FILE`;do
for ip in $network.{1..255};do
(
SYSNAME=`snmpwalk -v2c -c public -t1 -r1 $ip sysName.0 2>/dev/null | awk '{ print $NF }'`
SYSTYPE=`snmpwalk -v2c -c public -t1 -r1 $ip sysDescr.0 2>/dev/null | grep -o Linux`
if [ $? -eq 0 ];then
echo "$SYSNAME"
exit 0;
else
echo "Processed $ip"
exit 0
fi
) &
done
done
This solution I found that works, but not in my case, because no matter what, it still will spawn the processes before the logic of limiting the processes. I think that maybe I've just been looking at the code too long and it's just a simple logic issue and I'm placing things in the wrong area, or in the wrong order.
Answer Accepted:
I've accepted the answer from huitseeker. He was able to provide me with direction of how the logic works, allowing me to get it to work. Final script:
#!/bin/bash
FILE=/root/ats_net_final
for network in `cat $FILE`;do
#for ip in $network.{1..255};do
for ip in {1..255};do
(
ip=$network.$ip
SYSNAME=`snmpwalk -v2c -c public -t1 -r1 $ip sysName.0 2>/dev/null | awk '{ print $NF }'`
SYSTYPE=`snmpwalk -v2c -c public -t1 -r1 $ip sysDescr.0 2>/dev/null | grep -o Linux`
if [ $? -eq 0 ];then
echo "$SYSNAME"
exit 0;
else
echo "Processed $ip"
exit 0
fi
) &
if (( $ip % 10 == 0 )); then wait; fi
done
wait
done

An easy way to limit the number of concurrent subshells to 10 is:
for ip in $(seq 1 255);do
(<whatever you did with $ip in the subshell here, do with $network.$ip instead >) &
if (( $ip % 10 == 0 )); then wait; fi
done
wait
With the last wait being useful not to let the subshells of the last round of the inner loop overlap with those created in first round of the next outer run.

I think I have found a better solution. I implemented using make:
#!/usr/bin/make -f
IPFILE := ipfile
IPS:=$(foreach ip,$(shell cat $(IPFILE)),$(foreach sub,$(shell seq 1 1 254),$(ip).$(sub)))
all: $(IPS)
$(IPS):
SYSNAME=`snmpwalk -v2c -c public -t1 -r1 $# sysName.0 2>/dev/null | awk '{ print $$NF }'`; \
SYSTYPE=`snmpwalk -v2c -c public -t1 -r1 $# sysDescr.0 2>/dev/null | grep -o Linux`; \
if [ $$? -eq 0 ]; then \
echo "$$SYSNAME"; \
else \
echo "Processed $#"; \
fi
The first part of the IPs (e.g. 192.168.1) should be placed to ipfile. Then it generates all the IP addresses into the variable IPS (Like 192.168.1.1 ... 192.168.1.254 ...).
These lines can be copied to e.g. test.mak and add execute rights to the file. If one run it as ./test.mak then it will process the IPs in IPS one by one. But if it is run as ./test.mak -j 10 then it will process 10 IPs at once. Also it can be run as ./test.mak -j 10 -l 0.5. It will run maximum 10 processes or until the system load reaches 0.5.

Related

Does pushing a block of code to background in Bash result in parallelization? [duplicate]

Lets say I have a loop in Bash:
for foo in `some-command`
do
do-something $foo
done
do-something is cpu bound and I have a nice shiny 4 core processor. I'd like to be able to run up to 4 do-something's at once.
The naive approach seems to be:
for foo in `some-command`
do
do-something $foo &
done
This will run all do-somethings at once, but there are a couple downsides, mainly that do-something may also have some significant I/O which performing all at once might slow down a bit. The other problem is that this code block returns immediately, so no way to do other work when all the do-somethings are finished.
How would you write this loop so there are always X do-somethings running at once?

Depending on what you want to do xargs also can help (here: converting documents with pdf2ps):
cpus=$( ls -d /sys/devices/system/cpu/cpu[[:digit:]]* | wc -w )
find . -name \*.pdf | xargs --max-args=1 --max-procs=$cpus pdf2ps
From the docs:
--max-procs=max-procs
-P max-procs
Run up to max-procs processes at a time; the default is 1.
If max-procs is 0, xargs will run as many processes as possible at a
time. Use the -n option with -P; otherwise chances are that only one
exec will be done.

With GNU Parallel http://www.gnu.org/software/parallel/ you can write:
some-command | parallel do-something
GNU Parallel also supports running jobs on remote computers. This will run one per CPU core on the remote computers - even if they have different number of cores:
some-command | parallel -S server1,server2 do-something
A more advanced example: Here we list of files that we want my_script to run on. Files have extension (maybe .jpeg). We want the output of my_script to be put next to the files in basename.out (e.g. foo.jpeg -> foo.out). We want to run my_script once for each core the computer has and we want to run it on the local computer, too. For the remote computers we want the file to be processed transferred to the given computer. When my_script finishes, we want foo.out transferred back and we then want foo.jpeg and foo.out removed from the remote computer:
cat list_of_files | \
parallel --trc {.}.out -S server1,server2,: \
"my_script {} > {.}.out"
GNU Parallel makes sure the output from each job does not mix, so you can use the output as input for another program:
some-command | parallel do-something | postprocess
See the videos for more examples: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

maxjobs=4
parallelize () {
while [ $# -gt 0 ] ; do
jobcnt=(`jobs -p`)
if [ ${#jobcnt[#]} -lt $maxjobs ] ; then
do-something $1 &
shift
else
sleep 1
fi
done
wait
}
parallelize arg1 arg2 "5 args to third job" arg4 ...

Here an alternative solution that can be inserted into .bashrc and used for everyday one liner:
function pwait() {
while [ $(jobs -p | wc -l) -ge $1 ]; do
sleep 1
done
}
To use it, all one has to do is put & after the jobs and a pwait call, the parameter gives the number of parallel processes:
for i in *; do
do_something $i &
pwait 10
done
It would be nicer to use wait instead of busy waiting on the output of jobs -p, but there doesn't seem to be an obvious solution to wait till any of the given jobs is finished instead of a all of them.

Instead of a plain bash, use a Makefile, then specify number of simultaneous jobs with make -jX where X is the number of jobs to run at once.
Or you can use wait ("man wait"): launch several child processes, call wait - it will exit when the child processes finish.
maxjobs = 10
foreach line in `cat file.txt` {
jobsrunning = 0
while jobsrunning < maxjobs {
do job &
jobsrunning += 1
}
wait
}
job ( ){
...
}
If you need to store the job's result, then assign their result to a variable. After wait you just check what the variable contains.

If you're familiar with the make command, most of the time you can express the list of commands you want to run as a a makefile. For example, if you need to run $SOME_COMMAND on files *.input each of which produces *.output, you can use the makefile
INPUT = a.input b.input
OUTPUT = $(INPUT:.input=.output)
%.output : %.input
$(SOME_COMMAND) $< $#
all: $(OUTPUT)
and then just run
make -j<NUMBER>
to run at most NUMBER commands in parallel.

While doing this right in bash is probably impossible, you can do a semi-right fairly easily. bstark gave a fair approximation of right but his has the following flaws:
Word splitting: You can't pass any jobs to it that use any of the following characters in their arguments: spaces, tabs, newlines, stars, question marks. If you do, things will break, possibly unexpectedly.
It relies on the rest of your script to not background anything. If you do, or later you add something to the script that gets sent in the background because you forgot you weren't allowed to use backgrounded jobs because of his snippet, things will break.
Another approximation which doesn't have these flaws is the following:
scheduleAll() {
local job i=0 max=4 pids=()
for job; do
(( ++i % max == 0 )) && {
wait "${pids[#]}"
pids=()
}
bash -c "$job" & pids+=("$!")
done
wait "${pids[#]}"
}
Note that this one is easily adaptable to also check the exit code of each job as it ends so you can warn the user if a job fails or set an exit code for scheduleAll according to the amount of jobs that failed, or something.
The problem with this code is just that:
It schedules four (in this case) jobs at a time and then waits for all four to end. Some might be done sooner than others which will cause the next batch of four jobs to wait until the longest of the previous batch is done.
A solution that takes care of this last issue would have to use kill -0 to poll whether any of the processes have disappeared instead of the wait and schedule the next job. However, that introduces a small new problem: you have a race condition between a job ending, and the kill -0 checking whether it's ended. If the job ended and another process on your system starts up at the same time, taking a random PID which happens to be that of the job that just finished, the kill -0 won't notice your job having finished and things will break again.
A perfect solution isn't possible in bash.

Maybe try a parallelizing utility instead rewriting the loop? I'm a big fan of xjobs. I use xjobs all the time to mass copy files across our network, usually when setting up a new database server.
http://www.maier-komor.de/xjobs.html

function for bash:
parallel ()
{
awk "BEGIN{print \"all: ALL_TARGETS\\n\"}{print \"TARGET_\"NR\":\\n\\t#-\"\$0\"\\n\"}END{printf \"ALL_TARGETS:\";for(i=1;i<=NR;i++){printf \" TARGET_%d\",i};print\"\\n\"}" | make $# -f - all
}
using:
cat my_commands | parallel -j 4

Really late to the party here, but here's another solution.
A lot of solutions don't handle spaces/special characters in the commands, don't keep N jobs running at all times, eat cpu in busy loops, or rely on external dependencies (e.g. GNU parallel).
With inspiration for dead/zombie process handling, here's a pure bash solution:
function run_parallel_jobs {
local concurrent_max=$1
local callback=$2
local cmds=("${#:3}")
local jobs=( )
while [[ "${#cmds[#]}" -gt 0 ]] || [[ "${#jobs[#]}" -gt 0 ]]; do
while [[ "${#jobs[#]}" -lt $concurrent_max ]] && [[ "${#cmds[#]}" -gt 0 ]]; do
local cmd="${cmds[0]}"
cmds=("${cmds[#]:1}")
bash -c "$cmd" &
jobs+=($!)
done
local job="${jobs[0]}"
jobs=("${jobs[#]:1}")
local state="$(ps -p $job -o state= 2>/dev/null)"
if [[ "$state" == "D" ]] || [[ "$state" == "Z" ]]; then
$callback $job
else
wait $job
$callback $job $?
fi
done
}
And sample usage:
function job_done {
if [[ $# -lt 2 ]]; then
echo "PID $1 died unexpectedly"
else
echo "PID $1 exited $2"
fi
}
cmds=( \
"echo 1; sleep 1; exit 1" \
"echo 2; sleep 2; exit 2" \
"echo 3; sleep 3; exit 3" \
"echo 4; sleep 4; exit 4" \
"echo 5; sleep 5; exit 5" \
)
# cpus="$(getconf _NPROCESSORS_ONLN)"
cpus=3
run_parallel_jobs $cpus "job_done" "${cmds[#]}"
The output:
1
2
3
PID 56712 exited 1
4
PID 56713 exited 2
5
PID 56714 exited 3
PID 56720 exited 4
PID 56724 exited 5
For per-process output handling $$ could be used to log to a file, for example:
function job_done {
cat "$1.log"
}
cmds=( \
"echo 1 \$\$ >\$\$.log" \
"echo 2 \$\$ >\$\$.log" \
)
run_parallel_jobs 2 "job_done" "${cmds[#]}"
Output:
1 56871
2 56872

The project I work on uses the wait command to control parallel shell (ksh actually) processes. To address your concerns about IO, on a modern OS, it's possible parallel execution will actually increase efficiency. If all processes are reading the same blocks on disk, only the first process will have to hit the physical hardware. The other processes will often be able to retrieve the block from OS's disk cache in memory. Obviously, reading from memory is several orders of magnitude quicker than reading from disk. Also, the benefit requires no coding changes.

This might be good enough for most purposes, but is not optimal.
#!/bin/bash
n=0
maxjobs=10
for i in *.m4a ; do
# ( DO SOMETHING ) &
# limit jobs
if (( $(($((++n)) % $maxjobs)) == 0 )) ; then
wait # wait until all have finished (not optimal, but most times good enough)
echo $n wait
fi
done

Here is how I managed to solve this issue in a bash script:
#! /bin/bash
MAX_JOBS=32
FILE_LIST=($(cat ${1}))
echo Length ${#FILE_LIST[#]}
for ((INDEX=0; INDEX < ${#FILE_LIST[#]}; INDEX=$((${INDEX}+${MAX_JOBS})) ));
do
JOBS_RUNNING=0
while ((JOBS_RUNNING < MAX_JOBS))
do
I=$((${INDEX}+${JOBS_RUNNING}))
FILE=${FILE_LIST[${I}]}
if [ "$FILE" != "" ];then
echo $JOBS_RUNNING $FILE
./M22Checker ${FILE} &
else
echo $JOBS_RUNNING NULL &
fi
JOBS_RUNNING=$((JOBS_RUNNING+1))
done
wait
done

You can use a simple nested for loop (substitute appropriate integers for N and M below):
for i in {1..N}; do
(for j in {1..M}; do do_something; done & );
done
This will execute do_something N*M times in M rounds, each round executing N jobs in parallel. You can make N equal the number of CPUs you have.

My solution to always keep a given number of processes running, keep tracking of errors and handle ubnterruptible / zombie processes:
function log {
echo "$1"
}
# Take a list of commands to run, runs them sequentially with numberOfProcesses commands simultaneously runs
# Returns the number of non zero exit codes from commands
function ParallelExec {
local numberOfProcesses="${1}" # Number of simultaneous commands to run
local commandsArg="${2}" # Semi-colon separated list of commands
local pid
local runningPids=0
local counter=0
local commandsArray
local pidsArray
local newPidsArray
local retval
local retvalAll=0
local pidState
local commandsArrayPid
IFS=';' read -r -a commandsArray <<< "$commandsArg"
log "Runnning ${#commandsArray[#]} commands in $numberOfProcesses simultaneous processes."
while [ $counter -lt "${#commandsArray[#]}" ] || [ ${#pidsArray[#]} -gt 0 ]; do
while [ $counter -lt "${#commandsArray[#]}" ] && [ ${#pidsArray[#]} -lt $numberOfProcesses ]; do
log "Running command [${commandsArray[$counter]}]."
eval "${commandsArray[$counter]}" &
pid=$!
pidsArray+=($pid)
commandsArrayPid[$pid]="${commandsArray[$counter]}"
counter=$((counter+1))
done
newPidsArray=()
for pid in "${pidsArray[#]}"; do
# Handle uninterruptible sleep state or zombies by ommiting them from running process array (How to kill that is already dead ? :)
if kill -0 $pid > /dev/null 2>&1; then
pidState=$(ps -p$pid -o state= 2 > /dev/null)
if [ "$pidState" != "D" ] && [ "$pidState" != "Z" ]; then
newPidsArray+=($pid)
fi
else
# pid is dead, get it's exit code from wait command
wait $pid
retval=$?
if [ $retval -ne 0 ]; then
log "Command [${commandsArrayPid[$pid]}] failed with exit code [$retval]."
retvalAll=$((retvalAll+1))
fi
fi
done
pidsArray=("${newPidsArray[#]}")
# Add a trivial sleep time so bash won't eat all CPU
sleep .05
done
return $retvalAll
}
Usage:
cmds="du -csh /var;du -csh /tmp;sleep 3;du -csh /root;sleep 10; du -csh /home"
# Execute 2 processes at a time
ParallelExec 2 "$cmds"
# Execute 4 processes at a time
ParallelExec 4 "$cmds"

$DOMAINS = "list of some domain in commands"
for foo in some-command
do
eval `some-command for $DOMAINS` &
job[$i]=$!
i=$(( i + 1))
done
Ndomains=echo $DOMAINS |wc -w
for i in $(seq 1 1 $Ndomains)
do
echo "wait for ${job[$i]}"
wait "${job[$i]}"
done
in this concept will work for the parallelize. important thing is last line of eval is '&'
which will put the commands to backgrounds.

Parallelizing a ping sweep

I'm attempting to sweep an IP block totaling about 65,000 addresses. We've been instructed to use specifically ICMP packets with bash and find a way to parallelize it. Here's what I've come up with:
#!/bin/bash
ping() {
if ping -c 1 -W 5 131.212.$i.$j >/dev/null
then
((++s))
echo -n "*"
else
((++f))
echo -n "."
fi
((++j))
#if j has reached 255, set it to zero and increment i
if [ $j -gt 255 ]; then
j=0
((++i))
echo "Pinging 131.212.$i.xx IP Block...\n"
fi
}
s=0 #number of responses recieved
f=0 #number of failures recieved
i=0 #IP increment 1
j=0 #IP increment 2
curProcs=$(ps | wc -l)
maxProcs=$(getconf OPEN_MAX)
while [ $i -lt 256 ]; do
curProcs=$(ps | wc -l)
if [ $curProcs -lt $maxProcs ]; then
ping &
else
sleep 10
fi
done
echo "Found "$s" responses and "$f" timeouts."
echo /usr/bin/time -l
done
However, I've been running into the following error (on macOS):
redirection error: cannot duplicate fd: Too many open files
My understanding is I'm going over a resource limit, which I've attempted to rectify by only starting new ping processes if the existing processes count is less than the specified max, but this does not solve the issue.
Thank you for your time and suggestions.
EDIT:
There are a lot of good suggestions below for doing this with preexisting tools. Since I was limited by academic requirements, I ended up splitting the ping loops into a different process for each 12.34.x.x blocks, which although ugly did the trick in under 5 minutes. This code has a lot of problems, but it might be a good starting point for someone in the future:
#!/bin/bash
#############################
# Ping Subfunction #
#############################
# blocks with more responses will complete first since worst-case scenerio
# is O(n) if no IPs generate a response
pingSubnet() {
for ((j = 0 ; j <= 255 ; j++)); do
# send a single ping with a timeout of 5 sec, piping output to the bitbucket
if ping -c 1 -W 1 131.212."$i"."$j" >/dev/null
then
((++s))
else
((++f))
fi
done
#echo "Recieved $s responses with $f timeouts in block $i..."
# output number of success results to the pipe opened in at the start
echo "$s" >"$pipe"
exit 0
}
#############################
# Variable Declaration #
#############################
start=$(date +%s) #start of execution time
startMem=$(vm_stat | awk '/Pages free/ {print $3}' | awk 'BEGIN { FS = "\." }; {print ($1*0.004092)}' | sed 's/\..*$//');
startCPU=$(top -l 1 | grep "CPU usage" | awk '{print 100-$7;}' | sed 's/\..*$//')
s=0 #number of responses recieved
f=0 #number of failures recieved
i=0 #IP increment 1
j=0 #IP increment 2
#############################
# Pipe Initialization #
#############################
# create a pipe for child procs to write to
# child procs inherit runtime environment of parent proc, but cannot
# write back to it (like passing by value in C, but the whole env)
# hence, they need somewhere else to write back to that the parent
# proc can read back in
pipe=/tmp/pingpipe
trap 'rm -f $pipe' EXIT
if [[ ! -p $pipe ]]; then
mkfifo $pipe
exec 3<> $pipe
fi
#############################
# IP Block Iteration #
#############################
# adding an ampersand to the end forks the command to a separate, backgrounded
# child process. this allows for parellel computation but adds logistical
# challenges since children can't write the parent's variables
echo "Initiating scan processes..."
while [ $i -lt 256 ]; do
#echo "Beginning 131.212.$i.x block scan..."
#ping subnet asynchronously
pingSubnet &
((++i))
done
echo "Waiting for scans to complete (this may take up to 5 minutes)..."
peakMem=$(vm_stat | awk '/Pages free/ {print $3}' | awk 'BEGIN { FS = "\." }; {print ($1*0.004092)}' | sed 's/\..*$//')
peakCPU=$(top -l 1 | grep "CPU usage" | awk '{print 100-$7;}' | sed 's/\..*$//')
wait
echo -e "done" >$pipe
#############################
# Concat Pipe Outputs #
#############################
# read each line from the pipe we created earlier, adding the number
# of successes up in a variable
success=0
echo "Tallying responses..."
while read -r line <$pipe; do
if [[ "$line" == 'done' ]]; then
break
fi
success=$((line+success))
done
#############################
# Output Statistics #
#############################
echo "Gathering Statistics..."
fail=$((65535-success))
#output program statistics
averageMem=$((peakMem-startMem))
averageCPU=$((peakCPU-startCPU))
end=$(date +%s) #end of execution time
runtime=$((end-start))
echo "Scan completed in $runtime seconds."
echo "Found $success active servers and $fail nonresponsive addresses with a timeout of 1."
echo "Estimated memory usage was $averageMem MB."
echo "Estimated CPU utilization was $averageCPU %"

This should give you some ideas with GNU Parallel
parallel --dry-run -j 64 -k ping 131.212.{1}.{2} ::: $(seq 1 3) ::: $(seq 11 13)
ping 131.212.1.11
ping 131.212.1.12
ping 131.212.1.13
ping 131.212.2.11
ping 131.212.2.12
ping 131.212.2.13
ping 131.212.3.11
ping 131.212.3.12
ping 131.212.3.13
-j64 executes 64 pings in parallel at a time
-dry-run means do nothing but show what it would do
-k means keep the output in order - (just so you can understand it)
The ::: introduces the arguments and I have repeated them with different numbers (1 through 3, and then 11 through 13) so you can distinguish the two counters and see that all permutations and combinations are generated.

Don't do that.
Use fping instead. It will probe far more efficiently than your program will.
$ brew install fping
will make it available, thanks to the magic of brew.

Of course it's not as optimal as you are trying to build above, but you could start the maximum allow number of processes on the background, wait for them to end and start the next batch, something like this (except I'm using sleep 1s):
for i in {1..20} # iterate some
do
sleep 1 & # start in the background
if ! ((i % 5)) # after every 5th (using mod to detect)
then
wait %1 %2 %3 %4 %5 # wait for all jobs to finish
fi
done

fping hosts in file and return down ips

I want to use fping to ping multiple ips contained in a file and output the failed ips into a file i.e.
hosts.txt
8.8.8.8
8.8.4.4
1.1.1.1
ping.sh
#!/bin/bash
HOSTS="/tmp/hosts.txt"
fping -q -c 2 < $HOSTS
if ip down
echo ip > /tmp/down.log
fi
So I would like to end up with 1.1.1.1 in the down.log file

It seems that parsing the data from fping is somewhat difficult. It allows the parsing of data for hosts that is alive but not dead. As a way round the issue and to allow for multiple host processing simultaneously with -f, all the hosts that are alive are placed in a variable called alive and then the hosts in the /tmp/hosts.txt file are looped through and grepped against the variable alive to decipher whether the host is alive or dead. A return code of 1 signifies that grep cannot find the host in alive and hence an addition to down.log.
alive=$(fping -c 1 -f ipsfile | awk -F: '{ print $1 }')
while read line
do
grep -q -o $line <<<$alive
if [[ "$?" == "1" ]]
then
echo $line >> down.log
fi
done < /tmp/hosts.txt

Here's one way to get the result you want. Note however; i didn't use fping anywhere in my script. If the usage of fping is crucial to you then i might have missed the point entirely.
#!/bin/bash
HOSTS="/tmp/hosts.txt"
declare -i DELAY=$1 # Amount of time in seconds to wait for a packet
declare -i REPEAT=$2 # Amount of times to retry pinging upon failure
# Read HOSTS line by line
while read -r line; do
c=0
while [[ $c < $REPEAT ]]; do
# If pinging an address does not return the word "0 received", we assume the ping has succeeded
if [[ -z $(ping -q -c $REPEAT -W $DELAY $line | grep "0 received") ]]; then
echo "Attempt[$(( c + 1))] $line : Success"
break;
fi
echo "Attempt[$(( c + 1))] $line : Failed"
(( c++ ))
done
# If we failed the pinging of an address equal to the REPEAT count, we assume address is down
if [[ $c == $REPEAT ]]; then
echo "$line : Failed" >> /tmp/down.log # Log the failed address
fi
done < $HOSTS
Usage: ./script [delay] [repeatCount] -- 'delay' is the total amount of seconds we wait for a response from a ping, 'repeatCount' is how many times we retry pinging upon failure before deciding the address is down.
Here we are reading the /tmp/hosts.txt line by line and evaluating each adress using ping. If pinging an address succeeds, we move on to the next one. If an address fails, we try again for as many times as the user has specified. If the address fails all of the pings, we log it in our /tmp/down.log.
The conditions for checking whether a ping failed/succeeded may not be accurate for your use-cases, so maybe you will have to edit that. Still, i hope this gets the general idea across.

Bash script to ping a IP and if the ms is over 100 print a echo msg

Im new to bash scripting.
I need a script to get the ms of a ping to a IP and if the time is over 100 it will print a echo message.
For the example lets do it with the google ip 8.8.8.8
Could you please help me?
Edit:
Okay how to make it like this:
#!/bin/sh
echo '>> Start ping test 2.0'
/bin/ping 8.8.8.8 | awk -F' |=' '$10=="time"'
if [$11>100]
then
echo "Slow response"
else
echo "Fast response"
fi

Okay... First off, you are not writing a bash script, your script is called using #!/bin/sh, so even if your system uses bash as its system shell, it's being run in sh compatibility mode. So you can't use bashisms. Write your script as I've shown below instead.
So... it seems to me that if you want your ping to have output that is handled by your script, then the ping needs to actually EXIT. Your if will never get processed, because ping never stops running. And besides $11 within the awk script isn't the same as $11 within the shell script. So something like this might work:
#!/bin/bash
while sleep 5; do
t="$(ping -c 1 8.8.8.8 | sed -ne '/.*time=/{;s///;s/\..*//;p;}')"
if [ "$t" -gt 100 ]; then
# do something
else
# do something else
fi
done
This while loop, in shell (or bash) will run ping every five seconds with only one packet sent (the -c 1), and parse its output using sed. The sed script works like this:
/.*time=/{...} - look for a line containing the time and run stuff in the curly braces on that line...
s/// - substitute the previously found expression (the time) with nothing (erasing it from the line)
s/\..*// - replace everything from the first period to the end of the line with nothing (since shell math only handles integers)
p - and print the remaining data from the line.
And alternate way of handling this is to parse ping's output as a stream instead of spawning a new ping process for each test. For example:
#!/bin/bash
ping -i 60 8.8.8.8 | while read line; do
case "$line" in
*time=*ms)
t=${line#.*=} # strip off everything up to the last equals
t=${t% *} # strip off everything from the last space to the end
if [[ (($t > 100)) ]]; then
# do something
else
# do something else
fi
;;
done
These solutions are a bit problematic in that they fail to report when connectivity goes away ENTIRELY. But perhaps you can adapt them to handle that case too.
Note that these may not be your best solution. If you really want a monitoring system, larger scale things like Nagios, Icinga, Munin, etc, are a good way to go.
For small-scale ping monitoring like this, you might also want to look at fping.

There's a couple transformations you'll need to do to the ping output in order to get the actual number for milliseconds.
First, to make this simple, use the -c 1 flag for ping to only send one packet.
The output for ping will look like:
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=59 time=41.101 ms
--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 41.101/41.101/41.101/0.000 mss
Since you want the '41.101' piece, you'll need to parse out the second to last element of the second line.
To extract the second line you can use the FNR variable in awk, and to get the second to last column you can use the NF (number of fields) variable.
ping -c 1 8.8.8.8 | awk 'FNR == 2 { print $(NF-1) }'
This will give you time=41.101, to get just the number use cut to extract the field after the equals sign
ping -c 1 8.8.8.8 | awk 'FNR == 2 { print $(NF-1) }' | cut -d'=' -f2

This is what I did to get a trace on slow ping times and also get a mail sent to me or anyone if you want to have that.
#!/bin/bash
if [ "$#" -ne 1 ]; then
echo "You must enter 1 command line arguments - The address which you want to ping against"
exit
fi
hostname=$(hostname)
while true; do
RESULT=$(ping -c 1 $1 | awk -v time="$(date +"%Y-%m-%d %H:%M:%S")" -Ftime= 'NF>1{if ($2+0 > 1) print $1 $2 $4 $3 $5 " "time }')
if [ "$RESULT" != "" ]; then
echo $RESULT >> pingscript.log
echo $RESULT | mail -s "pingAlert between $hostname - $1" foo#bar.com
fi
sleep 2
done

Here's a script I pulled together from different examples. It prints out the date/time along with ok or FAIL if it was slower than 100ms
#!/bin/bash
host=$1
if [ -z $host ]; then
echo "Usage: `basename $0` [HOST]"
exit 1
fi
function pingTestTime()
{
while :; do
test2 $1
sleep 1
done
}
function test2()
{
now=`date "+%a %D %r"`
timeinms=$(ping -c 1 $1 | grep -oP 'time=\K\S+')
status=$?
timeint=${timeinms%.*}
if [ ! -z "$timeint" ]
then
if (( $timeint > 100 )); then
extraText="FAIL (Slow)"
else
extraText="ok"
fi
else
extraText="FAIL (Not Connected)"
fi
#echo "Status="$status
echo $now $timeinms"ms" $extraText
}
pingTestTime $host

Parallelize Bash script with maximum number of processes

Lets say I have a loop in Bash:
for foo in `some-command`
do
do-something $foo
done
do-something is cpu bound and I have a nice shiny 4 core processor. I'd like to be able to run up to 4 do-something's at once.
The naive approach seems to be:
for foo in `some-command`
do
do-something $foo &
done
This will run all do-somethings at once, but there are a couple downsides, mainly that do-something may also have some significant I/O which performing all at once might slow down a bit. The other problem is that this code block returns immediately, so no way to do other work when all the do-somethings are finished.
How would you write this loop so there are always X do-somethings running at once?

Depending on what you want to do xargs also can help (here: converting documents with pdf2ps):
cpus=$( ls -d /sys/devices/system/cpu/cpu[[:digit:]]* | wc -w )
find . -name \*.pdf | xargs --max-args=1 --max-procs=$cpus pdf2ps
From the docs:
--max-procs=max-procs
-P max-procs
Run up to max-procs processes at a time; the default is 1.
If max-procs is 0, xargs will run as many processes as possible at a
time. Use the -n option with -P; otherwise chances are that only one
exec will be done.

With GNU Parallel http://www.gnu.org/software/parallel/ you can write:
some-command | parallel do-something
GNU Parallel also supports running jobs on remote computers. This will run one per CPU core on the remote computers - even if they have different number of cores:
some-command | parallel -S server1,server2 do-something
A more advanced example: Here we list of files that we want my_script to run on. Files have extension (maybe .jpeg). We want the output of my_script to be put next to the files in basename.out (e.g. foo.jpeg -> foo.out). We want to run my_script once for each core the computer has and we want to run it on the local computer, too. For the remote computers we want the file to be processed transferred to the given computer. When my_script finishes, we want foo.out transferred back and we then want foo.jpeg and foo.out removed from the remote computer:
cat list_of_files | \
parallel --trc {.}.out -S server1,server2,: \
"my_script {} > {.}.out"
GNU Parallel makes sure the output from each job does not mix, so you can use the output as input for another program:
some-command | parallel do-something | postprocess
See the videos for more examples: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

maxjobs=4
parallelize () {
while [ $# -gt 0 ] ; do
jobcnt=(`jobs -p`)
if [ ${#jobcnt[#]} -lt $maxjobs ] ; then
do-something $1 &
shift
else
sleep 1
fi
done
wait
}
parallelize arg1 arg2 "5 args to third job" arg4 ...

Here an alternative solution that can be inserted into .bashrc and used for everyday one liner:
function pwait() {
while [ $(jobs -p | wc -l) -ge $1 ]; do
sleep 1
done
}
To use it, all one has to do is put & after the jobs and a pwait call, the parameter gives the number of parallel processes:
for i in *; do
do_something $i &
pwait 10
done
It would be nicer to use wait instead of busy waiting on the output of jobs -p, but there doesn't seem to be an obvious solution to wait till any of the given jobs is finished instead of a all of them.

Instead of a plain bash, use a Makefile, then specify number of simultaneous jobs with make -jX where X is the number of jobs to run at once.
Or you can use wait ("man wait"): launch several child processes, call wait - it will exit when the child processes finish.
maxjobs = 10
foreach line in `cat file.txt` {
jobsrunning = 0
while jobsrunning < maxjobs {
do job &
jobsrunning += 1
}
wait
}
job ( ){
...
}
If you need to store the job's result, then assign their result to a variable. After wait you just check what the variable contains.

If you're familiar with the make command, most of the time you can express the list of commands you want to run as a a makefile. For example, if you need to run $SOME_COMMAND on files *.input each of which produces *.output, you can use the makefile
INPUT = a.input b.input
OUTPUT = $(INPUT:.input=.output)
%.output : %.input
$(SOME_COMMAND) $< $#
all: $(OUTPUT)
and then just run
make -j<NUMBER>
to run at most NUMBER commands in parallel.

While doing this right in bash is probably impossible, you can do a semi-right fairly easily. bstark gave a fair approximation of right but his has the following flaws:
Word splitting: You can't pass any jobs to it that use any of the following characters in their arguments: spaces, tabs, newlines, stars, question marks. If you do, things will break, possibly unexpectedly.
It relies on the rest of your script to not background anything. If you do, or later you add something to the script that gets sent in the background because you forgot you weren't allowed to use backgrounded jobs because of his snippet, things will break.
Another approximation which doesn't have these flaws is the following:
scheduleAll() {
local job i=0 max=4 pids=()
for job; do
(( ++i % max == 0 )) && {
wait "${pids[#]}"
pids=()
}
bash -c "$job" & pids+=("$!")
done
wait "${pids[#]}"
}
Note that this one is easily adaptable to also check the exit code of each job as it ends so you can warn the user if a job fails or set an exit code for scheduleAll according to the amount of jobs that failed, or something.
The problem with this code is just that:
It schedules four (in this case) jobs at a time and then waits for all four to end. Some might be done sooner than others which will cause the next batch of four jobs to wait until the longest of the previous batch is done.
A solution that takes care of this last issue would have to use kill -0 to poll whether any of the processes have disappeared instead of the wait and schedule the next job. However, that introduces a small new problem: you have a race condition between a job ending, and the kill -0 checking whether it's ended. If the job ended and another process on your system starts up at the same time, taking a random PID which happens to be that of the job that just finished, the kill -0 won't notice your job having finished and things will break again.
A perfect solution isn't possible in bash.

Maybe try a parallelizing utility instead rewriting the loop? I'm a big fan of xjobs. I use xjobs all the time to mass copy files across our network, usually when setting up a new database server.
http://www.maier-komor.de/xjobs.html

function for bash:
parallel ()
{
awk "BEGIN{print \"all: ALL_TARGETS\\n\"}{print \"TARGET_\"NR\":\\n\\t#-\"\$0\"\\n\"}END{printf \"ALL_TARGETS:\";for(i=1;i<=NR;i++){printf \" TARGET_%d\",i};print\"\\n\"}" | make $# -f - all
}
using:
cat my_commands | parallel -j 4

Really late to the party here, but here's another solution.
A lot of solutions don't handle spaces/special characters in the commands, don't keep N jobs running at all times, eat cpu in busy loops, or rely on external dependencies (e.g. GNU parallel).
With inspiration for dead/zombie process handling, here's a pure bash solution:
function run_parallel_jobs {
local concurrent_max=$1
local callback=$2
local cmds=("${#:3}")
local jobs=( )
while [[ "${#cmds[#]}" -gt 0 ]] || [[ "${#jobs[#]}" -gt 0 ]]; do
while [[ "${#jobs[#]}" -lt $concurrent_max ]] && [[ "${#cmds[#]}" -gt 0 ]]; do
local cmd="${cmds[0]}"
cmds=("${cmds[#]:1}")
bash -c "$cmd" &
jobs+=($!)
done
local job="${jobs[0]}"
jobs=("${jobs[#]:1}")
local state="$(ps -p $job -o state= 2>/dev/null)"
if [[ "$state" == "D" ]] || [[ "$state" == "Z" ]]; then
$callback $job
else
wait $job
$callback $job $?
fi
done
}
And sample usage:
function job_done {
if [[ $# -lt 2 ]]; then
echo "PID $1 died unexpectedly"
else
echo "PID $1 exited $2"
fi
}
cmds=( \
"echo 1; sleep 1; exit 1" \
"echo 2; sleep 2; exit 2" \
"echo 3; sleep 3; exit 3" \
"echo 4; sleep 4; exit 4" \
"echo 5; sleep 5; exit 5" \
)
# cpus="$(getconf _NPROCESSORS_ONLN)"
cpus=3
run_parallel_jobs $cpus "job_done" "${cmds[#]}"
The output:
1
2
3
PID 56712 exited 1
4
PID 56713 exited 2
5
PID 56714 exited 3
PID 56720 exited 4
PID 56724 exited 5
For per-process output handling $$ could be used to log to a file, for example:
function job_done {
cat "$1.log"
}
cmds=( \
"echo 1 \$\$ >\$\$.log" \
"echo 2 \$\$ >\$\$.log" \
)
run_parallel_jobs 2 "job_done" "${cmds[#]}"
Output:
1 56871
2 56872

The project I work on uses the wait command to control parallel shell (ksh actually) processes. To address your concerns about IO, on a modern OS, it's possible parallel execution will actually increase efficiency. If all processes are reading the same blocks on disk, only the first process will have to hit the physical hardware. The other processes will often be able to retrieve the block from OS's disk cache in memory. Obviously, reading from memory is several orders of magnitude quicker than reading from disk. Also, the benefit requires no coding changes.

This might be good enough for most purposes, but is not optimal.
#!/bin/bash
n=0
maxjobs=10
for i in *.m4a ; do
# ( DO SOMETHING ) &
# limit jobs
if (( $(($((++n)) % $maxjobs)) == 0 )) ; then
wait # wait until all have finished (not optimal, but most times good enough)
echo $n wait
fi
done

Here is how I managed to solve this issue in a bash script:
#! /bin/bash
MAX_JOBS=32
FILE_LIST=($(cat ${1}))
echo Length ${#FILE_LIST[#]}
for ((INDEX=0; INDEX < ${#FILE_LIST[#]}; INDEX=$((${INDEX}+${MAX_JOBS})) ));
do
JOBS_RUNNING=0
while ((JOBS_RUNNING < MAX_JOBS))
do
I=$((${INDEX}+${JOBS_RUNNING}))
FILE=${FILE_LIST[${I}]}
if [ "$FILE" != "" ];then
echo $JOBS_RUNNING $FILE
./M22Checker ${FILE} &
else
echo $JOBS_RUNNING NULL &
fi
JOBS_RUNNING=$((JOBS_RUNNING+1))
done
wait
done

You can use a simple nested for loop (substitute appropriate integers for N and M below):
for i in {1..N}; do
(for j in {1..M}; do do_something; done & );
done
This will execute do_something N*M times in M rounds, each round executing N jobs in parallel. You can make N equal the number of CPUs you have.

My solution to always keep a given number of processes running, keep tracking of errors and handle ubnterruptible / zombie processes:
function log {
echo "$1"
}
# Take a list of commands to run, runs them sequentially with numberOfProcesses commands simultaneously runs
# Returns the number of non zero exit codes from commands
function ParallelExec {
local numberOfProcesses="${1}" # Number of simultaneous commands to run
local commandsArg="${2}" # Semi-colon separated list of commands
local pid
local runningPids=0
local counter=0
local commandsArray
local pidsArray
local newPidsArray
local retval
local retvalAll=0
local pidState
local commandsArrayPid
IFS=';' read -r -a commandsArray <<< "$commandsArg"
log "Runnning ${#commandsArray[#]} commands in $numberOfProcesses simultaneous processes."
while [ $counter -lt "${#commandsArray[#]}" ] || [ ${#pidsArray[#]} -gt 0 ]; do
while [ $counter -lt "${#commandsArray[#]}" ] && [ ${#pidsArray[#]} -lt $numberOfProcesses ]; do
log "Running command [${commandsArray[$counter]}]."
eval "${commandsArray[$counter]}" &
pid=$!
pidsArray+=($pid)
commandsArrayPid[$pid]="${commandsArray[$counter]}"
counter=$((counter+1))
done
newPidsArray=()
for pid in "${pidsArray[#]}"; do
# Handle uninterruptible sleep state or zombies by ommiting them from running process array (How to kill that is already dead ? :)
if kill -0 $pid > /dev/null 2>&1; then
pidState=$(ps -p$pid -o state= 2 > /dev/null)
if [ "$pidState" != "D" ] && [ "$pidState" != "Z" ]; then
newPidsArray+=($pid)
fi
else
# pid is dead, get it's exit code from wait command
wait $pid
retval=$?
if [ $retval -ne 0 ]; then
log "Command [${commandsArrayPid[$pid]}] failed with exit code [$retval]."
retvalAll=$((retvalAll+1))
fi
fi
done
pidsArray=("${newPidsArray[#]}")
# Add a trivial sleep time so bash won't eat all CPU
sleep .05
done
return $retvalAll
}
Usage:
cmds="du -csh /var;du -csh /tmp;sleep 3;du -csh /root;sleep 10; du -csh /home"
# Execute 2 processes at a time
ParallelExec 2 "$cmds"
# Execute 4 processes at a time
ParallelExec 4 "$cmds"

$DOMAINS = "list of some domain in commands"
for foo in some-command
do
eval `some-command for $DOMAINS` &
job[$i]=$!
i=$(( i + 1))
done
Ndomains=echo $DOMAINS |wc -w
for i in $(seq 1 1 $Ndomains)
do
echo "wait for ${job[$i]}"
wait "${job[$i]}"
done
in this concept will work for the parallelize. important thing is last line of eval is '&'
which will put the commands to backgrounds.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Limiting the number of subshells spawned - bash

Related

Does pushing a block of code to background in Bash result in parallelization? [duplicate]

Parallelizing a ping sweep

fping hosts in file and return down ips

Bash script to ping a IP and if the ms is over 100 print a echo msg

Parallelize Bash script with maximum number of processes

Categories

Resources