Is there a way to stop scripts that are running simultaneously if one of them send an echo? - bash

I need to find if a value (actually it's more complex than that) is in one of 20 servers I have. And I need to do it as fast as possible. Right now I am sending the scripts simultaneously to all the servers. My main script is something like this (but with all the servers):
#!/bin/sh
#mainScript.sh
value=$1
c1=`cat serverList | sed -n '1p'`
c2=`cat serverList | sed -n '2p'`
sh find.sh $value $c1 & sh find.sh $value $c2
#!/bin/sh
#find.sh
#some code here .....
if [ $? -eq 0 ]; then
rm $tempfile
else
myValue=`sed -n '/VALUE/p' $tempfile | awk 'BEGIN{FS="="} {print substr($2, 8, length($2)-2)}'`
echo "$myValue"
fi
So the script only returns a response if it finds the value in the server. I would like to know if there is a way to stop executing the other scripts if one of them already return a value.
I tried adding an "exit" on the find.sh script but it won't stop all the scripts. Can somebody please tell me if what I want to do is possible?

I would suggest that you use something that can handle this for you: GNU Parallel. From the linked tutorial:
If you are looking for success instead of failures, you can use success. This will finish as soon as the first job succeeds:
parallel -j2 --halt now,success=1 echo {}\; exit {} ::: 1 2 3 0 4 5 6
Output:
1
2
3
0
parallel: This job succeeded:
echo 0; exit 0

I suggest you start by modifying your find.sh so that its return code depends on its success, that will let us identify a successful call more easily; for instance:
myValue=`sed -n '/VALUE/p' $tempfile | awk 'BEGIN{FS="="} {print substr($2, 8, length($2)-2)}'`
success=$?
echo "$myValue"
exit $success
To terminate all the find.sh processes spawned by your script you can use pkill with a Parent Process ID criteria and a command name criteria :
pkill -P $$ find.sh # $$ refers to the current process' PID
Note that this requires that you start the find.sh script directly rather than passing it as a parameter to sh. Normally that shouldn't be a problem, but if you have a good reason to call sh rather than your script, you can replace find.sh in the pkill command by sh (assuming you're not spawning other scripts you wouldn't want to kill).
Now that find.sh exits with success only when it finds the expected string, you can plug the two actions with && and run the whole thing in background :
{ find.sh $value $c1 && pkill -P $$ find.sh; } &
The first occurrence of find.sh that terminates with success will invoke the pkill command that will terminate all others (those killed processes will have non-zero exit codes and therefore won't run their associated pkill).

Related

Kill bash command when line is found

I want to kill a bash command when I found some string in the output.
To clarify, I want the solution to be similar to a timeout command:
timeout 10s looping_program.sh
Which will execute the script: looping_program.sh and kill the script after 10 seconds of execute.
Instead I want something like:
regexout "^Success$" looping_program.sh
Which will execute the script until it matches a line that just says Success in the stdout of the program.
Note that I'm assuming that this looping_program.sh does not exit at the same time it outputs Success for whatever reason, so simply waiting for the program to exit would waste time if I don't care about what happens after that.
So something like:
bash -e looping_program.sh > /tmp/output &
PID="$(ps aux | grep looping_program.sh | head -1 | tr -s ' ' | cut -f 2 -d ' ')"
echo $PID
while :; do
echo "$(tail -1 /tmp/output)"
if [[ "$(tail -1 /tmp/output)" == "Success" ]]; then
kill $PID
exit 0
fi
sleep 1
done
Where looping_program.sh is something like:
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Success"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
echo "Fail"
sleep 1;
But that is not very robust (uses a single tmp file... might kill other programs...) and I want it to just be one command. Does something like this exist? I may just write a c program to do it if not.
P.S.: I provided my code as an example of what I wanted the program to do. It does not use good programming practices. Notes from other commenters:
#KamilCuk Do not use temporary file. Use a fifo.
#pjh Note that any approach that involves using kill with a PID in shell code runs the risk of killing the wrong process. Use kill in shell programs only when it is absolutely necessary.
There are more suggestions below from other users, I just wanted to make sure no one came across this and thought it would be good to model their code after.
looping_program() {
for i in 1 2 3; do echo $i; sleep 1; done
echo Success
yes
}
coproc looping_program
while IFS= read -r line; do
if [[ "$line" =~ Success ]]; then
break
fi
done <&${COPROC[0]}
exec {COPROC[0]}>&- {COPROC[1]}>&-
kill ${COPROC_PID}
wait ${COPROC_PID}
Notes:
Do not use temporary file. Use a fifo.
Do not use tail -n1 to read last line. Read from the stream in a loop.
Do not repeat tail -1 twice. Cache the result.
Wait for pid after killing to synchronize.
When you're using a coprocess, use COPROC_PID to get the PID
When you're not using a coprocess, use $! to get the PID of a background process started from the current shell.
When you can't use $! (because the process you're trying to get a PID of was not spawned in the background as a direct child of the current shell), do not use ps aux | grep to get the pid. Use pgrep.
Do not use echo $(stuff). Just run the stuff, no echo.
With expect
#!/usr/bin/env -S expect -f
set timeout -1
spawn ./looping_program.sh
expect "Success"
send -- "\x03"
expect eof
Call it looping_killer:
$ ./looping_killer
spawn ./looping_program.sh
Fail
Fail
Fail
Success
^C
To pass the program and pattern:
./looping_killer some_program "some pattern"
You'd change the expect script to
#!/usr/bin/env -S expect -f
set timeout -1
spawn [lindex $argv 0]
expect -- [lindex $argv 1]
send -- "\x03"
expect eof
Assuming that your looping program exists when it tries to write to a broken pipe, this will print all output up to and including the 'Success' line and then exit:
./looping_program | sed '/^Success$/q'
You may need to disable buffering of the looping program output. See Force line-buffering of stdout in a pipeline and How to make output of any shell command unbuffered? for ways to do it.
See Should I save my scripts with the .sh extension? and Erlkonig: Commandname Extensions Considered Harmful for reasons why I dropped the '.sh' suffix.
Note that any approach that involves using kill with a PID in shell code runs the risk of killing the wrong process. Use kill in shell programs only when it is absolutely necessary.

Check if bash script already running except itself with arguments

So I've looked up other questions and answers for this and as you can imagine, there are lots of ways to find this. However, my situation is kind of different.
I'm able to check whether a bash script is already running or not and I want to kill the script if it's already running.
The problem is that with the below code, -since I'm running this within the same script- the script kills itself too because it sees a script already running.
result=`ps aux | grep -i "myscript.sh" | grep -v "grep" | wc -l`
if [ $result -ge 1 ]
then
echo "script is running"
else
echo "script is not running"
fi
So how can I check if a script is already running besides it's own self and kill itself if there's another instance of the same script is running, else, continue without killing itself.
I thought I could combine the above code with $$ command to find the script's own PID and differentiate them this way but I'm not sure how to do that.
Also a side note, my script can be run multiple times at the same time within the same machine but with different arguments and that's fine. I only need to identify if script is already running with the same arguments.
pid=$(pgrep myscript.sh | grep -x -v $$)
# filter non-existent pids
pid=$(<<<"$pid" xargs -n1 sh -c 'kill -0 "$1" 2>/dev/null && echo "$1"' --)
if [ -n "$pid" ]; then
echo "Other script is running with pid $pid"
echo "Killing him!"
kill $pid
fi
pgrep lists the pids that match the name myscript.sh. From the list we filter current $$ shell with grep -v. It the result is non-empty, then you could kill the other pid.
Without the xargs, it would work, but the pgrep myscript.sh will pick up the temporary pid created for command substitution or the pipe. So the pid will never be empty and the kill will always execute complaining about the non-existent process. To do that, for each pid in pids, I check if the pid exists with kill -0. If it does, then it is outputted, effectively filtering all nonexistent pids.
You could also use a normal for loop to filter the pids:
# filter non-existent pids
pid=$(
for i in $pid; do
if kill -0 "$i" 2>/dev/null; then
echo "$i"
fi
done
)
Alternatively, you could use flock to lock the file and use lsof to list current open files with filtering the current one. As it is now, I think it will kill also editors that are editing the file and such. I believe the lsof output could be better filtered to accommodate this.
if [ "${FLOCKER}" != "$0" ]; then
pids=$(lsof -p "^$$" -- ./myscript.sh | awk 'NR>1{print $2}')
if [ -n "$pids" ]; then
echo "Other processes with $(echo $pids) found. Killing them"
kill $pids
fi
exec env FLOCKER="$0" flock -en "$0" "$0" "$#"
fi
I would go with either of 2 ways to solve this problem.
1st solution: Create a watchdog file lets say a .lck file kind of on a location before starting the script's execution(Make sure we use trap etc commands in case script is aborted so that .lck file should be removed) AND remove it once execution of script is completed successfully.
Example script for 1st solution: This is just an example a test one. We need to take care of interruptions in the script, lets say script got interrupted by a command or etc then we could use trap in it too, since at that time it would have not been completed but you may need to kick it off again(since last time it was not completed).
cat file.ksh
#!/bin/bash
PWD=`pwd`
watchdog_file="$PWD/script.lck"
if [[ -f "$watchdog_file" ]]
then
echo "Please wait script is still running, exiting from script now.."
exit 1;
else
touch $watchdog_file
fi
while true
do
echo "singh" > test1
done
if [[ -f "$watchdog_file" ]]
then
rm "$watchdog_file"
fi
2nd solution: Take pid of current running shell using $$ save it in a file. Then check if that process is still running come out of script if NOT running then move on to run statements in script.

question on using bwait to wait for multiple bsub jobs to finish

I am new to using LSF (been using PBS/Torque all along).
I need to write code/logic to make sure all bsub jobs finish before other commands/jobs can be fired.
Here is what I have done: I have a master shell script which calls multiple other shell scripts via bsub commands. I capture the job ids from bsub in a log file and I need to ensure that all jobs get finished before the downstream shell script should execute its other commands.
Master shell script
#!/bin/bash
...Code not shown for brevity..
"Command 1 invoked with multiple bsubs" > log_cmd_1.txt
Need Code logic to use bwait before downstream Commands can be used
"Command 2 will be invoked with multiple bsubs" > log_cmd_2.txt
and so on
stdout captured from Command 1 within the Master Shell script is stored in log_cmd_1.txt which looks like this
Submitting Sample 101
Job <545> is submitted to .
Submitting Sample 102
Job <546> is submitted to .
Submitting Sample 103
Job <547> is submitted to .
Submitting Sample 104
Job <548> is submitted to .
I have used the codeblock shown below after Command 1 in the master shell script.
However, it does not seem to work for my situation. Looks like I would have gotten the whole thing wrong below.
while sleep 30m;
do
#the below gets the JobId from the log_cmd_1.txt and tries bwait
grep '^Job' <path_to>/log_cmd_1.txt | perl -pe 's!.*?<(\d+)>.*!$1!' | while read -r line; do res=$(bwait -w "done($line)");echo $res; done 1>
<path_to>/running.txt;
# the below sed command deletes lines that start with Space
sed '/^\s*$/d' running.txt > running2.txt;
# -s file check operator means "file is not zero size"
if [ -s $WORK_DIR/logs/running2.txt ]
then
echo "Jobs still running";
else
echo "Jobs complete";
break;
fi
done
The question: What's the correct way to do this using bwait within the master shell script.
Thanks in advance.
bwait will block until the condition is satisfied, so the loops are probably not neecessary. Note that since you're using done, if the job fails then bwait will exit and inform you that the condition can never be satisfied. Make sure to check that case.
What you have should work. At least the following test worked for me.
#!/bin/bash
# "Command 1 invoked with multiple bsubs" > log_cmd_1.txt
( bsub sleep 0; bsub sleep 0 ) > log_cmd_1.txt
# Need Code logic to use bwait before downstream Commands can be used
while sleep 1
do
#the below gets the JobId from the log_cmd_1.txt and tries bwait
grep '^Job' log_cmd_1.txt | perl -pe 's!.*?<(\d+)>.*!$1!' | while read -r line; do res=$(bwait -w "done($line)");echo "$res"; done 1> running.txt;
# the below sed command deletes lines that start with Space
sed '/^\s*$/d' running.txt > running2.txt;
# -s file check operator means "file is not zero size"
if [ -s running2.txt ]
then
echo "Jobs still running";
else
echo "Jobs complete";
break;
fi
done
Another way to do it. Which may is a little cleaner, is to use job arrays and job dependencies. Job arrays will combine several pieces of work that can be managed as a single job. So your
"Command 1 invoked with multiple bsubs" > log_cmd_1.txt
could be submitted as a single job array. You'll need a driver script that can launch the individual jobs. Here's an example driver script.
$ cat runbatch1.sh
#!/bin/bash
# $LSB_JOBINDEX goes from 1 to 10
if [ "$LSB_JOBINDEX" -eq 1 ]; then
# do the work for job batch 1, job 1
...
elif [ "$LSB_JOBINDEX" -eq 2 ]; then
# etc
...
fi
Then you can submit the job array like this.
bsub -J 'batch1[1-10]' sh runbatch1.sh
This command will run 10 job array elements. The driver script's environment will use the variable LSB_JOB_INDEX to let you know which element the driver is running. Since the array has a name, batch, it's easier to manage. You can submit a second job array that won't start until all elements of the first have completed successfully. The second array is submitted with this command.
bsub -w 'done(batch1)' -J 'batch2[1-10]' sh runbatch2.sh
I hope that this helps.

Run bash script in background by default

I know I can run my bash script in the background by using bash script.sh & disown or alternatively, by using nohup. However, I want to run my script in the background by default, so when I run bash script.sh or after making it executable, by running ./script.sh it should run in the background by default. How can I achieve this?
Self-contained solution:
#!/bin/sh
# Re-spawn as a background process, if we haven't already.
if [[ "$1" != "-n" ]]; then
nohup "$0" -n &
exit $?
fi
# Rest of the script follows. This is just an example.
for i in {0..10}; do
sleep 2
echo $i
done
The if statement checks if the -n flag has been passed. If not, it calls itself with nohup (to disassociate the calling terminal so closing it doesn't close the script) and & (to put the process in the background and return to the prompt). The parent then exits to leave the background version to run. The background version is explicitly called with the -n flag, so wont cause an infinite loop (which is hell to debug!).
The for loop is just an example. Use tail -f nohup.out to see the script's progress.
Note that I pieced this answer together with this and this but neither were succinct or complete enough to be a duplicate.
Simply write a wrapper that calls your actual script with nohup actualScript.sh &.
Wrapper script wrapper.sh
#! /bin/bash
nohup ./actualScript.sh &
Actual script in actualScript.sh
#! /bin/bash
for i in {0..10}
do
sleep 10 #script is running, test with ps -eaf|grep actualScript
echo $i
done
tail -f 10 nohup.out
0
1
2
3
4
...
Adding to Heath Raftery's answer, what worked for me is a variation of what he suggested such as this:
if [[ "$1" != "-n" ]]; then
$0 -n & disown
exit $?
fi

Quit from pipe in bash

For following bash statement:
tail -Fn0 /tmp/report | while [ 1 ]; do echo "pre"; exit; echo "past"; done
I got "pre", but didn't quit to the bash prompt, then if I input something into /tmp/report, I could quit from this script and get into bash prompt.
I think that's reasonable. the 'exit' make the 'while' statement quit, but the 'tail' still alive. If something input into /tmp/report, the 'tail' will output to pipe, then 'tail' will detect the pipe is close, then 'tail' quits.
Am I right? If not, would anyone provide a correct interpretation?
Is it possible to add anything into 'while' statement to quit from the whole pipe statement immediately? I know I could save the pid of tail into a temporary file, then read this file in the 'while', then kill the tail. Is there a simpler way?
Let me enlarge my question. If use this tail|while in a script file, is it possible to fulfill following items simultaneously?
a. If Ctrl-C is inputed or signal the main shell process, the main shell and various subshells and background processes spawned by the main shell will quit
b. I could quit from tail|while only at a trigger case, and preserve other subprocesses keep running
c. It's better not use temporary file or pipe file.
You're correct. The while loop is executing in a subshell because its input is redirected, and exit just exits from that subshell.
If you're running bash 4.x, you may be able to achieve what you want with a coprocess.
coproc TAIL { tail -Fn0 /tmp/report.txt ;}
while [ 1 ]
do
echo "pre"
break
echo "past"
done <&${TAIL[0]}
kill $TAIL_PID
http://www.gnu.org/software/bash/manual/html_node/Coprocesses.html
With older versions, you can use a background process writing to a named pipe:
pipe=/tmp/tail.$$
mkfifo $pipe
tail -Fn0 /tmp/report.txt >$pipe &
TAIL_PID=$!
while [ 1 ]
do
echo "pre"
break
echo "past"
done <$pipe
kill $TAIL_PID
rm $pipe
You can (unreliably) get away with killing the process group:
tail -Fn0 /tmp/report | while :
do
echo "pre"
sh -c 'PGID=$( ps -o pgid= $$ | tr -d \ ); kill -TERM -$PGID'
echo "past"
done
This may send the signal to more processes than you want. If you run the above command in an interactive terminal you should be okay, but in a script it is entirely possible (indeed likely) the the process group will include the script running the command. To avoid sending the signal, it would be wise to enable monitoring and run the pipeline in the background to ensure that a new process group is formed for the pipeline:
#!/bin/sh
# In Posix shells that support the User Portability Utilities option
# this includes bash & ksh), executing "set -m" turns on job control.
# Background processes run in a separate process group. If the shell
# is interactive, a line containing their exit status is printed to
# stderr upon their completion.
set -m
tail -Fn0 /tmp/report | while :
do
echo "pre"
sh -c 'PGID=$( ps -o pgid= $$ | tr -d \ ); kill -TERM -$PGID'
echo "past"
done &
wait
Note that I've replaced the while [ 1 ] with while : because while [ 1 ] is poor style. (It behaves exactly the same as while [ 0 ]).

Resources