I have several binary in the same folder that I want to run in a sequence.
Each binary does not terminate by itself and is waiting for data from a socket interface. Also, I need to decide whether to run the next binary based on the output of the previous binary. I am thinking of running them in the background and redirect the output of the previous binary to a file and "grep" for the keyword. However, unless I use wait, I couldn't capture all the output I want from running the previous binary. But if I use wait, I can't get control back because the binary is listening on socket and wouldn't return.
What can I do here?
a sample code here:
/home/test_1 & > test_1_log
test_1_id=$!
wait
===> I also want to grep "Success" in test_1_log here.
===> can't get here because of wait.
/home/test_2 & >test_2_log
test_2_id=$!
wait
Thanks
Can you use sleep instead of wait?
The problem is that you can't wait for it to return, because it won't. At the same time, you have to wait for some output. If you know that "Success" or something will be output, then you can loop until that line appears with a sleep.
RC=1
while [ $RC != 0 ]
do
sleep 1
grep -q 'Success' test_1_log
RC=$?
done
that also allows you to stop waiting after, say, 10 iterations or something, making sure your script exits
Related
I saw here the use of:
while ps | grep " $my_pid "
Question: In this kind of syntax while -command-, what is the while loop checking, return code of the command or stdout?
It's checking the return value of the process pipeline, which happens to be the return value of the last element in that pipeline (unless pipefail is set, but it usually isn't). The bash doco has this to say:
while list-1; do list-2; done
The while command continuously executes the list list-2 as long as the last command in the list list-1 returns an exit status of zero.
Elsewhere, it states:
The return status of a pipeline is the exit status of the last command
So this while statement continues as long as grep returns zero. And the grep doco states:
The exit status is 0 if a line is selected.
So the intent is almost certainly to keep looping as long as the process you're monitoring is still alive.
Of course, this is a rather "flaky" way of detecting if your process is still running. For a start, if my_pid is equal to 60, that grep is going to return zero if any of the processes 60, 602, or 3060, are running.
It's also going to return zero if, like I often have, you have some number of sleep 60 or sleep 3600 processes in flight, no matter their process ID.
Perhaps a better way, assuming you're on a system with procfs, is to use something like:
while [[ -d /proc/$my_pid ]] ; do ...
This will solve everything but the possibility that process IDs may be recycled so that a different process may start up with the same PID between checks but, since most UNIX-like systems allocate PIDs sequentially with wrap-around, that's very unlikely.
I want to send multiple jobs to a remote computer. Therefore I wrote a for loop which iterates over i jobs which consist of several subcommands. I need to pause the subsequent iteration until a certain subcommand is executed and the job actually runs on the remote computer.
So the idea is to check whether the string "PEND" appears in the output of a command on the remote computer. I want the for loop to continue when "PEND" changes to "RUN". I don't know whether the if statement is the right thing to use here. A fixed waiting time by using sleep wouldn't do the trick as the status change from PEND to RUN is highly irregular.
Additional information: The subcommands comprise compilation of an executable.
Erroneous pseudocode:
for i in {1..10}
do
subcommands
...
if [[ jobs | grep "PEND" == TRUE ]]; then sleep 1
fi
done
I have a bash script with a loop that calls a hard calculation routine every iteration. I use the results from every calculation as input to the next. I need make bash stop the script reading until every calculation is finished.
for i in $(cat calculation-list.txt)
do
./calculation
(other commands)
done
I know the sleep program, and i used to use it, but now the time of the calculations varies greatly.
Thanks for any help you can give.
P.s>
The "./calculation" is another program, and a subprocess is opened. Then the script passes instantly to next step, but I get an error in the calculation because the last is not finished yet.
If your calculation daemon will work with a precreated empty logfile, then the inotify-tools package might serve:
touch $logfile
inotifywait -qqe close $logfile & ipid=$!
./calculation
wait $ipid
(edit: stripped a stray semicolon)
if it closes the file just once.
If it's doing an open/write/close loop, perhaps you can mod the daemon process to wrap some other filesystem event around the execution? `
#!/bin/sh
# Uglier, but handles logfile being closed multiple times before exit:
# Have the ./calculation start this shell script, perhaps by substituting
# this for the program it's starting
trap 'echo >closed-on-calculation-exit' 0 1 2 3 15
./real-calculation-daemon-program
Well, guys, I've solved my problem with a different approach. When the calculation is finished a logfile is created. I wrote then a simple until loop with a sleep command. Although this is very ugly, it works for me and it's enough.
for i in $(cat calculation-list.txt)
do
(calculations routine)
until [[ -f $logfile ]]; do
sleep 60
done
(other commands)
done
Easy. Get the process ID (PID) via some awk magic and then use wait too wait for that PID to end. Here are the details on wait from the advanced Bash scripting guide:
Suspend script execution until all jobs running in background have
terminated, or until the job number or process ID specified as an
option terminates. Returns the exit status of waited-for command.
You may use the wait command to prevent a script from exiting before a
background job finishes executing (this would create a dreaded orphan
process).
And using it within your code should work like this:
for i in $(cat calculation-list.txt)
do
./calculation >/dev/null 2>&1 & CALCULATION_PID=(`jobs -l | awk '{print $2}'`);
wait ${CALCULATION_PID}
(other commands)
done
I'm creating a startup/shutdown script for WebSEAL. It's written to allow several instances to be stopped/started in parallel. The only problem is verifying that it completed without issue. With other infrastructures, I could simply grep for a particular keyword in the output (which I redirect to a log file), but WebSEAL does not give any success/error message.
Instead, I thought to use the $? to throw the exit status into a dynamic variable that will be checked after the startups have occured (during log consolidation).
Here is the code that starts/stops and then creates the variable
${PDCOMMAND} >> ${LOGDIR}/${APP}.txt 2>&1 &
let return_${APP}=$?
PDCOMMAND is a valid startup/stop command: aka pdweb start my_instance
APP is the name of the instance: aka my_instance
The goal is that return_${APP} (return_my_instance) will have a value of 0 (success) or 1 (failure) when I check it at a later point in the script.
Are there problems using the $? for a command that may have not technically completed at the time that it was set, or does it set it upon completion of that? So let's say I have 3 instances
instance_1, instance_2, instance_3
if I ran the following:
pdweb start instance1 &
let return_instance_1 = $?
pdweb start instance2 &
let return_instance_2 = $?
pdweb start instance_3 &
let_return_instance_3 = $?
would return_instance_[1|2|3] have the correct values if they started in unequal amounts of time? If instance_3 starts before instance_1, for example, will it still output the result of instance_3 to return_instance_3?
Basically, I'm trying to figure out how the command line treats an asynchronous request in regards to the exit status.
Thanks in advance
No; the exit status code is only available when the command finishes. (That's why it's called "exit status".) If you successfully spawned a service and it is up and running, it does not yet have an exit status.
If I am able to correctly guess what you are trying to accomplish, you could reap the values of $! after starting each instance, wait for a "reasonable" time (a few seconds?) and check that the processes you started are still running. If they have terminated, there was a problem.
I have a task that is very well inside of a bash for loop. The situation is though, that a few of the iterations seem to not terminate. What I'm looking for is a way to introduce a timeout that if that iteration of command hasn't terminated after e.g. two hours it will terminate, and move on to the next iteration.
Rough outline:
for somecondition; do
while time-run(command) < 2h do
continue command
done
done
One (tedious) way is to start the process in the background, then start another background process that attempts to kill the first one after a fixed timeout.
timeout=7200 # two hours, in seconds
for somecondition; do
command & command_pid=$!
( sleep $timeout & wait; kill $command_pid 2>/dev/null) & sleep_pid=$!
wait $command_pid
kill $sleep_pid 2>/dev/null # If command completes prior to the timeout
done
The wait command blocks until the original command completes, whether naturally or because it was killed after the sleep completes. The wait immediately after sleep is used in case the user tries to interrupt the process, since sleep ignores most signals, but wait is interruptible.
If I'm understanding your requirement properly, you have a process that needs to run, but you want to make sure that if it gets stuck it moves on, right? I don't know if this will fully help you out, but here is something I wrote a while back to do something similar (I've since improved this a bit, but I only have access to a gist at present, I'll update with the better version later).
#!/bin/bash
######################################################
# Program: logGen.sh
# Date Created: 22 Aug 2012
# Description: parses logs in real time into daily error files
# Date Updated: N/A
# Developer: #DarrellFX
######################################################
#Prefix for pid file
pidPrefix="logGen"
#output direcory
outDir="/opt/Redacted/logs/allerrors"
#Simple function to see if running on primary
checkPrime ()
{
if /sbin/ifconfig eth0:0|/bin/grep -wq inet;then isPrime=1;else isPrime=0;fi
}
#function to kill previous instances of this script
killScript ()
{
/usr/bin/find /var/run -name "${pidPrefix}.*.pid" |while read pidFile;do
if [[ "${pidFile}" != "/var/run/${pidPrefix}.${$}.pid" ]];then
/bin/kill -- -$(/bin/cat ${pidFile})
/bin/rm ${pidFile}
fi
done
}
#Check to see if primary
#If so, kill any previous instance and start log parsing
#If not, just kill leftover running processes
checkPrime
if [[ "${isPrime}" -eq 1 ]];then
echo "$$" > /var/run/${pidPrefix}.$$.pid
killScript
commands && commands && commands #Where the actual command to run goes.
else
killScript
exit 0
fi
I then set this script to run on cron every hour. Every time the script is run, it
creates a lock file named after a variable that describes the script that contains the pid of that instance of the script
calls the function killScript which:
uses the find command to find all lock files for that version of the script (this lets more than one of these scripts be set to run in cron at once, for different tasks). For each file it finds, it kills the processes of that lock file and removes the lock file (it automatically checks that it's not killing itself)
Starts doing whatever it is I need to run and not get stuck (I've omitted that as it's hideous bash string manipulation that I've since redone in python).
If this doesn't get you squared let me know.
A few notes:
the checkPrime function is poorly done, and should either return a status, or just exit the script itself
there are better ways to create lock files and be safe about it, but this has worked for me thus far (famous last words)