Bash script not exiting once a process is running - bash

How should I modify my bash script logic so it exits the while loop and exits the script itself once a process named custom_app is running on my local Ubuntu 18.04? I've tried using break and exit inside an if statement with no luck.
Once custom app is running from say...1st attempt then I quit the app, run_custom_app.sh lingers in the background and resumes retrying 2nd, 3rd, 4th, 5th time. It should be doing nothing at this point since app already ran successfully and user intentionally quit.
Below is run_custom_app.sh used to run my custom app triggered from a website button click.
Script logic
Check if custom_app process is running already. If so, don't run the commands in the while code block. Do nothing. Exit run_custom_app.sh.
While custom_app process is NOT running, retry up to 5 times.
Once custom_app process is running, stop while loop and exit run_custom_app.sh as well.
In cases where 5 run retries have been attempted but custom_app process is still not running, display a message to the user.
#!/bin/sh
RETRYCOUNT=0
PROCESS_RUNNING=`ps cax | grep custom_app`
# Try to connect until process is running. Retry up to 5 times. Wait 10 secs between each retry.
while [ ! "$PROCESS_RUNNING" ] && [ "$RETRYCOUNT" -le 5 ]; do
RETRYCOUNT="`expr $RETRYCOUNT + 1`"
commands
sleep 10
PROCESS_RUNNING=`ps cax | grep custom_app`
if [ "$PROCESS_RUNNING" ]; then
break
fi
done
# Display an error message if not connected after 5 connection attempts
if [ ! "$PROCESS_RUNNING" ]; then
echo "Failed to connect, please try again in about 2 minutes" # I need to modify this later so it opens a Terminal window displaying the echo statement, not yet sure how.
fi

I have tested this code on VirtualBox as a replacement for your custom_app and the previous post was using an until loop and pgrep instead of ps. As suggested by DavidC.Rankin pidof is more correct but if you want to use ps then I suggest to use ps -C custom_app -o pid=
#!/bin/sh
retrycount=0
until my_app_pid=$(ps -C VirtualBox -o pid=); do ##: save the output of ps in a variable so we can check/test it for later.
echo commands ##: Just echoed the command here not sure which commands you are using/running.
if [ "$retrycount" -eq 4 ]; then ##: We started at 0 so the fifth count is 4
break ##: exit the loop
fi
sleep 10
retrycount=$((retrycount+1)) ##: increment by one using shell syntax without expr
done
if [ -n "$my_app_pid" ]; then ##: if $my_app_pid is not empty
echo "app is running"
else
echo "Failed to connect, please try again in about 2 minutes" >&2 ##: print the message to stderr
exit 1 ##: exit with a failure which is not 0
fi
The my_app_pid=$(ps -C VirtualBox -o pid=) variable assignment has a useful exit status so we can use it.
Basically the until loop is just the opposite of the while loop.

Related

bash run multiple files exit condition

I have a function like so
function generic_build_a_module(){
move_to_the_right_directory
echo 'copying the common packages'; ./build/build_sdist.sh;
echo 'installing the api common package'; ./build/cache_deps.sh;
}
I want to exit the function if ./build/build_sdist.sh doesn't finishes successfully.
here is the content ./build/build_sdist.sh
... multiple operations....
echo "installing all pip dependencies from $REQUIREMENTS_FILE_PATH and placing their tar.gz into $PACKAGES_DIR"
pip install --no-use-wheel -d $PACKAGES_DIR -f $PACKAGES_DIR -r $REQUIREMENTS_FILE_PATH $PACKAGES_DIR/*
In other words, how does the main function generic_build_a_module "knows" if the ./build/build_sdist.sh finished successfully?
You can check the exit status of a command by surrounding it with an if. ! inverts the exit status. Use return 1 to exit your function with exit status 1.
generic_build_a_module() {
move_to_the_right_directory
echo 'copying the common packages'
if ! ./build/build_sdist.sh; then
echo "Aborted due to error while executing build."
return 1
fi
echo 'installing the api common package'
./build/cache_deps.sh;
}
If you don't want to print an error message, the same program can be written shorter using ||.
generic_build_a_module() {
move_to_the_right_directory
echo 'copying the common packages'
./build/build_sdist.sh || return 1
echo 'installing the api common package'
./build/cache_deps.sh;
}
Alternatively, you could use set -e. This will exit your script immediately when some command exits with a non-zero status.
You have to do the following:-
Run both the script in background and store their respective process id in two variables
Keep checking whether the scripts completed or not after an interval say for every 1 to 2 seconds.
Kill the process which is not completed after a specific time say 30 seconds
Example:
sdist=$(ps -fu $USER|grep -v "grep"|grep "build_sdist.sh"| awk '{print $2}')
OR
sdist=$(ps -fu $USER|grep [b]uild_sdist.sh| awk '{print $2}')
deps=$(ps -fu $USER|grep -v "grep"|grep "cache_deps.sh"| awk '{print $2}')
Now use a while loop to check the status every after a certain interval or just check the status directly after 30 seconds like below
sleep 30
if grep "$sdist"; then
kill -8 $sdist
fi
if grep "$deps"; then
kill -8 $deps
fi
You can check the exit code status of the last executed command by checking the $? variable. Exit code 0 is a typical indication that the command completed successfully.
Exit codes can be set by using exit followed by the code number within a script.
Here's a previous question regarding the use of $? with more detail, but to simply check this value try:
echo "test";echo $?
# Example
echo 'copying the common packages'; ./build/build_sdist.sh;
if [ $? -ne 0 ]; then
echo "The last command exited with a non-zero code"
fi
[ $? -ne 0 ] Checks if the last executed commands error code is not equal to 0. This is also useful to ensure that any negative error codes generated such as -1 are captured.
The caveat of the above approach is that we have only checked against the last command executed and not the ... multiple operations.... that you mentioned, so we may have missed an error generated by a command executed before pip install.
Depending on the situation you could set -e within a subsequent script, which instructs the shell to exit the script at the first instance a command exits with a non-zero status.
Another option would be to perform a similar operation as the example within ./build/build_sdist.sh to check the exit code of each command. This would give you the most control as to when and how the script finishes and allows the script to set it's own exit code.

get the number of seconds left for the sleep command to end in a shell script

I built a shell script that sleeps for a specified amount of minutes and shows notification when it is done.
TIME=$(zenity --scale --title="Next Session in (?) minutes")
sleep $TIME'm'
BEEP="/usr/share/sounds/freedesktop/stereo/complete.oga"
paplay $BEEP
notify-send "Next Session" "Press <Ctrl><Shift><s> to run the script again"
I prevented multiple instance of the program from executing using a file based approach at the beginning of the code. When a user wants to run the script while another instance is running, it shows a notification that the script is already running.
LOCKFILE=/tmp/lock.txt
if [ -e ${LOCKFILE} ] && kill -0 `cat ${LOCKFILE}`; then
notify-send "Already Running" $SECONDS
exit
fi
trap "rm -f ${LOCKFILE}; exit" INT TERM EXIT
echo $$ > ${LOCKFILE}
and finally remove the temporary file at the end of the script
rm -f ${LOCKFILE}
Now I want to add a text to the notification that tells how many seconds are left for the sleep command in my shell script to end. (changing the already running notification as follows)
notify-send "Already Running" $SECONDS
To implement the sleep command with my own controlled while loop would affect the overall performance of the computer. I think the sleep command is a better option as it optimizes the process by sending itself to a waiting state in the process queue.
Is there any way I can go around the problem?
Store the time when the script is supposed to end in the lock file.
if [ -e "$LOCKFILE" ]; then
read pid endtime < "$LOCKFILE"
if kill -0 "$pid"; then
notify-send "Already running" $(($(date +%s) - $endtime))
exit
fi
fi
trap "rm -f ${LOCKFILE}" EXIT # Use cascaded trap
trap 'exit 127' INT TERM
echo $$ $(($(date +%s) + (60 * $TIME))) >"$LOCKFILE"
There is a race condition here; if two scripts are started at almost the same time, the first could be inside the if but before the echo when the second starts. If you really need to prevent that, use a lock directory instead of a file -- directory creation is atomic, and either succeeds or fails at just a single point in time (but then you'll need to clean out the stale directory in the mystery scenario where the directory exists but is not owned by a file -- maybe after a careless OOM killer or something).
I think Triplee has a fine answer, another way to handle it that can be applied to any running process that may block is to bg the process briefly to grab and save the assigned pid $! to a file then fg the process back.
From there you can do the math and get the seconds via ps:
TIME=$(zenity --scale --title="Next Session in (?) minutes")
SLEEP_PID_FILE="/tmp/__session_ui_sleep_pid__"
sleep $TIME'm' &
echo $! >> "${SLEEP_PID_FILE}"
fg
BEEP="/usr/share/sounds/freedesktop/stereo/complete.oga"
paplay $BEEP
notify-send "Next Session" "Press <Ctrl><Shift><s> to run the script again"
Then afterward you can find the current elapsed time with something like:
notify-send "Already running for $(($(date +%s)-$(date -d"$(ps -o lstart= -p$(< "${SLEEP_PID_FILE}"))" +%s))) seconds..."

Bash script: `exit 0` fails to exit

So I have this Bash script:
#!/bin/bash
PID=`ps -u ...`
if [ "$PID" = "" ]; then
echo $(date) Server off: not backing up
exit
else
echo "say Server backup in 10 seconds..." >> fifo
sleep 10
STARTTIME="$(date +%s)"
echo nosave >> fifo
echo savenow >> fifo
tail -n 3 -f server.log | while read line
do
if echo $line | grep -q 'save complete'; then
echo $(date) Backing up...
OF="./backups/backup $(date +%Y-%m-%d\ %H:%M:%S).tar.gz"
tar -czhf "$OF" data
echo autosave >> fifo
echo "$(date) Backup complete, resuming..."
echo "done"
exit 0
echo "done2"
fi
TIMEDIFF="$(($(date +%s)-STARTTIME))"
if ((TIMEDIFF > 70)); then
echo "Save took too long, canceling backup."
exit 1
fi
done
fi
Basically, the server takes input from a fifo and outputs to server.log. The fifo is used to send stop/start commands to the server for autosaves. At the end, once it receives the message from the server that the server has completed a save, it tar's the data directory and starts saves again.
It's at the exit 0 line that I'm having trouble. Everything executes fine, but I get this output:
srv:scripts $ ./backup.sh
Sun Nov 24 22:42:09 EST 2013 Backing up...
Sun Nov 24 22:42:10 EST 2013 Backup complete, resuming...
done
But it hangs there. Notice how "done" echoes but "done2" fails. Something is causing it to hang on exit 0.
ADDENDUM: Just to avoid confusion for people looking at this in the future, it hangs at the exit line and never returns to the command prompt. Not sure if I was clear enough in my original description.
Any thoughts? This is the entire script, there's nothing else going on and I'm calling it direct from bash.
Here's a smaller, self contained example that exhibits the same behavior:
echo foo > file
tail -f file | while read; do exit; done
The problem is that since each part of the pipeline runs in a subshell, exit only exits the while read loop, not the entire script.
It will then hang until tail finds a new line, tries to write it, and discovers that the pipe is broken.
To fix it, you can replace
tail -n 3 -f server.log | while read line
do
...
done
with
while read line
do
...
done < <(tail -n 3 -f server.log)
By redirecting from a process substitution instead, the flow doesn't have to wait for tail to finish like it would in a pipeline, and it won't run in a subshell so that exit will actually exits the entire script.
But it hangs there. Notice how "done" echoes but "done2" fails.
done2 won't be printed at all since exit 0 has already ended your script with return code 0.
I don't know the details of bash subshells inside loops, but normally the appropriate way to exit a loop is to use the "break" command. In some cases that's not enough (you really need to exit the program), but refactoring that program may be the easiest (safest, most portable) way to solve that. It may also improve readability, because people don't expect programs to exit in the middle of a loop.

How can I wait for certain output from a process then continue in Bash?

I'm trying to write a bash script to do some stuff, start a process, wait for that process to say it's ready, and then do more stuff while that process continues to run. The issue I'm running into is finding a way to wait for that process to be ready before continuing, and allowing it to continue to run.
In my specific case I'm trying to setup a PPP connection. I need to wait until it has connected before I run the next command. I would also like to stop the script if PPP fails to connect. pppd prints to stdout.
In psuedo code what I want to do is:
[some stuff]
echo START
[set up the ppp connection]
pppd <options> /dev/ttyUSB0
while 1
if output of pppd contains "Script /etc/ppp/ipv6-up finished (pid ####), status = 0x0"
break
if output of pppd contains "Sending requests timed out"
exit 1
[more stuff, and pppd continues to run]
echo CONTINUING
Any ideas on how to do this?
I had to do something similar waiting for a line in /var/log/syslog to appear. This is what worked for me:
FILE_TO_WATCH=/var/log/syslog
SEARCH_PATTERN='file system mounted'
tail -f -n0 ${FILE_TO_WATCH} | grep -qe ${SEARCH_PATTERN}
if [ $? == 1 ]; then
echo "Search terminated without finding the pattern"
fi
It pipes all new lines appended to the watched file to grep and instructs grep to exit quietly as soon as the pattern is discovered. The following if statement detects if the 'wait' terminated without finding the pattern.
The quickest solution I came up with was to run pppd with nohup in the background and check the nobup.out file for stdout. It ended up something like this:
sudo nohup pppd [options] 2> /dev/null &
# check to see if it started correctly
PPP_RESULT="unknown"
while true; do
if [[ $PPP_RESULT != "unknown" ]]; then
break
fi
sleep 1
# read in the file containing the std out of the pppd command
# and look for the lines that tell us what happened
while read line; do
if [[ $line == Script\ /etc/ppp/ipv6-up\ finished* ]]; then
echo "pppd has been successfully started"
PPP_RESULT="success"
break
elif [[ $line == LCP:\ timeout\ sending\ Config-Requests ]]; then
echo "pppd was unable to connect"
PPP_RESULT="failed"
break
elif [[ $line == *is\ locked\ by\ pid* ]]; then
echo "pppd is already running and has locked the serial port."
PPP_RESULT="running"
break;
fi
done < <( sudo cat ./nohup.out )
done
There's a tool called "Expect" that does almost exactly what you want. More info: http://en.wikipedia.org/wiki/Expect
You might also take a look at the man pages for "chat", which is a pppd feature that does some of the stuff that expect can do.
If you go with expect, as #sblom advised, please check autoexpect.
You run what you need via autoexpect command and it will create expect script.
Check man page for examples.
Sorry for the late response but a simpler way would to use wait.
wait is a BASH built-in command which waits for a process to finish
Following is the excerpt from the MAN page.
wait [n ...]
Wait for each specified process and return its termination sta-
tus. Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job's pipeline are
waited for. If n is not given, all currently active child pro-
cesses are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.
For further reference on usage:
Refer to wiki page

killall httpd for sleep process

this shell explain the issue ,
after executing the .sh file halt and nothing happen , any clue where is my mistake
its kill httpd if there is more than 10 sleep process and start the httpd with zero sleep process
#!/bin/bash
#this means loop forever
while [ 1 ];
do HTTP=`ps auwxf | grep httpd | grep -v grep | wc -l`;
#the above line counts the number of httpd processes found running
#and the following line says if there were less then 10 found running
if [ $[HTTP] -lt 10 ];
then killall -9 httpd;
#inside the if now, so there are less then 10, kill them all and wait 1 second
sleep 1;
#start apache
/etc/init.d/httpd start;
fi;
#all done, sleep for ten seconds before we loop again
sleep 10;done
Why would you kill the child processes? If you do that you killing all ongoing sessions. Would it not be easier to setup your Webserver configuration so that it matches your needs?
As Dennis has mentioned already your script should look like:
#!/bin/bash
BINNAME=httpd # Name of the process
TIMEOUT=10 # Seconds to wait until next loop
MAXPROC=10 # Maximum amount of procs for given daemon
while true
do
# Count number of procs
HTTP=`pgrep $BINNAME | wc -l`
# Check if more then $MAXPROC are running
if [ "$HTTP" -gt "$MAXPROC" ]
then
# Kill the procs
killall-9 $BINNAME
sleep 1
# start http again
/etc/init.d/httpd start
fi
sleep $TIMEOUT
done
Formating makes code more readable ;)
I can't see anything wrong with it.
This line:
if [ $[HTTP] -lt 10 ];
should probably be:
if [ ${HTTP} -lt 10 ];
even though yours works.
If you add this as the last line, you should never see its output since you're in an infinite while loop.
echo "At end"
If you do, then that's really weird.
Make your first line look like this and it will display the script line-by-line as it executes to help you see where it's going wrong:
#!/bin/bash -x
Watch out for killall if you are trying to write portable scripts. It doesn't mean the same thing on every system: while on linux it means "kill processes named like this" on some systems it means "kill every process I have permission to kill".
If you run the later version as root, one of the things you kill is init. Oops.

Resources