Compound conditional in a bash while loop - bash

Im modifying an existing bash script and having some trouble getting the while loop behaving correctly. This is the original code
while ! /usr/bin/executable1
do
# executable1 returned an error. So sleep for some time try again
sleep 2
done
I would like to change this to the following
while ! /usr/bin/executable1 && ! $(myfunc)
do
# executable1 and myfunc both were unsuccessful. So sleep for some time
sleep 2
done
executable1 returns 0 on success and 1 on failure. I understand that "true" in bash evaluates to 0 so thats why the original script would keep looping till executable returned success
Accordingly myfunc is coded like this
myfunc ()
{
# Check if file exists. If exits, return 0, If not, return 1
if [ -e someFile ]; then
return 0
fi
return 1
}
I notice that the my new while loop does not seem to call executable1. It always calls myfunc() and then exits out of the loop immediately. What am I doing wrong?
I tried various ways of coding the while loop (with (( )), [ ], [[ ]] etc), but nothing seems to fix it

You don't need $(...) to call a function, just to capture its standard output. You simply want
while ! /usr/bin/executable1 && ! myfunc
do
sleep 2
done
Note that myfunc can also be more simply written
myfunc () {
[ -e someFile ]
}
or even (in bash)
myfunc () [[ -e someFile ]]
Either way, it's almost not worth defining myfunc separately; just use
while ! /usr/bin/executable1 && ! [[ -e someFile ]]
do
sleep 2
done
It might also be simpler to use an until loop:
until /usr/bin/executable1 || [[ -e someFile ]]; do
sleep 2
done

Related

shell: failed to save error stream code to file

I am trying to detect whenever the following script (random_fail.sh) fails --which happens rarely-- by running it inside a while loop in the second script (catch_error.sh):
#!/usr/bin/env bash
# random_fail.sh
n=$(( RANDOM % 100 ))
if [[ n -eq 42 ]]; then
echo "Something went wrong"
>&2 echo "The error was using magic numbers"
exit 1
fi
echo "Everything went according to plan"
#!/usr/bin/env bash
# catch_error.sh
count=0 # The number of times before failing
error=0 # assuming everything initially ran fine
while [ "$error" != 1 ]; do
# running till non-zero exit
# writing the error code from the radom_fail script into /tmp/error
bash ./random_fail.sh 1>/tmp/msg 2>/tmp/error
# reading from the file, assuming 0 written inside most of the times
error="$(cat /tmp/error)"
echo "$error"
# updating the count
count=$((count + 1))
done
echo "random_fail.sh failed!: $(cat /tmp/msg)"
echo "Error code: $(cat /tmp/error)"
echo "Ran ${count} times, before failing"
I was expecting that the catch_error.sh will read from /tmp/error and come out of the loop once a particular run of random_fail.sh exits with 1.
Instead, the catch script seems to be running forever. I think this is because the error code is not being redirected to the /tmp/error file at all.
Please help.
You aren't catching the error code in the proper/usual manner. Also, no need to prefix the execution with the "bash" command, when it already contains the shebang. Lastly, curious why you don't simply use #!/bin/bash instead of #!/usr/bin/env bash .
Your second script should be modified to look like this:
#!/usr/bin/env bash
# catch_error.sh
count=0 # The number of times before failing
error=0 # assuming everything initially ran fine
while [ "$error" != 1 ]; do
# running till non-zero exit
# writing the error code from the radom_fail script into /tmp/error
./random_fail.sh 1>/tmp/msg 2>/tmp/error
error=$?
echo "$error"
# updating the count
count=$((count + 1))
done
echo "random_fail.sh failed!: $(cat /tmp/msg)"
echo "Error code: ${error}"
echo "Ran ${count} times, before failing"
[ "$error" != 1 ] is true if random_fail.sh prints a lone digit 1 to stderr. As long as this doesn't happen, your script will loop. You could instead test whether there has been written anything to stderr. There are several possibilities to achieve this:
printf '' >/tmp/error
while [[ ! -s /tmp/error ]]
or
error=
while (( $#error == 0 ))
or
error=
while [[ -z $error ]]
/tmp/error will always be either empty or will contain the line "The error was using magic numbers". It will never contain 0 or 1. If you want to know the exit value of the script, just check it directly:
if ./random_fail.sh 1>/tmp/msg 2>/tmp/error; then error=1; else error=0; fi
Or, you can do:
./random_fail.sh 1>/tmp/msg 2>/tmp/error
error=$?
But don't do either of those. Just do:
while ./random_fail.sh; do ...; done
As long as random_fail.sh (please read https://www.talisman.org/~erlkonig/documents/commandname-extensions-considered-harmful/ and stop naming your scripts with a .sh suffix) returns 0, the loop body will be entered. When it returns non-zero, the loop terminates.

equivalent of exit on command line

I have a bash script that runs a series of python scripts. It always runs all of them, but exits with a failure code if any script exited with a failure code. At least that's what I hope it does. Here it is ...
#!/bin/bash
res=0
for f in scripts/*.py
do
python "$f";
res=$(( $res | $? ))
done
exit $res
I'd like to run this as a bash command in the terminal, but i can't work out how to replace exit so that the command fails like a failed script, rather than exits the terminal. How do I do that?
Replace your last line exit $res with
$(exit ${res})
This exits the spawned subshell with the exit value of ${res} and because it is the last statement, this is also the exit value of your script.
Bash doesn't have a concept of anonymous functions (e.g. Go) which you can defined inline and get the return value, you need to do it explicitly. Wrap the whole code in a function say f()
f() {
local res=0
for f in scripts/*.py
do
python "$f";
res=$(( $res | $? ))
done
return $res
}
and use the exit code in command line.
if ! f; then
printf '%s\n' "one more python scripts failed"
fi
Is it true, that the value of the error code doesn't matter. Then I have another solution:
#!/bin/bash
total=0
errcount=0
for f in scripts/*.py
do
let total++
python "$f" || let errcount++
done
if ((errcount))
then
printf '%u of %u scripts failed\n' $errcount $total >&2
exit 1
fi
exit 0
#!/bin/bash
for f in scripts/*.py
do
python "$f" && echo "1" >> stack || echo "0" >> stack
done
[ $(grep -o stack) -eq 0 ] && rm -v ./stack && exit 1
I am rather stoned at the moment, so I apologise if I am misinterpreting, but I believe that this will do what you need. Every time the python script returns an error code of 0, (which means it works) a 1 is echoed into a stack file. At the end of the loop, the stack is checked for the presence of a single 0, and if it finds one, exits with an error code of 1, which is for general errors.

How to return from sourced bash script automatically on any error?

I have a bash script which is only meant to used be when sourced.
I want to return from it automatically on any error, similar to what set -e does.
However setting set -e doesn't work for me because it will also exit the users shell.
Right now I'm handling returning manually like this command || return 1, for each command.
You can also use command || true or command || return.
If your requirement is something different, please update more precisely.
You can use trap. E.g.:
// foo.sh
function func() {
trap 'if [ $? -ne 0 ]; then echo "Trapped!"; return ; fi' DEBUG
echo 'foo'
find -name "foo" . 2> /dev/null
echo 'bar'
}
func
Two notes. First, the trap needs to be inside the function as shown. It won't work if it's just inside the script.
Two, there is a significant limitation. Even if you set the return to the trap (e.g., return 1), while func exists after the bad find command, $? is still zero, no matter what. I'm not sure if there's a way around that, so if it's important to preserve the exit value of the failed command, this may not work.
E.g., if you had:
func
func_return=$?
echo "return value is: $func_return"
func_return will always be zero. I've played around with trying to get the exit value of the failed command to pass out of the function trap and into the function exit value, but have not found a way to do it.
If you need to preserve the return value, you could update a global variable inside the debug trap.
If I understand well, you can set -e locally in each function.
cat sourced
f1 () {
local -
set -e
[ "$1" -eq "$1" ] 2> /dev/null && echo "$1"
}
cat script.sh
. sourced
param='bad'
ret=$(f1 "$param")
[ $? -eq 0 ] && echo "result = $ret" || \
echo "error in sourced file with param $param"
param=3
ret=$(f1 "$param")
[ $? -eq 0 ] && echo "result = $ret" || \
echo "error in sourced file with param $param"

why does pgrep fail in this process monitor?

I have a monitor shell script that does effectively monitor and keep a process running. But it often fails in the sense that it starts a second, third or more instance of the process. I have also seen the pgrep command (pgrep -n -f wx_nanoserver) return the wrong pid at the command line...
Here's my script:
#!/bin/bash
check_process() {
# echo "$ts: checking $1"
[ "$1" = "" ] && return 0
[ `pgrep -n -f $1` ] && return 1 || return 0
}
while [ 1 ]; do
# timestamp
ts=`date +%T`
NOW=`date +"%Y%m%d-%H%M%S"`
# echo "$ts: begin checking..."
check_process "wx_nanoserver"
[ $? -eq 0 ] && echo "$ts: not running, restarting..." && `php /var/www/wx_nanoserver.php > /var/www/logs/wx_output_$NOW.log 2> /var/www/logs/wx_error_$NOW.log &`
sleep 5
done
try:
pgrep -n -f "$1" && return 1 || return 0
if you use [ ], you will try to check pgrep stdout data, and your script did not compare it with empty space or sth, without [ ], will using pgrep exit code.
Two weird things about your script:
[ `pgrep -n -f $1` ] && return 1 || return 0
works through side effects. The ``` part evaluates to either the pid of the process if found, or nothing if no process is found. The single[notation is a synonym for thetestbuiltin (or command on earlier systems) which happens to returntrueif its argument is a nonempty string andfalseif it is given no argument. So when a pid is found, the test becomes something like[ 1234 ]which evaluates to true and[ ]` otherwise, which evaluates to false. That is indeed what you want, but it would be cleaner to write:
pgrep -n -f "$1" &>/dev/null && return 1 || return 0
Another thing is
`php /var/www/wx_nanoserver.php > /var/www/logs/wx_output_$NOW.log 2> /var/www/logs/wx_error_$NOW.log &`
where you use command substitution for no apparent reason. You're asking bash to evaluate the output of your command rather than simply running it. As its output is redirected, it always evaluates to an empty string so it has no further effect. A side effect is that the command is run in a subshell, which is a good thing to deamonize it. Though, it would be cleaner to write:
( php /var/www/wx_nanoserver.php > /var/www/logs/wx_output_$NOW.log 2> /var/www/logs/wx_error_$NOW.log & )
Not sure though what the actual problem might be. Seems to be working that way anyhow.
Final note, the back tick `` notation has been deprecated in favour of the$()` notation.

writing from a function in a Bash script leaking file descriptors

We have a shell script that is called by cron and runs as root.
This script outputs logging and debug info, and has been failing at one certain point. This point varies based on how much output the script creates (it fails sooner if we enable more debugging output, for example).
However, if the script is called directly, as a user, then it works without a problem.
We have since created a simplified test case which demonstrates the problem.
The script is:
#!/bin/bash
function log_so () {
local msg="$1"
if [ -z "${LOG_FILE}" ] ; then warn_so "It's pointless use log_so() if LOG_FILE variable is undefined!" ; return 1 ; fi
echo -e "${msg}"
echo -e "${msg}" >> ${LOG_FILE}
(
/bin/true
)
}
LOG_FILE="/usr/local/bin/log_bla"
linenum=1
while [[ $linenum -lt 2000 ]] ; do
log_so "short text: $linenum"
let linenum++
done
The highest this has reached is 244 before dying (when called via cron).
Some other searches recommended using a no-op subshell from the function and also calling /bin/true but not only did this not work, the subshell option is not feasible in the main script.
We have also tried changing the file descriptor limit for root, but that did not help, and have tried using both #!/bin/sh and #!/bin/bash for the script.
We are using bash 4.1.5(1)-release on Ubuntu 10.04 LTS.
Any ideas or recommendations for a workaround would be appreciated.
What about opening a fd by hand and cleaning it up afterwards? I don't have a bash 4.1 to test with, but it might help.
LOG_FILE="/usr/local/bin/log_bla"
exec 9<> "$LOG_FILE"
function log_so () {
local msg="$1"
if [ -z "${LOG_FILE}" ] ; then warn_so "It's pointless use log_so() if LOG_FILE variable is undefined!" ; return 1 ; fi
echo -e "${msg}"
echo -e "${msg}" >&9
return 0
}
linenum=1
while [[ $linenum -lt 2000 ]] ; do
log_so "short text: $linenum"
let linenum++
done
exec 9>&-

Resources