I have a function like so
function generic_build_a_module(){
move_to_the_right_directory
echo 'copying the common packages'; ./build/build_sdist.sh;
echo 'installing the api common package'; ./build/cache_deps.sh;
}
I want to exit the function if ./build/build_sdist.sh doesn't finishes successfully.
here is the content ./build/build_sdist.sh
... multiple operations....
echo "installing all pip dependencies from $REQUIREMENTS_FILE_PATH and placing their tar.gz into $PACKAGES_DIR"
pip install --no-use-wheel -d $PACKAGES_DIR -f $PACKAGES_DIR -r $REQUIREMENTS_FILE_PATH $PACKAGES_DIR/*
In other words, how does the main function generic_build_a_module "knows" if the ./build/build_sdist.sh finished successfully?
You can check the exit status of a command by surrounding it with an if. ! inverts the exit status. Use return 1 to exit your function with exit status 1.
generic_build_a_module() {
move_to_the_right_directory
echo 'copying the common packages'
if ! ./build/build_sdist.sh; then
echo "Aborted due to error while executing build."
return 1
fi
echo 'installing the api common package'
./build/cache_deps.sh;
}
If you don't want to print an error message, the same program can be written shorter using ||.
generic_build_a_module() {
move_to_the_right_directory
echo 'copying the common packages'
./build/build_sdist.sh || return 1
echo 'installing the api common package'
./build/cache_deps.sh;
}
Alternatively, you could use set -e. This will exit your script immediately when some command exits with a non-zero status.
You have to do the following:-
Run both the script in background and store their respective process id in two variables
Keep checking whether the scripts completed or not after an interval say for every 1 to 2 seconds.
Kill the process which is not completed after a specific time say 30 seconds
Example:
sdist=$(ps -fu $USER|grep -v "grep"|grep "build_sdist.sh"| awk '{print $2}')
OR
sdist=$(ps -fu $USER|grep [b]uild_sdist.sh| awk '{print $2}')
deps=$(ps -fu $USER|grep -v "grep"|grep "cache_deps.sh"| awk '{print $2}')
Now use a while loop to check the status every after a certain interval or just check the status directly after 30 seconds like below
sleep 30
if grep "$sdist"; then
kill -8 $sdist
fi
if grep "$deps"; then
kill -8 $deps
fi
You can check the exit code status of the last executed command by checking the $? variable. Exit code 0 is a typical indication that the command completed successfully.
Exit codes can be set by using exit followed by the code number within a script.
Here's a previous question regarding the use of $? with more detail, but to simply check this value try:
echo "test";echo $?
# Example
echo 'copying the common packages'; ./build/build_sdist.sh;
if [ $? -ne 0 ]; then
echo "The last command exited with a non-zero code"
fi
[ $? -ne 0 ] Checks if the last executed commands error code is not equal to 0. This is also useful to ensure that any negative error codes generated such as -1 are captured.
The caveat of the above approach is that we have only checked against the last command executed and not the ... multiple operations.... that you mentioned, so we may have missed an error generated by a command executed before pip install.
Depending on the situation you could set -e within a subsequent script, which instructs the shell to exit the script at the first instance a command exits with a non-zero status.
Another option would be to perform a similar operation as the example within ./build/build_sdist.sh to check the exit code of each command. This would give you the most control as to when and how the script finishes and allows the script to set it's own exit code.
Related
I have a Bash shell script that invokes a number of commands.
I would like to have the shell script automatically exit with a return value of 1 if any of the commands return a non-zero value.
Is this possible without explicitly checking the result of each command?
For example,
dosomething1
if [[ $? -ne 0 ]]; then
exit 1
fi
dosomething2
if [[ $? -ne 0 ]]; then
exit 1
fi
Add this to the beginning of the script:
set -e
This will cause the shell to exit immediately if a simple command exits with a nonzero exit value. A simple command is any command not part of an if, while, or until test, or part of an && or || list.
See the bash manual on the "set" internal command for more details.
It's really annoying to have a script stubbornly continue when something fails in the middle and breaks assumptions for the rest of the script. I personally start almost all portable shell scripts with set -e.
If I'm working with bash specifically, I'll start with
set -Eeuo pipefail
This covers more error handling in a similar fashion. I consider these as sane defaults for new bash programs. Refer to the bash manual for more information on what these options do.
To add to the accepted answer:
Bear in mind that set -e sometimes is not enough, specially if you have pipes.
For example, suppose you have this script
#!/bin/bash
set -e
./configure > configure.log
make
... which works as expected: an error in configure aborts the execution.
Tomorrow you make a seemingly trivial change:
#!/bin/bash
set -e
./configure | tee configure.log
make
... and now it does not work. This is explained here, and a workaround (Bash only) is provided:
#!/bin/bash
set -e
set -o pipefail
./configure | tee configure.log
make
The if statements in your example are unnecessary. Just do it like this:
dosomething1 || exit 1
If you take Ville Laurikari's advice and use set -e then for some commands you may need to use this:
dosomething || true
The || true will make the command pipeline have a true return value even if the command fails so the the -e option will not kill the script.
If you have cleanup you need to do on exit, you can also use 'trap' with the pseudo-signal ERR. This works the same way as trapping INT or any other signal; bash throws ERR if any command exits with a nonzero value:
# Create the trap with
# trap COMMAND SIGNAME [SIGNAME2 SIGNAME3...]
trap "rm -f /tmp/$MYTMPFILE; exit 1" ERR INT TERM
command1
command2
command3
# Partially turn off the trap.
trap - ERR
# Now a control-C will still cause cleanup, but
# a nonzero exit code won't:
ps aux | grep blahblahblah
Or, especially if you're using "set -e", you could trap EXIT; your trap will then be executed when the script exits for any reason, including a normal end, interrupts, an exit caused by the -e option, etc.
The $? variable is rarely needed. The pseudo-idiom command; if [ $? -eq 0 ]; then X; fi should always be written as if command; then X; fi.
The cases where $? is required is when it needs to be checked against multiple values:
command
case $? in
(0) X;;
(1) Y;;
(2) Z;;
esac
or when $? needs to be reused or otherwise manipulated:
if command; then
echo "command successful" >&2
else
ret=$?
echo "command failed with exit code $ret" >&2
exit $ret
fi
Run it with -e or set -e at the top.
Also look at set -u.
On error, the below script will print a RED error message and exit.
Put this at the top of your bash script:
# BASH error handling:
# exit on command failure
set -e
# keep track of the last executed command
trap 'LAST_COMMAND=$CURRENT_COMMAND; CURRENT_COMMAND=$BASH_COMMAND' DEBUG
# on error: print the failed command
trap 'ERROR_CODE=$?; FAILED_COMMAND=$LAST_COMMAND; tput setaf 1; echo "ERROR: command \"$FAILED_COMMAND\" failed with exit code $ERROR_CODE"; put sgr0;' ERR INT TERM
An expression like
dosomething1 && dosomething2 && dosomething3
will stop processing when one of the commands returns with a non-zero value. For example, the following command will never print "done":
cat nosuchfile && echo "done"
echo $?
1
#!/bin/bash -e
should suffice.
I am just throwing in another one for reference since there was an additional question to Mark Edgars input and here is an additional example and touches on the topic overall:
[[ `cmd` ]] && echo success_else_silence
Which is the same as cmd || exit errcode as someone showed.
For example, I want to make sure a partition is unmounted if mounted:
[[ `mount | grep /dev/sda1` ]] && umount /dev/sda1
I have a Bash shell script that invokes a number of commands.
I would like to have the shell script automatically exit with a return value of 1 if any of the commands return a non-zero value.
Is this possible without explicitly checking the result of each command?
For example,
dosomething1
if [[ $? -ne 0 ]]; then
exit 1
fi
dosomething2
if [[ $? -ne 0 ]]; then
exit 1
fi
Add this to the beginning of the script:
set -e
This will cause the shell to exit immediately if a simple command exits with a nonzero exit value. A simple command is any command not part of an if, while, or until test, or part of an && or || list.
See the bash manual on the "set" internal command for more details.
It's really annoying to have a script stubbornly continue when something fails in the middle and breaks assumptions for the rest of the script. I personally start almost all portable shell scripts with set -e.
If I'm working with bash specifically, I'll start with
set -Eeuo pipefail
This covers more error handling in a similar fashion. I consider these as sane defaults for new bash programs. Refer to the bash manual for more information on what these options do.
To add to the accepted answer:
Bear in mind that set -e sometimes is not enough, specially if you have pipes.
For example, suppose you have this script
#!/bin/bash
set -e
./configure > configure.log
make
... which works as expected: an error in configure aborts the execution.
Tomorrow you make a seemingly trivial change:
#!/bin/bash
set -e
./configure | tee configure.log
make
... and now it does not work. This is explained here, and a workaround (Bash only) is provided:
#!/bin/bash
set -e
set -o pipefail
./configure | tee configure.log
make
The if statements in your example are unnecessary. Just do it like this:
dosomething1 || exit 1
If you take Ville Laurikari's advice and use set -e then for some commands you may need to use this:
dosomething || true
The || true will make the command pipeline have a true return value even if the command fails so the the -e option will not kill the script.
If you have cleanup you need to do on exit, you can also use 'trap' with the pseudo-signal ERR. This works the same way as trapping INT or any other signal; bash throws ERR if any command exits with a nonzero value:
# Create the trap with
# trap COMMAND SIGNAME [SIGNAME2 SIGNAME3...]
trap "rm -f /tmp/$MYTMPFILE; exit 1" ERR INT TERM
command1
command2
command3
# Partially turn off the trap.
trap - ERR
# Now a control-C will still cause cleanup, but
# a nonzero exit code won't:
ps aux | grep blahblahblah
Or, especially if you're using "set -e", you could trap EXIT; your trap will then be executed when the script exits for any reason, including a normal end, interrupts, an exit caused by the -e option, etc.
The $? variable is rarely needed. The pseudo-idiom command; if [ $? -eq 0 ]; then X; fi should always be written as if command; then X; fi.
The cases where $? is required is when it needs to be checked against multiple values:
command
case $? in
(0) X;;
(1) Y;;
(2) Z;;
esac
or when $? needs to be reused or otherwise manipulated:
if command; then
echo "command successful" >&2
else
ret=$?
echo "command failed with exit code $ret" >&2
exit $ret
fi
Run it with -e or set -e at the top.
Also look at set -u.
On error, the below script will print a RED error message and exit.
Put this at the top of your bash script:
# BASH error handling:
# exit on command failure
set -e
# keep track of the last executed command
trap 'LAST_COMMAND=$CURRENT_COMMAND; CURRENT_COMMAND=$BASH_COMMAND' DEBUG
# on error: print the failed command
trap 'ERROR_CODE=$?; FAILED_COMMAND=$LAST_COMMAND; tput setaf 1; echo "ERROR: command \"$FAILED_COMMAND\" failed with exit code $ERROR_CODE"; put sgr0;' ERR INT TERM
An expression like
dosomething1 && dosomething2 && dosomething3
will stop processing when one of the commands returns with a non-zero value. For example, the following command will never print "done":
cat nosuchfile && echo "done"
echo $?
1
#!/bin/bash -e
should suffice.
I am just throwing in another one for reference since there was an additional question to Mark Edgars input and here is an additional example and touches on the topic overall:
[[ `cmd` ]] && echo success_else_silence
Which is the same as cmd || exit errcode as someone showed.
For example, I want to make sure a partition is unmounted if mounted:
[[ `mount | grep /dev/sda1` ]] && umount /dev/sda1
I need to make sure that all commands in my script finished successfully (returned 0 status). That's why my slurm script includes following lines:
set -e
set -x
Now I would like the exit status of the whole script to be written in the logfile that's automatically created by slurm. I have tried echo $SLURM_JOB_EXIT_CODE (with no success) or echo $? (which I am not sure is what I need) as a last line of my script.
What's the proper way to do this? I need to differentiate between "failed" and "completed" jobs, preferably by checking logfiles only.
Catching the exit code of the script within the script is impossible so you should either
wrap your script in another script that would take proper action based on its return code, or
get the return code from Slurm's accounting with the sacct command.
I know this is an old question but here is my way of appending the final job status to the Slurm output.
res=$(sbatch job.sh)
echo $res
sleep 10s
ST="PENDING"
while [[ "$ST" != "COMPLETED" && "$ST" != "FAILED" ]] ; do
ST=$(sacct -j ${res##* } -o State | awk 'FNR == 3 {print $1}')
sleep 10s
done
echo "$ST" >> job.out # assuming stdout writes to job.out
This question already has answers here:
Aborting a shell script if any command returns a non-zero value
(10 answers)
Closed 1 year ago.
I am a noob in shell-scripting. I want to print a message and exit my script if a command fails. I've tried:
my_command && (echo 'my_command failed; exit)
but it does not work. It keeps executing the instructions following this line in the script. I'm using Ubuntu and bash.
Try:
my_command || { echo 'my_command failed' ; exit 1; }
Four changes:
Change && to ||
Use { } in place of ( )
Introduce ; after exit and
spaces after { and before }
Since you want to print the message and exit only when the command fails ( exits with non-zero value) you need a || not an &&.
cmd1 && cmd2
will run cmd2 when cmd1 succeeds(exit value 0). Where as
cmd1 || cmd2
will run cmd2 when cmd1 fails(exit value non-zero).
Using ( ) makes the command inside them run in a sub-shell and calling a exit from there causes you to exit the sub-shell and not your original shell, hence execution continues in your original shell.
To overcome this use { }
The last two changes are required by bash.
The other answers have covered the direct question well, but you may also be interested in using set -e. With that, any command that fails (outside of specific contexts like if tests) will cause the script to abort. For certain scripts, it's very useful.
If you want that behavior for all commands in your script, just add
set -e
set -o pipefail
at the beginning of the script. This pair of options tell the bash interpreter to exit whenever a command returns with a non-zero exit code. (For more details about why pipefail is needed, see http://petereisentraut.blogspot.com/2010/11/pipefail.html)
This does not allow you to print an exit message, though.
Note also, each command's exit status is stored in the shell variable $?, which you can check immediately after running the command. A non-zero status indicates failure:
my_command
if [ $? -eq 0 ]
then
echo "it worked"
else
echo "it failed"
fi
I've hacked up the following idiom:
echo "Generating from IDL..."
idlj -fclient -td java/src echo.idl
if [ $? -ne 0 ]; then { echo "Failed, aborting." ; exit 1; } fi
echo "Compiling classes..."
javac *java
if [ $? -ne 0 ]; then { echo "Failed, aborting." ; exit 1; } fi
echo "Done."
Precede each command with an informative echo, and follow each command with that same
if [ $? -ne 0 ];... line. (Of course, you can edit that error message if you want to.)
Provided my_command is canonically designed, ie returns 0 when succeeds, then && is exactly the opposite of what you want. You want ||.
Also note that ( does not seem right to me in bash, but I cannot try from where I am. Tell me.
my_command || {
echo 'my_command failed' ;
exit 1;
}
You can also use, if you want to preserve exit error status, and have a readable file with one command per line:
my_command1 || exit
my_command2 || exit
This, however will not print any additional error message. But in some cases, the error will be printed by the failed command anyway.
The trap shell builtin allows catching signals, and other useful conditions, including failed command execution (i.e., a non-zero return status). So if you don't want to explicitly test return status of every single command you can say trap "your shell code" ERR and the shell code will be executed any time a command returns a non-zero status. For example:
trap "echo script failed; exit 1" ERR
Note that as with other cases of catching failed commands, pipelines need special treatment; the above won't catch false | true.
Using exit directly may be tricky as the script may be sourced from other places (e.g. from terminal). I prefer instead using subshell with set -e (plus errors should go into cerr, not cout) :
set -e
ERRCODE=0
my_command || ERRCODE=$?
test $ERRCODE == 0 ||
(>&2 echo "My command failed ($ERRCODE)"; exit $ERRCODE)
Greetings all. I'm setting up a cron job to execute a bash script, and I'm worried that the next one may start before the previous one ends. A little googling reveals that a popular way to address this is the flock command, used in the following manner:
flock -n lockfile myscript.sh
if [ $? -eq 1 ]; then
echo "Previous script is still running! Can't execute!"
fi
This works great. However, what do I do if I want to check the exit code of myscript.sh? Whatever exit code it returns will be overwritten by flock's, so I have no way of knowing if it executed successfully or not.
It looks like you can use the alternate form of flock, flock <fd>, where <fd> is a file descriptor. If you put this into a subshell, and redirect that file descriptor to your lock file, then flock will wait until it can write to that file (or error out if it can't open it immediately and you've passed -n). You can then do everything in your subshell, including testing the return value of scripts you run:
(
if flock -n 200
then
myscript.sh
echo $?
fi
) 200>lockfile
According to the flock man page, flock has a -E or --exit-conflict-code flag you can use to set what the exit code of flock should be in the case a conflict occurs:
-E, --conflict-exit-code number
The exit status used when the -n option is in use, and the conflicting lock exists, or the -w option is in use, and the timeout is reached. The default value is 1. The number has to be in the range of 0 to 255.
The man page also states:
EXIT STATUS
The command uses sysexits.h exit status values for everything, except when using either of the options -n or -w which report a failure to acquire the lock with a exit status given by the -E option, or 1 by default. The exit status given by -E has to be in the range of 0 to 255.
When using the command variant, and executing the child worked, then the exit status is that of the child command.
So, in the case of the -n or -w flags while using the "command" variant, you can see both exit statuses.
Example:
$ flock --exclusive /tmp/flock.lock bash -c 'exit 42'; echo $?
42
$ flock --exclusive /tmp/flock.lock flock --exclusive --nonblock --conflict-exit-code 100 /tmp/flock.lock bash -c 'exit 42'; echo $?
100
In the first example, we see that we get back the exit status of the process we're running with flock. In the second example, we are creating contention for the lock. In that case, flock itself returns the status code we tell it (100). If you do not specify a value with the --conflict-exit-code flag, it will return 1 instead. However, I prefer setting less common values to prevent confusion from other processess/scripts which also might return a value of 1.
#!/bin/bash
if ! pgrep myscript.sh; then
flock -n lockfile myscript.sh
fi
If I understand you right, you want to make sure 'myscript.sh' is not running before cron attempts to run your command again. Assuming that's right, we check to see if pgrep failed to find myscript.sh in the processes list and if so we run the flock command again.
Perhaps something like this would work for you.
#!/bin/bash
RETVAL=0
lockfailed()
{
echo "cannot flock"
exit 1
}
(
flock -w 2 42 || lockfailed
false
RETVAL=$?
echo "original retval $RETVAL"
exit $RETVAL
) 42>|/tmp/flocker
RETVAL=$?
echo "returned $RETVAL"
exit $RETVAL