Best Option for resumable script - shell

I am writing a script that executes around 10 back-end processes in sequence, depending on if the previous process was executed without any errors.
Now let's assume the scenario, in which lets say 5th process failed and script came out. But I want to code it in a way such that, when next time user runs it(after removing the error because of which script exited last time), he should be able to run from 5th process onwards and not again from 1st process.
To be more specific, assume following is the script:
Script Starts
Process1
if [ $? -eq 0 ] then
Process2
if [ $? -eq 0 ] then
Process3
if [ $? -eq 0 ] then
..
..
..
..
if [ $? -eq 0 ] then
Process10
else
exit
So here the script will exit anytime if any one of the process fails to complete with status 0. So again, if process5 fails, and user corrects the problem and restarts script, the script should start with process5 again and not process1 or at least there should be an option to user if he wants to resume the script or start it back from beginning i.e. process1.
What all possible ways we can code this kind of script, also please bear in mind, I am not allowed to use a temporary db, where I can store the status of each process.
I need to code in sh (shell script) in unix.

A simple solution would be to write stamp files:
#/bin/sh
set -e # Automatically abort if any simple command fails
if ! test -f cmd1-stamp; cmd1; fi
touch cmd1-stamp
if ! test -f cmd2-stamp; cmd2; fi
touch cmd2-stamp
When the script executes, if cmd1-stamp exists, cmd1 is not executed. Otherwise, cmd1 is executed. The script will abort if it fails. Note that it is very tempting to write test -f cmd1-stamp || cmd1, and this seems to work ( in bash ) but the shell specs state that the shell shall abort if the simple command that fails is not a part of an AND or OR list, and I suspect this is (yet another) instance of bash not conforming to the spec. (Although it doesn't seem to specify that the shell shall not abort if the failing command is part of an AND or OR list.)

Related

How can I write and reference global bash scripts which halt the parent script's execution upon conditions?

I have written a simple script
get-consent-to-continue.sh
echo Would you like to continue [y/n]?
read response
if [ "${response}" != 'y' ];
then
exit 1
fi
I have added this script to ~/.bashrc as an alias
~/.bashrc
alias getConsentToContinue="source ~/.../get-consent-to-continue.sh"
My goal is to be able to call this from another script
~/.../do-stuff.sh
#!/usr/bin/env bash
# do stuff
getConsentToContinue
# do other stuff IF given consent, ELSE stop execution without closing terminal
Goal
I want to be able to
bash ~/.../do-stuff.sh
And then, when getConsentToContinue is called, if I respond with anything != 'y', then do-stuff.sh stops running without closing the terminal window.
The Problem
When I run
bash ~/.../do-stuff.sh
the alias is not accessible.
When I run
source ~/.../do-stuff.sh
Then the whole terminal closes when I respond with 'n'.
I just want to cleanly reuse this getConsentToContinue script to short-circuit execution of whatever script happens to be calling it. It's just for personal use when automating repetitive tasks.
A script can't force its parent script to exit, unless you source the script (since it's then executing in the same shell process).
Use an if statement to test how getConsentToContinue exited.
if ! getConsentToContinue
then
exit 1
fi
or more compactly
getConsentToContinue || exit
You could pass the PID of the calling script
For instance say you have a parent script called parent.sh:
# do stuff
echo "foo"
check_before_proceed $$
echo "bar"
Then, your check_before_proceed script would look like:
#!/bin/sh
echo Would you like to continue [y/n]?
read response
if [ "${response}" != 'y' ];then
kill -9 $1
fi
The $$ denotes the PID of the parent.sh script itself, you could find the relevant docs here. When we pass $$ as a parameter to the check_before_proceed script, then we would have access to the PID of the running parent.sh via the positional parameter$1 (see positional parameters)
Note: in my example, the check_before_proceed script would need to be accessible on $PATH

Concise way to run command if previous command completes successfully, when previous command is already running

I use the following pattern often:
some_long_running_command && echo success
However, sometimes I forget to attach the && success, so I write out the status check in the next line:
some_long_running_command
[ $? -eq 0 ] && echo success
As you can see, the status check in the second example ([ $? -eq 0 ] &&) is much more verbose than the check from the first example (&&).
Is there a more concise way to run a command only if the previous command completes successfully, when the previous command is already running?
exit exits its shell using the same exit status as the most recently completed command. This is true even if that command runs in the parent of the shell that executes exit: you can run exit in a subshell.
some_long_running_command
(exit) && echo success
Depending on the command, you can also suspend the job, then resume it with fg, which will have the same exit status as the command being resumed. This has the benefit of not having to wait for the command to complete before adding the new && list.
some_long_running_command
# type ctrl-z here
fg && echo success
Generally, this will work as long as some_long_running_command doesn't have to keep track of some resource that is changing in real-time, so that you don't lose data that came and went while the job was (briefly) suspended.

How to detect if a script in Julia got "Killed"?

So I'm running a Julia (v 0.6.3) script in a bash script called ./run.sh like so:
./julia/bin/julia scripts/my_script.jl
Now the script prematurely terminates. I'm sure of this because it doesn't finish outputting all the data it's supposed to. When I run a parsing script (written in Python) afterwards, it fails because of missing data.
I think that it terminates to insufficient RAM allocation (I'm running the script on a Docker container). If I bump up the allocated RAM the script works fine.
To catch this error in my Julia script I did the following:
try main();
catch e
println(e)
exit(1)
end
exit(0)
On top of that, I updated the bash script to check if the Julia script failed:
./julia/bin/julia scripts/my_script.jl
echo "Julia script returned: $?"
if [ $? -ne 0 ]; then
echo "Julia script failed"
exit 1
fi
However, no exception is printed from the Julia script. Furthermore, the return code is 0, so the bash bash script doesn't detect any errors either.
If I just run the the script directly from the terminal, at the very end of the output there's the word Killed. Immediately after I ran the command echo $? and I get 137, which is definitely not a successful return status. So it seems Julia and bash both know the script is terminated, but not if I run the Julia script from within a bash script...?
Another weird thing is when I run the Julia script from the bash script, the word Killed doesn't appear at all!
How can I reliably detect whether a script was prematurely terminated? Is there a way to get the reason it was killed as well (e.g. not enough RAM, stack overflow, etc)?
Your code if [ $? -ne 0 ]; then checks if the echo before it successfully completed (See Cyrus's comment).
Sometimes it makes sense to put the return value in a variable:
./julia/bin/julia scripts/my_script.jl
retval=$?
if [ $retval -ne 0 ]; then
echo "Julia script failed with $retval"
exit $retval
fi
ps: Reports a snapshot of the status of currently running processes.
ps -ef | grep 'julia'
T.

error handling in shell script to stop flow execution on next line

I would like to setup a error handling on my shell script where is my invocation of script fail with error, I should be able to stop the executin and flow should nt go on next line.
Like In my main script, I am making call to below script
sh /usr/oracle/StopServer.sh
if this script fail with error, my next script on this main file should not execute.. pls help.
You can check the return value of the command execution, one way to do this is:
sh /usr/oracle/StopServer.sh
if [ $? -ne 0 ]; then
# exit or take action
fi
it should do the trick
Here you go.
sh /usr/oracle/StopServer.sh && sh my_next_line_that_only_happens_if_Stop_server_exits_without_error
For more information, the && operator in bash (and most languages) exhibits McCarthy evaluation which is basically just lazy evaluation for boolean conditionals. what this means is that for an and (&&) then the second term of the and expression will only be evaluated if the first part is true (because otherwise the result is garuanteed to be false. Similarly if we did A || B (or) B would only be executed if A were false, which means it returned with an exit code of 0 and thus failed. If you had a program that you wanted to execute if the program exited normally and another that you want to execute if it exited with a failed state (I'm going to call them normal and fail) then you could execute them like so:
sh condition.sh && sh normal.sh || sh fail.sh
EDIT:
if [ sh condition.sh ]; then
# do whatever
else
#this is what to do if it failed
fi
EDIT #2:
If you want to see what happens try running this:
if [ ls -badoption ] ; then
echo "passed "
else
echo "failed"
fi
# result follows
# zsh: parse error: condition expected: ls
# failed
it will fail as there is a bad option and failed will be run, if you had just put ls it would have echoed passed, this is where you can either exit or run the script depending on what path you take.

Bash: How to lock also when perform an outside script

Here is my bash code:
(
flock -n -e 200 || (echo "This script is currently being run" && exit 1)
sleep 10
...Call some functions which is written in another script...
sleep 5
) 200>/tmp/blah.lockfile
I'm running the script from two shells successively and as long as the first one is at "sleep 5" all goes good, meaning that the other one doesn't start. But when the first turns to perform the code from another script (other file) the second run starts to execute.
So I have two questions here:
What should I do to prevent this script and all its "children" from run while the script OR its "child" is still running.
(I didn't find a more appropriate expression for running another script other than a "child", sorry for that :) ).
According to man page, -n causes the process to exit when it fails to gain the lock, but as far as I can see it just wait until it can run. What am I missing ?
Your problem may be fairly mundane. Namely,
false || ( exit 1 )
Does not cause the script to exit. Rather, the exit instructs the subshell to exit. So change your first line to:
flock -n -e 200 || { echo "This script is currently being run"; exit 1; } >&2

Resources