BASH: Recursive design, linear implementation - bash

The idea
Say I have a few scripts. For example:
script1
script2
script3
I want each script to:
Do something
Run next script
Wait
Cleanup
The wait is simply to wait for the next script to complete.
The problem
A recursive solution is rather straightforward. The problem is that each script then needs to check if there is a next script. This is ok but a minor mistake in a script and it becomes a debugging hell, especially if there are many scripts.
For this reason I was thinking to do it in a linear way. Having a main script (script1) keeping control of everything. The main issue is the wait part.
How do I make script1 to pause script2 until script3 has completed so that it cleans up?

The easiest would be to simply split each worker script in two parts: the real work and the cleanup. Then your master script can run each of the scripts in sequence, followed by each of the cleanup scripts.
Another way to go about this would be to use a "build system" like SCons, which may work well if you can define the inputs and outputs of each script as filenames and let SCons schedule the work and support the "clean" command. This will be a bit of a steep learning curve, but for serious systems where debugging may be needed often, it may be more beneficial.

Related

Correct use Bash wait command with unknown processes number

Im writing a bash script that essentially fires off a python script that takes roughly 10 hours to complete, followed by an R script that checks the outputs of the python script for anything I need to be concerned about. Here is what I have:
ProdRun="python scripts/run_prod.py"
echo "Commencing Production Run"
$ProdRun #Runs python script
wait
DupCompare="R CMD BATCH --no-save ../dupCompareTD.R" #Runs R script
$DupCompare
Now my issues is that often the python script can generate a whole heap of different processes on our linux server depending on its input, with lots of different PIDs AND we have heaps of workers using the same server firing off scripts. As far as I can tell from reading, the 'wait' command must wait for all processes to finish or for a specific PID to finish, but when i cannot tell what or how many PIDs will be assigned/processes run, how exactly do I use it?
EDIT: Thank you to all that helped, here is what caused my dilemma for anyone google searching this. I broke up the ProdRun python script into its individual script that it was itself calling, but still had the issue, I think found that one of these scripts was also calling another smaller script that had a "&" at the end of it that was ignoring any commands to wait on it inside the python script itself. Simply removing this and inserting a line of "os.system()" allowed all the code to run sequentially.
It sounds like you are trying to implement a job scheduler with possibly some complex dependencies between different tasks. I recommend to use a job scheduler instead. It allows you to specify to run those jobs whilst also benefitting from features like monitoring, handling exceptional cases, errors, ...
Examples are: the open source rundeck https://github.com/rundeck/rundeck or the commercial one http://www.bmcsoftware.uk/it-solutions/control-m.html
Make your Python program wait on the children it spawns. That's the proper way to fix this scenario. Then you don't have to wait for Python after it finishes (sic).
(Also, don't put your commands in variables.)

How to exit the entire call stack of shell scripts if a child script fails?

I have set of shell scripts, around 20-30, that are used for performing one big task as a whole. The wrapper script calls mainly the high-level task scripts but internally those scripts calls other scripts like and the flow goes on in a nested manner.
I want to know if there is a way to exit the entire call stack if some critical script fails. Normally I run exit 125 command and then catch that in caller script and so on but I feel that little complicated. Is there a special exit that will abort the entire call stack? I don't want to use kill command to abort the wrapper script process.
You could have your main wrapper script start every sub-script in its own process group, using e.g. chpst -P.
Then the sub-scripts, as well as their children, could kill their own process group by sending it a KILL signal, and this would not affect the main wrapper script.
I think this would be a bad idea and what you're currently doing is the good way, though (because it makes the code easier to follow).

bash shell script sleep or cronjob which is preferred?

I want to do a task every 5 mins. I want to control when i can start and when i can end.
One way is to use sleep in a while true loop, another way is to use cronjob. Which one is preferred performance-wise?
Thanks!
cron is almost always the best solution.
If you try to do it yourself with a simple script running in a while loop:
while true; do
task
sleep 300
done
you eventually find that nothing is happening because your task failed due to a transient error. Or the system rebooted. Or some such. Making your script robust enough to deal with all these eventualities is hard work, and unnecessary. That's what cron is for, after all.
Also, if the task takes some non-trivial amount of time, the above simple-minded while loop will slowly shift out of sync with the clock. That could be fixed:
while true; do
task
sleep $((300 - $(date +%s) % 300))
done
Again, it's hardly worth it since cron will do that for you, too. However, cron will not save you from starting the task before the previous invocation finished, if the previous invocation got stuck somehow. So it's not a completely free ride, but it still provides you with some additional robustness.
A simple approach to solving the stuck-task problem is to use the flock utility. For example, you could cron a script containing the following:
(
flock -n 8 || {
logger -p user.warning Task is taking too long
# You might want to kill the stuck task here. See pkill
exit 1
}
# Do the task here
) 8> /tmp/my_task.lck
Use a cron job. Cron is made for this type of use case. It frees you of having to to code the while loop yourself.
However, cron may be unsuitable if the run time of the script is unpredictable and exceeds the timer schedule.
Performance-wise It is hard to tell unless you share what the script does and how often it does it. But generally speaking, neither option should have a negative impact on performance.

BASH shell process control - any other examples of controlling/scheduling work

I've inherited a medium sized project in which the main (batch) program is fed work through a large set of shell scripts that do a lot of process control (waiting for process to complete, sleeping, checking for conditions, etc) [ and reprocessed through perl scripts ]
Are there other examples of process control by shell scripts ? I would like to see what other people have done as a comparison. (as i'm not really fond of the 6,668 line shell script)
It may lead to that the current program works and doesn't need to be messed with or for maintenance reasons - it's too cumbersome and doing it another way will be easier to maintain, but I need other examples.
To reduce the "generality" of the question here's an example of what I'm looking for: procsup
Inquisitor project relies on process control from shell scripts extensively. You might want to see it's directory with main function set or directory with tests (i.e. slave processes) that it runs.
This is quite general question, and therefore giving specific answers may be a little bit difficult. (And you wont be happy with 5000 lines long example.) Most probably architecture of your application is faulty, and requires rather complete rework.
As you probably already know, process control with bash is pretty simple:
./test_script.sh &
test_script_pid=$!
wait $test_script_pid # waits until it's done
./test_script2.sh
echo $? # Prints return code of previous command
You can do same things with for example Python subprocess (or with Perl, obviously). If you have complex architecture with large number of different programs, then process is obviously non-trivial.
That is an awfully bug shell script. Have you considered refactoring it?
From the sound of it, there may be a lot of instances where you could replace several lines of code with a call to a shell function. If you can simplify the code in this way, then it will be easier to see where there are errors in the logic.
I've used this tactic successfully with a humongous PERL script and it turned out to have some serious logic errors and to be a security risk because it had embedded passwords that were obfuscated in an easily reversible way. The passwords that were exposed could have been used by persons unknown (well, a disgruntled employee) to shut down an entire global network.
Some managers were leaning towards making a security exception because this script was so important, but when the logic error was explained and it was clear that this script was providing incorrect data, it was decided that no data was better than dirty data. The guy who wrote that script taught himself programming with a PERL book and the writing of the script.

Pitfalls of using shell scripts to wrap a program?

Consider I have a program that needs an environment set. It is in Perl and I want to modify the environment (to search for libraries a special spot).
Every time I mess with the the standard way to do things in UNIX I pay a heavy price and I pay a penalty in flexibility.
I know that by using a simple shell script I will inject an additional process into the process tree. Any process accessing its own process tree might be thrown for a little bit of a loop.
Anything recursive to a nontrivial way would need to defend against multiple expansions of the environment.
Anything resembling being in a pipe of programs (or closing and opening STDIN, STDOUT, or STDERR) is my biggest area of concern.
What am I doing to myself?
What am I doing to myself?
Getting yourself all het up over nothing?
Wrapping a program in a shell script in order to set up the environment is actually quite standard and the risk is pretty minimal unless you're trying to do something really weird.
If you're really concerned about having one more process around — and UNIX processes are very cheap, by design — then use the exec keyword, which instead of forking a new process, simply exec's a new executable in place of the current one. So, where you might have had
#!/bin/bash -
FOO=hello
PATH=/my/special/path:${PATH}
perl myprog.pl
You'd just say
#!/bin/bash -
FOO=hello
PATH=/my/special/path:${PATH}
exec perl myprog.pl
and the spare process goes away.
This trick, however, is almost never worth the bother; the one counter-example is that if you can't change your default shell, it's useful to say
$ exec zsh
in place of just running the shell, because then you get the expected behavior for process control and so forth.

Resources