Correct use Bash wait command with unknown processes number - bash

Im writing a bash script that essentially fires off a python script that takes roughly 10 hours to complete, followed by an R script that checks the outputs of the python script for anything I need to be concerned about. Here is what I have:
ProdRun="python scripts/run_prod.py"
echo "Commencing Production Run"
$ProdRun #Runs python script
wait
DupCompare="R CMD BATCH --no-save ../dupCompareTD.R" #Runs R script
$DupCompare
Now my issues is that often the python script can generate a whole heap of different processes on our linux server depending on its input, with lots of different PIDs AND we have heaps of workers using the same server firing off scripts. As far as I can tell from reading, the 'wait' command must wait for all processes to finish or for a specific PID to finish, but when i cannot tell what or how many PIDs will be assigned/processes run, how exactly do I use it?
EDIT: Thank you to all that helped, here is what caused my dilemma for anyone google searching this. I broke up the ProdRun python script into its individual script that it was itself calling, but still had the issue, I think found that one of these scripts was also calling another smaller script that had a "&" at the end of it that was ignoring any commands to wait on it inside the python script itself. Simply removing this and inserting a line of "os.system()" allowed all the code to run sequentially.

It sounds like you are trying to implement a job scheduler with possibly some complex dependencies between different tasks. I recommend to use a job scheduler instead. It allows you to specify to run those jobs whilst also benefitting from features like monitoring, handling exceptional cases, errors, ...
Examples are: the open source rundeck https://github.com/rundeck/rundeck or the commercial one http://www.bmcsoftware.uk/it-solutions/control-m.html

Make your Python program wait on the children it spawns. That's the proper way to fix this scenario. Then you don't have to wait for Python after it finishes (sic).
(Also, don't put your commands in variables.)

Related

How to monitor and control background processes in shell script

I need to write a shell (bash) script that will be executing several Hive queries.
Each of the queries will produce a directory with a lot of files.
After all queries are finished I need to process all these files in a specific order.
I want to run Hive queries in parallel as background processes as each one might take couple of hours.
I would also like to parallelize resulting file processing but there are some culprits, that I don't know how to handle. I.e. I can start processing results of the first and second queries as soon as they are finished, but for the third, I need to hold until first two processors are done. Similarly for the fourth and fifth.
I won't have any problems writing such a program in Java, but how to do it in shell - beats me.
If someone can give me a hint on how can I monitor execution of these components in the shell script, I would appreciate it greatly.

Is it ok to use check PID for rare exceptions?

I read this interesting question, that basically says that I should always avoid reaching PID of processes that aren't child processes. It's well explained and makes perfect sense.
BUT, while OP was trying to do something that cron isn't meant to, I'm in a very different situation :
I want to run a process say every 5 minutes, but once in a hundred times it takes a little more than 5 minutes to run (and I can't have two instances running at once).
I don't want to kill or manipulate other processes, I just want to end my process without doing anything if another instance of the process is running.
Is it ok to fetch PID of "not-child processes" in that case ? If so, how would I do it ?
I've tried doing if pgrep "myscript"; then ... or stuff like that, but the process finds its own PID. I need to detect if it finds more than one.
(Initially before being redirected I read this question, but the solution given doesn't work: it can give pid of the process using it)
EDIT: I should have mentioned it before, but if the script is already in use I still need to write something in a log file, at least : date>>script.log; echo "Script already in use">>script.log", I may be wrong but I think flock doesn't allow to do that.
Use lckdo or flock to avoid duplicated running.
DESCRIPTION
lckdo runs a program with a lock held, in order to prevent multiple
processes from running in parallel. Use just like nice or nohup.
Now that util-linux contains a similar command named flock, lckdo is
deprecated, and will be removed from some future version of moreutils.
Of course you can implement this primitive lockfile feature by yourself.
if [ ! -f /tmp/my.lock ];then
touch /tmp/my.lock
run prog
rm -f /tmp/my.lock
fi

How to execute multiple proc in tcl scripting

I have 4 proc in my tcl script. Each proc contain a while loop to wait for a task to be finished and to process the result files subsequently. My purpose now is to parallel this 4 process together instead of 1 by 1. Anyone has any idea?
Background:
The normal way before is I open 4 terminal in KDE/GNOME to execute the different tasks. 4 different tasks actually running together.
Tcl threads can do the job just fine: http://www.tcl.tk/man/tcl8.6/ThreadCmd/thread.htm
Of course you may just leave everything as it is and run your scripts in the background within one terminal, if that's what you are looking for, e.g.
script1.tcl &
script2.tcl &
threading is better option for this scenario and it gives better control for your subprocess. You refer the following link for simple example : https://www.activestate.com/blog/2016/09/threads-done-right-tcl

How to exit the entire call stack of shell scripts if a child script fails?

I have set of shell scripts, around 20-30, that are used for performing one big task as a whole. The wrapper script calls mainly the high-level task scripts but internally those scripts calls other scripts like and the flow goes on in a nested manner.
I want to know if there is a way to exit the entire call stack if some critical script fails. Normally I run exit 125 command and then catch that in caller script and so on but I feel that little complicated. Is there a special exit that will abort the entire call stack? I don't want to use kill command to abort the wrapper script process.
You could have your main wrapper script start every sub-script in its own process group, using e.g. chpst -P.
Then the sub-scripts, as well as their children, could kill their own process group by sending it a KILL signal, and this would not affect the main wrapper script.
I think this would be a bad idea and what you're currently doing is the good way, though (because it makes the code easier to follow).

BASH: Recursive design, linear implementation

The idea
Say I have a few scripts. For example:
script1
script2
script3
I want each script to:
Do something
Run next script
Wait
Cleanup
The wait is simply to wait for the next script to complete.
The problem
A recursive solution is rather straightforward. The problem is that each script then needs to check if there is a next script. This is ok but a minor mistake in a script and it becomes a debugging hell, especially if there are many scripts.
For this reason I was thinking to do it in a linear way. Having a main script (script1) keeping control of everything. The main issue is the wait part.
How do I make script1 to pause script2 until script3 has completed so that it cleans up?
The easiest would be to simply split each worker script in two parts: the real work and the cleanup. Then your master script can run each of the scripts in sequence, followed by each of the cleanup scripts.
Another way to go about this would be to use a "build system" like SCons, which may work well if you can define the inputs and outputs of each script as filenames and let SCons schedule the work and support the "clean" command. This will be a bit of a steep learning curve, but for serious systems where debugging may be needed often, it may be more beneficial.

Resources