Making qsub block until job is done? - performance

Currently, I have a driver program that runs several thousand instances of a "payload" program and does some post-processing of the output. The driver currently calls the payload program directly, using a shell() function, from multiple threads. The shell() function executes a command in the current working directory, blocks until the command is finished running, and returns the data that was sent to stdout by the command. This works well on a single multicore machine. I want to modify the driver to submit qsub jobs to a large compute cluster instead, for more parallelism.
Is there a way to make the qsub command output its results to stdout instead of a file and block until the job is finished? Basically, I want it to act as much like "normal" execution of a command as possible, so that I can parallelize to the cluster with as little modification of my driver program as possible.
Edit: I thought all the grid engines were pretty much standardized. If they're not and it matters, I'm using Torque.

You don't mention what queuing system you're using, but SGE supports the '-sync y' option to qsub which will cause it to block until the job completes or exits.

In TORQUE this is done using the -x and -I options. qsub -I specifies that it should be interactive and -x says run only the command specified. For example:
qsub -I -x myscript.sh
will not return until myscript.sh finishes execution.

In PBS you can use qsub -Wblock=true <command>

Related

How to create a job using only default bash commands?

So, I have started studying UNIX systems about a month ago and now I have a basic question about job control
How can I create a job, that will contain several processes, using only default bash commands?
jobs are usually an interactive shell concept, as there is usually a controlling terminal involved.
A shell script is executed in a non-interactive, non-login session of shell, hence no job control by default.
You can force job control inside a script, by setting:
set -m
inside the script.
From help set:
-m Job control is enabled.
echo | ping google.com &
Was fine example, because both of these processes work independently and only require pipe symbol (|) to work in sync in the background

LSF - BSUB Running a script if the job is killed

Im working with the LSF, running bsub commands.
I'm implementing the -Ep switch to run a post exec script. This works great until the Job is killed or hits a memory limit, run limit etc.
Is there any way for the job to detect its running out of resource and then run the script? or to force it to run the script even if its been killed?
I guess my other option is running job with a dependency on that job which will run the "post exec" script when it finishes.
Any thoughts?
Kind Regards,
TheBigPeeler
From the documentation, you should be seeing the behaviour that you want.
A post-execution command runs after the job finishes, regardless of
the exit state of the job. Once a post-execution command is associated
with a job, that command runs even if the job fails. You cannot
configure the post-execution command to run only under certain
conditions.
I thought that maybe the interaction with JOB_INCLUDE_POSTEXEC (lsb.params) could account for the difference, but from my test the post-exec still runs in both cases. I used runlimit (bsub -W) to trigger the job kill.
Is it possible that the post exec is running, but exits early?
What version of LSF are you using? (What's the output of mbatchd -V and sbatchd -V)

Force shell script to run tasks in sequence

I'm running a shell scripts that executes several tasks. The thing is that the script does not wait for a task to end before starting the next one. My script should work differently, waiting for one task to be completed before the next one to start. Is there a way to do that? My script looks like this
sbatch retr.sc 19860101 19860630
scp EN/EN1986* myhostname#myhost.it:/storage/myhostname/MetFiles
the first command runs retr.sc, that retrieves files and it takes half an hour roughly. The second command, though, is run right soon, moving just some files to destination. I wish the scp command to be run only when the first is complete.
thanks in advance
You have several options:
use srun rather than sbatch: srun retr.sc 19860101 19860630
use sbatch for the second command as well, and make it depend on the first one
like this:
RES=$(sbatch retr.sc 19860101 19860630)
sbatch --depend=after:${RES##* } --wrap "scp EN/EN1986* myhostname#myhost.it:/storage/myhostname/MetFiles"
create one script that incorporates both retr.sc and scp and submit that script.
sbatch exits immediately on submitting the job to slurm.
salloc will wait for the job to finish before exiting.
from the man page:
$ salloc -N16 xterm
salloc: Granted job allocation 65537
(at this point the xterm appears, and salloc waits for xterm to exit)
salloc: Relinquishing job allocation 65537
Thanks for you replies
I've sorted out this way
RES=$(sbatch retr.sc $date1 $date2)
array=(${RES// / })
JOBID=${array[3]}
year1={date1:0:4}
sbatch --dependency=afterok:${JOBID} scp.sh $year1
where scp.sh is the script for transferring the file to my local machine

sun grid engine qsub to all nodes

I have a master and two nodes. They are install with SGN. And I have a shell script ready on all the nodes as well. Now I want to use a qsub to submit the job on all my nodes.
I used:
qsub -V -b n -cwd /root/remotescript.sh
but it seems that only one node is doing the job. I am wondering how do I submit jobs for all nodes. What would the command be.
My reference is this enter link description here
SGE is meant to dispatch jobs to worker nodes. In your example, you create one job so one node will run it. If you want to run a job on each of your node, you need to submit more than one job. If you want to target nodes you probably should use something closer to
qsub -V -b n -cwd -l hostname=node001 /root/remotescript.sh
qsub -V -b n -cwd -l hostname=node002 /root/remotescript.sh
The "-l hostname=*" parameter will require a specific host to run the job.
What are you trying to do? The general use case of using a grid engine is to let the scheduler dispatch the jobs so you don't have to use the "-l hostname=*" parameter. So technically you should just submit a bunch of jobs to SGE and let it dispatch it with the nodes availability.
Finch_Powers answer is good for describing how SGE allocates resources. So, I'll elaborate below on specifics of you question, which may be why you are not getting the desired outcome.
You mention launching remote script via:
qsub -V -b n -cwd /root/remotescript.sh
Also, you mention again that these scripts are located on the nodes:
"And I have a shell script ready on all the nodes as well"
This is not how SGE is designed to work, although it can do this. Typical usage is to have same single (or multiple) scripts accessible to all nodes via network mounted storage on the execution nodes and let SGE decide which nodes to run the script on.
To run remote code, you may be better served using plain SSH.

running multiple instances of a single MPI executable in parallel on a cluster using mpirun options?

I'm trying to write a shell script to perform some kind of algorithm, and a part of it requires parallel execution of an MPI executable across multiple input files on a grid engine cluster. From what I read, it seems like mpirun supports MPMD execution by using the colon sign or using the application context/schema file and then perform mpirun --app my_appfile. And below is what my my_appfile looks like,
-np 12 /path/to/executable /path/to/dir1/input1
-np 12 /path/to/executable /path/to/dir2/input2
-np 12 /path/to/executable /path/to/dir3/input3
...
-np 12 /path/to/executable /path/to/dir10/input10
I was trying to parallely execute 10 instances of the same executable and assign the resources in the cluster accordingly (120 processes in this case in SGE's orte parallel environment).
However, there was a problem. Each input file was written to generate an output in the same directory as each particular input file. As I submitted the job (the submission script contains only the mpirun --app my_appfile line), it shows only the output from input1 in dir1, but not the rest. So I wonder what is the problem here. Is it the problem with mpirun options or the problem with how the cluster does the job? Any help would be highly appreciated. Thank you!

Resources