Capture job id of a job submitted by qsub - bash

I have been looking for a simple way to capture the job ID of a job submitted by qsub. I saw a suggestion was given by providing a name to the job, and using that name. But that's an indirect method. I tried this way but getting an error
jobID="qsub job.sh"
35546.cell0 (This is the output I want to capture)
$jobID
qsub -W depend=afterok:$jobID analyze.sh
Can anyone please suggest a neat way to capture the job ID from qsub?
Thank you very much.

You may try
qsub -W depend=afterok:$(qsub job.sh) analyze.sh

Related

Send Mail using cron job and shell script

I am using cron job to run my shell script after every 2 minutes. My shell script contains pig and hive scripts. I am searching the person with high risk using my hive query and i can get their email id from my hive table, i want to know if i can send mail to the person and how ? I checked on the internet but not able to understand properly, it would be a great help if you guys help me in this. Thanks
This code solves my problem
$ echo "hello world" | mail -s "a subject" xxx#xxx.com

QSUB: Specify output and error files for each task in Job Array

Hopefully this is not a dublicate and also not just a problem of our cluster's configuration...
I am submitting a job array to a cluster using qsub with the following command:
qsub -q QUEUE -N JOBNAME -t 1:10 -e ${ERRFILE}_$SGE_TASK_ID /path/to/script.sh
where
ERRFILE=/home/USER/somedir/errors.
The idea is to specify an error file (also analogously the output file) that also contains the task ID from within the job array.
So far I have learned that the line
#$ -e ${ERRFILE}_$SGE_TASK_ID
inside the script.sh, does not work, because it is a comment and not evaluated by bash. My first line does not work however because $SGE_TASK_ID is only set AFTER the job is submitted.
I read here that escaping the evaluation of $SGE_TASK_ID (in that link it's PBS' $PBS_JOBID, but a similar problem) should work, but when I tried
qsub -q QUEUE -N JOBNAME -t 1:10 -e ${ERRFILE}_\$SGE_TASK_ID /path/to/script.sh
it did not work as expected.
Am I missing something obvious? Is it possible to use $SGE_TASK_ID in the name of an error file (the automatic naming of error files does that, but I want to specify the directory and if possible the name, too)?
Some additional remarks:
I am using the -cwd option for qsub inside script.sh, but that is NOT where I want my error files to be stored.
I have next to no control over how the cluster works and no root access (wouldn't know what I could need it for in this context but anyway...).
Apparently our cluster does not use PBS.
Yes my scripts are all executable and where applicable started with #!/bin/bash (I also specified the use of bash with the -S /bin/bash option for qsub).
There seems to be a solution here, but I am not quite sure how that works and it also appears to be using PBS. If that answer DOES apply to my question and I misunderstood it, please let me know.
I would appreciate any hint into the right direction.
Thank You!
I didn't know this either, but it looks like Grid Engine has something called "pseudo environment variables" like $TASK_ID for this purpose. This should work:
qsub -q QUEUE -N JOBNAME -t 1:10 -e ${ERRFILE}_\$TASK_ID /path/to/script.sh
From the man page:
-e [[hostname]:]path,...
...
If the pathname contains certain pseudo
environment variables, their value will be expanded at
runtime of the job and will be used to constitute the
standard error stream path name. The following pseudo
environment variables are supported currently:
$HOME home directory on execution machine
$USER user ID of job owner
$JOB_ID current job ID
$JOB_NAME current job name (see -N option)
$HOSTNAME name of the execution host
$TASK_ID array job task index number

How to see the output of a job submitted through qsub in my terminal?

I am submitting this simple job to SGE through qsub. How can I see the output of the job which is a simple echo in my terminal. I mean I want it directly on screen not diverting the output to a logfile or something.
So here is the job stored in Dummyjob:
#!/bin/sh
#$ -j y
#$ -S /bin/sh
#$ -q long.q
sleep 30
echo "I'm done!"
And this is the qsub command:
qsub -N job_1 -cwd./Dummyjob
Thank you!
It doesn't do that. You're referring to a batch facility, e.g., How to submit a job using qsub.
Looking at the command-line options, these are the possibilities:
-o <output_logfile> name of the output log file
-e <error_logfile> name of the error log file
-m ea Will send email when job ends or aborts
You can ask it to send mail when the job is done (successfully or not). Or you might be able to make it write to a fifo, e.g., in one terminal you would do
mkfifo myFakeFile
tail -f myFakeFile
and then use
-o myFakeFile
when submitting (in that order, so that something is waiting). But if the program does any checking, it will not write to a fifo (because it is not a regular file).
Further reading:
qsub - submit a batch job to Sun Grid Engine.
6.3.2 Creating a FIFO (The Linux Programmer's Guide)
The previous answer mentions that you are submitting a 'batch job script' and this is true, so you will not see the output on your terminal (tty) but the stdout/stderr will be sent to output files. However that doesn't mean you can't run an interactive job through Grid Engine. You can, just use 'qrsh' instead of using 'qsub' and the script will be run on a remote machine chosen by Grid Engine - the results will be displayed on your screen.
Note: You might have to configure qrsh in your Grid Engine Cluster for this to work.

Autosys job automation

We get multiple user request to hold run or kill the jobs running in our autosys server
I want to crete a script which will take input as joobname and what send event they want as input parameter and then trigger the respective job as per requirement .
Can it be done.Please help
Sounds like the answer is the killallcommand (Examples)
Yes it is possible.
You need to create a script with the following information.
Either you can store all the job names which you want to kill via script in text file which will be parm file and then read this file in the script. A script will be having an logic to read the file which will be having job names and while running the script you can pass the ICE_TYPE ( it can be ON_ICE/OFF_ICE/KILLJOB/....or any event)
A script will be having following main code so if you say created script autosysICE.sh then you need to run the script in the following manner.
Shell > Wrapper autosysICE.sh
Following is the important line we need to embed into the script which will take argument as KILLJOB and will replace it with the $ICE_TYPE.
sendevent -E $ICE_TYPE -J JOB_NAME
The rest you can do with the help of array read the job names from file and go through it
Code Snippet ( Note : I haven't tested this but this is the logic )
open(DATA,"<autosysParm.txt") or die "Can't open data";
#jobList= <DATA>;
close(DATA);
foreach my $jobName (#jobList) {
sendevent -E $ICE_TYPE -J $jobName
}
Let me know if you still having an issue.
Thanks!

How get information of completed PBS or Torque jobs?

I have IDs of completed jobs. How do I check its detailed information, such as execution time, allocated nodes, etc? I remember SGE has a command for it (qacct?). But I could not find it for PBS or Torque. Thanks.
Since job accounting requires root access to view completed jobs, or that the cluster admins have installed pbstools (both out of the control of a user), I've found that the easiest thing to do is to place a
tracejob $PBS_JOBID
on the last line of the submission script. If the scheduler is MAUI, then checkjob -vv $PBS_JOBID is another alternative. These commands could be redirected to a separate outfile:
tracejob $PBS_JOBID > $PBS_O_WORKDIR/$PBS_JOBID.tracejob
Should also be possible to have this run as a user epilog script to make it more reusable from job to job.
I was looking at this thread searching how to do this in my HPC running PBSPro 19.2.3 and as of PBSPro 18 the solution is similar to John Damm Sørensen's reply, but the -w flag is used instead of -1 to display output of each field in a single line and you need to add -x flag to see the details of finished jobs as well, so you don't need to run it within the job script. (p.203, section 2.59.2.2 of the Reference Guide)
qstat -fxw $PBS_JOBID
You can then grep out of it the requested information, such as resources used, Exit status, etc:
qstat -fxw $PBS_JOBID | grep -E "resources_used|Exit_status|array_index"
For Torque, you can check at least part of the information you seek using the "tracejob" command.
Official documentation:
http://docs.adaptivecomputing.com/torque/Content/topics/11-troubleshooting/usingTracejobToLocateFailures.htm
One thing you should notice is that this tool is a convenience that parses the logs. By default it will only check the last day. Be sure to read the doc for the "-n" option.
On a Torque based system. I find that the best way to get stats from a job is to add this to the end of the submitted job script. The output will be added to the STDOUT file.
qstat -f -1 $PBS_JOBID
Right now the only way to get this in TORQUE is to look at the accounting logs. You can grep for the job id and view the accounting records for the job, which look like this:
04/30/2014 15:20:18;Q;5000.bob;queue=batch
04/30/2014 15:33:00;S;5000.bob;user=dbeer group=dbeer jobname=STDIN queue=batch ctime=1398892818 qtime=1398892818 etime=1398892818 start=1398893580 owner=dbeer#bob exec_host=bob/0
04/30/2014 15:36:20;E;5000.bob;user=dbeer group=dbeer jobname=STDIN queue=batch ctime=1398892818 qtime=1398892818 etime=1398892818 start=1398893580 owner=dbeer#bob exec_host=bob/0 session=22933 end=1398893780 Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=2580kb resources_used.vmem=37072kb resources_used.walltime=00:03:20
Unfortunately, to do this directly you have to have root access. To get around this, there are tools such as pbsacct that help better browse this. pbsacct is part of the pbstools package, which is where that link takes you.

Resources