Autosys Job failing due to underlying script not being exited

Autosys Job failing due to underlying script not being exited - shell

I have an Autosys Job, which calls a wrapper script, which in turn calls a script which starts up a server and keeps running.
Now the problem i am facing is that once the job starts , it keeps on waiting for the exit signal or script to exit, but being unable to find so sends the job into a FAILED state.
Can anyone provide a workaround for this?
For demo :
lets say i have autosys job :A
wrapper script as : W.sh
Main restart script : serverrestart.sh
A
|---W.sh
|---serverrestart.sh ( Always running )

Related

bash: stop subshell script marked as failed if one step exits with an error

I am running a script through the SLURM job scheduler on HPC.
I am invoking a subshell script through a master script.
The subshell script contains several steps. One step in the script sometimes fails because of the quality of the data; this step is not required for further steps, but if this step fails, my whole subshell script is marked with "failed" Status in the job scheduler. However, I need this subshell script to have a "completed" Status in the Job scheduler as it is dependency in my master script.
I tried setting up
set +e
in my subshell script right before the optional step, but it doesn't seem to work: I still get an exitCode with errors and FAILED status inthe job scheduler.
In short: I need the subshell script to have Status "completed" in the job scheduler, no matter whether one particular step is finished with errors or not. Will appreciate help with this.

For Slurm jobs submitted with sbatch, the job exit code is taken to be the return code of the submission script itself. The return code of a Bash script is that of the last command in the script.
So if you just end your script with exit 0, Slurm should consider it COMPLETED no matter what.

LSF - BSUB Running a script if the job is killed

Im working with the LSF, running bsub commands.
I'm implementing the -Ep switch to run a post exec script. This works great until the Job is killed or hits a memory limit, run limit etc.
Is there any way for the job to detect its running out of resource and then run the script? or to force it to run the script even if its been killed?
I guess my other option is running job with a dependency on that job which will run the "post exec" script when it finishes.
Any thoughts?
Kind Regards,
TheBigPeeler

From the documentation, you should be seeing the behaviour that you want.
A post-execution command runs after the job finishes, regardless of
the exit state of the job. Once a post-execution command is associated
with a job, that command runs even if the job fails. You cannot
configure the post-execution command to run only under certain
conditions.
I thought that maybe the interaction with JOB_INCLUDE_POSTEXEC (lsb.params) could account for the difference, but from my test the post-exec still runs in both cases. I used runlimit (bsub -W) to trigger the job kill.
Is it possible that the post exec is running, but exits early?
What version of LSF are you using? (What's the output of mbatchd -V and sbatchd -V)

Autosys job not failing when the shell script fails

I am moving existing manual shell scripts to execute via autosys jobs. However, after adding exit 1 for each failed autosys job; it is not failing and autosys shows exit code as 0.
I tried the below simple script
#!/bin/ksh
exit 1;
When I execute this, the autosys job shows a success status.I have not updated success code or max success code in autosys, everything is default. What am I missing?

Why does scheduling Spark jobs through cron fail (while the same command works when executed on terminal)?

I am trying to schedule a spark job using cron.
I have made a shell script and it executes well on the terminal.
However, when I execute the script using cron it gives me insufficient memory to start JVM thread error.
Every time I start the script using terminal there is no issue. This issue comes when the script starts with cron.
Kindly if you could suggest something.

DATASTAGE: how to run more instance jobs in parallel using DSJOB

I have a question.
I want to run more instance of same job in parallel from within a script: I have a loop in which I invoke jobs with dsjob and without option "-wait" and "-jobstatus".
I want that jobs completed before script termination, but I don't know how to verify if job instance terminated.
I though to use wait command but it is not appropriate.
Thanks in advance

First,you should assure job compile option "Allow Multiple Instance" choose.
Second:
#!/bin/bash
. /home/dsadm/.bash_profile
INVOCATION=(1 2 3 4 5)
cd $DSHOME/bin
for id in ${INVOCATION[#]}
do
./dsjob -run -mode NORMAL -wait test demo.$id
done
project -- test
job -- demo
$id -- invocation id
the two line in shell scipt:guarantee the environment path can work.

Run the jobs like you say without the -wait, and then loop around running dsjob -jobinfo and parse the output for a job status of 1 or 2. When all jobs return this status, they are all finished.
You might find, though, that you check the status of the job before it actually starts running and you might pick up an old status. You might be able to fix this by first resetting the job instance and waiting for a status of "Not running", prior to running the job.

Invoke the jobs in loop without wait or job-status option
after your loop , check the jobs status by dsjob command
Example - dsjob -jobinfo projectname jobname.invocationid
you can code one more loop for this also and use sleep command under that
write yours further logic as per status of the jobs
but its good to create Job Sequence to invoke this multi-instance job simultaneously with the help of different invoaction-ids
create a sequence job if these are in same process
create different sequences or directly create different scripts to trigger these jobs simultaneously with invocation- ids and schedule in same time.
Best option create a standard generalized script where each thing will be getting created or getting value as per input command line parameters
Example - log files on the basis of jobname + invocation-id
then schedule the same script for different parameters or invocations .

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Autosys Job failing due to underlying script not being exited - shell

Related

bash: stop subshell script marked as failed if one step exits with an error

LSF - BSUB Running a script if the job is killed

Autosys job not failing when the shell script fails

Why does scheduling Spark jobs through cron fail (while the same command works when executed on terminal)?

DATASTAGE: how to run more instance jobs in parallel using DSJOB

Categories

Resources