What are the different ways to check if the mapreduce program ran successfully - hadoop

If we need to automate a mapreduce program or run from a script, what are the different ways to check if the mapreduce program ran successfully? One way is to find is if _SUCCESS file is created in the output directory. Does the command "hadoop jar program.jar hdfs:/input.txt hdfs:/output" return 0 or 1 based on success or failure ?

Just like any other command in Linux, you can check the exit status of a
hadoop jar command using the built in variable $?.
You can use:
echo $?
after executing the hadoop jar command to check its status.
The exit status value varies from 0 to 255. An exit status of zero implies that the command executed successfully while a non-zero value indicates that the command failed.
Edit: To see how to achieve automation or to run from a script, refer Hadoop job fails when invoked by cron.

Related

bash: stop subshell script marked as failed if one step exits with an error

I am running a script through the SLURM job scheduler on HPC.
I am invoking a subshell script through a master script.
The subshell script contains several steps. One step in the script sometimes fails because of the quality of the data; this step is not required for further steps, but if this step fails, my whole subshell script is marked with "failed" Status in the job scheduler. However, I need this subshell script to have a "completed" Status in the Job scheduler as it is dependency in my master script.
I tried setting up
set +e
in my subshell script right before the optional step, but it doesn't seem to work: I still get an exitCode with errors and FAILED status inthe job scheduler.
In short: I need the subshell script to have Status "completed" in the job scheduler, no matter whether one particular step is finished with errors or not. Will appreciate help with this.
For Slurm jobs submitted with sbatch, the job exit code is taken to be the return code of the submission script itself. The return code of a Bash script is that of the last command in the script.
So if you just end your script with exit 0, Slurm should consider it COMPLETED no matter what.

Running shell script from jenkins

When i try to execute the jobs from terminal it works for one hour without any issue. when i try to execute shell from Jenkins it works for just one minute and stops the execution. The output from Jenkins console output as follows :
Creating folder path in /jenkins/workspace/load_test/scripts/loadtest/loadtest1
PWD is : /jenkins/workspace/load_test/scripts/loadtest
Running /jenkins/workspace/load_test/scripts/loadtest/loadtest1/testRestApi.sh
1495126268
1495129868
3600
Process leaked file descriptors. See http://wiki.jenkins-ci.org/display/JENKINS/Spawning+processes+from+build for more information
Finished: SUCCESS
Any ideas/ suggestion to make the script run for one hour from Jenkins job ?
Have you tried with BUILD_ID=dontKillMe it is commonly used for daemons. https://wiki.jenkins-ci.org/display/JENKINS/ProcessTreeKiller however this should let you run your script

ExitCode of RunProgramInGuest in Jenkins job

I'm running a batch file in virtual machine by jenkins job. I using following command to run it.
..path..\vmrun.exe -T ws -gu username -gp password runProgramInGuest "c:\vm_image.vmx" -activeWindow -interactive "C:\Installer.bat"
The job is running correctly and installing software (by run batch file).
But sometime it is exiting with exit code 2.
So jenkins is showing as job failed.
Shall I know what is the exit code 2 mean in this job?
What are other possible exit code for this command and there meanings?
How shall I find whether job passed or failed?
If I understood what you ran, it's:
0 – VIX_OK
The operation was successful.
1 – VIX_E_FAIL
Unknown error.
2 – VIX_E_OUT_OF_MEMORY
Memory allocation failed: out of memory.

Autosys job not failing when the shell script fails

I am moving existing manual shell scripts to execute via autosys jobs. However, after adding exit 1 for each failed autosys job; it is not failing and autosys shows exit code as 0.
I tried the below simple script
#!/bin/ksh
exit 1;
When I execute this, the autosys job shows a success status.I have not updated success code or max success code in autosys, everything is default. What am I missing?

How to invoke an oozie workflow via shell script and block/wait till workflow completion

I have created a workflow using Oozie that is comprised of multiple action nodes and have been successfully able to run those via coordinator.
I want to invoke the Oozie workflow via a wrapper shell script.
The wrapper script should invoke the Oozie command, wait till the oozie job completes (success or error) and return back the Oozie success status code (0) or the error code of the failed oozie action node (if any node of the oozie workflow has failed).
From what I have seen so far, I know that as soon as I invoke the oozie command to run a workflow, the command exits with the job id getting printed on linux console, while the oozie job keeps running asynchronously in the backend.
I want my wrapper script to block till the oozie coordinator job completes and return back the success/error code.
Can you please let me know how/if I can achieve this using any of the oozie features?
I am using Oozie version 3.3.2 and bash shell in Linux.
Note: In case anyone is curious about why I need such a feature - the requirement is that my wrapper shell script should know how long an oozie job has been runnig, when an oozie job has completed, and accordingly return back the exit code so that the parent process that is calling the wrapper script knows whether the job completed successfully or not, and if errored out, raise an alert/ticket for the support team.
You can do that by using the job id then start a loop and parsing the output of oozie info. Below is the shell code for same.
Start oozie job
oozie_job_id=$(oozie job -oozie http://<oozie-server>/oozie -config job.properties -run );
echo $oozie_job_id;
sleep 30;
Parse job id from output. Here job_id format is "job: jobid"
job_id=$(echo $oozie_job_id | sed -n 's/job: \(.*\)/\1/p');
echo $job_id;
check job status at regular interval, if its Running or not
while [ true ]
do
job_status=$(oozie job --oozie http://<oozie-server>/oozie -info $job_id | sed -n 's/Status\(.*\): \(.*\)/\2/p');
if [ "$job_status" != "RUNNING" ];
then
echo "Job is completed with status $job_status";
break;
fi
#this sleep depends on you job, please change the value accordingly
echo "sleeping for 5 minutes";
sleep 5m
done
This is basic way to do it, you can modify it as per you use case.
To upload workflow definition to HDFS use the following command :
hdfs dfs -copyFromLocal -f workflow.xml /user/hdfs/workflows/workflow.xml
To fire up Oozie job you need these two commands at the below
Please Notice that to write each on a single line.
JOB_ID=$(oozie job -oozie http://<oozie-server>/oozie -config job.properties
-submit)
oozie job -oozie http://<oozie-server>/oozie -start ${JOB_ID#*:}
-config job.properties
You need to parse result coming from below command when the returning result = 0 otherwise it's a failure. Simply loop with sleep X amount of time after each trial.
oozie job -oozie http://<oozie-server>/oozie -info ${JOB_ID#*:}
echo $? //shows whether command executed successfully or not

Resources