Workflow error logs disabled in Oozie 4.2 - hadoop

I am using Oozie 4.2 that comes bundled with HDP 2.3.
while working with a few example workflow's that comes with the oozie package, I noticed that the "job error log is disabled" and this makes debugging really difficult in the event of a failure. I tried running the below commands,
# oozie job -config /home/santhosh/examples/apps/hive/job.properties -run
job: 0000063-150904123805993-oozie-oozi-W
# oozie job -errorlog 0000063-150904123805993-oozie-oozi-W
Error Log is disabled!!
Can someone please tell me how to enable the workflow error log for oozie?

In the Oozie UI, 'Job Error Log' is a tab which was introduced in HDP v2.3 on Oozie v4.2 .
This is the most simplest way of looking for error for the specified oozie job from the oozie log file.
To enable the oozie's Job Error Log, please make the following changes in the oozie log4j property file:
Add the below set of lines after log4j.appender.oozie and before log4j.appender.oozieops:
log4j.appender.oozieError=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.oozieError.RollingPolicy=org.apache.oozie.util.OozieRollingPolicy
log4j.appender.oozieError.File=${oozie.log.dir}/oozie-error.log
log4j.appender.oozieError.Append=true
log4j.appender.oozieError.layout=org.apache.log4j.PatternLayout
log4j.appender.oozieError.layout.ConversionPattern=%d{ISO8601} %5p %c{1}:%L - SERVER[${oozie.instance.id}] %m%n
log4j.appender.oozieError.RollingPolicy.FileNamePattern=${log4j.appender.oozieError.File}-%d{yyyy-MM-dd-HH}
log4j.appender.oozieError.RollingPolicy.MaxHistory=720
log4j.appender.oozieError.filter.1 = org.apache.log4j.varia.LevelMatchFilter
log4j.appender.oozieError.filter.1.levelToMatch = WARN
log4j.appender.oozieError.filter.2 = org.apache.log4j.varia.LevelMatchFilter
log4j.appender.oozieError.filter.2.levelToMatch = ERROR
log4j.appender.oozieError.filter.3 =`enter code here` org.apache.log4j.varia.LevelMatchFilter
log4j.appender.oozieError.filter.3.levelToMatch = FATAL
log4j.appender.oozieError.filter.4 = org.apache.log4j.varia.DenyAllFilter
Modify the following from log4j.logger.org.apache.oozie=WARN, oozie to log4j.logger.org.apache.oozie=ALL, oozie, oozieError
Restart the oozie service. This would help in getting the job error log for the new jobs launched after restart of oozie service.

As mentioned, the errorlog is new and may not be made available for good reasons. However it seems that you have the wrong expectation of the oozie error log.
The error log is meant to be a subset of the log file. Not an addition to it.
So yes, it could make things easier to debug, but if you checked the oozie log and did not find what you are looking for, the error log will not be the solution for you.
Probably you will want to look at the log of the underlying tasks, which can be found via the external ID.

Related

How to resolve Oozie error : JA009: Cannot initialize Cluster.check configuration for mapreduce.framework.name

I have been using oozie to schedule spark jobs.
Trying to deploy a spark job in 2.x cluster using spark action available in Oozie.
In my job.properties, I have the following
`nameNode=hdfs://hostname:8020
jobTracker=hostname:8050
master=yarn-cluster
queueName=default
oozie.use.system.libpath=true`
When i submit the oozie job, i have been receiving this error
Error:
ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.]
org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:457)
What am I doing wrong here ? Any thing to be changed in properties file ?
Thanks
I was also getting same error JA009 and in my case namenode was running in safemode so after leaving safemode it was able to initialize the cluster and i was able to submit the oozie job.
You can also refer following oozie error codes list.
https://oozie.apache.org/docs/4.1.0/oozie-default.xml

OOZIE status check throws java.lang.NullPointerException

I am new to oozie, trying to write a oozie workflow in CDH4.1.1. So I started the oozie service and then I checked the status using this command:
sudo service oozie status
I got the message:
running
Then I tried this command for checking the status:
oozie admin --oozie http://localhost:11000/oozie status
And I got the below exception:
java.lang.NullPointerException
at java.io.Writer.write(Writer.java:140)
at org.apache.oozie.client.AuthOozieClient.writeAuthToken(AuthOozieClient.java:182)
at org.apache.oozie.client.AuthOozieClient.createConnection(AuthOozieClient.java:137)
at org.apache.oozie.client.OozieClient.validateWSVersion(OozieClient.java:243)
at org.apache.oozie.client.OozieClient.createURL(OozieClient.java:344)
at org.apache.oozie.client.OozieClient.access$000(OozieClient.java:76)
at org.apache.oozie.client.OozieClient$ClientCallable.call(OozieClient.java:410)
at org.apache.oozie.client.OozieClient.getSystemMode(OozieClient.java:1299)
at org.apache.oozie.cli.OozieCLI.adminCommand(OozieCLI.java:1323)
at org.apache.oozie.cli.OozieCLI.processCommand(OozieCLI.java:499)
at org.apache.oozie.cli.OozieCLI.run(OozieCLI.java:466)
at org.apache.oozie.cli.OozieCLI.main(OozieCLI.java:176)
null
Reading the exception stack, I am unable to figure out the reason for this exception. Please let me know why I got this exception and how to resolve this.
Try disabling the env property USE_AUTH_TOKEN_CACHE_SYS_PROP in your cluster. As per your stacktrace and the code .
Usually the clusters are setup with Kerberos based authentication, which is set up by following the steps here . Not sure if you want to do that, but just wanted to mentioned that as an FYI.

Oozie Job getting Suspended and not reaching YARN

I am trying to start a Oozie Shell Action Job via cli as:
oozie job -config jobprops/jos.prioperties -run
The Job Starts, it gives me a unique Id and I can see Job in Oozie UI.
However, Yarn Console shows no submitted jobs and on checking log in oozie I get following message:
Error starting action [folder-structure].
ErrorType [TRANSIENT], ErrorCode [JA009]
Message [JA009: Permission denied: user=vikas.r, access=WRITE, inode="/":hdfs:hadoop:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257).
The job finally goes to SUSPENDED state.
Why is job trying to access "/" ? How could it be resolved ?
I am running under unix user vikas.r, with all folders in hdfs at /user/vikas.r
The error message is quite straightforward. Your oozie job is trying to write something to / as vikas.r user, which lacks permissions to do so.

How can you resolve Oozie error JA009

I am running a simple Oozie workflow on Cloudera VM. The sub-workflow calls a shell script which sends a test email. However, I am getting the JA009 error:
(JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.).
I have already changed mapreduce.framework.name from yarn to classic in the following places:
/etc/oozie/conf/hadoop-conf/core-site.xml
/etc/oozie/conf/hadoop-config.xml
/etc/hadoop/conf/mapred-site.xml
Also, in /etc/hadoop/conf/hadoop-env.sh I changed:
"export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce"
to:
"export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce"
Is there anything I am missing? Yarn is not showing up in hadoop fs -ls /user (hive, pig, spark etc are). So I am assuming Yarn is not pre-installed here.

How can I check Oozie logs

My coordinator failed with Error : E0301 invalid resource [filename]
when I do hadoop fs -ls [filename] the file is listed.
how can I debug what is wrong.
how can I check log files???
oozie job -log requires jobId. in my case i dont have job id. how can I see logs in that case. appreciate responses.
thank you
If you are looking for a command line way to do this, you can run the following:
oozie job -oozie http://localhost:11000 -info <wfid>
oozie job -oozie http://localhost:11000 -log <wfid>
If you have the $OOZIE_URL set, then you do not need the -oozie parm in the above statements. This first command will show you the status of the job and each action. The second command will dig into the oozie log and display the part in the log that pertains to the workflow id that was passed in.
cd /var/log/oozie/
ls
You should see the log file there.
I highly recommend using the oozie webconsole when new to oozie. If you are using Cloudera it's under "Enabling the Oozie Web Console" here http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_17_6.html for CDH4. CDH3 link is similar.
Also the jobid is printed when you submit the job.

Resources