Hive Oozie Error Handling - hadoop

Does anyone have any suggestion on what is the best practice around Oozie Exception/Error handling?
We have Hive Actions within Oozie workflows and find that the errors are not logging with enough detail. We need more of stack trace and more context around each failure.
Any suggestions?
Thanx in advance...
Himanshu

Once the oozie job submitted the Yarn will responsible for the action to completes the mapreduce. Check the log in mapred historyserver once the job is submitted to the yarn or check it via the job logs located in oozie with the list of error code in web UI.

The logging level of the Hive action can set in the Hive action configuration using the property oozie.hive.log.level . The default value is INFO .
You can change it to DEBUG and include in your Hive action configuration of your workflow.xml.
<configuration>
<property>
<name>oozie.log.hive.level</name>
<value>DEBUG</value>
</property>
</configuration>
This log level is in-turn passed onto log4j I believe.
https://github.com/apache/oozie/blob/master/sharelib/hive/src/main/java/org/apache/oozie/action/hadoop/HiveMain.java

Related

Oozie on YARN - oozie is not allowed to impersonate hadoop

I'm trying to use Oozie from Java to start a job on a Hadoop cluster. I have very limited experience with Oozie on Hadoop 1 and now I'm struggling trying out the same thing on YARN.
I'm given a machine that doesn't belong to the cluster, so when I try to start my job I get the following exception:
E0501 : E0501: Could not perform authorization operation, User: oozie is not allowed to impersonate hadoop
Why is that and what to do?
I read a bit about core-site properties that need to be set
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>users</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>master</value>
</property>
Does it seem that this is the problem? Should I contact people responsible for cluster to fix that?
Could there be problems because I'm using same code for YARN as I did for Hadoop 1? Should something be changed? For example, I'm setting nameNode and jobTracker in workflow.xml, should jobTracker exist, since there is now ResourceManager? I have set the address of ResourceManager, but left the property name as jobTracker, could that be the error?
Maybe I should also mention that Ambari is used...
Hi please update the core-site.xml
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
and jobTracker address is the Resourcemananger address that will not be the case . once update the core-site.xml file it will works.
Reason:
Cause of this type of error is- You run oozie server as a hadoop user but you define oozie as a proxy user in core-site.xml file.
Solution:
change the ownership of oozie installation directory to oozie user and run oozie server as a oozie user and problem will be solved.

Nutch 1.7 with Hadoop 2.6.0 "Wrong FS" Error

We have been trying to use Nutch 1.7 with Hadoop 2.6.0.
After installation, we we try to submit a job to Nutch, we receive the following error:
INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://master:9000/user/ubuntu/crawl/crawldb/436075385, expected: file:///
Job is submitted using the following command:
./crawl urls crawl_results 1
Also, we have checked fs.default.name setting in core-site.xml is having hdfs protocol:
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
It is happening when crawl command is sent to Nutch, after it reads the input URLs from file and attempts to insert the data into crawl db.
Any insights would be appreciated.
Thanks in advance.

How to use JobClient in hadoop2(yarn)

(Solved)I want to contact hadoop cluster and get some job/task information.
In hadoop1, I was able to use JobClient ( local pesudo distributed mode, use Eclipse):
JobClient jobClient = new JobClient(new InetSocketAddress("127.0.0.1",9001),new JobConf(config));
JobID job_id = JobID.forName("job_xxxxxx");
RunningJob job = jobClient.getJob(job_id);
.....
Today I set up a pesudo distributed hadoop2 YARN cluster, however, the above code doesn't work. I use the port of resource manager(8032).
JobClient jobClient = new JobClient(new InetSocketAddress("127.0.0.1",8032),new JobConf(config));
This line gives exception:
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
I search this exception but all solutions are not working. I use eclipse, and I have add all hadoop jars including hadoop-mapreduce-client-xxx. Also, I can successfully run example programs on my cluster.
Any suggestions on how to use JobClient on hadoop2 yarn?
Update: I am able to solve this issue by compile with the same hadoop lib as the rm server. In Eclipse it still gives this exception but after I compiled and deployed my project it works fine.(not sure why as in hadoop1 it works in eclipse) There is no need to change the api, JobClient is still functioning well in hadoop2
Have you configured the mapred-site.xml file as followed? It is located in $HADOOP_HOME/etc/hadoop/ in hadoop 2.x
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
edit: Also make sure that your yarn-site.xml (same location) contains the following property:
<property>
<name>yarn.resourcemanager.address</name>
<value>host:port</value>
</property>
One last thing: I strongly advise you to work with hostnames instead of IPs. There are known cases of failure with hadoop when IPs are set in the configuration files.

Default Oozie options in cloudera

I'm using latest Cloudera cdh4.
By default all default parameters of oozie are in /etc/oozie/conf/oozie-default.xml
I have changed oozie.service.CoordMaterializeTriggerService.lookup.interval to 30:
<property>
<name>oozie.service.CoordMaterializeTriggerService.lookup.interval</name>
<value>30</value>
</property>
Next Cluster was restarted.
But in Hue UI in oozie config I see
oozie.service.CoordMaterializeTriggerService.lookup.interval 300
Why it happens? And how i can change it?
You should override the property in /etc/oozie/conf/oozie-site.xml.
If using CM, you should put it in the Oozie Safety Valve.
And restart Oozie in both cases.

Getting OOZIE error E0900: Jobtracker [localhost:8021] not allowed, not in Oozies whitelist]

I am trying to run the Oozie examples on the CDH virtual machine. I have Cloudera Manager running and I execute the following command:
oozie job -oozie http://localhost:11000/oozie -config examples/apps/map-reduce/job.properties -run
when I check the status I get the HadoopAccessorException.
I checked the oozie log and I see the following stack trace:
2013-07-22 14:25:56,179 WARN org.apache.oozie.command.wf.ActionStartXCommand:
USER[cloudera] GROUP[-] TOKEN[] APP[map-reduce-wf] JOB[0000001-130722142323751-oozie
oozi-W] ACTION[0000001-130722142323751-oozie-oozi-W#mr-node] Error starting action
[mr-node]. ErrorType [ERROR], ErrorCode [HadoopAccessorException], Message
[HadoopAccessorException: E0900: Jobtracker not allowed, not in
Oozies whitelist] org.apache.oozie.action.ActionExecutorException:
HadoopAccessorException: E0900: Jobtracker not allowed, not in Oozies
Whitelist
The oozie-site.xml and the oozie-default.xml have the oozie.service.HadoopAccessorService.jobTracker.whitelist and oozie.service.HadoopAccessorService.nameNode.whitelist set.
Any help would be appreciated.
Thanks.
Dave
I believe Cloudera Manager doesn't read your oozie-site.xml file and rather maintains its own config somewhere.
You should be able to go in the UI into Oozie Server Role, Processes, Configuration Files/Environment and click on Show and this is where you can define the whitelists for your Oozie server, as opposed to just doing it in the files.
Once this is changed, restart Oozie and you should be able to execute your command.
source
I know i am very late on this but someone looking for answers might find this helpful. I got similar error i went into the the location on Cloudera manager UI into Oozie Server Role, Processes, Configuration Files/Environment
And clicked on oozie-site.xml link and looked at the below property
<property>
<name>oozie.service.HadoopAccessorService.nameNode.whitelist</name>
<value>server1:8020,server2:8020,**<name>**</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.jobTracker.whitelist</name>
<value>server1:8032,server2:8032,**yarnRM**</value>
</property>
I used yarnRM as my value on the jobtracker in the workflow.xml file and it went past the error while running the workflow.

Resources