Oozie variable[user] cannot ber resolved - hadoop

I'm trying to use Oozie's Hive action in Hue. My Hive script is very simple:
create table test.test_2 as
select * from test.test
This Oozie action has only 3 steps:
start
hive_query
end
My job.properties:
jobTracker worker-1:8032
mapreduce.job.user.name hue
nameNode hdfs://batchlayer
oozie.use.system.libpath true
oozie.wf.application.path hdfs://batchlayer/user/hue/oozie/workspaces/_hue_-oozie-4-1425575226.04
user.name hue
I add hive-site.xml two times - as file and as job.xml. Oozie action starts and on second step stops. Job is 'accepted'. But in hue console I've got an error:
variable[user] cannot ber resolved
I'm using Apache Oozie 4.2, Apache Hive 0.14 and Hue 3.7 (from Github).
UPDATE:
This is my workflow.xml:
bash-4.1$ bin/hdfs dfs -cat /user/hue/oozie/workspaces/*.04/work*
<workflow-app name="ccc" xmlns="uri:oozie:workflow:0.4">
<start to="ccc"/>
<action name="ccc">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>/user/hue/hive-site.xml</job-xml>
<script>/user/hue/hive_test.hql</script>
<file>/user/hue/hive-site.xml#hive-site.xml</file>
</hive>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

Tried running a sample hive action in Oozie following similar steps as you, and was able to resolve error faced by you using following steps
Remove the add for hive-site.xml
Add following line to your
job.properties oozie.libpath=${nameNode}/user/oozie/share/lib
Increase visibility of your hive-site.xml file kept in HDFS. Maybe
you have very restrictive privileges over it (in my case 500)
With this both the [user] variable cannot be resolved and subsequent errors got resolved.
Hope it helps.

This message can be really misleading. You should check yarn logs and diagnostics.
In my case it was configuration settings regarding reduce task and container memory. By some error container memory limit was lower than single reduce task memory limit. After looking into yarn application logs I saw the true cause in 'diagnostics' section, which was:
REDUCE capability required is more than the supported max container capability in the cluster. Killing the Job. reduceResourceRequest: <memory:8192, vCores:1> maxContainerCapability:<memory:5413, vCores:4>
Regards

Related

Oozie workflow with spark application reports out of memory

I’ve tried to execute Oozie workflow with spark program as single step.
I've used jar which is successfully executed with spark-submit or spark-shell (the same code):
spark-submit --packages com.databricks:spark-csv_2.10:1.5.0 --master yarn-client --class "SimpleApp" /tmp/simple-project_2.10-1.1.jar
Application shouldn’t demand lot of resources – load single csv (<10MB) to hive using spark.
Spark version: 1.6.0
Oozie version: 4.1.0
Workflow is created with Hue, Oozie Workflow Editor:
<workflow-app name="Spark_test" xmlns="uri:oozie:workflow:0.5">
<start to="spark-589f"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="spark-589f">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapreduce.map.java.opts</name>
<value>-XX:MaxPermSize=2g</value>
</property>
</configuration>
<master>yarn</master>
<mode>client</mode>
<name>MySpark</name>
<jar>simple-project_2.10-1.1.jar</jar>
<spark-opts>--packages com.databricks:spark-csv_2.10:1.5.0</spark-opts>
<file>/user/spark/oozie/jobs/simple-project_2.10-1.1.jar#simple-project_2.10-1.1.jar</file>
</spark>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
I got following logs after running workflow:
stdout:
Invoking Spark class now >>>
Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exception invoking main(), PermGen space
stderr:
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Yarn application state monitor"
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exception invoking main(), PermGen space
syslog:
2017-03-14 12:31:19,939 ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: PermGen space
Please suggest which configuration parameters should be increased.
You have at least 2 options here:
1) increase PermGen size for launcher MR job by adding this to workflow.xml:
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-XX:PermSize=512m -XX:MaxPermSize=512m</value>
</property>
see details here: http://www.openkb.info/2016/07/memory-allocation-for-oozie-launcher-job.html
2) preferred way is to use Java 8 instead of outdated Java 7
PermGen memory is a non-heap memory which is used to store the class metadata and string constants. It will not usually grow drastically if there are no runtime class loading by class.forname() or any other third-party JARs.
If you get this error message as soon as you launch your application, then it means that the allocated permanent generation space is smaller than actually required by all the class files in your application.
"-XX:MaxPermSize=2g"
You already set 2gb for PermGen memory. You can increase this value gradually and see which value does not throw outofmemoryerror and keep that value. You can also use profilers to monitor the memory usage of permanent generation and set the right value.
If this error is triggered at run time, then it might be due to runtime class loading or excessive creation of string constants in permanent generation. It requires profiling your application to fix the issue and set the right value for -XX:MaxPermSize parameter.

Oozie workflow fails - Mkdirs failed to create file

I am using an Oozie workflow to run a pyspark script, and I'm running into an error I can't figure out.
When running the workflow (either locally or on YARN) a MapReduce job is run before the Spark starts. After a few minutes the task fails (before the Spark action), and digging through the logs shows the following error:
java.io.IOException: Mkdirs failed to create file:/home/oozie/oozie-oozi/0000011-160222043656138-oozie-oozi-W/bulk-load-node--spark/output/_temporary/1/_temporary/attempt_1456129482428_0003_m_000000_2 (exists=false, cwd=file:/hadoop/yarn/local/usercache/root/appcache/application_1456129482428_0003/container_e68_1456129482428_0003_01_000004)
(Apologies for the length)
There are no other evident errors. I do not directly create this folder (I assume given the name that it is used for temporary storage of MapReduce jobs). I can create this folder from the command line using mkdir -p /home/oozie/blah.... It doesn't appear to be a permissions issue, as setting that folder to 777 made no difference. I have also added default ACLs for oozie, yarn and mapred users for that folder, so I've pretty much ruled out permission issues. It's also worth noting that the working directory listed in the error does not exist after the job fails.
After some Googling I saw that a similar problem is common on Mac systems, but I'm running on CentOS. I am running the HDP 2.3 VM Sandbox, which is a single node 'cluster'.
My workflow.xml is as follows:
<workflow-app xmlns='uri:oozie:workflow:0.4' name='SparkBulkLoad'>
<start to = 'bulk-load-node'/>
<action name = 'bulk-load-node'>
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn</master>
<mode>client</mode>
<name>BulkLoader</name>
<jar>file:///test/BulkLoader.py</jar>
<spark-opts>
--num-executors 3 --executor-cores 1 --executor-memory 512m --driver-memory 512m\
</spark-opts>
</spark>
<ok to = 'end'/>
<error to = 'fail'/>
</action>
<kill name = 'fail'>
<message>
Error occurred while bulk loading files
</message>
</kill>
<end name = 'end'/>
</workflow-app>
and job.properties is as follows:
nameNode=hdfs://192.168.26.130:8020
jobTracker=http://192.168.26.130:8050
queueName=spark
oozie.use.system.libpath=true
oozie.wf.application.path=file:///test/workflow.xml
If necessary I can post any other parts of the stack trace. I appreciate any help.
Update 1
After having checked my Spark History Server, I can confirm that the actual Spark action is not starting - no new Spark apps are being submitted.

not able to run the shell script with oozie

hi i am trying to run the shell script through oozie.while running the shell script i am getting the following error.
org.apache.oozie.action.hadoop.ShellMain], exit code [1]
my job.properties file
nameNode=hdfs://ip-172-31-41-199.us-west-2.compute.internal:8020
jobTracker=ip-172-31-41-199.us-west-2.compute.internal:8032
queueName=default
oozie.libpath=${nameNode}/user/oozie/share/lib/
oozie.use.system.libpath=true
oozie.wf.rerun.failnodes=true
oozieProjectRoot=shell_example
oozie.wf.application.path=${nameNode}/user/karun/${oozieProjectRoot}/apps/shell
my workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.1" name="pi.R example">
<start to="shell-node"/>
<action name="shell-node">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>script.sh</exec>
<file>/user/karun/oozie-oozi/script.sh#script.sh</file>
<capture-output/>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Incorrect output</message>
</kill>
<end name="end"/>
</workflow-app>
my shell script- script.sh
export SPARK_HOME=/opt/cloudera/parcels/CDH-5.4.2-1.cdh5.4.2.p0.2/lib/spark
export YARN_CONF_DIR=/etc/hadoop/conf
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export HADOOP_CMD=/usr/bin/hadoop
/SparkR-pkg/lib/SparkR/sparkR-submit --master yarn-client examples/pi.R yarn-client 4
error log file
WEBHCAT_DEFAULT_XML=/opt/cloudera/parcels/CDH-5.4.2- 1.cdh5.4.2.p0.2/etc/hive-webhcat/conf.dist/webhcat-default.xml:
CDH_KMS_HOME=/opt/cloudera/parcels/CDH-5.4.2-1.cdh5.4.2.p0.2/lib/hadoop-kms:
LANG=en_US.UTF-8:
HADOOP_MAPRED_HOME=/opt/cloudera/parcels/CDH-5.4.2- 1.cdh5.4.2.p0.2/lib/hadoop-mapreduce:
=================================================================
Invoking Shell command line now >>
Stdoutput Running /opt/cloudera/parcels/CDH-5.4.2-
1.cdh5.4.2.p0.2/lib/spark/bin/spark-submit --class edu.berkeley.cs.amplab.sparkr.SparkRRunner --files hdfs://ip-172-31-41-199.us-west-2.compute.internal:8020/user/karun/examples/pi.R --master yarn-client
/SparkR-pkg/lib/SparkR/sparkr-assembly-0.1.jar hdfs://ip-172-31-41-199.us-west- 2.compute.internal:8020/user/karun/examples/pi.R yarn-client 4
Stdoutput Fatal error: cannot open file 'pi.R': No such file or directory
Exit code of the Shell command 2
<<< Invocation of Shell command completed <<<
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://ip-172-31-41-199.us-west-2.compute.internal:8020/user/karun/oozie-oozi/0000035-150722003725443-oozie-oozi-W/shell-node--shell/action-data.seq
Oozie Launcher ends
I dont know how to solve the issue.any help will be appreciated.
sparkR-submit ... examples/pi.R ...
Fatal error: cannot open file 'pi.R': No such file or directory
The message is really explicit: your shell tries to read a R script from the local FileSystem. But local to what, actually???
Oozie uses YARN to run your shell; so YARN allocates a container on a random machine. It's something you must put into your head so that it becomes a reflex: all resources required by an Oozie Action (scripts, libraries, config files, whatever) must be
available in HDFS beforehand
downloaded at execution time thanks to <file> instructions in the Oozie script
accessed as local files in the Current Working Dir
In your case:
<exec>script.sh</exec>
<file>/user/karun/oozie-oozi/script.sh</file>
<file>/user/karun/some/place/pi.R</file>
Then
sparkR-submit ... pi.R ...

Sqoop - Hive import using Oozie failed

I am trying to execute a sqoop import from oracle to hive, but the job fails with error
WARN [main] conf.HiveConf (HiveConf.java:initialize(2472)) - HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
Intercepting System.exit(1)
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
Oozie Launcher failed, finishing Hadoop job gracefully
I have all the jar files in place
hive-site.xml is also in place with hive metastore configuration
<property>
<name>hive.metastore.uris</name>
<value>thrift://sv2lxgsed01.xxxx.com:9083</value>
</property>
I am able to run a sqoop import(using oozie) to HDFS successfully.
I am also able to execute a hive script(using oozie) successfully
I can also execute sqoop-hive import from commandline , but the same
command fails when I execute it using oozie
My workflow.xml is as below
<workflow-app name="WorkflowWithSqoopAction" xmlns="uri:oozie:workflow:0.1">
<start to="sqoopAction"/>
<action name="sqoopAction">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<command>import --connect
jdbc:oracle:thin:#//sv2axcrmdbdi301.xxx.com:1521/DI3CRM --username xxxxxxx --password xxxxxx--table SIEBEL.S_ORG_EXT --hive-table eg.EQX_EG_CRM_S_ORG_EXT --hive-import -m1</command>
<file>/user/oozie/oozieProject/workflowSqoopAction/hive-site.xml</file>
</sqoop>
<ok to="end"/>
<error to="killJob"/>
</action>
<kill name="killJob">
<message>"Killed job due to error: ${wf:errorMessage(wf:lastErrorNode())}"</message>
</kill>
<end name="end" />
</workflow-app>
I can also find the data being loaded in HDFS.
You need to do 2 things
1) Copy hive-site.xml in the oozie workflow directory 2) In your Hive action tell oozie that use my hive-site.xml

Do I need to provide configuration in workflow.xml and job.properties in oozie?

I'm tryuing to run job looks like this (workflow.xml)
<workflow-app name="FirstWorkFlow" xmlns="uri:oozie:workflow:0.2">
<start to="FirstJob"/>
<action name="FirstJob">
<pig>
<job-tracker>hadoop1:50300</job-tracker>
<name-node>hdfs://hadoop1:8020</name-node>
<script>lib/FirstScript.pig</script>
</pig>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end" />
</workflow-app>
FirstScript :
dual = LOAD 'default.dual' USING org.apache.hcatalog.pig.HCatLoader();
store dual into '/user/oozie/dummy_file.txt' using PigStorage();
job.properties:
nameNode=hdfs://hadoop1:8020
jobTracker=hadoop1:50300
oozie.wf.application.path=/user/oozie/FirstScript
oozie.use.system.libpath=true
My question is: do I need to provide nameNode, and jobTracker confguration both in job.properies and workflow.xml?
I'm quite confused, cause no matter if I set these paramaters or not I get this error (error from hue interface):
E0902: Exception occured: [Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused]
Regards
Pawel
First to answer your question about job.properties - it is used to parametrize the workflow (the variables in the flow are replaced with the values specified in job.properties). So you can set the job tracker and namenode in job.properties and use the variables in workflow.xml or you can set it directly just in workflow.xml.
Are you sure that your Job Tracker's port is 50300? It seems suspicious for two reasons: normally, job tracker's web UI is accessible at http://ip:50030 but that is not the port that you are supposed to use for this configuration. For a Hadoop job configuration, the job tracker port is usually 8021, 9001, or 8012.
So it seems your problem is with setting the correct job tracker and name node (as opposed to setting it in the correct place). Try to check your Hadoop's settings in mapred-site.xml and core-site.xml for the correct ports and IPs. Alternatively, you can simply SSH to the machines running your Hadoop nodes and run netstat -plnt and look for the ports mentioned here.
I see a difference in port that you have specified in namenode and jobtracker. Just check what you have configured in mapred-site.xml and core-site.xml and put the appropriate port.
And also might be the hadoop1 host-name is not getting resolved. Try to add the ip address of the server or put hadoop1 in your /etc/hosts file.
You define the properties file so that the workflow could be parametarized.
Try with port 9000 which is default.Otherwise we need to see the Hadoop configuration files.

Resources