LeaseExpiredException while running oozie fork - hadoop

We are trying to run a Oozie workflow with 3 sub workflows running in parallel using fork. The sub-workflows contains a node running a native map reduce job, and subsequent two nodes running some complex PIG jobs. Finally the three sub-workflows are joined to a single end node.
When we run this workflow, we get LeaseExpiredException. The exception occurs randomly while running the PIG jobs. There is no definite place when it occurs, but it occurs every time we run the WF.
Also, if we remove the fork and run the sub-workflows sequentially, it works fine. However, our expectation is to have them run in parallel and same on some execution time.
Can you please help me understand this issue and some pointers on where we could be going wrong. We are starting with hadoop development and haven't faced such an issue earlier.
It looks like due to several tasks running in parallel, one of the threads closed a part file and when another thread tried to close the same, it throws the error.
Following is the stack trace of the exception from the hadoop logs.
2013-02-19 10:23:54,815 INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher: 57% complete
2013-02-19 10:26:55,361 INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher: 59% complete
2013-02-19 10:27:59,666 ERROR org.apache.hadoop.hdfs.DFSClient: Exception closing file <hdfspath>/oozie-oozi/0000105-130218000850190-oozie-oozi-W/aggregateData--pig/output/_temporary/_attempt_201302180007_0380_m_000000_0/part-00000 : org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on <hdfspath>/oozie-oozi/0000105-130218000850190-oozie-oozi-W/aggregateData--pig/output/_temporary/_attempt_201302180007_0380_m_000000_0/part-00000 File does not exist. Holder DFSClient_attempt_201302180007_0380_m_000000_0 does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1664)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1655)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1710)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1698)
at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:793)
at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1439)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1435)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1433)
Following is the sample for main workflow and one sub-workflow.
Main Work-Flow:
<workflow-app xmlns="uri:oozie:workflow:0.2" name="MainProcess">
<start to="forkProcessMain"/>
<fork name="forkProcessMain">
<path start="Proc1"/>
<path start="Proc2"/>
<path start="Proc3"/>
</fork>
<join name="joinProcessMain" to="end"/>
<action name="Proc1">
<sub-workflow>
<app-path>${nameNode}${wfPath}/proc1_workflow.xml</app-path>
<propagate-configuration/>
</sub-workflow>
<ok to="joinProcessMain"/>
<error to="fail"/>
</action>
<action name="Proc2">
<sub-workflow>
<app-path>${nameNode}${wfPath}/proc2_workflow.xml</app-path>
<propagate-configuration/>
</sub-workflow>
<ok to="joinProcessMain"/>
<error to="fail"/>
</action>
<action name="Proc3">
<sub-workflow>
<app-path>${nameNode}${wfPath}/proc3_workflow.xml</app-path>
<propagate-configuration/>
</sub-workflow>
<ok to="joinProcessMain"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>WF Failure, 'wf:lastErrorNode()' failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
Sub-WorkFlow:
<workflow-app xmlns="uri:oozie:workflow:0.2" name="Sub Process">
<start to="Step1"/>
<action name="Step1">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${step1JoinOutputPath}"/>
</prepare>
<configuration>
<property>
<name>mapred.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<main-class>com.absd.mr.step1</main-class>
<arg>${wf:name()}</arg>
<arg>${wf:id()}</arg>
<arg>${tbMasterDataOutputPath}</arg>
<arg>${step1JoinOutputPath}</arg>
<arg>${tbQueryKeyPath}</arg>
<capture-output/>
</java>
<ok to="generateValidQueryKeys"/>
<error to="fail"/>
</action>
<action name="generateValidQueryKeys">
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${tbValidQuerysOutputPath}"/>
</prepare>
<configuration>
<property>
<name>pig.tmpfilecompression</name>
<value>true</value>
</property>
<property>
<name>pig.tmpfilecompression.codec</name>
<value>lzo</value>
</property>
<property>
<name>pig.output.map.compression</name>
<value>true</value>
</property>
<property>
<name>pig.output.map.compression.codec</name>
<value>lzo</value>
</property>
<property>
<name>pig.output.compression</name>
<value>true</value>
</property>
<property>
<name>pig.output.compression.codec</name>
<value>lzo</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
</configuration>
<script>${pigDir}/tb_calc_valid_accounts.pig</script>
<param>csvFilesDir=${csvFilesDir}</param>
<param>step1JoinOutputPath=${step1JoinOutputPath}</param>
<param>tbValidQuerysOutputPath=${tbValidQuerysOutputPath}</param>
<param>piMinFAs=${piMinFAs}</param>
<param>piMinAccounts=${piMinAccounts}</param>
<param>parallel=80</param>
</pig>
<ok to="aggregateAumData"/>
<error to="fail"/>
</action>
<action name="aggregateAumData">
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${tbCacheDataPath}"/>
</prepare>
<configuration>
<property>
<name>pig.tmpfilecompression</name>
<value>true</value>
</property>
<property>
<name>pig.tmpfilecompression.codec</name>
<value>lzo</value>
</property>
<property>
<name>pig.output.map.compression</name>
<value>true</value>
</property>
<property>
<name>pig.output.map.compression.codec</name>
<value>lzo</value>
</property>
<property>
<name>pig.output.compression</name>
<value>true</value>
</property>
<property>
<name>pig.output.compression.codec</name>
<value>lzo</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
</configuration>
<script>${pigDir}/aggregationLogic.pig</script>
<param>csvFilesDir=${csvFilesDir}</param>
<param>tbValidQuerysOutputPath=${tbValidQuerysOutputPath}</param>
<param>tbCacheDataPath=${tbCacheDataPath}</param>
<param>currDate=${date}</param>
<param>udfJarPath=${nameNode}${wfPath}/lib</param>
<param>parallel=150</param>
</pig>
<ok to="loadDataToDB"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>WF Failure, 'wf:lastErrorNode()' failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>

We've got the same error when we were running three pig actions in parallel and one of them failed. That message error is consequence of an unexpected workflow stop because one action failed, the workflow is stopped and the others actions are trying to continue. You must look at the failed action with status ERROR to know what happened, doesn't look at actions with status KILLED

Related

oozie over hive to fetch the data from table

I am trying to do automation through oozie over hive.I wrote simple hive query for creation of table and select queries on that table.When I submitted the same script.Script goes to running mode and doesn't execute.I checked the yarn application -list ,job was hanged on 95%.Hive table had been created successfully but not able to fetch data from table.Please let me know how to resolve this problem.
Thanks in Advance.
Workflow.xml
<action name="hive2-node">
<hive2 xmlns="uri:oozie:hive2-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/hive2"/>
<mkdir path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<jdbc-url>${jdbcURL}</jdbc-url>
<script>script.q</script>
<param>INPUT=/user/${wf:user()}/${examplesRoot}/input-data/table</param>
<param>OUTPUT=/user/${wf:user()}/${examplesRoot}/output-data/hive2</param>
</hive2>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Hive2 (Beeline) action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
script.q
job.properties
nameNode=hdfs://...:8020
jobTracker=...:8050
queueName=default
jdbcURL=jdbc:hive2://..*.:10000/default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/hive2

How to pick Dynamic File Name from HDFS while inserting into Hive Table

I have a Hive Table.
Now I need to write a workflow where everyday the job will search for a file in a location -
/data/data_YYYY-mm-dd.csv
like
/data/data_2015-07-07.csv
/data/data_2015-07-08.csv
...
So each day workflow will automatically pick the file name and load the data into the Hive Table(MyTable).
I am writing the script of loading as below-
LOAD DATA INPATH "/data/${filepath}" OVERWRITE INTO TABLE MyTable.
Now while running the same as a plain hive job I can set the filepath as data_2015-07-07.csv , but how to do that in Oozie coordinator so that it automatically picks the path with name as date.
I tried to set the workflow parameter from Oozie coordinator-
clicklog_${YYYY}-{MONTH}-{DAY}.csv
Well after checking through Oozie coordinator documentation, I found the solution.
Its simple and straightforward, whatever the configuration you already added in Hive Workflow, will be ignored and OOzie coordinator will fill them-
So My Hive Workflow was -
<workflow-app name="Workflow__" xmlns="uri:oozie:workflow:0.5">
<start to="hive-cfc5"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="hive-cfc5">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>/user/hive-site.xml</job-xml>
<script>/user/sub/create.hql</script>
</hive>
<ok to="hive-2ade"/>
<error to="Kill"/>
</action>
<action name="hive-2ade">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>/user/hive-site.xml</job-xml>
<script>/user/sub/load_query.hql</script>
<param>filepath=test_2015-06-26.csv</param>
</hive>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
Now I scheduled the same workflow in my oozie coordinator-
Simply by setting the filepath parameter-
test_${YYYY}-{MONTH}-{DAY}.csv
<coordinator-app name="My_Coordinator"
frequency="*/60 * * * *"
start="${start_date}" end="${end_date}" timezone="America/Los_Angeles"
xmlns="uri:oozie:coordinator:0.2"
>
<controls>
<execution>FIFO</execution>
</controls>
<action>
<workflow>
<app-path>${wf_application_path}</app-path>
<configuration>
<property>
<name>filepath</name>
<value>test_${YYYY}-{MONTH}-{DAY}.csv</value>
</property>
<property>
<name>oozie.use.system.libpath</name>
<value>True</value>
</property>
<property>
<name>start_date</name>
<value>2015-07-07T14:50Z</value>
</property>
<property>
<name>end_date</name>
<value>2015-07-14T07:23Z</value>
</property>
</configuration>
</workflow>
</action>
</coordinator-app>
and then I used a crone job to run the same every 60 minute (*/60 * * * *) to check for any above pattern file is available or not

Ozzie flow not able to hive-ql having UDF add command

I am creating oozie workflow where I am calling hive sqls sequentially.
First sql has simple transformation logic. While second has temporary function creation command and add lookup files commands. I am using this UDF further in sql.
ADD JAR **;
CREATE TEMPORARY FUNCTION XXXXX AS ...;
ADD FILE *;
<workflow-app xmlns="uri:oozie:workflow:0.4" name="hive-wf">
<credentials>
<credential name="hive_credentials" type="hcat">
<property>
<name>hcat.metastore.uri</name>
<value>XXXXXXXX</value>
</property>
<property>
<name>hcat.metastore.principal</name>
<value>XXXXXXXX</value>
</property>
</credential>
</credentials>
<start to="hive-1" />
<action name="hive-1" cred="hive_credentials">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>XXXXXXX</job-tracker>
<name-node>XXXXXXX</name-node>
<job-xml>/XXXXXX/oozie/oozie-hive-site.xml</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<script>/XXXXXXX/hive_1.sql</script>
</hive>
<ok to="hive-2" />
<error to="fail" />
</action>
<action name="hive-2" cred="hive_credentials">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>XXXXXXXX</job-tracker>
<name-node>XXXXXXXX</name-node>
<job-xml>/XXXXXX/oozie/oozie-hive-site.xml</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<script>/XXXXXXX/hive_2.sql</script>
</hive>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end" />
First hql script is executed successfully. Workflow is killed while executing second hql script giving below error.
JOB[0000044-140317190624992-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100
It throws error while executing commands to ADD UDF.(ADD JAR,CREATE TEMPORARY,ADD FILE).
I searched on this error and I got some links to ignore the error !!!
But, my actual sql using hive UDF given in second Hql script is not executed.
Can you please help?
The add jar path is a local machine path.
Oozie actions are run on datanodes. There is a possibility it cannot find the jar on the datanode and hence gives an error (Check the mapreduce logs , you would find the reason)
If this is a issue, one workaround can be to put the file in HDFS and copy it to local filesytem on the datanode during execution

Executing MapReduce job using oozie workflow in hue giving wrong output

I'm trying to execute MapReduce job using oozie workflow in hue. When I submit the job, oozie successfully executes but I don't get the expected output. It seems that either mapper or reducer never invoked.here is my workflow.xml:
<workflow-app name="wordCount" xmlns="uri:oozie:workflow:0.4">
<start to="wordcount"/>
<action name="wordcount">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.input.dir</name>
<value>/user/root/jane/inputPath</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/root/jane/outputPath17</value>
</property>
<property>
<name>mapred.mapper.class</name>
<value>MapReduceGenerateReports.Map</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>MapReduceGenerateReports.Reduce</value>
</property>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
Can anyone please tell what is the problem?
my new workflow.xml :
<workflow-app name="wordCount" xmlns="uri:oozie:workflow:0.4">
<start to="wordcount"/>
<action name="wordcount">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.input.dir</name>
<value>/user/root/jane/inputPath</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/root/jane/outputPath3</value>
</property>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.class</name>
<value>MapReduceGenerateReports$Map</value>
</property>
<property>
<name>mapreduce.reduce.class</name>
<value>MapReduceGenerateReports$Reduce</value>
</property>
<property>
<name> mapred.output.key.class</name>
<value>org.apache.hadoop.io.LongWritable</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
jobtracker log:
1)
Kind % Complete Num Tasks Pending Running Complete Killed Failed/Killed
Task Attempts
map 100.00%
1 0 0 1 0 0 / 0
reduce 100.00%
0 0 0 0 0 0 / 0
2)
Kind Total Tasks(successful+failed+killed) Successful tasks Failed tasks Killed tasks Start Time Finish Time
Setup 1 1 0 0 5-Apr-2014 18:36:22 5-Apr-2014 18:36:23 (1sec)
Map 1 1 0 0 5-Apr-2014 18:33:27 5-Apr-2014 18:33:33 (5sec)
Reduce 0 0 0 0
Cleanup 1 1 0 0 5-Apr-2014 18:33:33 5-Apr-2014 18:33:37 (4sec)
Check out the instructions for using the new API here
However, if you really need to run MapReduce jobs written using the 20 API in Oozie, below are the changes you need to make in workflow.xml.
change mapred.mapper.class to mapreduce.map.class
change mapred.reducer.class to mapreduce.reduce.class
add mapred.output.key.class
add mapred.output.value.class
and, include the following property into MR action configuration

Submitting applications externally via REST APIs

Is there currently a way to submit applications externally via the supplied REST APIs for MapReduceV1 and/or YARN? I'm hoping to find a way to do this without adding a custom service.
So far I've only figured out how to GET the application status from the ResourceManager using YARN.
Maybe I'm looking at this the wrong and there's a better way to do this externally?
So after doing some research, I've decided that the Oozie Workflow Scheduler is the way to go.
This is a sample workflow that can be submitted to a REST endpoint running inside your Hadoop system to start a MapReduce job. <action>s are not limited to MapReduce.
<workflow-app xmlns='uri:oozie:workflow:0.1' name='map-reduce-wf'>
<start to='hadoop1' />
<action name='hadoop1'>
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.mapper.class</name>
<value>org.apache.oozie.example.SampleMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>org.apache.oozie.example.SampleReducer</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>input-data</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>output-map-reduce</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>unfunded</value>
</property>
</configuration>
</map-reduce>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name='end' />
</workflow-app>
Sample taken from https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases

Resources