Oozie Workflow and Coordinator - hadoop

I have 2 properties files one for workflow and one for coordinator.
./job.properties and ./coordinator/job.properties
2 files are identical except in coordinator there are a few additional variables set. below are those variables
coordstartTime=2013-04-08T18:40Z
coordendTime=2020-04-08T18:40Z
coordTimeZone=GMT
oozie.coord.application.path=${workflowRoot}/coordinator
wfPath=${workflowRoot}/workflow-master.xml
Everything is fine when I run the workflow but I am getting error when I run coordinator
error :
Error: E0301 : E0301: Invalid resource [filename]
that filename exists and when I do hadoop fs -ls [filename] it is listed.
What am I doing wrong here.
thanks

Problem was both
oozie.wf.application.path
and
oozie.coord.application.path
existed in the coordinator properties file.
I removed oozie.wf.application.path and the coordinator worked.
thanks

Related

Could not find or load main class hdfs problem

I am trying to use Apache Rya for some tests (https://rya.apache.org/).
For those who are familiar with Rya and RDF stores, I am trying to do a bulk loading which is explained here: https://github.com/apache/rya/blob/master/extras/rya.manual/src/site/markdown/loaddata.md.
Briefly, I should copy a Jar file 'mapreduce/target/rya.mapreduce--shaded.jar' into an hdfs volume then run the following command:
hadoop hdfs://volume/rya.mapreduce-<version>-shaded.jar org.apache.rya.accumulo.mr.tools.RdfFileInputTool -Dac.zk=localhost:2181 -Dac.instance=accumulo -Dac.username=root -Dac.pwd=secret -Drdf.tablePrefix=rya_ -Drdf.format=N-Triples hdfs://volume/dir1,hdfs://volume/dir2,hdfs://volume/file1.nt
Well I copied the needed Jar and the input files into hdfs and verified that they are really there using bin/hadoop fs -put command. My problem is that when I run the cmd in the official example I get the following lines of error that I could not understand or resolve.
/project/hadoop/libexec/hadoop-functions.sh: line 2393: HADOOP_HDFS://LOCALHOST:9000/USER/RYA.MAPREDUCE-4.0.0-INCUBATING-SHADED.JAR_USER: invalid variable name
/project/hadoop/libexec/hadoop-functions.sh: line 2358: HADOOP_HDFS://LOCALHOST:9000/USER/RYA.MAPREDUCE-4.0.0-INCUBATING-SHADED.JAR_USER: invalid variable name
/project/hadoop/libexec/hadoop-functions.sh: line 2453: HADOOP_HDFS://LOCALHOST:9000/USER/RYA.MAPREDUCE-4.0.0-INCUBATING-SHADED.JAR_OPTS: invalid variable name
Error: Could not find or load main class hdfs:..localhost:9000.user.rya.mapreduce-4.0.0-incubating-shaded.jar
For information; all env variables are properly set, HADOOP_HOME and HADOOP_PREFIX

Hadoop copy from cluster to cluster fails due to "Mismatch in length of source"

I want to copy data from one to another cluster. I use this command
hadoop distcp hdfs://SOURCE-NAMENODE:9000/dir/ \ hdfs://DESTINATION-NAMENODE:9000/
And I get this message:
18/04/11 12:05:37 INFO mapred.CopyMapper: Copying
hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108
to
hdfs://DESTINATION-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108
18/04/11 12:05:37 INFO mapred.RetriableFileCopyCommand: Creating temp
file:
hdfs://DESTINATION-NAMENODE:9000/.distcp.tmp.attempt_local2084770019_0001_m_000000_0
18/04/11 12:05:38 ERROR util.RetriableCommand: Failure in Retriable
command: Copying
hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108
to
hdfs://DESTINATION-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108
java.io.IOException: Mismatch in length of
source:hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108
and
target:hdfs://DESTINATION-NAMENODE:9000/.distcp.tmp.attempt_local2084770019_0001_m_000000_0
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareFileLengths(RetriableFileCopyCommand.java:193)...
On destination I only see directories created and none of files.
Any ideas?
That's probably due to the fact you are copying a file that's being written to.

Hive issue using yarn

I am running hive sql on yarn,
it's throwing error with join condition , I am able to create External as well as internal table but failed to create table when use command
create table as AS SELECT name from student.
when running same query through hive cli it's working fine but with spring jog it throws error
2016-03-28 04:26:50,692 [Thread-17] WARN
org.apache.hadoop.hive.shims.HadoopShimsSecure - Can't fetch tasklog:
TaskLogServlet is not supported in MR2 mode.
Task with the most failures(4):
-----
Task ID:
task_1458863269455_90083_m_000638
-----
Diagnostic Messages for this Task:
AttemptID:attempt_1458863269455_90083_m_000638_3 Timed out after 1 secs
2016-03-28 04:26:50,842 [main] INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Killed application
application_1458863269455_90083
2016-03-28 04:26:50,849 [main] ERROR com.mapr.fs.MapRFileSystem - Failed to
delete path maprfs:/home/pro/amit/warehouse/scratdir/hive_2016-03-28_04-
24-32_038_8553676376881087939-1/_task_tmp.-mr-10003, error: No such file or
directory (2)
2016-03-28 04:26:50,852 [main] ERROR org.apache.hadoop.hive.ql.Driver -
FAILED: Execution Error, return code 2 from
As per my findings I think there is some issue with scratdir.
Kindly suggest if any one face same issue.
This issue occurs if the recursive directory doesnot exist. Hive doesnt automatically create directories recursively.
Please check existence of directories to child\table level from root
I faced a similar issue while running the below Hive query
select * from <db_name>.<internal_tbl_name> where <field_name_of_double_type> in (<list_of_double_values>) order by <list_of_order_fields> limit 10;
I performed an explain on the above statement and below was the result.
fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2700001591]: it still exists.
Time taken: 0.886 seconds, Fetched: 24 row(s)
And checked the logs through
yarn logs -applicationID application_1458863269455_90083
The error happened after a MapR upgrade from the admin team. It is probably due to some upgrade or installation issue and Tez configurations (as suggested by the line 873 in log below). Or probably, the Hive query is syntactically not supporting the Tez optimization. Saying so, because another Hive query on an external table is running fine in my case. Have to check a bit deeper though.
Though not sure but the error line in the logs that looks to be most relevant is as follows:
2017-05-08 00:01:47,873 [ERROR] [main] |web.WebUIService|: Tez UI History URL is not set
Solution:
It is probably happening due to some open files or applications that are using some resources. Pls check https://unix.stackexchange.com/questions/11238/how-to-get-over-device-or-resource-busy
You can run the explain <your_Hive_statement>
In the result execution plan, you can come across the filenames/dirs that Hive execution engine fails to delete e.g.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
Go to the path given in the step 2 e.g. /hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/
In path 3, doing ls -a or lsof +D /path will show the open process_ids blocking the files from delete.
If you run ps -ef | grep <pid>, you get
hive_username <pid> 19463 1 05:19 pts/8 00:00:35 /opt/mapr/tools/jdk1.7.0_51/jre/bin/java -Xmx256m -Dhiveserver2.auth=PAM -Dhiveserver2.authentication.pam.services=login -Dmapr_sec_enabled=true -Dhadoop.login=maprsasl -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/mapr/hadoop/hadoop-2.7.0 -Dhadoop.id.str=hive_username -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/mapr/hadoop/hadoop-2.7.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/mapr/hive/hive-2.1/bin/../conf/parquet-logging.properties -Dhadoop.security.logger=INFO,NullAppender -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dzookeeper.saslprovider=com.mapr.security.maprsasl.MaprSaslProvider -Djavax.net.ssl.trustStore=/opt/mapr/conf/ssl_truststore org.apache.hadoop.util.RunJar /opt/mapr/hive/hive-2.1//lib/hive-cli-2.1.1-mapr-1703.jar org.apache.hadoop.hive.cli.CliDriver
CONCLUSION:
The HiveCLiDriver clearly shows that running "Hive on Spark" (or managed) tables through Hive CLI is not supported any more from Hive 2.0 onwards and it is going to be deprecated going forward. You have to use HiveContext in Spark for running Hive queries. But you can still run queries on Hive external tables through Hive CLI.

Using Oozie workflow and coordinator - E0302: Invalid parameter error

I'm trying to run a workflow using a coordinator, but when i try to set the workflow and coordinator XML file paths together, i get an error.
This is how my jobs.properties file looks like:
nameNode=hdfs://10.74.6.155:9000
jobTracker=10.74.6.155:9010
queueName=default
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/examples/apps/test/
oozie.coord.application.path=${nameNode}/user/${user.name}/examples/apps/test/
when i run my workflow with the command line:
bin\oozie job -oozie http://localhost:11000/oozie -config examples\apps\test\job.properties -run
i get the following error:
Error: E0302 : E0302: Invalid parameter [{0}]
what am i doing wrong?
Thanks!
Both workflow and coordination paths cannot exist in job.properties at the same time. You can either run a job as a workflow or as a coordination.
Use only your Coordinator path in your properties file and use your workflow path in the Coordinator.xml file.
**oozie.use.system.libpath=true
workflowpath=${nameNode}/user/${user.name}/examples/apps/test/
oozie.coord.application.path=${nameNode}/user/${user.name}/examples/apps/test/**
In your coordinator.xml file add this line
'<app-path>${workflowpath}</app-path>'

Generating job and topology traces from history folder of multinode cluster using Rumen

I have a single node cluster from which i got logs and gave input TraceBuilder and it works.
I have grouped 5 node cluster under default rack and got logs. Here job and topology traces are generated properly.
I have set up 5 node cluster with each of them mapped to different racks.
I have hadoop-0.20.2 set up on my Eclipse Helios. So, i ran Tracebuilder using
Main Class: org.apache.hadoop.tools.rumen.TraceBuilder
I ran some jobs on cluster and used copy of /usr/local/hadoop/logs/history folder of master node as input to TraceBuilder.
Arguments: /home/arun/job.json /home/arun/topology.json /home/ubuntu/Documents/testlog
But i get
11/12/16 12:02:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
11/12/16 12:02:38 WARN rumen.TraceBuilder: TraceBuilder got an error while processing the [possibly virtual] file master_1324011575958_job_201112161029_0001_hduser_word+count within Path file:/home/ubuntu/Documents/testlog/master_1324011575958_job_201112161029_0001_hduser_word+count
java.lang.NullPointerException
at org.apache.hadoop.tools.rumen.JobBuilder.processTaskAttemptFinishedEvent(JobBuilder.java:492)
at org.apache.hadoop.tools.rumen.JobBuilder.process(JobBuilder.java:149)
at org.apache.hadoop.tools.rumen.TraceBuilder.processJobHistory(TraceBuilder.java:310)
at org.apache.hadoop.tools.rumen.TraceBuilder.run(TraceBuilder.java:264)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
at org.apache.hadoop.tools.rumen.TraceBuilder.main(TraceBuilder.java:142)
.....................
It generates job trace json file but the fields like hostname and location are "null" in it and the topology trace json file doesn't have 5 node's info and is like this :
{
"name" : "<root>",
"children" : [ ]
}
Can anyone help me out?
This error occurs because none expected input file was found on input directory.
The input directory must to contain job files, for example: job_201205192032_0006_conf.xml. These files are stored inside the logs/history folder, but under some directories generated in accord with the job execution and execution date

Resources