error while running example of oozie job - hadoop

I tried running my first oozie job by following a blog post.
I used oozie-examples.tar.gz, after extracting, placed examples in hdfs.
I tried running map-reduce job in it but unfortunately got an error.
Ran below command:
oozie job -oozie http://localhost:11000/oozie -config /examples/apps/map-reduce/job.properties -run
Got the error:
java.io.IOException: configuration is not specified at
org.apache.oozie.cli.OozieCLI.getConfiguration(OozieCLI.java:787) at
org.apache.oozie.cli.OozieCLI.jobCommand(OozieCLI.java:1026) at
org.apache.oozie.cli.OozieCLI.processCommand(OozieCLI.java:662) at
org.apache.oozie.cli.OozieCLI.run(OozieCLI.java:615) at
org.apache.oozie.cli.OozieCLI.main(OozieCLI.java:218) configuration is
not specified
I don't know which configuration it is asking for as I am using Cloudera VM and it has by default got all the configurations set in it.

oozie job -oozie http://localhost:11000/oozie -config /examples/apps/map-reduce/job.properties -run
The -config parameter takes an local path not an HDFS path. The workflow.xml needs to be present in the HDFS and path is defined in the job.properties file with the property:
oozie.wf.application.path=<path to the workflow.xml>

Related

Unable to deploy Spark jobs using Oozie

I need to keep a spark job running 24/7 and for this I am using Oozie. To do this I have written a workflow.xml and job.properties files, containing the needful information to invoke it.
However when I try to send the oozie job using this:
oozie job –config /home/oozie/tst/job.properties -run
I get the following error message, which is very clear:
java.io.IOException: configuration is not specified
at org.apache.oozie.cli.OozieCLI.getConfiguration(OozieCLI.java:816)
at org.apache.oozie.cli.OozieCLI.jobCommand(OozieCLI.java:1055)
at org.apache.oozie.cli.OozieCLI.processCommand(OozieCLI.java:686)
at org.apache.oozie.cli.OozieCLI.run(OozieCLI.java:639)
at org.apache.oozie.cli.OozieCLI.main(OozieCLI.java:225)
configuration is not specified
The problem here is that the configuration file (job.properties) exists locally on the path specified. I also PUT the directory containing both files and .jar in the HDFS.
Any idea why is this failing?
Is Oozie the best tool for this task I have?
The config parameter takes local path not HDFS. check job.properties present in /home/oozie/tst/job.properties
check job.properties contain oozie.wf.application.path=PATH_TO_HDFS_PATH_WHERE_WORKFLOW.XML_IS_PRESENT
Plus I see the dash(-) given in config parameter is different then dash(-) in run parameter
Specify the host in your command
oozie job --oozie http://your_host:11000/oozie -config /home/oozie/tst/job.properties -run
11000 is deafult port

Oozie - Run workflow by command line with configuration file in HDFS

as a newbie with Oozie, I tried to run some tutorials by command line. My stepByStep:
upload my Oozie project (workflow xml file, job.properties file, jar and data) to HDFS via HUE interface. In my job.properties files, I've indicated every information like data name node, path to my application, ...
running via HUE interface, simply, I check on check box of workflow xml file and submit
I would like to run my Oozie project by command line:
with job.properties file in local, I run:
oozie job -oozie http://localhost:11000/oozie -config examples/apps/map-reduce/job.properties -run
How can I run my Oozie project instead of with the job.properties in local (instead of the config file in local, I want to run my job with the configuration file in HDFS)?
Thanks for any suggestion and feel free to comment if my question is not clear!
I don't know if there is a direct way, but you certainly could do something like
oozie job -oozie http://localhost:11000/oozie -config <(hdfs dfs -cat examples/apps/map-reduce/job.properties) -run

Oozie validation command throwing Error: One file must be specified

I am trying to use validate method expose by oozie but stuck with error mentioned below.
As per Apache Documentation:
https://oozie.apache.org/docs/3.3.2/DG_CommandLineTool.html#Validating_a_Workflow_XML
oozie validate xx_logger_import/workflow.xml -oozie http://localhost:11000/oozie
Error: One file must be specified
Cloudera CDH-5.8 version is in use.
Oozie Version:
oozie admin -oozie http://localhost:11000/oozie -version
Oozie server build version: 4.1.0-cdh5.8.0
The workflow xml that you are trying to validate should be the last parameter in your validate command -
Example -
oozie validate -oozie $Oozie_URL /home/abc/workflow.xml
Valid workflow-app
oozie validate : validate a workflow, coordinator, bundle XML file
-auth <arg> select authentication type [SIMPLE|KERBEROS]
-oozie <arg> Oozie URL
*
That is because you are passing the -oozie parameter as well. Just use the validate command and pass the required workflow file. Thanks.
oozie validate xx_logger_import/workflow.xml

The oozie job is not submitting

I have checked for oozie service at 11000. It is connecting.
But at the time of submitting job console is stuck.
Command used for submitting is
oozie/bin/oozie job -submit -config /tmp/config.properties -oozie http://127.0.0.1:11000/oozie
I have also checked logs for errors. There isnt any.
You are using the -submit command. You need to use -run for the job to submit. You should get a workflow id when you submit. This is the command you should run:
oozie/bin/oozie job -run -config /tmp/config.properties -oozie http://127.0.0.1:11000/oozie
You can check the status of the job by running:
oozie job -oozie http://localhost:11000/oozie -info <wfid>
I have mentioned wrong yarn port and ip address in config file. Thats why it was not connecting to yarn and it was not submitting. I have updated it and its working fine.

How can I check Oozie logs

My coordinator failed with Error : E0301 invalid resource [filename]
when I do hadoop fs -ls [filename] the file is listed.
how can I debug what is wrong.
how can I check log files???
oozie job -log requires jobId. in my case i dont have job id. how can I see logs in that case. appreciate responses.
thank you
If you are looking for a command line way to do this, you can run the following:
oozie job -oozie http://localhost:11000 -info <wfid>
oozie job -oozie http://localhost:11000 -log <wfid>
If you have the $OOZIE_URL set, then you do not need the -oozie parm in the above statements. This first command will show you the status of the job and each action. The second command will dig into the oozie log and display the part in the log that pertains to the workflow id that was passed in.
cd /var/log/oozie/
ls
You should see the log file there.
I highly recommend using the oozie webconsole when new to oozie. If you are using Cloudera it's under "Enabling the Oozie Web Console" here http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_17_6.html for CDH4. CDH3 link is similar.
Also the jobid is printed when you submit the job.

Resources