I have created two hive scripts script1.hql and script2.hql.
Is it possible to run the script script2.hql from script1.hql?
I read about using the source command, but could not get around about its use.
Any pointers/ref docs will be appreciated..
Use source <filepath> command:
source /tmp/script2.hql; --inside script1
The docs are here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli
Hive will include text of /tmp/script2.hql and execute it in the same context, so all variables defined for main script will be accessible script2 commands.
source command looks for local path (not HDFS). Copy file to local directory before executing.
Try using command and see if you can execute
hive -f /home/user/sample.sql
Related
Usually I source all the macros I have for the jobs run in a remote machine using this command:
macros=$\my_directory
But I see someone uses a different way to get all the macros for submitting the jobs in a remote machine. He uses this command:
macros=$(dirname $(readlink -f $BASH_SOURCE))
Now I want to know how the $dirname has the advantages over giving the specific macro location. It would be great if you just explain to me regarding the sourcing the macro using $dirname
By using dirname you get the directory of where the script is located, therefore it's easy to source other files locally close to your script and don't worry about specifying the correct path each time the script bundle is relocated.
For instance if you have in your script source $macros/some_script.sh then it will not break when the bundle is located in the /usr/local/bin/ or /bin/ or ...
Regarding $BASH_SOURCE see: https://stackoverflow.com/a/35006505/2146346
I created a script in shell and passing the arguments using getopts methods in my script like this:
sh my_code.sh -F"file_name"
where my_code.sh is my unix script name and file_name is the file I am passing to my script using getopts.
This is working fine when I am invoking my script from the command line.
I want to invoke the same script by using oozie, but I am not sure how can I do it.
I tried passing the argument to the "exec" as well as "file" tag in the xml
When I am trying passing argument in exec tag, it was giving "JavaNullPoint" Expection
exec TAG
<exec>my_code.sh -F file_name</exec>
file TAG
<file>$/user/oozie/my_code.sh#$my_code.sh -F file_name</file>
When I am trying passing argument in File Tag, I was getting error, "No such File or directory". It was searching the file_name in /yarn/hadoop directory.
Can anyone please suggest how can I achieve this by using oozie?
You need to create a lib/ folder as part of your workflow where Oozie will upload the script as part of its process. This directory should also be uploaded to the oozie.wf.application.path location.
The reason this is required is that Oozie will run on any random YARN node, and pretend that you had a hundred node cluster, and you would otherwise have to ensure that every single server had /user/oozie/my_code.sh file available (which of course is hard to track). When this file can be placed on HDFS, every node can download it locally.
So if you put the script in the lib directory next to the workflow xml that you submit, then you can reference the script by name directly rather than using the # syntax
Then, you'll want to use the argument xml tags for the opts
https://oozie.apache.org/docs/4.3.1/DG_ShellActionExtension.html
I have created lib/ folder and uploaded it to oozie.wf.application.path location.
I am able to pass files to my shell action.
I have tested writing hive query output to a file by executing hive queries inside shell script using hive –e and hive –f options. when i tried executing the shell script from putty it is working fine, however in the hue box from oozie workflow the same shell script is not writing any results to local file.
Using Insert overwrite directory I can directly write hive query output to a directory inside HDFS however for each query it creates a new directory so I can not use this option.
Please suggest me any alternative option to write multiple hive query output to a single file by executing shell script from oozie workflow.
Thanks in advance.
When running shell action via Oozie workflow, it will run on any of the datanodes. check the output path is present in the datanode
I wrote a shell script where I copy my .bashrc file as well as custom dotfiles to a backup folder and then replace them in my home folder with another .bashrc file which will then source my custom dotfiles.
However, after the script does its job, if I try to execute the aliases I included in the new files I get the error No command found. Only after I source the .bashrc file manually in the terminal I have access to them.
From what I understand, the script I'm running is executing in a sub-shell (?) which will terminate on execution.
How can I run the script and have new commands/aliases/functions available without having to source the .bashrc file myself or restarting the terminal?
Well, it appears that instead of running my script via sh script.sh, I can source it like source script.sh, which will behave exactly as I wanted.
Solution
I am invoking a bash shell script using oozie editor in Hue.
I used the shell action in the workflow and tried below different options in shell command:
Uploaded the shell script using 'choose a file'
Gave local directory path where shell script is present
Gave HDFS path where shell script is present
But all these options gave following error:
Cannot run program "sec_test_oozie.sh" (in directory "/data/hadoop/yarn/local/usercache/user/appcache/application_1399542362142_0086/container_1399542362142_0086_01_000002"): java.io.IOException: error=2, No such file or directory
How should I give the shell script execution command?
Where the shell script file should be residing?
You need add file "sec_test_oozie.sh" in oozie shell step. In add files
I think you are creating the file from windows machine which is adding extra line break characters.You need to convert the shell script file to Unix format.I also faced the same issue.Then I created the file from a Linux system and it started working.The error is misguiding.
I want to extend the #SergioRG answer. Oozie, at least with Cloudera's Hue interface is very counterintuitive.
To run a script file, three conditions should be met:
the file is on the HDFS file system, in a folder accessible by Oozie
the file should be indicated in the shell command field
the file should be added with any other dependent file in the "Files+" part of the task card.
I wonder why they didn't add by default the script file you are calling.
Edit: please also check in advanced options (the gear in the left upper corner) if you need to set the path variable (eg. PATH=/usr/local/bin:/usr/bin).
Did you edit sec_test_oozie.sh with the Hue File Browser? Depending on your Hue version it might have corrupted it: hue-list
I encountered the same problem, and the problem was that the script echoed some irrelevant line while the workflow tried to parse it as a property line. Oozie gave a very irrelevant error message of java.io.IOException: error=2, No such file or directory which only added confusion.
You will need to use <file> to add your script.
If you used <capture-output/> then you must make sure that your script prints only "key=value" lines, like java properties, otherwise you will get the error you see java.io.IOException: error=2, No such file or directory with some path pointing to .../yarn/local/usercache/...
We had this issue on a test script, basically if you use an editor that adds wierd characters or line ending to the file, it'll throw this error because the script cannot be used in the container.
Try using nano file.sh to see if any strange characters appear. Then push it back to hdfs with hdfs dfs -put file.sh /path/you/need
Removing the #!/bin/bash from my shell script helped me
"No such a file or directory" oozie cannot locate the file. Please check the AddPath setting in the command.
In the edit node seciton, get the oozie application hdfs path.
Upload the shell script in hdfs oozie application path.
In the oozie edit node step, Shell command - specify the shell script name which is uploaded.
Below that there would be option to AddPath, then add files, add the shell script which was uploaded in the hdfs path.