Need to pass Variable from Shell Action to Oozie Shell using Hive - shell

All,
Looking to pass variable from shell action to the oozie shell. I am running commands such as this, in my script:
#!/bin/sh
evalDate="hive -e 'set hive.execution.engine=mr; select max(cast(create_date as int)) from db.table;'"
evalPartition=$(eval $evalBaais)
echo "evaldate=$evalPartition"
Trick being that it is a hive command in the shell.
Then I am running this to get it in oozie:
${wf:actionData('getPartitions')['evaldate']}
But it pulls a blank every time! I can run those commands in my shell fine and it seems to work but oozie does not. Likewise, if I run the commands on the other boxes of the cluster, they run fine as well. Any ideas?

The issue was configuration regarding to my cluster. When I ran as oozie user, I had write permission issues to /tmp/yarn. With that, I changed the command to run as:
baais="export HADOOP_USER_NAME=functionalid; hive yarn -hiveconf hive.execution.engine=mr -e 'select max(cast(create_date as int)) from db.table;'"
Where hive allows me to run as yarn.

The solution to your problem is to use "-S" switch in hive command for silent output. (see below)
Also, what is "evalBaais"? You might need to replace this with "evalDate". So your code should look like this -
#!/bin/sh
evalDate="hive -S -e 'set hive.execution.engine=mr; select max(cast(create_date as int)) from db.table;'"
evalPartition=$(eval $evalDate)
echo "evaldate=$evalPartition"
Now you should be able to capture the out.

Related

I do not want by Bash script to stop if a Hive command fails

I have a bash script sending a lot of HiveQL commands to hive. The problem is that I do not want it to stop if one of these commands fails. I tried the usual Bash command:
set +e
but it does not work (the script stops running if one of the Hive command fails). Do you know where is the problem ? An option in my hive config or something :-) ?
Thank you !
EDIT: I use the Hiveshell, doing something like this:
#Send my command to hive ...
hive -S -e "\"$MyCommand\""
#... but I want my script continue running if the command fails :-).

is it possible to execute more than one hive queries parallely

I have a script where it will read & execute one hql at a time,but i want to execute more than one hql at a time.Please let me know is there any way to do so.
If you use hive -e 'some command' you can use Bash &:
hive -e 'some command' &
hive -f someFile.hql &
etc..
Approach 1 (oozie):
One of the easiest and straightforward approach to run all your hql's is to use oozie. Create an oozie job and define hive actions in parallel and submit your job.
Approach 2 (Shell):
Create multiple shell scripts, with each shell script having a hive -e '<<query>>' and run all the shell scripts in parallel with a cron job (or again you can use oozie to run the shell scripts).
Although approach 2 works, I'd recommend approach 1 since oozie is the way to go to run hive scripts in parallel.

how to write a sqoop job using shell script and run them sequentially?

I need to run a set of sqoop jobs one after another inside a shell script. How can I achieve it? By default, it runs all the job in parallel which results in performance taking a hit. should i remove the "-m" parameter and run ?
-m parameter is used to run multiple map-only jobs for each sqoop command but not for all the commands that you issue.
so removing -m parameter will not help you to solve the problem.
first you need to write a shell script file with your sqoop commands
#!/bin/bash
sqoop_command_1
sqoop_command_2
sqoop_command_3
save the above command with some name like sqoop_jobs.sh
then issue permissions to run on the shell file
chmod 777 sqoop_jobs.sh
now you can run/execute your shell file by issuing the following command within your terminal
>./sqoop_jobs.sh
I hope this will help

Oozie shell Action - Running hive from shell issue

Based on a condition being true I am executing hive -e in shell script.It works fine.When I put this script in Shell action in Oozie and run ,I get a scriptName.sh: line 42: hive:command not found exception.
I tried passing the < env-var >PATH=/usr/lib/hive< /env-var> in the shell action, but I guess I am making some mistake there,because I get the same error scriptName.sh: line 42: hive:command not found
Edited:
I used which hive in the shell script. Its output is not consistent.I get two variations of output :
1. /usr/bin/hive along with a Delegation token can be issued only with kerberos or web authentication Java IOException."
2.which : hive not in {.:/sbin:/usr/bin:/usr/sbin:...}
Ok finally I figured it out .Might be a trivial thing for experts on Shell but can help someone starting out.
1. hive : command not found It was not a classpath issue.It was a shell issue.The environment i am running in is a korn shell (echo $SHELL to find out). But the hive script(/usr/lib/hive/bin/hive.sh) is a bash shell.So i changed the shebang (#! /bin/bash) in my script and it worked.
2.Delegation Token can only be issued with kerberos or web authentication.
In my hive script i added SET mapreduce.job.credentials.binary = ${HADOOP_TOKEN_FILE_LOCATION} HADOOP_TOKEN_FILE_LOCATION is a variable that holds the location of jobToken.This token needs to be passed for authentication of access to HDFS data(in my case,an HDFS read operation,through Hive Select query) in a secure cluster.Know more on Delegation Token Here .
Obviously, u miss shell environment variables.
To confirm it, use export in called shell by oozie.
If u use oozie call shell, a simple way is use /bin/bash -l your_script.
PS. PATH is a list of directories, so u need append ${HIVE_HOME}/bin to your PATH not ${HIVE_HOME}/bin/hive.

Why does using set -e cause my script to fail when called in crontab?

I have a bash script that performs several file operations. When any user runs this script, it executes successfully and outputs a few lines of text but when I try to cron it there are problems. It seems to run (I see an entry in cron log showing it was kicked off) but nothing happens, it doesn't output anything and doesn't do any of its file operations. It also doesn't appear in the running processes anywhere so it appears to be exiting out immediately.
After some troubleshooting I found that removing "set -e" resolved the issue, it now runs from the system cron without a problem. So it works, but I'd rather have set -e enabled so the script exits if there is an error. Does anyone know why "set -e" is causing my script to exit?
Thanks for the help,
Ryan
With set -e, the script will stop at the first command which gives a non-zero exit status. This does not necessarily mean that you will see an error message.
Here is an example, using the false command which does nothing but exit with an error status.
Without set -e:
$ cat test.sh
#!/bin/sh
false
echo Hello
$ ./test.sh
Hello
$
But the same script with set -e exits without printing anything:
$ cat test2.sh
#!/bin/sh
set -e
false
echo Hello
$ ./test2.sh
$
Based on your observations, it sounds like your script is failing for some reason (presumably related to the different environment, as Jim Lewis suggested) before it generates any output.
To debug, add set -x to the top of the script (as well as set -e) to show commands as they are executed.
When your script runs under cron, the environment variables and path may be set differently than when the script is run directly by a user. Perhaps that's why it behaves differently?
To test this: create a new script that does nothing but printenv and echo $PATH.
Run this script manually, saving the output, then run it as a cron job, saving that output.
Compare the two environments. I am sure you will find differences...an interactive
login shell will have had its environment set up by sourcing a ".login", ".bash_profile",
or similar script (depending on the user's shell). This generally will not happen in a
cron job, which is usually the reason for a cron job behaving differently from running
the same script in a login shell.
To fix this: At the top of the script, either explicitly set the environment variables
and PATH to match the interactive environment, or source the user's ".bash_profile",
".login", or other setup script, depending on which shell they're using.

Resources