hive query inside shell script - shell

I want to run the hive query inside the shell script. I want to exit shell script and throw error if my hive query fails.
Right now, even if my hive query fails, the next steps are getting executed. can someone help with this:
val=hive -e "
select col1 from table_name;"
(assuming the table has only one row)
echo "don't run if hive fails"

hive -e "select col1 from table_name"
if test $? -ne 0
then
exit 1
fi

Related

Return status of a hive script

I have two questions regarding the capture of the return status/exit status of hive script.
Capture the return status in a unix script
try2.hql
select from_unixtime(unix_timestamp(),'YYYY-MM-DD')
This is called in the shell script try1.sh
echo "Start of script"
hive -f try2.hql
echo "End of script"
Now, I need to capture the return status of try2.hql. How can I do this ?
Control flow when multiple queries are available
There are a couple of hive queries in a script try3.hql
select stockname, stock_date from mystocks_stg;
select concat('Top10_Stocks_High_OP_',sdate,'_',srnk) as rowkey, sname, sdate, sprice, srnk from (
select stockname as sname, stock_date as sdate, stock_price_open as sprice,rank() over(order by stock_price_open desc) as srnk
from mystocks
where from_unixtime(unix_timestamp(stock_date,'yyyy-mm-dd'),'yyyymmdd') = '${hiveconf:batch_date}') tab
where tab.srnk <= 10;
try3.hql is called in the script try4.sh be passing the relevant parameters.
My question : In try3.hql, if there is any error in the first query, I must return to the shell script and abort the program, without executing the second script.
Please suggest.
For part 1 of your problem, you can change your script to exit the status of hive:
echo "Start of script"
hive -f try2.hql; hive_status=$?
echo "End of script"
exit $hive_status
And I have a solution for part 2.
You do know that "hive" CLI is deprecated in favor of "beeline" according to the documentation ?
HiveServer2 (introduced in Hive 0.11) has its own CLI called Beeline,
which is a JDBC client based on SQLLine. Due to new development being
focused on HiveServer2, Hive CLI will soon be deprecated in favor of
Beeline (HIVE-10511).
In beeline, by default, your script will stop as soon as there is an error in it. This is controlled by the "force" parameter.
--force=[true/false] continue running script even after errors
BTW, the solution provided by codeforester for part 1 still works with beeline.
echo "Start of script"
hive -f try2.hql
hive_status=$?
echo "End of script"
echo $hive_status>>$HOME/exit_status.log
In the home directory, you'll find the exit_status.log file created, in which you'll have the exit status of the script.

how to invoke shell script in hive

Could someone please explain me how to invoke a shell script from hive?. I explored on this and found that we have to use source FILE command to invoke a shell script from hive. But I am not sure how exactly I can call my shell script from hive using source File command. So can someone help me on this? Thanks in Advance.
using ! <command> - Executes a shell command from the Hive shell.
test_1.sh:
#!/bin/sh
echo "This massage is from $0 file"
hive-test.hql:
! echo showing databases... ;
show databases;
! echo showing tables...;
show tables;
! echo runing shell script...;
! /home/cloudera/test_1.sh
output:
$ hive -v -f hive-test.hql
showing databases...
show databases
OK
default
retail_edw
sqoop_import
Time taken: 0.997 seconds, Fetched: 3 row(s)
showing tables...
show tables
OK
scala_departments
scaladepartments
stack
stackover_hive
Time taken: 0.062 seconds, Fetched: 4 row(s)
runing shell script...
This massage is from /home/cloudera/test_1.sh file
To invoke a shell script through HIVE CLI, please look at the example below.
!sh file.sh;
or
!./file.sh;
Please go though Hive Interactive Shell Commands section in the link below for more information.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli
Don't know if it would suit you, but you can inverse your problem by launching the hive commands from the bash shell in combination with the hive queries results. You can even create a single bash script for this to combine your hive queries with bash commands in a single script:
#!/bin/bash
hive -e 'SELECT count(*) from table' > temp.txt
cat temp.txt

Table not found exception when running hive query via an Oozie shell script

I m trying to run a hive count query on a table from a bash action in the Oozie workflow but I always get a table not found exception.
#!/bin/bash
COUNT=$(hive -S -e "SELECT COUNT(*) FROM <table_name> where <condition>;")
echo $COUNT
The idea is to get the count stored in a variable for further analysis. This works absolutely fine if run it directly from a local file on the shell.
I can do this by splitting it into 2 separate actions, where I first output hive query result to a temp directory and then read the file in the bash script.
Any help appreciated. Thanks!
Fixed it. I had some user permissions issue in accessing the table and also had to add the following property config to do the trick:
SET mapreduce.job.credentials.binary = ${HADOOP_TOKEN_FILE_LOCATION}

Write a report of what the shell would done

In my UNIX shell script I need to insert a parameter to start it. This parameter can assume two valors (test and production). Inside the code I make an insert in an Oracle db. After this insert I have to make a condition that if the parameter is test then write the spool in another file and don't connect the db, else connect the db and make the insert normally. Fundamentally there are two ways; in the test I just want to see what the shell is going to do and the production that it makes the normal insert and his operations. I try this after the insert but I get a error:
if [[ "$choice" = "test" ]];
then
${TMP_PART2DAT} > ${TMP_REPORT}
else
SP_SQLLOGIN="$ORACLE_DB_OWN/$ORACLE_PWD#$ORACLE_SID"
sqlplus -S -L ${SP_SQLLOGIN} #${TMP_PART2SQL}
fi
Any ideas?
Try running your shell script with "bash -x" mode. You would be able to trace the command execution.
Try
cat ${TMP_PART2DAT} > ${TMP_REPORT}
for line 3 of your script.
This will overwrite everything in TMP_REPORT with the contents of TMP_PART2DAT.

SQL select statement in UNIX inside/within IF..THEN statement

I just want ask the steps when trying to create a simple SQL select statement in UNIX inside/within IF..THEN..FI statement.
I know how to use the 'select' and 'if..then' statements in SQL*Plus, but I'm having a difficulties having a UNIX script to point to variables: If 'ABC' to 'Select...'
Example:
if [ "$?" = 'ABC' ]
then
SELECT employid, name, age FROM tablename;
else
exit 1
fi
if [ "$?" = 'XYZ' ]
then
SELECT employid, name, age FROM tablename;
else
exit 1
fi
How do I put it in a UNIX script more correctly syntax wise and right to the point?
Thanks.
This sounds like you're trying to embed SQLPlus in a shell script. From memory the incantation should look something like:
if [ $? -eq ABC ]; then
SQLPLUS /S USER/PASS#Instance <<EOF
SET echo off;
SET pagesize 0;
SET heading off;
SPOOL foo.out
select foo from bar
EOF
fi
Everything between the SQLPLUS and EOF is passed to SQLPlus, so we have some statements to control the formatting (you may want different ones) and the actual query. The SPOOL command in the SQLPlus script sends the output to a file. For more detailed docs on using SQLPlus, You can download them from Oracle's web site.
Remember that echo is your friend.
if [ "$?" = "ABC" ] then echo SELECT employid, name, age FROM tablename; else exit 1; fi
You can embed shell script variables and such. Be careful of the need to quote things the shell wants to act upon, like quotes and semi-colons.
Have you considered using perl or other scripting language that includes database connection functionality. That way you avoid the clunky shell script/SQL*Plus linkage
The above answer was ok. However, I knew that, using SQLPlus in a shell script and unfortunately, I don't need the SQLPlus script to send the output to a file. In other words: Is there any other way of doing this, just print the output to a log?

Resources