passing argument from shell script to hive script - bash

I've a concern which can be categorized in 2 ways:
My requirement is of passing argument from shell script to hive script.
OR
within one shell script I should include variable's value in hive statement.
I'll explain with an example for both:
1) Passing argument from shell script to hiveQL->
My test Hive QL:
select count(*) from demodb.demo_table limit ${hiveconf:num}
My test shell script:
cnt=1
sh -c 'hive -hiveconf num=$cnt -f countTable.hql'
So basically I want to include the value of 'cnt' in the HQL, which is not happening in this case. I get the error as:
FAILED: ParseException line 2:0 mismatched input '<EOF>' expecting Number near 'limit' in limit clause
I'm sure the error means that the variable's value isn't getting passed on.
2) Passing argument directly within the shell script->
cnt=1
hive -e 'select count(*) from demodb.demo_table limit $cnt'
In both the above cases, I couldn't pass the argument value. Any ideas??
PS: I know the query seems absurd of including the 'limit' in count but I have rephrased the problem I actually have. The requirement remains intact of passing the argument.
Any ideas, anyone?
Thanks in advance.

Set the variable this way:
#!/bin/bash
cnt=3
echo "Executing the hive query - starts"
hive -hiveconf num=$cnt -e ' set num; select * from demodb.demo_table limit ${hiveconf:num}'
echo "Executing the hive query - ends"

This works, if put in a file named hivetest.sh, then invoked with sh hivetest.sh:
cnt=2
hive -e "select * from demodb.demo_table limit $cnt"
You are using single quotes instead of double.
Using double quotes for OPTION #1 also works fine.

hadoop#osboxes:~$ export val=2;
hadoop#osboxes:~$ hive -e "select * from bms.bms1 where max_seq=$val";
or
vi test.sh
#########
export val=2
hive -e "select * from bms.bms1 where max_seq=$val";
#####################

Try this
cnt=1
hive -hiveconf number=$cnt select * from demodb.demo_table limit ${hiveconf:number}

Related

Parse Strings in HIVE using Shell

I have a shell script that I use to parse a string variable into Hive, in order to filter my observations. I provide both the script and the hive code below.
In the following script I have a variable which has a string value and I try to parse it into hive, the example below:
Shell Script:
name1='"Maria Nash"' *(I use a single quote first and then a double)*
hive --hiveconf name=${name1} -f t2.hql
Hive code (t2.hql)
create table db.mytable as
SELECT *
FROM db.employees
WHERE emp_name='${hivevar:name}';
Conclusion
To be accurate, the final table is created but it does not contain any observation. The employees table contains observations which has emp_name "Maria Nash" though.
I think that I might not parse the string correctly from shell or I do not follow the correct syntax on how I should handle the parsed variable in the hive query.
I would appreciate your help!
you are passing variable in hiveconf namespace but in the sql script are using hivevar, you should also use hiveconf:
WHERE emp_name=${hiveconf:name} --hiveconf, not hivevar
Use of the CLI is deprecated
you can use beeline from a shell script
it should look something like
beeline << EOF
!connect jdbc:hive2://host:port/db username password
select *
from db.employees
where emp_name = "${1}"
EOF
assuming that $1 is the input from the script.
This is an example of how to do it rather than a production implementation. Generally,
Kerberos would be enabled so username and password wouldn't be there
and a valid token would be available
Validate the input parameters.
Given that you can do it in a single line
beeline -u jdbc:hive2://hostname:10000 -f {full Path to Script} --hivevar {variable}={value}

Unable to resolve $proc_date from the script

I have multiple HQL's, below is the one example.
located at : /home/ganesh/CopyJobs/hql/
insert into XYZ.exttbl_form_data PARTITION (load_date="$proc_date") select FORM_DATA_ID,FORM_ID,USER_ID,INTERACTIONS_ID,SUBMISSION_DATETIME,FILEDS from PQR.exttbl_form_data where load_date="$proc_date"
In the main script im reading above mentioned HQLs as
export proc_date=2018-05-07
while read line
do
export hql=`cat /home/ganesh/CopyJobs/hql/$table_name.hql`
export hql_final=$(`eval echo"$hql"`)
echo "Final HQL: $hql_final"
hive -e "$hql_final;"
done < /home/ganesh/CopyJobs/config/tables.txt
where in tables.txt has list of all HQL.
I want to resolve the $proc_date however that not happening.
Use Hive variables substitution (hiveconf variables). I have fixed your script a little bit.
HQL file should look like this:
insert into XYZ.exttbl_form_data PARTITION (load_date='${hiveconf:proc_date}')
select FORM_DATA_ID,FORM_ID,USER_ID,INTERACTIONS_ID,SUBMISSION_DATETIME,FILEDS
from PQR.exttbl_form_data where load_date='${hiveconf:proc_date}'
${hiveconf:proc_date} - is a variable to be passed to the Hive.
The main script:
proc_date=2018-05-07
echo "proc_date is $proc_date"
while read line
do
hql_file=/home/ganesh/CopyJobs/hql/"$line".hql
echo "current hql_file is $hql_file"
hive -hiveconf proc_date="$proc_date" -f "$hql_file"
done < /home/ganesh/CopyJobs/config/tables.txt

Hive - How to store a query result in a variable in a Bash script

I need to store the result of a Hive query in a variable whose value will be used later. So, something like:
$var = select col1 from table;
$var_to_used_later = $var;
All this is part of a bash shell script. How to form the query so as to get the desired result?
Hive should provide command line support for you. I am not familiar with hive but I found this: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli, you can check whether that works.
Personally, I used mysql to achieve similar goal before. The command is:
mysql -u root -p`[script to generate the key]` -N -B -e "use XXXDB; select aaa, bbb, COUNT(*) from xxxtable where some_attribute='$CertainValue';"
I used the method shown here and got it! Instead of calling a file as shown, I run the query directly and use the value stored in the variable.

Passing Local Parameters to Hadoop script

To my understanding, the following will result in passing a global Hive variable:
hive -hiveconf DATE='01/01/2000' -f test_script.hql
That can be called with
SELECT * FROM DATETABLE WHERE DATE = ${hiveconf:DATE}
And I know that local variables can be defined in the script and called by doing:
set DATE='01/01/2000'
SELECT * FROM DATETABLE WHERE DATE = ${DATE}
But say one wanted to submit many jobs with local parameters set for each script, how can we pass them from the command line?
The emphasis is avoiding one script picking up the hiveconf:DATE set by another script that was submitted in quick succession.
EDIT:
I guess this could work, creating a shell script and passing variables to the shell script and then passing those to the individual queries:
#!/bin/bash
FIRST_QUERY = "SELECT * FROM DATETABLE WHERE DATE = '$DATE'"
hive -e "$FIRST_QUERY"
But this seems inefficient, I would still want to know if the option above is possible.
I found the option -define here:
hive -e 'SELECT * FROM DATETABLE WHERE DATE = ${DATE}' -define DATE='01/01/2000'

How to create a "one-liner" for oracle that includes "set" commands as well as sql statements

I want to execute a dynamic sql containing some set commands. Is it possible to do so without embedding newlines?
set heading off ; set lines 1000 ; select * from my_table;
Note the above does not work due to the semicolons between the set commands:
SP2-0158: unknown SET option ";"'
Update The whole point of this question is to do it on one line.
The best I have found for my own purposes is to put my standard SET commands in a file called sql_settings.txt in a directory with an environment variable holding its path and another variable for the connect string:
sqlsets=/directory/where/sql_settings/stored/sql_settings.txt
db_conn=<ConnectStr>
& then execute a one-liner as such with a shell here-string:
sqlplus -s $db_conn #$sqlsets <<< "select * from my_table;" | less
(The "less" pipe will prevent from cluttering your shell session)
You could also get fancy and create a shell function to minimize typing to the SQL query:
function mydb { sqlplus -s $db_conn #$sqlsets <<< "$#;" ; }
Then call as such:
mydb 'select * from my_table;'
set command is a directive for sqlplus and is not related to sql and you can do it this way
set heading off lines 1000
select * from my_table;
After extensive research, I have concluded this is not possible to perform with oracle.

Resources