Hadoop Hcatalog -How to pass key value pair - hadoop

I have a create table script where the table name will be decided at runtime. How do I pass the value to sql script?
I'm trying something like this
hcat -e "create table ${D:TAB_NAME} (name string)" -DTAB_NAME=person
But I keep getting errors.
Can I get the correct syntax?

Try this:
hcat -e 'create table ${hiveconf:TAB_NAME} (name string);' -DTAB_NAME=person2
Here are two things to note:
In shell, default variable expansion is $ so your ${D:TAB_NAME} is getting expanded to nothing before even getting passed to hcat parser. So, either escape the $ or use strong quoting using: ''.
Use hiveconf instead of D for variable substitution as hcat under the hoods is still using hive to parse commands.

Related

Passing a variable to a Hive script file -- works with integer but not string

I need to pass a variable to an hql file in Hive using putty. I've set up a test scenario. Basically I want to select a row from a table where a value equals the variable. It will work when the variable is an integer but not a string.
The hql file /home_dir_users/username/smb_bau/testy.hql has this code in it:
drop table if exists tam_seg.tbl_ppp;
create table tam_seg.tbl_ppp as
select
*
from
tam_seg.1_testy as b
where
b.column_a = ${hivevar:my_var};
tam_seg.1_testy looks like this:
column_a
A
B
C
D
ZZZ
123
I want to use PuTTY to pass the variable my_var to the hql file. It works if I try 123 using this:
hive --hivevar my_var=123 -f /home_dir_users/username/smb_bau/testy.hql
But it doesn't work if I try to select one of the strings. I have tried the below:
hive --hivevar my_var=ZZZ -f /home_dir_users/username/smb_bau/testy.hql
hive --hivevar my_var='ZZZ' -f /home_dir_users/username/smb_bau/testy.hql
my_var='ZZZ'
hive --hivevar my_var=$my_var -f /home_dir_users/username/smb_bau/testy.hql
But every time I get this error message:
*FAILED: SemanticException [Error 10004]: Line 9:14 Invalid table alias or column reference 'ZZZ': (possible column names are: column_a)*
I have also tried hiveconf, only one dash before it instead of two, not having hiveconf or hivevar before the variable in the code file.
Any ideas what am I doing wrong?
Many thanks.
OK so it looks like I have found the answer below through trial and error. I am leaving the post here in case any other users new to Hive find this useful.
I put single quotes round the variable in the hql file so it looks like this:
select
*
from
tam_seg.1_testy as b
where
b.column_a = '${hivevar:my_var}';
In a way this maybe seems obvious -- I would put single quotes round a string if I weren't using a variable. I guess I had my VBA/SQL Server hat on where a variable would not have quotes round it even if it were a string e.g. = strMyVar or = #STR_MY_VAR (otherwise the result would literally be "${hivevar:my_var}" as a string).

Parse Strings in HIVE using Shell

I have a shell script that I use to parse a string variable into Hive, in order to filter my observations. I provide both the script and the hive code below.
In the following script I have a variable which has a string value and I try to parse it into hive, the example below:
Shell Script:
name1='"Maria Nash"' *(I use a single quote first and then a double)*
hive --hiveconf name=${name1} -f t2.hql
Hive code (t2.hql)
create table db.mytable as
SELECT *
FROM db.employees
WHERE emp_name='${hivevar:name}';
Conclusion
To be accurate, the final table is created but it does not contain any observation. The employees table contains observations which has emp_name "Maria Nash" though.
I think that I might not parse the string correctly from shell or I do not follow the correct syntax on how I should handle the parsed variable in the hive query.
I would appreciate your help!
you are passing variable in hiveconf namespace but in the sql script are using hivevar, you should also use hiveconf:
WHERE emp_name=${hiveconf:name} --hiveconf, not hivevar
Use of the CLI is deprecated
you can use beeline from a shell script
it should look something like
beeline << EOF
!connect jdbc:hive2://host:port/db username password
select *
from db.employees
where emp_name = "${1}"
EOF
assuming that $1 is the input from the script.
This is an example of how to do it rather than a production implementation. Generally,
Kerberos would be enabled so username and password wouldn't be there
and a valid token would be available
Validate the input parameters.
Given that you can do it in a single line
beeline -u jdbc:hive2://hostname:10000 -f {full Path to Script} --hivevar {variable}={value}

Hive - How to store a query result in a variable in a Bash script

I need to store the result of a Hive query in a variable whose value will be used later. So, something like:
$var = select col1 from table;
$var_to_used_later = $var;
All this is part of a bash shell script. How to form the query so as to get the desired result?
Hive should provide command line support for you. I am not familiar with hive but I found this: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli, you can check whether that works.
Personally, I used mysql to achieve similar goal before. The command is:
mysql -u root -p`[script to generate the key]` -N -B -e "use XXXDB; select aaa, bbb, COUNT(*) from xxxtable where some_attribute='$CertainValue';"
I used the method shown here and got it! Instead of calling a file as shown, I run the query directly and use the value stored in the variable.

Hive argument using Variable Substitution (-d|--define) fails with a string argument

When I run the a hive script with the command
hive -d arg_partition1="p1" -f test.hql
It returns the error
FAILED: SemanticException [Error 10004]: Line 3:36 Invalid table alias or column reference 'p1': (possible column names are: line, partition1)
Script with name test.hql
DROP TABLE IF EXISTS test;
CREATE EXTERNAL TABLE IF NOT EXISTS test (Line STRING)
PARTITIONED BY (partition1 STRING);
ALTER TABLE test ADD PARTITION (partition1="p1") LOCATION '/user/test/hive_test_data';
SELECT * FROM test WHERE partition1=${arg_partition1};
If I modify the partition to be an integer then it works fine and returns the correct results.
How do I run a Hive script with a string argument?
You'll have to escape your quotes when invoking hive, such as -d arg_partition1=\"p1\" for this to work.
However, I don't see why you'd have to add the quotes to the replacement string in any case. Presumably you know the data types of your fields when writing the query, so if partition1 is a string then include the quotes in the query, such as WHERE partition1="${arg_partition1}"; and if it's an integer just leave them out entirely.

whitespace character in case of parameter substitution

I want to pass a filter statement with in my pig script using parameter substitution
For that I have tried
exec -param flt='a1==1 AND a2=2' filterscript.pig
But sadly it is throwing an exception message
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: Local file 'AND' does not exist.
Pig version - 0.9.2
I have tried flt='\'a1==1 AND a2=2\'' and flt="a1==1 AND a2==2" suggested by pig users in apache forum as well as seen a similar post in SO.
Any help will be appreciated
I think you are using the parameter passed as it is as a condition. If so you will get an error like this. Instead you can pass them as separate paarmeters and form the condition string inside the pig script.
exec -p p1=1 -p p2=2 filterscript.pig
Inside your filterscript.pig script you can use these parameter values in condition clauses. For example
a1==$p1 AND a2=$p2
If you run your script outside the grunt shell you can do the followings:
pig -param flt="a1\=\=1 AND a2\=\=2" -f filterscript.pig
where filterscript.pig is something like this:
A = load ...
...
B = filter A by $flt;
...
Note that the '=' is also escaped, otherwise the filter condition won't be evalued to boolean.
If you want to use the filter substitution within the grunt shell as you tried with exec,
then you'll encounter the whitespace problem. Since escaping the whitespace character doesn't work, as a workaround you can create a parameter file :
cat params.txt
flt="a1\=\=1 AND a2\=\=2"
Then issue:
exec -param_file params.txt filterscript.pig
Note: I use Pig 0.12

Resources