Hive argument using Variable Substitution (-d|--define) fails with a string argument

Hive argument using Variable Substitution (-d|--define) fails with a string argument - hadoop

When I run the a hive script with the command
hive -d arg_partition1="p1" -f test.hql
It returns the error
FAILED: SemanticException [Error 10004]: Line 3:36 Invalid table alias or column reference 'p1': (possible column names are: line, partition1)
Script with name test.hql
DROP TABLE IF EXISTS test;
CREATE EXTERNAL TABLE IF NOT EXISTS test (Line STRING)
PARTITIONED BY (partition1 STRING);
ALTER TABLE test ADD PARTITION (partition1="p1") LOCATION '/user/test/hive_test_data';
SELECT * FROM test WHERE partition1=${arg_partition1};
If I modify the partition to be an integer then it works fine and returns the correct results.
How do I run a Hive script with a string argument?

You'll have to escape your quotes when invoking hive, such as -d arg_partition1=\"p1\" for this to work.
However, I don't see why you'd have to add the quotes to the replacement string in any case. Presumably you know the data types of your fields when writing the query, so if partition1 is a string then include the quotes in the query, such as WHERE partition1="${arg_partition1}"; and if it's an integer just leave them out entirely.

Related

Error while compiling statement: FAILED: ParseException line 1:14 cannot recognize input near ''default'' '.' ''sales_withcomma'' in join source

I am running the below commands in Hive and have already imported the table 'sales_withcomma' however still now working
SELECT * FROM 'default'.'sales_withcomma'
ALTER TABLE sales_withcomma SET SERDE 'com.bizo.hive.serde.csv.CSVSerde'

Use double quotes, not single quotes around tables/schemas. But using quotes is a bad practice for tables (check this question for an explanation why that is the case). This should work:
SELECT * FROM default.sales_withcomma;

Passing a variable to a Hive script file -- works with integer but not string

I need to pass a variable to an hql file in Hive using putty. I've set up a test scenario. Basically I want to select a row from a table where a value equals the variable. It will work when the variable is an integer but not a string.
The hql file /home_dir_users/username/smb_bau/testy.hql has this code in it:
drop table if exists tam_seg.tbl_ppp;
create table tam_seg.tbl_ppp as
select
*
from
tam_seg.1_testy as b
where
b.column_a = ${hivevar:my_var};
tam_seg.1_testy looks like this:
column_a
A
B
C
D
ZZZ
123
I want to use PuTTY to pass the variable my_var to the hql file. It works if I try 123 using this:
hive --hivevar my_var=123 -f /home_dir_users/username/smb_bau/testy.hql
But it doesn't work if I try to select one of the strings. I have tried the below:
hive --hivevar my_var=ZZZ -f /home_dir_users/username/smb_bau/testy.hql
hive --hivevar my_var='ZZZ' -f /home_dir_users/username/smb_bau/testy.hql
my_var='ZZZ'
hive --hivevar my_var=$my_var -f /home_dir_users/username/smb_bau/testy.hql
But every time I get this error message:
*FAILED: SemanticException [Error 10004]: Line 9:14 Invalid table alias or column reference 'ZZZ': (possible column names are: column_a)*
I have also tried hiveconf, only one dash before it instead of two, not having hiveconf or hivevar before the variable in the code file.
Any ideas what am I doing wrong?
Many thanks.

OK so it looks like I have found the answer below through trial and error. I am leaving the post here in case any other users new to Hive find this useful.
I put single quotes round the variable in the hql file so it looks like this:
select
*
from
tam_seg.1_testy as b
where
b.column_a = '${hivevar:my_var}';
In a way this maybe seems obvious -- I would put single quotes round a string if I weren't using a variable. I guess I had my VBA/SQL Server hat on where a variable would not have quotes round it even if it were a string e.g. = strMyVar or = #STR_MY_VAR (otherwise the result would literally be "${hivevar:my_var}" as a string).

String and non string data getting converted to 'null' for empty fields while exporting into Oracle table through hive

I am new to Hadoop and I have a scenario where I have to export the dataset/file from HDFS to Oracle table using sqoop export. The file has values of 'null' in it so same is getting exported in table as well. I want to know how we can replace 'null' with blank in database while exporting?

You can create a TSV file from hive/beeline in that process you can add nulls to be blank with this --nullemptystring=true
Example : beeline -u ${hhiveConnectionString} --outputformat=csv2 --showHeader=false --silent=true --nullemptystring=true --incremental=true -e 'set hive.support.quoted.identifiers =none; select * from someSchema.someTable where whatever > something' > /Your/Local/Location or EdgeNode/exportingfile.tsv
You can use the created file in the sqoop export for exporting to Oracle table.
You can also replace the nulls with blanks on the file with Unix sed
Ex : sed -i s/null//g /Your/file//Your/Local/Location or EdgeNode/exportingfile.tsv

In oracle empty strings and nulls are treated the same for varchars. That is why Oracle internally converts empty strings into nulls for varchar. When '' assigned to a char(1) it becomes ' ' (char types are blank padded strings). See what Tom Kite says about this: https://asktom.oracle.com/pls/asktom/f?p=100:11:0%3a%3a%3a%3aP11_QUESTION_ID:5984520277372
See this manual: https://www.techonthenet.com/oracle/questions/empty_null.php

Correcting a hive script

Suppose I am writing any script for exa. creation of table as,
hive (test)> create TABLE tlocal
> (id int,
> name string
> addr string);
FAILED: ParseException line 4:5 mismatched input 'addr' expecting ) near 'string' in create table statement.
Here I forgot to add a comma after name string, so I got the error. I want to add the comma after name string and run again. But, like sql, hive does not allow you to correct only the wrong part of script - I have to rewrite the script again from beginning.
How can I do this?

As Andrew suggested you can write your query in a file and run it using
hive -f <your query file>
Alternatively you can use Hue which is open-source Web interface that supports Apache Hadoop SQL editors for Apache Hive.

Hadoop Hcatalog -How to pass key value pair

I have a create table script where the table name will be decided at runtime. How do I pass the value to sql script?
I'm trying something like this
hcat -e "create table ${D:TAB_NAME} (name string)" -DTAB_NAME=person
But I keep getting errors.
Can I get the correct syntax?

Try this:
hcat -e 'create table ${hiveconf:TAB_NAME} (name string);' -DTAB_NAME=person2
Here are two things to note:
In shell, default variable expansion is $ so your ${D:TAB_NAME} is getting expanded to nothing before even getting passed to hcat parser. So, either escape the $ or use strong quoting using: ''.
Use hiveconf instead of D for variable substitution as hcat under the hoods is still using hive to parse commands.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hive argument using Variable Substitution (-d|--define) fails with a string argument - hadoop

Related

Error while compiling statement: FAILED: ParseException line 1:14 cannot recognize input near ''default'' '.' ''sales_withcomma'' in join source

Passing a variable to a Hive script file -- works with integer but not string

String and non string data getting converted to 'null' for empty fields while exporting into Oracle table through hive

Correcting a hive script

Hadoop Hcatalog -How to pass key value pair

Categories

Resources