Hive table creation error through Bash Shell

Hive table creation error through Bash Shell - hadoop

Can anyone give me why I am getting error while creating partitioed table from bash shell.
[cloudera#localhost ~]$ hive -e "create table peoplecountry (
name1 string,
name2 string,
salary int,
country string
)
partitioned by (country string)
row format delimited
column terminated by '\n'";
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.10.0-cdh4.7.0.jar!/hive-log4j.properties
Hive history file=/tmp/cloudera/hive_job_log_0fdf7083-8ab4-499f-8048-a85f162d1357_376056456.txt
FAILED: ParseException line 8:0 missing EOF at 'column' near 'delimited'

If you meant newline at end of each row of your data then you need to use:
line terminated by '\n'
instead of column terminated by ,
In case you meant each column in the row to separated by a delimiter , then specify as
fields terminated by '\n'
refer :
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

Related

Hive Command Line - problems with backticks in column name

When I try creating a table using beeline / hive command line for the following DDL :
CREATE EXTERNAL TABLE schema.table
(
`Week` string,
`Orders` string,
`Units` string
)
COMMENT 'This table was auto generated'
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'separatorChar' = ',',
'quoteChar' = '\"',
'escapeChar' = '\\'
)
STORED AS TEXTFILE
LOCATION '/data/qa/ingest_id=1543338670'
TBLPROPERTIES ("skip.header.line.count"="1");
I get the following error
Error: Error while compiling statement: FAILED: ParseException line 3:0 character '▒' not supported here
line 3:1 character '▒' not supported here
line 3:2 character '▒' not supported here (state=42000,code=40000)
Has anyone faced this issue before? This DDL executes without issues on a GUI client.

Issue was related UTF 8 encoding. Removed unicode characters from the shell.
tr -d '\200-\277' | tr -d '\300-\377'

How to export a Hive table into a CSV file including header?

I used this Hive query to export a table into a CSV file.
hive -f mysql.sql
row format delimited fields terminated by ','
select * from Mydatabase,Mytable limit 100"
cat /LocalPath/* > /LocalPath/table.csv
However, it does not include table column names.
How to export in csv the column names ?
show tablename ?

You should add set hive.cli.print.header=true; before your select query to get column names as the first row of your output. The output would look as Mytable.col1, Mytable.col2 ....
If you don't want the table name with the column names, use set hive.resultset.use.unique.column.names=false;. The first row of your output would then look like col1, col2 ...

Invoking hive command-line with the parameters suggested in the other answer here works for a plain select. So, you can extract the column names and create the csv to start with, as follows:
hive -S --hiveconf hive.cli.print.header=true --hiveconf hive.resultset.use.unique.column.names=false --database Mydatabase -e 'select * from Mytable limit 0;' > /LocalPath/table.csv
Post which you can have the actual data extraction part run, except this time, remember to append to the csv:
cat /LocalPath/* >> /LocalPath/table.csv ## From your question with >> for append

command line arguments in hive ( .hql) files from a bash script

I am having a main bash script running several other bash scripts and hql files. The hql files have hive queries. The hive queries have a where clause and it is on the date field. I am trying to automate a process and I need the where clause to change based on todays date ( which is obtained from the main bash script).
For example the .hql file looks like this:
This is selectrows.hql
DROP TABLE IF EXISTS tv.events_tmp;
CREATE TABLE tv.events_tmp
( origintime STRING,
deviceid STRING,
clienttype STRING,
loaddate STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION 'hdfs://nameservice1/data/full/events_tmp';
INSERT INTO TABLE tv.events_tmp SELECT origintime, deviceid, clienttype, loaddate FROM tv.events_tmp WHERE origintime >= '2015-11-02 00:00:00' AND origintime < '2015-11-03 00:00:00';
Since today is 2015-11-11, i want to be able to pass the date - 9 days and date-8 days to the .hql script from the bash script. Is there a way to pass these two variable from the bash script to the .hql file.
So the main bash script looks like this:
#!/bin/bash
# today's date
prodate=`date +%Y-%m-%d`
echo $prodate
dateneeded=`date -d "$prodate - 8 days" +%Y-%m-%d`
echo $dateneeded
# CREATE temp table
beeline -u 'jdbc:hive2://datanode:10000/;principal=hive/datanode#HADOOP.INT.BELL.CA' -d org.apache.hive.jdbc.HiveDriver -f /home/automation/tv/selectrows.hql
echo "created table"
thanks in advance.

You can use beeline -e option to execute queries using strings. Then pass the date parameters to the strings.
#!/bin/bash
# today's date
prodate=`date +%Y-%m-%d`
echo $prodate
dateneeded8=`date -d "$prodate - 8 days" +%Y-%m-%d`
dateneeded9=`date -d "$prodate - 9 days" +%Y-%m-%d`
echo $dateneeded8
echo $dateneeded9
hql="
DROP TABLE IF EXISTS tv.events_tmp;
CREATE TABLE tv.events_tmp
( origintime STRING,
deviceid STRING,
clienttype STRING,
loaddate STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION 'hdfs://nameservice1/data/full/events_tmp';
INSERT INTO TABLE tv.events_tmp SELECT origintime, deviceid, clienttype, loaddate FROM tv.events_tmp WHERE origintime >= '"
echo "$hql""$dateneeded9""' AND origintime < '""$dateneeded8""';"
# CREATE temp table
beeline -u 'jdbc:hive2://datanode:10000/;principal=hive/datanode#HADOOP.INT.BELL.CA' -d org.apache.hive.jdbc.HiveDriver -e "$hql""$dateneeded9""' AND origintime < '""$dateneeded8""';"
echo "created table"

An alternate way to pass an argument
create hive .hql file with defined variables
vi multi_var_file.hql
SELECT * FROM TEST_DB.TEST_TB WHERE TEST1='${var_1}' AND TEST2='${var_2}';
Pass the same variables into the Hive script to run
hive -hivevar var_1='TEST1' -hivevar var_2='TEST2' -f multi_var_file.hql

hive load data:how to specify file column separator and dynamic partition columns?

well I had some question on loading mysql data into hive2, and don't know how to specify the separator, I tried for serval times but got nothing.
Here below is the hive table,id is the partition column,
0: jdbc:hive2://localhost/> desc test;
+-----------+------------+----------+
| col_name | data_type | comment |
+-----------+------------+----------+
| a | string | |
| id | int | |
+-----------+------------+----------+
When i execute
load data local inpath 'file:///root/test' into table test partition (id=1);
it says:
Invalid path ''file:///root/test'': No files matching path file
but it do exists.
I wish to dynamic partitioned by the specified file,so i add the very column into the file like this:
root#<namenode|~>:#cat /root/test
a,1
b,2
but it also failed,the docs say nothing about this,i guess it doesn't support right now.
dose anyone got some idea in it? any help will be appreciated!

If you want to specify column sperators it uses the command;
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
Replace the ',' with your separator
Also if you want to partition a Hive table you specify the column which you want to terminate on using;
CREATE TABLE Foo (bar int )
PARTITIONED BY (testpartition string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','

LOAD DATA query error

What is the problem with this line
$load ="LOAD DATA INFILE $inputFile INTO TABLE $tableName FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n' IGNORE 1 LINES";
echo $load;
mysql_query($load);
The echo result is;
LOAD DATA INFILE appendpb.csv INTO TABLE appendpb_csv FIELDS TERMINATED BY ',' LINES TERMINATED BY ' ' IGNORE 1 LINES
The error is;
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'appendpb.csv INTO TABLE appendpb_csv FIELDS TERMINATED BY ',' LINES TERMINATED B' at line 1

According to the MYSQL LOAD DATA Reference it should have single quotes around the input file:
$load ="LOAD DATA INFILE '$inputFile' INTO TABLE $tableName FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n' IGNORE 1 LINES";
Eventually looking likes this
LOAD DATA INFILE 'appendpb.csv' INTO TABLE appendpb_csv FIELDS TERMINATED BY ',' LINES TERMINATED BY ' ' IGNORE 1 LINES
Assuming the path of the file is correct.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hive table creation error through Bash Shell - hadoop

Related

Hive Command Line - problems with backticks in column name

How to export a Hive table into a CSV file including header?

command line arguments in hive ( .hql) files from a bash script

hive load data:how to specify file column separator and dynamic partition columns?

LOAD DATA query error

Categories

Resources