simple working beeline query below; when i put in script it will run but I want to put a hivevar for the path, how do I accomplish this as when i put in my script .properties file the ='path' does not seem to work. I am missing something with these single quotes i think and I just can't seem to get it to work.
maxValQuery.hql
WORKING: INSERT OVERWRITE DIRECTORY '/user/tmp/maxVal' select max(${hivevar:MAX_VAL_COL}) from ${hivevar:FACT_TABLE};
WANTED: INSERT OVERWRITE DIRECTORY ${hivevar:PATH_ON_HDFS} select max(${hivevar:MAX_VAL_COL}) from ${hivevar:FACT_TABLE};
script.sh
#! /bin/bash
# I want to add --hivevar PATH_ON_HDFS=${maxValPathOnHDFS}
beeline \
-u $hiveServer2 \
--hivevar DATABASE_NAME_ON_HIVE=${dbNameOnHive} \
--hivevar FACT_TABLE=${mainFactTableOnHive} \
--hivevar MAX_VAL_COL=${factTableIncrementalColumn} \
-f ${maxValQueryFile}
script.properties
dbNameOnHive=poc
mainFactTableOnHive=factTable
factTableIncrementalColumn=aTimeColumn
maxValQueryFile=maxValQuery.hql
#maxValPathOnHDFS='/user/tmp/maxVal'
#I believe problem is above with the single quotes, yes I uncomment when i execute :P
removed single quotes from properties file and added around hivevar in query:
#maxValPathOnHDFS=/user/tmp/maxVal & '${hivevar:PATH_ON_HDFS}'
Related
I'm trying to read CSV file and writing the same into the table, CSV file was located in my local machine(client). I used /copy command and achieved the same. Here I have hardcoded my filepath in sql script. I want to parameterised my csv file path.
Based on my analysis /copy not supported :variable substitution, but not sure
I believe we can achieve this using shell variables, but I tried the same, It's not working as expected.
Following are my sample scripts
command:
psql -U postgres -h localhost testdb -a -f '/tmp/psql.sql' -v path='"/tmp/userData.csv"'
psql script:
\copy test_user_table('username','dob') from :path DELIMITER ',' CSV HEADER;
I executing this commands from shell and I'm getting no such a file not found exception. But same script is working with hardcoded path.
Anyone able to advise me on this.
Reference :
Variable substitution in psql \copy
https://www.postgresql.org/docs/devel/app-psql.html
I am new to Bash. So far your problem is way hard for me.
I can do it in one shell script. Maybe later I can make it to two scripts.
The follow is a simple one script file.
#!bin/bash
p=\'"/mnt/c/Users/JIAN HE/Desktop/test.csv"\'
c="copy emp from ${p}"
a=${c}
echo $a
psql -U postgres -d postgres -c "${a}"
I have a hive script that executes some DMLs and drop some tables, and executes some shell-delete files. I am firing the script using hive -f myscript.hql.
From within the script I need to remove files from local directory. I tried to use !rm /home/myuser/temp_table_id_*; throws error:
rm: cannot remove ‘/home/myuser/temp_table_id_*’: No such file or directory
Command failed with exit code = 1
* is not working.
Here is a sample script:
--My HQL File--
INSERT OVERWRITE ....
...
..;
DROP TABLE TEMP_TABLE;
!hadoop fs -rm -r /user/myuser/ext_tables/temp_table;
!rm /home/myuser/temp_table_id_*;
CREATE TABLE NEW_TABLE(
....
...
;
I am calling the script with the command: hive -f myscript.hql
The script is running fine till it finds the line :!rm /home/myuser/temp_table_id_*; where is cursing about the *.
When I am providing separate file names instead of the *, its working.
But i wish to use *.
Try
dfs -rm /home/myuser/temp_table_id_*;
in the HQL. Wildcard works well with hive dfs commands.
From Hive docs
dfs <dfs command> -Executes a dfs command from the Hive shell.
I am trying to execute a beeline hql file with the following contents.
INSERT OVERWRITE DIRECTORY "${hadoop_temp_output_dir}${file_pattern}${business_date}" select data from database.${table}
I am executing the script using the following command:
beeline -u "jdbc:hive2://svr.us.XXXX.net:10000/;principal=hive/svr.us.XXXX.net#NAEAST.COM" --hivevar hadoop_temp_output_dir=/tenants/demo/hive/database/ --hivevar file_pattern=sales --hivevar business_date=20180709 -f beeline_test.hql
I see the variables are not getting substituted while they are getting executed in the hive environment. What is the mistake I made here.
Also, how to setup init.hql(for all configurations) and execute this hql file
EDIT:I got the answer: I just used double quotes for the variables and corrected few typos
I am new to Hive and wanted to know how to execute hive commoands directly from .hql file.
As mentioned by #rajshukla4696, both hive -f filename.hql or beeline -f filename will work.
You can also execute queries from the command line via "-e":
hive -e "select * from my_table"
There are plenty of useful command line arguments for Hive that can be found here: Hive Command line Options
hive -f filepath;
example-hive -f /home/Nitin/Desktop/Hive/script1.hql;
Use hive -f filename.hql;
Remember to terminate your command with ;
I am trying to pass command line arguments through the below ,but its not working . Can anybody help me with what I am doing wrong here!
hive -f test2.hql -hiveconf partition=20170117 -hiveconf -hiveconf datepartition=20170120
Pass your arguments before the query file,
hive --hiveconf partition='20170117' --hiveconf datepartition='20170120' -f test2.hql
And use them in your queries in test2.hql like this,
${hiveconf:partition}
Example:
select * from tablename where partition=${hiveconf:partition} and date=${hiveconf:datepartition}
Some alternatives:
1) if using hive command line, you can just elaborate the whole sql command and execute it like:
hive -e <command>
and explicit the parameters as literals.
2) if using beeline (preferred to hive), just append this to the command line:
--hivevar myparam='myvalue'