I want to execute select statement based on region check. If the region value is HK then the table should be created from temp.temp1, otherwise it has to create with temp.temp2.
eg:
**beeline -e "
if [ '$REGION' == 'HK' ]
then
Create table region as Select * from temp.temp1;
else
Create table region as Select * from temp.temp2;
fi**
"**
Is there any possible way to do it?
Hive itself does not support if-else statements, there's HPL/SQL procedural extension that may be useful in your case.
Though, I suggest you a bit different approach: if $REGION variable comes from outside of beeline and those tables' schemes match, you can union the results with the corresponding where case:
create table region as
select *
from temp.temp1
where '$REGION' == 'HK'
union all
select *
from temp.temp2
where '$REGION' != 'HK'
Hive will build the execution plan and get rid of one of the union parts, so it won't affect the real execution time.
Yes- Hive it self doesn't support if-else statement. What i have implemented for now is.
if [ '$REGION' == 'HK' ]
then
beeline -e " Select * from temp.temp1; "
else
beeline -e " Select * from temp.temp2;"
fi
"
I know this is repetitive but for now this is what we have implemented to execute queries of different regions/ blocks
Related
I am trying to accomplish a fairly simple goal: react to the possible error of a previous command with a second command. The wrench in the spokes is the need to use a heredoc syntax in the first command.
this (trivialized example) would produce the result I want to catch:
psql -c "select * from table_that_doesnt_exist" || echo "error"
except, the SQL I need to execute is multiple commands and for my circumstance, I must do this with heredocs:
psql << SQL
select * from good_table;
select * from table_that_doesnt_exist
SQL
and when trying to successfully read the stderr from this type of configuration (I've tried a million ways) I cannot seem to figure it out. These kind of methods do not work:
( psql << SQL
select * from good_table;
select * from table_that_doesnt_exist
SQL
) || echo "error"
or
psql << SQL || echo "error"
select * from good_table;
select * from table_that_doesnt_exist
SQL
I am trying to resolve the issue where if all categories of source table is available in target then truncate and load the target table else don't do anything.
I haven't found any solution just using hive and end up using Shell script as well to resolve this issue.
is it possible to avoid shell script?
Current Approach:
create_ind_table.hql:
create temporary table temp.master_source_join
as select case when source.program_type_cd=master.program_type_cd then 1 else 0 end as IND
from source left join master
on source.program_type_cd=master.program_type_cd;
--if all the categoies from source persent in master then will contain 1 else 0'
drop table if exists temp.indicator;
create table temp.indicator
as select min(ind)*max(ind) as ind from master_source_join;
And following is the script I am calling to Truncate and load the master table if all the source table categories are present in master.
tuncate_load_master.sh
beeline_cmd="beeline -u 'jdbc:hive2://abc.com:2181,abc1.com:2181,abc2.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2' --showHeader=flase --silent=true"
${beeline_cmd} -f create_ind_table.hql
## if indicator is 1 all the source category is present in master else not.
a=`${beeline_cmd} -e "select ind from temp.indicator;"`
temp=`echo $a | sed -e 's/-//g' | sed -e 's/+//g' | sed -e 's/|//g'`
echo $temp
if [ ${temp} -eq 1 ]
then
echo "truncate and load the traget table"
${beeline_cmd} -e "insert overwrite table temp.master select * from temp.source;"
else
echo "nothing to load"
fi
Query with dynamic partitioning will overwrite only partitions existing in the source dataset. Add a dummy partition to your table, like in this answer: https://stackoverflow.com/a/47505850/2700344
You can calculate your flag using analytic min() in the same subquery and filter by it.
IND calculated will be the same for all rows returned. And it seems analytic min() is enough, no need to calculate max(). Filter by IND=1. It will return no rows if min() over()=0 and will not overwrite the table.
--enable dynamic partitioning
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table temp.master PARTITION(dummy_part)
select s.col1, s.col2, --list all columns here you need to insert
'dummy_value' as dummy_part --dummy partition column
from
(
select s.*,
min(case when s.program_type_cd=m.program_type_cd then 1 else 0 end ) over() as IND
from source s left join master m
on s.program_type_cd=m.program_type_cd
)s where ind=1 --filter will not return rows if min=0
wondering if there is way to validate a query before executing
Is there way to check/validate Query without executing it?
One way that we validate SQL is to add a condition to the SQL that could never be true.
Example:
long ll_rc
long ll_result
string ls_sql, ls_test
string ls_message
//Arbitrary SQL
ls_sql = "SELECT * FROM DUAL"
//This SQL when executed will always return 0 if successful.
ls_test = "select count(*) from ( " + ls_sql + " WHERE 1 = 2 )"
DECLARE l_cursor DYNAMIC CURSOR FOR SQLSA ;
PREPARE SQLSA FROM :ls_test;
OPEN DYNAMIC l_cursor;
ll_rc = SQLCA.SQLCODE
choose case ll_rc
case 0
//Success
ls_message = "SQL is properly formed"
case 100
//Fetched row not found. This should not be the case since we only opened the cursor
ls_message = SQLCA.SQLERRTEXT
case -1
//Error; the statement failed. Use SQLErrText or SQLDBCode to obtain the detail.
ls_message = SQLCA.SQLERRTEXT
end choose
CLOSE l_cursor ; //This will fail if open cursor failed.
messagebox( "Result", ls_message )
Note: If your SQL is VERY complicated, which I suspect it isn't, the database optimizer may take several seconds to prepare your SQL. It will be significantly less time than if you run the entire query.
Since the database is the final arbitrator for what is "valid" (table and column names and such) the general answer is no. Now you could come up with a class in PB which checks statement syntax, object names, etc. so you wouldn't have to touch the db but it would be obsolete as soon as any changes were made to the db.
Put the select statement in any script and compile it. Part of the work will be to check the SQL syntax against the database you are connected to.
Watch out: you need at least one bound variable in the column list of your SQL statement. This is not the case for other DML statements.
Example:
in my case:
select noms into :ls_ttt from contacts;
results in a message Unknown columns 'noms' in 'field list'.
However,
select nom into :ls_ttt from contacts;
does not show any error.
Hope this helps.
Add following filter on a column in SAP HANA Analytical view using if statement
if(Col1='a') col2=Col2
else if(Col2='b') col2=col2*1
Can someone help to give me syntax for HANA IF statement for following logic?
Why not using the documentation at the first place?
Not really clear what you are trying to do here. Look's like you are calculating something using col2 based on comparison on col1. As View will not allow you to update the value in the column, you will need to create col3 and put there the following:
if("Col1" = 'a',"Col2", if("Col1" = 'b',"Col2" * 1,'not a not b') )
BTW, do you think col2=col2*1 makes any sense?
Is it possible that you (or Shidai) are confusing the IF-Function with the IF-Statement? Both are working differently:
SELECT IF("Col1"=='a', 'aaahhh', 'uhhhhh') FROM DUMMY;
This works just like in an Excel: If Col1 is 'a' then the first value is returned, otherwise the second.
DECLARE x VARCHAR(100);
IF "Col1"='a'
THEN
x := "Col2";
ELSEIF "Col2"='b'
THEN
x := "Col2" * 1;
END IF
This is a control structure and only allowed in a SQLScript block, e.g. a stored procedure or anonymous block. You cannot use it in a simple SELECT statement.
It's not so clear what you are trying to do with assigning to col2, so I used x instead.
Also note:
HANA is case-sensitive. If you want to use the column Col1 you must write "Col1".
There is also CASE, which works similar to the IF-Function.
I usually do this query every day:
select * from table1
where name = 'PAUL';
select * from table2
where id_user = 012345;
select * from table25
where name = 'PAUL';
select * from table99
where name = 'PAUL';
select * from table28
where id_user = 012345;
.
.
.
I do this query every day. Today is 15 tables, tomorrow maybe 20, the only change is the name or id of the user to search. I have to go placing the username or id in each of the queries, so I need a query where there is a variable and assign the name and id.
.
.
There is a way to simplify this query and make it better? Such as:
DECLARE
l_name table1.name %type;
l_id_user table28.id_user %type;
BEGIN
l_id_user := 012345;
l_name := 'PAUL';
select * from table1
where name in ('l_name');
select * from table2
where id_user in (l_id_user);
.
.
.
END;
I tried that way but fails. I need this query because in most cases need to see up to 20 tables or more.
If what you want is an easy way to repeatedly run a sequence of select statements in SQL*Plus - but with varying criteria you can do it like this:
ACCEPT l_name PROMPT "Input user name: "
ACCEPT l_id_user PROMPT "Input user id: "
select * from table1
where name = '&l_name';
select * from table2
where id_user = &l_id_user;
select * from table25
where name = '&l_name';
select * from table99
where name = '&l_name';
select * from table28
where id_user = &l_id_user;
In a pl/sql block declare ... begin .. end; select statements behave differently from when you run them stand alone in SQL*Plus:
They won't automatically print output.
You must use e.g. a FOR LOOP or a variation of the SELECT INTO syntax to capture the result for subsequent processing. You should seek some documentation on pl/sql programming.
You should understand that pl/sql is definitely not designed for screen output.
It is possible to print out string values using dbms_output.put_line, but you are completely on your own with regards to formatting a row of values into a suitable string.
If your result contains more than a few columns formatting for screen output becomes very cumbersome as you wont have any formatting utilities like the java String.format() to assist you.
And if your result contains more than one row you have to provide some sort of loop construct allowing you to print out the individual rows.
Conlusion:
These are the drawbacks of pl/sql - I really don't see any benefits - unless your aim is to apply some logic to the data being read rather than just printing it to the screen or a file.