hive query using regular expression - hadoop

hi i was looking for a way to query a hive table ( user_acc_detl )
where a column (ACC_DETAILS) data looks like below,
COUNTRY[0]_united staes~DATE[0]_6/10/2014~AMOUNT[0]_200~ID[0]_20140509065052159324~COUNTRY[1]_united kingdom~DATE[1]_6/17/2014~AMOUNT[1]_125~ID[1]_20140516075156389761~COUNTRY[2]_Canada~DATE[2]_6/26/2014~AMOUNT[2]_200~ID[2]_20140515094013444121~COUNTRY[3]_Mexico~DATE[3]_7/3/2014~AMOUNT[3]_1200~ID[3]_20140601000937914898
i can query the hive table by
select ACC_DETAILS["COUNTRY[0]"] as COUNTRY, ACC_DETAILS["DATE[0]"] as DATE, ACC_DETAILS["AMOUNT[0]"] as BILLAMOUNT, ACC_DETAILS["ID[0]"] as PAYMENTID
from user_acc_detl
the above query gives the data for country[0], date[0], amount[0], id[0] which is fine.
Question - all i need to query it using just country, date, amount....without specifying as country[0]...
Question - is there a regular expression way to modify the query accordingly. please help me.

A simple way to achieve this is to wrap your query in a view:
CREATE VIEW user_acc_detl_simple AS
SELECT ACC_DETAILS["COUNTRY[0]"] as COUNTRY
, ACC_DETAILS["DATE[0]"] as DATE
, ACC_DETAILS["AMOUNT[0]"] as BILLAMOUNT
, ACC_DETAILS["ID[0]"] as PAYMENTID
FROM user_acc_detl;
SELECT country, date, billamount, paymentid FROM user_acc_detl_simple;

Related

Is it possible to evaluate SQL given as text? (Oracle APEX 21.1)

I have created a classic report region (REGION: REPORT_FILTER_SHOP_TYPE).
That has a SQL below this.
SELECT
ID, SHOP_NAME, SHOP_TYPE, OPEN_YEAR, CITY
FROM SHOP_LIST;
I want to apply a filter to this table. The filter criteria will be selected from the list item. And this page have some lists.
For example, if there is no filter, the SQL right above one. But if the "SHOP_TYPE" and "OPEN_YEAR" are selected, execute the SQL below.
SELECT * FROM (
SELECT
ID, SHOP_NAME, SHOP_TYPE, OPEN_YEAR, CITY
FROM SHOP_LIST
) S
WHERE S.SHOP_TYPE = 'BOOKSTORE' AND S.OPEN_YEAR <2010;
I can now create the compose SQL text from selected list items.
What do I need to set to display this result in REPORT_FILTER_SHOP_TYPE?
Well, most probably not like that; why using parameters on a page if you hardcode some values into report's query? Use parameters!
Something like this:
SELECT id,
shop_name,
shop_type,
open_year
FROM shop_list
WHERE ( shop_type = :P1_SHOP_TYPE
OR :P1_SHOP_TYPE IS NULL)
AND ( open_year < :P1_OPEN_YEAR
OR :P1_OPEN_YEAR IS NULL);

HIve join without common filed

I have the following tables:
Table1:
user_name Url
Rahul www.cric.info.com
ranbir www.rogby.com
sahil www.google.com
banit www.yahoo.com
Table2:
Keyword category
cric sports
footbal sports
google search
I want to search Table1 by matching the keyword in Table2. I can perform the same using case statement and the query works but it is not the right approach because each time I have to add the case statement when I will add new search keyword.
select user_name from table1
case when url like '%cric%' then sports
else 'undefined'
end as category
from table1;
Thanks find the soluntions for this approach. FIrst we need to do the Join and after that we need to filter the record.
select user_name,url,Keyword,catagory from(select table1.user_name,table1.url ,table2.keyword,table2.catagory from table1 left outer join table2)a where a.url like (concat('%',a.phrase,'%')
Not sure about more current versions, but I've run into a similar problem... the primary issue is that Hive only supports equi-join statements... when you apply logic to either side of the join, it has difficulty translating into a Map Reduce function.
The alternative method, if you have a reliably structured field, is that you can create a matching key from the larger field. For example, if you know that you're looking for your keyword to exist in the second position of a dot-delimited URI, you could do something like:
select
Uri
, split(Uri, "\\.")[1] as matchKey
from
Table1
join Table2 on Table2.keyword = Table1.matchKey
;

Is there a Hive equivalent of SQL “LIKE ANY ( SUBQUERY )”

While Hive doesn't supports multi-value LIKE queries which are supported in SQL : ex.
SELECT * FROM user_table WHERE first_name LIKE ANY ( 'root~%' , 'user~%' );
We can convert it into equivalent HIVE queries as :
SELECT * FROM user_table WHERE first_name LIKE 'root~%' OR first_name LIKE 'user~%'
Does anyone know an equivalent solution that Hive does support in case sub-query is used with LIKE ? Have a look at below example :
SELECT * FROM user_table WHERE first_name LIKE ANY ( SELECT expr FROM exprTable);
As It doesn't have values in expression, I can't use same approach for generating multiple LIKE expression separated with OR / AND operator. Initially I thought to write HIVE UDF for it ? Can you please help me supporting such expression and finding HIVE equivalent ?
You can use Hive's RLIKE relational operator as shown below,
SELECT * FROM user_table WHERE first_name RLIKE 'root~|user~|admin~';
Hope this helps!
This is a case involving theta joins in Hive. There is a wiki page for this and a jira request. Please go through the details here on this page: https://cwiki.apache.org/confluence/display/Hive/Theta+Join
Your case is similar to the Side-Table Similarity case given on the page.
You need to convert the expr values into a map and then use regular expression to find the like. Alternatively you can also use union all with all the like expressions in separate SQL - the query might become tedious so you can programatically generate it.
What about this using EXISTS
SELECT * FROM user_table WHERE EXISTS ( SELECT * FROM exprTable WHERE first_name LIKE expr );

creating a hiveQL query that uses UDF function that can return column names

I want to create a Hive UDF function that returns specific column names based on some value say retreivecol(age).If the age is 20 then return the list of column names to be used in select query like 'name,email,fbuserid,friend list ' etc and if the age is less than 20 return 'name' alone.So I want my HIVE QL query to look like
select retreivecol(age) from User_Data;
The above query just prints the name of the columns like 'name,email,fbuserid,friendslist' etc as opposed to treating them as column names and filtering based on the same.Any pointers are appreciated.
I'm not sure a UDF is the right place to do this, as UDF's simply see the value passed to them, they don't really have access to the whole table structure.
Instead, could you do this in a nested table?
select name,email,id FROM
(
select
name,
if(age < 20, email, NULL) as email,
if(age < 20, id, NULL) as id
FROM mytable
) a

Ruby OCI8 DBI, how to check query generated after parameter binding? need to check for "in" queries

While using Ruby-DBI I am facing issues with parameter binding for where "in" queries.
Two questions:
How do I get sql generated after parameter binding?
Does in parameter for sql work properly if using DBI and OCI8?
My code looks like this:
dbh = DBI.connect(setting[:tns], setting[:username], setting[:password])
#date and in_params are parameters to sql query.
#In the query they are seen as ? "Question marks"
sth = dbh.execute(File.read('import_values.sql'), date, in_params)
The query looks like this:
SELECT date, col1, col2
FROM TABLEX
WHERE date = ?
AND col1 not in ( ? )
Please help.
I re-factored code to not use "in".

Resources