HIVE : Replace string/ pattern in row if it exists else do nothing - hadoop

I have a table A with id, name, age.
> id name age
> {20} Joan 12
> 3 James 12
> 12 Jill 12
> {54} Adam 12
> {10} Bill 12
I need to remove the {} surrounding 'id' field.
I tried this :
translate(regexp_extract(id, '([^{])([^}])', 2), '{', '')
which works but returned a null for values with NO {}.
id
3
12
Is there way I can get the output as ???
id
20
3
12
54
10

You could use the regexp_replace udf so as to remove the "{}" like :
select regexp_replace(id, '\\{|\\}','');

Please try the following select statement:
select regexp_replace(col1,'[{}]','') as replaced,col2,col3 from table_name;

Related

How to redirect hive query output to text file with header and column name having space

I have hive able product with rating.
Id, productid, rating, ProdBarCode
42 96 5 881107178
168 151 5 884288058
110 307 4 886987260
58 144 4 884304936
62 21 3 879373460
279 832 3 881375854
237 514 4 879376641
I want to write a query find average product rating of product to pipe separated text file with header using hive -e"query" > output.txt
OUTPUT Format:-|Productid|average rating|
Solution:
hive -e " select C.value from (select 1 key, '|Productid|average rating|' value union all select 2 key , concat('|',concat_ws('|', Productid, averagerating),'|') value from (select CAST(A.productid AS STRING) AS Productid, CAST(A.averagerating AS STRING) AS averagerating from (select productid, avg(rating) averagerating from product group by productid sort by productid ) AS A where A.averagerating > 2) B sort by key) C " > output.txt
Is this query correct? Is there any other simple way to redirect the output in text file with header and column name having spaces (average rating)?
Any suggestion

REMOVE THE LAST COMMA in oracle

COUNTNUM is a column name in a table that has data like below
1,2,3,4,
I used
RTRIM((COUNTNUM),',') COUNTNUM
It didn't work
Desired output
1,2,3,4
Current output
1,2,3,4,
Any suggestions would greatly help..!
Thanks
REGEXP_REPLACE((countnum), ',$', '')
Perhaps There are non-digits after the comma which needed to be removed
Logic is added to account for possible non-digits between the comma and the end of countnum.
Explanation:
[^[:digit:]] is the negation of the digit character class
* is a quantifier meaning zero to many
$ is an anchor identify the end of countnum
SCOTT#dev>WITH d AS (
2 SELECT
3 '1,2,3,4, ' countnum
4 FROM
5 dual
6 UNION ALL
7 SELECT
8 '1,2,3,4,'
9 FROM
10 dual
11 ) SELECT
12 countnum,
13 regexp_replace(
14 countnum,
15 ',[^[:digit:]]*$'
16 ) mod_count_num
17 FROM
18 d;
COUNTNUM MOD_COUNT_NUM
1,2,3,4, 1,2,3,4
1,2,3,4, 1,2,3,4

Select same column for different values on a different column

I did search the forum before posting this and found some topics which were close to the same issue but I still had questions so am posting it here.
EMP_ID SEQ_NR NAME
874830 3 JOHN
874830 4 JOE
874830 21 MIKE
874830 22 BILL
874830 23 ROBERT
874830 24 STEVE
874830 25 JERRY
My output should look like this.
EMP ID SEQ3NAME SEQ4NAME SEQ21NAME SEQ22NAME SEQ23NAME SEQ24NAME SEQ25NAME
874830 JOHN JOE MIKE BILL ROBERT STEVE JERRY
SELECT A.EMP_ID
,A.NAME SEQ3NAME
,B.NAME SEQ4NAME
FROM AC_XXXX_CONTACT A
INNER JOIN AC_XXXX_CONTACT B ON A.EMP_ID = B.EMP_ID
WHERE A.SEQ_NR = '03' AND B.SEQ_NR = '04'
AND B.EMP_ID = '874830';
The above query helped me get the below results.
EMP_ID SEQ3NAME SEQ4NAME
874830 JOHN JOE
My question is to get all the fields(i.e till seq nr = 25) should I be joining the table 5 more times.
Is there a better way to get the results ?
I m querying against the Oracle DB
Thanks for your help.
New Requirement
New Input
STU-ID SEM CRS-NBR
12345 1 100
12345 1 110
12345 2 200
New Output
stu-id crs1 crs2
12345 100 200
12345 110
Not tested since you didn't provide test data (from table AC_XXXX):
(using Oracle 11 PIVOT clause)
select *
from ( select emp_id, seq_nr, name
from ac_xxxx
where emp_id = '874830' )
pivot ( max(name) for seq_nr in (3 as seq3name, 4 as seq4name, 21 as seq21name,
22 as seq22name, 23 as seq23name, 24 as seq24name, 25 as seq25name)
)
;
For Oracle 10 or earlier, pivoting was done "by hand", like so:
select max(emp_id) as emp_id, -- Corrected based on comment from OP
max(case when seq_nr = 3 then name end) as seq3name,
max(case when seq_nr = 4 then name end) as seq4name,
-- etc. (similar expressions for the other seq_nr)
from ac_xxxx
where emp_id = '874830'
;
Or, emp_id doesn't need to be within max() if we add group by emp_id - which then will work even without the WHERE clause, for a different but related question.

Transform row into column and vice-versa using sql - oracle

I have this table:
create table history (
date_check DATE,
type VARCHAR2(30),
id_type NUMBER,
total NUMBER
)
Selecting.....
select * from history order by 1
DATE_CHECK TYPE ID_TYPE TOTAL
14/02/2016 abc 1 14
14/02/2016 abc33 1 14
14/02/2016 bbb 1 40
14/02/2016 bbb33 3 43
14/02/2016 ddd 2 61
14/02/2016 ddd33 2 62
15/02/2016 abc 1 33
15/02/2016 abc33 1 44
15/02/2016 bbb 1 55
15/02/2016 bbb33 3 66
15/02/2016 ddd 2 77
15/02/2016 ddd33 2 88
Type its always this 6 values:
abc
abc33
bbb
bbb33
ddd
ddd33
And I cross this data with "id_type" so there is a decode like this:
select type || decode(id_type, 1, '- new', 2, '- old', 3, '- xpto') as type from history order by 1
In the end I need something like this:
DATE_CHECK abc - new abc33 - old bbb - new bbb33 - old ....
14/02/2016 14 14 40 43
15/02/2016 33 44 55 66
What is the easiest way to do it? Using pivot?
try this:
with data as(
select date_check, type, total from (
select date_check, type || ' ' || decode(id_type, 1, '- new', 2, '- old', 3, '- xpto') as type, total from history
))
select * from data
pivot(
max(total) for type in ('abc - new', 'abc33 - new', 'bbb - new',
'bbb33 - xpto', 'ddd - old', 'ddd33 - old')
)
order by date_check;
And for the "vice versa" use UNPIVOT
You can reference multiple columns in a pivot statement to get your desired output. In your case you have a single analytic column (TOTAL) but multiple columns forming composite columns on which to perform the analytic function, you can use a pivot query like the following:
select *
from history
PIVOT ( max(TOTAL)
for (TYPE, ID_TYPE) in ( ('abc',1) abc_new
, ('abc',2) abc_old
, ('abc',3) abc_xpto
, ('abc33',1) abc33_new
, ('abc33',2) abc33_old
, ('abc33',3) abc33_xpto
, ('bbb',1) bbb_new
, ('bbb',2) bbb_old
, ('bbb',3) bbb_xpto
, ('bbb33',1) bbb33_new
, ('bbb33',2) bbb33_old
, ('bbb33',3) bbb33_xpto
, ('ddd',1) ddd_new
, ('ddd',2) ddd_old
, ('ddd',3) ddd_xpto
, ('ddd33',1) ddd33_new
, ('ddd33',2) ddd33_old
, ('ddd33',3) ddd33_xpto
)
)
You can adjust the output column headings to suite if desired by changing them similar to the following:
...
PIVOT ( max(TOTAL)
for (TYPE, ID_TYPE) in ( ('abc',1) "abc - new"
, ('abc',2) "abc - old"
, ('abc',3) "abc - xpto"
, ('abc33',1) "abc33 - new"
, ...

How can I get values from one table to another via similar values?

I have a table called excel that has 3 columns, name, id, and full_name. The name part is the only one I have and I need to fill id and full_name. The other table that contains the data is called tim_pismena and has 2 columns that I need, id and pismeno_name (the actual names are not important, but i'm writing them just for clarity). In pseudooracle code :) the select that gets me the values from the second table would be done something like this:
SELECT tp.id, tp.pismeno_name
FROM tim_pismena tp
WHERE upper(tp.pismeno_name) LIKE IN upper('%(SELECT name FROM excel)%')
and when used with an insert, the end result should be something like
name id full_name
Happy Joe 55 Very fun place Happy Joe, isn't it?
Use merge statement
1 MERGE
2 INTO excel tgt
3 USING tim_pismenae src
4 ON ( upper(src.naziv_pismena) LIKE '%'||upper(tgt.ime)||'%')
5 WHEN MATCHED
6 THEN
7 UPDATE
8 SET tgt.id = src.id
9 , tgt.full_name = src.naziv_pismena
10 WHEN NOT MATCHED
11 THEN
12 INSERT ( tgt.name
13 , tgt.id
14 , tgt.full_name )
15 VALUES ( src.naziv_pismena
16 , src.id
17 , src.naziv_pismena )
18 WHERE (1 <> 1);

Resources