Inserting Data using join in Hive - hadoop

Hi I have two Tables and after Join I want to Insert Data into Third Tables. Problem I am facing is I have to create multiple records based on the value of Join.
Table 1
A B
-------
1 X
2 y
3 x
Table 2
A C
-------
1 Y
2 N
3 Y
I need to join Table 1 and Table 2 on Column A and Based on value of Column C in Table 2 I need to insert Records in Table 3
Rule
If Column C value is 'Y' then insert 3 Records as 'Red','Green','Blue'
If Column C value is 'N then insert 2 records as 'White','Black'
So Result should be
Table 3
A D
-----------
1 Red
1 Green
1 Blue
2 White
2 Black
3 Red
3 Green
3 Blue
Can you let me know how to achieve this using hiveql ? Thanks

You can create a third table Color
Table Color
------------------
Flag Color
Y Red
Y Blue
Y Green
N White
N Black
Now you can join them easily
Select * from Table1 T1
JOIN Table2 T2
ON T1.A = T2.C
JOIN COLOR C
ON T2.C = C.Flag

Related

How to match a value from a table upto a particular position in oracle?

I have to write a query to match values in two tables, Table A and Table B , Table A is havingvalues in column XYZ as "91517181","915171812", i want to check if its exist in table B or not , but in table B, the value in column ABC is "9151718", but in another column in table B it is having its match length as "10". Which means it is upto "9151718XXX".
So i have to write a query where value from table A should match with value in table B, because in table B, the value is upto 10 characters.
Kindly help...
I think that you need something like this:
table a: table b:
xyz x y
---------- ---------- ---
9151718 9151718 10
91517181 91360 5
913601
select a.xyz, rpad(xyz, b.y, 'x') result, b.x pattern, b.y len
from a
left join b on a.xyz like b.x||'%' and length(a.xyz)<=b.y
xyz result pattern len
---------- ---------- ---------- ---
9151718 9151718xxx 9151718 10
91517181 91517181xx 9151718 10
913601 <- not matched
I think something like that:
select * from a where
exists(select 'x' from b where substr(xyz, 1, y) = x)
x - value in b
y - length in b

Update with group by

I'm stumped on what seemed to be a simple UPDATE statement.
I'm looking for an UPDATE that uses two values. The first (a) is used to group, the second (b) is used to find a local minimum of values within the respective group. As a little extra there is a threshold value on b: Any value 1 or smaller shall remain as it is.
drop table t1;
create table t1 (a number, b number);
insert into t1 values (1,0);
insert into t1 values (1,1);
insert into t1 values (2,1);
insert into t1 values (2,2);
insert into t1 values (3,1);
insert into t1 values (3,2);
insert into t1 values (3,3);
insert into t1 values (4,1);
insert into t1 values (4,3);
insert into t1 values (4,4);
insert into t1 values (4,5);
-- 1,0 -> 1,0
-- 1,1 -> 1,1
-- 2,1 -> 2,1
-- 2,2 -> 2,2
-- 3,1 -> 3,1
-- 3,2 -> 3,2
-- 3,3 -> 3,2 <-
-- 4,1 -> 4,1
-- 4,3 -> 4,3 <-
-- 4,4 -> 4,3 <-
-- 4,5 -> 4,3 <-
Obviously not sufficient is:
update t1 x
set b = (select min(b) from t1 where b > 1)
;
Whatever more complicated stuff I try, e.g.
UPDATE t1 x
set (a,b) = (select distinct a,b from (
select a, min(b) from t1 where b > 1 group by a)
)
;
I get
SQL-Fehler: ORA-01427: Unterabfrage für eine Zeile liefert mehr als eine Zeile
01427. 00000 - "single-row subquery returns more than one row"
which is not overly surprising as I need a row for each value of a.
Of course I could write a PL/SQL Procedure with a cursor loop but is it possible in a single elegant SQL statement? Maybe using partition by?
Your question is a bit confusing.
You say that you would like to set value b to a minimum value from partition a that column b is in row with, while the rows containing b = 1 should remain untouched.
From what I can see in your question as comments (I assume it's your expected output) you also want to get the minimum value that follows 1 within a partition - so you basically want the minimum value of b that is greater than 1.
Below is SQL query that does this
UPDATE t1 alias
SET b = (
SELECT min(b)
FROM t1
WHERE alias.a = t1.a
AND t1.b > 1 -- this would get the minimum value higher than 1
GROUP BY a
)
WHERE alias.b > 1 -- update will not affect rows with b <= 1
Output after update
a | b
---+---
1 | 0
1 | 1
2 | 1
2 | 2
3 | 1
3 | 2
3 | 2
4 | 1
4 | 3
4 | 3
4 | 3

Oracle: how to flash back a specific column?

How do I flash back a specific column for all rows in a table?
For example, given this table:
select * from t as of scn 1201789714628;
a b
- -
x 1
y 2
z 3
select * from t;
a b
- -
x 4
y 5
z 6
I can flash back a column in a specific row as follows:
update t set b = (select b from t as of scn 1201789714628 where a='x') where a='x';
select * from t;
a b
- -
x 1
y 5
z 6
but I can't figure out the syntax to set b to its previous value this for all rows.
update t t1 set b = (select b from t as of scn 1201789714628) t2 where t1.a = t2.a;
Error at Command Line:11 Column:60
SQL Error: ORA-00933: SQL command not properly ended
You may try this:
update t t1
set b = (select b from (select a, b from t as of scn 1201789714628) t2
where t1.a = t2.a);
P.S. I'd recommend to copy your snapshot in a temporary table if you're not going to update it right now (it can dissapear very soon).

Generate List for label with barcode printing from tablem with number quantity of rows

I try to generate list of products for printing labels, but all of my attempt fail (with connect by level)!
My table:
CREATE TABLE LABELS
(
PRODUCT VARCHAR2(8 BYTE),
Q_ROWS NUMBER
);
Information in the table:
INSERT INTO LABELS (PRODUCT, Q_ROWS) VALUES('D', 3);
INSERT INTO LABELS (PRODUCT, Q_ROWS) VALUES('A', 1);
INSERT INTO LABELS (PRODUCT, Q_ROWS) VALUES('C', 4);
INSERT INTO LABELS (PRODUCT, Q_ROWS) VALUES('B', 2);
Expected Result in a oracle select
PRODUCT
A
B
B
C
C
C
C
D
D
D
Results: (1 row for A, 2 rows for B, 4 rows to C and 3 rows to D)
Can someone help me?
Use LEVEL to get a "table" that counts from 1 to the maximum number of rows:
SELECT LEVEL AS LabelNum
FROM DUAL
CONNECT BY LEVEL <= (SELECT MAX(Q_Rows) FROM Labels)
This will give you the following table:
LabelNum
--------
1
2
3
4
Next, join this to your LABELS table where LabelNum <= Q_Rows. Here's the whole query:
WITH Mult AS (
SELECT LEVEL AS LabelNum
FROM DUAL
CONNECT BY LEVEL <= (SELECT MAX(Q_Rows) FROM Labels)
)
SELECT Product
FROM Labels
INNER JOIN Mult ON LabelNum <= Q_Rows
ORDER BY Product, LabelNum
There's a working SQLFiddle here.
Finally, good job including the create/populate scripts :)
Another approach, using model clause:
select product
from labels
model
partition by (product)
dimension by (1 as indx)
measures(q_rows)
rules(
q_rows[for indx from 1 to q_rows[1] increment 1] = q_rows[1]
)
order by product
result:
PRODUCT
----------
A
B
B
C
C
C
C
D
D
D
SQLFiddle Demo

CONNECT BY for two tables with two JOINS

I have 3 tables:
two with hierarchical structures
(like "dimensions" of recursive type of hierarchy);
one with summing data (like "facts" with X column).
They are here:
DIM1 (ID1, PARENT2, NAME1)
DIM2 (ID2, PARENT2, NAME2)
FACTS (ID1, ID2, X)
Example of DIM1 table:
-- 1 0 DIM1
---- 2 1 DIM1-A
------ 3 2 DIM1-A-A
-------- 4 3 DIM1-A-A-A
-------- 5 3 DIM1-A-A-B
------ 6 2 DIM1-A-B
-------- 7 6 DIM1-A-B-A
-------- 8 6 DIM1-A-B-B
------ 9 2 DIM1-A-C
---- 10 1 DIM1-B
------ 11 10 DIM1-B-C
------ 12 10 DIM1-B-D
---- 13 1 DIM1-C
Example of DIM2 table:
-- 1 0 DIM2
---- 2 1 DIM2-A
------ 3 2 DIM2-A-A
-------- 4 3 DIM2-A-A-A
-------- 5 3 DIM2-A-A-B
-------- 6 3 DIM2-A-B-C
------ 7 2 DIM2-A-B
---- 8 1 DIM2-B
---- 9 1 DIM2-C
Example of FACTS table:
1 1 100
1 2 30
1 3 500
-- ................
13 9 200
And I would like to create the only SELECT where I will specify the parent for DIM1 (for example ID1=2 for DIM1-A) and parent for DIM2 (for example ID2=2 for DIM2-A) and SELECT will generate a report like this:
Name_of_1 Name_of_2 Sum_of_X
--------- --------- ----------
DIM1-A-A DIM2-A-A (some sum)
DIM1-A-A DIM2-A-B (some sum)
DIM1-A-B DIM2-A-A (some sum)
DIM1-A-B DIM2-A-B (some sum)
DIM1-A-C DIM2-A-A (some sum)
DIM1-A-C DIM2-A-B (some sum)
I would like to use CONNECT BY phrase, START WITH phrase, SUM phrase, GROUP BY phrase, and OUTER or INNER (?) JOIN. I need no other extensions of Oracle 10.2.
In other words: only with "classic" SQL and
only Oracle extensions for hierarchy queries.
Is it possible?
I tried some experiments with question in
Mixing together Connect by, inner join and sum with Oracle
(where is a very nice solution but only for one
dimension table ("Tasks"), but I need to JOIN two dimension tables to one facts table), but I was not successful.
"Some sum" is not very descriptive, so I don't see why do you need CONNECT BY at all.
SELECT dim1.name, dim2.name, x
FROM (
SELECT id1, id2, SUM(x) AS x
FROM facts
GROUP BY
id1, id2
) f
JOIN dim1
ON dim1.id = f.id1
JOIN dim2
ON dim2.id = f.id2
I think what you're trying to do is get the sum of the value in the facts table for all of the children of the specified rows grouped by the topmost children. This would mean that in your example above, the results for the first row would be the sum any intersections of (DIM1-A-A, DIM1-A-A-A, DIM1-A-A-B) and (DIM2-A-A, DIM2-A-A-A, DIM2-A-A-B, DIM3-A-A-C) found in the FACTS table. With that assumption, I have come to the following solution:
SELECT root_name1, root_name2, SUM(X)
FROM ( SELECT CONNECT_BY_ROOT(name1) AS root_name,
id1
FROM dim1
CONNECT BY parent1 = PRIOR id1
START WITH parent1 = 2) d1
CROSS JOIN
( SELECT CONNECT_BY_ROOT(name2) AS root_name,
id2
FROM dim2
CONNECT BY parent2 = PRIOR id2
START WITH parent2 = 2) d2
LEFT OUTER JOIN
facts
ON d1.id1 = facts.id1
AND d2.id2 = facts.id2
GROUP BY root_name1, root_name2
(This also assumes that the columns of FACTS are named ID1, ID2, and X.)

Resources