How to remove duplicate values from SQL inner join tables?

How to remove duplicate values from SQL inner join tables? - oracle

I have two tables:
Table 1:
+-----------+-----------+------------------+
| ID | Value | other |
+-----------+-----------+------------------+
| 123456 | 5 | 12 |
| 987654 | 7 | 15 |
| 456789 | 6 | 22 |
+-----------+-----------+------------------+
Table 2:
+-----------+-----------+------------------+
| ID | Type | other |
+-----------+-----------+------------------+
| 123456 | 00 | 2 |
| 123456 | 01 | 6 |
| 123456 | 02 | 4 |
| 987654 | 00 | 7 |
| 987654 | 01 | 8 |
| 456789 | 00 | 6 |
| 456789 | 01 | 16 |
+-----------+-----------+------------------+
Now I perform the inner join:
SELECT
table1.ID, table2.TYPE, table1.value, table2.other
FROM
table1 INNER JOIN table2 ON table1.ID = table2.ID
Here the SQLfiddle
Result Table:
+-----------+-----------+---------+------------------+
| ID | Type | Value | other |
+-----------+-----------+---------+------------------+
| 123456 | 00 | 5 | 2 |
| 123456 | 01 | 5 | 6 |
| 123456 | 02 | 5 | 4 |
| 987654 | 00 | 7 | 7 |
| 987654 | 01 | 7 | 8 |
| 456789 | 00 | 6 | 6 |
| 456789 | 01 | 6 | 16 |
+-----------+-----------+---------+------------------+
This is totally what I expected but not what I need.
Because if I now want to get the Value per ID the Value gets doubled or tripled for the first cause.
Desired Table:
+-----------+-----------+---------+------------------+
| ID | Type | Value | other |
+-----------+-----------+---------+------------------+
| 123456 | 00 | 5 | 2 |
| 123456 | 01 | - | 6 |
| 123456 | 02 | - | 4 |
| 987654 | 00 | 7 | 7 |
| 987654 | 01 | - | 8 |
| 456789 | 00 | 6 | 6 |
| 456789 | 01 | - | 16 |
+-----------+-----------+---------+------------------+
I tried to achieve a similar output by counting the rows per id and dividing the sum of Value by that count but it did not seem to work and is not the desired output.
Also, I tried grouping but this did not seem to achieve the desired output.
One thing to mention is that the DB I am working with is an ORACLE SQL DB.

How about this:
select table1.id
, table2.type
, case
when row_number() over (partition by table1.id order by table2.type) = 1
then table1.value
end as "VALUE"
, table2.other
from table1
join table2 on table1.id = table2.id
order by 1, 2;
(This is Oracle SQL syntax. Your SQL Fiddle (thanks!) was set to MySQL, which as far as I know doesn't have analytic functions like row_number().)

A way to get the result.
select t1.ID,
t2.type,
t1.value,
t2.other
from table1 t1 inner join table2 t2
ON t1.ID = t2.ID
inner join (select ID, min(type) mv
from table2
group by id) m
on t2.id = m.id
and t2.type = m.mv
union all
select t1.ID,
t2type,
null,
t2.other
from table1 t1 inner join table2 t2
ON t1.ID = t2.ID
and not exists (
select 1 from (
select ID, min(type) mv
from table2
group by id) m
where t2.id = m.id
and t2.type = m.mv
)
order by id,type

You can use a CASE block to display NULL for which it is NOT equal to MIN value of type
SELECT table1.ID,
table2.TYPE,
CASE
WHEN table2.TYPE =
MIN (table2.TYPE)
OVER (PARTITION BY table1.id ORDER BY table2.TYPE)
THEN
Table1.VALUE
END
VALUE,
table2.other
FROM table1 INNER JOIN table2 ON table1.ID = table2.ID;

Related

Join 3 or more tables with Comma separated values in Oracle

I've 3 tables (One parent and 2 childs) as below
Student(Parent)
StudentID | Name | Age
1 | AA | 23
2 | BB | 25
3 | CC | 27
Book(child 1)
BookID | SID | BookName | BookPrice
1 | 1 | ABC | 20
2 | 1 | XYZ | 15
3 | 3 | LMN | 34
4 | 3 | DDD | 90
Pen(child 2)
PenID | SID | PenBrandName | PenPrice
1 | 2 | LML | 20
2 | 1 | PARKER | 15
3 | 2 | CELLO | 34
4 | 3 | LML | 90
I need to join the tables and get an output as Below
StudentID | Name | Age | BookNames | TotalBookPrice | PenBrands | TotalPenPrice
1 | AA | 23 | ABC, XYZ | 35 | PARKER | 15
2 | BB | 25 | null | 00 | LML, CELLO | 54
3 | CC | 27 | LMN, DDD | 124 | LML | 90
This is the code i tried :
Select s.studentID as "StudentID", s.name as "Name", s.age as "AGE",
LISTAGG(b.bookName, ',') within group (order by b.bookID) as "BookNames",
SUM(b.bookPrice) as "TotalBookPrice",
LISTAGG(p.penBrandName, ',') within group (order by p.penID) as "PenBrands",
SUM(p.penPrice) as "TotalPenPrice"
FROM Student s
LEFT JOIN BOOK b ON b.SID = s.StudentID
LEFT JOIN PEN p ON p.SID = s.StudentID
GROUP BY s.studentID, s.name, s.age
The result i get has multiple values of Book and Pen (cross product result in multiple values)
StudentID | Name | Age | BookNames | TotalBookPrice | PenBrands | TotalPenPrice
1 | AA | 23 | ABC,ABC,XYZ,XYZ | 35 | PARKER,PARKER | 15
Please let me know how to fix this.

Instead of Joining the tables and doing aggregation, You have to aggregate first and then join your tables -
Select s.studentID as "StudentID", s.name as "Name", s.age as "AGE",
"BookNames",
"TotalBookPrice",
"PenBrands",
"TotalPenPrice"
FROM Student s
LEFT JOIN (SELECT SID, LISTAGG(b.bookName, ',') within group (order by b.bookID) as "BookNames",
SUM(b.bookPrice) as "TotalBookPrice"
FROM BOOK
GROUP BY SID) b ON b.SID = s.StudentID
LEFT JOIN (SELECT SID, LISTAGG(p.penBrandName, ',') within group (order by p.penID) as "PenBrands",
SUM(p.penPrice) as "TotalPenPrice"
FROM PEN
GROUP BY SID) p ON p.SID = s.StudentID;

Hive - same sequence of records in array

I have a table with data at hour level. I want to find the count of hours and the values for col1 and col2 for all hours in an array. Input Table
+-----+-----+-----+
| hour| col1| col2|
+-----+-----+-----+
| 00 | 0.0 | a |
| 04 | 0.1 | b |
| 08 | 0.2 | c |
| 12 | 0.0 | d |
+-----+-----+-----+
I am using the below query to get the column values in an array
Query:
select count(hr), map_values(str_to_map(concat_ws(',',collect_set(concat_ws(':',reflect('java.util.UUID','randomUUID'),cast(col1 as string)))))) as col1_arr, map_values(str_to_map(concat_ws(',',collect_set(concat_ws(':',reflect('java.util.UUID','randomUUID'),cast(col2 as string)))))) as col2_arr from table;
Output that i am getting, values in col2_arr are not in the same sequence with col1_arr. Please suggest how can i get the values in array/list for different columns in same sequence.
+----------+-----------------+----------+
| count(hr)| col1_arr | col2_arr |
+----------+-----------------+----------+
| 4 | 0.0,0.1,0.2,0.0 | b,a,c,d |
+----------+----------------+-----------+
Required output:
+----------+-----------------+----------+
| count(hr)| col1_arr | col2_arr |
+----------+-----------------+----------+
| 4 | 0.0,0.1,0.2,0.0 | a,b,c,d |
+----------+----------------+-----------+
Thanks

select count(*) as cnt
,concat_ws(',',sort_array(collect_list(hour))) as hour
,regexp_replace(concat_ws(',',sort_array(collect_list(concat_ws(':',hour,cast(col1 as string))))),'..:','') as col1
,regexp_replace(concat_ws(',',sort_array(collect_list(concat_ws(':',hour,col2)))),'..:','') as col2
from mytable
;
+-----+-------------+-------------+---------+
| cnt | hour | col1 | col2 |
+-----+-------------+-------------+---------+
| 4 | 00,04,08,12 | 0,0.1,0.2,0 | a,b,c,d |
+-----+-------------+-------------+---------+

why do they use (+) in where clause for checking not null in a column eg emp_name(+) IS NOT NULL

Why do we use a (+) operator in the where clause for instance emp_name(+) IS NOT NULL, emp_name IS NOT NULL AND emp_name(+) IS NOT NULL is the same

Because removing the (+) from the column you're checking is not null turns the join from an outer join into what is effectively an inner join. Leaving the (+) in tells oracle to get all rows from the "main" table, and then match any rows from the outer joined table where that column is not null.
See the below for an example of why the "extra" (+) is needed:
with t1 as (select 1 id, 'a' val from dual union all
select 2 id, 'b' val from dual union all
select 3 id, 'c' val from dual),
t2 as (select 1 id, null val from dual union all
select 3 id, 'd' val from dual)
select *
from t1,
t2
where t1.id = t2.id (+)
and t2.val (+) is not null
order by t1.id;
ID VAL ID_1 VAL_1
---------- --- ---------- -----
1 a
2 b
3 c 3 d
with t1 as (select 1 id, 'a' val from dual union all
select 2 id, 'b' val from dual union all
select 3 id, 'c' val from dual),
t2 as (select 1 id, null val from dual union all
select 3 id, 'd' val from dual)
select *
from t1,
t2
where t1.id = t2.id (+)
and t2.val is not null
order by t1.id;
ID VAL ID_1 VAL_1
---------- --- ---------- -----
3 c 3 d
You can see the difference easier if you convert the query to the ANSI join syntax:
with t1 as (select 1 id, 'a' val from dual union all
select 2 id, 'b' val from dual union all
select 3 id, 'c' val from dual),
t2 as (select 1 id, null val from dual union all
select 3 id, 'd' val from dual)
select *
from t1
left outer join t2 on (t1.id = t2.id and t2.val is not null)
order by t1.id;
ID VAL ID_1 VAL_1
---------- --- ---------- -----
1 a
2 b
3 c 3 d
with t1 as (select 1 id, 'a' val from dual union all
select 2 id, 'b' val from dual union all
select 3 id, 'c' val from dual),
t2 as (select 1 id, null val from dual union all
select 3 id, 'd' val from dual)
select *
from t1
left outer join t2 on (t1.id = t2.id)
where t2.val is not null
order by t1.id;
ID VAL ID_1 VAL_1
---------- --- ---------- -----
3 c 3 d
In other words, it's the difference between the "col is not null" predicate being a part of the outer join condition, or a filter in the where clause.
You'll note as well that having the "t2.val is not null" in the where clause has the effect of turning the outer join into an inner join, despite the fact that you've requested an outer join:
with t1 as (select 1 id, 'a' val from dual union all
select 2 id, 'b' val from dual union all
select 3 id, 'c' val from dual),
t2 as (select 1 id, null val from dual union all
select 3 id, 'd' val from dual)
select *
from t1
left outer join t2 on (t1.id = t2.id)
--where t2.val is not null
order by t1.id;
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | OMem | 1Mem | O/1/M |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 12 (100)| | | | |
| 1 | SORT ORDER BY | | 3 | 33 | 12 (17)| 00:00:01 | 2048 | 2048 | 1/0/0|
|* 2 | HASH JOIN OUTER| | 3 | 33 | 11 (10)| 00:00:01 | 1156K| 1156K| 1/0/0|
| 3 | VIEW | | 3 | 18 | 6 (0)| 00:00:01 | | | |
| 4 | UNION-ALL | | | | | | | | |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 8 | VIEW | | 2 | 10 | 4 (0)| 00:00:01 | | | |
| 9 | UNION-ALL | | | | | | | | |
| 10 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 11 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
-----------------------------------------------------------------------------------------------------
with t1 as (select 1 id, 'a' val from dual union all
select 2 id, 'b' val from dual union all
select 3 id, 'c' val from dual),
t2 as (select 1 id, null val from dual union all
select 3 id, 'd' val from dual)
select *
from t1
left outer join t2 on (t1.id = t2.id)
where t2.val is not null
order by t1.id;
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | OMem | 1Mem | O/1/M |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 10 (100)| | | | |
| 1 | SORT ORDER BY | | 1 | 11 | 10 (20)| 00:00:01 | 2048 | 2048 | 3/0/0|
|* 2 | HASH JOIN | | 1 | 11 | 9 (12)| 00:00:01 | 1156K| 1156K| 3/0/0|
| 3 | VIEW | | 2 | 10 | 2 (0)| 00:00:01 | | | |
| 4 | UNION-ALL | | | | | | | | |
|* 5 | FILTER | | | | | | | | |
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 8 | VIEW | | 3 | 18 | 6 (0)| 00:00:01 | | | |
| 9 | UNION-ALL | | | | | | | | |
| 10 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 11 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
| 12 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |
-----------------------------------------------------------------------------------------------------
Note the change from HASH JOIN OUTER to HASH JOIN in the row with id = 2 in the 2nd explain plan.

ORACLE: INSERT SELECT FROM 2 views and value from param

I'm trying to insert some fields into MYTABLE from views MYVIEW1 and MYVIEW2 and then add a value from a parameter (this is part of a stored procedure) for UPDATED_BY, SYSDATE for UPDATED_ON. How can I correctly do this with INSERT SELECT or some other way entirely?
MYVIEW1
+------+----+-----+-----------+---------+
| YR | MO | QTR | USER_CODE | MO_PERF |
+------+----+-----+-----------+---------+
| 2012 | 1 | 1 | 1099 | 89 |
| 2012 | 2 | 1 | 1099 | 86 |
| 2012 | 3 | 1 | 1099 | 95 |
+------+----+-----+-----------+---------+
MYVIEW2
+------+-----+-----------+----------+
| YR | QTR | USER_CODE | QTR_PERF |
+------+-----+-----------+----------+
| 2012 | 1 | 1099 | 90 |
+------+-----+-----------+----------+
MYTABLE
+------+-----+-----------+---------+---------+---------+---------+-------------+------------+
| YR | QTR | USER_CODE | MO1_PCT | MO2_PCT | MO3_PCT | INC | UPDATED_BY | UPDATED_ON |
+------+-----+-----------+---------+---------+---------+---------+-------------+------------+
| 2012 | 1 | 1099 | 89 | 86 | 95 | 7000 | SAMPLE NAME | 01/16/2013 |
+------+-----+-----------+---------+---------+---------+---------+-------------+------------+
INSERT INTO MYTABLE
(YR,QTR,USER_CODE,MO1_PCT,MO2_PCT,MO3_PCT,INC,UPDATED_BY,UPDATED_ON)
SELECT b.YR,b.QTR,b.USER_CODE,b.MO1_PCT,b.MO2_PCT,b.MO3_PCT,c.INC
FROM MYVIEW1 b,
MYVIEW2 c
How do I insert values for (first month of QTR's MO_PERF) as MO1_PCT and (second month of QTR's MO_PERF) as MO2_PCT and (last month of QTR's MO_PERF) as MO3_PCT, making sure that I've inserted the right month within the right quarter and year.And then check if the MO_PERF values of each month has reached at least 85, else set INC as NULL.
,CASE WHEN MO1_PCT>=85 AND MO2_PCT>=85 AND MO3_PCT>=85 THEN 7000
ELSE NULL
END INC

If you're using oracle 11g then you can use PIVOT like this:
select YR, QTR, USER_CODE, "1_MO_PCT" MO1_PCT, "2_MO_PCT" MO2_PCT, "3_MO_PCT" MO3_PCT ,
case when "1_MO_PCT" >= 85 and "2_MO_PCT" >= 85 and "2_MO_PCT" >= 85 then 7000 end INC,
user updated_by, sysdate updated_on
from (
select m1.yr, m1.mo, m1.qtr, m1.user_code, m1.mo_perf, m2.qtr_perf
from myview1 m1 join myview2 m2 on m1.yr=m2.yr
and m1.qtr = m2.qtr and m1.user_code = m2.user_code )t
pivot(
max(mo_perf) MO_PCT for mo in (1,2,3)
)
Here is a sqlfiddle demo

INSERT trigger with INSERT INTO WHERE condition

I am working on a trigger which needs INSERT INTO with WHERE logic.
I have three tables.
Absence_table:
-----------------------------
| user_id | absence_reason |
-----------------------------
| 1234567 | 40 |
| 1234567 | 50 |
| 1213 | 40 |
| 1314 | 20 |
| 1111 | 20 |
-----------------------------
company_table:
-----------------------------
| user_id | company_id |
-----------------------------
| 1234567 | 10201 |
| 1213 | 10200 |
| 1314 | 10202 |
| 1111 | 10200 |
-----------------------------
employment_table:
--------------------------------------
| user_id | emp_type | emp_no |
--------------------------------------
| 1234567 | Int | 1 |
| 1213 | Int | 2 |
| 1314 | Int | 3 |
| 1111 | Ext | 4 |
--------------------------------------
and finally I have the table out where data should be going only who have emp_type = Int in employment_table and have company_id = 10200
out:
--------------------------------
| employee_id | absence_reason |
--------------------------------
| 1 | 40 |
| 1 | 50 |
| 2 | 40 |
| 3 | 20 |
--------------------------------
Here is my trigger:
CREATE OR REPLACE TRIGGER "INOUT"."ABSENCE_TRIGGER"
AFTER INSERT ON absence_table
FOR EACH ROW
DECLARE
BEGIN
CASE
WHEN INSERTING THEN
INSERT INTO out (absence_reason, employee_id)
VALUES (:NEW.absence_reason, (SELECT employee_id FROM employment_table WHERE user_id = :NEW.user_id)
WHERE user_id IN
(SELECT user_id FROM employment_table WHERE employment_type = 'INT')
AND user_id IN
(SELECT user_id FROM company_table WHERE company_id = '10200');
END CASE;
END absence_trigger;
It is obviously not working and I can't figure out what should I do to make it work. Any suggestions?

change the insert to this:
insert into out (absence_reason, employee_id)
select :NEW.absence_reason, e.emp_no
from employment_table e
inner join company_table c
on c.user_id = e.user_id
where e.user_id = :NEW.user_id
and e.emp_type = 'INT'
and c.company_id = '10200';
which should work. note you had emp_no in your sample structure yet employee_id in the trigger insert too. i've assumed emp_no is right. also emp_type vs employment_type.
Finally in your trigger you have company_id in quotes. Is it really a varchar2? if so OK, if not, don't use quotes.

The parentheses are not balanced. The one for values is not closed. This is the cause of your specific error, but #DazzaL's answer looks like the correct solution.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to remove duplicate values from SQL inner join tables? - oracle

Related

Join 3 or more tables with Comma separated values in Oracle

Hive - same sequence of records in array

why do they use (+) in where clause for checking not null in a column eg emp_name(+) IS NOT NULL

ORACLE: INSERT SELECT FROM 2 views and value from param

INSERT trigger with INSERT INTO WHERE condition

Categories

Resources