joining based on columns priority - oracle

I want to join 2 tables based on columns priority
ex. Suppose Table1 has six columns(Col1,Col2,Col3,Col4,Col5,Col6)
If i want to join Table 1 with table2 (Col1,Col2,Col3,Col4,Col5,Col7), it should
otherwise
Select Table2.col7
where
first check col1 , col2 and col3 if match found no need go check more
second check col1 , col2 if match found no need go check more
third check col1 if match found no need go check more
last ignore all col1 , col2 and col3
AND Table1.Col4=Table2.Col4
AND Table1.Col5=Table2.Col5
I may not be clear with my words, if any concern please shout

You cannot tell SQL to try to join on a certain condition first and in case it finds no match to go on searching. What you can do is join all allowed combinations (matches on col4 and col5 in your case) and then rank your matches (such that a match on col1 and col2 and col3 is considered best etc.). Then only keep the best matches:
select col7
from
(
select
t1.*,
t2.*,
row_number() over
(
partition by t1.col4, t1.col5
order by case
when t2.col1 = t1.col1 and t2.col2 = t1.col2 and t2.col3 = t1.col3 then 1
when t2.col1 = t1.col1 and t2.col2 = t1.col2 then 2
when t2.col1 = t1.col1 then 3
else 4
) as rn
from table1 t1
join table2 t2 on t2.col4 = t1.col4 and t2.col5 = t1.col5
)
where rn = 1;

Select t2.col7
from Table1 t1 inner join Table2 t2
on
case
when t1.col1 = t2.col1 then 1
when t1.col2 = t2.col2 then 1
when t1.col3 = t2.col3 then 1
when t1.Col4=t2.Col4
and t1.Col5=t2.Col5 then 1
else 0 end = 1
;

Related

delete data from Table1 that doesn't exist in Table2

I have 2 different tables and I want to delete records from table1 which does not exist in Tables2
Table1:
select col1 from Table1
Table2:
select
concat('A_',col1)
from
Table2
where
Col2 = '748'
and Col3 = 'D'
and Col4 = 'Account'
now I want to delete the difference from Table1...
This can be done using the minus operation, and an insert into statement.
insert into table3(col) (
select col1 from Table1
minus
select
concat('A_',col1)
from
Table2
where
Col2 = '748'
and Col3 = 'D'
and Col4 = 'Account'
)
Records can then be deleted from table1 using a delete statement like
delete from table1
where col1 in (
select col1 from Table1
minus
select
concat('A_',col1)
from
Table2
where
Col2 = '748'
and Col3 = 'D'
and Col4 = 'Account'
)
delete from table1 t1
where not exists ( select * from table2 where col2 || col3 || col4 = t1.col1 );
This will work EXCEPT for the following situation; you need to explain what you want in that case. The DELETE statement can be modified to accommodate.
If t1.col1 is NULL, it will be deleted even if there are rows in table2 where col2, col3 and col4 are all NULL. Is that situation possible (where t1.col1 and col2, col3, col4 in table2 are all NULL? In that case, should the row in t1 be kept rather than deleted?

INNER JOIN on multiple columns with multiple matches

How to find what rows got exact one match and what rows got more than one match in a INNER JOIN ?
SELECT A.Col1, B.Col2 FROM A INNER JOIN B
ON A.Col3 = B.Col3 AND A.Col4 = B.Col4;
As we know INNER JOIN returns rows with minimum one match, so to again reiterate my qustion, how to find which rows matched once and which rows got more than one match.
Regards,
Sachin
You could use a window function to count how many records are coming from B:
SELECT A.Col1, B.Col2, Count(*) OVER (PARTITION BY b.col3, b.col4) as bcount
FROM A
INNER JOIN B
ON A.Col3 = B.Col3 AND A.Col4 = B.Col4;
With the help of JNevill's inputs, here is a working example of what I was looking for. I want to thank JNevill once again.
create table A (col1 number, col3 number, col4 number, col5 number, col6 number);
create table B (col2 number, col3 number, col4 number, col5 number, col6 number);
insert into A values (1,2,3, 4, 5);
insert into A values (2,3,4,5,6);
insert into B values (3,4,5,6,7);
insert into B values (4,2,3,4,5);
insert into B values (5,2,3,8,9);
insert into B values (6,3,4,5,6);
insert into B values (7,3,4,5,6);
SELECT Col1 FROM(
SELECT A.Col1,B.Col2, A.Col3, A.Col4, A.Col5 ,A.Col6, Count(*) OVER (PARTITION BY B.col3, B.col4, B.col5, B.col6) as bcount
FROM A
INNER JOIN B
ON A.Col3 = B.Col3 AND A.Col4 = B.Col4 AND A.Col5 = B.Col5 AND A.Col6 = B.Col6) WHERE BCOUNT = 1;
So, I was looking for a column from table A which has exact one match for all the joining columns in table B.
Regards.

Generate difference between 2 tables listing columns from both tables

Have 2 tables with same columns and want to generate the difference between the tables and want to show the difference listing all columns from both tables
example:
select a.*,b.* from (
(
select a.col1,a.col2 from
(select col1, col2 from table1 minus select col1, col2 from table2) as a
)
union
(
select b.col1, b.col2 from
(select col1, col2 from table2 minus select col1, col2 from table2) as b
)
)
The result should be
a.col1 a.col2 b.col1 b.col2
a.FName a.ZipCode b.FName b.ZipCode
John <same value> Jane <same value as A>
Alpha 1234 Beta 2345
My query returns exception that it is missing R parenthesis after the 1st minus keyword
I think you are trying to find rows from table a which are missing in table b and rows in table b which are missing from table a. However, there is no point in joining these two sets. Try the following query and see if it works for you.
SELECT col1, col2, 'Missing from table 2' title
FROM
(
SELECT col1,
col2
FROM table1
MINUS
SELECT col1,
col2
FROM table2
)
UNION ALL
SELECT col1, col2, 'Missing from table 1' title
FROM
(
SELECT col1,
col2
FROM table2
MINUS
SELECT col1,
col2
FROM table1
)

SQL Server 2012: Update table with inner join after sorted

I am using SQL Server 2012. I have a table called table1 like below:
Id col1 col2 col3 Name
1 a b abc null
2 b c mno null
And I have another table table2, like below:
Id col1 col2 col3 Name
1 % % abc Name1
2 a % abc Name2
3 % b abc Name3
4 a b abc Name4
I have to update Name column in Table1 From Name column in Table2 based on columns: col1, col2 and col3.
The Id = 1 in the table1 finds all 4 matches in the table because I am using like operator in col1 and col2 to compare(why I am using like is if it didn't find exact match it should accept % as a match).
Now my problem is if exact match is there for the columns col1, col2 and col3 in the table, it should consider that only not the rows with '%' value. For example, for the Id=1 in the table1, the result should be from id=4 in the table2.
I tried with following query:
UPDATE table1
SET name = t2.Name
FROM (SELECT TOP 1
t1.id, t2.name
FROM table1 t1
INNER JOIN table2 t2
ON t1.col3 = t2.col3 AND t1.col1 LIKE t2.col1
AND t1.col2 LIKE t2.col2
ORDER BY t2.col1, t2.col2) AS t3
WHERE id = t3.id;
But I am not getting result which I expected. And also, there are 8,000,000 records are there in table1 so it should not affect performance.
Please help to fix this issue.
At a first try, I suggest this:
UPDATE table1
SET name = t2.Name
FROM (SELECT TOP(1) *
FROM (SELECT
t1.id, t2.name, t2.col1, t2.col2, 2 As ord
FROM table1 t1
INNER JOIN table2 t2
ON t1.col3 = t2.col3 AND t1.col1 LIKE t2.col1
AND t1.col2 LIKE t2.col2
UNION ALL
SELECT
t1.id, t2.name, t2.col1, t2.col2, 1 As ord
FROM table1 t1
INNER JOIN table2 t2
ON t1.col3 = t2.col3 AND t1.col1 = t2.col1
AND t1.col2 = t2.col2
) DT
ORDER BY ord, col1, col2) AS t3
WHERE id = t3.id;

Replace selfjoin with analytic functions

How do I go about replacing the following self join using analytics:
SELECT
t1.col1 col1,
t1.col2 col2,
SUM((extract(hour FROM (t1.times_stamp - t2.times_stamp)) * 3600 + extract(minute FROM ( t1.times_stamp - t2.times_stamp)) * 60 + extract(second FROM ( t1.times_stamp - t2.times_stamp)) ) ) div,
COUNT(*) tot_count
FROM tab1 t1,
tab1 t2
WHERE t2.col1 = t1.col1
AND t2.col2 = t1.col2
AND t2.col3 = t1.sequence_num
AND t2.times_stamp < t1.times_stamp
AND t2.col4 = 3
AND t1.col4 = 4
AND t2.col5 NOT IN(103,123)
AND t1.col5 != 549
GROUP BY t1.col1, t1.col2
I'm pretty sure you won't be able to replace the self-join with analytics because you are using inter-rows operations (t1.time_stamp - t2.time_stamp). Analytics can only access the values of the current row and the value of aggregate functions over a subset of rows (windowing clause).
See this article from Tom Kyte and this paper for further analysis of the limitations of analytics.
It almost looks like you could eliminate the self join on t2 and replace
t1.time_stamp - t2.time_stamp
with something like
t1.time_stamp - lag(t1.time_stamp) over (partition by col1, col2 order by time_stamp)
The different filters on t1 and t2 on col4 and col5 are what prevents you from doing this.
Analytic functions are applied after the where / group by on the main query, so you'd need to have a single filter on t1 in order to use lag/lead to specify following or preceding rows in a sequence.
Also, you'd need to push the sum/group by to an outer query to aggregate after the analytic function:
select col1, col2, sum(timestamp_diff) from (
select col1, col2, timestamp - lag(timestamp) over(.....) as timestamp_diff
where ....
) group by col1, col2

Resources