finding the best match - oracle

I am trying to find best match value from table T2 with respect to the value in table T3. For example, here I am expecting to get 441 as result.
Please advise.
Thanks.
create table t3 (rc varchar2(20));
create table t2 (rcs varchar2(20));
insert into T3 values ('441449729804');
insert into T2 values ('44');
insert into T2 values ('441');
commit;

If you only ever have a single row selected from t3 then:
SELECT t3.rc, t2.rcs
FROM t2
INNER JOIN t3
ON t3.rc LIKE t2.rcs || '%'
ORDER BY LENGTH(t2.rcs) DESC
FETCH FIRST ROW ONLY
If you can have multiple rows from t3 and want the best match for each then:
SELECT t3.rc,
t2.rcs
FROM t3
CROSS JOIN LATERAL (
SELECT rcs
FROM t2
WHERE t3.rc LIKE t2.rcs || '%'
ORDER BY LENGTH(t2.rcs) DESC
FETCH FIRST ROW ONLY
) t2
or:
SELECT rc,
rcs
FROM (
SELECT t3.rc,
t2.rcs,
ROW_NUMBER() OVER (PARTITION BY t3.rc ORDER BY LENGTH(t2.rcs) DESC) AS rn
FROM t2
INNER JOIN t3
ON t3.rc LIKE t2.rcs || '%'
)
WHERE rn = 1;
Which, for your sample data, all output:
RC
RCS
441449729804
441
fiddle

Related

Return non-null value from two tables in Oracle

I have two tables, T1 and T2 with same set of columns. I need to issue a query which will return me value of columns from either table whichever is not null. If both columns are null return null as the value of that column.
The columns are c1,c2,c3,cond1.
I issued the following query. The problem is that if one subquery fails the whole query fails. Somebody please help me. Probably there is another simple way.
SELECT NVL(T1.c1, T2.c1) c1,NVL(T1.c2, T2.c2) c2,NVL(T1.c3, T2.c3) c3
FROM (SELECT c1,c2,c3
FROM T1
WHERE cond1 = 'T10') T1
,(SELECT c1,c2,c3
FROM T2
WHERE cond1 = 'T200') T2 ;
You need something like this:
SELECT NVL((SELECT T1.c1
FROM T1
WHERE T1.c2 = 'T10'),
(SELECT T2.c1
FROM T2
WHERE T2.c2 = 'T200')) AS c1
FROM dual
Or you may prefer a full outer join:
SELECT NVL(T1.c1, T2.c1) AS c1
FROM T1 FULL OUTER JOIN T2 ON 1=1
WHERE T1.c2 = 'T10'
AND T2.c2 = 'T200'
Your result is logical. If the first table is null no combination of values will exist in the natural join.
EDIT. After some new requirements we can use a hack to get the row. Lets get all three possibilities, T1, T2 or all nulls and select the first one:
SELECT *
FROM ( (SELECT T1.*
FROM T1
WHERE T1.c2 = 'T10')
UNION ALL
(SELECT T2.*
FROM T2
WHERE T2.c2 = 'T200')
UNION ALL
(SELECT T2.*
FROM dual
LEFT JOIN T1 ON 1 = 0 ) )
WHERE ROWNUM = 1

Improve performance of stored procedure where only select query is used

In our environment one procedure is taking long time to execute. I have checked the procedure, and below is the summary -
The procedure contains only select block (around 24). Before each select we are checking if data exists. If yes select the data, else do something else. For example :
-- Select block 1 --
IF EXISTS (SELECT 1 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
)
BEGIN
SELECT t1.col1,t2.col2,t2.col3 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
END
ELSE
BEGIN
SELECT 'DEFAULT1', 'DEFAULT2', 'DEFAULT3'
END
-- Select block 2 --
IF EXISTS (SELECT 1 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col5='someValue' AND t2.col5='someValue'
)
BEGIN
SELECT t1.col5,t2.col6,t2.col7 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col5='someValue' AND t2.col5='someValue'
END
ELSE
BEGIN
SELECT 'DEFAULT1', 'DEFAULT2', 'DEFAULT3'
END
I have come to an conclution that, somehow if we can combine the query that is used within IF EXISTS block into one query, and set some value to some variables so that we can identify which where condition returns true, that can reduce the execution time and improve the performance.
Is my thought correct? Is there any option to do that? Can you suggest any other options?
We are using Microsoft SQL Server 2005.
[Editted : Added] - All select statement doesn't return same column types they are different. And all select statements are required. If there are 24 if block, procedure should return 24 result-set.
[Added]
I would like to ask one more thing, which one of the below runs faster -
SELECT 1 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
SELECT COUNT(1) FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
SELECT TOP 1 1 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
Thanks.
Kartic
To enhance the performance of select query...create "index" on columns which you are using in where clause
like you are using the
WHERE t1.col2='someValue' AND t2.col2='someValue'
WHERE t1.col5='someValue' AND t2.col5='someValue'
so create database index on col2 and col5
Temp table
you can use the temp table to store the result. since you are using same query 24 time so first store the result of below query into the temp table (correct the syntax as require)
insert into temp_table (col2, col5)
SELECT col1, col5 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
Now use the temp table for checking
-- Select block 1 --
IF EXISTS (SELECT 1 FROM temp_table
WHERE t1.col2='someValue' AND t2.col2='someValue'
)
BEGIN
SELECT t1.col1,t2.col2,t2.col3 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
END
-- Select block 2 --
IF EXISTS (SELECT 1 FROM temp_table1
WHERE t1.col5='someValue' AND t2.col5='someValue'
)
BEGIN
SELECT t1.col5,t2.col6,t2.col7 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col5='someValue' AND t2.col5='someValue'
END
The current structure is not very efficient - you effectively have to execute each "if" statement (which will be expensive), and then repeat the same where clause (the expensive bit) if the "if" returns true. And you do this 24 times. Worst case (all the queries return data), you're doubling the time for the query.
You say you've checked for indexing - given that each query appears to be subtly different, it would be worth double checking this.
The obvious thing is to refactor the application to execute the 24 select statements, and deal with the fact that sometimes, they don't return any data. That's a fairly large refactoring, and I assume you've considered that...
If you can't do that, consider a less ambitious (though nastier) refactoring. Instead of checking whether data exists, and either returning it or an equivalent default result set, write it as a union:
SELECT t1.col1,t2.col2,t2.col3 FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
UNION
SELECT 'DEFAULT1', 'DEFAULT2', 'DEFAULT3'
This reduces the number of times you're hitting the where clause, but means your client application must filter out the "default" data.
To answer your final question, I'd run it through the query optimizer and look at the execution plan - but I'd imagine that the first version is fastest - the query can complete as soon as it finds the first record that matches the where criteria. The second version must find all records that match and count them; the final version must find all records and select the first one.
You could outer-join the results of a query to a row of default values, then fall back to the defaults when the query's results are empty:
SELECT
col1 = COALESCE(query.col1, defaults.col1),
col2 = COALESCE(query.col2, defaults.col2),
col3 = COALESCE(query.col3, defaults.col3)
FROM
(SELECT 'DEFAULT1', 'DEFAULT2', 'DEFAULT3') AS defaults (col1, col2, col3)
LEFT JOIN
(
SELECT t1.col1, t2.col2, t2.col3
FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
) query
ON 1=1 -- i.e. join all the rows unconditionally
;
The method may not suit you in exactly that form you if the subquery may actually return NULLs and those must not be replaced with default values. In that case, have the subqueries return a flag column (just any value). If that column evaluates to NULL in the final query, that can only mean that the subquery hasn't returned rows. You can use that fact in a CASE expression like this:
SELECT
col1 = CASE WHEN query.HasRows IS NULL THEN defaults.col1 ELSE query.col2 END,
col2 = CASE WHEN query.HasRows IS NULL THEN defaults.col2 ELSE query.col2 END,
col3 = CASE WHEN query.HasRows IS NULL THEN defaults.col3 ELSE query.col2 END
FROM
(SELECT 'DEFAULT1', 'DEFAULT2', 'DEFAULT3') AS defaults (col1, col2, col3)
LEFT JOIN
(
SELECT HasRows = 1, t1.col1, t2.col2, t2.col3
FROM table1 t1
INNER JOIN table2 t2
ON t1.col1=t2.col1
WHERE t1.col2='someValue' AND t2.col2='someValue'
) query
ON 1=1
;

How to optimize this SELECT with sub query Oracle

Here is my query,
SELECT ID As Col1,
(
SELECT VID FROM TABLE2 t
WHERE (a.ID=t.ID or a.ID=t.ID2)
AND t.STARTDTE =
(
SELECT MAX(tt.STARTDTE)
FROM TABLE2 tt
WHERE (a.ID=tt.ID or a.ID=tt.ID2) AND tt.STARTDTE < SYSDATE
)
) As Col2
FROM TABLE1 a
Table1 has 48850 records and Table2 has 15944098 records.
I have separate indexes in TABLE2 on ID,ID & STARTDTE, STARTDTE, ID, ID2 & STARTDTE.
The query is still too slow. How can this be improved? Please help.
I'm guessing that the OR in inner queries is messing up with the optimizer's ability to use indexes. Also I wouldn't recommend a solution that would scan all of TABLE2 given its size.
This is why in this case I would suggest using a function that will efficiently retrieve the information you are looking for (2 index scan per call):
CREATE OR REPLACE FUNCTION getvid(p_id table1.id%TYPE)
RETURN table2.vid%TYPE IS
l_result table2.vid%TYPE;
BEGIN
SELECT vid
INTO l_result
FROM (SELECT vid, startdte
FROM (SELECT vid, startdte
FROM table2 t
WHERE t.id = p_id
AND t.startdte < SYSDATE
ORDER BY t.startdte DESC)
WHERE rownum = 1
UNION ALL
SELECT vid, startdte
FROM (SELECT vid, startdte
FROM table2 t
WHERE t.id2 = p_id
AND t.startdte < SYSDATE
ORDER BY t.startdte DESC)
WHERE rownum = 1
ORDER BY startdte DESC)
WHERE rownum = 1;
RETURN l_result;
END;
Your SQL would become:
SELECT ID As Col1,
getvid(a.id) vid
FROM TABLE1 a
Make sure you have indexes on both table2(id, startdte DESC) and table2(id2, startdte DESC). The order of the index is very important.
Possibly try the following, though untested.
WITH max_times AS
(SELECT a.ID, MAX(t.STARTDTE) AS Startdte
FROM TABLE1 a, TABLE2 t
WHERE (a.ID=t.ID OR a.ID=t.ID2)
AND t.STARTDTE < SYSDATE
GROUP BY a.ID)
SELECT b.ID As Col1, tt.VID
FROM TABLE1 b
LEFT OUTER JOIN max_times mt
ON (b.ID = mt.ID)
LEFT OUTER JOIN TABLE2 tt
ON ((mt.ID=tt.ID OR mt.ID=tt.ID2)
AND mt.startdte = tt.startdte)
You can look at analytic functions to avoid having to hit the second table twice. Something like this might work:
SELECT id AS col1, vid
FROM (
SELECT t1.id, t2.vid, RANK() OVER (PARTITION BY t1.id ORDER BY
CASE WHEN t2.startdte < TRUNC(SYSDATE) THEN t2.startdte ELSE null END
NULLS LAST) AS rn
FROM table1 t1
JOIN table2 t2 ON t2.id IN (t1.ID, t1.ID2)
)
WHERE rn = 1;
The inner select gets the id and vid values from the two tables with a simple join on id or id2. The rank function calculates a ranking for each matching row in the second table based on the startdte. It's complicated a bit by you wanting to filter on that date, so I've used a case to effectively ignore any dates today or later by changing the evaluated value to null, and in this instance that means the order by in the over clause needs nulls last so they're ignored.
I'd suggest you run the inner select on its own first - maybe with just a couple of id values for brevity - to see what its doing, and what ranks are being allocated.
The outer query is then just picking the top-ranked result for each id.
You may still get duplicates though; if table2 has more than one row for an id with the same startdte they'll get the same rank, but then you may have had that situation before. You may need to add more fields to the order by to break ties in a way that makes sens to you.
But this is largely speculation without being able to see where your existing query is actually slow.

conditional UPDATE in ORACLE with subquery

I have a table ( table1 ) that represents item grouping and another table ( table2 ) that represents the items themselves.
table1.id is foreign key to table2 and in every record of table1 I also collect information like the total number of records in table2 associated with that particular record and the sum of various fields so that I can show the grouping and a summary of what's in it without having to query table2.
Usually items in table2 are added/removed one at a time, so I update table1 to reflect the changes in table2.
A new requirement arose, choosen items in a group must be moved to a new group. I thought of it as a 3 step operation:
create a new group in table1
update choosen records in table2 to point to the newly created rec in table1
the third step would be to subtract to the group the number of records / the sum of those other fields I need do show and add them to the new group, data that I can find simply querying table2 for items associated with the new group.
I came up with the following statement that works.
update table1 t1 set
countitems = (
case t1.id
when 1 then t1.countitems - ( select count( t2.id ) from table2 t2 where t2.id = 2 )
when 2 then ( select count( t2.id ) from table2 t2 where t2.id = 2 )
end
),
sumitems = (
case t1.id
when 1 then t1.sumitems - ( select sum( t2.num ) from table2 t2 where t2.id = 2 )
when 2 then ( select sum( t2.num ) from table2 t2 where t2.id = 2 )
end
)
where t1.id in( 1, 2 );
is there a way to rewrite the statement without having to repeat the subquery every time?
thanks
Piero
You can use a cursor and a bulk collect update statement on the rowid. That way you can simply write the join query with the desired result and update the table with those values. I always use this function and make slight adjustments each time.
declare
cursor cur_cur
IS
select ti.rowid row_id
, count(t2.id) countitems
, sum(t2.num) numitems
from table t1
join table t2 on t1.id = t2.t1_id
order by row_id
;
type type_rowid_array is table of rowid index by binary_integer;
type type_countitems_array is table of table1.countitems%type;
type type_numitems_array is table of table1.numitems%type;
arr_rowid type_rowid_array;
arr_countitems type_countitems_array;
arr_numitems type_numitems_array;
v_commit_size number := 10000;
begin
open cur_cur;
loop
fetch cur_cur bulk collect into arr_rowid, arr_countitems, arr_numitems limit v_commit_size;
forall i in arr_rowid.first .. arr_rowid.last
update table1 tab
SET tab.countitems = arr_countitems(i)
, tab.numitems = arr_numitems(i)
where tab.rowid = arr_rowid(i)
;
commit;
exit when cur_cur%notfound;
end loop;
close cur_cur;
commit;
exception
when others
then rollback;
raise_application_error(-20000, 'ERROR updating table1(countitems,numitems) - '||sqlerrm);
end;

2x COUNT in HAVING clause

I have following problem:
I want to count data in one table, count data in second table and compare countings in having clause, and display only that rows which have the same countings
Something like that:
SELECT bla
FROM T1 t1 JOIN T2 t2
ON t1.id = t2.id
HAVING COUNT(counted data from table1) = COUNT(counted data from table2)
Do you have any idea?
Cheers
Standard SQL:
SELECT t1.bla, t1.id, t1.counter, t2.counter
FROM (SELECT t1.bla, t1.id, COUNT(counted_data_from_t1) AS counter
FROM t1
GROUP BY t1.bla, t1.id
) AS t1
JOIN (SELECT t2.id, COUNT(counted_data_from_t2) AS counter
FROM t2
GROUP BY t2.id
) AS t2
ON t1.id = t2.id AND t1.counter = t2.counter
Oracle SQL (because Oracle doesn't like AS before table aliases):
SELECT t1.bla, t1.id, t1.counter, t2.counter
FROM (SELECT t1.bla, t1.id, COUNT(counted_data_from_t1) AS counter
FROM t1
GROUP BY t1.bla, t1.id
) t1
JOIN (SELECT t2.id, COUNT(counted_data_from_t2) AS counter
FROM t2
GROUP BY t2.id
) t2
ON t1.id = t2.id AND t1.counter = t2.counter
You just have to decide where bla comes from; I nominated t1. I'm assuming that for any given value of t1.id, there is a single value of t1.bla. If there isn't, then you need to explain much more clearly what you're counting and where the various columns are, and what the keys of the tables are.
Update: Apologies for not noticing the Oracle tag and giving invalid Oracle syntax.
WITH jezyki as
(SELECT pseudo_wampira, COUNT(*) AS counter
FROM Jezyki_obce_w
GROUP BY pseudo_wampira
)
,sprawnosc as
(SELECT pseudo_wampira, sprawnosc, COUNT(*) AS counter
FROM Sprawnosci_w
GROUP BY pseudo_wampira, sprawnosc
)
SELECT jezyki.pseudo_wampira, sprawnosc.counter
FROM jezyki,sprawnosc
WHERE jezyki.pseudo_wampira = sprawnosc.pseudo_wampira
AND jezyki.counter = sprawnosc.counter

Resources