I develop procedure in Oracle
I need write something like this:
--
open curr1 for
select *
from table1
where key_field in (select key_field from tbl_keys where type = 1);
--
open curr2 for
select *
from table2
where key_field in (select key_field from tbl_keys where type = 1);
--
open curr3 for
select *
from table3
where key_field in (select key_field from tbl_keys where type = 1);
--
Is there a better way to do this?
Any optimization for inner select?
Thanks!
I think Oracle will try to cache the results of your tbl_keys subquery, so if it returns a small number of rows, your queries are probably fine the way they are.
I don't know if it's better, but if you're having performance problems an alternate method you could try is to join the tables, e.g.
open curr1 for
select t.*
from table1 t
join tbl_keys k
on k.key_field = t.key_field
and k.type = 1;
This might improve performance if your tbl_keys table is very large.
Personally, I prefer using implicit cursor loops whenever possible - they're simple and can be very fast - but I don't know if it would work for your procedure, since you didn't show the rest of it.
for r in (select t.*
from table1 t
join tbl_keys k
on k.key_field = t.key_field
and k.type = 1)
loop
-- output the row somehow
end loop;
for r in (select t.*
from table2 t
join tbl_keys k
on k.key_field = t.key_field
and k.type = 1)
loop
-- output the row somehow
end loop;
...etc
Related
There are around 120k records in the database, and based on a few functions I calculate scores for all the records, weekly I have to update the table with new records and respective scores.
Below is a procedure that I am using to merge data into the table:
create or replace procedure scorecalc
AS
score1 number;
score2 number;
score3 number;
CURSOR cur IS
SELECT Id_number from tableA;
r_num cur%ROWTYPE;
BEGIN
--OPEN cur;
FOR r_num IN cur
LOOP
select functionA(r_num.id_number),functionb(r_num.id_number),functionc(r_num.id_number) into score1, score2,score3 from dual;
Merge into scores A USING
(Select
r_num.id_number as ID, score1 as scorea, score2 as scoreb, score3 as scorec, TO_DATE(sysdate, 'DD/MM/YYYY') as scoredate
FROM DUAL) B
ON ( A.ID = B.ID and A.scoredate = B.scoredate)
WHEN NOT MATCHED THEN
INSERT (
ID, scorea, scoreb, scorec, scoredate)
VALUES (
B.ID, B.scorea, B.scoreb, B.scorec,B.scoredate)
WHEN MATCHED THEN
UPDATE SET
A.scorea = B.scorea,
A.scoreb = B.scoreb,
A.scorec = B.scorec;
COMMIT;
END LOOP;
END;
whereas functionA/ B/ C has complex queries, joins in it to calculate the score.
Please suggest me any way to improve the performance because currently with this snippet of code I am only able to insert some 2k records in 1 hour? Can I use parallel DML here?
Thank you!
Why are you doing this in a procedure? This could all be done via DML:
MERGE INTO scores a USING
(SELECT ta.id_number AS ID,
functionA(ta.id) AS scoreA,
functionB(ta.id) AS scoreB,
functionC(ta.id) AS scoreC,
TO_DATE(sysdate, 'DD/MM/YYYY') as scoredate
FROM tableA ta) b
ON (a.id = b.id AND a.scoredate = b.scoredate)
WHEN MATCHED THEN UPDATE SET
a.scorea = b.scorea,
a.scoreb = b.scoreb,
a.scorec = b.scorec
WHEN NOT MATCHED THEN INSERT (ID, scorea, scoreb, scorec, scoredate)
VALUES (B.ID, B.scorea, B.scoreb, B.scorec,B.scoredate);
If you want to try using PARALLEL hint after that, feel free. But you should definitely get rid of that cursor and stop doing "Slow-by-slow" processing.
Something I've had some success with has been inserting from a select statement. It is pretty performant as it doesn't involve row by row inserting.
In your case, I'm thinking it would be something like:
INSERT INTO table (ID, scorea, scoreb, scorec, scoredate)
SELECT functionA(id_number), functionB(id_number), functionC(id_number)
FROM tableA
An example of this can be found at the link below:
https://docs.oracle.com/cd/B12037_01/appdev.101/b10807/13_elems025.htm
To schedule it just put the statement #Del in a procedure block;
create or replace procedure Saturday_Night_Merge is
begin
<Put the merge statement here>
end Saturday_Night_Merge;
I'm currently migrating data from legacy system to the current system.
I have this INSERT statement inside a stored procedure.
INSERT INTO TABLE_1
(PRIMARY_ID, SEQUENCE_ID, DESCR)
SELECT LEGACY_ID PRIMARY_ID
, (SELECT COUNT(*) + 1
FROM TABLE_1 T1
WHERE T1.PRIMARY_ID = L1.LEGACY_ID) SEQUENCE_ID
, L1.DESCR
FROM LEGACY_TABLE L1;
However, whenever I have multiple values of LEGACY_ID from LEGACY_TABLE, the query for the SEQUENCE_ID doesn't increment.
Why is this so? I can't seem to find any documentation on how the INSERT INTO SELECT statement works behind the scenes. I am guessing that it selects all the values from the table you are selecting and then inserts them simultaneously after, that's why it doesn't increment the COUNT(*) value?
What other workarounds can I do? I cannot create a SEQUENCE because the SEQUENCE_ID must be based on the number of PRIMARY_ID that are present. They are both primary ids.
Thanks in advance.
Yes, The SELECT will be executed FIRST and only then the INSERT happens.
A Simple PL/SQL block below, will be a simpler approach, though not efficient.
DECLARE
V_SEQUENCE_ID NUMBER;
V_COMMIT_LIMIT:= 20000;
V_ITEM_COUNT := 0;
BEGIN
FOR REC IN (SELECT LEGACY_ID,DESCR FROM LEGACY_TABLE)
LOOP
V_SEQUENCE_ID:= 0;
SELECT COUNT(*)+1 INTO V_SEQUENCE_ID FROM TABLE_1 T1
WHERE T1.PRIMARY_ID = REC.LEGACY_ID
INSERT INTO TABLE_1
(PRIMARY_ID, SEQUENCE_ID, DESCR)
VALUES
(REC.LEGACY_ID,V_SEQUENCE_ID,REC.DESCR);
V_ITEM_COUNT := V_ITEM_COUNT + 1;
IF(V_ITEM_COUNT >= V_COMMIT_LIMIT)
THEN
COMMIT;
V_ITEM_COUNT := 0;
END IF;
END LOOP;
COMMIT;
END;
/
EDIT: Using CTE:
WITH TABLE_1_PRIM_CT(PRIMARY_ID, SEQUENCE_ID) AS
(
SELECT L1.LEGACY_ID,COUNT(*)
FROM LEGACY_TABLE L1
LEFT OUTER JOIN TABLE_1 T1
ON(L1.LEGACY_ID = T1.PRIMARY_ID)
GROUP BY L1.LEGACY_ID
)
INSERT INTO TABLE_1
(SELECT L1.LEGACY_ID,
CTE.SEQUENCE_ID+ (ROW_NUMBER() OVER (PARTITION BY L1.LEGACY_ID ORDER BY null)),
L1.DESCR
FROM TABLE_1_PRIM_CT CTE, LEGACY_TABLE L1
WHERE L1.LEGACY_ID = CTE.PRIMARY_ID);
PS: With your Millions of Rows, this is going to create a temp table
of same size, during execution. Do Explain Plan before actual execution.
I need to make a conversion from Oracle SQL to PostgreSQL.
select * from table1 inner join table2 on table1.id = table2.table1Id
where table1.col1 = 'TEST'
and rownum <=5
order by table2.col1
If I delete and rownum <=5 and put at the end limit 5, there are differences between the 2 dialects. In Oracle, first are selected the 5 elements, and after that, they are sorted by table2.col1 . In Postgres, first all the list is sorted, and AFTER there are selected the first 5 elements.
How can I obtain the same result in Postgres as in Oracle?
Thanks!
To get the behavior you desire, you can use a subquery like this:
SELECT * FROM (
SELECT table1.col1 as t1col1, table2.col1 as t2col1
FROM table1 INNER JOIN table2 ON table1.id = table2.table1Id
WHERE table1.col1 = 'TEST'
LIMIT 5
) AS sub
ORDER BY t2col1;
I named the columns there because in your example both tables had a col1.
Note however that without any ordering on the inner query, the selection of 5 rows you get will be purely random and subject to change.
Depending on the version you are using, PostgreSQL 8.4 and above have Window functions. Window function ROW_NUMBER() is capable of implementing the functionality of Oracle pseudo column rownum.
select row_number() over() as rownum,* from table1 inner join table2 on table1.id = table2.table1Id where table1.col1 = 'TEST' and rownum <=5 order by table2.col1;
Here is my query,
SELECT ID As Col1,
(
SELECT VID FROM TABLE2 t
WHERE (a.ID=t.ID or a.ID=t.ID2)
AND t.STARTDTE =
(
SELECT MAX(tt.STARTDTE)
FROM TABLE2 tt
WHERE (a.ID=tt.ID or a.ID=tt.ID2) AND tt.STARTDTE < SYSDATE
)
) As Col2
FROM TABLE1 a
Table1 has 48850 records and Table2 has 15944098 records.
I have separate indexes in TABLE2 on ID,ID & STARTDTE, STARTDTE, ID, ID2 & STARTDTE.
The query is still too slow. How can this be improved? Please help.
I'm guessing that the OR in inner queries is messing up with the optimizer's ability to use indexes. Also I wouldn't recommend a solution that would scan all of TABLE2 given its size.
This is why in this case I would suggest using a function that will efficiently retrieve the information you are looking for (2 index scan per call):
CREATE OR REPLACE FUNCTION getvid(p_id table1.id%TYPE)
RETURN table2.vid%TYPE IS
l_result table2.vid%TYPE;
BEGIN
SELECT vid
INTO l_result
FROM (SELECT vid, startdte
FROM (SELECT vid, startdte
FROM table2 t
WHERE t.id = p_id
AND t.startdte < SYSDATE
ORDER BY t.startdte DESC)
WHERE rownum = 1
UNION ALL
SELECT vid, startdte
FROM (SELECT vid, startdte
FROM table2 t
WHERE t.id2 = p_id
AND t.startdte < SYSDATE
ORDER BY t.startdte DESC)
WHERE rownum = 1
ORDER BY startdte DESC)
WHERE rownum = 1;
RETURN l_result;
END;
Your SQL would become:
SELECT ID As Col1,
getvid(a.id) vid
FROM TABLE1 a
Make sure you have indexes on both table2(id, startdte DESC) and table2(id2, startdte DESC). The order of the index is very important.
Possibly try the following, though untested.
WITH max_times AS
(SELECT a.ID, MAX(t.STARTDTE) AS Startdte
FROM TABLE1 a, TABLE2 t
WHERE (a.ID=t.ID OR a.ID=t.ID2)
AND t.STARTDTE < SYSDATE
GROUP BY a.ID)
SELECT b.ID As Col1, tt.VID
FROM TABLE1 b
LEFT OUTER JOIN max_times mt
ON (b.ID = mt.ID)
LEFT OUTER JOIN TABLE2 tt
ON ((mt.ID=tt.ID OR mt.ID=tt.ID2)
AND mt.startdte = tt.startdte)
You can look at analytic functions to avoid having to hit the second table twice. Something like this might work:
SELECT id AS col1, vid
FROM (
SELECT t1.id, t2.vid, RANK() OVER (PARTITION BY t1.id ORDER BY
CASE WHEN t2.startdte < TRUNC(SYSDATE) THEN t2.startdte ELSE null END
NULLS LAST) AS rn
FROM table1 t1
JOIN table2 t2 ON t2.id IN (t1.ID, t1.ID2)
)
WHERE rn = 1;
The inner select gets the id and vid values from the two tables with a simple join on id or id2. The rank function calculates a ranking for each matching row in the second table based on the startdte. It's complicated a bit by you wanting to filter on that date, so I've used a case to effectively ignore any dates today or later by changing the evaluated value to null, and in this instance that means the order by in the over clause needs nulls last so they're ignored.
I'd suggest you run the inner select on its own first - maybe with just a couple of id values for brevity - to see what its doing, and what ranks are being allocated.
The outer query is then just picking the top-ranked result for each id.
You may still get duplicates though; if table2 has more than one row for an id with the same startdte they'll get the same rank, but then you may have had that situation before. You may need to add more fields to the order by to break ties in a way that makes sens to you.
But this is largely speculation without being able to see where your existing query is actually slow.
Can you use something on the line of
Select * from table(cast(select * from tab1 inner join tab2)) inner join tab3
Take into account that what's inside the table(cast()) is something much more complex than a simple select involving a block like with test as (select) select *... etc.
I need a simple way to do this preferably without the need for a temporary table.
Thank you.
Database: Oracle 10g
LE:
I have something like
Select a.dummy1, a.dummy2, wm_concat(t2.dummy3)
from table1 a,
(with str as
(Select '1,2,3,4' from dual)
Select a.dummy1, t.dummy3
from table1 a
inner join
(Select regexp_substr (str, '[^,]+', 1, rownum) split
from str
connect by level <= length (regexp_replace (str, '[^,]+')) + 1) t
on instr(a.dummy2, t.split) > 0) t2
where a.dummy1='xyz'
group by a.dummy1, a.dummy2
The main idea is that column t2.dummy3 contains CSVs. Thats why i have select '1,2,3,4' from dual.
I need to find all rows that contain at least one of the values from str.
Using any kind of loop is out of the question because further i need to integrate this into a larger query used for a report in SSRS, and the tables needed for this are quite large (>1mil rows)
CAST seem completely irrelevant here. You use CAST to change the perceived datatype of an expression. Here, you're passing it a result set, not an expression, and you're not saying what datatype to cast to.
You should be able to simply remove the TABLE and CAST calls and do something like:
SELECT * FROM (SELECT * FROM tab1 INNER JOIN tab2 ON ...) INNER JOIN tab3 ON ...
e.g.
SELECT * FROM
(SELECT d1.dummy FROM dual d1 INNER JOIN dual d2 ON d1.dummy=d2.dummy) d12
INNER JOIN dual d3 ON d12.dummy = d3.dummy
Subquery factoring should work fine here as well.
WITH x AS (SELECT * FROM DUAL)
SELECT * FROM
(SELECT d1.dummy FROM x d1 INNER JOIN x d2 ON d1.dummy=d2.dummy) d12
INNER JOIN dual d3 ON d12.dummy = d3.dummy;
If you're having difficulty getting that kind of construct to work, try adding more detail to your question about specifically what you've tried and what error you're getting.
Yeah... i found the answer... i was just too SQL n00b to see it as it was right in front of me...
i just took the "with" statement outside of the query and it worked.
thank you so much for your help, it was your answer that led me to see my mistake :D
Something like:
with str as
(Select '1,2,3,4' from dual)
Select a.dummy1, a.dummy2, wm_concat(t2.dummy3)
from table1 a,
(
Select a.dummy1, t.dummy3
from table1 a
inner join
(Select regexp_substr (str, '[^,]+', 1, rownum) split
from str
connect by level <= length (regexp_replace (str, '[^,]+')) + 1) t
on instr(a.dummy2, t.split) > 0) t2
where a.dummy1='xyz'
group by a.dummy1, a.dummy2