t-sql TVF function slow when using localvariables in "CASE statement" - performance

I have sql like below inside a Multi-Step Table valued function..I have shortened the query a bit to make it easy to understand.
CREATE FUNCTION [dbo].[fn_ph]
(
-- Add the parameters for the function here
#pAsOfDate date,
#pAccountId nvarchar(15)
)
RETURNS #HH TABLE
(
-- Add the column definitions for the TABLE variable here
AsOfDate date,
AccountId nvarchar(15),
LongName varchar(100),
ShortName varchar(100)
)
AS
BEGIN
declare #longOrShortIndicator int
select #longOrShortIndicator = COUNT(distinct LongOrShortIndicator) from table1 a
select a.*,
case
when a.SecType='CASH' and #longOrShortIndicator > 1 then 'C'
else a.LongOrShortIndicator
end as LongOrShortIndicator
from Table1 a
END
The expression when a.SecType='CASH' and #longOrShortIndicator > 1 then 'C' is the bottleneck.
if i remove the #longOrShortIndicator > 1 the query runs fast
If i run the query outside of the function it returns fast...
Why is the local variable slowing down the entire query? Any help would be appreciated.
thanks

The listing does not show what you want to do with #HH, your return table, but in #longOrShortIndicator you are obviously counting rows in table1. If you're doing it many times, e.g. for all rows of your return table, then it is slow indeed, it would be slow even on the death star.

Guessing here: maybe you could try moving the #LongOrShortIndicator variable out of the query and into an IF statement:
declare #longOrShortIndicator int
select #longOrShortIndicator = COUNT(distinct LongOrShortIndicator) from table1 a
IF #longOrShortIndicator > 1 BEGIN
select a.*,
CASE when a.SecType='CASH' then 'C'
else a.LongOrShortIndicator
end as LongOrShortIndicator
from Table1 a
END ELSE BEGIN
select a.*, a.LongOrShortIndicator
from Table1 a
END

Related

Fastest way to check for duplicates while processing an import in T-SQL

Looking for the fastest way to check for duplicates while processing an import in SQL.
For the most part there will be no duplicates. I am using an IF so that ROW OVER PARTITION will only run when there are duplicates. The IF also dumps the errors to a table.
The issue I am having is I can't use temp tables because it cannot be the same table for both IF and ELSE.
I decided to duplicate the merge for the IF and ELSE. Is there a better/faster way to accomplish this?
--#ImportID is pulled from the filename and datetime
DROP TABLE IF EXISTS [#Duplicate];
SELECT [IDColumn],
COUNT(*) AS [Count]
INTO [#Duplicate]
FROM [TableOfImport]
GROUP BY [IDColumn]
HAVING COUNT(*) > 1;
--SELECT * INTO [#Unique] FROM (
IF 0 = (SELECT MAX(count) FROM [#Duplicate])
PRINT 'Unique'
SELECT *
--INTO #UniqueInfo
FROM [TableOfImport]
-- MERGE STATEMENT HERE --
ELSE
PRINT 'Duplicates'
INSERT INTO [#error] ([Import]
, [Row]
, [Error])
SELECT #ImportID
, [IDColumn]
, 'Duplicate'
FROM [#Duplicate];
SELECT *
--INTO #UniqueInfo
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY [IDColumn] ORDER BY [IDColumn] DESC) rn
FROM [TableOfImport]
) x
WHERE x.rn = 1
-- SAME MERGE STATEMENT HERE --
END
--)

Query taking long when i use user defined function with order by in oracle select

I have a function, which will get greatest of three dates from the table.
create or replace FUNCTION fn_max_date_val(
pi_user_id IN number)
RETURN DATE
IS
l_modified_dt DATE;
l_mod1_dt DATE;
l_mod2_dt DATE;
ret_user_id DATE;
BEGIN
SELECT MAX(last_modified_dt)
INTO l_modified_dt
FROM table1
WHERE id = pi_user_id;
-- this table contains a million records
SELECT nvl(MAX(last_modified_ts),sysdate-90)
INTO l_mod1_dt
FROM table2
WHERE table2_id=pi_user_id;
-- this table contains clob data, 800 000 records, the table 3 does not have user_id and has to fetched from table 2, as shown below
SELECT nvl(MAX(last_modified_dt),sysdate-90)
INTO l_mod2_dt
FROM table3
WHERE table2_id IN
(SELECT id FROM table2 WHERE table2_id=pi_user_id
);
execute immediate 'select greatest('''||l_modified_dt||''','''||l_mod1_dt||''','''||l_mod2_dt||''') from dual' into ret_user_id;
RETURN ret_user_id;
EXCEPTION
WHEN OTHERS THEN
return SYSDATE;
END;
this function works perfectly fine and executes within a second.
-- random user_id , just to test the functionality
SELECT fn_max_date_val(100) as max_date FROM DUAL
MAX_DATE
--------
27-02-14
For reference purpose i have used the table name as table1,table2 and table3 but my business case is similar to what i stated below.
I need to get the details of the table1 along with the highest modified date among the three tables.
I did something like this.
SELECT a.id,a.name,a.value,fn_max_date_val(id) as max_date
FROM table1 a where status_id ='Active';
The above query execute perfectly fine and got result in millisecods. But the problem came when i tried to use order by.
SELECT a.id,a.name,a.value,a.status_id,last_modified_dt,fn_max_date_val(id) as max_date
FROM table1 where status_id ='Active' a
order by status_id desc,last_modified_dt desc ;
-- It took almost 300 seconds to complete
I tried using index also all the values of the status_id and last_modified, but no luck. Can this be done in a right way?
How about if your query is like this?
select a.*, fn_max_date_val(id) as max_date
from
(SELECT a.id,a.name,a.value,a.status_id,last_modified_dt
FROM table1 where status_id ='Active' a
order by status_id desc,last_modified_dt desc) a;
What if you don't use the function and do something like this:
SELECT a.id,a.name,a.value,a.status_id,last_modified_dt x.max_date
FROM table1 a
(
select max(max_date) as max_date
from (
SELECT MAX(last_modified_dt) as max_date
FROM table1 t1
WHERE t1.id = a.id
union
SELECT nvl(MAX(last_modified_ts),sysdate-90) as max_date
FROM table2 t2
WHERE t2.table2_id=a.id
...
) y
) x
where a.status_id ='Active'
order by status_id desc,last_modified_dt desc;
Syntax might contain errors, but something like that + the third table in the derived table too.

Insert into select statement on the same table

I'm currently migrating data from legacy system to the current system.
I have this INSERT statement inside a stored procedure.
INSERT INTO TABLE_1
(PRIMARY_ID, SEQUENCE_ID, DESCR)
SELECT LEGACY_ID PRIMARY_ID
, (SELECT COUNT(*) + 1
FROM TABLE_1 T1
WHERE T1.PRIMARY_ID = L1.LEGACY_ID) SEQUENCE_ID
, L1.DESCR
FROM LEGACY_TABLE L1;
However, whenever I have multiple values of LEGACY_ID from LEGACY_TABLE, the query for the SEQUENCE_ID doesn't increment.
Why is this so? I can't seem to find any documentation on how the INSERT INTO SELECT statement works behind the scenes. I am guessing that it selects all the values from the table you are selecting and then inserts them simultaneously after, that's why it doesn't increment the COUNT(*) value?
What other workarounds can I do? I cannot create a SEQUENCE because the SEQUENCE_ID must be based on the number of PRIMARY_ID that are present. They are both primary ids.
Thanks in advance.
Yes, The SELECT will be executed FIRST and only then the INSERT happens.
A Simple PL/SQL block below, will be a simpler approach, though not efficient.
DECLARE
V_SEQUENCE_ID NUMBER;
V_COMMIT_LIMIT:= 20000;
V_ITEM_COUNT := 0;
BEGIN
FOR REC IN (SELECT LEGACY_ID,DESCR FROM LEGACY_TABLE)
LOOP
V_SEQUENCE_ID:= 0;
SELECT COUNT(*)+1 INTO V_SEQUENCE_ID FROM TABLE_1 T1
WHERE T1.PRIMARY_ID = REC.LEGACY_ID
INSERT INTO TABLE_1
(PRIMARY_ID, SEQUENCE_ID, DESCR)
VALUES
(REC.LEGACY_ID,V_SEQUENCE_ID,REC.DESCR);
V_ITEM_COUNT := V_ITEM_COUNT + 1;
IF(V_ITEM_COUNT >= V_COMMIT_LIMIT)
THEN
COMMIT;
V_ITEM_COUNT := 0;
END IF;
END LOOP;
COMMIT;
END;
/
EDIT: Using CTE:
WITH TABLE_1_PRIM_CT(PRIMARY_ID, SEQUENCE_ID) AS
(
SELECT L1.LEGACY_ID,COUNT(*)
FROM LEGACY_TABLE L1
LEFT OUTER JOIN TABLE_1 T1
ON(L1.LEGACY_ID = T1.PRIMARY_ID)
GROUP BY L1.LEGACY_ID
)
INSERT INTO TABLE_1
(SELECT L1.LEGACY_ID,
CTE.SEQUENCE_ID+ (ROW_NUMBER() OVER (PARTITION BY L1.LEGACY_ID ORDER BY null)),
L1.DESCR
FROM TABLE_1_PRIM_CT CTE, LEGACY_TABLE L1
WHERE L1.LEGACY_ID = CTE.PRIMARY_ID);
PS: With your Millions of Rows, this is going to create a temp table
of same size, during execution. Do Explain Plan before actual execution.

Need to write a procedure to fetch given rownums

I need to write one procedure to pick the record for given rows
for example
procedure test1
(
start_ind number,
end_ind number,
p_out ref cursor
)
begin
opecn p_out for
select * from test where rownum between start_ind and end_ind;
end;
when we pass start_ind 1 and end_ind 10 its working.But when we change start_ind to 5
then query looks like
select * from test where rownum between 5 and 10;
and its fails and not shows the output.
Please assist how to fix this issue.Thanks!
The rownum is assigned and then the where condition evaluated. Since you'll never have a rownum 1-4 in your result set, you never get to rownum 5. You need something like this:
SELECT * FROM (
SELECT rownum AS rn, t.*
FROM (
SELECT t.*
FROM test t
ORDER BY t.whatever
)
WHERE ROWNUM <= 10
)
WHERE rn >= 5
You'll also want an order by clause in the inner select, or which rows you get will be undefined.
This article by Tom Kyte pretty much tells you everything you need to know: http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html
SELECT *
from (SELECT rownum AS rn, t.*
FROM MyTable t
WHERE ROWNUM <= 10
ORDER BY t.NOT-Whatever
-- (its highly important to use primary or unique key of MyTable)
WHERE rn > 5
As a hint, :
Typically we use store-procedures for data validation, access control, extensive or complex processing that requires execution of several SQL statements. Stored procedures may return result sets, i.e. the results of a SELECT statement. Such result sets can be processed using cursors, by other stored procedures, by associating a result set locator, or by applications
I think you are going to use the ruw-number to fetch paged queries.
Try to create a generic select query based on the idea mentioned above.
Two possibilities:
1) Your table is an index-organized table. So its data is sorted. You would select those first rows you want to avoid and based on that get the next rows you are looking for:
create or replace procedure get_records
(
vi_start_ind integer,
vi_end_ind integer,
vo_cursor out sys_refcursor
) as
begin
open vo_cursor for
select *
from test
where rownum <= vi_end_ind - vi_start_ind + 1
and rowid not in
(
select rowid
from test
where rownum < vi_start_ind
)
;
end;
2) Your table is not index-organized, which is normally the case. Then its records are not sorted. To get records m to n, you would have to tell the system what order you have in mind:
create or replace procedure get_records
(
vi_start_ind number,
vi_end_ind number,
vo_cursor out sys_refcursor
) as
begin
open vo_cursor for
select *
from test
where rownum <= vi_end_ind - vi_start_ind + 1
and rowid not in
(
select rowid from
(
select rowid
from test
order by somthing
)
where rownum < vi_start_ind
)
order by something
;
end;
All this said, think it over what you want to achieve. If you want to use this procedure to read your table block for block, keep in mind that it will read the same data again and again. To know what rows 1,000,001 to 1,000,100 are, the dbms must read through one million rows first.

Oracle: how to INSERT if a row doesn't exist

What is the easiest way to INSERT a row if it doesn't exist, in PL/SQL (oracle)?
I want something like:
IF NOT EXISTS (SELECT * FROM table WHERE name = 'jonny') THEN
INSERT INTO table VALUES ("jonny", null);
END IF;
But it's not working.
Note: this table has 2 fields, say, name and age. But only name is PK.
INSERT INTO table
SELECT 'jonny', NULL
FROM dual -- Not Oracle? No need for dual, drop that line
WHERE NOT EXISTS (SELECT NULL -- canonical way, but you can select
-- anything as EXISTS only checks existence
FROM table
WHERE name = 'jonny'
)
Assuming you are on 10g, you can also use the MERGE statement. This allows you to insert the row if it doesn't exist and ignore the row if it does exist. People tend to think of MERGE when they want to do an "upsert" (INSERT if the row doesn't exist and UPDATE if the row does exist) but the UPDATE part is optional now so it can also be used here.
SQL> create table foo (
2 name varchar2(10) primary key,
3 age number
4 );
Table created.
SQL> ed
Wrote file afiedt.buf
1 merge into foo a
2 using (select 'johnny' name, null age from dual) b
3 on (a.name = b.name)
4 when not matched then
5 insert( name, age)
6* values( b.name, b.age)
SQL> /
1 row merged.
SQL> /
0 rows merged.
SQL> select * from foo;
NAME AGE
---------- ----------
johnny
If name is a PK, then just insert and catch the error. The reason to do this rather than any check is that it will work even with multiple clients inserting at the same time. If you check and then insert, you have to hold a lock during that time, or expect the error anyway.
The code for this would be something like
BEGIN
INSERT INTO table( name, age )
VALUES( 'johnny', null );
EXCEPTION
WHEN dup_val_on_index
THEN
NULL; -- Intentionally ignore duplicates
END;
I found the examples a bit tricky to follow for the situation where you want to ensure a row exists in the destination table (especially when you have two columns as the primary key), but the primary key might not exist there at all so there's nothing to select.
This is what worked for me:
MERGE INTO table1 D
USING (
-- These are the row(s) you want to insert.
SELECT
'val1' AS FIELD_A,
'val2' AS FIELD_B
FROM DUAL
) S ON (
-- This is the criteria to find the above row(s) in the
-- destination table. S refers to the rows in the SELECT
-- statement above, D refers to the destination table.
D.FIELD_A = S.FIELD_A
AND D.FIELD_B = S.FIELD_B
)
-- This is the INSERT statement to run for each row that
-- doesn't exist in the destination table.
WHEN NOT MATCHED THEN INSERT (
FIELD_A,
FIELD_B,
FIELD_C
) VALUES (
S.FIELD_A,
S.FIELD_B,
'val3'
)
The key points are:
The SELECT statement inside the USING block must always return rows. If there are no rows returned from this query, no rows will be inserted or updated. Here I select from DUAL so there will always be exactly one row.
The ON condition is what sets the criteria for matching rows. If ON does not have a match then the INSERT statement is run.
You can also add a WHEN MATCHED THEN UPDATE clause if you want more control over the updates too.
Using parts of #benoit answer, I will use this:
DECLARE
varTmp NUMBER:=0;
BEGIN
-- checks
SELECT nvl((SELECT 1 FROM table WHERE name = 'john'), 0) INTO varTmp FROM dual;
-- insert
IF (varTmp = 1) THEN
INSERT INTO table (john, null)
END IF;
END;
Sorry for I don't use any full given answer, but I need IF check because my code is much more complex than this example table with name and age fields. I need a very clear code. Well thanks, I learned a lot! I'll accept #benoit answer.
In addition to the perfect and valid answers given so far, there is also the ignore_row_on_dupkey_index hint you might want to use:
create table tq84_a (
name varchar2 (20) primary key,
age number
);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Johnny', 77);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Pete' , 28);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Sue' , 35);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Johnny', null);
select * from tq84_a;
The hint is described on Tahiti.
you can use this syntax:
INSERT INTO table_name ( name, age )
select 'jonny', 18 from dual
where not exists(select 1 from table_name where name = 'jonny');
if its open an pop for asking as "enter substitution variable" then use this before the above queries:
set define off;
INSERT INTO table_name ( name, age )
select 'jonny', 18 from dual
where not exists(select 1 from table_name where name = 'jonny');
You should use Merge:
For example:
MERGE INTO employees e
USING (SELECT * FROM hr_records WHERE start_date > ADD_MONTHS(SYSDATE, -1)) h
ON (e.id = h.emp_id)
WHEN MATCHED THEN
UPDATE SET e.address = h.address
WHEN NOT MATCHED THEN
INSERT (id, address)
VALUES (h.emp_id, h.address);
or
MERGE INTO employees e
USING hr_records h
ON (e.id = h.emp_id)
WHEN MATCHED THEN
UPDATE SET e.address = h.address
WHEN NOT MATCHED THEN
INSERT (id, address)
VALUES (h.emp_id, h.address);
https://oracle-base.com/articles/9i/merge-statement
CTE and only CTE :-)
just throw out extra stuff. Here is almost complete and verbose form for all cases of life. And you can use any concise form.
INSERT INTO reports r
(r.id, r.name, r.key, r.param)
--
-- Invoke this script from "WITH" to the end (";")
-- to debug and see prepared values.
WITH
-- Some new data to add.
newData AS(
SELECT 'Name 1' name, 'key_new_1' key FROM DUAL
UNION SELECT 'Name 2' NAME, 'key_new_2' key FROM DUAL
UNION SELECT 'Name 3' NAME, 'key_new_3' key FROM DUAL
),
-- Any single row for copying with each new row from "newData",
-- if you will of course.
copyData AS(
SELECT r.*
FROM reports r
WHERE r.key = 'key_existing'
-- ! Prevent more than one row to return.
AND FALSE -- do something here for than!
),
-- Last used ID from the "reports" table (it depends on your case).
-- (not going to work with concurrent transactions)
maxId AS (SELECT MAX(id) AS id FROM reports),
--
-- Some construction of all data for insertion.
SELECT maxId.id + ROWNUM, newData.name, newData.key, copyData.param
FROM copyData
-- matrix multiplication :)
-- (or a recursion if you're imperative coder)
CROSS JOIN newData
CROSS JOIN maxId
--
-- Let's prevent re-insertion.
WHERE NOT EXISTS (
SELECT 1 FROM reports rs
WHERE rs.name IN(
SELECT name FROM newData
));
I call it "IF NOT EXISTS" on steroids. So, this helps me and I mostly do so.

Resources