I have too many SELECT statements toghether with only one INSERT (maybe hundreds of them) And the system is giving a bad performance.
I will explain in general words what is happening and what I'm searching for:
Considering the following two pseudo-codes in Oracle PL/SQL, which of them would give the best performance?
Option A:
INSERT INTO MyTable
WITH Fields AS (
SELECT Field1, Field2, ..., FieldN FROM TableA JOIN TableW .... WHERE <condition1>
UNION ALL
SELECT Field1, Field2, ..., FieldN FROM TableB JOIN TableX .... WHERE <condition2>
UNION ALL
SELECT Field1, Field2, ..., FieldN FROM TableC JOIN TableB .... WHERE <condition3>
....
UNION ALL
....
SELECT Field1, Field2, ..., FieldN FROM TableZZZ JOIN TableB .... WHERE <conditionN>
Option B:
BEGIN
INSERT INTO MyTable SELECT Field1, Field2, ..., FieldN FROM TableA JOIN TableZ .... WHERE <condition1>
INSERT INTO MyTable SELECT Field1, Field2, ..., FieldN FROM TableB JOIN TableW .... WHERE <condition2>
INSERT INTO MyTable SELECT Field1, Field2, ..., FieldN FROM TableC JOIN TableH .... WHERE <condition3>
...
INSERT INTO MyTable SELECT Field1, Field2, ..., FieldN FROM TableZZZZ JOIN TableX .... WHERE <conditionN>
END
I didn't put the real table names, but I would like to know: if I change the current option A to option B, would it present me a better performance? I mean, is it a good idea to replace UNION ALL with many INSERT statements in this case?
Context Switches and Performance
Almost every program PL/SQL developers write includes both PL/SQL and SQL statements. PL/SQL statements are run by the PL/SQL statement executor;
SQL statements are run by the SQL statement executor. When the PL/SQL runtime engine encounters a SQL statement, it stops and passes the SQL statement over to the SQL engine. The SQL engine executes the SQL statement and returns information back to the PL/SQL engine (see Figure 1). This transfer of control is called a context switch, and each one of these switches incurs overhead that slows down the overall performance of your programs.
so, use this third way:
create view MyView as select Field1, Field2, ..., FieldN from TableA join TableB .... where <condition1>
declare
p_array_size pls_integer := 100;
type array is table of MyView%rowtype;
l_data array;
cursor c is select * from MyView;
begin
open c;
loop
fetch c bulk collect into l_data limit p_array_size;
forall i in 1..l_data.count
insert into MyTable values l_data(i);
exit when c%notfound;
end loop;
close c;
end;
Except your queries are very peculiar in terms of memory on your database server, option B is not going to increase the performance of the big query.
To ensure the above is true, try ask your DBA to check what happens in your databsae sever SGA at the time you perform the query. If a jam in the memory occurs, then it is worth trying to implement option B.
When I say "memory jams", I mean whole SGA memory is filled, making swap necessary on the server. If you do the inserts in a sequence, then SGA can be reused between the inserts.
Related
I have written a Stored Procedure which has 10 DML statements in series where each DML statement takes around 3 minutes for execution. The total Stored Procedure runs for about 29 minutes in production (Each DML dumps millions of record).
I would need 1st two DML statements alone to run in series and the remaining 8 can run in parallel, since they have no dependency.
Need advice to achieve this without using dbms_job or dbms_scheduler
begin
insert all into table1 values ()
insert all into table2 values ()
select ...;
insert into table3 select ... from table1 join table2...;
insert into table4 select ... from table2 join tableA...;
insert into table5 select ... from table1 join tableB...;
insert into table6 select ... from table1 join tableC...;
insert into table7 select ... from table1 join tableD...;
insert into table8 select ... from table1 join tableE...;
insert into table9 select ... from table1 join tableF...;
insert into table10 select ... from table1 join tableG...;
end;
My requirement is to get a report from a complex query using a if sentence.
If a flag=0 I must perform set of select statements, if the flag = 1 I must perform another set of select statements from another table,
Is there any way I can achieve this in a query rather than writing a function or stored procedure?
Eg:
In SQL I do this
if flag = 0
select var1, vari2 from table1
else
select var1, vari2, var3, vari4 from table2
Is this possible ??
There is no if in SQL - there is the case expression, but it is not quite the same thing.
If you have two tables, t1 and t2, and flag is in a scalar table t3 ("scalar" means exactly one column, flag, and with exactly one row, with the value either 0 or 1), you can do what you want but only if t1 and t2 have the same number of columns, with the same data types (and, although not required by syntax, this would only make sense if the columns in t1 and t2 have the same business meaning). Or, at least, if you plan to select only some columns from t1 or from t2, the columns you want to select from either table should be equal in number, have the same data type, and preferably the same business meaning.
For example: t1 and t2 may be employee tables, perhaps for two companies that just merged. If they both include first_name, last_name, date_of_birth and you just want to select these three columns from either t1 or t2 based on the flag value (even if t1 has other columns, not present in t2), you can do it. Same if t1 or t2 or both is not a single table, but the result of a more complicated query. The principle is the same.
The way you can do it is with a UNION ALL, like this:
select t1.col1, t1.col2, ...
from t1 cross join t3
where t3.flag = 0
UNION ALL
select t2.col1, t2.col2, ...
from t2 cross join t3
where t3.flag = 1
;
I need to convert a query from Oracle SQL to Postgres.
select count(*) from table1 group by column1 having max(rownum) = 4
If I replace "rownum" with "row_number() over()", I have an error message: "window functions are not allowed in HAVING".
Could you help me to get the same result in Postgres, as in Oracle?
The query below will do what your Oracle query is doing.
select count(*) from
(select column1, row_number() over () as x from table1) as t
group by column1 having max(t.x) = 6;
However
Neither oracle not postgres will guarantee the order in which records are read unless you specify an order by clause. So running the query multiple times is going to be inconsistent depending on how the database decides to process the query. Certainly in postgres any updates will change the underlying row order.
In the example below I've got an extra column of seq which is used to provide a consistent sort.
CREATE TABLE table1 (column1 int, seq int);
insert into table1 values (0,1),(0,2),(0,3),(1,4),(0,5),(1,6);
And a revised query which forces the order to be consistent:
select count(*) from
(select column1, row_number() over (order by seq) as x from table1) as t
group by column1 having max(t.x) = 6;
In my Stored Procedure, I have two queries:
Here rec_count is out parameter and cursor_name is in out parameter.
open cursor_name for
select <col list> from <table1 join table2 inner join...> on <join conditions> where <conditions>;
select count(*) into rec_count from <table1 join table2 inner join...> on <join conditions> where <conditions>;
Is there a way I can do the select and count together as I am providing the same join conditions and where clause again?
Will this affect performance or SQL Optimizer will optimize these two queries?
You could do the analytic count over the entire data set like this -
OPEN cursor_name for
SELECT <col_list> ,
count(*) over () as cnt
from <tables> <join conditions> <where clauses>;
That way the cursor would have a column with the count of all rows in each row.
There are bigger issues than you are thinking to be here.
What if another session commits a transaction meanwhile between you open the cursor and select count? Obviously, the count of the rows that of the cursor will not match with your select count(*) query.
Oracle doesn't know the count of rows, until the last row is fetched.
If you want an exact count of rows, then I would insist an analytic count(*) over() in your existing cursor query.
I have 2 tables each has about 230000 records. When I make a query:
select count(*)
from table1
where field1 in (select field2 from table2).
It takes about 0.2 second.
If I use the same query just changing in to not in
select count(*)
from table1
where field1 NOT in (select field2 from table2).
It never ends.
Why ?
It's the difference between a scan and a seek.
When you ask for "IN" you ask for specifically these values.
This means the database engine can use indexes to seek to the correct data pages.
When you ask for "NOT IN" you ask for all values except these values.
This means the database engine has to scan the entirety of the table/indexes to find all values.
The other factor is the amount of data. The IN query likely involves much less data and therefore much less I/O than the NOT IN.
Compare it to a phonebook, If you want people only named Smith you can just pick the section for Smith and return that. You don't have to read any pages in the book before or any pages after the Smith section.
If you ask for all non-Smith - you have to read all pages before Smith and all after Smith.
This illustrates both the seek/scan aspect and the data amount aspect.
Its better to user not exists, as not in uses row search which takes too long
In worst case both queries can be resolved using two full table scans plus a hash join (semi or anti). We're talking a few seconds for 230 000 rows unless there is something exceptionally going on in your case.
My guess is that either field1 or field2 is nullable. When you use a NOT IN construct, Oracle has to perform an expensive filter operation which is basically executing the inner query once for each row in the outer table. This is 230 000 full table scans...
You can verify this by looking at the the execution plan. It would look something like:
SELECT
FILTER (NOT EXISTS SELECT 0...)
TABLE ACCESS FULL ...
TABLE ACCESS FULL ...
If there are no NULL values in either column (field1, field2) you can help Oracle with this piece of information so another more efficient execution strategy can be used:
select count(*)
from table1
where field1 is not null
and field1 not in (select field2 from table2 where field2 is not null)
This will generate a plan that looks something like:
SELECT
HASH JOIN ANTI
FULL TABLE SCAN ...
FULL TABLE SCAN ...
...or you can change the construct to NOT EXISTS (will generate the same plan as above):
select count(*)
from table1
where not exists(
select 'x'
from table2
where table2.field2 = table1.field1
);
Please note that changing from NOT IN to NOT EXISTS may change the result of the query. Have a look at the following example and try the two different where-clauses to see the difference:
with table1 as(
select 1 as field1 from dual union all
select null as field1 from dual union all
select 2 as field1 from dual
)
,table2 as(
select 1 as field2 from dual union all
select null as field2 from dual union all
select 3 as field2 from dual
)
select *
from table1
--where field1 not in(select field2 from table2)
where not exists(select 'x' from table2 where field1 = field2)
Try:
SELECT count(*)
FROM table1 t1
LEFT JOIN table2 t2 ON t1.field1 = t2.field2
WHERE t2.primary_key IS NULL