How can you check query performance with small data set - oracle

All the Oracles out here,
I have an Oracle PL/SQL procedure but very small data that can run on the query. I suspect that when the data gets large, the query might start performing back. Are there ways in which I can check for performance and take corrective measure even before the data build up? If I wait for the data buildup, it might get too late.
Do you have any general & practical suggestions for me? Searching the internet did not get me anything convincing.

Better to build yourself some test data to get an idea of how things will perform. Its easy to get started, eg
create table MY_TEST as select * from all_objects;
gives you approx 50,000 rows typically. You can scale that easily with
create table MY_TEST as select a.* from all_objects a ,
( select 1 from dual connect by level <= 10);
Now you have 500,000 rows
create table MY_TEST as select a.* from all_objects a ,
( select 1 from dual connect by level <= 10000);
Now you have 500,000,000 rows!
If you want unique values per row, then add rownum eg
create table MY_TEST as select rownum r, a.* from all_objects a ,
( select 1 from dual connect by level <= 10000);
If you want (say) 100,000 distinct values in a column then TRUNC or MOD. You can also use DBMS_RANDOM to generate random numbers, strings etc.
Also check out Morten's test data generator
https://github.com/morten-egan/testdata_ninja
for some domain specific data, and also the Oracle sample schemas on github which can also be scaled using techniques above.
https://github.com/oracle/db-sample-schemas

Related

how to use 50 thousand Ids at where or join clause in oracle pl/sql for a select query?

I have a list of 50 thousand receipt Ids (hard coded values). i want to apply these 50 thousand Ids in where condition or join operation. I have used below 'with' clause to create a temp table to collect those 50 thousand Ids. Then I used this temp table in join query for filtering.
with temp_receiptIds(receiptId)
as
(
select 'M0000001' from dual
union
select 'M0000002' from dual
union
select 'M0000003' from dual
union
select 'M0000004' from dual
..
..
...
union
select 'M0049999' from dual
union
select 'M0050000' from dual
)
select sal.receiptId, prd.product_name, prd.product_price, sal.sales_date, sal.seller_name
from product prd
join sales sal on prd.product_id=sal.product_id
join temp_receiptIds tmp on tmp.receiptId=sal.receiptId
Whenever I run the above select join query to extract data as requested by business people, it takes about 8 minutes to fetch result in the production server.
Is my above approach correct? Are there any simpler approach than this by considering best performance in the production server.
Please note, every second , the production database is used by customer. since production db is very busy, can I run this query in production db directly, will it cause slow performance in the customer using website which calls this production db in every second. Correct answers would be greatly appreciated! Thanks
Why wouldn't you store those receiptIDs into a table?
create table receiptids as
with temp_receiptIds(receiptId)
as
(
select 'M0000001' from dual
union all --> "union ALL" instead of "union"
...
)
select * from temp_receiptids;
Index it:
create index i1recid on receiptids (receiptIdD);
See how that query now behaves.
If you - for some reason - can't do that, see whether UNION ALL within the CTE does any good. For 50.000 rows, it could make a difference.

Stored procedure for Select and Inner select query in Oracle

I have a query like this:
select PROMOTER_DSMID,
PROMOTER_NAME,
PROMOTER_MSISDN,
RETAILER_DSMID,
RETAILER_MSISDN,
RETAILER_NAME ,
ATTENDANCE_FLAG,
ATTENDANCE_DATE
from PROMO_ATTENDANCE_DETAILS
where PROMOTER_DSMID not in
(SELECT PROMOTER_DSMID
FROM PROMO_ATTENDANCE_DETAILS
WHERE PROMOTERS_ASM_DSMID='ASM123'
AND ATTENDANCE_FLAG='TRUE'
AND TRUNC(ATTENDANCE_DATE) ='16-07-17')
and PROMOTERS_ASM_DSMID='ASM123'
AND ATTENDANCE_FLAG='FALSE'
AND TRUNC(ATTENDANCE_DATE) ='16-07-17';
This query is taking too much time when I run this in PROD database because of large number of records.
I need to write a procedure for this but am not able to get the correct approach of how to write a procedure. Somebody please guide me.
"was thinking to write a proc in which inner select statement can put the data in some temporary table and then from that temporary table I can run the outer select statement"
No need for that. Use a WITH clause to select the data once and use it twice.
with cte as (
select PROMOTER_DSMID,
PROMOTER_NAME,
PROMOTER_MSISDN,
RETAILER_DSMID,
RETAILER_MSISDN,
RETAILER_NAME ,
ATTENDANCE_FLAG,
ATTENDANCE_DATE
from PROMO_ATTENDANCE_DETAILS
where PROMOTERS_ASM_DSMID='ASM123'
AND TRUNC(ATTENDANCE_DATE) ='16-07-17'
)
select *
from cte
where ATTENDANCE_FLAG='FALSE'
AND PROMOTER_DSMID not in
(SELECT PROMOTER_DSMID
FROM cte
where ATTENDANCE_FLAG='TRUE')
;
This will perform better than a temporary table, which involve a lot of disk I/O.
There are other possible performance improvements, depending on the usual tuning
considerations: data volume and skew, indexes, etc

Internal Functionality of DUAL table?

Will local database get disturb if we create DUAL table ?
Kindly Suggest me ?
create table DUAL
(
x varchar2(1)
);
No you cannot create a dual table. DUAL table is owned by SYS and SYS owns the data dictionary so you can not create it.
See the wiki
The DUAL table is a special one-row table present by default in all
Oracle database installations. It is suitable for use in selecting a
pseudocolumn such as SYSDATE or USER. The table has a single
VARCHAR2(1) column called DUMMY that has a value of 'X'.
Even if you try to create a DUAL table then it will create problems for you as everytime the Oracle engine has to ensure that you are not calling the SYS dual table. You need to specify the database and schema as well. It may lead to too much of ambiguity problem for Oracle engine. The Oracle optimizer knows everything that DUAL does and what it should do and it then does things based on that.
SQL Reference:
DUAL is a table automatically created by Oracle Database along with
the data dictionary. DUAL is in the schema of the user SYS but is
accessible by the name DUAL to all users. It has one column, DUMMY,
defined to be VARCHAR2(1), and contains one row with a value X.
Selecting from the DUAL table is useful for computing a constant
expression with the SELECT statement. Because DUAL has only one row,
the constant is returned only once. Alternatively, you can select a
constant, pseudocolumn, or expression from any table, but the value
will be returned as many times as there are rows in the table. Refer
to "About SQL Functions" for many examples of selecting a constant
value from DUAL.
Beginning with Oracle Database 10g Release 1, logical I/O is not
performed on the DUAL table when computing an expression that does not
include the DUMMY column. This optimization is listed as FAST DUAL in
the execution plan. If you SELECT the DUMMY column from DUAL, then
this optimization does not take place and logical I/O occurs.
Will local database get disturb if we create DUAL table ?
Yes, of course weird things can and will happen. DUAL is owned by SYS. SYS owns the data dictionary, therefore DUAL is part of the data dictionary. You are not to modify the data dictionary via SQL ever.
And the first question is "How will you guarantee only one row in your own DUAL table"?
This goes back to the original article Self-Managing PL/SQL by Steven Feuerstein where he explains "Use Your Own DUAL Table". But, that was back then when DUAL table was prone to such things.
However, in the recent releases, the DUAL table structure has been made robust and you cannot have more than single row ever. Here is a proof:
SQL> conn sys#pdborcl as sysdba
Enter password:
Connected.
SQL> insert into dual select * from dual;
1 row created.
SQL> select * from dual;
D
-
X
I know, few would argue that we can handle one row with our own DUAL table using a trigger or ROWNUM =1, however, you will soon realize the cons. It is simply not necessary from 10g on wards, as the DUAL table is now a memory structure and you cannot add a row to it as demonstrated above.
Imagine a situation where you have created your own DUAL table, and you are using the call to DUAL table in your PL/SQL code to get the USER, SYSDATE, SYSTIMESTAMP etc.
This is the code taken from the stdbody.sql file delivered with Oracle Database:
1 FUNCTION USER
2 RETURN VARCHAR2
3 IS
4 c VARCHAR2 (255);
5 BEGIN
6 SELECT USER
7 INTO c
8 FROM SYS.DUAL;
9
10 RETURN c;
11 END;
If you ever have more than one row in your own DUAL table, every call to the USER function in your PL/SQL code will fail with TOO_MANY_ROWS error.
Bottomline : All the discussion about using your own DUAL table made sense back then before 10g days. The DUAL table is now a robust memory structure and doesn't allow to add a row to it. So, makes no sense to use your own DUAL table rather than the SYS.DUAL.

Compare two tables with minus operation in oracle

Some tables' data need to be updated(or deleted , inserted) in my system.
But I want to know which data are updated,deleted and inserted.
So before the data are changed ,I will backup the table in different schema
just like this:
create table backup_table as select * from schema1.testtable
and after the data are changed,I want to find the difference between backup_table
and testtable ,and I want to save the difference into a table in the backup schema.
the sql I will run is like this:
CREATE TABLE TEST_COMPARE_RESULT
AS
SELECT 'BEFORE' AS STATUS, T1.*
FROM (
SELECT * FROM backup_table
MINUS
SELECT * FROM schema1.testtable
) T1
UNION ALL
SELECT 'AFTER' AS STATUS, T2.*
FROM (
SELECT * FROM schema1.testtable
MINUS
SELECT * FROM backup_table
) T2
What I am worried about is that I heared about that the minus operation will use
a lot of system resource.In my sysytem, some table size will be over 700M .So I want to
know how oracle will read the 700M data in memory (PGA??) or the temporary tablespace?
and How I should make sure that the resource are enough to to the compare operation?
Minus is indeed a resource intensive task. It need to read both tables and do sorts to compare the two tables. However, Oracle has advanced techniques to do this. It won't load the both tables in memory(SGA) if can't do it. It will use, yes, temporary space for sorts. But I would recommend you to have a try. Just run the query and see what happens. The database won't suffer and allways you can stop the execution of statement.
What you can do to improve the performance of the query:
First, if you have columns that you are sure that won't changed, don't include them.
So, is better to write:
select a, b from t1
minus
select a, b from t2
than using a select * from t, if there are more than these two columns, because the work is lesser.
Second, if the amount of data to compare si really big for your system(too small temp space), you should try to compare them on chunks:
select a, b from t1 where col between val1 and val2
minus
select a, b from t2 where col between val1 and val2
Sure, another possibility than minus is to have some log columns, let's say updated_date. selecting with where updated_date greater than start of process will show you updated records. But this depends on how you can alter the database model and etl code.

CTE With Insert In Oracle

i am running a query in oracle with CTE.
When i execute the query it works fine in select statement but when i use insert statement it takes ample of time to execute.Any help here is the code
INSERT INTO port_weeklydailypricesTest (co_code,start_dtm,end_dtm)
SELECT * FROM
(
WITH CTE(co_code, start_dtm, end_dtm) AS
(
SELECT co_code ,
CAST(NEXT_DAY(MIN(dlyprice_date),'FRIDAY')-6 AS DATE) start_dtm ,
CAST(NEXT_DAY(MIN(dlyprice_date),'FRIDAY') AS DATE) end_dtm
FROM feed_dlyprice
GROUP BY co_code
UNION ALL
SELECT co_code ,
CAST(TO_CHAR(end_dtm + INTERVAL '1' DAY,'DD-MON-YYYY') AS DATE),
CAST(TO_CHAR(end_dtm + INTERVAL '7' DAY,'DD-MON-YYYY') AS DATE)
FROM CTE
WHERE CAST(end_dtm AS DATE) <= TO_CHAR(TO_DATE(SYSDATE+1,'DD-MON-YYYY'))
)
SELECT co_code,start_dtm,end_dtm
FROM CTE
);
If, as you say, the performance of the SELECT on its own is satisfactory the problem must lie with the INSERT part of the statement.
There are a number of things which might cause an insert to run slow:
The most likely is the presence of a trigger on the target table which executes something very expensive.
Another possibility is that the insert is waiting on a locked resource (say some other process has an exclusive table level lock on the target table, or some other shared resource such as a code control table).
it could be a storage allocation issue, chaining or row migration, too many indexes or lots of derived columns.
perhaps it is down to hardware - underpowered network, dodgy interconnects, a bad disk.
This is by no means exhaustive. The items at the top are application issues which you should be able to investigate and resolve. The further down the list you go the more likely it is that you will need the assistance on an on-site DBA.

Resources