Improve DELETE query Oracle - oracle

I have a query to delete some records from a table, but take too much time.
The table is use it in a stored procedure to match another table.
Every time that the SP is executed the table is truncated and filled with 2 or 3 millions of records depending of the received parameters.
The table doesn't have any FK or constraints
The query to delete the records that I am using is:
DELETE FROM TABLE1
WHERE (fecha,hora_ini,origen,destino,tipo,valor,rowsm1) IN (
SELECT fecha_t,hora_t,origen_t,destino_t,tipo,valor,id_t
FROM TABLE2)
I try to decrease the time in execute the query creating an index based in the same columns of the query
CREATE INDEX smb1 ON table1 (fecha,hora_ini,origen,destino,tipo,valor,rowsm1);
And the query take more time to execute.
How can improve the performance of this "DELETE" query.
UPDATE
EXPLAIN PLAN OUTPUT
DELETE TABLE1
TABLE ACCESS TABLE1
TABLE ACCESS FULL TABLE1
TABLE ACCESS FULL TABLE2
TABLE ACCESS FULL TABLE2

The index you created looks like a quite big index:
CREATE INDEX smb1
ON table1 (fecha,hora_ini,origen,destino,tipo,valor,rowsm1);
Sure, this depends on the amount of data but generally I would rather look for one or two selective columns - if possible.
Don't forget, that the index data has to be read as well and if it doesn't help to speed up the query, you even loose performance.
This might for instance happen, if the table is very small, because the database reads data block by block (I think it was about 8K). A small table can be read in one step - no need to use an index here.
Or, if more or less all records are selected. In this case the table has to be read anyway.
If you want to speed up the query you should create the same index (with a good selectivity) on table2. This way the EXPLAIN PLAN will look somewhat lie this:
DELETE STATEMENT
DELETE
NESTED LOOPS SEMI
INDEX FULL SCAN
INDEX RANGE SCAN

You can switch off logging and delete the rows,
Here is an example, you can do it 2 ways,
1.) Physically chaning the table to Nologging
2.) Using Nologging hint in the delete statement.
1.) First approach
both testemp and testemp2 are same tables with same data while testemp takes over a minute , testemp2 takes only 1 second
SQL> delete from testemp;
14336 rows deleted.
Elapsed: 00:01:04.12
SQL>
SQL>
SQL> alter table testemp2 nologging;
Table altered.
Elapsed: 00:00:02.86
SQL>
SQL> delete from testemp2;
14336 rows deleted.
Elapsed: 00:00:01.26
SQL>
The table needs to be put back to logging only when we physically change the table using "Alter" command, if you are using as hint not required please see the example below
2.) Second approach
SQL> set timing on;
SQL> delete from testemp2;
14336 rows deleted.
Elapsed: 00:00:01.51
Deleting data after reinserting same data into table now with nologging;
SQL> delete /*+NOLOGGING*/ from testemp2;
14336 rows deleted.
Elapsed: 00:00:00.28
SQL> select logging from user_Tables where table_name='TESTEMP2';
LOG
---
YES

Related

Oracle 11G - Performance effect of indexing at insert

Objective
Verify if it is true that insert records without PK/index plus create thme later is faster than insert with PK/Index.
Note
The point here is not about indexing takes more time (it is obvious), but the total cost (Insert without index + create index) is higher than (Insert with index). Because I was taught to insert without index and create index later as it should be faster.
Environment
Windows 7 64 bit on DELL Latitude core i7 2.8GHz 8G memory & SSD HDD
Oracle 11G R2 64 bit
Background
I was taught that insert records without PK/Index and create them after insert would be faster than insert with PK/Index.
However 1 million record inserts with PK/Index was actually faster than creating PK/Index later, approx 4.5 seconds vs 6 seconds, with the experiments below. By increasing the records to 3 million (999000 -> 2999000), the result was the same.
Conditions
The table DDL is below. One bigfile table space for both data and
index.
(Tested a separate index tablespace with the same result & inferior overall perforemace)
Flush the buffer/spool before each run.
Run the experiment 3 times each and made sure the results
were similar.
SQL to flush:
ALTER SYSTEM CHECKPOINT;
ALTER SYSTEM FLUSH SHARED_POOL;
ALTER SYSTEM FLUSH BUFFER_CACHE;
Question
Would it be actually true that "insert witout PK/Index + PK/Index creation later" is faster than "insert with PK/Index"?
Did I make mistakes or missed some conditions in the experiment?
Insert records with PK/Index
TRUNCATE TABLE TBL2;
ALTER TABLE TBL2 DROP CONSTRAINT PK_TBL2_COL1 CASCADE;
ALTER TABLE TBL2 ADD CONSTRAINT PK_TBL2_COL1 PRIMARY KEY(COL1) ;
SET timing ON
INSERT INTO TBL2
SELECT i+j, rpad(TO_CHAR(i+j),100,'A')
FROM (
WITH DATA2(j) AS (
SELECT 0 j FROM DUAL
UNION ALL
SELECT j+1000 FROM DATA2 WHERE j < 999000
)
SELECT j FROM DATA2
),
(
WITH DATA1(i) AS (
SELECT 1 i FROM DUAL
UNION ALL
SELECT i+1 FROM DATA1 WHERE i < 1000
)
SELECT i FROM DATA1
);
commit;
1,000,000 rows inserted.
Elapsed: 00:00:04.328 <----- Insert records with PK/Index
Insert records without PK/Index and create them after
TRUNCATE TABLE TBL2;
ALTER TABLE &TBL_NAME DROP CONSTRAINT PK_TBL2_COL1 CASCADE;
SET TIMING ON
INSERT INTO TBL2
SELECT i+j, rpad(TO_CHAR(i+j),100,'A')
FROM (
WITH DATA2(j) AS (
SELECT 0 j FROM DUAL
UNION ALL
SELECT j+1000 FROM DATA2 WHERE j < 999000
)
SELECT j FROM DATA2
),
(
WITH DATA1(i) AS (
SELECT 1 i FROM DUAL
UNION ALL
SELECT i+1 FROM DATA1 WHERE i < 1000
)
SELECT i FROM DATA1
);
commit;
ALTER TABLE TBL2 ADD CONSTRAINT PK_TBL2_COL1 PRIMARY KEY(COL1) ;
1,000,000 rows inserted.
Elapsed: 00:00:03.454 <---- Insert without PK/Index
table TBL2 altered.
Elapsed: 00:00:02.544 <---- Create PK/Index
Table DDL
CREATE TABLE TBL2 (
"COL1" NUMBER,
"COL2" VARCHAR2(100 BYTE),
CONSTRAINT "PK_TBL2_COL1" PRIMARY KEY ("COL1")
) TABLESPACE "TBS_BIG" ;
The current test case is probably good enough for you to overrule the "best practices". There are too many variables involved to make a blanket statement that "it's always best to leave the indexes enabled". But you're probably close enough to say it's true for your environment.
Below are some considerations for the test case. I've made this a community wiki in the hopes that others will add to the list.
Direct-path inserts. Direct-path writes use different mechanisms and may work completely differently. Direct-path inserts can often be significantly faster than regular inserts, although they have some complicated restrictions (for example, triggers must be disabled) and disadvantages (the data is not immediately backed-up). One particular way it affects this scenario is that NOLOGGING for indexes only applies during index creation. So even if a direct-path insert is used, an enabled index will always generate REDO and UNDO.
Parallelism. Large insert statements often benefit from parallel DML. Usually it's not worth worrying about the performance of bulk loads until it takes more than several seconds, which is when parallelism starts to be useful.
Bitmap indexes are not meant for large DML. Inserts or updates to a table with a bitmap index can lock the whole table and lead to disastrous performance. It might be helpful to limit the test case to b-tree indexes.
Add alter system switch logfile;? Log file switches can sometimes cause performance issues. The tests would be somewhat more consistent if they all started with empty logfiles.
Move data generation logic into a separate step. Hierarchical queries are useful for generating data but they can have their own performance issues. It might be better to create in intermediate table to hold the results, and then only test inserting the intermediate table into the final table.
It's true that it is faster to modify a table if you do not also have to modify one or more indexes and possibly perform constraint checking as well, but it is also largely irrelevant if you then have to add those indexes. You have to consider the complete change to the system that you wish to effect, not just a single part of it.
Obviously if you are adding a single row into a table that already contains millions of rows then it would be foolish to drop and rebuild indexes.
However, even if you have a completely empty table into which you are going to add several million rows it can still be slower to defer the indexing until afterwards.
The reason for this is that such an insert is best performed with the direct path mechanism, and when you use direct path inserts into a table with indexes on it, temporary segments are built that contain the data required to build the indexes (data plus rowids). If those temporary segments are much smaller than the table you have just loaded then they will also be faster to scan and to build the indexes from.
the alternative, if you have five index on the table, is to incur five full table scans after you have loaded it in order to build the indexes.
Obviously there are huge grey areas involved here, but well done for:
Questioning authority and general rules of thumb, and
Running actual tests to determine the facts in your own case.
Edit:
Further considerations -- you run a backup while the indexes are dropped. Now, following an emergency restore, you have to have a script that verifies that all indexes are in place, when you have the business breathing down your neck to get the system back up.
Also, if you absolutely were determined to not maintain indexes during a bulk load, do not drop the indexes -- disable them instead. This preserves the metadata for the indexes existence and definition, and allows a more simple rebuild process. Just be careful that you do not accidentally re-enable indexes by truncating the table, as this will render disabled indexes enabled again.
Oracle has to do more work while inserting data into table having an index. In general, inserting without index is faster than inserting with index.
Think in this way,
Inserting rows in a regular heap-organized table with no particular row order is simple. Find a table block with enough free space, put the rows randomly.
But, when there are indexes on the table, there is much more work to do. Adding new entry for the index is not that simple. It has to traverse the index blocks to find the specific leaf node as the new entry cannot be made into any block. Once the correct leaf node is found, it checks for enough free space and then makes the new entry. If there is not enough space, then it has to split the node and distribute the new entry into old and new node. So, all this work is an overhead and consumes more time overall.
Let's see a small example,
Database version :
SQL> SELECT banner FROM v$version where ROWNUM =1;
BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
OS : Windows 7, 8GB RAM
With Index
SQL> CREATE TABLE t(A NUMBER, CONSTRAINT PK_a PRIMARY KEY (A));
Table created.
SQL> SET timing ON
SQL> INSERT INTO t SELECT LEVEL FROM dual CONNECT BY LEVEL <=1000000;
1000000 rows created.
Elapsed: 00:00:02.26
So, it took 00:00:02.26. Index details:
SQL> column index_name format a10
SQL> column table_name format a10
SQL> column uniqueness format a10
SQL> SELECT index_name, table_name, uniqueness FROM user_indexes WHERE table_name = 'T';
INDEX_NAME TABLE_NAME UNIQUENESS
---------- ---------- ----------
PK_A T UNIQUE
Without Index
SQL> DROP TABLE t PURGE;
Table dropped.
SQL> CREATE TABLE t(A NUMBER);
Table created.
SQL> SET timing ON
SQL> INSERT INTO t SELECT LEVEL FROM dual CONNECT BY LEVEL <=1000000;
1000000 rows created.
Elapsed: 00:00:00.60
So, it took only 00:00:00.60 which is faster compared to 00:00:02.26.

How to speed up access to Oracle table after deleting millions of rows?

I'm newbie to Oracle and database staff.
I had a huge Oracle 11g table of data with 200 million rows. I delete them all and kept only one. I read many articles on how to recover table space after deleting rows, etc but my table still takes a long time to be accessed (minutes).
Example:
select * from MyTable; Takes 0.02 seconds to return one record which is ok !
select * from MyTable where ID > 0; Takes more than 20 minutes to return one record.
(The ID field has an index)
I have successfully run the following commands:
alter table MyTable enable row movement;
alter table MyTable shrink space compact;
alter table MyTable deallocate unused;
But I still have the same issue.
Any ideas please ?

How do I UPDATE a large table in oracle pl/sql in batches to avoid running out of undospace?

I have a very large table (5mm records). I'm trying to obfuscate the table's VARCHAR2 columns with random alphanumerics for every record on the table. My procedure executes successfully on smaller datasets, but it will eventually be used on a remote db whose settings I can't control, so I'd like to EXECUTE the UPDATE statement in batches to avoid running out of undospace.
Is there some kind of option I can enable, or a standard way to do the update in chunks?
I'll add that there won't be any distinguishing features of the records that haven't been obfuscated so my one thought of using rownum in a loop won't work (I think).
If you are going to update every row in a table, you are better off doing a Create Table As Select, then drop/truncate the original table and re-append with the new data. If you've got the partitioning option, you can create your new table as a table with a single partition and simply swap it with EXCHANGE PARTITION.
Inserts require a LOT less undo and a direct path insert with nologging (/+APPEND/ hint) won't generate much redo either.
With either mechanism, there would probably sill be 'forensic' evidence of the old values (eg preserved in undo or in "available" space allocated to the table due to row movement).
The following is untested, but should work:
declare
l_fetchsize number := 10000;
cursor cur_getrows is
select rowid, random_function(my_column)
from my_table;
type rowid_tbl_type is table of urowid;
type my_column_tbl_type is table of my_table.my_column%type;
rowid_tbl rowid_tbl_type;
my_column_tbl my_column_tbl_type;
begin
open cur_getrows;
loop
fetch cur_getrows bulk collect
into rowid_tbl, my_column_tbl
limit l_fetchsize;
exit when rowid_tbl.count = 0;
forall i in rowid_tbl.first..rowid_tbl.last
update my_table
set my_column = my_column_tbl(i)
where rowid = rowid_tbl(i);
commit;
end loop;
close cur_getrows;
end;
/
This isn't optimally efficient -- a single update would be -- but it'll do smaller, user-tunable batches, using ROWID.
I do this by mapping the primary key to an integer (mod n), and then perform the update for each x, where 0 <= x < n.
For example, maybe you are unlucky and the primary key is a string. You can hash it with your favorite hash function, and break it into three partitions:
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=0
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=1
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=2
You may have more partitions, and may want to put this into a loop (with some commits).
If I had to update millions of records I would probably opt to NOT update.
I would more likely create a temp table and then insert data from old table since insert doesnt take up a lot of redo space and takes less undo.
CREATE TABLE new_table as select <do the update "here"> from old_table;
index new_table
grant on new table
add constraints on new_table
etc on new_table
drop table old_table
rename new_table to old_table;
you can do that using parallel query, with nologging on most operations generating very
little redo and no undo at all -- in a fraction of the time it would take to update the
data.

Is it safe to put an index on an Oracle Temporary Table?

I have read that one should not analyze a temp table, as it screws up the table statistics for others. What about an index? If I put an index on the table for the duration of my program, can other programs using the table be affected by that index?
Does an index affect my process, and all other processes using the table?
or Does it affect my process alone?
None of the responses have been authoritative, so I am offering said bribe.
Does an index effect my process, and all other processes using the table? or Does it effect my process alone?
I'm assuming we are talking of GLOBAL TEMPORARY tables.
Think of a temporary table as of multiple tables that are created and dropped by each process on the fly from a template stored in the system dictionary.
In Oracle, DML of a temporary table affects all processes, while data contained in the table will affect only one process that uses them.
Data in a temporary table is visible only inside the session scope. It uses TEMPORARY TABLESPACE to store both data and possible indexes.
DML for a temporary table (i. e. its layout, including column names and indexes) is visible to everybody with sufficient privileges.
This means that existence of the index will affect your process as well as other processes using the table in sense that any process that modifies data in the temporary table will also have to modify the index.
Data contained in the table (and in the index too), on the contrary, will affect only the process that created them, and will not even be visible to other processes.
IF you want one process to use the index and another one not to use it, do the following:
Create two temporary tables with same column layout
Index on one of them
Use indexed or non-indexed table depending on the process
I assume you're referring to true Oracle temporary tables and not just a regular table created temporarily and then dropped. Yes, it is safe to create indexes on the temp tables and they will be used according to the same rules as a regular tables and indexes.
[Edit]
I see you've refined your question, and here's a somewhat refined answer:
From:
Oracle® Database Administrator's Guide
10g Release 2 (10.2)
Part Number B14231-02
"Indexes can be created on temporary tables. They are also temporary and the data in the index has the same session or transaction scope as the data in the underlying table."
If you need the index for efficient processing during the scope of the transaction then I would imagine you'll have to explicitly hint it in the query because the statistics will show no rows for the table.
You're asking about two different things, indexes and statistics.
For indexes, yes, you can create indexes on the temp tables, they will be maintained as per usual.
For statistics, I recommend that you explicitly set the stats of the table to represent the average size of the table when queried. If you just let oracle gather stats by itself, the stats process isn't going to find anything in the tables (since by definition, the data in the table is local to your transaction), so it will return inaccurate results.
e.g. you can do:
exec dbms_stats.set_table_stats(user, 'my_temp_table', numrows=>10, numblks=>4)
Another tip is that if the size of the temporary table varies greatly, and within your transaction, you know how many rows are in the temp table, you can help out the optimizer by giving it that information. I find this helps out a lot if you are joining from the temp table to regular tables.
e.g., if you know the temp table has about 100 rows in it, you can:
SELECT /*+ CARDINALITY(my_temp_table 100) */ * FROM my_temp_table
Well, I tried it out and the index was visible and used by the second session. Creating a new global temporary table for your data would be safer if you really need an index.
You are also unable to create an index while any other session is accessing the table.
Here's the test case I ran:
--first session
create global temporary table index_test (val number(15))
on commit preserve rows;
create unique index idx_val on index_test(val);
--second session
insert into index_test select rownum from all_tables;
select * from index_test where val=1;
You can also use the dynamic sampling hint (10g):
select /*+ DYNAMIC_SAMPLING (3) */ val
from index_test
where val = 1;
See Ask Tom
You cannot create an index on a temporary table while it is used by another session, so answer is: No, it cannot affect any other process, because it is not possible.
An existing Index affects only your current session, because for any other session the temporary table appears empty, so it cannot access any index values.
Session 1:
SQL> create global temporary table index_test (val number(15)) on commit preserve rows;
Table created.
SQL> insert into index_test values (1);
1 row created.
SQL> commit;
Commit complete.
SQL>
Session 2 (while session 1 is still connected):
SQL> create unique index idx_val on index_test(val);
create unique index idx_val on index_test(val)
*
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use
SQL>
Back to session 1:
SQL> delete from index_test;
1 row deleted.
SQL> commit;
Commit complete.
SQL>
Session 2:
SQL> create unique index idx_val on index_test(val);
create unique index idx_val on index_test(val)
*
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already in use
SQL>
still failing, you first have to disconnect session 1 or table has to be truncated.
Session 1:
SQL> truncate table index_test;
Table truncated.
SQL>
Now you can create the index in Session 2:
SQL> create unique index idx_val on index_test(val);
Index created.
SQL>
This index of course will be used by any session.

Data loading in Oracle

I am facing problem in loading data. I have to copy 800,000 rows from one table to another in Oracle database.
I tried for 10,000 rows first but the time it took is not satisfactory. I tried using the "BULK COLLECT" and "INSERT INTO SELECT" clause but for both the cases response time is around 35 minutes. This is not the desired response I'm looking for.
Does anyone have any suggestions?
Anirban,
Using an "INSERT INTO SELECT" is the fastest way to populate your table. You may want to extend it with one or two of these hints:
APPEND: to use direct path loading, circumventing the buffer cache
PARALLEL: to use parallel processing if your system has multiple cpu's and this is a one-time operation or an operation that takes place at a time when it doesn't matter that one "selfish" process consumes more resources.
Just using the append hint on my laptop copies 800,000 very small rows below 5 seconds:
SQL> create table one_table (id,name)
2 as
3 select level, 'name' || to_char(level)
4 from dual
5 connect by level <= 800000
6 /
Tabel is aangemaakt.
SQL> create table another_table as select * from one_table where 1=0
2 /
Tabel is aangemaakt.
SQL> select count(*) from another_table
2 /
COUNT(*)
----------
0
1 rij is geselecteerd.
SQL> set timing on
SQL> insert /*+ append */ into another_table select * from one_table
2 /
800000 rijen zijn aangemaakt.
Verstreken: 00:00:04.76
You mention that this operation takes 35 minutes in your case. Can you post some more details, so we can see what exactly is taking 35 minutes?
Regards,
Rob.
I would agree with Rob. Insert into () select is the fastest way to do this.
What exactly do you need to do? If you're trying to do a table rename by copying to a new table and then deleting the old, you might be better off doing a table rename:
alter table
table
rename to
someothertable;
INSERT INTO SELECT is the fastest way to do it.
If possible/necessary, disable all indexes on the target table first.
If you have no existing data in the target table, you can also try CREATE AS SELECT.
As with the above, I would recommend the Insert INTO ... AS select .... or CREATE TABLE ... AS SELECT ... as the fastest way to copy a large volume of data between two tables.
You want to look up the direct-load insert in your oracle documentation. This adds two items to your statements: parallel and nologging. Repeat the tests but do the following:
CREATE TABLE Table2 AS SELECT * FROM Table1 where 1=2;
ALTER TABLE Table2 NOLOGGING;
ALTER TABLE TABLE2 PARALLEL (10);
ALTER TABLE TABLE1 PARALLEL (10);
ALTER SESSION ENABLE PARALLEL DML;
INSERT INTO TABLE2 SELECT * FROM Table 1;
COMMIT;
ALTER TABLE 2 LOGGING:
This turns off the rollback logging for inserts into the table. If the system crashes, there's not recovery and you can't do a rollback on the transaction. The PARALLEL uses N worker thread to copy the data in blocks. You'll have to experiment with the number of parallel worker threads to get best results on your system.
Is the table you are copying to the same structure as the other table? Does it have data or are you creating a new one? Can you use exp/imp? Exp can be give a query to limit what it exports and then imported into the db. What is the total size of the table you are copying from? If you are copying most of the data from one table to a second, can you instead copy the full table using exp/imp and then remove the unwanted rows which would be less than copying.
try to drop all indexes/constraints on your destination table and then re-create them after data load.
use /*+NOLOGGING*/ hint in case you use NOARCHIVELOG mode, or consider to do the backup right after the operation.

Resources