Fastest way to insert a million rows in Oracle - oracle

How can I insert more than a million rows in Oracle in optimal way for the following procdeure? It hangs if I increase FOR loop to a million rows.
create or replace procedure inst_prc1 as
xssn number;
xcount number;
l_start Number;
l_end Number;
cursor c1 is select max(ssn)S1 from dtr_debtors1;
Begin
l_start := DBMS_UTILITY.GET_TIME;
FOR I IN 1..10000 LOOP
For C1_REC IN C1 Loop
insert into dtr_debtors1(SSN) values (C1_REC.S1+1);
End loop;
END LOOP;
commit;
l_end := DBMS_UTILITY.GET_TIME;
DBMS_OUTPUT.PUT_LINE('The Procedure Start Time is '||l_start);
DBMS_OUTPUT.PUT_LINE('The Procedure End Time is '||l_end);
End inst_prc1;

Your approach will lead to memory issues. Fastest way will be this [Query edited after David's comment to take care of null scenario] :
insert into dtr_debtors1(SSN)
select a.S1+level
from dual,(select nvl(max(ssn),0) S1 from dtr_debtors1) a
connect by level <= 10000
A select insert is the fastest approach as everything stays in RAM.
This query can become slow if it slips into Global temp area but then that needs DB tuning . I don't think there can be anything faster than this.
Few more details on memory use by Query:
Each query will have its own PGA [Program global area] which is basically RAM available to each query. If this this area is not sufficient to return query results then SQL engine starts using Golabl temp tablespace which is like hard disk and query starts becoming slow. If data needed by query is so huge that even temp area is not sufficient then you will tablespace error.
So always design query so that it stays in PGA else its a Red flag.

Inserting one row at a time with single insert statement within loop is slow. The fastest way is to use insert-select like the following, which generates a million rows and bulk insert.
insert into dtr_debtors1(SSN)
select level from dual connect by level <= 1000000;

Try to drop all the index created on your table and then try to insert using the select query. You can try this link which will help you in inserting millions of rows fast into your database.

1) If you want to insert using PL/SQL, then use BULK COLLECT INTO and for insert DML use BULK BIND FOR ALL.
2) In SQL multi insert use INSERT ALL statement.
3) Another method INSERT INTO <tb_nm> SELECT.
4) Use SQL LOADER Utility.

Related

How to delete huge rows in oracle table using parallel sessions query quickly

I am using mentioned query to delete 250 million plus rows from my table and it is taking more time
I have tried with t_delete limit with up to 20000.
Still slow deletion happening.
Please suggest a few optimisations in the same code to done my job faster.
DECLARE
TYPE tt_delete IS TABLE OF ROWID; t_delete tt_delete;
CURSOR cIMAV IS SELECT ROWID FROM moc_attribute_value where id in (select
id from ORPHANS_MAV);
total Number:=0;
rcount Number:=0;
Stmt1 varchar2(2000);
Stmt2 varchar2(2000);
BEGIN
--- CREATE TABLE orphansInconsistenDelProgress (currentTable
VARCHAR(100), deletedCount INT, totalToDelete INT);
--- INSERT INTO orphansInconsistenDelProgress (currentTable,
deletedCount,totalToDelete) values ('',0,0);
Stmt1:='ALTER SESSION SET parallel_degree_policy = AUTO';
Stmt2:='ALTER SESSION FORCE PARALLEL DML';
EXECUTE IMMEDIATE Stmt1;
EXECUTE IMMEDIATE Stmt2;
--- ALTER SESSION SET parallel_degree_policy = AUTO;
--- ALTER SESSION FORCE PARALLEL DML;
COMMIT;
--- MOC_ATTRIBUTE_VALUE
SELECT count(*) INTO total FROM ORPHANS_MAV;
UPDATE orphansInconsistenDelProgress SET currentTable='ORPHANS_MAV',
totalToDelete=total;
rcount := 0;
OPEN cIMAV;
LOOP
FETCH cIMAV BULK COLLECT INTO t_delete LIMIT 2000;
EXIT WHEN t_delete.COUNT = 0;
FORALL i IN 1..t_delete.COUNT
DELETE moc_attribute_value WHERE ROWID = t_delete (i);
COMMIT;
rcount := rcount + 2000;
UPDATE orphansInconsistenDelProgress SET deletedCount=rcount;
END LOOP;
CLOSE cIMAV;
COMMIT;
END;
/
A single Oracle parallel query can simplify the code and improve performance.
declare
execute immediate 'alter session enable parallel dml';
delete /*+ parallel */
from moc_attribute_value
where id in (select id from ORPHANS_MAV);
update OrphansInconsistenDelProgress
set currentTable = 'ORPHANS_MAV',
totalToDelete = sql%rowcount;
commit;
end;
/
In general, we want to either let Oracle break the task into pieces or use our own custom chunking. The original code seems to be doing both - it reads the data in chunks, and then submits each chunk to be further divided into a parallel delete. That approach generates lots of tiny pieces, and Oracle likely wastes a lot of time on things like thread coordination.
Deleting a large number of rows is expensive because there's no way to avoid REDO and UNDO. You might want to look into using DDL options, such as truncating a partition, or dropping and recreating the table. (But be careful recreating objects, it's difficult to perfectly recreate complex objects. We tend to forget things like privileges and table options.)
Tuning parallelism and large jobs is complicated. It's important to use the best monitoring tools, to ensure that Oracle is requesting, allocating, and using the right number of parallel processes, and that the execution plan is correct. One strong advantage of using a single SQL statement is that you can use real-time SQL monitoring reports to monitor progress. If you have the Diagnostics and Tuning Pack licenses, find the SQL_ID in GV$SQL and generate the report with select dbms_sqltune.report_sql_monitor('your SQL_ID here');.
maybe use SQL TRUNCATE TABLE,
Truncate table is faster and uses lesser resources than DELETE TABLE command.
If you are keeping only a fraction of the rows, it is likely to be much faster to copy over the rows to keep, then swap tables and delete the old table.
(No I don't know the threshold at which this is faster than DELETEing.)

how to run the stored procedure in batch mode or in run it in parallel processing

We are iterating 100k+ records from global temporary table.below stored procedure will iterate all records from glogal temp table one by one and has to process below three steps.
to see whether product is exists or not
to see whether product inside the assets are having the 'category' or not.
to see whether the assets are having file names starts with '%pdf%' or not.
So each record has to process these 3 steps and final document names will be stored in the table for the successful record. If any error comes in any of the steps then error message will be stored for that record.
Below stored procedure is taking long time to process Because its processing sequentially.
Is there any way to make this process faster in the stored procedure itself by doing batch process?
If it's not possible in stored procedure then can we change this code into Java and run this code in multi threaded mode? like creating 10 threads and each thread will take one record concurrently and process this code. I would be happy if somebody gives some pseudo code.
which approach is going to suggest?
DECLARE
V_NODE_ID VARCHAR2(20);
V_FILENAME VARCHAR2(100);
V_CATEGORY_COUNT INTEGER :=0;
FINAL_FILNAME VARCHAR2(2000);
V_FINAL_ERRORMESSAGE VARCHAR2(2000);
CURSOR C1 IS
SELECT isbn FROM GT_ADD_ISBNS GT;
CURSOR C2(v_isbn in varchar2) IS
SELECT ANP.NODE_ID NODE_ID
FROM
table1 ANP,
table2 ANPP,
table3 AN
WHERE
ANP.NODE_ID=AN.ID AND
ANPP.NODE_ID=ANP.NODE_ID AND
AN.NAME_ID =26 AND
ANP.CATEORGY='category' AND
ANP.QNAME_ID='categories' AND
ANP.NODE_ID IN(SELECT CHILD_NODE_ID
FROM TABLE_ASSOC START WITH PARENT_NODE_ID IN(v_isbn)
CONNECT BY PRIOR CHILD_NODE_ID = PARENT_NODE_ID);
BEGIN
--Iterating all Products
FOR R1 IN C1
LOOP
FINAL_FILNAME :='';
BEGIN
--To check whether Product is exists or not
SELECT AN.ID INTO V_NODE_ID
FROM TABLE1 AN,
TABLE2 ANP
WHERE
AN.ID=ANP.NODE_ID AND
ANP.VALUE in(R1.ISBN);
V_CATEGORY_COUNT :=0;
V_FINAL_ERRORMESSAGE :='';
--To check Whether Product inside the assets are having the 'category' is applied or not
FOR R2 IN C2(R1.ISBN)
LOOP
V_CATEGORY_COUNT := V_CATEGORY_COUNT+1;
BEGIN
--In this Logic Product inside the assets have applied the 'category' But those assets are having documents LIKE '%pdf%' or not
SELECT ANP.STRING_VALUE into V_FILENAME
FROM
table1 ANP,
table2 ANPP,
table3 ACD
WHERE
ANP.QNAME_ID=21 AND
ACD.ID=ANPP.LONG_VALUE
ANP.NODE_ID=ANPP.NODE_ID AND
ANPP.QNAME_ID=36 AND
ANP.STRING_VALUE LIKE '%pdf%' AND
ANP.NODE_ID=R2.NODE_ID;
FINAL_FILNAME := FINAL_FILNAME || V_FILENAME ||',';
EXCEPTION WHEN
NO_DATA_FOUND THEN
V_FINAL_ERRORMESSAGE:=V_FINAL_ERRORMESSAGE|| 'Category is applied for this Product But for the asset:'|| R2.NODE_ID || ':Documents[LIKE %pdf%] were not found ;';
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= V_FINAL_ERRORMESSAGE WHERE ISBN= R1.ISBN;
END;--Iterating for each NODEID
END LOOP;--Iterating the assets[Nodes] for each product of catgeory
-- DBMS_OUTPUT.PUT_LINE('R1.ISBN:' || R1.ISBN ||'::V_CATEGORY_COUNT:' || V_CATEGORY_COUNT);
IF(V_CATEGORY_COUNT = 0) THEN
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= 'Category is not applied to none of the Assets for this Product' WHERE ISBN= R1.ISBN;
END IF;
EXCEPTION WHEN
NO_DATA_FOUND THEN
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= 'Product is not Found:' WHERE ISBN= R1.ISBN;
END;
-- DBMS_OUTPUT.PUT_LINE( R1.ISBN || 'Final documents:'||FINAL_FILNAME);
UPDATE GT_ADD_ISBNS SET FILENAME=FINAL_FILNAME WHERE ISBN= R1.ISBN;
COMMIT;
END LOOP;--looping gt_isbns
END;
You have a number of potential performance hits. Here's one:
"We are iterating 100k+ records from global temporary table"
Global temporary tables can be pretty slow. Populating them means writing all that data to disk; reading from them means reading from disk. That's a lot of I/O which might be avoidable. Also, GTTs use the temporary tablespace so you may be in contention with other sessions doing large sorts.
Here's another red flag:
FOR R1 IN C1 LOOP
... FOR R2 IN C2(R1.ISBN) LOOP
SQL is a set-based language. It is optimised for joining tables and returning sets of data in a highly-performative fashion. Nested cursor loops mean row-by-row processing which is undoubtedly easier to code but may be orders of magnitude slower than the equivalent set operation would be.
--To check whether Product is exists or not
You have several queries selecting from the same tables (AN, 'ANP) using the same criteria (isbn`). Perhaps all these duplicates are the only way of validating your business rules but it seems unlikely.
FINAL_FILNAME := FINAL_FILNAME || V_FILENAME ||',';
Maybe you could rewrite your query to use listagg() instead of using procedural logic to concatenate a string?
UPDATE GT_ADD_ISBNS
Again, all your updates are single row operations instead of set ones.
"Is there any way to make this process faster in the stored procedure itself by doing batch process?"
Without knowing your rules and the context we cannot rewrite your logic for you, but 15-16 hours is way too long for this so you can definitely reduce the elapsed time.
Things to consider:
Replace the writing and reading to the temporary table with the query you use to populate it
Rewrite the loops to use BULK COLLECT with a high LIMIT (e.g. 1000) to improve the select efficiency. Find out more.
Populate arrays and use FORALL to improve the efficiency of the updates. Find out more.
Try to remove all those individual look-ups by incorporating the logic into the main query, using OUTER JOIN syntax to test for existence.
These are all guesses. If you really want to know where the procedure is spending the time - and that knowledge is the root of all successful tuning, so you ought to want to know - you should run the procedure under a PL/SQL Profiler. This will tell you which lines cost the most time, and those are usually the ones where you need to focus your tuning effort. If you don't already have access to DBMS_PROFILER you will need a DBA to run the install script for you. Find out more.
" can we change this code into Java and run this code in multi threaded mode?"
Given that one of the reasons for slowing down the procedure is the I/O cost of selecting from the temporary table there's a good chance multi-threading might introduce further contention and actually make things worse. You should seek to improve the stored procedure first.

Oracle insert 1000 rows at a time

I would like to insert 1000 rows at a time with oracle
Example:
INSERT INTO MSG(AUTHOR)
SELECT AUTHOR FROM oldDB.MSGLOG
This insert is taking a very long time but if I limit it with ROWNUM <= 1000 it will insert right away so I want to create an import that goes throuhg my X number of rows and inserts 1000 at at time.
Thanks
It is rather doubtful that this will really improve performance particularly given the simplicity of the SELECT statement. That must be doing either a full scan of the table or of an index on author. If that scan is slow, you're much better off diagnosing the underlying problem rather than trying to work around it (for example, perhaps oldDB.MsgLog has a number of empty blocks below the high water mark that forces a full table scan to read many more blocks than is strictly necessary).
If you really want to write some more verbose and less efficient PL/SQL to accomplish the task, though, you certainly can
DECLARE
TYPE tbl_authors IS TABLE OF msg.author%TYPE;
l_authors tbl_authors;
CURSOR author_cursor
IS SELECT author
FROM oldDB.MsgLog;
BEGIN
OPEN author_cursor;
LOOP
FETCH author_cursor
BULK COLLECT INTO l_authors
LIMIT 1000;
EXIT WHEN l_authors.count = 0;
FORALL i IN 1..l_authors.count
INSERT INTO msg( author )
VALUES( l_authors(i) );
END LOOP;
END;

bulk collect in oracle

How to query bulk collection? If for example I have
select name
bulk collect into namesValues
from table1
where namesValues is dbms_sql.varchar2_table.
Now, I have another table XYZ which contains
name is_valid
v
h
I want to update is_valid to 'Y' if name is in table1 else 'N'. Table1 has 10 million rows. After bulk collecting I want to execute
update xyz
set is_valid ='Y'
where name in namesValue.
How to query namesValue? Or is there is another option. Table1 has no index.
please help.
As Tom Kyte (Oracle Corp. Vice President) says:
My mantra, that I'll be sticking with thank you very much, is:
You should do it in a single SQL statement if at all possible.
If you cannot do it in a single SQL Statement, then do it in PL/SQL.
If you cannot do it in PL/SQL, try a Java Stored Procedure.
If you cannot do it in Java, do it in a C external procedure.
If you cannot do it in a C external routine, you might want to
seriously think about why it is you need to do it…
think in sets...
learn all there is to learn about SQL...
You should perform your update in SQL if you can. If you need to add an index to do this then that might be preferable to looping through a collection populated with BULK COLLECT.
If however, this is some sort of assignment....
You should specify it as such but here's how you would do it.
I have assumed that your DB server does not have the capacity to hold 10 million records in memory so rather than BULK COLLECTing all 10 million records in one go I have put the BULK COLLECT into a loop to reduce your memory overheads. If this is not the case then you can omit the bulk collect loop.
DECLARE
c_bulk_limit CONSTANT PLS_INTEGER := 500000;
--
CURSOR names_cur
IS
SELECT name
FROM table1;
--
TYPE namesValuesType IS TABLE OF table1.name%TYPE
INDEX BY PLS_INTEGER;
namesValues namesValuesType;
BEGIN
-- Populate the collection
OPEN name_cur;
LOOP
-- Fetch the records in a loop limiting them
-- to the c_bulk_limit amount at a time
FETCH name_cur BULK COLLECT INTO namesValues
LIMIT c_bulk_limit;
-- Process the records in your collection
FORALL x IN INDICES OF namesValues
UPDATE xyz
SET is_valid ='Y'
WHERE name = namesValue(x)
AND is_valid != 'Y';
-- Set up loop exit criteria
EXIT WHEN namesValues.COUNT < c_bulk_limit;
END LOOP;
CLOSE name_cur;
-- You want to update all remaining rows to 'N'
UPDATE xyz
SET is_valid ='N'
WHERE is_valid IS NULL;
EXCEPTION
WHEN others
THEN
IF name_cur%ISOPEN
THEN
CLOSE name_cur;
END IF;
-- Re-raise the exception;
RAISE;
END;
/
Depending upon your rollback segment sizes etc. you may want to issue interim commits within the bulk collect loop but be aware that you will not then be able to rollback these changes. I deliberately haven't added any COMMITs to this so you can choose where to put them to suit your system.
You also might want to change the size of the c_bulk_limit constant depending upon the resources available to you.
Your update will still cause you problems if the xyz table is large and there is no index on the name column.
Hope it helps...
"Table1 has no index."
Well there's your problem right there. Why not? Put an index on TABLE1.NAME and use a normal SQL UPDATE to amend the data in XYZ.
Trying to solve this problem with bulk collect is not the proper approach.

Oracle pl sql 10g - move set of rows from a table to a history table with same structure

PL SQL moves older versions of data from a transaction table to a history table of same structure and archive for a certain period -
for each record
insert into tab_hist (select older_versions of current row);
delete from tab (select older_versions of current row);
END
ps: earlier we were not archiving(no insert) - but after adding the insert it has doubled the run time - so can we accomplish insert and delete with a single select statement? as there is large data to be processed and across multiple table
This is a batch operation, right? In which case you should avoid Row By Row and use set processing. SQL is all about The Joy Of Sets.
Oracle has fantastic bulk SQL processing capabilities. The pseudo code you paosted would look something like this:
declare
cursor c_oldrecs is
select * from your_table
where criterion between some_date and some_other_date;
type rec_nt is table of your_table%rowtype;
oldrecs_coll rec_nt;
begin
open c_oldrecs;
loop
fetch c_oldrecs into oldrecs_coll limit 1000;
exit when oldrecs_coll.count() = 0;
forall i in oldrecs_coll.first() oldrecs_coll.last()
insert into your_table_hist
values oldrecs_coll(i);
forall i in oldrecs_coll.first() oldrecs_coll.last()
delete from your_table
where pk_col = oldrecs_coll(i).pk_col;
end loop;
end;
/
This bulk processing is faster because it sends one thousand statements to the database at a time, instead of switching between PL/SQL and SQL one thousand times. The LIMIT 1000 clause is there to prevent a really huge selection blowing the PGA. This safeguard may not be necessary in your case, or perhaps you can work with a higher value.
I think your current implementation is wrong. It is better to keep only the current version in the live table, and to keep all the historical versions in a separate table from the off. Use triggers to maintain the history as part of every transaction.
It may be that the slowness you are seeing is due to the logic that selects which rows are to be moved. If so, you might get better results by doing the select once to get the rowids into a nested table in memory, then doing the insert and the delete based on that list; or alternatively, driving your loop with a query that selects the rows to be moved.
You might instead consider creating a trigger on insert that will move the existing rows that "match" the row being inserted. This will slow down the inserts somewhat, but would mean you don't need any process to move the old rows in bulk.
If you are on Enterprise edition with the partitioning option, look at partition exchange.
As simple as this
CREATE BACKUP_TAB AS SELECT * FROM TAB
If you are deleting a lot of rows you will be hitting your undo tablespace and a delete which deletes say 100k rows can cause performance issues. You are better of deleting by batch say 5k rows at a time and committing.
BEGIN
-- Where condition on insert and delete must be the same
loop
INSERT INTO BACKUP_TAB SELECT * FROM TAB WHERE 1=1 and rownum < 5000; --Your condition here
exit when SQL%rowcount < 4999;
commit;
end loop;
loop
DELETE FROM TAB
where 1=1--Your condition here
and rownum < 5000;
exit when SQL%rowcount < 4999;
commit;
end loop;
commit;
END;

Resources