Direct-Path INSERT Oracle - oracle

I am reading about Direct-Path INSERT in oracle documentation Loading Tables
It is written that :
During direct-path INSERT operations, the database appends the inserted data after existing data in the table. Data is written directly into datafiles, bypassing the buffer cache. Free space in the table is not reused, and referential integrity constraints are ignored. Direct-path INSERT can perform significantly better than conventional insert.
Can anyone explain me ,how referential integrity constraints is been ignored,According to my understanding it will load the data into the table ignoring the referential constraint .and after insert it will check for referential constraint.
If this is so ,if i use like this .
FORALL i IN v_temp.first..v_temp.last save exceptions
INSERT /*+ APPEND_VALUES */ INTO orderdata
VALUES(v_temp(i).id,v_temp(i).name);
COMMIT;
Will this will gave me correct index ,in case of any exceptions and how ?.
Sorry to ask so many questions in one ,but they are releated to each other.
How refrential constraint is been ignored
What is Free Space in table above
How it will give correct Index in case of any exceptions.

The first question should really be (Do I want/need to use direct path insert?", and the second should be "Did my query use direct path insert?"
If you need referential integrity checks, then you do not use direct path insert.
If you do not want the table to be exclusively locked for modifications, then do not use direct path insert.
If you remove data by deletion and only insert with this code, then do not use direct path insert.
One quick and easy check on whether direct path insert was used is to immediately, before committing the insert, issue a select of one row from the table. If it succeeds then direct path insert was not used -- you will receive an error message if it was because your change has to be commited before your session can read the table.

Referential Integrity is is not ignored in that statement.
See this AskTom thread for an explanation and an example:
what it seems to neglect to say in that old documentation is that....
insert /*+ append */ will ignore the append hint and use conventional
path loading when the table has referential integrity or a trigger

Free space is an in it doesn't reuse space freed up in the table by deletes, whereas a standard insert would.
I can't see anywhere where it says it will do a referential integrity check after the operation. I suspect you have to do that yourself.
erm what index?
Edited to insert.
Index as in the 3rd row to insert, I believe not necessarily anything to do with the table unless the index in the inserts happens to be the key of the table.
Check to see if it is maintaining referential integrity? Put a "bad" record in e.g. an order with a customerid that doesn't exist.
Free space.
Lets say you have a table of nchar(2) with an int primary key
e.g.
1 AA
2 AB
3 AC
So in your index on the key
1 points to 0
2 points to 4 (unicode one char = two bytes)
3 points to 8
Now you delete record with key 2, now you have
1 points to 0
3 points to 8
If you do a normal insert which reuses free space you get
1 points to 0
3 points to 8
4 points to 4
This direct insert stuff however saves time by not reusing the space so you get
1 points to 0
3 points to 8
4 points to 12
Very simplified scenario for illustrative purposes by the way...

Related

Oracle UPDATE /*+append*/ ROW MOVEMENT

I have a table partitioned according to valid_to_date in Oracle 18. Row movement enabled.
I UPDATE the valid_to_date column so that it moves from one partition to another one.
I suppose, Oracle does internally something like DELETE in one partition and INSERT into another partition.
Does it something like normal INSERT or INSERT /* +append */
Because I know /* + append */ is kind of more efficient...
Row movement does not use append hint, and no, unless you're updating a lot of rows, it is NOT more efficient than a simple insert. One simple reason It can't be used is that after such an insert, you cannot use the table unless you commit/rollback. So, if Oracle does APPEND under the hood, it will probably break your code.
It will necessarily consume processing resources on your machine while running (it will read the table, it will delete/insert the rows at the bottom of the table to move them up, it will generate redo, it will generate undo).
see Asktom post
The "insert the rows at the bottom of the table" clause most probably means using the append hint

Sybase ASE remote row insert locking

Im working on an application which access a Sybase ASE 15.0.2 ,where the current code access a remote database
(CIS) to insert a row using a proxy table definition (the destination table is a DOL - DRL table - The PK
row is defined as identity ,and is always growing). The current code performs a select to check if the row
already exists to avoid duplicate data to be inserted.
Since the remote table also have a PK definition on the table, i do understand that the PK verification will
be done again prior to commiting the row.
Im planning to remove the select check since its being effectively done again by the PK verification,
but im concerned about if when receiving a file with many duplicates, the table may suffer
some unecessary contention when the data is tried to be commited.
Its not clear to me if Sybase ASE tries to hold the last row and writes the data prior to check for the
duplicate. Also, if the table is very big, im concerned also about the time it will spend looking the
whole index to find duplicates.
I've found some documentation for SQL anywhere, but not ASE in the following link
http://dcx.sybase.com/1200/en/dbusage/insert-how-transact.html
The best i could find is the following explanation
https://groups.google.com/forum/?fromgroups#!topic/comp.databases.sybase/tHnOqptD7X8
But it doesn't enlighten in details how the row is locked (and if there is any kind of
optimization to write it ahead or at the same time of PK checking)
, and also if it will waste a full PK look if im positively inserting a row which the PK
positively greater than the last row commited
Thanks
Alex
Unlike SqlAnywhere there is no option for ASE to set wait_for_commit. The primary key constraint is checked during the insert and not at the commit time. The problem as I understand from your post I see is if you have a mass insert from a file that may contain duplicates is to load into a temp table , check for duplicates, remove the duplicates and then insert the unique ones. Mass insert are lot faster though it still checks for primary key violations. However there is no cost associated as there is no rolling back. The insert statement is always all or nothing. Even if one row is duplicate the entire insert statement will fail. Check before insert in more of error free approach as opposed to use of constraint to the verification because it is going to fail and rollback is going to again be costly.
Thanks Mike
The link does have a very quick explanation about the insert from the CIS perspective. Its a variable to keep an eye on given that CIS may become a representative time consumer
if its performing data and syntax checking if it will be done again when CIS forwards the insert statement to the target server. I was afraid that CIS could have some influence beyond the network traffic/time over the locking/PK checking
Raju
I do agree that avoiding the PK duplication by checking if the row already exists by running a select and doing in a batch, but im currently looking for a stop gap solution, and that may be to perform the insert command in batches of about 50 rows and leave the
duplicate key check for the PK.
Hopefully the PK check will be done over a join of the 50 newly inserted rows, and thus
avoid to traverse the index for each single row...
Ill try to test this and comment back
Alex

Inserting data in a column avoiding duplicates

Lets say i have a query which is fetching col1 after joining multiple tables. I want to insert values of that col1 in a table which is on remote db i.e. i would be using dblink to do that.
Now that col1 would be fetched from 4-5 different db's. There is chances that a value1 fetch from db1 would b in db2 as well. How can i avoid duplicates ?
In my remote db, I have created col1 a primary key. so when inserting, an error would be thrown if there is a duplicate key, end result failing rest of the process. Which i don't want to. I was thiking about 2 approaches
Write a PLSQL script, For each value, determine if value already exists or not. If it doesn't then insert.
Write a PLSQL script and insert and catch the duplicate key exception. The exception would be ignore and it will keep inserting (it doesn't sound that good).
Which approach would you prefer? Is there anything else i can do ?
I would use the MERGE statement and WHEN NOT MATCHED THEN INSERT.
The same merger could also update but it doesn't have to, just leave the update part out.
The different databases can have duplicate primary keys but that doesn't mean the records are duplicates. The actual data may be different in each case. Or the records may represent the same real world thing but at different statuses, Don't know, you haven't provided enough explanation.
The point is, you need much more analysis of why duplicate records can exist and probably a more sophisticated approach to handling collisions. Do you need to take all records (in which case you need a synthetic key)? Or do you take only one instance (so how do you decide precedence)? Other scenarios may exist.
In any case, MERGE or PL/SQL loops are likely to be too crude a solution.
First off, I would suggest that your target database drive all of these inserts because inserting/updating across a database link can create some locking issues and further complicate things especially with multiple databases attempting to access and perform DML on the same table. However if that isn't possible the solutions below will work.
I would fix your primary key problem by including a table look-up on the target table for each row.
INSERT INTO customer#dblink.oracle.com cust
(emp_name,
emp_id)
VALUES
(SELECT
cust.employee_name,
cust.employee_id --primary_key
FROM
source_table st
WHERE NOT EXISTS
(SELECT 1
FROM customer#dblink.oracle.com cust
WHERE cust.employee_id = st.emp_id));
Again, I would not recommend DML transactions across database links unless absolutely necessary as you can sometimes have weird locking behavior.
A PL/SQL procedure or anonymous PL/SQL block could be used to create a bulk processing solution as follows:
CREATE OR REPLACE PROCEDURE send_unique_data
AS
TYPE tab_cust IS TABLE OF customer#dblink.oracle.com%ROWTYPE
INDEX BY PLS_INTEGER;
t_records tab_cust;
BEGIN
SELECT
cust.employee_name,
cust.employee_id --primary_key
BULK COLLECT
INTO t_records
FROM source_table;
FORALL i IN t_records.FIRST...t_records.LAST SAVE EXCEPTIONS
INSERT INTO customer#dblink.oracle.com
VALUES t_records(i);
END send_unique_data;
You can also call the system SQL%BULKEXCEPTIONS collection in case you want to do anything with the records that threw exceptions (such as unique_constraint violations). Be warned that this solution will cause the target table to suffers from performance issues if there are lots of duplicate data attempting to be inserted.

create index before adding columns vs. create index after adding columns - does it matter?

In Oracle 10g, does it matter what order create index and alter table comes in?
Say i have a query Q with a where clause on column C in table T. Now i perform one of the following scenarios:
I create index I(C) and then add columns X,Y,Z.
Add columns X,Y,Z then create index I(C).
Q is 'select * from T where C = whatever'
Between 1 and 2 will there be a significant difference in performance of Q on table T when T contains a very large number of rows?
I personally make it a practice to do #2 but others seem to have a different opinion.
thanks
It makes no difference if you add columns to a table before or after creating an index. The optimizer should pick the same plan for the query and the execution time should be unchanged.
Depending on the physical storage parameters of the table, it is possible that adding the additional columns and populating them with data may force quite a bit of row migration to take place. That row migration will generate changes to the indexes on the table. If the index exists when you are populating the three new columns with data, it is possible that populating the data in X, Y, and Z will take a bit longer because of the additional index maintenance.
If you add columns without populating them, then it is pretty quick as it is just a metadata change. Adding an index does require the table to be read (or potentially another index) so that can be very time consuming and of much greater impact than the simple metadata change of recording the new index details.
If the new columns are going to be populated as part of the ALTER TABLE, it is a different matter.
The database may undergo an unplanned shutdown during the course of adding that data to every row of the table data
The server memory may not have room to record every row changed in that table
Therefore those row changes may be written to datafiles before commit, and are therefore written as dirty blocks
The next read of those blocks, after the ALTER table has successfully completed will do a delayed block cleanout (ie record the fact that the change has been committed)
If you add the columns (with data) first, then the create index will (probably) read the table and do the added work of the delayed block cleanout.
If you create the index first then add the columns, the create index may be faster but the delayed block cleanout won't happen and that housekeeping will be picked up by the application later (potentially by the select * from T where C = whatever)

Oracle ROWID as function/procedure parameter

I just would like to hear different opinions about ROWID type usage as input parameter of any function or procedure.
I have normally used and seen primary keys used as input parameters but is there some kind of disadvantages to use ROWID as input parameter? I think it's kind a simple and selects are pretty quick if used in WHERE clause.
For example:
FUNCTION get_row(p_rowid IN ROWID) RETURN TABLE%ROWTYPE IS...
From the concept guide:
Physical rowids provide the fastest possible access to a row of a given table. They contain the physical address of a row (down to the specific block) and allow you to retrieve the row in a single block access. Oracle guarantees that as long as the row exists, its rowid does not change.
The main drawback of a ROWID is that while it is normally stable, it can change under some circumstances:
The table is rebuilt (ALTER TABLE MOVE...)
Export / Import obviously
Partition table with row movement enable
A primary key identifies a row logically, you will always find the correct row, even after a delete+insert. A ROWID identifies the row physically and is not as persistent as a primary key.
You can safely use ROWID in a single SQL statement since Oracle will guarantee the result is coherent, for example to remove duplicates in a table. To be on the safe side, I would suggest you only use the ROWID accross statements when you have a lock on the row (SELECT ... FOR UPDATE).
From a performance point of view, the Primary key access is a bit more expensive but you will normally notice this only if you do a lot of single row access. If performance is critical though, you usually can get greater benefit in that case from using set processing than single row processing with rowid. In particular, if there are a lot of roundtrips between the DB and the application, the cost of the row access will probably be negligible compared to the roundtrips cost.

Resources