Primary Key Constraint Error when using Merge oracle 10g - oracle

Requirement : I have one huge table (say haivng lacs records) with duplicate entries for when we combine three columns value together. My requirement is to populate second different table having unique records (removing duplicates from first table).
For this requirement as we have to do bulk inserts in second table, I came across MERGE feature of oracle 10g, which is more optimize way for bulk insert. But when I tried this I am getting integrity constraint error for composite primary key(three cols that I mentioned above).
MERGE INTO 2ndTable e
USING firstTable h
ON (e.firstCol = h.firstCol and e.2ndCol = h.2ndCol and e.3rdCol = h.3rdCol)
WHEN NOT MATCHED THEN
INSERT VALUES (h.firstCol, h.2ndCol, h.3rdCol);
composite primary key for 2nd Table : e.firstCol, e.2ndCol, e.3rdCol
Please let me know your thought for this error OR the best way we can handle this bulk inserts removing duplicate records.

Related

Oracle 12c - refreshing the data in my tables based on the data from warehouse tables

I need to update the some tables in my application from some other warehouse tables which would be updating weekly or biweekly. I should update my tables based on those. And these are having foreign keys in another tables. So I cannot just truncate the table and reinsert the whole data every time. So I have to take the delta and update accordingly based on few primary key columns which doesn't change. Need some inputs on how to implement this approach.
My approach:
Check the last updated time of those tables, views.
If it is most recent then compare each row based on the primary key in my table and warehouse table.
update each column if it is different.
Do nothing if there is no change in columns.
insert if there is a new record.
My Question:
How do I implement this? Writing a PL/SQL code is it a good and efficient way? as the expected number of records are around 800K.
Please provide any sample code or links.
I would go for Pl/Sql and bulk collect forall method. You can use minus in your cursor in order to reduce data size and calculating difference.
You can check this site for more information about bulk collect, forall and engines: http://www.oracle.com/technetwork/issue-archive/2012/12-sep/o52plsql-1709862.html
There are many parts to your question above and I will answer as best I can:
While it is possible to disable referencing foreign keys, truncate the table, repopulate the table with the updated data then reenable the foreign keys, given your requirements described above I don't believe truncating the table each time to be optimal
Yes, in principle PL/SQL is a good way to achieve what you are wanting to
achieve as this is too complex to deal with in native SQL and PL/SQL is an efficient alternative
Conceptually, the approach I would take is something like as follows:
Initial set up:
create a sequence called activity_seq
Add an "activity_id" column of type number to your source tables with a unique constraint
Add a trigger to the source table/s setting activity_id = activity_seq.nextval for each insert / update of a table row
create some kind of master table to hold the "last processed activity id" value
Then bi/weekly:
retrieve the value of "last processed activity id" from the master
table
select all rows in the source table/s having activity_id value > "last processed activity id" value
iterate through the selected source rows and update the target if a match is found based on whatever your match criterion is, or if
no match is found then insert a new row into the target (I assume
there is no delete as you do not mention it)
on completion, update the master table "last processed activity id" to the greatest value of activity_id for the source rows
processed in step 3 above.
(please note that, depending on your environment and the number of rows processed, the above process may need to be split and repeated over a number of transactions)
I hope this proves helpful

Oracle - Delete One Row in Dimension Table is Slow

I have a datamart with 5 dimension table and a fact table.
I'm trying to clean a dimension table that has few rows (4000 rows). But, the fact table have millions rows (25GB)(Indexes and partitions).
When I try to delete a row in the table dimension, the process becomes very slow. It's just as slow despite the absence of relationship with a row in the fact table (cascade delete).
Is there any way to optimize this?. Thanks in advance.
Presumably, there is a cascading delete of some sort between the dimension table and the fact table.
Adding an index on the key column in the fact table may be sufficient. Then Oracle can immediately tell if/where any given value is.
If that doesn't work, drop the foreign key constraint altogether. Delete the unused values and add the constraint back in.
You could try these strategies as well :
create another copy of the fact table but, without the dim foreign key column of the table to be cleaned.
create fact_table_new as
select dim1_k, dim2_k, dim3_k, dim4_k, dim5_k (not this column), fact_1, fact_2, ...
from fact_table ;
or
update fact_table
set dim5_fk_col = null
where dim5_fk_col in (select k_col from dim5_table) ;

Update Index Organized Tables using multiple UPDATE queries (temporary duplicates)

I need to update the primary key of a large Index Organized Table (20 million rows) on Oracle 11g.
Is it possible to do this using multiple UPDATE queries? i.e. Many smaller UPDATEs of say 100,000 rows at a time. The problem is that one of these UPDATE batches could temporarily produce a duplicate primary key value (there would be no duplicates after all the UPDATEs have completed.)
So, I guess I'm asking is it somehow possible to temporarily disable the primary key constraint (but which is required for an IOT!) or alter the table temporarily some other way. I can have exclusive and offline access to this table.
The only solution I can see is to create a new table and when complete, drop the original table and rename the new table to the original table name.
Am I missing another possibility?
You can't disable / drop the primary key constraint from an IOT, since it is a unique index by definition.
When I need to change an IOT like this, I either do a CTAS (create table as) for a new plain heap table, do my maintenance, and then CTAS a new IOT.
Something like:
create table t_temp as select * from t_iot;
-- do maintenance
create table t_new_iot as select * from t_temp;
If, however, you need to simply add or join a new field to the existing key, you can do this in one step by creating the new IOT structure, then populating directly from the old IOT with a query.
Unfortunately, this is one of the downsides to IOTs.
I would recommend following method:
Create new IOT table partitioned by system with single partition
with exactly same structure as current one.
Lock current IOT table to prevent any DML.
insert into new table as select from current table changing PK values in select. This step
could be repeated several times if needed. In this case it's better
to do it in another session to keep lock on original table.
Exchange partition of new table with original table.

Fastest way to SELECT a row from a table in a database (Microsoft SQL server)

I have a huge table with one int PRIMARY KEY IDENTITY column.
I guess making the SELECT query using that primary key is the fastest way for the database to find the row in the table isn't it?
If that is true i still have a question.
Is that query as fast as a call to a dictionary by key or the database still has to read all the rows from the beginning (the Primary Key column) till it finds the row itself?
Thanks in advance ^^
Using primary key is obviously the fastest way to access a particular row.
If you want to understand how it works, you have to understand how index works.
In general it works like that :
Let's say you have a table t1(col1,col2...col10) and you have an index on col1.
Index on col1 means that you have some data structure which contains pairs (col1, rec_id)
and rec_id allows direct access to row with appropriate col1.
The data structure is ordered by col1 and therefore allows efficient searching by col1.
I think searching in dictionary works per dictionary search algorithm which should be more like binary search kind.
When you declare a column as Primary key in table, then that column is indexed, hence it should be working based on hashing principle, so searching is definitely NOT row by row as you mentioned.
Finally, yes it is the common and fast way, but you should be selective about the number of columns and rows you need in your sql query. Avoid fetching large number of rows per select call.

Unique Indexes with Oracle partitioned tables

I have a table Customer_Chronics in Oracle 11g.
The table has three key columns as shown below :
branch_code
customer_id
period
I have partitioned by table by list of branch_code, and now I'm having dilemma. Which is better:
Create unique index indexNumberOne on Customer_Chronics (PERIOD, CUSTOMER_ID);
Create unique index indexNumberTwo on Customer_Chronics (branch_code, PERIOD, CUSTOMER_ID);
The actual data must be unique by period, customer_id. If I put a unique index only on these two columns Oracle will check all partitions on the table when inserting new records?
The only way to enforce uniqueness is with a unique constraint on the columns of interest. So that's your first option. The database will check all values across all partitions it this case. But as it's a unique index that shouldn't take too long no matter how big the table gets (if that's your concern).
Yes, If you put unique index on that two columns only, Oracle will create a global index and will check all partitions. This is one of challenges I face sometime because we prefer local indexs for big tables (small tables should be OK).

Resources