DB project - improving performance with relationships - performance

I have two tables, let's call them TableA and TableB. One record in TableA is related to one or more in TableB. But there's also one special record within them in TableB for each record from TableA (for example with lowest ID), and I want to have quick access to that special one. Data from both tables aren't deleted - it's a kind of history rarely cleared. How do that the best in terms of performance?
I thought of:
1) two-way relationship, but it will affect insert performance
2) design next table, with primary key as FK_TableA (for TableA record exactly one is "special") and second column FK_TableB and then create view
3) design next table, with primary key as FK_TableA, FK_TableB, make FK_TableA unique and then create view
I'm open for all other ideas :)

4) I'd consider an indexed view to hide the JOIN and row restriction
This is similar to your options 2 and 3 but the DB engine will maintain it for you. With a new table you'll either compromise data integrity or have to manage the data via triggers

Related

Fast data migration on the same database

I'm trying to find a way to perform a migration from two tables on the same database. This migration should be as fast as possible in order to minimize the downtime.
To put it on an example lets say I have a person table like so:
person_table -> (id, name, address)
So a person as an Id, a name and an address. My system will contain millions of person registries and it was decided that the person table should be partitioned. To do so, I've created a new table:
partitioned_person_table->(id,name,address,partition_time)
Now this table will contain an extra column called partition_time. This is the partition key for this table since this is a range partition (one partition every hour).
Finally, I need to find a way to move all the information from the person_table to the partitioned_person_table with the best performance.
The first thing I could try is to simply create a statement like:
INSERT INTO partitioned_person_table (id, name, address, partition_time)
SELECT id, name, address, CURRENT_TIMESTAMP FROM person_table;
The problem is that when it comes to millions of registries this might become very slow (also the temporary tablespace might not be able to handle all this information)
My second approach was to use the EXCHANGE PARTITION method. Unfortunetly, I cannot do this because the tables contain diffrent column numbers.
Is there any other way that I can perfom this with the best performance (less downtime) ?
Thank you.
If you can live with the state, that all the current records would be located in one partition (and your INSERT approach suggest that), you may only
1) add a new column partition_time either as NULL or possible with metadata default only - required 12c
2) switch the table to a partitioned table either with online redefinition (if you have no maintainace window, where the table is offline) or with exchange partition otherwise.

Oracle 12c - refreshing the data in my tables based on the data from warehouse tables

I need to update the some tables in my application from some other warehouse tables which would be updating weekly or biweekly. I should update my tables based on those. And these are having foreign keys in another tables. So I cannot just truncate the table and reinsert the whole data every time. So I have to take the delta and update accordingly based on few primary key columns which doesn't change. Need some inputs on how to implement this approach.
My approach:
Check the last updated time of those tables, views.
If it is most recent then compare each row based on the primary key in my table and warehouse table.
update each column if it is different.
Do nothing if there is no change in columns.
insert if there is a new record.
My Question:
How do I implement this? Writing a PL/SQL code is it a good and efficient way? as the expected number of records are around 800K.
Please provide any sample code or links.
I would go for Pl/Sql and bulk collect forall method. You can use minus in your cursor in order to reduce data size and calculating difference.
You can check this site for more information about bulk collect, forall and engines: http://www.oracle.com/technetwork/issue-archive/2012/12-sep/o52plsql-1709862.html
There are many parts to your question above and I will answer as best I can:
While it is possible to disable referencing foreign keys, truncate the table, repopulate the table with the updated data then reenable the foreign keys, given your requirements described above I don't believe truncating the table each time to be optimal
Yes, in principle PL/SQL is a good way to achieve what you are wanting to
achieve as this is too complex to deal with in native SQL and PL/SQL is an efficient alternative
Conceptually, the approach I would take is something like as follows:
Initial set up:
create a sequence called activity_seq
Add an "activity_id" column of type number to your source tables with a unique constraint
Add a trigger to the source table/s setting activity_id = activity_seq.nextval for each insert / update of a table row
create some kind of master table to hold the "last processed activity id" value
Then bi/weekly:
retrieve the value of "last processed activity id" from the master
table
select all rows in the source table/s having activity_id value > "last processed activity id" value
iterate through the selected source rows and update the target if a match is found based on whatever your match criterion is, or if
no match is found then insert a new row into the target (I assume
there is no delete as you do not mention it)
on completion, update the master table "last processed activity id" to the greatest value of activity_id for the source rows
processed in step 3 above.
(please note that, depending on your environment and the number of rows processed, the above process may need to be split and repeated over a number of transactions)
I hope this proves helpful

Update Index Organized Tables using multiple UPDATE queries (temporary duplicates)

I need to update the primary key of a large Index Organized Table (20 million rows) on Oracle 11g.
Is it possible to do this using multiple UPDATE queries? i.e. Many smaller UPDATEs of say 100,000 rows at a time. The problem is that one of these UPDATE batches could temporarily produce a duplicate primary key value (there would be no duplicates after all the UPDATEs have completed.)
So, I guess I'm asking is it somehow possible to temporarily disable the primary key constraint (but which is required for an IOT!) or alter the table temporarily some other way. I can have exclusive and offline access to this table.
The only solution I can see is to create a new table and when complete, drop the original table and rename the new table to the original table name.
Am I missing another possibility?
You can't disable / drop the primary key constraint from an IOT, since it is a unique index by definition.
When I need to change an IOT like this, I either do a CTAS (create table as) for a new plain heap table, do my maintenance, and then CTAS a new IOT.
Something like:
create table t_temp as select * from t_iot;
-- do maintenance
create table t_new_iot as select * from t_temp;
If, however, you need to simply add or join a new field to the existing key, you can do this in one step by creating the new IOT structure, then populating directly from the old IOT with a query.
Unfortunately, this is one of the downsides to IOTs.
I would recommend following method:
Create new IOT table partitioned by system with single partition
with exactly same structure as current one.
Lock current IOT table to prevent any DML.
insert into new table as select from current table changing PK values in select. This step
could be repeated several times if needed. In this case it's better
to do it in another session to keep lock on original table.
Exchange partition of new table with original table.

How to improve deletion times in Oracle for a self-referencing table

In our Oracle 11g database we have a table, that has a primary key I_Node (int) and also a column called I_Parent_Node (int) that references back to another record in the same table. The root node has I_Parent_Node = null. In this way we form a tree structure of nodes, leaves, branches, whatever you want to call them.
Frequently we need to delete an entire branch of nodes at once, meaning a node and all of its children. At times this is many, many records, say 50,000 or more. Since a cascade delete is not allowed on a self-referencing table, we are forced to delete one by one starting with the leaves and working our way back up the tree. We have experienced hours-long delete times.
We are considering doing a "marking for deletion" technique, where a separate program would clean out the nodes marked for deletion during off-peak hours, but I am interested in whether a database design change or some other Oracle construct could help out here. I am not trained in Oracle aside from what I've learned on the job, and the people who created the database did not have such large quantities in mind. I am open to database design changes since it is not yet a fixed design.
You may want to consider separating the hierarchy structure from the main table. So you main table would just have primary ids (lets call it "ID"), and your hierarchy table would have "ID, ParentID, TreeID". ParentID is that ID's parent node, and TreeID is the highest parent in the tree (level 1).
So, a level 1 node would look like:
ID, ParentID, TreeID
1, [null], 1
A level 2 node would look like:
ID, ParentID, TreeID
2, 1, 1
A level 3 node would look like:
ID, ParentID, TreeID
3, 2, 1
And so on.
You would use Oracle hierarchy queries (Connect by queries) to query or traverse the trees. This table will be very thin (not many columns, these 3 + some modified dates maybe), so updating these relationships should be much faster and scale better than messing with the main table.
You should be able to do this with deferrable constraints and a hierarchical query.
If your foreign key constraint (on I_Parent_Node) is not already deferrable, drop it and recreate it with the keyword "DEFERRABLE".
Here's an example using the EMPLOYEES table from Oracle's examples (I modified the DEPARTMENTS table too so that this would execute, that's really not needed for an example though):
Drop & Recreate your foreign key if it's not currently deferrable:
alter table employees drop constraint emp_manager_fk;
alter table employees add constraint emp_manager_fk foreign key (manager_id) references employees(employee_id) deferrable;
In your transaction, defer your contraints, and delete using a hierarchical query:
set constraints all deferred;
delete
from employees e
where employee_id in (select employee_id
from employees
start with employee_id = 108
connect by prior employee_id = manager_id);
The "108" is the ID of my "parent" record.
I assume you've already done standard tuning - i.e. are the node and parent node ID columns suitable indexed?
(1) One approach to the problem is to use PL/SQL. Bulk collect the IDs to be deleted, using a hierarchical query that returns the leaf rows first, into an array; then do a bulk delete (FORALL) using the array.
(2) Another approach is a soft-delete - mark the rows as "deleted", but never actually delete them. You would need to modify your application (or use Oracle VPD to automatically omit the "deleted" rows from queries). This might work reasonably well if deleting a node is relatively rare; but if you're routinely deleting lots of nodes then this would clutter the table with a lot of old data.

oracle - moving data from to identical database

I have two databases with identical table layouts. There are a dozen or so tables of interest. They are a number of FK between them.
I have been asked to write a stored procedure to copy data from database A to database B based on the PK of the parent table at the top of the hierarchy. I may receive just one value, or a list of values. I'm supposed to select all records from database A that match the value(s) and insert/update them into database B. This includes all the records in the child tables too.
My questions is whats the best(most efficent/ best practice) way to do this?
Should I write a dozen select from... insert into... statements?
Should I join the tables together an try to insert into all the tables at the same time?
Thanks!
Additional info:
The record should be inserted if it is not already there. (based on the PK of the respective table). Otherwise it should be updated.
Obviously I need to traverse down to all child tables, so There would only be one record to copy in the parent table, but the child table might have 10, and the child's child table might have 500. I would of course need to update the record if it already existed, insert if it does not for the child tables too...
UPDATE:
I think it would make the solution simpler if I just deleted all records related to the top level key, and then insert all the new records rather than trying to do updates.
So I guess the questions is it best to just do a dozen:
delete from ... where ... in ...
select from ... where ... in ...
insert into...
or is it better to do some kinda of fancy joins to do all the inserts in one sql statement?
I would do this by disabling all the foreign key constraints, then doing a set of MERGE statements to deal with the updates and inserts, then enable all the constraints.
Think about logging. How much redo do you want to generate?
You might find that it's quicker and better to truncate all the target tables and then do inserts of everything with nolog. Could be simpler than the merges.
One major main alternative would be to drop all the target tables and use export and import. Might be a lot faster.
A second alternative would be to use materialized views, particularly if you don't need to do updates on the target tables. That way, Oracle does all the heavy lifting for you. You can force integrity by choosing refresh groups carefully.
There are several ways to deal with this business problem. A PL/SQL program may not be the best.

Resources