Oracle calculate average using a trigger - oracle

For a school project we are forced to have redundant information and update it by using triggers. We have a table called 'recipe_ratings' that contains a 'rating' (numbers 0-100). In our 'recipes' table we have a redundant row called 'rating' that contains the average rating for that specific recipe.
We tried to create an Oracle trigger like this:
CREATE OR REPLACE TRIGGER trigger_rating
AFTER UPDATE
ON recipe_ratings
FOR EACH ROW
DECLARE
average_rating NUMBER;
BEGIN
SELECT ROUND(AVG(rating))
INTO average_rating
FROM recipe_ratings
WHERE rid = :new.rid;
UPDATE recipe SET rating = average_rating
WHERE rid = :new.rid
END;
But this gives us: ORA-04091: table name is mutating, trigger/function may not see it. We are experimenting with 'autonomous transaction' but it feels like we're drifting away from our trigger.
How can we make this trigger work?

I hope the professor is not leading you down the path of using autonomous transactions which would be a hideous misuse of autonomous transactions in addition to using an invalid data model.
In the real world, in order for this sort of thing to work, you would need
A package with a collection of RID values
A before statement trigger that initializes this collection
A row-level trigger that inserts the :new.rid values into the collection
An after statement trigger that reads through the collection and issues the updates on the RECIPE_RATINGS table
Obviously, that sort of thing gets quite cumbersome quite quickly which is why storing redundant data is so problematic.
If you only had to handle inserts and you could guarantee that all inserts would be single-row inserts using the INSERT ... VALUES, you could query the RECIPE_RATINGS table in your query. That doesn't work in the real world, but it may suffice in a classroom.
If you don't mind re-computing the average rating for every recipe every time a single row in RECIPE_RATINGS is updated-- something that would be catastrophic in practice but may work on a sufficiently small data set-- you could have an after statement trigger that does a correlated update on every row of the RECIPE table.

How flexible is your data model ?
Rather than the storing the average rating on the recipe, can you store the total of all the rating plus the number of ratings.
An insert trigger on ratings would take the values or the new row to update the parent recipe row to add the rating to the total and 1 to the number/count of ratings.
An update trigger would add the difference between the :NEW and :OLD values to the total (and not update the count).
Neither trigger has to query other rows on the ratings table preventing the mutating table error, and making it much safer to use in an environment with multiple concurrent users.
The query (or a view or a derived column) would determine the average simply by dividing the total by the count.

This article gives one means of avoiding these errors.
Another thought - would a 'normal' trigger, rathen than a FOR EACH ROW trigger be more suitable here? If there are multiple recipe_rating updates for the same recipe in one statement you're calculating the average multiple times otherwise (hence the mutation warning).

Related

MERGE INTO Performance as table grows

This is a general question about the Oracle MERGE INTO statement with a particular scenario, on Oracle RDBMS 12c.
Daily data will be loaded to StagingTableA - about 10m rows.
This will be MERGEd INTO TableA.
TableA will vary between 0 to 10m rows (matcing StagingTableA).
There may be times when TableA will be pruned/emptied and left with 0 rows.
Clearly, when TableA is empty, a straight INSERT will do the job, but the procedure has been written to use a MERGE INTO method to handle all scenarios.
The MERGE .. MATCH is on a indexed column.
My question is an uncertainty about how the MERGE handles the MATCH in circumstances where TableA will start empty, and then grow hugely during the MERGE execution. The MATCH on indexed columns will use a FTS as the stats will show the table has 0 rows.
At some point during the MERGE transaction, this will become inefficient.
Is the MERGE statement clever enough to detect this and change the execution plan, and start using the index instead of the FTS?
If this was done the old way with CURSOR, UPDATE and INSERT then we could potentially introduce a ANALYZE at a appropriate point (say after 50,000 processed) on the TableA to switch to a optimal plan.
I haven't been able to find any documentation dealing with this specific question.
Hopefully you've got a UNIQUE index on that table, which is based on the incoming data. If I was you, rather than using a simple MERGE I'd:
Mark all indexes on the table as UNUSABLE, except for the unique index.
INSERT all records
Catch the DUPLICATE VALUE ON INDEX exception at the time of INSERT and issue the appropriate UPDATE.
DELETE processed rows from the input record.
Commit every N records (1000? 10000? 100000? Your choice...), calling DBMS_STATS.GATHER_TABLE_STATS for the table you've inserted into after each COMMIT.
Best of luck.

Oracle 12c - refreshing the data in my tables based on the data from warehouse tables

I need to update the some tables in my application from some other warehouse tables which would be updating weekly or biweekly. I should update my tables based on those. And these are having foreign keys in another tables. So I cannot just truncate the table and reinsert the whole data every time. So I have to take the delta and update accordingly based on few primary key columns which doesn't change. Need some inputs on how to implement this approach.
My approach:
Check the last updated time of those tables, views.
If it is most recent then compare each row based on the primary key in my table and warehouse table.
update each column if it is different.
Do nothing if there is no change in columns.
insert if there is a new record.
My Question:
How do I implement this? Writing a PL/SQL code is it a good and efficient way? as the expected number of records are around 800K.
Please provide any sample code or links.
I would go for Pl/Sql and bulk collect forall method. You can use minus in your cursor in order to reduce data size and calculating difference.
You can check this site for more information about bulk collect, forall and engines: http://www.oracle.com/technetwork/issue-archive/2012/12-sep/o52plsql-1709862.html
There are many parts to your question above and I will answer as best I can:
While it is possible to disable referencing foreign keys, truncate the table, repopulate the table with the updated data then reenable the foreign keys, given your requirements described above I don't believe truncating the table each time to be optimal
Yes, in principle PL/SQL is a good way to achieve what you are wanting to
achieve as this is too complex to deal with in native SQL and PL/SQL is an efficient alternative
Conceptually, the approach I would take is something like as follows:
Initial set up:
create a sequence called activity_seq
Add an "activity_id" column of type number to your source tables with a unique constraint
Add a trigger to the source table/s setting activity_id = activity_seq.nextval for each insert / update of a table row
create some kind of master table to hold the "last processed activity id" value
Then bi/weekly:
retrieve the value of "last processed activity id" from the master
table
select all rows in the source table/s having activity_id value > "last processed activity id" value
iterate through the selected source rows and update the target if a match is found based on whatever your match criterion is, or if
no match is found then insert a new row into the target (I assume
there is no delete as you do not mention it)
on completion, update the master table "last processed activity id" to the greatest value of activity_id for the source rows
processed in step 3 above.
(please note that, depending on your environment and the number of rows processed, the above process may need to be split and repeated over a number of transactions)
I hope this proves helpful

A BEFORE UPDATE TRIGGER can cause MUTATING TABLE Oracle error?

Please suppose you have, in Oracle Database, a BEFORE UPDATE TRIGGER.
If fires only when in a particular column is assigned a certain value (in example, the string 'SUBSTITUTE'` is inserted as update in the ALPHA column), otherwise it does not fire.
This trigger does many queries and, under certain conditions, updates some records of the triggered table.
Being a BEFORE UPDATE TRIGGER, could it cause MUTATING TABLE error?
You can assume that the body of the trigger does not update the ALPHA column, but could update other columns and/or insert new records in the same table, using :OLD values.
The update of the ALPHA column to the string value 'SUBSTITUTE' provokes the trigger fire.
A mutating table is a table that is currently being modified by an update, delete, or insert statement. If your before-update for-each-row trigger tries to modify the table that is defined against then it will get an ORA-04091: table X is mutating, trigger/function may not see it error. Here's a SQL Fiddle with a trivial example.
You'd get the same with an after-update trigger depending on what you're doing; and you can't make it statement-level if you need to act depending on the :new.alpha value.
Both the 'does many queries' part and the update suggest that perhaps a trigger is not the right tool here; this is quite vague though, and what the right tool is depends on what you're doing. A procedure that makes all the necessary changes and is called instead of the simple update might be one solution, for example.

Inserting data in a column avoiding duplicates

Lets say i have a query which is fetching col1 after joining multiple tables. I want to insert values of that col1 in a table which is on remote db i.e. i would be using dblink to do that.
Now that col1 would be fetched from 4-5 different db's. There is chances that a value1 fetch from db1 would b in db2 as well. How can i avoid duplicates ?
In my remote db, I have created col1 a primary key. so when inserting, an error would be thrown if there is a duplicate key, end result failing rest of the process. Which i don't want to. I was thiking about 2 approaches
Write a PLSQL script, For each value, determine if value already exists or not. If it doesn't then insert.
Write a PLSQL script and insert and catch the duplicate key exception. The exception would be ignore and it will keep inserting (it doesn't sound that good).
Which approach would you prefer? Is there anything else i can do ?
I would use the MERGE statement and WHEN NOT MATCHED THEN INSERT.
The same merger could also update but it doesn't have to, just leave the update part out.
The different databases can have duplicate primary keys but that doesn't mean the records are duplicates. The actual data may be different in each case. Or the records may represent the same real world thing but at different statuses, Don't know, you haven't provided enough explanation.
The point is, you need much more analysis of why duplicate records can exist and probably a more sophisticated approach to handling collisions. Do you need to take all records (in which case you need a synthetic key)? Or do you take only one instance (so how do you decide precedence)? Other scenarios may exist.
In any case, MERGE or PL/SQL loops are likely to be too crude a solution.
First off, I would suggest that your target database drive all of these inserts because inserting/updating across a database link can create some locking issues and further complicate things especially with multiple databases attempting to access and perform DML on the same table. However if that isn't possible the solutions below will work.
I would fix your primary key problem by including a table look-up on the target table for each row.
INSERT INTO customer#dblink.oracle.com cust
(emp_name,
emp_id)
VALUES
(SELECT
cust.employee_name,
cust.employee_id --primary_key
FROM
source_table st
WHERE NOT EXISTS
(SELECT 1
FROM customer#dblink.oracle.com cust
WHERE cust.employee_id = st.emp_id));
Again, I would not recommend DML transactions across database links unless absolutely necessary as you can sometimes have weird locking behavior.
A PL/SQL procedure or anonymous PL/SQL block could be used to create a bulk processing solution as follows:
CREATE OR REPLACE PROCEDURE send_unique_data
AS
TYPE tab_cust IS TABLE OF customer#dblink.oracle.com%ROWTYPE
INDEX BY PLS_INTEGER;
t_records tab_cust;
BEGIN
SELECT
cust.employee_name,
cust.employee_id --primary_key
BULK COLLECT
INTO t_records
FROM source_table;
FORALL i IN t_records.FIRST...t_records.LAST SAVE EXCEPTIONS
INSERT INTO customer#dblink.oracle.com
VALUES t_records(i);
END send_unique_data;
You can also call the system SQL%BULKEXCEPTIONS collection in case you want to do anything with the records that threw exceptions (such as unique_constraint violations). Be warned that this solution will cause the target table to suffers from performance issues if there are lots of duplicate data attempting to be inserted.

create index before adding columns vs. create index after adding columns - does it matter?

In Oracle 10g, does it matter what order create index and alter table comes in?
Say i have a query Q with a where clause on column C in table T. Now i perform one of the following scenarios:
I create index I(C) and then add columns X,Y,Z.
Add columns X,Y,Z then create index I(C).
Q is 'select * from T where C = whatever'
Between 1 and 2 will there be a significant difference in performance of Q on table T when T contains a very large number of rows?
I personally make it a practice to do #2 but others seem to have a different opinion.
thanks
It makes no difference if you add columns to a table before or after creating an index. The optimizer should pick the same plan for the query and the execution time should be unchanged.
Depending on the physical storage parameters of the table, it is possible that adding the additional columns and populating them with data may force quite a bit of row migration to take place. That row migration will generate changes to the indexes on the table. If the index exists when you are populating the three new columns with data, it is possible that populating the data in X, Y, and Z will take a bit longer because of the additional index maintenance.
If you add columns without populating them, then it is pretty quick as it is just a metadata change. Adding an index does require the table to be read (or potentially another index) so that can be very time consuming and of much greater impact than the simple metadata change of recording the new index details.
If the new columns are going to be populated as part of the ALTER TABLE, it is a different matter.
The database may undergo an unplanned shutdown during the course of adding that data to every row of the table data
The server memory may not have room to record every row changed in that table
Therefore those row changes may be written to datafiles before commit, and are therefore written as dirty blocks
The next read of those blocks, after the ALTER table has successfully completed will do a delayed block cleanout (ie record the fact that the change has been committed)
If you add the columns (with data) first, then the create index will (probably) read the table and do the added work of the delayed block cleanout.
If you create the index first then add the columns, the create index may be faster but the delayed block cleanout won't happen and that housekeeping will be picked up by the application later (potentially by the select * from T where C = whatever)

Resources