I have two procedures that I want to run on the same table, one uses the birth date and the other updates the name and last name taken from a third table.
The one that uses the birthday to update the age field runs all over the table, and the one that updates the names and last name only updates the rows that appear on the third table based on a key.
So I launched both and got deadlocked! Is there a way to prioritize any of them? I read about the nowait and skip locked for the update but then, how would I return to the ones skipped?
Hope you can help me on this!!
One possibility is to lock all rows you will update at once. Doing all updates in a single update statment will accomplish this. Or
select whatever from T
where ...
for update;
Another solution is to create what I call a "Gatekeeper" table. Both procedures need to lock the Gatekeeper table in exclusive mode before updating the table in question. The second procedure will block until the first commits but won't deadlock. In 11g you can create a table with no space allocated.
A variation is to insert a row in the Gatekeeper. Then lock only that row with select for update. Then you can use the Gatekeeper in other situations.
I would guess that you got locked because the update for all the rows and the update for a small set of rows accessed rows in different orders.
The former used a full scan and reached Raw A first, then went on to other rows, eventually trying to lock Row B. However, the other query was driven from an index or a join and already had Row B locked, and was off to lock Row A when it found it was already locked.
So, the fix: firstly, having an age column that needs to be constantly modified is a really bad idea. Perhaps it was done to allow indexing of age, but with a correctly written query an index on date of birth will let you find the same records just as quickly. You've broken normalisation rules and ended up coding yourself a deadlocking application. Hopefully you are only updating the rows that need to be updated, not all of them regardless -- I mean, that would just be insane.
The best solution is to get rid of that design flaw.
The not so good solution is to deconflict your queries by running them at different times or by using DBMS_Lock so that only one of them can run at any time.
Related
Problem: I have a table to which a customer may add columns. This table might have hundreds of columns of varying data types depending on how insane the customer is. I need to deploy an AFTER UPDATE trigger against this table to insert a row in another table for each column value that has changed.
Example:
Table_A, Row 1: Key_Value=1, Col1=123, Col2="foo"...Coln="bar"
becomes
Table_B, Row 1: Key_Value=1, ColName="Col1", ColValue=123
Table_B, Row 2: Key_Value=1, ColName="Col2", ColValue="foo"
Table_B, Row 3: Key_Value=1, ColName="Coln", ColValue="bar"
Since I do not know what columns they may create and this trigger must be deployed with the application, I need to evaluate the OLD vs NEW pseudo records dynamically (if :new.columns[1] != :old.columns[1] then...) to see what has changed and log only the changed columns. The only examples I have been able to find require referencing the columns in the pseudo records explicitly (if :new.col1 != :old.col1 then...).
Question: Is there a way to do this in Oracle?
Caveats: No, this is not for auditing purposes, so I cannot use Oracle's built-in auditing. No, we are not going to rewrite our app because you know how to do it better, this is the way it needs to work for better or worse.
Any helpful comments are welcome. All snarkey DBA drivel is not. Thanks in advance.
No. You can't dynamically reference columns in the :new or :old pseudorecord.
The closest you're likely to come is to write code that dynamically generates the entire trigger body by querying the data dictionary and making static references to columns in the pseudorecord. That code, however, would need to be run every time a column was added or removed from the table. Normally, that would be done as part of normal release management. If you are saying that people are adding and removing columns from this table without going through a release process, you could write a DDL trigger that submitted a job via dbms_job that called the procedure that rebuilt the trigger. That would be a lot of moving pieces and it would be a pain to troubleshoot when something inevitably goes wrong but if you're not open to alternate ways of implementing the functionality, that's complexity you'll have to live with.
Im working on an application which access a Sybase ASE 15.0.2 ,where the current code access a remote database
(CIS) to insert a row using a proxy table definition (the destination table is a DOL - DRL table - The PK
row is defined as identity ,and is always growing). The current code performs a select to check if the row
already exists to avoid duplicate data to be inserted.
Since the remote table also have a PK definition on the table, i do understand that the PK verification will
be done again prior to commiting the row.
Im planning to remove the select check since its being effectively done again by the PK verification,
but im concerned about if when receiving a file with many duplicates, the table may suffer
some unecessary contention when the data is tried to be commited.
Its not clear to me if Sybase ASE tries to hold the last row and writes the data prior to check for the
duplicate. Also, if the table is very big, im concerned also about the time it will spend looking the
whole index to find duplicates.
I've found some documentation for SQL anywhere, but not ASE in the following link
http://dcx.sybase.com/1200/en/dbusage/insert-how-transact.html
The best i could find is the following explanation
https://groups.google.com/forum/?fromgroups#!topic/comp.databases.sybase/tHnOqptD7X8
But it doesn't enlighten in details how the row is locked (and if there is any kind of
optimization to write it ahead or at the same time of PK checking)
, and also if it will waste a full PK look if im positively inserting a row which the PK
positively greater than the last row commited
Thanks
Alex
Unlike SqlAnywhere there is no option for ASE to set wait_for_commit. The primary key constraint is checked during the insert and not at the commit time. The problem as I understand from your post I see is if you have a mass insert from a file that may contain duplicates is to load into a temp table , check for duplicates, remove the duplicates and then insert the unique ones. Mass insert are lot faster though it still checks for primary key violations. However there is no cost associated as there is no rolling back. The insert statement is always all or nothing. Even if one row is duplicate the entire insert statement will fail. Check before insert in more of error free approach as opposed to use of constraint to the verification because it is going to fail and rollback is going to again be costly.
Thanks Mike
The link does have a very quick explanation about the insert from the CIS perspective. Its a variable to keep an eye on given that CIS may become a representative time consumer
if its performing data and syntax checking if it will be done again when CIS forwards the insert statement to the target server. I was afraid that CIS could have some influence beyond the network traffic/time over the locking/PK checking
Raju
I do agree that avoiding the PK duplication by checking if the row already exists by running a select and doing in a batch, but im currently looking for a stop gap solution, and that may be to perform the insert command in batches of about 50 rows and leave the
duplicate key check for the PK.
Hopefully the PK check will be done over a join of the 50 newly inserted rows, and thus
avoid to traverse the index for each single row...
Ill try to test this and comment back
Alex
Background: My team has an etl job that updates an aggregate table. Each row contains data for a particular date, but this row can and will get updated after the row date (which means any row can contain data from multiple jobs). This ETL job missed some data for one day last week and now I need to backfill it.
Problem: I have the missing data, and what I was planning on doing was dumping that data into a temporary table and then merging it with the agg table. That way I can deal with whether the ETL job already contains a row for that data (update) or whether a new row needs to be added (insert), but I don't have sufficient permissions to create a temp table, and I'd prefer not to involve the DBA.
Question: Can I do an insert/update sort of behavior without creating a temporary table (this is Oracle SQL by the way).
Edit: The data is coming from a tsv file.
Why do you want to avoid involving the DBA? The DBA should have full knowledge of what's going on in the database, as they are ultimately responsible for the condition of the data within it. So you shouldn't be playing sneaky commando with them.
As you have a file of missing data, the easiest way to present it to the database is with an external table. This requires the creation of the table and probably a directory object as well. You will need the DBA's help with this task.
The only way to avoid creating database objects is to convert your TSV file into a series of DML statements. An IDE which supports regex and/or records macros will prove invaluable here. I like TextPad; other editors are available.
The DML statement for doing upserts in Oracle is the MERGE statement. The one thing you need to watch for is recency. Your missing data comes from last week. If a row exists it may have have been added or amended in the intervening period. You must write your MERGE statement so it does not overwrite more recent data with the older stuff. Hopefully your table has useful metadata columns such as DATE_CREATED and LAST_UPDATED.
I have a trigger that checks another couple of tables before allowing a row to be inserted. However between the time I check the other tables and insert the row the other tables may get updated.
How do I ensure the tables I'm checking remain in a consistent state until after the new row is inserted? I was thinking of taking locks out but everything I've read boils down to if you are not leaving locking to Oracle you're almost certainly doing it wrong.
Oracle is already doing this for you, when you perform a select it will look at all tables as of the time the transaction started ( the time of the first DML ). This wont stop the data from being changed under you though, your transaction just wont see it being changed. If you want to stop that data from being changed then you can use "SELECT FOR UPDATE" as Justin Cave suggests.
I would seriously question what you are doing though, triggers, except in the most trivial cases, almost always lead to unexpected side effects.
I am working on a system to track a project's history. There are 3 main tables: projects, tasks, and clients then 3 history tables for each. I have the following trigger on projects table.
CREATE OR REPLACE TRIGGER mySchema.trg_projectHistory
BEFORE UPDATE OR DELETE
ON mySchema.projects REFERENCING NEW AS New OLD AS Old
FOR EACH ROW
declare tmpVersion number;
BEGIN
select myPackage.GETPROJECTVERSION( :OLD.project_ID ) into tmpVersion from dual;
INSERT INTO mySchema.projectHistiry
( project_ID, ..., version )
VALUES
( :OLD.project_ID,
...
tmpVersion
);
EXCEPTION
WHEN OTHERS THEN
-- Consider logging the error and then re-raise
RAISE;
END ;
/
I got three triggers for each of my tables (projects, tasks, clients).
Here is the challenge: Not everything changes at the same time. For example, somebody could just update a certain tasks' cost. In this case, only one trigger fires and I got one insert. I'd like to insert one record into 3 history tables at once even if nothing changed in the projects and clients tables.
Also, what if somebody changes a project's end_date, the cost, and say the picks another client. Now, I have three triggers firing at the same time. Only in this case, I will have one record inserted into my three history tables. (which I want)
If i modify the triggers to do insert into 3 tables for the first example, then I will have 9 inserts when the second example happens.
Not quite sure how to tackle this. any help?
To me it sounds as if you want a transaction-level snapshot of the three tables created whenever you make a change to any of those tables.
Have a row level trigger on each of the three tables that calls a single packaged procedure with the project id and optionally client / task id.
The packaged procedure inserts into all three history tables the relevant project, client and tasks where there isn't already a history record for that key and transaction (ie you don't want duplicates). You got a couple of choices when it comes to the latter. You can use a unique constraint and either a BULK select and insert with FORALL/SAVE EXCEPTIONS, DML error logging (EXCEPTIONS INTO) or a INSERT...SELECT...WHERE NOT EXISTS...
You do need to keep track of your transactions. I'm guessing this is what you were doing with myPackage.GETPROJECTVERSION. The trick here is to only increment versions when you have a new transaction. If, when you get a new version number, you hold it in a pacakge level variable, you can easily tell whether your session has already got a version number or not.
If your session is going to run multiple transaction, you'll need to 'clear' out the session-level version number if it was part of a previous transaction. If you get DBMS_TRANSACTION.LOCAL_TRANSACTION_ID and store that at the package/session level as well, you can determine if you are in a new transaction, or part of the same transaction.
From your description, it looks like you would be capturing the effective and end date for each of the history rows once any of the original rows change.
Eg. Project_hist table would have eff_date and exp_date which has the start and end date for a given project. Project table would just have an effective date. (as it is the active project).
I don't see why you want to insert rows for all three history tables when only one of the table values is updated. You can pretty much get the details as you need (as of a given date) using your current logic. (inserting old row in the history table for the table that has been updated only.).
Alternative answer.
Have a look at Total Recall / Flashback Archive
You can set the retention to 10 years, and use a simple AS OF TIMESTAMP to get the data as of any particular timestamp.
Not sure on performance though. It may be easier to have a daily or weekly retention and then a separate scheduled job that picks out the older versions using the VERSIONS BETWEEN syntax and stores them in your history table.