Write Skew anomaly in Oracle and PostgreSQL does not rollback transaction - oracle

I noticed the following occurrence in both Oracle and PostgreSQL.
Considering we have the following database schema:
create table post (
id int8 not null,
title varchar(255),
version int4 not null,
primary key (id));
create table post_comment (
id int8 not null,
review varchar(255),
version int4 not null,
post_id int8,
primary key (id));
alter table post_comment
add constraint FKna4y825fdc5hw8aow65ijexm0
foreign key (post_id) references post;
With the following data:
insert into post (title, version, id) values ('Transactions', 0, 1);
insert into post_comment (post_id, review, version, id)
values (1, 'Post comment 1', 459, 0);
insert into post_comment (post_id, review, version, id)
values (1, 'Post comment 2', 537, 1);
insert into post_comment (post_id, review, version, id)
values (1, 'Post comment 3', 689, 2);
If I open two separate SQL consoles and execute the following statements:
TX1: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
TX2: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
TX1: SELECT COUNT(*) FROM post_comment where post_id = 1;
TX1: > 3
TX1: UPDATE post_comment SET version = 100 WHERE post_id = 1;
TX2: INSERT INTO post_comment (post_id, review, version, id) VALUES (1, 'Phantom', 0, 1000);
TX2: COMMIT;
TX1: SELECT COUNT(*) FROM post_comment where post_id = 1;
TX1: > 3
TX1: COMMIT;
TX3: SELECT * from post_comment;
> 0;"Post comment 0";100;1
1;"Post comment 1";100;1
2;"Post comment 2";100;1
1000;"Phantom";0;1
As expected, the SERIALIZABLE isolation level has kept the snapshot data from the beginning of the TX1 transaction and TX1 only sees 3 post_comment records.
Because of the MVCC model in Oracle and PostgreSQL, TX2 is allowed to insert a new record and commit.
Why is TX1 allowed to commit? Because this is a Write Skew anomaly, I was expecting to see that TX1 would be rolled back with a "Serialization failure exception" or something similar.
Does the MVCC Serializable model in PostgreSQL and Oracle only offer a snapshot isolation guarantee but no Write Skew anomaly detection?
UPDATE
I even changed Tx1 to issue an UPDATE statement that changes the version column for all post_comment records belonging to the same post.
This way, Tx2 creates a new record and Tx1 is going to commit without knowing that a new record has been added that satisfied the UPDATE filtering criteria.
Actually, the only way to make it fail on PostgreSQL is if we execute the following COUNT query in Tx2, prior to inserting the phantom record:
Tx2: SELECT COUNT(*) FROM post_comment where post_id = 1 and version = 0
TX2: INSERT INTO post_comment (post_id, review, version, id) VALUES (1, 'Phantom', 0, 1000);
TX2: COMMIT;
Then Tx1 is going to be rolled back with:
org.postgresql.util.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during conflict out checking.
Hint: The transaction might succeed if retried.
Most likely that the write-skew anomaly prevention mechanism detected this change and rolled back the transaction.
Interesting that Oracle does not seem to be bothered by this anomaly and so Tx1 just commits successfully. Since Oracle does not prevent write-skew from happening, Tx1 commits juts fine.
By the way, you can run all these examples yourself since they are on GitHub.

In the 1995 paper, A Critique of ANSI SQL Isolation Levels, Jim Gray and co, described Phantom Read as:
P3: r1[P]...w2[y in P]...(c1 or a1) (Phantom)
One important note is that ANSI SQL P3 only prohibits inserts (and
updates, according to some interpretations) to a predicate whereas the
definition of P3 above prohibits any write satisfying the predicate
once the predicate has been read — the write could be an insert,
update, or delete.
Therefore, a Phantom Read does not mean that you can simply return a snapshot as of the start of the currently running transaction and pretend that providing the same result for a query is going to protect you against the actual Phantom Read anomaly.
In the original SQL Server 2PL (Two-Phase Locking) implementation, returning the same result for a query implied Predicate Locks.
The MVCC (Multi-Version Concurrency Control) Snapshot Isolation (wrongly named Serializable in Oracle) does not actually prevent other transactions from inserting/deleting rows that match the same filtering criteria with a query that already executed and returned a result set in our current running transaction.
For this reason, we can imagine the following scenario in which we want to apply a raise to all employees:
Tx1: SELECT SUM(salary) FROM employee where company_id = 1;
Tx2: INSERT INTO employee (id, name, company_id, salary)
VALUES (100, 'John Doe', 1, 100000);
Tx1: UPDATE employee SET salary = salary * 1.1;
Tx2: COMMIT;
Tx1: COMMIT:
In this scenario, the CEO runs the first transaction (Tx1), so:
She first checks the sum of all salaries in her company.
Meanwhile, the HR department runs the second transaction (Tx2) as they have just managed to hire John Doe and gave him a 100k $ salary.
The CEO decides that a 10% raise is feasible taking into account the total sum of salaries, being unaware that the salary sum has raised with 100k.
Meanwhile, the HR transaction Tx2 is committed.
Tx1 is committed.
Boom! The CEO has taken a decision on an old snapshot, giving a raise that might not be sustained by the current updated salary budget.
You can view a detailed explanation of this use case (with lots of diagrams) in the following post.
Is this a Phantom Read or a Write Skew?
According to Jim Gray and co, this is a Phantom Read since the Write Skew is defined as:
A5B Write Skew Suppose T1 reads x and y, which are consistent with
C(), and then a T2 reads x and y, writes x, and commits. Then T1
writes y. If there were a constraint between x and y, it might be
violated. In terms of histories:
A5B: r1[x]...r2[y]...w1[y]...w2[x]...(c1 and c2 occur)
In Oracle, the Transaction Manager might or might not detect the anomaly above because it does not use predicate locks or index range locks (next-key locks), like MySQL.
PostgreSQL manages to catch this anomaly only if Bob issues a read against the employee table, otherwise, the phenomenon is not prevented.
UPDATE
Initially, I was assuming that Serializability would imply a time ordering as well. However, as very well explained by Peter Bailis, wall-clock ordering or Linearizability is only assumed for Strict Serializability.
Therefore, my assumptions were made for a Strict Serializable system. But that's not what Serializable is supposed to offer. The Serializable isolation model makes no guarantees about time, and operations are allowed to be reordered as long as they are equivalent to a some serial execution.
Therefore, according to the Serializable definition, such a Phantom Read can occur if the second transaction does not issue any read. But, in a Strict Serializable model, the one offered by 2PL, the Phantom Read would be prevented even if the second transaction does not issue a read against the same entries which we are trying to guard against phantom reads.

What you observe is not a phantom read. That would be if a new row would show up when the query is issued the second time (phantoms appear unexpectedly).
You are protected from phantom reads in both Oracle and PostgreSQL with SERIALIZABLE isolation.
The difference between Oracle and PostgreSQL is that SERIALIZABLE isolation level in Oracle only offers snapshot isolation (which is good enough to keep phantoms from appearing), while in PostgreSQL it will guarantee true serializability (i.e., there always exists a serialization of the SQL statements that leads to the same results). If you want to get the same thing in Oracle and PostgreSQL, use REPEATABLE READ isolation in PostgreSQL.

I just wanted to point that Vlad Mihalcea's answer is plain wrong.
Is this a Phantom Read or a Write Skew?
Neither of those -- there is no anomaly here, transactions are serializable as Tx1 -> Tx2.
SQL standard states:
"A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions."
PostgreSQL manages to catch this anomaly only if Bob issues a read against the employee table, otherwise the phenomenon is not prevented.
PostgreSQL's behavior here is 100% correct, it just "flips" apparent transactions order.

The Postgres documentation defines a phantom read as:
A transaction re-executes a query returning a set of rows that satisfy
a search condition and finds that the set of rows satisfying the
condition has changed due to another recently-committed transaction.
Because your select returns the same value both before and after the other transaction committed, it does not meet the criteria for a phantom read.

Related

Is Oracle DB truly isolated during execution of COMMIT?

Consider these two transactions:
INSERT INTO foo VALUES (1, 2, 'bar');
INSERT INTO foo VALUES (1, 4, 'xyz');
COMMIT;
and
SELECT * FROM foo;
Is there any point in time when the SELECT would see only one row inserted from the first transaction?
So far I couldn't find any evidence that the data are visible only after the COMMIT is successfully finished. As Oracle writes the Redo log during commit, it writes it in a serial fashion, am I right? So there is a point where first row is written, but not the second one. And since writers do not block readers in Oracle, if the select hits exactly this window, then it sees only one row. Or is there some other locking mechanism?
No.
The data will not exist until the commit has been successful.
see ATOMICITY
Of course in the same session you can see the uncommited data
e.g:
INSERT INTO foo VALUES (1, 2, 'bar');
SELECT * FROM foo;
INSERT INTO foo VALUES (1, 4, 'xyz');
COMMIT;
The select will show the inserted data even though the commit has not yet executed.
Nope. It's impossible to see just one row.
I don't have exact implemenation details but the main idea is every record has associated last modified transaction number. When other transaction reads data it checks the status of the last modified record transaction (and their own isolation level) and fetches only allowed records. (This is a pretty common for any MVCC databases)
Moreover even when fetching transaction has RC isolation level each query before execution makes a snapshot of currently active transaction statuses and uses it to perform check above. It actually means that every query runs in SNAPSHOT isolation level. (This is oracle specific feature)
More details here: https://docs.oracle.com/cd/E25054_01/server.1111/e25789/consist.htm
Check the multiversion read and the statement level read consistency parts.

When does Oracle sql exclusively lock a row in an update statement?

I'm trying to see whether I can use database lock to deal with race conditions. For example
CREATE TABLE ORDER
(
T1_ID NUMBER PRIMARY KEY,
AMT NUMBER,
STATUS1 CHAR(1),
STATUS2 CHAR(1),
UPDATED_BY VARCHAR(25)
);
insert into order values (order_seq.nextval, 1, 'N', 'N', 'U0');
Later two users can update the order record at the same time. Requirement is that only one can proceed while the other should NOT. We can certainly use a distributed lock manager (DLM) to do this but I figure database lock may be more efficient.
User 1:
update T1 set status1='Y', updated_by='U1' where status1='N';
User 2:
update T1 set status2='Y', updated_by='U2' where status1='N';
Two users are doing these at the same time. Ideally only one should be allowed to proceed. I played using Sql Plus and also wrote a little java test program letting two threads do these simultaneously. I got the same result. Let's say User 1 got the DB row lock first. It returns 1 row updated. The second session will be blocked waiting for the row lock before the 1st session commits or rollbacks. The question is REALLY this:
Update with a where clause seems like two operations: first it will do an implicit select based on the where clause to pick the row that will be updated. Since Oracle only supports READ COMMITTED isolation level, I expect both UPDATE statements will pick the single record in the DB. As a result, I expected both UPDATE statement will eventually return "1 row updated" although one will wait till the other transaction commits. HOWEVER that's not what I saw. The second UPDATE returns "0 row updated" after the first commits. I feel that Oracle actually runs the where clause AGAIN after the first session commits, which results in "0 row updated" result.
This is strange to me. I thought I would run into the classical "lost update" phenomenon.
can somebody please explain what's going on here? Thanks very much!

Consecutive application threads and uncommitted data in Oracle

Our application reads a record from an Oracle 'Event' table. When the event record exists we update the 'count' field of that record. If the record doesn't exist we insert it. So we want only 1 record for a particular event in the table.
The problem with this is probably quite predictable: one application thread will read the table, see the event is not there, insert the new event and commit. But before it commits a second thread will also read the table and see the event is not there. And then both threads will insert the event and we end up with 2 records for the same event.
I guess synchronizing access to this particular method in our application will prevent this problem, but what is the best option in Oracle to prevent this? Will MERGE for example always prevent this problem?
Serialising access to the procedure that implements this functionality would be trivial to implement, using DBMS_LOCK to define and take an exclusive lock.
Serialising through SQL based methods is practically impossible, due to the read consistency model.
CREATE TABLE EVENTS (ID NUMBER PRIMARY KEY, COUNTER NUMBER NOT NULL);
MERGE INTO EVENTS
USING (SELECT ID, COUNTER FROM DUAL LEFT JOIN EVENTS ON EVENTS.ID = :EVENT_ID) SRC
ON (EVENTS.ID = SRC.ID)
WHEN MATCHED THEN UPDATE SET COUNTER = SRC.COUNTER + 1
WHEN NOT MATCHED THEN INSERT (ID, COUNTER) VALUES (:EVENT_ID, 1);
Simple SQL securing single record for each ID and consistently increasing the counter no matter what application fires it or number of concurrent thread. You don't need to code anything at all and it's very lightweight as well.
It also doesn't produce any exception related to data consistency so you don't need any special handling.
UPDATE: It actually produces unique violation exception if both threads are inserting. I thought the second merge would switch to update, but it doesn't.
UPDATE: Just tested the same case on SQL Server and when executing in parallel and the record doesn't exist one MERGE inserts and the second updates.

How to insert while avoiding unique constraints with oracle

We have a process that aggregates some data and inserts the results into another table that we use for efficient querying. The problem we're facing is that we now have multiple aggregators running at roughly the same time.
We use the original records id as the primary key in this new table - a unique constraint. However, if two aggregation processes are running at the same time, one of them will error with a unique constraint violation.
Is there a way to specify some kind of locking mechanism which will make the second writer wait until the first is finished? Alternatively, is there a way to tell oracle to ignore that specific row and continue with the rest?
Unfortunately it's not practical to reduce the aggregation to a single process, as the following procedures rely on an up to date version of the data being available and those procedures do need to scale out.
Edit:
The following is my [redacted] query:
INSERT INTO
agg_table
SELECT
h.id, h.col, h.col2
FROM history h
JOIN call c
ON c.callid = h.callid
WHERE
h.id > (SELECT coalesce(max(id),0) FROM agg_table)
It is possible run an INSERT statement with an error logging clause. The example from the Oracle docs is as follows:
INSERT INTO dw_empl
SELECT employee_id, first_name, last_name, hire_date, salary, department_id
FROM employees
WHERE hire_date > sysdate - 7
LOG ERRORS INTO err_empl ('daily_load') REJECT LIMIT 25
Alternatively, you could try using a [MERGE][2] statement. You would be merging into the summary table with a select from the detail table. If a match is not found, you INSERT and if it is found you would UPDATE. I believe this solution will handle your concurrency issues, but you would need to test it.
have a look at FOR UPDATE clause. If you correctly write the SELECT statement with FOR UPDATE clause within a transaction before your update/insert statements you will be able to "lock" the required records
Serialising the inserts is probably the best way, as there's no method that will get you round the problem of the multiple inserts being unable to see what each one is doing.
DBMS_Lock is probably the appropriate serialisation mechanism.

How to find locked rows in Oracle

We have an Oracle database, and the customer account table has about a million rows. Over the years, we've built four different UIs (two in Oracle Forms, two in .Net), all of which remain in use. We have a number of background tasks (both persistent and scheduled) as well.
Something is occasionally holding a long lock (say, more than 30 seconds) on a row in the account table, which causes one of the persistent background tasks to fail. The background task in question restarts itself once the update times out. We find out about it a few minutes after it happens, but by then the lock has been released.
We have reason to believe that it might be a misbehaving UI, but haven't been able to find a "smoking gun".
I've found some queries that list blocks, but that's for when you've got two jobs contending for a row. I want to know which rows have locks when there's not necessarily a second job trying to get a lock.
We're on 11g, but have been experiencing the problem since 8i.
Oracle's locking concept is quite different from that of the other systems.
When a row in Oracle gets locked, the record itself is updated with the new value (if any) and, in addition, a lock (which is essentially a pointer to transaction lock that resides in the rollback segment) is placed right into the record.
This means that locking a record in Oracle means updating the record's metadata and issuing a logical page write. For instance, you cannot do SELECT FOR UPDATE on a read only tablespace.
More than that, the records themselves are not updated after commit: instead, the rollback segment is updated.
This means that each record holds some information about the transaction that last updated it, even if the transaction itself has long since died. To find out if the transaction is alive or not (and, hence, if the record is alive or not), it is required to visit the rollback segment.
Oracle does not have a traditional lock manager, and this means that obtaining a list of all locks requires scanning all records in all objects. This would take too long.
You can obtain some special locks, like locked metadata objects (using v$locked_object), lock waits (using v$session) etc, but not the list of all locks on all objects in the database.
you can find the locked tables in Oracle by querying with following query
select
c.owner,
c.object_name,
c.object_type,
b.sid,
b.serial#,
b.status,
b.osuser,
b.machine
from
v$locked_object a ,
v$session b,
dba_objects c
where
b.sid = a.session_id
and
a.object_id = c.object_id;
Rather than locks, I suggest you look at long-running transactions, using v$transaction. From there you can join to v$session, which should give you an idea about the UI (try the program and machine columns) as well as the user.
Look at the dba_blockers, dba_waiters and dba_locks for locking. The names should be self explanatory.
You could create a job that runs, say, once a minute and logged the values in the dba_blockers and the current active sql_id for that session. (via v$session and v$sqlstats).
You may also want to look in v$sql_monitor. This will be default log all SQL that takes longer than 5 seconds. It is also visible on the "SQL Monitoring" page in Enterprise Manager.
The below PL/SQL block finds all locked rows in a table. The other answers only find the blocking session, finding the actual locked rows requires reading and testing each row.
(However, you probably do not need to run this code. If you're having a locking problem, it's usually easier to find the culprit using GV$SESSION.BLOCKING_SESSION and other related data dictionary views. Please try another approach before you run this abysmally slow code.)
First, let's create a sample table and some data. Run this in session #1.
--Sample schema.
create table test_locking(a number);
insert into test_locking values(1);
insert into test_locking values(2);
commit;
update test_locking set a = a+1 where a = 1;
In session #2, create a table to hold the locked ROWIDs.
--Create table to hold locked ROWIDs.
create table locked_rowids(the_rowid rowid);
--Remove old rows if table is already created:
--delete from locked_rowids;
--commit;
In session #2, run this PL/SQL block to read the entire table, probe each row, and store the locked ROWIDs. Be warned, this may be ridiculously slow. In your real version of this query, change both references to TEST_LOCKING to your own table.
--Save all locked ROWIDs from a table.
--WARNING: This PL/SQL block will be slow and will temporarily lock rows.
--You probably don't need this information - it's usually good enough to know
--what other sessions are locking a statement, which you can find in
--GV$SESSION.BLOCKING_SESSION.
declare
v_resource_busy exception;
pragma exception_init(v_resource_busy, -00054);
v_throwaway number;
type rowid_nt is table of rowid;
v_rowids rowid_nt := rowid_nt();
begin
--Loop through all the rows in the table.
for all_rows in
(
select rowid
from test_locking
) loop
--Try to look each row.
begin
select 1
into v_throwaway
from test_locking
where rowid = all_rows.rowid
for update nowait;
--If it doesn't lock, then record the ROWID.
exception when v_resource_busy then
v_rowids.extend;
v_rowids(v_rowids.count) := all_rows.rowid;
end;
rollback;
end loop;
--Display count:
dbms_output.put_line('Rows locked: '||v_rowids.count);
--Save all the ROWIDs.
--(Row-by-row because ROWID type is weird and doesn't work in types.)
for i in 1 .. v_rowids.count loop
insert into locked_rowids values(v_rowids(i));
end loop;
commit;
end;
/
Finally, we can view the locked rows by joining to the LOCKED_ROWIDS table.
--Display locked rows.
select *
from test_locking
where rowid in (select the_rowid from locked_rowids);
A
-
1
Given some table, you can find which rows are not locked with SELECT FOR UPDATESKIP LOCKED.
For example, this query will lock (and return) every unlocked row:
SELECT * FROM mytable FOR UPDATE SKIP LOCKED
References
Ask TOM "How to get ROWID for locked rows in oracle".

Resources