Is Oracle DB truly isolated during execution of COMMIT? - oracle

Consider these two transactions:
INSERT INTO foo VALUES (1, 2, 'bar');
INSERT INTO foo VALUES (1, 4, 'xyz');
COMMIT;
and
SELECT * FROM foo;
Is there any point in time when the SELECT would see only one row inserted from the first transaction?
So far I couldn't find any evidence that the data are visible only after the COMMIT is successfully finished. As Oracle writes the Redo log during commit, it writes it in a serial fashion, am I right? So there is a point where first row is written, but not the second one. And since writers do not block readers in Oracle, if the select hits exactly this window, then it sees only one row. Or is there some other locking mechanism?

No.
The data will not exist until the commit has been successful.
see ATOMICITY
Of course in the same session you can see the uncommited data
e.g:
INSERT INTO foo VALUES (1, 2, 'bar');
SELECT * FROM foo;
INSERT INTO foo VALUES (1, 4, 'xyz');
COMMIT;
The select will show the inserted data even though the commit has not yet executed.

Nope. It's impossible to see just one row.
I don't have exact implemenation details but the main idea is every record has associated last modified transaction number. When other transaction reads data it checks the status of the last modified record transaction (and their own isolation level) and fetches only allowed records. (This is a pretty common for any MVCC databases)
Moreover even when fetching transaction has RC isolation level each query before execution makes a snapshot of currently active transaction statuses and uses it to perform check above. It actually means that every query runs in SNAPSHOT isolation level. (This is oracle specific feature)
More details here: https://docs.oracle.com/cd/E25054_01/server.1111/e25789/consist.htm
Check the multiversion read and the statement level read consistency parts.

Related

JpaItemWriter<T> stills performs writes one item at a time instead of in batch

I have a question about writing operations in Spring Batch on databases through the ItemWriter<T> contract. To quote from The Definitive Guide to Spring Batch by Michael T. Minella:
All of the items are passed in a single call to the ItemWriter where they can be written out at once. This single call to the ItemWriter allows for IO optimizations by batching the physical write. [...] Chunks are defined by their commit intervals. If the commit interval is set to 50 items, then your job reads in 50 items, processes 50 items, and then writes out 50 items at once.
Yet when I use, say, HibernateItemWriter or JpaItemWriter in a step-based job to write to the database in a Spring-Boot-based app with all the Spring Batch infrastructure in place (#EnableBatchProcessing, Step/JobBuilderFactory, etc.) together with monitoring tools for verifying the number of insert/update statements like implementations of the MethodInterceptor interface, I notice that the number of inserts performed by the writer is equal to the total size of records to process instead of the number of chunks set for that job.
For example, upon inspection of the logs in Intellij from a job execution of 10 items with a chunk size of 5, I found 10 insert statements
Query:["insert into my_table (fields...
instead of 2. I also checked for insert statements in the general_log_file for my RDS instance and found two 'Prepare insert' statements and one 'Execute insert' statement for each item to process.
Now I understand that a writer such as JpaItemWriter<T>'s method write(List<? extends T> items) loops through the items calling entityManager.persist/merge(item) - thereby inserting a new row into the corresponding table - and eventually entityManager.flush(). But where is the performance gain provided by the batch processing, if there is any?
where is the performance gain provided by the batch processing, if there is any?
There is performance gain, and this gain is provided by the chunk-oriented processing model that Spring Batch offers in the sense that it will execute all these insert statements in a single transaction:
start transaction
INSERT INTO table ... VALUES ...
INSERT INTO table ... VALUES ...
...
INSERT INTO table ... VALUES ...
end transaction
You would see a performance hit if there was a transaction for each item, something like:
start transaction
INSERT INTO table ... VALUES ...
end transaction
start transaction
INSERT INTO table ... VALUES ...
end transaction
...
But that is not the case with Spring Batch, unless you set the chunk-size to 1 (but that defeats the goal of using such a processing model in the first place).
So yes, even if you see multiple insert statements, that does not mean that there are no batch inserts. Check the transaction boundaries in your DB logs and you should see a transaction around each chunk, not around each item.
As a side note, from my experience, using raw JDBC performs better than JPA (with any provider) when dealing with large inserts/updates.
Performance can be improved by batching inserts with the following configuration
spring.jpa.properties.hibernate.jdbc.batch_size=?
For example with a batch_size of 3 and a chunk size of 3 when a chunk is committed it will execute the following SQL
INSERT INTO my_table (id, name)
VALUES (1, 'Pete'), (2, 'Pam'), (3, 'Paul');
rather than multiple single inserts
INSERT INTO my_table (id, name) VALUES (1, 'Pete');
INSERT INTO my_table (id, name) VALUES (2, 'Pam');
INSERT INTO my_table (id, name) VALUES (3, 'Paul');
The following blog hightlights it's use:
https://vladmihalcea.com/the-best-way-to-do-batch-processing-with-jpa-and-hibernate/

Database read locking

I have a use case where I need to do the following things in one transaction:
start the transaction
INSERT an item into a table
SELECT all the items in the table
dump the selected items into a file (this file is versioned and another program always uses the latest version)
If all the above things succeed, commit the transaction, if not, rollback.
If two transactions begin almost simultaneously, it is possible that before the first transaction A commits what it has inserted into the table (step 4), the second transaction B has already performed the SELECT operation(step 2) whose result doesn't contain yet the inserted item by the first transaction(as it is not yet committed by A, so not visible to B). In this case, when A finishes, it will have correctly dumped a file File1 containing its inserted item. Later, B finishes, it will have dumped another file File2 containing only its inserted item but not the one inserted by A. Since File2 is more recent, we will use File2. The problem is that File2 doesn't contain the item inserted by A even though this item is well in the DB.
I would like to know if it is feasible to solve this problem by locking the read(SELECT) of the table when a transaction inserts something into the table until its commit or rollback and if yes, how this locking can be implemented in Spring with Oracle as DB.
You need some sort of synchronization between the transactions:
start the transaction
Obtain a lock to prevent the transaction in another session to proceed or wait until the transaction in the other session finishes
INSERT an item into a table
SELECT ......
......
Commit and release the lock
The easiest way is to use LOCK TABLE command, at least in SHARE mode (SHARE ROW EXCLUSIVE or EXCLUSIVE modes can also be used, but they are too restrictve for this case).
The advantage of this approach is that the lock is automatically released at commit or rollback.
The disadvantage is a fact, that this lock can interfere with other transactions in the system that update this table at the same time, and could reduce an overall performance.
Another approach is to use DBMS_LOCK package. This lock doesn't affect other transactions that don't explicitely use that lock. The drawaback is that this package is difficult to use, the lock is not released on commit nor rollback, you must explicitelly release the lock at the end of the transaction, and thus all exceptions must be carefully handled, othervise a deadlock easily could occur.
One more solution is to create a "dummy" table with a single row in it, for example:
CREATE TABLE my_special_lock_table(
int x
);
INSERT INTO my_special_lock_table VALUES(1);
COMMIT:
and then use SELECT x FROM my_special_lock_table FOR UPDATE
or - even easier - simple UPDATE my_special_lock_table SET x=x in your transaction.
This will place an exclusive lock on a row in this table and synchronize only this one transaction.
A drawback is that another "dummy" table must be created.
But this solution doesn't affect the other transactions in the system, the lock is automatically released upon commit or rollback, and it is portable - it should work in all other databases, not only in Oracle.
Use spring's REPEATABLE_READ or SERIALIZABLE isolation levels:
REPEATABLE_READ A constant indicating that dirty reads and
non-repeatable reads are prevented; phantom reads can occur. This
level prohibits a transaction from reading a row with uncommitted
changes in it, and it also prohibits the situation where one
transaction reads a row, a second transaction alters the row, and the
first transaction rereads the row, getting different values the second
time (a "non-repeatable read").
SERIALIZABLE A constant indicating that dirty reads, non-repeatable
reads and phantom reads are prevented. This level includes the
prohibitions in ISOLATION_REPEATABLE_READ and further prohibits the
situation where one transaction reads all rows that satisfy a WHERE
condition, a second transaction inserts a row that satisfies that
WHERE condition, and the first transaction rereads for the same
condition, retrieving the additional "phantom" row in the second read.
with serializable or repeatable read, the group will be protected from non-repeatable reads:
connection 1: connection 2:
set transaction isolation level
repeatable read
begin transaction
select name from users where id = 1
update user set name = 'Bill' where id = 1
select name from users where id = 1 |
commit transaction |
|--> executed here
In this scenario, the update will block until the first transaction is complete.
Higher isolation levels are rarely used because they lower the number of people that can work in the database at the same time. At the highest level, serializable, a reporting query halts any update activity.
I think you need to serialize the whole transaction. While a SELECT ... FOR UPDATE could work, it does not really buy you anything, since you would be selecting all rows. You may as well just take and release a lock, using DBMS_LOCK()

When does Oracle sql exclusively lock a row in an update statement?

I'm trying to see whether I can use database lock to deal with race conditions. For example
CREATE TABLE ORDER
(
T1_ID NUMBER PRIMARY KEY,
AMT NUMBER,
STATUS1 CHAR(1),
STATUS2 CHAR(1),
UPDATED_BY VARCHAR(25)
);
insert into order values (order_seq.nextval, 1, 'N', 'N', 'U0');
Later two users can update the order record at the same time. Requirement is that only one can proceed while the other should NOT. We can certainly use a distributed lock manager (DLM) to do this but I figure database lock may be more efficient.
User 1:
update T1 set status1='Y', updated_by='U1' where status1='N';
User 2:
update T1 set status2='Y', updated_by='U2' where status1='N';
Two users are doing these at the same time. Ideally only one should be allowed to proceed. I played using Sql Plus and also wrote a little java test program letting two threads do these simultaneously. I got the same result. Let's say User 1 got the DB row lock first. It returns 1 row updated. The second session will be blocked waiting for the row lock before the 1st session commits or rollbacks. The question is REALLY this:
Update with a where clause seems like two operations: first it will do an implicit select based on the where clause to pick the row that will be updated. Since Oracle only supports READ COMMITTED isolation level, I expect both UPDATE statements will pick the single record in the DB. As a result, I expected both UPDATE statement will eventually return "1 row updated" although one will wait till the other transaction commits. HOWEVER that's not what I saw. The second UPDATE returns "0 row updated" after the first commits. I feel that Oracle actually runs the where clause AGAIN after the first session commits, which results in "0 row updated" result.
This is strange to me. I thought I would run into the classical "lost update" phenomenon.
can somebody please explain what's going on here? Thanks very much!

Consecutive application threads and uncommitted data in Oracle

Our application reads a record from an Oracle 'Event' table. When the event record exists we update the 'count' field of that record. If the record doesn't exist we insert it. So we want only 1 record for a particular event in the table.
The problem with this is probably quite predictable: one application thread will read the table, see the event is not there, insert the new event and commit. But before it commits a second thread will also read the table and see the event is not there. And then both threads will insert the event and we end up with 2 records for the same event.
I guess synchronizing access to this particular method in our application will prevent this problem, but what is the best option in Oracle to prevent this? Will MERGE for example always prevent this problem?
Serialising access to the procedure that implements this functionality would be trivial to implement, using DBMS_LOCK to define and take an exclusive lock.
Serialising through SQL based methods is practically impossible, due to the read consistency model.
CREATE TABLE EVENTS (ID NUMBER PRIMARY KEY, COUNTER NUMBER NOT NULL);
MERGE INTO EVENTS
USING (SELECT ID, COUNTER FROM DUAL LEFT JOIN EVENTS ON EVENTS.ID = :EVENT_ID) SRC
ON (EVENTS.ID = SRC.ID)
WHEN MATCHED THEN UPDATE SET COUNTER = SRC.COUNTER + 1
WHEN NOT MATCHED THEN INSERT (ID, COUNTER) VALUES (:EVENT_ID, 1);
Simple SQL securing single record for each ID and consistently increasing the counter no matter what application fires it or number of concurrent thread. You don't need to code anything at all and it's very lightweight as well.
It also doesn't produce any exception related to data consistency so you don't need any special handling.
UPDATE: It actually produces unique violation exception if both threads are inserting. I thought the second merge would switch to update, but it doesn't.
UPDATE: Just tested the same case on SQL Server and when executing in parallel and the record doesn't exist one MERGE inserts and the second updates.

Strange Oracle problem

I tried to put it in a sentence but it is better to give an example:
SELECT * FROM someTable WHERE id = someID;
returns no rows
...
some time passes (no inserts are done to the table and no ID updates)
...
SELECT * FROM someTable WHERE id = someID;
returns one row!
Is it possible that some DB mechanism prevents first SELECT to return row?
Oracle log has no errors.
No transactions are rolled back when two selects are executed.
You can't see uncommitted data in another session. When did the commit happen?
EDIT1: Are you the only one using this database? Or did/do you have multiple sessions?
I think in another session you or someone else has inserted this row, you do your select and you don't see this row. After that a commit happens in the other session (maybe implicit because a session is closed) and then you see this row when you select again.
I can think of other explanations, but I first want to know are you only one using this database.
With read consistency as provided by Oracle, you should not see a row appear like that. If you are running in some mode with automatic commits, so that each statement is a self-contained transaction, then read consistency is not being violated. Which program are you using to access the database? I agree with the other observations; the row should not appear if your session is not inserting it and no other session is active at the same time. I don't know of a DBMS that indulges in spontaneous data generation.
Don't you have scheduled jobs in that Oracle?

Resources