Dirty Reading in hibernate - oracle

Dirty Read: The definition states that
dirty reading occurs when a transaction reads data from a row that has been modified by another transaction but not yet committed.
Assuming the definition is correct, I am unable to fathom any such situation.
Due to the principle of Isolation, the transaction A can not see the uncommitted data of the row that has been modified by transaction B. If transaction B has simply not committed, how transaction A can see it in the first place? It is only possible when both operations are performed under same transaction.
Can someone please explain what am I missing here?

"Dirty", or uncommitted reads (UR) are a way to allow non-blocking reads. Reading uncommitted data is not possible in an Oracle database due to the multi-version concurrency control employed by Oracle; instead of trying to read other transactions' data each transaction gets its own snapshot of data as they existed (committed) at the start of the transaction. As a result all reads are essentially non-blocking.
In databases that use lock-based concurrency control, e.g DB2, uncommitted reads are possible. A transaction using the UR isolation level ignores locks placed by other transactions, and thus it is able to access rows that have been modified but not yet committed.
Hibernate, being an abstraction layer on top of a database, offers the UR isolation level support for databases that have the capability.

Related

What's the performance penalty of long lived DB transactions interleaved with one another?

Could anyone provide an explanation or point me to a good source where it is explained the impact of long lived database transactions when there are other transactions involved?
I'm having difficulties trying to understand what is the real impact in the performance of an application of having transactions where most of the queries are reads and maybe a couple or three are writes, given the different isolation levels.
Mostly I would like to understand it in the situation where:
Neither the rows read nor the rows updated are involved in any other transaction.
The rows read are involved in another transaction but not the rows being updated and this other transaction is read only.
The rows read are involved in another transaction but not the rows being updated and this other transaction is modifying some data being read. I understand here it also affects whether the data is read before or after is being modified.
Both the rows read and the rows updated are involved in another transaction also modifying the data.
These questions come in the context of an application using micro services where all application layer services are annotated with #Transactional using JPA and PostgreSQL and, to transform the data, they need to do some network calls to other micro services within the transaction to fetch some other values.

Difference between LockModeType Jpa

I am confused about the working of LockModeTypes in JPA:
LockModeType.Optimistic
it increments the version while committing.
Question here is : If I have version column in my entity and if I don't specify this lock mode then also it works similarly then what is the use of it?
LockModeType.OPTIMISTIC_FORCE_INCREMENT
Here it increments the version column even though the entity is not updated.
but what is the use of it if any other process updated the same row before this transaction is committed? this transaction is anyways going to fail. so what is the use of this LockModeType.
LockModeType.PESSIMISTIC_READ
This lock mode issues a select for update nowait(if no hint timeout specified)..
so basically this means that no other transaction can update this row until this transaction is committed, then its basically a write lock, why its named a Read lock?
LockModeType.PESSIMISTIC_WRITE
This lock mode also issues a select for update nowait (if no hint timeout specified).
Question here is what is the difference between this lock mode and LockModeType.PESSIMISTIC_READ as I see both fires same queries?
LockModeType.PESSIMISTIC_FORCE_INCREMENT
this does select for update nowait (if no hint timeout specified) and also increments the version number.
I totally didn't get the use of it.
why a version increment is required if for update no wait is there?
I would first differentiate between optimistic and pessimistic locks, because they are different in their underlying mechanism.
Optimistic locking is fully controlled by JPA and only requires additional version column in DB tables. It is completely independent of underlying DB engine used to store relational data.
On the other hand, pessimistic locking uses locking mechanism provided by underlying database to lock existing records in tables. JPA needs to know how to trigger these locks and some databases do not support them or only partially.
Now to the list of lock types:
LockModeType.Optimistic
If entities specify a version field, this is the default. For entities without a version column, using this type of lock isn't guaranteed to work on any JPA implementation. This mode is usually ignored as stated by ObjectDB. In my opinion it only exists so that you may compute lock mode dynamically and pass it further even if the lock would be OPTIMISTIC in the end. Not very probable usecase though, but it is always good API design to provide an option to reference even the default value.
Example:
`LockModeType lockMode = resolveLockMode();
A a = em.find(A.class, 1, lockMode);`
LockModeType.OPTIMISTIC_FORCE_INCREMENT
This is a rarely used option. But it could be reasonable, if you want to lock referencing this entity by another entity. In other words you want to lock working with an entity even if it is not modified, but other entities may be modified in relation to this entity.
Example: We have entity Book and Shelf. It is possible to add Book to Shelf, but book does not have any reference to its shelf. It is reasonable to lock the action of moving a book to a shelf, so that a book does not end up in another shelf (due to another transaction) before end of this transaction. To lock this action, it is not sufficient to lock current book shelf entity, as the book does not have to be on a shelf yet. It also does not make sense to lock all target bookshelves, as they would be probably different in different transactions. The only thing that makes sense is to lock the book entity itself, even if in our case it does not get changed (it does not hold reference to its bookshelf).
LockModeType.PESSIMISTIC_READ
this mode is similar to LockModeType.PESSIMISTIC_WRITE, but different in one thing: until write lock is in place on the same entity by some transaction, it should not block reading the entity. It also allows other transactions to lock using LockModeType.PESSIMISTIC_READ. The differences between WRITE and READ locks are well explained here (ObjectDB) and here (OpenJPA). If an entity is already locked by another transaction, any attempt to lock it will throw an exception. This behavior can be modified to waiting for some time for the lock to be released before throwing an exception and roll back the transaction. In order to do that, specify the javax.persistence.lock.timeout hint with the number of milliseconds to wait before throwing the exception. There are multiple ways to do this on multiple levels, as described in the Java EE tutorial.
LockModeType.PESSIMISTIC_WRITE
this is a stronger version of LockModeType.PESSIMISTIC_READ. When WRITE lock is in place, JPA with the help of the database will prevent any other transaction to read the entity, not only to write as with READ lock.
The way how this is implemented in a JPA provider in cooperation with underlying DB is not prescribed. In your case with Oracle, I would say that Oracle does not provide something close to a READ lock. SELECT...FOR UPDATE is really rather a WRITE lock. It may be a bug in hibernate or just a decision that, instead of implementing custom "softer" READ lock, the "harder" WRITE lock is used instead. This mostly does not break consistency, but does not hold all rules with READ locks. You could run some simple tests with READ locks and long running transactions to find out if more transactions are able to acquire READ locks on the same entity. This should be possible, whereas not with WRITE locks.
LockModeType.PESSIMISTIC_FORCE_INCREMENT
this is another rarely used lock mode. However, it is an option where you need to combine PESSIMISTIC and OPTIMISTIC mechanisms. Using plain PESSIMISTIC_WRITE would fail in following scenario:
transaction A uses optimistic locking and reads entity E
transaction B acquires WRITE lock on entity E
transaction B commits and releases lock of E
transaction A updates E and commits
in step 4, if version column is not incremented by transaction B, nothing prevents A from overwriting changes of B. Lock mode LockModeType.PESSIMISTIC_FORCE_INCREMENT will force transaction B to update version number and causing transaction A to fail with OptimisticLockException, even though B was using pessimistic locking.
LockModeType.NONE
this is the default if entities don't provide a version field. It means that no locking is enabled conflicts will be resolved on best effort basis and will not be detected. This is the only lock mode allowed outside of a transaction

Rolling back multiple transactions with JDBC

Is it possible to rollback multiple already-commited transactions with JDBC?
According to this link here: http://docs.oracle.com/javase/tutorial/jdbc/basics/transactions.html savepoints are only active for the current transaction?
Thanks.
Already committed individual or multiple transactions (unlike savepoints!) are not possible on any databases as far as I know, definitely not on Oracle. Yes, savepoints are relevant only for the current transaction.
I'm not sure what your problem is but if you want to look at old values of a recently committed table you could use SELECT AS OF or similarly, flashback the whole table or even the database.
If you think about it for a while there are lots of constrains while individual transactional rollbacks are sometimes logically impossible without violating a whole lot of data integrity rules...

Achieving ACID properties using JDBC?

First of all i would like to confirm is it the responsibility of developer to follow these properties or responsibilty of transaction Apis like JDBC?
Below is my understanding how we achieve acid properties in JDBC
Atomicity:- as there is one transaction associated with connection, so we do commit or rollback , there are no partial updation.Hence achieved
Consitency:- when some data integrity constraint is voilated (say some check constraint) then sqlexception will be thrown . Then programmer acieve the consistent database by rollbacking the transaction?
one question on above say we do transaction1 and sql excpetion is thrown during transaction 2 as explained above . Now we catch the exception and do the commit will first transaction be commited?
Isolation:- Provided by JDBC Apis.But this leads to the problem of concurrent update . so it has be dealt manually right?
Durability:- Provided by JDBC Apis.
Please let me if above understanding is right?
ACID principles of transactional integrity are implemented by the database not by the API (like JDBC) or by the application. Your application's responsibility is to choose a database and a database configuration that supports whatever transactional integrity you need and to correctly identify the transactional boundaries in your application.
When an exception is thrown, your application has to determine whether it is appropriate to rollback the entire transaction or to proceed with additional processing. It may be appropriate if your application is processing orders from a vendor, for example, to process the 99 orders that succeed and log the 1 order that failed somewhere for users to investigate. On the other hand, you may reject all 100 orders because 1 failed. It depends what your application is doing.
In general, you only have one transaction open at a time (or, more accurately, one transaction per connection). So if you are working in transaction 2, transaction 1 by definition has already completed-- it was either committed or rolled back previously. Exceptions thrown in transaction 2 have no impact on transaction 1.
Depending on the transaction isolation level your application requests (and the transaction isolation levels your database supports) as well as the mechanics of your application, lost updates are something that you may need to be concerned about. If you set your transaction isolation level to read committed, it is possible that you would read a value as 'A' in transaction 1, wait for a user to do something, update the value to 'B', and commit without realizing that transaction 2 updated the value to 'C' between the time you read the data and the time you wrote the data. This may be a problem that you need to deal with or it may be something where it is fine for the last person to update a row to "win".
Your database, on the other hand, should take care of the automatic locking that prevents two transactions from simultaneously updating the same row of the same table. It may do this by locking more than is strictly necessary but it will serialize the updates somehow.

AUTONOMOUS_TRANSACTION: pros and cons

Can be autonomous transactions dangerous? If yes, in which situations? When autonomous transactions are necessary?
Yes, autonomous transactions can be dangerous.
Consider the situation where you have your main transaction. It has inserted/updated/deleted rows. If you then, within that, set up an autonomous transaction then either
(1) It will not query any data at all. This is the 'safe' situation. It can be useful to log information independently of the primary transaction so that it can be committed without impacting the primary transaction (which can be useful for logging error information when you expect the primary transaction to be rolled back).
(2) It will only query data that has not been updated by the primary transaction. This is safe, but superfluous. There is no point to the autonomous transaction.
(3). It will query data that has been updated by the primary transaction. This smacks of a poorly thought through design, since you've overwritten something and then need to go back to see what it was before you overwrote it. Sometimes people think that an autonomous transaction will still see the uncommitted changes of the primary transaction, and it won't. It reads the currently committed state of the database, plus any changes made within the autonomous transaction. Some people (often trying autonomous transactions in response to mutating trigger errors) don't care what state the data is in when they try to read it and these people simply shouldn't be allowed access to a database.
(4). It will try to update/delete data that hasn't been updated by the primary transaction. Again, this smacks of poor design. These changes are going to get committed (or rolled back) whether or not the primary transaction succeeds or fails. Worse you risk issue (5) since it is hard to determine, within an autonomous transaction, whether the data has been updated by the primary transaction.
(5). You try to update/delete data that has already been updated by the primary transaction, in which case it will deadlock and end up in an ugly mess.
Can be autonomous transactions dangerous?
Yes.
If yes, in which situations?
When they're misused. For example, when used to make changes to data which should have been rolled back if the rest of the parent transaction is rolled back. Misusing them can cause data corruption because some portions of a change are committed, while others are not.
When are autonomous transactions necessary?
They are necessary when the effects of one transaction must survive, regardless of whether the parent transaction is committed or rolled back. A good example is a procedure which logs the progress and activity of a process to a database table.
When are autonomous transactions necessary?
Check my question: How can LOCK survive COMMIT or how can changes to LOCKed table be propagated to another session without COMMIT and losing LOCK
We ingest business configurations sequentially and should forbid parallel processing.
I use lock to table with configurations and update other tables accordingly. I commit each batched updates to other tables as we can't afford to keep transaction on all records - probability of collision would be near 0.99.
Each failure because of concurrent access is persisted to log for later update attempt.

Resources