OptimisticLockException while using - "for update skip locked"

OptimisticLockException while using - "for update skip locked" - spring

I have the below calls using JPA native queries using entity manager. My understanding was if we use "for update skip locked", we shouldn't run into locking issue with concurrent processing. Database is MySQL.
But I am still getting "javax.persistence.OptimisticLockException: org.hibernate.exception.LockAcquisitionException" when multiple threads are running the below code.
What could be the reason for this?
Query selectQuery = entityManager
.createNativeQuery(....)
select id from tb_contacts limit 10 for update SKIP LOCKED
....
.... processing in java
....
update tb_contacts set status = 2 where id in (:ids)

Related

How to lock on select and release lock after update is committed using spring?

I have started using spring from last few months and I have a question on transactions. I have a java method inside my spring batch job which first does a select operation to get first 100 rows with status as 'NOT COMPLETED' and does a update on the selected rows to change the status to 'IN PROGRESS'. Since I'm processing around 10 million records, I want to run multiple instances of my batch job and each instance has multiple threads. For a single instance, to make sure two threads are not fetching the same set of records, I have made my method as synchonized. But if I run multiple instances of my batch job (multiple JVMs), there is high probability that same set of records might be fetched by both the instances even if I use "optimistic" or "pesimistic lock" or "select for update" since we cannot lock records during selection. Below is the example shown. Transaction 1 has fetched 100 records and meanwhile Transaction2 also fetched 100 records but if I enable locking transaction 2 waits until transaction 1 is updated and committed. But Transaction 2 again does the same update.
Is there any way in spring to make transaction 2's select operation to wait until transaction 1's select is completed ?
Transaction1 Transaction2
fetch 100 records
fetch 100 records
update 100 records
commit
update 100 records
commit
#Transactional
public synchronized List<Student> processStudentRecords(){
List<Student> students = getNotCompletedRecords();
if(null != students && students.size() > 0){
updateStatusToInProgress(students);
}
return student;
}
Note: I cannot perform update first and then select. I would appreciate if any alternative approach is suggested ?

Transaction synchronization should be left to the database server and not managed at the application level. From the database server point of view, no matter how many JVMs (threads) you have, those are concurrent database clients asking for read/write operations. You should not bother yourself with such concerns.
What you should do though is try to minimize contention as much as possible in the design of your solution, for example, by using the (remote) partitioning technique.
if I run multiple instances of my batch job (multiple JVMs), there is high probability that same set of records might be fetched by both the instances even if I use "optimistic" or "pesimistic lock" or "select for update" since we cannot lock records during selection
Partitioning data will by design remove all these problems. If you give each instance a set of data to work on, there is no chance that a worker would select the same of records of another worker. Michael gave a detailed example in this answer: https://stackoverflow.com/a/54889092/5019386.
(Logical) Partitioning however will not solve the contention problem since all workers would read/write from/to the same table, but that's the nature of the problem you are trying to solve. What I'm saying is that you don't need to start locking/unlocking the table in your design, leave this to the database. Some database severs like Oracle can write data of the same table to different partitions on disk to optimize concurrent access (which might help if you use partitioning), but again that's Oracle's business, not Spring's (or any other framework) business.
Not everybody can afford Oracle so I would look for a solution at the conceptual level. I have successfully used the following solution ("Pseudo" physical partitioning) to a problem similar to yours:
Step 1 (in serial): copy/partition unprocessed data to temporary tables (in serial)
Step 2 (in parallel): run multiple workers on these tables instead of the source table with millions of rows.
Step 3 (in serial): copy/update processed data back to the original table
Step 2 removes the contention problem. Usually, the cost of (Step 1 + Step 3) is neglectable compared to Step 2 (even more neglectable if Step 2 is done in serial). This works well if the processing is the bottleneck.
Hope this helps.

Spring int-jdbc:inbound-channel-adapter transaction

I have gone through this link
spring integration jdbc adapter for multiple nodes.Which is quite helpful.I have doubt on below point.
I have multi thread environment(Multiple Nodes),where a select query which has n rows eligible,but I have configure max-rows-per-poll=5,followed by a update for these 5 records.
Poller is configure with transaction.
while these 5 records are processed bye one thread in one node,all other thread will wait or they will pick 5 records each from n-5 records and process ?
I am using int-jdbc:inbound-channel-adapter and Oracle Database.

You need to read about a difference between max-messages-per-poll and max-rows: https://docs.spring.io/spring-integration/docs/5.0.7.RELEASE/reference/html/jdbc.html#jdbc-max-rows-per-poll-versus-max-messages-per-poll.
Also for Oracle I would recommend to use FOR UPDATE SKIP LOCKED if you really would like to get new records and don’t wait for already locked.

Spring #Transactional + Isolation.REPEATABLE_READ for Rate Limiting

We are trying a scenario of Rate Limiting the total no. of JSON records requested in a month to 10000 for an API.
We are storing the total count of records in a table against client_id and a Timestamp(which is primary key).
Per request we fetch record from table for that client with Timestamp with in that month.
From this record we get the current count, then increment it with no. of current records in request and update the DB.
Using the Spring Transaction, the pseudocode is as below
#Transactional(propagation=Propagation.REQUIRES_NEW, isolation=Isolation.REPEATABLE_READ)
public void updateLimitData(String clientId, currentRecordCount) {
//step 1
startOfMonthTimestamp = getStartOfMonth();
endOfMonthTimestamp = getEndOfMonth();
//step 2
//read from DB
latestLimitDetails = fetchFromDB(startOfMonthTimestamp, endOfMonthTimestamp, clientId);
latestLimitDetails.count + currentRecordCount;
//step 3
saveToDB(latestLimitDetails)
}
We want to make sure that in case of multiple threads accessing the "updateLimitData()" method, each thread get the updated data for a clientId for a month and it do not overwrite the count wrongly.
In the above scenario if multiple threads access the method "updateLimitData()" and reach the "step 3". First thread will update "count" in DB, then the second thread update "count" in DB which may not have latest count.
I understand from Isolation.REPEATABLE_READ that "Write Lock" is placed in the rows when update is called at "Step 3" only(by that time other thread will have stale data). How I can ensure that always threads get he latest count from table in multithread scenario.
One solution came to my mind is synchronizing this block but this will not work well in multi server scenario.
Please provide a solution.

A transaction would not help you unless you lock the table/row whilst doing this operation (don't do that as it will affect performance).
You can migrate this to the database, doing this increment within the database using a stored procedure or function call. This will ensure ACID and transactional safety as this is built into the database.
I recommend doing this using standard Spring Actuator to produce a count of API calls however, this will mean re-writing your service to use the actuator endpoint and not the database. You can link this to your Gateway/Firewall/Load-balancer to deny access to the API once their quote is reached. This means that your API endpoint is pure and this logic is removed from your API call. All new API's you developer will automatically get this functionality.

Save and lock entity with Hibernate

I'm looking for a way to save and immediately lock an entity on a DB in order to avoid that other thread access the entity before the thread creator ends.
I'm using Hibernate 4.3.11 and Spring 4.2.5.
Thanks in advance.

Although there is lock mode - LockMode.WRITE - but as the documentation states
A WRITE lock is obtained when an object is updated or inserted. This
lock mode is for internal use only and is not a valid mode for load()
or lock() (both of which throw exceptions if WRITE is specified)..
If it's just that you are only inserting rows then you cannot specifically lock the database rows using hibernate as the rows are not yet committed.
The moment your code (hibernate or without) inserts rows in database and not yet commits - there are transactional locks held which gets released on transaction commit. The nature of locks and the manner in which this internally happens is database specific. However if you are interested in locking some rows (already existing) , then you
can query the data using
session.get(TestEntity.class, 1, LockMode.PESSIMISTIC_WRITE);
This will hold a pessimistic lock (typically by issuing SELECT .... FOR UPDATE) for the duration of transaction and no other thread/transaction can modify the data on which lock has been taken.

A possible way should be increase transaction level to serializable.
This level ensure data is locked until is not used in other transaction.

Hibernate offer's two types of locks Optimistic and Pessimistic. Its straight forward.
1)Optimistic uses versioning where in it will have a version column in the database and check it before it updates or else throw the exception
2)Pessimistic is some thing like a database handles the locking on that row and it will get released after the operation is completed, there are few options are there which is similarly like how you imagine like read lock, write lock
https://docs.jboss.org/hibernate/orm/4.0/devguide/en-US/html/ch05.html

If you are using PostgreSQL I think the below example works:
#Query(value = """with ins_artist as (
insert into artist
values (301, 'Whoever1')
returning *
) select artist_id
from ins_artist
for update""", nativeQuery = true)
#Transactional(propagation = Propagation.REQUIRED)
Long insertArtist(); // returns artist ID
PS: I ran this query on https://postgres.devmountain.com/ . But it would need testing on a Java app.

Quartz org.quartz.jobStore.selectWithLockSQL row lock

I am using Quartz in clustered mode
I have some row lock contention on DB level caused by excessive call to :
org.quartz.jobStore.selectWithLockSQL
"SELECT * FROM QRTZ_LOCKS WHERE SCHED_NAME = :"SYS_B_0" AND LOCK_NAME = :1 FOR UPDATE"
I read quartz docs and is still not very clear to me why is above query is executed.
What is the purpose of having this row lock ?
Regards

The locks table is used by quartz for coordinating multiple schedulers when deployed in cluster mode. In a cluster only one node should fire the trigger, so a lock is used to avoid multiple nodes acquiring the same trigger.
From the clustering section of the documentation (http://quartz-scheduler.org/generated/2.2.1/html/qs-all/#page/Quartz_Scheduler_Documentation_Set%2Fre-cls_cluster_configuration.html%23):
Clustering currently only works with the JDBC-Jobstore (JobStoreTX or
JobStoreCMT), and essentially works by having each node of the cluster
share the same database. Load-balancing occurs automatically, with
each node of the cluster firing jobs as quickly as it can. When a
trigger's firing time occurs, the first node to acquire it (by placing
a lock on it) is the node that will fire it.

In my case, I was experiencing a similar issue. I was using quartz fir running jobs whose logic involved fetching data from a foreign db. Whenever the connection between the application db and foreign db stopped due to some reason and the connection came back up the issue of locks surfaced and we used to get messages like this in the database logs
2021-01-14 12:06:17.935 KST [46836] STATEMENT:
SELECT * FROM HVACQRTZ_LOCKS WHERE SCHED_NAME = 'schedulerFactoryBean' AND LOCK_NAME = $1 FOR UPDATE
2021-01-14 12:06:18.937 KST [46836] ERROR: current transaction is aborted, commands ignored until end of transaction block
To solve this issue I used this property of quartz and once after using this property the issue went away. By default, the foe update part will be there at the end of the query but since the default query is replaced by the query which I wrote in the property file the for update portion is gone and no locks appear now and everything seems to be working smoothly.
selectWithLockSQL: SELECT * FROM {0}LOCKS WHERE LOCK_NAME = ?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio