Spring boot #Transactioanl method running on multiple threads - spring-boot

In my spring boot application, I have parallel running multiple threads of following #Transactioanl method.
#Transactional
public void run(Customer customer) {
Customer customer = this.clientCustomerService.findByCustomerName(customer.getname());
if(customer == null) {
this.clientCustomerService.save(customer);
}
// another database oparations
}
When this running on multiple threads at the same time, since customer object will not be save until end of the transaction block, is there any possibility to duplicate customers in the database?

If your customer has an #Idfield which define a Primary Key column in Customer database, the database will throw you an exception like javax.persistence.EntityExistsException. Even if you run your code on multiple threads, at a point in time, maybe at the database level, only one will acquire a lock on the new inserted row. Also you must define #Version column/field at top entity level in order to use optimistic-locking. More details about this you can find here.

Related

Should I use ALLOW FILTERING in Cassandra to delete associated entities in a multi-tenant app?

I have a spring-boot project and I am using Cassandra as database. My application is a tenant based application and all my tables include the tenantId. It is always part of the partition key of all tables but I have also other columns which are part of the partition keys.
So, the problem is; I want to remove a specific tenant from my database but I can't do it directly. Because I need the other part of the partition key.
I have two solutions for it in mind.
I will allow filtering and select all the tenant specific entities and then remove them one by one in the application.
I will use the findAll() method and fetch all the data and then filter in the application and delete all the tenant specific data.
Example:
public class DeleteTenant{
#Autowired MyRepository myRepo;
public void cleanTenantWithoutDbFiltering(String tenantId){
myRepo.findAll()
.stream()
.filter(entity -> entity.getTenantId().equals(tenantId)) // ??
.forEach(MyRepository::remove);
}
public void cleanTenantWithDbFiltering(String tenantId){
myRepo.getTenantSpecificData(tenantId)
.forEach(MyRepository::remove);
}
}
My getTenantSpecificData(String tenantId) query would look like that:
#AllowFiltering
#Query("Select * from myTable where tenantId = ?1 ALLOW FILTERING")
public List<MyEntity> getTenantSpecificData(String tenantId);
Do you have any other idea about it? If not which one do you think would be more efficient? Filtering in the application itself or in the cassandra.
Thanks in advance for your answers!
It isn't clear to me how you've modelled your data because you haven't provided examples of your schema but in any case, the use of ALLOW FILTERING is never going to be a good idea because it means that your query has to do a full table scan of all the relevant tables unless the tenant ID is the partition key.
You will need to come up with a different approach such as writing a Spark app that will efficiently go through the tables to identify partitions/rows to delete. Cheers!

JPA #Version behavior when data is changed from unmanaged connection

Enabling #Version on table Customer when running the tests below
#Test
public void actionsTest1 () throws InterruptedException {
CustomerState t = customerStateRepository.findById(1L).get();
Thread.sleep(20000);
t.setInvoiceNumber("1");
customerStateRepository.save(t);
}
While actionsTest1 is sleeping, I run actionsTest2 which updates the invoice number to 2.
#Test
public void actionsTest2 () throws InterruptedException {
CustomerState t = customerStateRepository.findById(1L).get();
t.setInvoiceNumber("2");
customerStateRepository.save(t);
}
When actionsTest1 returns from sleeping it tries to update too, and gets a ObjectOptimisticLockingFailureException
Works as expected.
But if I run actionsTest1 and while it is sleeping I open a SQL terminal and do a raw update of
update customer
set invoice_number='3' where id=1
When actionsTest1 returns from sleeping, its versioning mechanism doesn't catch the case and updates the value back to 1.
Is that expected behavior? Does versioning work only with connections managed by JPA?
It works as expected. If you do a update manually, you have to update your version as well.
If you using JPA with #Version, JPA is incrementing the version column.
To get your expected result you have to write the statement like this
update customer set invoice_number='3', version=XYZ (mabye version+1) where id=1
Is that expected behavior?
Yes.
Does versioning work only with connections managed by JPA?
No, it also works when using any other way of updating your data. But everything updating the data has to adhere to the rules of optimistic locking:
increment the version column whenever performing any update
(only required when the other process also want to detect concurrent updates): on every update check that the version number hasn't changes since the data on which the update is based was loaded.
Hibernate automatically increases/changes the value in #Version mapped column in your database.
When you fetch an entity record, hibernate keeps a copy of the record of the data along with the value of #Version. While performing a merge or update operation, hibernate checks if the current value in of Version is still the same and matches the copy of entity fetched earlier.
If the value matches, it means that the entity is not dirty(not updated by any other transaction) else an exception is thrown.

Unexpected in Spring partition when using synchronized

I am using Spring Batch and Partition to do parallel processing. Hibernate and Spring Data Jpa for db. For the partition step, the reader, processor and writer have stepscope and so I can inject partition key and range(from-to) to them. Now in processor, I have one synchronized method and expected this method to be ran once at time, but it is not the case.
I set it to have 10 partitions , all 10 Item reader read the right partitioned range. The problem comes with item processor. Blow code has the same logic I use.
public class accountProcessor implementes ItemProcessor{
#override
public Custom process(item) {
createAccount(item);
return item;
}
//account has unique constraints username, gender, and email
/*
When 1 thread execute that method, it will create 1 account
and save it. If next thread comes in and try to save the same account,
it should find the account created by first thread and do one update.
But now it doesn't happen, instead findIfExist return null
and it try to do another insert of duplicate data
*/
private synchronized void createAccount(item) {
Account account = accountRepo.findIfExist(item.getUsername(), item.getGender(), item.getEmail());
if(account == null) {
//account doesn't exist
account = new Account();
account.setUsername(item.getUsername());
account.setGender(item.getGender());
account.setEmail(item.getEmail());
account.setMoney(10000);
} else {
account.setMoney(account.getMoney()-10);
}
accountRepo.save(account);
}
}
The expected output is that only 1 thread will run this method at any given time and so that there will be no duplicate inserttion in db as well as avoid DataintegrityViolationexception.
Actually result is that second thread can't find the first account and try to create a duplicate account and save to db, which will cause DataintegrityViolationexception, unique constraints error.
Since I synchronized the method, thread should execute it in order, second thread should wait for first thread to finish and then run, which mean it should be able to find the first account.
I tried with many approaches, like a volatile set to contains all unique accounts, do saveAndFlush to make commits asap, using threadlocal whatsoever, no of these works.
Need some help.
Since you made the item processor step-scoped, you don't really need synchronization as each step will have its own instance of the processor.
But it looks like you have a design problem rather than an implementation issue. You are trying to sychronize threads to act in a certain order in a parallel setup. When you decide to go parallel and divide the data into partitions and give each worker (either local or remote) a partition to work on, you must admit that these partitions will be processed in an undefined order and that there should be no relation between records of each partition or between the work done by each worker.
When 1 thread execute that method, it will create 1 account
and save it. If next thread comes in and try to save the same account,
it should find the account created by first thread and do one update. But now it doesn't happen, instead findIfExist return null and it try to do another insert of duplicate data
That's because the transaction of thread1 may not be committed yet, hence thread2 won't find the record you think have been inserted by thread1.
It looks like you are trying to create or update some accounts with a partitioned setup. I'm not sure if this setup is suitable for the problem at hand.
As a side note, I would not call accountRepo.save(account); in an item processor but rather do that in an item writer.
Hope this helps.

How to use Hazelcast with MapStore

I am using Hazelcast as caching Solution for my application.
My application has few inserts and updates to the database and these needs to be synced to Cache also.
I want to use MapStore functionality so that when I do IMap.put(), Hazelcast takes care of persisting the Object in underlying Db and also update its cache.
In the overridden store implementation, I want to call my DAO in following way to persist the Data.
public void store(Long key, Product value)
{
log.info("Storing Data for Employee {} in Database using DataStore ", value);
Long employeeId = employeeDao.create(value);
value.setId(employeeId );
}
There are few issues listed below:-
1) In put call, I want to use "key" as the "employeeId", but this is generated only after insertion happens for this record in the Db. So how do I put into the Cache when I don't have the Id.? I want Hazelcast to use the "id" generated as part of store method call (or any other way) as the key to my Object.
Imap.put(key,new Employee("name_of_Employee","age_of_employee"))
2) The MapStore implementation's store method returns a void so I cannot return the Id generated for this Object to the Client. How can I achieve this?
I tried using MapEntryListeners on the Map but the entry added callback does not return new Object. I also added PostProcessingMapStore interface to my MapStore but could not get the new Value back to client.
Please advice
You have 2 options:
1) Generate the employeeId outside of the database. You can use the IdGenerator from Hazelcast to do this.
2) If you must let the database generate the id, then you need to put the Employee in the cache manually AFTER it has been stored in the database.

Double instances in database after using EntityManager.merge() in Transient Method

I am new with Spring, my application, developed with Spring Roo has a Cron that every day download some files and update a database.
The update is done, after downloading and parsing the files, using merge(),
an Entity class Dataset has a list called resources, after the download I do:
dataset.setResources(resources);
dataset.merge();
and dataset.merge() does the following:
#Transactional
public Dataset Dataset.merge() {
if (this.entityManager == null) this.entityManager = entityManager();
Dataset merged = this.entityManager.merge(this);
this.entityManager.flush();
return merged;
}
I expect that doing dataset.setResources(resources); I would overwrite the filed resources, and so even the database entry would be overwritten.
But I get double entries in the database: every resource appear twice, with different IDs (incremental).
How can I succed in let my application doing updates and not insert? A naive solution would be delete manually the old resource and then call merge(); is this the way or is there some more smart solution?
This situation occurs when you use Hibernate as persistence engine and your entities have version field.
Normally the ID field is what we need for merging a detached object with its persistent state in the database, but Hibernate takes the version field in account and if you don't set it (it is null) Hibernate discards the value of ID field and creates a new object with new ID.
To know if you are affected by this strange feature of Hibernate, set a value in the version field, if an Exception is thrown you got it. In that case the best way to solve it is the data to parse contain the right value of version. Another ways are to disable version checking (see Hibernate ref guide to know about it) or load persistent state before merging.

Resources