spring boot how to handle fault tolerance in async method? - spring-boot

Suppose I have a caller to distribute work to multiple async tasks:
public class Caller{
public boolean run() {
for (int i = 0: i< 100; i++) {
worker.asyncFindOrCreate(entites[i]);
}
return true;
}
public class Worker{
#Autowired
Dao dao;
#Async
public E asyncFindOrCreate(User entity) {
return dao.findByName(entity.getName).elseGet(() -> dao.save(entity));
}
}
If we have 2 same entities:
with the synchronized method, the first one will be created and then the second one will be retrieved from the existing entity;
with async, the second entities might pass the findByName and go to save because the first entity hasn't been saved yet, which cause the save of the second entity throws unique identifier error.
Is there a way to add some fault tolerance mechanic to have some features like retry and skipAfterRetry, in particular for database operations.

In this special case you should convert your array to a map. Use the name property as a key, so there will be no duplicated entries.
However, if this method also can be called by multiple threads (ie. it's in a web-server) or there are multiple instances running it's still not fail-safe.
In generic, you should let the DB to check the uniqueness. There is no safest/easiest way to do that. Put the save method inside a try-catch block and check/handle the unique identifier exception.

Related

Multiple writers for different types in the same Spring Batch step

I am writing a Spring Batch application with the following workflow:
Read some items of type A (using a FlatFileItemReader<A>).
Process an item, transforming it from A to B.
Write the processed items of type B (using a JdbcBatchItemWriter<B>)
Eventually, I should call an external service (a RESTful API, but it could be a SimpleMailMessageItemWriter<A>) using data from the source type A.
How can I configure such a workflow?
So far, I have found the following workaround:
Configuring a CompositeItemWriter<B> which delegates to:
The actual ItemWriter<B>
A custom ItemWriter<B> implementation which converts B back to A and then writes an A
But this is a cumbersome solution because it forces me to either:
Duplicate processing logic: from A to B and back again.
Sneakily hide some attributes from the source object A inside B, polluting the domain model.
Note: since my custom item writer for A needs to invoke an external service, I would like to perform this operation after B has been successfully written.
Here are the relevant parts of the batch configuration code.
#Bean
public Step step(StepBuilderFactory steps, ItemReader<A> reader, ItemProcessor<A, B> processor, CompositeItemWriter<B> writer) {
return steps.get("step")
.<A, B>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
#Bean
public CompositeItemWriter<B> writer(JdbcBatchItemWriter<B> jdbcBatchItemWriter, CustomItemWriter<B, A> customItemWriter) {
return new CompositeItemWriterBuilder<B>()
.delegates(jdbcBatchItemWriter, customItemWriter)
.build();
}
For your use case, I would encapsulate A and B in a wrapper type, such AB:
class AB {
private A originalItem;
private B transformedItem;
}
With that, you would have: ItemReader<A>, ItemProcessor<A, AB> and ItemWriter<AB>. The processor creates instances of AB in which it keeps a reference to the original item. The writer can then get access to both types and delegate to the JdbcBatchItemReader<B> and SimpleMailMessageItemWriter<A> as needed, something like:
class ABItemWriter implements ItemWriter<AB> {
private JdbcBatchItemWriter<B> jdbcBatchItemWriter;
private SimpleMailMessageItemWriter mailMessageItemWriter;
// constructor with delegates
#Override
public void write(List<? extends AB> items) throws Exception {
jdbcBatchItemWriter.write(getBs(items));
mailMessageItemWriter.write(getAs(items)); // this would not be called if the jdbc writer fails
}
}
The methods getAs and getBs would extract items of type A/B from AB. Encapsulation for the win! BTW, a Java record is a good option for type AB.

Why is Spring Boot #Async dropping items in my List argument?

I am experiencing some sort of thread issue with the #Async method annotation where one argument contains a List of enum and is dropping items. The list is very small, 2 items. The dropping of items is not immediate, but sometimes takes hours or days to appear.
This is the general flow of our program:
A Controller generates the said List in its #RequestMapping method, passes the list to a Service class, which makes a call to a database for batching and triggers an event for each item from the database, passing the list. This list eventually gets passed into an #Async method which then drops either the first item or both items.
Controller.methodA()
-> Creates list with two items in it
-> Calls void Service.methodX(list)
-> Load batch from database
-> Iterate over batch
-> Print items from list --- list in tact
-> Calls void AsyncService.asyncMethod(list)
-> Print items from list --- eventually drops items here always the first item, sometimes both.
Code configuration and bare-bones sample:
We configured it to have 2 threads:
#Configuration
#EnableAsync
public class AsyncConfig implements AsyncConfigurer {
#Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
threadPoolTaskExecutor.setMaxPoolSize(5); // Never actually creates 5 threads
threadPoolTaskExecutor.setCorePoolSize(2); // Only 2 threads are ever created
threadPoolTaskExecutor.initialize();
return threadPoolTaskExecutor;
}
}
This is a local replica to try to trigger the core issue, but no luck:
#RestController
public class ThreadingController {
private final ThreadingService threadingService;
public ThreadingController(ThreadingService threadingService) {
this.threadingService = threadingService;
}
#GetMapping("/test")
public void testThreads() {
List<SomeEnum> list = new ArrayList<>();
list.add(SomeEnum.FIRST_ENUM);
list.add(SomeEnum.SECOND_ENUM);
for (int i = 0; i < 1000; i++) {
this.threadingService.workSomeThreads(i, list);
}
}
}
public enum SomeEnum {
FIRST_ENUM("FIRST_ENUM"),
SECOND_ENUM("SECOND_ENUM");
#Getter
private String name;
SomeEnum(String name) {
this.name = name;
}
}
#Slf4j
#Service
public class ThreadingService {
#Async
public void workSomeThreads(int i, List<SomeEnum> list) {
try {
Thread.sleep(100L); // Add some delay to slow things down to trigger GC or other tests during processing
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info("Count {} ; Here are the list items: {}", i, list.toString());
assert(list.size() == 2);
}
}
If we look through this, I have one controller simulating both the Controller and Service mentioned earlier. It spins through a batch of data, sending the same list over and over. There's an aync method in another class to test that the list is the same. I was not able to replicate the issue locally, but this is the core problem.
To my knowledge, Java is pass-by-reference and every variable passed into a method gets its own pointer in the stack to that reference in memory, but don't think it would cause us to run out of memory. We are running in PCF and don't see any memory spikes or anything during this time. Memory is constant around 50%. I also tried using a CopyOnWriteArrayList (thread safe) instead of ArrayList and still the problem exists.
Questions:
Any idea why the #Async method would drop items in the method argument? The list is never modified after construction, so why would items disappear? Why would the first item always disappear? Why not the second item? Why would both disappear?
Edit: So this question had little to do with #Async in the end. I found deeply nested code that removed items from the list, causing items to go missing.
What you said is correct, Java is indeed pass-by-reference. The change in your list must be definitely due to some other code that is modifying this list while the threads are executing. There is no other way the object would change its values.
You must investigate the code in the below section to identify if there is something that is modifying the list.
-> Print items from list --- eventually drops items here always the first item, sometimes both.
-> code following this might be changing the list.
As the AsyncService would execute its code asynchronously and in the mean time, some other code modifies the list.
You may as well make the method params to be final.

#Transaction does not work for multiple inner methods of same class or when called methods of another classes

I have a method which is annotated with #Transactional , which internally calling multiple inner method for same class which within them may or may not be calling any other external service methods. When it is calling the external service class method , it is working for 1 method meanning it rollbacking , but same service when calling another method of same class [external service class only ] , it is no roll backing , can anyone help me here.
#Transactional
public void processPayments(PaymentRequest request) {
request.getDetails.forEach(payment -> {
method1(payment);
});
// when doSomething1() is success , then its calling below method ,
externalService.doSomething2();// when it api fails , it is rollbacking properly , the process of calling is exactly same. Howcome this is rollbacking not dosomething1() is not rollbacking ?
}
private void method1(PaymentDetails details){
details.getDetails.forEach(detailedPayment -> {
method1_1(detailedPayment);
});
task3();
}
private void method1_1(DetailedPayment detailedPayment){
roundPayment();
task1();
task2();
}
private void roundPayment(){
}
private SomeObject task1(SomeObjet object){
// update object with if conditions
repository.save(object);
}
private SomeObject task2(){
// update object with if conditions
repository.save(object);
}
private SomeObject task3(){
// repository.save(updateSomeObject(someObject));
// externalService.doSomething1(double val1 , double val2); // this is another service , which also uses another service , which uses restTemplate to call external service. , if http status is other than 200 , i am throwing ExternalAPICall Exception , which should roll back full transaction starting from processPayments method
// its not roll backing
}
private void updateSomeObject(SomeObject object){
// update object based on few if conditions
}
Can anyone help me here ? Also i would like to know more about properly use of transactional , like multiple inner method of same class , or multiple inner method of another classe called by proxied class and so on.
The only structural difference between calling doSomething1 and doSomething2 is that the first one is called form inside a inlined function. (which is passed to a stream, which could be implemented some fancy asynchronous way.)
What happens if you refactor your code this way :
#Transactional
public void processPayments(PaymentRequest request) {
for(PaymentDetails details: request.getDetails()) {
method1(details);
}
externalService.doSomething2();
}
(If it works, refactor the other method too, there is good chance that task1 and task2 wont roll back neither, because details.getDetails() has a similar implementation.)

Spring Data Solr #Transaction Commits

I currently have a setup where data is inserted into a database, as well as indexed into Solr. These two steps are wrapped in a spring-managed transaction via the #Transaction annotation. What I've noticed is that spring-data-solr issues an update with the following parameters whenever the transaction is closed : params{commit=true&softCommit=false&waitSearcher=true}
#Transactional
public void save(Object toSave){
dbRepository.save(toSave);
solrRepository.save(toSave);
}
The rate of commits into solr is fairly high, so ideally I'd like send data to the solr index, and have solr auto commit at regular intervals. I have the autoCommit (and autoSoftCommit) set in my solrconfig.xml, but since spring-data-solr is sending those commit parameters, it does a hard commit every time.
I'm aware that I can drop down to the SolrTemplate API and issue commits manually, I would like to keep the solr repository.save call within a spring-managed transaction if possible. Is there a way to modify the parameters that are sent to solr on commit?
After putting in an IDE debug breakpoint in org.springframework.data.solr.repository.support.SimpleSolrRepository here:
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit(solrCollectionName);
}
}
I discovered that wrapping my code as #Transactional (and other details to actually enable the framework to begin/end code as a transaction) doesn't achieve what we expect with "Spring Data for Apache Solr". The stacktrace shows the Proxy and Transaction Interceptor classes for our code's Transactional scope but then it also shows the framework starting its own nested transaction with another Proxy and Transaction Interceptor of its own. When the framework exits its CrudRepository.save() method my code calls, the action to commit to Solr is done by the framework's nested transaction. It happens before our outer transaction is exited. So, the attempt to batch-process many saves with one commit at the end instead of one commit for every save is futile. It seems, for this area in my code, I'll have to make use of SolrJ to save (update) my entities to Solr and then have "my" transaction's exit be followed with a commit.
If using Spring Solr, I found using the SolrTemplate bean allows you to 'batch' updates when adding data to the Solr index. By using the bean for SolrTemplate, you can use "addBeans" method, which will add a collection to the index and not commit until the end of the transaction. In my case, I started out using solrClient.add() and taking up to 4 hours for my collection to get saved to the index by iterating over it, as it commits after every single save. By using solrTemplate.addBeans(Collect<?>), it finishes in just over 1 second, as the commit is on the entire collection. Here is a code snippet:
#Resource
SolrTemplate solrTemplate;
public void doReindexing(List<Image> images) {
if (images != null) {
/* CMSSolrImage is a class with #SolrDocument mappings.
* the List<Image> images is a collection pulled from my database
* I want indexed in Solr.
*/
List<CMSSolrImage> sImages = new ArrayList<CMSSolrImage>();
for (Image image : images) {
CMSSolrImage sImage = new CMSSolrImage(image);
sImages.add(sImage);
}
solrTemplate.saveBeans(sImages);
}
}
The way I've done something similar is to create a custom repository implementation of the save methods.
Interface for the repository:
public interface FooRepository extends SolrCrudRepository<Foo, String>, FooRepositoryCustom {
}
Interface for the custom overrides:
public interface FooRepositoryCustom {
public Foo save(Foo entity);
public Iterable<Foo> save(Iterable<Foo> entities);
}
Implementation of the custom overrides:
public class FooRepositoryImpl {
private SolrOperations solrOperations;
public SolrSampleRepositoryImpl(SolrOperations fooSolrOperations) {
this.solrOperations = fooSolrOperations;
}
#Override
public Foo save(Foo entity) {
Assert.notNull(entity, "Cannot save 'null' entity.");
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBean(entity, 1000);
commitIfTransactionSynchronisationIsInactive();
return entity;
}
#Override
public Iterable<Foo> save(Iterable<Foo> entities) {
Assert.notNull(entities, "Cannot insert 'null' as a List.");
if (!(entities instanceof Collection<?>)) {
throw new InvalidDataAccessApiUsageException("Entities have to be inside a collection");
}
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBeans((Collection<? extends T>) entities, 1000);
commitIfTransactionSynchronisationIsInactive();
return entities;
}
private void registerTransactionSynchronisationIfSynchronisationActive() {
if (TransactionSynchronizationManager.isSynchronizationActive()) {
registerTransactionSynchronisationAdapter();
}
}
private void registerTransactionSynchronisationAdapter() {
TransactionSynchronizationManager.registerSynchronization(SolrTransactionSynchronizationAdapterBuilder
.forOperations(this.solrOperations).withDefaultBehaviour());
}
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit();
}
}
}
and you also need to provide a SolrOperations bean for the right solr core:
#Configuration
public class FooSolrConfig {
#Bean
public SolrOperations getFooSolrOperations(SolrClient solrClient) {
return new SolrTemplate(solrClient, "foo");
}
}
Footnote: auto commit is (to my mind) conceptually incompatible with a transaction. An auto commit is a promise from solr that it will try to start to write it to disk within a certain time limit. Many things might stop that from actually happening however - a timely power or hardware failure, errors between the document and the schema, etc. But the client won't know that solr failed to keep its promise, and the transaction will see a success when it actually failed.

Spring Data Rest and collections with unique constraints

I'm evaluating spring-data-rest and am running into a situation where the magic no longer appears to be working in my favor.
Say I have a collection of items.
Parent - 1:M - Child
Parent
Long id
String foo
String bar
#OneToMany(...)
#JoinColumn(name = "parent_id", referencedColumnName = "id", nullable = false)
Collection<Child> items
setItems(items) {
this.items.clear();
this.items.addAll(items);
}
#Table(name = "items", uniqueConstraints = {#UniqueConstraint(columnNames = {"parent_id", "ordinal"})})
Child
Long id
String foo
Integer ordinal
The database has a constraint that children of the same parent can't have conflicting values in one particular field, 'ordinal'.
I want to PATCH to the parent entity, overwriting the collection of children. The problem comes with the default behavior of hibernate. Hibernate doesn't flush the changes from when the collection is cleared until after the new items are added. This violates the constraint, even though the eventual state will not.
Cannot insert duplicate key row in object 'schema.parent_items' with unique index 'ix_parent_items_id_ordinal'
I have tried mapping this constraint to the child entity by using #UniqueConstraints(), but this doesn't appear to change the behavior.
I am currently working around this by manually looking at the current items and updating the ones that would cause the constraint violation with the new values.
Am I missing something? This seems like a fairly common use case, but maybe I'm trying too hard to shoe-horn hibernate into a legacy database design. I'd love to be able to make things work against our current data without having to modify the schema.
I see that I can write a custom controller and service, à la https://github.com/olivergierke/spring-restbucks, and this would let me handle the entityManager and flush in between. The problem I see going that way is that it seems that I lose the entire benefit of using spring-data-rest in the first place, which solves 99% of my problems with almost no code. Is there somewhere that I can shim in a custom handler for this operation without rewriting all the other operations I get for free?
In order to customize Spring Data REST (my way to do, I have to speak about with Spring Data REST guys) like following:
Consider we have a exposed repository UserRepository on /users/, you should have at least the following API:
...
/users/{id} GET
/users/{id} DELETE
...
Now you want to override /users/{id} DELETE but keep other API to be handle by Spring Data REST.
The natural approach (again in my opinion) is to write your own UserController (and your custom UserService) like following:
#RestController
#RequestMapping("/users")
public class UserController {
#Inject
private UserService userService;
#ResponseStatus(value = HttpStatus.NO_CONTENT)
#RequestMapping(method = RequestMethod.DELETE, value = "/{user}")
public void delete(#Valid #PathVariable("user") User user) {
if (!user.isActive()) {
throw new UserNotFoundException(user);
}
user.setActive(false);
userService.save(user);
}
}
But by doing this, the following mapping /users will now be handle by org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerMapping instead of org.springframework.data.rest.webmvc.RepositoryRestHandlerMapping.
And if you pay attention on method handleNoMatch of org.springframework.web.servlet.mvc.method.RequestMappingInfoHandlerMapping (parent of org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerMapping) you can see the following thing:
else if (patternAndMethodMatches.isEmpty() && !allowedMethods.isEmpty()) {
throw new HttpRequestMethodNotSupportedException(request.getMethod(), allowedMethods);
}
patternAndMethodMatches.isEmpty(): return TRUE if url and method (GET, POST, ...) does not match.
So if you are asking for /users/{id} GET it will be TRUE because GET only exists on Spring Data REST exposed repository controller.
!allowedMethods.isEmpty(): return TRUE if at least 1 method GET, POST or something else matches for the given url.
And again it's true for /users/{id} GET because /users/{id} DELETE exists.
So Spring will throw an HttpRequestMethodNotSupportedException.
In order to by-pass this problem I created my own HandlerMapping with the following logic:
The HandlerMapping has a list of HandlerMapping (here RequestMappingInfoHandlerMapping and RepositoryRestHandlerMapping)
The HandlerMapping loops over this list and delegate the request. If an exception occurs we keep it (we keep only the first exception in fact) and we continues to the other handler. At the end if all handlers of the list throw an exception we rethrow the first exception (previously keeped).
Moreover we implements org.springframework.core.Ordered in order to place the handler before org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerMapping.
import org.springframework.core.Ordered;
import org.springframework.util.Assert;
import org.springframework.web.servlet.HandlerExecutionChain;
import org.springframework.web.servlet.HandlerMapping;
import javax.servlet.http.HttpServletRequest;
import java.util.List;
/**
* #author Thibaud Lepretre
*/
public class OrderedOverridingHandlerMapping implements HandlerMapping, Ordered {
private List<HandlerMapping> handlers;
public OrderedOverridingHandlerMapping(List<HandlerMapping> handlers) {
Assert.notNull(handlers);
this.handlers = handlers;
}
#Override
public HandlerExecutionChain getHandler(HttpServletRequest request) throws Exception {
Exception firstException = null;
for (HandlerMapping handler : handlers) {
try {
return handler.getHandler(request);
} catch (Exception e) {
if (firstException == null) {
firstException = e;
}
}
}
if (firstException != null) {
throw firstException;
}
return null;
}
#Override
public int getOrder() {
return -1;
}
}
Now let's create our bean
#Inject
#Bean
#ConditionalOnWebApplication
public HandlerMapping orderedOverridingHandlerMapping(HandlerMapping requestMappingHandlerMapping,
HandlerMapping repositoryExporterHandlerMapping) {
List<HandlerMapping> handlers = Arrays.asList(requestMappingHandlerMapping, repositoryExporterHandlerMapping);
return new OrderedOverridingHandlerMapping(handlers);
}
Et voilà.

Resources