Persist state information using a StateMachine in Spring Boot - spring-boot

I am working on project where we use a state machine to realize a workflow. I am having some troubles getting warm with what was put in place, and I would like to see if there may be a better design/implementation to my problem.
I will try to show what we have at the moment.
Please ignore process_agent at the moment, I would like to focus on process_state only for the beginning. I simply want to create a process and the state machine shall immediately transition from CREATED to ASSIGNED and persist that state in the Entity table (by default I would simply set the current user as the agent for the time being).
There is a table Entity with two information: process_agent and process_state
There are only three States for the moment, defined as Enums: CREATED, ASSIGNED and IN_PROCESS
There are only two Events at the moment, defined as Enums: ASSIGN_TO_AGENT and START_PROCESS
There is an endpoint in the controller for the creation of a process that simply hands over to the service:
// In the Controller
// mapper is a MapStruct mapper, it simply copies fields from view to entity and vice versa
ResponseEntity<EntityView> create(#RequestBody final EntityView view) {
final Entity createdEntity = service.create(entityView);
final EntityView createdEntityView = mapper.toView(createdEntity); //map the entity to its view
return status(CREATED).body(createdEntityView);
}
// In the Service
// mapper is a MapStruct mapper, it simply copies fields from view to entity and vice versa
// stateHandler is a custom class to handle an event, see below
Entity entity = new Entity();
mapper.updateFromView(entityView, entity);
entity.setInitState(CREATED);
final Message<Event> message = MessageBuilder.withPayload(Event.ASSIGN_TO_AGENT).setHeader("ENTITY_HEADER", entity);
stateHandler.handleEvent(message);
entity.setProcessAgent(...get the current user's id somehow...);
...
return entity;
StateHandler handles the event messaging. That is the part that I find difficult and feel I should question. One basically gets a state machine, resets it to the given state and runs it in order to intercept a transition; once intercepted the new target state is persisted to the table of the entity:
// stateMachineFactory is auto wired into the state handler
// repository is auto wired in the state handler
public void handleEvent(final Message<Event> message) {
final Entity entity = message.getHeaders().get("ENTITY_HEADER", Entity.class);
final State currentState = entity.getProcessState();
StateMachine<State, Event> machine = stateMachineFactory.getStateMachine();
machine.getStateMachineAccessor().doWithAllRegions(accessor -> accessor.resetStateMachine(
new DefaultStateMachineContext<State, Event>(currentState, null, null, null, null)
));
machine.getStateMachineAccessor().doWithAllRegions(accessor -> accessor.addStateMachineInterceptor(
#Override
public StateContext<State, Event> postTransition(final StateContext<State, Event> stateContext) {
final Entity entity1 = stateContext.getMessage().getHeaders.get("ENTITY_HEADER", Entity.class);
if (entity != null) {
entity1.setState(stateContext.getTarget().getId());
repository.save(entity1);
return stateContext;
}
// if entity is null then throw exception
... omitted exception handling
}
);
log.debug("Starting state machine to process [{}]", entity);
stateMachine.start();
stateMachine.sendEvent(message);
stateMachine.stop();
}
For completeness the following StateMachineConfig:
#Override
public void configure(final StateMachineConfigurationConfigurer<State, Event> config) throws Exception {
config.withConfiguration()
.autoStartup(false);
}
#Override
public void configure(final StateMachineStateConfigurer<State, Event> sates) throws Exception {
states.withStates()
.initial(State.CREATED)
.states(EnumSet.allOf(State.class));
}
#Override
public void configure(final StateMachineTransitionConfigurer<State, Event> transitions) throws Exception {
transitions.withExternal()
.source(State.CREATED)
.target(State.ASSIGNED)
.event(Event.ASSIGN_TO_AGENT)
.and()
.withExternal()
.source(State.ASSIGNED)
.target(State.IN_PROCESS)
.event(Event.START_PROCESS);
}
I hope I could be as complete as possible. Please let me know if there are any clarifications needed.
My question is: Is there a better design to implement this state machine, or is what one can see here a reasonable approach ?

I would guess your workflow is bound to "agent" - so an agent starts a workflow and you want to persist the state of the workflow per agent. Now I don't know if an agent can start multiple workflow instances and progress them in parallel, so the below suggestion might need to be re-adjusted for those cases.
The straight forward approach would be to have a SM instance per agent (and per workflow if there are multiple workflow instances possible).
When an agent starts working with a workflow you must identify if it is a completely new workflow or an existing one in a particular state.
if it is a new workflow - return a new SM on a starting state and send the required event.
if it is an existing workflow, you need to create a SM and feed to the SM the current workflow state and do the necessary transitions upon SM initialization, before returning it to the service caller. The state should be previously persisted to a datastore.
I don't know your domain, so the state could be persisted as part of some Workflow entity or Agent entity or something else - depends on the app context.
There are different approaches on who is persisting the state in the DB.
A) The SM can be responsible for this (e.g. upon receiving an event the SM will extract the necessary state information and context (e.g. DB entity ID) from the event and persist it in the DB for that Entity ID and then transition to the next state).
B) A Service XYZ that is "orchestrating" the SM can be responsible for this (e.g. a Service XYZ calls "persist" on another repository service and if it is a successful operation, then the Service XYZ send the necessary event to the SM - then the SM only handles the transition to the next state).

Related

Spring boot change connection schema dynamically inside transaction

In my Spring boot application i need to read data from a specific schema and write on another one, to do so i follow this guide (https://github.com/spring-projects/spring-data-examples/tree/main/jpa/multitenant/schema) and i used this answer (https://stackoverflow.com/a/47776205/10857151) to be able to change at runtime the schema used.
But if this works fine inside a service without any transaction scope, this doesn't works on a more complex architecture (exception: session/EntityManager is closed) where there are couple of service that share transaction to ensure rollback.
THE BELLOW IS A SIMPLE EXAMPLE OF THE ARCHITECTURE
//simple jpa repository
private FirstRepository repository;
private SecondRepository secondRepository;
private Mapper mapper;
private SchematUpdater schemaUpdater;
#Transactional
public void entrypoint(String idSource,String idTarget) {
//copy first object
firstCopyService(idSource, idTarget);
//copy second object
secondCopyService(idSource, idTarget);
}
#Transactional
public void firstCopyService(String idSource,String idTarget) {
//change schema to the source default
schemaUpdater.changeToSurceSchema();
Object obj=repository.get(idSource);
//convert obj before persist - set new id reference and other things
obj=mapper.prepareObjToPersist(obj,idTarget);
//change schema to the target default
schemaUpdater.changeToTargetSchema();
repository.saveAndFlush(obj);
}
#Transactional
public void secondCopyService(String idSource,String idTarget) {
schemaUpdater.changeToSurceSchema();
Object obj=secondRepository.get(idSource);
//convert obj before persist
obj=mapper.prepareObjToPersist(obj);
//change schema to the target default
schemaUpdater.changeToTargetSchema();
secondRepository.saveAndFlush(obj);
}
I need to know what could be the best solution to ensure this dynamical switch and maintain the transaction scope on each service, without causing problems connected to restore and clean entity manager session.
Thanks

Spring State Machine | Actions (Calling External API with data & pass data to another State)

I would like to use the Action<S,E> to call an external api. How can i add more data into this Action in order to invoke an external API? Another question is what if i want to send back the response (pass data to another State)
What is the best way to add more data? I'm trying to find an alternative of using context (which i know is possible but very ugly using Key-value).
Calling an external API is the same as any executing code, you can wire in your action any executable code. This includes autowiring a Service or Gateway and retrieve the data you need.
Regarding the second question, in my company we are using the extended state (context) to expose data. Before we release the state machine we get the data inside of it and serialise to a response object using object mapper.
Here is a snippet for illustration
#Configuration
#RequiredArgsConstructor
public class YourAction implements Action<States, Events> {
private final YourService service;
#Override
public void execute(final StateContext<States, Events> context) {
//getting input data examples
final Long yourIdFromHeaders = context.getMessageHeaders().get(key, Long.class);
final Long yourIdFromContext = context.getExtendedState().get(key, Long.class);
//calling service
final var responseData = service.getData(yourIdFromContext);
//storing results
context.getExtendedState().getVariables().put("response", responseData);
}

Spring-Boot: scalability of a component

I am trying Spring Boot and think about scalabilty.
Lets say I have a component that does a job (e.g. checking for new mails).
It is done by a scheduled method.
e.g.
#Component
public class MailMan
{
#Scheduled (fixedRateString = "5000")
private void run () throws Exception
{ //... }
}
Now the application gets a new customer. So the job has to be done twice.
How can I scale this component to exist or run twice?
Interesting question but why Multiple components per customer? Can scheduler not pull the data for every customer on scheduled run and process the record for each customer? You component scaling should not be decided based on the entities evolved in your application but the resources utilization by the component. You can have dedicated components type for processing the messages for queues and same for REST. Scale them based on how much each of them is getting utilized.
Instead of using annotations to schedule a task, you could do the same thing programmatically by using a ScheduledTaskRegistrar. You can register the same bean multiple time, even if it is a singleton.
public class SomeSchedulingConfigurer implements SchedulingConfigurer {
private final SomeJob someJob; <-- a bean that is Runnable
public SomeSchedulingConfigurer(SomeJob someJob) {
this.someJob = someJob;
}
#Override
public void configureTasks(#NonNull ScheduledTaskRegistrar taskRegistrar) {
int concurrency = 2;
IntStream.range(0, concurrency)).forEach(
__ -> taskRegistrar.addFixedDelayTask(someJob, 5000));
}
}
Make sure the thread executor you are using is large enough to process the amount of jobs concurrently. The default executor has exactly one thead :-). Be aware that this approach has scaling limits.
I also recommend to add a delay or skew between jobs, so that not all jobs run at exactly the same moment.
See SchedulingConfigurer
and
ScheduledTaskRegistrar
for reference.
The job needs to run only once even with multiple customers. The component itself doesn't need to scale at all. It just a mechanism to "signal" that some logic needs to be run at some moment in time. I would keep the component really thin and just call the desired business logic that handles all the rest e.g.
#Component
public class MailMan {
#Autowired
private NewMailCollector newMailCollector;
#Scheduled (fixedRateString = "5000")
private void run () throws Exception {
// Collects emails for customers
newMailCollector.collect();
}
}
If you want to check for new e-mails per customer you might want to avoid using scheduled tasks in a backend service as it will make the implementation very inflexible.
Better make an endpoint available for clients to call to trigger that logic.

Set Current State of Spring StateMachine

I am starting to use Spring Statemachine and I am having some trouble managing the state of my objects.
My Statemachine is of type StateMachine.
My business object, Shipment, has an enum property (state) of type ShipmentState, which should hold the state-machine state of the episode. Here is my desired workflow:
Load a Shipment from the database.
Set the current state of the Statemachine from the ShipmentState that
is in that Shipment instance.
Send an event to the Statemachine.
Get the resultant state from the Statemachine (post event) and set
the ShipmentState in my Shipmentinstance.
Save the Shipment instance.
The problem is: How do I set the current state of an existing StateMachine?
My current approach is this one: For every event, create a new StateMachine instance (using a StateMachineBuilder) specifying the initial state according to a Shipment instance. For example:
#Service
public class StateMachineServiceImpl implements IStateMachineService {
#Autowired
private IShipmentService shipmentService;
#Override
public StateMachine<ShipmentState, ShipmentEvent> getShipmentStateMachine(Shipment aShipment) throws Exception {
Builder<ShipmentState, ShipmentEvent> builder = StateMachineBuilder.builder();
builder.configureStates().withStates()
.state(ShipmentState.S1)
.state(ShipmentState.S2)
.state(ShipmentState.S3)
.initial(shipmentService.getState())
.end(ShipmentState.S4);
builder.configureTransitions().withExternal().source(ShipmentState.S1).target(ShipmentState.S1)
.event(ShipmentEvent.S3).action(shipmentService.updateAction()).and().withExternal()
.source(ShipmentState.S1).target(ShipmentState.S2).event(ShipmentEvent.S3)
.action(shipmentService.finalizeAction()).and().withExternal().source(ShipmentState.S3)
.target(ShipmentEvent.S4).action(shipmentService.closeAction()).event(ShipmentEvent.S5);
return builder.build();
}
}
What do you think of my approach?
There is no issue with the approach. You can reset the state machine to particular state using the below code.
stateMachine.getStateMachineAccessor().doWithAllRegions(access -> access
.resetStateMachine(new DefaultStateMachineContext<>(state, null, null,null)));
You can pass the arguments to the DefaultStateMachineContext according to your use case.

Spring Data Solr #Transaction Commits

I currently have a setup where data is inserted into a database, as well as indexed into Solr. These two steps are wrapped in a spring-managed transaction via the #Transaction annotation. What I've noticed is that spring-data-solr issues an update with the following parameters whenever the transaction is closed : params{commit=true&softCommit=false&waitSearcher=true}
#Transactional
public void save(Object toSave){
dbRepository.save(toSave);
solrRepository.save(toSave);
}
The rate of commits into solr is fairly high, so ideally I'd like send data to the solr index, and have solr auto commit at regular intervals. I have the autoCommit (and autoSoftCommit) set in my solrconfig.xml, but since spring-data-solr is sending those commit parameters, it does a hard commit every time.
I'm aware that I can drop down to the SolrTemplate API and issue commits manually, I would like to keep the solr repository.save call within a spring-managed transaction if possible. Is there a way to modify the parameters that are sent to solr on commit?
After putting in an IDE debug breakpoint in org.springframework.data.solr.repository.support.SimpleSolrRepository here:
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit(solrCollectionName);
}
}
I discovered that wrapping my code as #Transactional (and other details to actually enable the framework to begin/end code as a transaction) doesn't achieve what we expect with "Spring Data for Apache Solr". The stacktrace shows the Proxy and Transaction Interceptor classes for our code's Transactional scope but then it also shows the framework starting its own nested transaction with another Proxy and Transaction Interceptor of its own. When the framework exits its CrudRepository.save() method my code calls, the action to commit to Solr is done by the framework's nested transaction. It happens before our outer transaction is exited. So, the attempt to batch-process many saves with one commit at the end instead of one commit for every save is futile. It seems, for this area in my code, I'll have to make use of SolrJ to save (update) my entities to Solr and then have "my" transaction's exit be followed with a commit.
If using Spring Solr, I found using the SolrTemplate bean allows you to 'batch' updates when adding data to the Solr index. By using the bean for SolrTemplate, you can use "addBeans" method, which will add a collection to the index and not commit until the end of the transaction. In my case, I started out using solrClient.add() and taking up to 4 hours for my collection to get saved to the index by iterating over it, as it commits after every single save. By using solrTemplate.addBeans(Collect<?>), it finishes in just over 1 second, as the commit is on the entire collection. Here is a code snippet:
#Resource
SolrTemplate solrTemplate;
public void doReindexing(List<Image> images) {
if (images != null) {
/* CMSSolrImage is a class with #SolrDocument mappings.
* the List<Image> images is a collection pulled from my database
* I want indexed in Solr.
*/
List<CMSSolrImage> sImages = new ArrayList<CMSSolrImage>();
for (Image image : images) {
CMSSolrImage sImage = new CMSSolrImage(image);
sImages.add(sImage);
}
solrTemplate.saveBeans(sImages);
}
}
The way I've done something similar is to create a custom repository implementation of the save methods.
Interface for the repository:
public interface FooRepository extends SolrCrudRepository<Foo, String>, FooRepositoryCustom {
}
Interface for the custom overrides:
public interface FooRepositoryCustom {
public Foo save(Foo entity);
public Iterable<Foo> save(Iterable<Foo> entities);
}
Implementation of the custom overrides:
public class FooRepositoryImpl {
private SolrOperations solrOperations;
public SolrSampleRepositoryImpl(SolrOperations fooSolrOperations) {
this.solrOperations = fooSolrOperations;
}
#Override
public Foo save(Foo entity) {
Assert.notNull(entity, "Cannot save 'null' entity.");
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBean(entity, 1000);
commitIfTransactionSynchronisationIsInactive();
return entity;
}
#Override
public Iterable<Foo> save(Iterable<Foo> entities) {
Assert.notNull(entities, "Cannot insert 'null' as a List.");
if (!(entities instanceof Collection<?>)) {
throw new InvalidDataAccessApiUsageException("Entities have to be inside a collection");
}
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBeans((Collection<? extends T>) entities, 1000);
commitIfTransactionSynchronisationIsInactive();
return entities;
}
private void registerTransactionSynchronisationIfSynchronisationActive() {
if (TransactionSynchronizationManager.isSynchronizationActive()) {
registerTransactionSynchronisationAdapter();
}
}
private void registerTransactionSynchronisationAdapter() {
TransactionSynchronizationManager.registerSynchronization(SolrTransactionSynchronizationAdapterBuilder
.forOperations(this.solrOperations).withDefaultBehaviour());
}
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit();
}
}
}
and you also need to provide a SolrOperations bean for the right solr core:
#Configuration
public class FooSolrConfig {
#Bean
public SolrOperations getFooSolrOperations(SolrClient solrClient) {
return new SolrTemplate(solrClient, "foo");
}
}
Footnote: auto commit is (to my mind) conceptually incompatible with a transaction. An auto commit is a promise from solr that it will try to start to write it to disk within a certain time limit. Many things might stop that from actually happening however - a timely power or hardware failure, errors between the document and the schema, etc. But the client won't know that solr failed to keep its promise, and the transaction will see a success when it actually failed.

Resources