Spring Statemachine Forks - spring

I have made good progress with the state machines upto now. My most recent problem arised when I wanted to use a fork, (I'm using UML). The fork didn't work as it is supossed to and I think its because of the persistance. I persist my machine in redis. refer below image.
This is my top level machine where Manage-commands is a Sub machine Reference And the top region is as it is.
Now say I persisted some state in redis, from the below region, and next an ONLINE event comes, then the machine does not accept the event, clearly because I have asked the machine to restore the state from redis with a given key.
bur I want both the regions to be persisted so that either one is selected according to the event.
Is there any way to achieve this?
Below is how I persist n restore
private void feedMachine(StateMachine<String, String> stateMachine, String user, GenericMessage<String> event)
throws Exception {
stateMachine.sendEvent(event);
System.out.println("persist machine --- > state :" + stateMachine.getState().toString());
redisStateMachinePersister.persist(stateMachine, "testprefixSw:" + user);
}
private StateMachine<String, String> resetStateMachineFromStore(StateMachine<String, String> stateMachine,
String user) throws Exception {
StateMachine<String, String> machine = redisStateMachinePersister.restore(stateMachine, "testprefixSw:" + user);
System.out.println("restore machine --- > state :" + machine.getState().toString());
return machine;
}

It's a bit weird as I found some other issues with persistence which I fixed in 1.2.x. Probably not related to your issues but I would have expected you to see similar errors. Anyway could you check RedisPersistTests.java and see if there's something different what you're doing. I didn't yet try sub-machine refs but I should not make any difference from persistence point of view.

Related

Multithreaded Use of Spring Pulsar

I am working on a project to read from our existing ElasticSearch instance and produce messages in Pulsar. If I do this in a highly multithreaded way without any explicit synchronization, I get many occurances of the following log line:
Message with sequence id X might be a duplicate but cannot be determined at this time.
That is produced from this line of code in the Pulsar Java client:
https://github.com/apache/pulsar/blob/a4c3034f52f857ae0f4daf5d366ea9e578133bc2/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java#L653
When I add a synchronized block to my method, synchronizing on the pulsar template, the error disappears, but my publish rate drops substantially.
Here is the current working implementation of my method that sends Protobuf messages to Pulsar:
public <T extends GeneratedMessageV3> CompletableFuture<MessageId> persist(T o) {
var descriptor = o.getDescriptorForType();
PulsarPersistTopicSettings settings = pulsarPersistConfig.getSettings(descriptor);
MessageBuilder<T> messageBuilder = Optional.ofNullable(pulsarPersistConfig.getMessageBuilder(descriptor))
.orElse(DefaultMessageBuilder.DEFAULT_MESSAGE_BUILDER);
Optional<ProducerBuilderCustomizer<T>> producerBuilderCustomizerOpt =
Optional.ofNullable(pulsarPersistConfig.getProducerBuilder(descriptor));
PulsarOperations.SendMessageBuilder<T> sendMessageBuilder;
sendMessageBuilder = pulsarTemplate.newMessage(o)
.withSchema(Schema.PROTOBUF_NATIVE(o.getClass()))
.withTopic(settings.getTopic());
producerBuilderCustomizerOpt.ifPresent(sendMessageBuilder::withProducerCustomizer);
sendMessageBuilder.withMessageCustomizer(mb -> messageBuilder.applyMessageBuilderKeys(o, mb));
synchronized (pulsarTemplate) {
try {
return sendMessageBuilder.sendAsync();
} catch (PulsarClientException re) {
throw new PulsarPersistException(re);
}
}
}
The original version of the above method did not have the synchronized(pulsarTemplate) { ... } block. It performed faster, but generated a lot of logs about duplicate messages, which I knew to be incorrect. Adding the synchronized block got rid of the log messages, but slowed down publishing.
What are the best practices for multithreaded access to the PulsarTemplate? Is there a better way to achieve very high throughput message publishing?
Should I look at using the reactive client instead?
EDIT: I've updated the code block to show the minimum synchronization necessary to avoid the log lines, which is just synchronizing during the .sendAsync(...) call.
Your usage w/o the synchronized should work. I will look into that though to see if I see anything else going on. In the meantime, it would be great to give the Reactive client a try.
This issue was initially tracked here, and the final resolution was that it was an issue that has been resolved in Pulsar 2.11.
Please try updating the Pulsar 2.11.

How to use spring state machine with nested state machine

Good Day,
I just started learning spring state machine.
I have the following questions
I will like to know how to configure a state machine that uses a nested state machine.
How can this be done programmatically i.e. via state machine builder?
How can this be done via papyrus UML?
My second question is on how to fire events i.e. upon getting to the state that has the nested state machine. How can events be a trigger in the nested state machine?
My third question is how to exit a nested state machine by firing an event that moves from the parent state (i.e. the state that references the nested state machine)
to another state in the parent state machine.
I would really appreciate a reference to some examples.
After studying the javadoc and reading a few links
https://github.com/spring-projects/spring-statemachine/issues/121
I figured it out.
Programmatically
Configure the state and transitions for the parent state machine as usual
https://www.baeldung.com/spring-state-machine
Follow that link to see how.
For States that reference a nested state machine. See the below snippet
....
enter code here
*builder.configureStates()
.withStates()
.initial("contactList2")
.state("newContactSM", newContactSM())
.end("end1");*
....
The state "newContactSM" references a nested state machine. The nested state machine
is define
....
*
public StateMachine<String, String> newContactSM() throws Exception
{
logger.info(" ------ newContactSM() -------- ");
// checkCurrentFlow();
Builder<String, String> builder = StateMachineBuilder.builder();
builder.configureConfiguration().withConfiguration().machineId("newContactBTF");
logger.info(" configure states ..");
builder.configureStates()
.withStates()
.initial("newContact")
.end("end2")
.states(new HashSet<String>(Arrays.asList("otherContact"))); // (Arrays.asList("S1", "S2", "S3")));
logger.info(" states configured ! ");
........ //
}
enter code here
....
To do it via UML
Just ensure that you reference the nested state machine in the state "newContactSM".
Once the set up is done. You can fire events as normal. spring state machine handles the rest.

Recover Hikaricp after OutOfMemoryError

I have a very specific scenario that, during the execution of a query, specifically during the fetching rows from db to my resultset, I get an OutOfMemoryError.
The code is simple as it:
public interface MyRepository extends Repository<MyEntity, Long> {
#EntityGraph(value = "MyBigEntityGraphToFetchAllCollections", type = EntityGraphType.FETCH)
#QueryHints({#QueryHint(name = "org.hibernate.readOnly", value = "true")})
MyEntity getOneById(Long id);
}
public class MyService {
...
public void someMethodCalledInLoop(Long id) {
try{
return repository.getOneById(id);
} catch (OutOfMemoryError error) {
// Here the connection is closed. How to reset Hikaricp?
System.gc();
return null;
}
}
}
Seems weird a getOne consumes all the memory, but due to eager fetching about 80 collections and due to multiplication of rows, some cases are insupportable.
I know I have the option to lazely load the collections, but I don't want to. Hit database 1+N times on every load consumes more time and my application dont have it. Its a batch processing of milions of records and less than 0,001% has this impact in memory. So my strategy is just discard this few records and process the next ones.
Just after catch the OutOfMemoryError the memory is freed, the trouble entity turns garbage. But due to this Error, HikariCP closes (or is forced to) the connection.
In the next call of the method, hikaricp still gives me a closed connection. Seems due to memory lack hikaricp doesn't finished correctly the previous transaction and sticks in this state forever.
My intention, now, is to reset or recovery hikaricp. I don't need to care about other threads using the pool.
So, after all, my simple question is, how to programatically restart or recover hikarycp to its primary state, without reboot the application.
Thanks, a lot, for who read this.
Try adding this to your Hibernate configuration:
<property name="hibernate.hikari.connectionTestQuery">select 1</property>
This way HikariCP will test that the connection is still alive before giving it to Hibernate.
Nothing has worked so far.
I minimized the problem by adding a 'query hint' to the method:
#QueryHints({#QueryHint(name = "org.hibernate.timeout", value = "10")})
MyEntity getOneById(Long id);
99% of the resultsets are fetched in 1 or less second, but sometimes the resultset is so big that takes longer. This way the JDBC stops the result fetching before the memory gets compromised.

Kafka stream state store rocksdb file size not decreasing on manual deletion of messages

I am using processor api to delete messages from state store. Delete is working successfully, i confirmed by using interactive queries call on state store by kafka key, but it does not reduce the kafka streams file size on local disk under directory tmp/kafka-streams.
#Override
public void init(ProcessorContext processorContext) {
this.processorContext = processorContext;
processorContext.schedule(Duration.ofSeconds(10), PunctuationType.STREAM_TIME, new Punctuator() {
#Override
public void punctuate(long l) {
processorContext.commit();
}
}); //invoke punctuate every 12 seconds
this.statestore = (KeyValueStore<String, GenericRecord>) processorContext.getStateStore(StateStoreEnum.HEADER.getStateStore());
log.info("Processor initialized");
}
#Override
public void process(String key, GenericRecord value) {
statestore.all().forEachRemaining(keyValue -> {
statestore.delete(keyValue.key);
});
}
kafka streams directory size
2.3M /private/tmp/kafka-streams
3.3M /private/tmp/kafka-streams
Do I need any specific configuration so that it keeps the file size in control? If it doesn't work this way, is it okay to delete kafka-streams directory? I assume it should be safe, since such delete will delete the record from both state store and changelog topic.
RocksDB does file compaction in the background. Hence, if you need a more aggressive compaction you should pass in a custom RocksDBConfigSetter via Streams config parameter rocksdb.config.setter. For more details about RockDB, check out the RocksDB documentation.
https://docs.confluent.io/current/streams/developer-guide/config-streams.html#rocksdb-config-setter
However, I would not recommend to change RocksDB configs as long as there is no real issue -- you can do more harm than good. Seems you store size is quite small, thus, I don't see a real problem atm.
Btw: If you go to production, you should change the state.dir config to an appropriate directory where even after restarting of a machine the state will not be lost. If you put state into the default /tmp location, state is most likely gone after restarting of the machine and an expensive recovery from the changelog topics would be triggered.

NHibernate ArgumentOutOfRangeException

I recently ran into an instance where I wanted to hit the database from a Task I have running periodically within a web application. I refactored the code to use the ThreadStaticSessionContext so that I could get a session without an HttpContext. This works fine for reads, but when I try to flush an update from the Task, I get the "Index was out of range. Must be non-negative and less than the size of the collection." error. Normally what I see for this error has to do with using a column name twice in the mapping, but that doesn't seem to be the issue here, as I'm able to update that table if the session is associated with a request (and I looked and I'm not seeing any duplicates). It's only when the Task tries to flush that I get the exception.
Does anyone know why it would work fine from a request, but not from a call from a Task?
Could it be because the Task is asynchronous?
Call Stack:
at System.ThrowHelper.ThrowArgumentOutOfRangeException()
at System.Collections.Generic.List`1.System.Collections.IList.get_Item(Int32 index)
at NHibernate.Engine.ActionQueue.ExecuteActions(IList list)
at NHibernate.Engine.ActionQueue.ExecuteActions()
at NHibernate.Event.Default.AbstractFlushingEventListener.PerformExecutions(IEventSource session)
at NHibernate.Event.Default.DefaultFlushEventListener.OnFlush(FlushEvent event)
at NHibernate.Impl.SessionImpl.Flush()
Session Generation:
internal static ISession CurrentSession {
get {
if(HasSession) return Initializer.SessionFactory.GetCurrentSession();
ISession session = Initializer.SessionFactory.OpenSession();
session.BeginTransaction();
CurrentSessionContext.Bind(session);
return session;
}
}
private static bool HasSession {
get { return CurrentSessionContext.HasBind(Initializer.SessionFactory); }
}
Task that I want to access the database from:
_maid = Task.Factory.StartNew(async () => {
while(true) {
if(CleaningSession != null) CleaningSession(Instance, new CleaningSessionEventArgs { Session = UnitOfWorkProvider.CurrentSession });
UnitOfWorkProvider.TransactionManager.Commit();
await Task.Delay(AppSettings.TempPollingInterval, _paycheck.Token);
}
//I know this function never returns, I'm using the cancellation token for that
// ReSharper disable once FunctionNeverReturns
}, _paycheck.Token);
_maid.GetAwaiter().OnCompleted(() => _maid.Dispose());
Edit: Quick clarification about some of the types above. CleaningSession is an event that is fired to run the various things that need to be done, and _paycheck is the CancellationTokenSource for the Task.
Edit 2: Oh yeah, and this is using NHibernate version 4.0.0.4000
Edit 3: I have since attempted this using a Timer, with the same results.
Edit 4: From what I can see of the source, it's doing a foreach loop on an IList. Questions pertaining to an IndexOutOfRangeException in a foreach loop tend to suggest a concurrency issue. I still don't see how that would be an issue, unless I misunderstand the purpose of ThreadStaticSessionContext.
Edit 5: I thought it might be because of requests bouncing around between threads, so I tried creating a new SessionContext that combines the logic of the WebSessionContext and ThreadStaticSessionContext. Still getting the issue, though...
Edit 6: It seems this has something to do with a listener I have set up to update some audit fields on entities just before they're saved. If I don't run it, the commit occurs properly. Would it be better to do this through an event than OnPreInsert, or use an interceptor instead?
After muddling through, I found out exactly where the problem was. Basically, there was a query that was run to load the current user record called from inside of the PreUpdate event in my listener.
I came across two solutions to this. I could cache the user in memory, avoiding the query, but having possibly stale data (not that anything other than the id matters here). Alternatively, I could open a temporary stateless session and use that to look up the user in question.

Resources