Producer consumer via spring application events - producer-consumer

I'm trying to implement the actor model pattern (Somewhat mashed toghter with producer consumer) by using spring's application events and threadpoolexecutors
my main objective is to decouple each of the layers .
my architecture is as follows :
i have a war deployed that trough a rest API receives requests for business transactions , at any given moment there can be X number of transactions alive
where X is configureable number , the actual execution must be asynchronous and each transaction must be in a different thread .
the requests themselves are treated in a FIFO manner but there is some complexity to it as some requests must wait for other to complete before they can be processed but that does not mean other requests can't be processed , e.g. : don't process a withdrawal from account 2 if a deposit to account 2 was requested before it , so if i get hits for :
where the numbers are account numbers i want to process them in this order:
i've built the architecture is this way :
i have a rest api that gets the hits and writes them to the DB (distributed system that has to have the state in DB)and publishes
a clientrequestevent in the application context
i have singleton bean that is in charge of publishing the producer events and monitoring how many events he has sent (i.e. : he is in charge of limiting the number of concurrent processes and implementing the above logic)
and i have a few other listeners each for an action (withdrawal deposit etc..) that listen to the events published by the latter and publish a done event.
every thing works great everything is done is different threads and all flows great but i have
a problem with the middle layer the one in charge of determing whether or not there is a free slot
i don't want to have a synchronous method nor do i want to do some tricks around an atomiclong or something like that i would rather use some blockingqueue to publish the events but i can't find a nice way to determine when an event is done so i can put a new one back in .
the biggest problem is that for requesting a new work i have to go the DB which is a heavy task , as this system should work under heavy load .
i would like to somehow utilize a blockingqueue and a threadpool so that will take from a size bounded queue with threads the minute one slot is free
what would be a good way to handle this ?
thanks in advance


Microservice Event driven communication - how to notify the caller only on command / event approach

I wonder, how do you avoid notifying all services that consume the same event? For example, service A and service B both of them consume event X. Based on some rules you want to send the event X only for service A. I am not talking about consumer groups (kafka) or even correlation-Id. As I am using Event-Driven microservices with the approach of command & event.
I think it's quite easy by using Kafka partitions/partition-key.
For example: In your topic X, you could just create it with many partitions. For each invocation, every service must specify its key, so based on the key, Kafka will do the rest of the job. So every time Service A sends a command, the consumer Service (the one who handles the command) will send the event with the same key. So in the end, Service A (the producer of the command) will receive the event on its own partition and will be the only service receiving it. So based on the Command/Event approach it may work.
On the other hand, by doing so, you are limiting one of the main benefits of partition which is allowing the scalability.

Best way to track/trace a JSON Object (a time series data) as it flows through a system of microservices on a IOT platform

We are working on an IOT platform, which ingests many device parameter
values (time series) every second from may devices. Once ingested the
each JSON (batch of multiple parameter values captured at a particular
instance) What is the best way to track the JSON as it flows through
many microservices down stream in an event driven way?
We use spring boot technology predominantly and all the services are
Eg: Option 1 - Is associating UUID to each object and then updating
the states idempotently in Redis as each microservice processes it
ideal? Problem is each microservice will be tied to Redis now and we
have seen performance of Redis going down as number api calls to Redis
increase as it is single threaded (We can scale this out though).
Option 2 - Zipkin?
Note: We use Kafka/RabbitMQ to process the messages in a distributed
way as you mentioned here. My question is about a strategy to track
each of this message and its status (to enable replay if needed to
attain only once delivery). Let's say a message1 is being by processed
by Service A, Service B, Service C. Now we are having issues to track
if the message failed getting processed at Service B or Service C as
we get a lot of messages
Better approach will be using Kafka instead of Redis.
Create a topic for every microservice & keep moving the packet from
one topic to another after processing.
topic(raw-data) - |MS One| - topic(processed-data-1) - |MS Two| - topic(processed-data-2) ... etc
Keep appending the results to same object and keep moving it down the line, untill every micro-service has processed it.

CQRS - out of order messages

Suppose we have 3 different services producing events, each of them publishing to its own event store.
Each of these services consumes other producers services events.
This because each service has to process another service's events AND to create its own projection. Each of the service runs on multiple instances.
The most straight forward way to do it (for me) was to put "something" in front of each ES which is picking events and publishing (pub/sub) them in queues of every other service.
This is perfect because every service can subscribe to each topics it likes, while the event publisher is doing the job and if a service is unavailable events are still delivered. This seems to me to guarantee high scalability and availability.
My problem is the queue. I can't get an easily scalable queue that guarantees ordering of the messages. It actually guarantees "slightly out of order" with at-least once delivery: to be clear, it's AWS SQS.
So, the ordering problems are:
No order guaranteed across events from the same event stream.
No order guaranteed across events from the same ES.
No order guaranteed across events from different ES (different services).
I though I could solve the first two problems just by keeping track of the "sequence number" of the events coming from the same ES.
This would be done by tracking the last sequence number of each topic from which we are consuming events
This should be easy for reacting to events and also building our projection.
Then, when I pop an event from the queue, if the eventSequenceNumber > previousAppliedEventSequenceNumber + 1 i renqueue it (or make it invisible for a certain time).
But it turns out that using this solution, it will destroy performances when events are produced at high rates (I can use a visibility timeout or other stuff, the result should be the same).
This because when I'm expecting event 10 and I ignore event 11 for a moment, I should ignore also all events (from ES) with sequence numbers coming after that event 11, until event 11 shows up again and it's effectively processed.
Other difficulties were:
where to keep track of the event's sequence number for build the projection.
how to keep track of the event's sequence number for build the projection so that when appling it, I have a consistent lastSequenceNumber.
What I'm missing?
P.S.: for the third problem think at the following scenario. We have a UserService and a CartService. The CartService has a projection where for each user keeps track of the products in the cart. Each cart's projection must have also user's name and other info's that are coming from the UserCreated event published from the UserService. If UserCreated comes after ProductAddedToCart the normal flow requires to throw an exception because the user doesn't exist yet.
What I'm missing?
You are missing flow -- consumers pull messages from sources, rather than having sources push the messages to the consumers.
When I wake up, I check my bookmark to find out which of your messages I read last, and then ask you if there have been any since. If there have, I retrieve them from you in order (think "document message"), also writing down the new bookmarks. Then I go back to sleep.
The primary purpose of push notifications is to interrupt the sleep period (thereby reducing latency).
With SQS acting as a queue, the idea is that you read all of the enqueued messages at once. If there are no gaps, then you can order the collection then start processing them and acking them. If there are gaps, you either wait (leaving the messages in the queue) or you go to the event store to fetch copies of the missing messages.
There's no magic -- if the message pipeline is promising "at least once" delivery, then the consumers must take steps to recognize duplicate messages as they arrive.
If UserCreated comes after ProductAddedToCart the normal flow requires to throw an exception because the user doesn't exist yet.
Review Race Conditions Don't Exist, by Udi Dahan: "A microsecond difference in timing shouldn’t make a difference to core business behaviors."
The basic issue is assuming we can get messages IN ORDER...
This is a fallacy in distributed computing...
I suggest you design for no message ordering in your system.
As for your issues, try and use UTC time in the message body/header created by the originator and try and work around this data point. Sequence numbers are going to fail unless you have a central deterministic sequence creator (which will be a non-scalable, single point of failure).
Using Sagas/State machine is a path that can help to make sense of (business) events ordering.

How to handle side effects based on multiple events in a message driven microservice system?

we are currently working in a message driven Microservice environment and some of our messages/events are event sourced (using Apache Kafka). Now we are struggling with implementing more complex business requirements, were we have to take multiple events into account to create new events and side effects.
In the current situation we are working with devices that can produce errors and we already process them and have a single topic which contains ERROR_OCCURRED and ERROR_RESOLVED events (so they are in order). We also make sure, that all messages regarding a specific device always go onto the same partition. And both messages share an ID that identifies that specific error incident. We already have a projection that consumes those events and provides an API for our customers, s.t. they can see all occurred errors and their current state.
Now we have to deal with the following requirement:
Reporting Errors
We need a push system that reports errors of devices to our external partners, but only after 15 minutes and if they have not been resolved in that timeframe. Our first approach was to consume all ERROR_RESOLVED events, store the IDs and have another consumer that is handling the ERROR_OCCURRED events in a delayed fashion (e.g. by only consuming the next ERROR_OCCURRED event on the topic if its timestamp is at least 15 minutes old). We would then be able to know if that particular error has already been resolved and does not need to be reported (since they share a common ID with the corresponding ERROR_RESOLVED event). Otherwise we send an HTTP request to our external partner and create an ERROR_REPORTED event on a new topic. Is there any better approach for delayed and conditional message processing?
We also have to take the following special use cases into account:
Service restarts: currently we are planning to keep the list of resolved errors in memory, so if a service restarts, that list has to be created from scratch. We could just replay the ERROR_RESOLVED messages, but that may take some time and in that time no ERROR_OCCURRED events should be processed because that may result in reporting errors that have been resolved in less then 15 minutes, but we are just not aware of it. Are there any good practices regarding replay vs. "normal" processing?
Scaling: we may increase or decrease the number of instances of our service at any time, so the partition assignment may change during runtime. That should not be a problem if we create a consumer group for each service instance when consuming the ERROR_RESOLVED events, s.t. every instance knows all resolved errors while still only handling the ERROR_OCCURRED events of its assigned partitions (in another consumer group which is shared by all instances). Is there a better approach for handling partition reassignment and internal state?
Thanks in advance!
For side effects, I would record all "side" actions in the event store. In your particular example, when it is time to send a notification, I would call SEND_NOTIFICATION command that emit NOTIFICATION_SENT event. These events would be processed by some worker process that does actual HTTP request.
Actually I would elaborate this even furter, since notifications could fail, so I would have, say, two events NOTIFICATION_REQUIRED, and NORIFICATION_SENT, so we can retry failed notifications.
And finally your logic would be "if error was not resolved in 15 minutes and notification was not sent - send a notification (or just discard if it missed its timeframe)"

Spring Integration message processing partitioned by header information

I want to be able to process messages with Spring Integration in parallel. The messages come from multiple devices and we need to process messages from the same device in sequential order but the devices can be processed in multiple threads. There can be thousands of devices so I'm trying to figure out how to assign processor based on mod of the device ID using Spring Integration's semantics as much as possible. What approach should I be looking at?
It's difficult to generalize without knowing other requirements (transaction semantics etc) but probably the simplest approach would be a router sending messages to a number of QueueChannels using some kind of hash algorithm on the device id (so all messages for a particular device go to the same channel).
Then, have a single-threaded poller pulling messages from each queue.
EDIT: (response to comment)
Again, difficult to generalize, but...
See AbstractMessageRouter.determineTargetChannels() - a router actually returns a physical channel object (actually a list, but in most cases a list of 1). So, yes, you can create the QueueChannels programmatically and have the router return the appropriate one, based on the message.
Assuming you want all the messages to then be handled by the same downstream flow, you would also need to create a <bridge/> for each queue channel to bridge it to the input channel of the next component in the flow.
create a QueueChannel
create a BridgeHandler (set the outputChannel to the input channel of the next component)
create a PollingConsumer (constructor takes the channel and handler; set trigger etc)
start() the consumer.
All of this can be done in your custom router initialization and implement determineTargetChannels() to select the queue.
Depending on the processing time for your events, I would generally recommend running the downstream flow on the poller thread rather than setting a taskExecutor to avoid issues with the next poll trying to schedule another task before this one's done. You might need to increase the default taskScheduler's pool size.
