Subscription queries in a multi-node environment - event-sourcing

Our axon backed service runs on several nodes. Our event processors are tracking (1 segment, thus active on one node). If I subscribe to a query on node A and the event that should trigger the update is handled on node B, node A will miss this.
Is this by design or should this work and am I misconfiguring the application?
In case of the former, what could we do to implement a likewise functionality in the most axon idiomatic manner?
(currently we poll the data source / projection directly for x seconds)

The QueryBus you are using is a SimpleQueryBus which stays within a single JVM, always.
If you need a distributed version of the QueryBus, you should turn towards using Axon Server as the centralized means to route queries between your nodes.
Note that although you could create this yourself, people have tried to do so (as shown in this Pull Request on the framework) and decided against it in favor of the optimizations made in Axon Server.
So, in short, I am assuming you are currently excluding the Axon Server connector.
Thus the framework gives you the SimpleQueryBus, which is indeed designed to not span several nodes.
And lastly, the quickest way to achieve distributed routing of queries is to use Axon Server.

Related

How to rehydrate state stored aggregates in Axon Framework

I have state stored aggregates in a PostgreSQL database. I'm testing replay by deleting the state stored tables and the token_entry table and restarting the application. All events will be replayed and aggregates are restored in-memory. However, my state stored tables stay empty. I was thinking that they would also be restored?
I'm using SpringBoot and latest Axon. The code, at this moment, is as simple as it can get.
Whenever you're using Axon's State-Stored Aggregates, they'll only be stored as-is. Hence, throwing away the stored instances and starting the application will not trigger a replay.
When removing TrackingTokens, or initiating a replay by invoking StreamingEventProcessor#resetTokens (this is the recommended approach, by the way), you're effectively telling the Event Processors of your application to start event streaming from scratch.
This part of Axon Framework supports the so-called Query Side of CQRS. The Aggregate support in Axon Framework is specifically for the Command Side of an application.
Long story short, State-Stored Aggregates don't have replay support. If you want your Aggregates (Command Models) to be replayed, you will have to use Event Sourcing Aggregates instead.
I hope the above clarifies a little about the misconception between Token and Aggregates from your question. And by the way, if you feel Axon's Reference Guide should be adjusted to clarify your situation, you're always free to file an issue.

Microservice failure Scenario

I am working on Microservice architecture. One of my service is exposed to source system which is used to post the data. This microservice published the data to redis. I am using redis pub/sub. Which is further consumed by couple of microservices.
Now if the other microservice is down and not able to process the data from redis pub/sub than I have to retry with the published data when microservice comes up. Source can not push the data again. As source can not repush the data and manual intervention is not possible so I tohught of 3 approaches.
Additionally Using redis data for storing and retrieving.
Using database for storing before publishing. I have many source and target microservices which use redis pub/sub. Now If I use this approach everytime i have to insert the request in DB first than its response status. Now I have to use shared database, this approach itself adding couple of more exception handling cases and doesnt look very efficient to me.
Use kafka inplace if redis pub/sub. As traffic is low so I used Redis pub/sub and not feasible to change.
In both of the above cases, I have to use scheduler and I have a duration before which I have to retry else subsequent request will fail.
Is there any other way to handle above cases.
For the point 2,
- Store the data in DB.
- Create a daemon process which will process the data from the table.
- This Daemon process can be configured well as per our needs.
- Daemon process will poll the DB and publish the data, if any. Also, it will delete the data once published.
Not in micro service architecture, But I have seen this approach working efficiently while communicating 3rd party services.
At the very outset, as you mentioned, we do indeed seem to have only three possibilities
This is one of those situations where you want to get a handshake from the service after pushing and after processing. In order to accomplish the same, using a middleware queuing system would be a right shot.
Although a bit more complex to accomplish, what you can do is use Kafka for streaming this. Configuring producer and consumer groups properly can help you do the job smoothly.
Using a DB to store would be a overkill, considering the situation where you "this data is to be processed and to be persisted"
BUT, alternatively, storing data to Redis and reading it in a cron-job/scheduled job would make your job much simpler. Once the job is run successfully, you may remove the data from cache and thus save Redis Memory.
If you can comment further more on the architecture and the implementation, I can go ahead and update my answer accordingly. :)

Scaling a microservice with frontend and backend instances

I am developing a series of microservices using Spring Boot and plan to deploy them on Kubernetes.
Some of the microservices are composed of an API which writes messages to a kafka queue and a listener which listens to the queue and performs the relevant actions (e.g. write to DB etc, construct messsages for onward processing).
These services work fine locally but I am planning to run multiple instances of the microservice on Kubernetes. I'm thinking of the following options:
Run multiple instances as is (i.e. each microservice serves as an API and a listener).
Introduce a FRONTEND, BACKEND environment variable. If the FRONTEND variable is true, do not configure the listener process. If the BACKEND variable is true, configure the listener process.
This way I can start scale how may frontend / backend services I need and also have the benefit of shutting down the backend services without losing requests.
Any pointers, best practice or any other options would be much appreciated.
You can do as you describe, with environment variables, or you may also be interested in building your app with different profiles/bean configuration and make two different images.
In both cases, you should use two different Kubernetes Deployments so you can scale and configure them independently.
You may also be interested in a Leader Election pattern where you want only one active replica if it only make sense if one single replica processes the events from a queue. This can also be solved by only using a single replica depending on your availability requirements.

Best approach to send updates to other micro services which are running(multiple instances) in different data centers

I have 3 different micro services(ex: A,B,C. these are REST, and springboot based). These 3 different services generally runs on 3 different data centers locations, so i.e different instances for each service.
The problem trying to solve:
I need to send updates(its kind of polling, checking if there are any updated records) in service A, then send updated information to services B and C, through REST call. Based on these updates service B and C does it's own processing. Once after deployment(mostly into cloud). How does A knows which B, C instances are up and running. SO that it can send updates to running instances.
Do we need to keep track of running instances into some DB table and lookup for active instances before sending updates from A?. (OR) just create some indicator or sequence number based approach to find out there are some updates at A, So we need to send out.But in this does it A knows what all are active instances running? Or else, we just need to send updates from A, so that some router or load balancer or some other thing will takes care of sending to available active instances running regardless of storing and looking up for active instances
I am not much familiar with network and prod systems behavior and its communication in cloud systems.
Trying to implement cross service update through REST based synchronization is a bad idea because it is not scalable in a sense that if you add more microservices that needs to be aware of updates made on service A. You would have to modify the existing microservice that emits the change. This in fact introduces risk and additional maintenance cost.
However, you can try to use messaging queues to emit events that indicates changes made on a service. This approach eliminates the need to modify any existing microservice (Thanks to pub/sub pattern) and just plug new consumers to your existing update emitting services in your ecosystem

Recommendation on Mule JMS queue config

I'm working on updating an existing Mule configuration and the task is to enhance it to route messages to different endpoints depending on some properties of the messages, therefore it would be nice to have some pros and cons on the two options I have at hand:
Add properties on the message, using the "message-properties-transformer" transformer which is later used by a "filtering-router" to single out the message and put it on the correct endpoint. This option allows me to use a single queue for all destinations.
Create one queue for each destination and thus instead of adding some property for later routing, I just put on on the right queue at once. I.e. this option would mean one queue per destination.
Any feedback would be welcome. Is there any "best practices" with regards to this?
I've had a great deal of success with using your first approach with a filtering-router. It reduces cohesion between your message producers and consumers. It forms a valuable abstraction, so any service can blindly drop messages within the generic "outbox".
We've come to depend on mule for filtering and routing messages so much so that we have a dedicated cluster of hardware to do only this. Using mule I was able to get far greater performance and not have to maintain connections to all queues.
The down side will be having to very carefully maintain your messaging object version globally, and having to keep a set of transformers on hands to accept and convert from different versions if you plan to upgrade only a portion of your infrastructure.
thanks, matt

Resources