I have the need to access the saga repository from within a consumer to read the current status of the saga correlated to the message being consumed.
Scenario:
I have an external service, when this service consumes an event coming from the saga I want to see if the saga is still in the correct state because if meanwhile the saga changed its state the consumer must skip the event.
How: I surely could query the saga repository implementation chosen by using its the native framework, but I would like to use an abstraction, an interface, to load the saga state from within the consumer, in order to be able to switch to a different repository implementation in the future.
Any help is appreciated.
If the saga initiated the command, sending it to the consumer, why would the consumer need to check the saga's state? Is there a long delay between the time the command is sent and the consumer is able to process it?
The type of check you are asking about sort of goes against what a system would generally do when processing commands. If you do need to do this type of check, I'd actually suggest a request/response interaction using the request client to which the saga would respond if the command is still valid. That way, the logic (and locking) of the saga repository remains under the saga's control.
If needed, a separate endpoint could be used for that request to ensure it isn't backed up behind other messages targeting the saga. If that is desired, post a comment and I'll update the answer.
Related
I have a simple integration flow that poll data based on a cron job from database, publish on a DirectChannel, then do split and transformations, and publish on another executor service channel, do some operations and finally publish to an output channel, its written using dsl style.
Also, I have an endpoint where I might receive an http request to trigger this flow, at this point I send the messages one of the mentioned channels to trigger the flow.
I want to make sure that the manual trigger doesn’t happen if the flow is already running due to either the cron job or another request.
I have used the isRunning method of the StandardIntegrationFlow, but it seems that it’s not thread safe.
I also tried using .wireTap(myService) and .handle(myService) where this service has an atomicBoolean flag but it got set per every message, which is not a solution.
I want to know if the flow is running without much intervention from my side, and if this is not supported how can I apply the atomic boolean logic on the overall flow and not on every message.
How can I simulate the racing condition in a test in order to make sure my implementation prevent this?
The IntegrationFlow is just a logical container for configuration phase. It does have those lifecycle methods, but only for an internal framework logic. Even if they are there, they don't help because endpoints are always running if you want to do them something by some event or input message.
It is hard to control all of that since it is in an async state as you explain. Even if we can stop a SourcePollingChannelAdapter in the beginning of that flow to let your manual call do do something, it doesn't mean that messages in other threads are not in process any more. The AtomicBoolean cannot help here for the same reason: even if you set it to true in the MessageSourceMutator.beforeReceive() and reset back to false in its afterReceive() when message is null, it still doesn't mean that messages you pushed down in other thread are already processed.
You might consider to use an aggregator for AtomicBoolean resetting in the end of batch since you mention that you pull data from DB, so perhaps there is a number of records per poll you can track downstream. This way your manual call could be skipped until aggregator collects results for that batch.
You also need to think about stopping a SourcePollingChannelAdapter at the moment when manual action is permitted, so there won't be any further race conditions with the cron.
I have REST API gateway which calls one of the microservices with MassTransit request client. This request is not durable and is meant to live for a short time - essentially it's just replacement of "traditional" synchronous (via HTTP/GRPC/etc) gateway-microservice communication.
On microservice side I have consumer which under the hood uses DbContext and Transaction (EFC) to perform some work in database. After the work is done it should publish "WorkDoneEvent" (to be consumed later by other microservices) and return result of the work to api gateway. Event must be published atomically along with transaction used to perform the work. It does not matter if ApiGateway will receive response / will retry request - as soon as transaction is commited both work result and sending "WorkDoneEvent" must be guaranteed.
Normally this is done with transactional outbox which first saves published event to database within same transaction as the work is done. (And then some process constantly "polls" outbox and tries send message to the broker, when done it removes message from outbox). As far as I know.
MassTransit seems to have transactional outbox built in: https://masstransit-project.com/advanced/middleware/transactions.html#transactional-bus.
However in docs it clearly states:
Never use the TransactionalBus or TransactionalEnlistmentBus when writing consumers. These tools are very specific and should be used only in the scenarios described.
And this is exactly what I want to do...
Why I should not do it?
I'd suggest using the InMemoryOutbox, which is part of MassTransit. It's significantly lighter weight, is designed to work in a consumer, and will not publish your events until after the consumer has completed (but prior to acknowledging the message at the broker). The only consideration is that your consumer should be idempotent (which needs to be the case in your approach as well) and if the operation was already performed on a retry, it should republish the events.
There are videos, articles, and a sample to go along with it.
I'm currently studying Saga pattern. Most examples seem to focus on Orchestration Sagas where we have one central saga execution coordinator service that dispatches and receives messages/events. Unfortunately information on how to implement Choreography Sagas seem to be lacking a bit.
In domain driven design, we have multiple bounded contexts, ideally, where each bounded context is a self contained microservice. If microservice A wants to communicate with another microservice B we use Integration Events. Integration Events are published and subscribed to using some asynchronous communication - RabbitMQ, Azure Service Bus.
Assuming we want to start some Saga, for example, where we have to run transactions on Order Service and Customer Service - how exactly do services communicate with each other? Is it just regular Integration Events or something entirely different?
The way I see it and given picture below (source), Saga would be executed this way:
A new order is created. Status is set to "Pending" and OrderSubmittedDomainEvent domain event is emitted.
Domain event handler receives OrderSubmittedDomainEvent domain event, it then creates and dispatches ReserveCreditIntegrationEvent integration event.
Customer Service receives ReserveCreditIntegrationEvent integration event.
It attempts to reserve customer credit.
If credit is successfully reserved, CustomerCreditReservedDomainEvent domain event is emitted.
Domain event handler received CustomerCreditReservedDomainEvent domain event, it creates and dispatches CreditReservedIntegrationEvent integration event.
Order Service receives CreditReservedIntegrationEvent integration event and sets Order Status to "Confirmed".
saga is completed.
Is this the right approach?
I think using Choreography rather then Orchestration for distributed transactions makes sense if you chose it for the right reasons. For instance, if you need to spare the usually higher effort of implementing a central choreography as you don't need to know what state a transaction is in until it has finished. Or because you know that the order of the transaction workflow is stable and is unlikely to change which would also be on the plus side of choreography. But would be a drawback for Choreography if the order changes frequently because you would need to adapt all microservices in that case...
So you need to know the advantages and drawbacks of the two approaches.
If you chose Choreography for the right reasons I would say that I am missing the compensation logic in your considerations. What if the credit was reserved but then the order fails in the order service? Compensation events need to be considered as well in such cases...
Other than that there is the usual suspects:
such as making sure that each service will reliably send the next event after it processed the received event. For this you could look into the Transactional Outbox pattern.
or making sure that you have deduplication of events implemented in each of the services as for reliable sending of events accross distributed transactions you cannot be a hundred percent sure that an event will only be sent once.
And if you are even interested in an alternative to the Saga pattern you can look into the Routing Slip pattern. It is well suited for distributed transaction workflows that will differ depending on the current use case by avoiding that each service needs to know each route. The sequence of the workflow is attached to the initial message of the transaction and all subsequent messages. Then each service receiving a message with the routing slip performs its tasks and passes the next message including the routing slip to the next station (service) on the list.
Note: I am not sure what exactly you mean by ...IntegrationEvent. I would not differentiate between domain and integration events, all events are relevant from the business perspective in your example otherwise they would not be relevant to other Microservices.
We have several services that publishes and subscribes to Domain Events. What we usually do is log events whenever we publish and log events whenever we process events. We basically use this to apply choreography pattern.
We are not doing Event Sourcing in these systems, and there's no programmatic use for them after publishing/processing. That's the main driver we opted not to store these in a durable container, like a database or event store.
Question is, are we missing some fundamental thing by doing this?
Is storing Events a must?
I consider queued messages as system messages, even if they represent some domain event in an event-driven architecture (pub/sub messaging).
There is absolutely no hard-and-fast rule about their storage. If you would like to keep them around you could have your messaging mechanism forward them to some auditing endpoint for storage and then remove them after some time (if necessary).
You are not missing anything fundamental by not storing them.
You're definitely not missing out on anything (but there is a catch) especially if that's not a need by the business. An Event-Sourced System would definitely store all the events generated by the system into a database (or any other event-store)
The main use of an event store is to be able to restore the state of the system to the current state in case of a failure by replaying messages. To make this process of recovery faster we have snapshots.
In your case since these events are just are only relevant until the process is completed, it would not make sense to store them until you have a failure. (this is the catch) especially in a Distributed Transaction case scenario.
What I would suggest?
Don't store the event themselves but log the relevant details about these events and maybe use an ELK stack or Grafana to store these logs.
Use either the Saga Pattern or the Routing Slip pattern in case of a Distributed Transaction and log them as well.
In case a failure occurs while processing an event, put that event into an exception queue and handle it. If it's a part of a distributed transaction make sure either they all have the same TransactionId or they have a CorrelationId so you can lookup for logs and save your system.
For reliably performing your business transactions in a distributed archicture you somehow need to make sure that your events are published at least once.
So a service that publishes events needs to persist such an event within the same transaction that causes it to get created.
Considering you are publishing an event via infrastructure services (e.g. a messaging service) you can not rely on it being available all the time.
Also, your own service instance could go down after persisting your newly created or changed aggregate but before it had the chance to publish the event via, for instance, a messaging service.
Question is, are we missing some fundamental thing by doing this? Is storing Events a must?
It doesn't matter that you are not doing event sourcing. Unless it is okay from the business perspective to sometimes lose an event forever you need to temporarily persist your event with your local transaction until it got published.
You can look into the Transactional Outbox Pattern to achieve reliable event publishing.
Note: Logging/tracking your events somehow for monitoring or later analyzing/reporting purpose is a different thing and has another motivation.
I'm working on a "microservice-like" architecture. Each microservice can fire some events to RabbitMQ. The events are identified by an event code. At the moment, the code of the event triggered is an hard coded const string declared inside the microservice that fire the event.
My problem is that each microservice that want to subscribe to this event must duplicate this event code string. This is error prone especially when an event code is renamed because all microservices that subscribed to this event code need to be changed accordingly... which is very bad.
I see the possible alternatives:
Declare the event code only in the microservice that fire the event. Let the consumers microservices directly access to the code declared in the microservice that fire the event. In this case, the event is declared once but it creates a source code dependency between microservices... which is bad.
Create a source file (outside all microservices) that contains all the events code of all the application. This source file is shared by all microservices. In this case, each event is declared once but it creates a global dependency for all microservices which is against the single responsability principle... which is bad.
How do you tackle this problem ?
At the moment, the code of the event triggered is an hard coded const string declared inside the microservice that fire the event. My problem is that each microservice that want to subscribe to this event must duplicate this event code string. This is error prone especially when an event code is renamed because all microservices that subscribed to this event code need to be changed accordingly... which is very bad.
Events are messages. All of the constraints that we use to manage the evolution of messages applies to events as well.
In a microservices architecture, we expect to be able to deploy instances of the services independently of one another. Requiring that all of the services shut down together to coordinate a change in message schema kind of misses the point. That in turn implies that we need to design reasonable behaviors for the cases where the producer and consumer don't have matching understandings of the message.
In practice, this means something like
We never introduce a new required field, only optional fields (with documented default values).
Unrecognized fields are ignored (but forwarded)
Consumers of optional fields know to use default value to use when an expected field is missing.
When these constraints cannot be satisfied, then you are introducing a new message.
If you have the message contracts in place, then you aren't restricting yourself to microservice implementations that share the same runtime platform (because two different implementations of the same contract are equivalent).
Recommend reading:
ZeroMQ RFC 42/C4, specifically section 2.6 which describes the evolution of public contracts
Versioning in an Event Sourced System, speficically "Basic Type Based Versioning"