Can MassTransit IBus safely be used in Consumers - masstransit

I would like to use the same service classes in both the publisher (which will be a REST API) and consumer. Since sending messages can be a part of these service classes, they have an instance of IBus injected into them so they can publish/send messages. This is fine on the REST API side, but the MassTransit documentation states the following:
Once you have consumers you will ALWAYS use ConsumeContext to interact with the bus, and never the IBus.
What's the reason behind this? Is it just performance related or does using IBus have any other consequences? And what are the alternatives to doing this? Would injecting IPublishEndpoint and ISendEndpointProvider be the accepted solution here, or does that not really change anything?
The reason why I want to do this is because some actions can be done either synchronously by using the API, or happen automatically in the background by using a message, and having to duplicate the business logic would be very inconvenient and hard to maintain.
Bonus question: The documentation states the same thing for TransactionalBus:
Never use the TransactionalBus or TransactionalEnlistmentBus when writing consumers. These tools are very specific and should be used only in the scenarios described.
However, if I want to support transactions in the above mentioned services, I will probably have to use TransactionalBus, but is it safe to do so in consumers? I do know about the in-memory outbox, but I have 2 problems with it:
It can only be used on the consumer side, so the publisher would not support transactions
It does not support "partial transactions" - the codebase that I'm working on has certain places where transactions don't wrap the entire API call, but rather only parts of it, so cases where some entities are successfully written to the database before the transaction is even started can happen, and in these cases the corresponding messages would need to be sent/published as well. This could easily be done by calling Release on the TransactionalBus at the right time, but couldn't be done when using the outbox since it's all or nothing (if an exception happens, nothing will be sent).
This bonus question isn't that important since I could probably work around it, but is still something I'm curious about, as it could be resolved by using TransactionalBus (if that won't cause any issues in consumers).

You should be using IPublishEndpoint or ISendEndpointProvider to publish or send messages from your components and/or services. There is almost never a reason to use IBus.
IPublishEndpoint and ISendEndpointProvider are registered a scoped, so a valid scope is required. In a service that normally doesn't have a scope, one can easily be created using provider.CreateScope(). Scopes should also be disposed of when they are no longer used.
Note that current versions should use provider.CreateAsyncScope() instead, and to make it easy just assign it using:
await using var scope = provider.CreateAsyncScope()
var publishEndpoint = scope.ServiceProvider.GetService<IPublishEndpoint>();
For any components, consumers, etc. simply use constructor injection for either of those two types, and they will resolve the proper services depending upon the context.
Also, don't use ITransactionBus. The new outbox is a better solution, as it's actually in the transaction. I will eventually remove ITransactionBus from MassTransit.

Related

RabbitMQ. Routing configuration best practices

I'm using RabbitMQ in my pet project (Spring Boot based). In #Configuration I declare beans like Queue,Binding,DirectExchange. So, when I run the application all these exchanges and bindings with queues are created automatically. I'm concerned about whether this is the correct way to configure these RabbitMQ-related "entities". Should I separate this into separate steps before application startup? For example, calling series of curl to the management HTTP API to create all needed queues (with exchanges and bindings) before application startup. What are the best practices for creating/configuring routing-related stuff?
The thing is there is no one way of using RabbitMQ. However there are a few questions I always ask myself before working with any broker. I'll apply them to your question here.
Let's question the approach:
In regards to creating the exchanges, bindings, queues etc:
Try to understand if the elements you are using are durable. If so, then you could create this within your code on startup & apply a simple health check. You also want to check if your RabbitMQ server persists data. If not, then you'll HAVE to create your queues, exchanges & bindings every time.
In regards to routing & binding queues with exchanges:
There are two major questions you need to consider
How much does latency matter?
If latency does matter, try to use direct exchanges as much as you can. The reason for this is simple. You're simply going from exchange to queue instead of having to route your message. Routing adds latency, never forget! If those few extra ms won't make the difference for you, then the following question needs to be kept in mind to understand how to define your exchanges, bindings & queues.
How will I use my broker?
Some people use their broker to simply pub/sub messages. This is a perfectly reasonable use case. In this case a fanout exchange would be the most viable option. If you're trying to minimize the amount of queues you're creating a topic exchange may be interesting as well.
However more important, are you using your broker exclusively for between-service communication or is your service going to be both the producer and consumer? Hell, is it a mix between the previous two cases? This boundary needs to be defined clearly. Else you'll run into a mess where you suddenly notice you're consuming messages that can actually be handled internally by another library or simply passing arguments to functions.
Example of how to apply it:
Case: we have a logging service & user service:
1. How will I use my broker:
between service communication.
2. Which messages do I need to get across?
CRUD operations
login/logout
3. How would my routing table look like (draft)?
Exchange Name
Exchange Type
Binding
Queue
user
topic
user.cmd.*
user:crud
user
topic
user.event.login
user:login
Above you can clearly see we can handle all CRUD operations using one simple queue. Is this the most efficient approach? It depends on your service. It may be better that user.cmd.create should go to user:created. This is another boundary that you'll need to define.
Something that also needs to be mentioned is that you should use your Queues & routing keys as pieces of information. Debugging a micro service can be hellish. So applying a general naming convention would be most appropriate. There is no one naming convention, so this depends on your use case once again.
Conclusion:
In general the practice that is best in regards to any broker is clearly defining the scope of your broker and its underlying elements. If not done properly, it does not matter if you do a health check or not on startup. It's easy to get lost in the complexity and interesting features rabbitMQ offers. Try to keep it as simple as possible at first and ask yourself: "is it going to cost me a lot of time to refactor/debug/fix later?" If the answer is yes go back to the first sentence in this paragraph.
Documentation:
AMQP concepts: https://www.rabbitmq.com/tutorials/amqp-concepts.html
Some general best practices: https://www.cloudamqp.com/blog/part2-rabbitmq-best-practice-for-high-performance.html
Routing: https://www.rabbitmq.com/tutorials/tutorial-four-python.html
Latency & throughput: https://blog.rabbitmq.com/posts/2012/05/some-queuing-theory-throughput-latency-and-bandwidth

Microservices: how to track fallen down services?

Problem:
Suppose there are two services A and B. Service A makes an API call to service B.
After a while service A falls down or to be lost due to network errors.
How another services will guess that an outbound call from service A is lost / never happen? I need some another concurrent app that will automatically react (run emergency code) if service A outbound CALL is lost.
What are cutting-edge solutions exist?
My thoughts, for example:
service A registers a call event in some middleware (event info, "running" status, timestamp, etc).
If this call is not completed after N seconds, some "call timeout" event in the middleware automatically starts the emergency code.
If the call is completed at the proper time service A marks the call status as "completed" in the same middleware and the emergency code will not be run.
P.S. I'm on Java stack.
Thanks!
I recommend to look into patterns such as Retry, Timeout, Circuit Breaker, Fallback and Healthcheck. Or you can also look into the Bulkhead pattern if concurrent calls and fault isolation are your concern.
There are many resources where these well-known patterns are explained, for instance:
https://www.infoworld.com/article/3310946/how-to-build-resilient-microservices.html
https://blog.codecentric.de/en/2019/06/resilience-design-patterns-retry-fallback-timeout-circuit-breaker/
I don't know which technology stack you are on but usually there is already some functionality for these concerns provided already that you can incorporate into your solution. There are libraries that already take care of this resilience functionality and you can, for instance, set it up so that your custom code is executed when some events such as failed retries, timeouts, activated circuit breakers, etc. occur.
E.g. for the Java stack Hystrix is widely used, for .Net you can look into Polly .Net to make use of retry, timeout, circuit breaker, bulkhead or fallback functionality.
Concerning health checks you can look into Actuator for Java and .Net core already provides a health check middleware that more or less provides that functionality out-of-the box.
But before using any libraries I suggest to first get familiar with the purpose and concepts of the listed patterns to choose and integrate those that best fit your use cases and major concerns.
Update
We have to differentiate between two well-known problems here:
1.) How can service A robustly handle temporary outages of service B (or the network connection between service A and B which comes down to the same problem)?
To address the related problems the above mentioned patterns will help.
2.) How to make sure that the request that should be sent to service B will not get lost if service A itself goes down?
To address this kind of problem there are different options at hand.
2a.) The component that performed the request to service A (which than triggers service B) also applies the resilience patterns mentioned and will retry its request until service A successfully answers that it has performed its tasks (which also includes the successful request to service B).
There can also be several instances of each service and some kind of load balancer in front of these instances which will distribute and direct the requests to an available instance (based on regular performed healthchecks) of the specific service. Or you can use a service registry (see https://microservices.io/patterns/service-registry.html).
You can of course chain several API calls after another but this can lead to cascading failures. So I would rather go with an asynchronous communication approach as described in the next option.
2b.) Let's consider that it is of utmost importance that some instance of service A will reliably perform the request to service B.
You can use message queues in this case as follows:
Let's say you have a queue where jobs to be performed by service A are collected.
Then you have several instances of service A running (see horizontal scaling) where each instance will consume the same queue.
You will use message locking features by the message queue service which makes sure that as soon one instance of service A reads a message from the queue the other instances won't see it. If service A was able to complete it's job (i.e. call service B, save some state in service A's persistence and whatever other tasks you need to be included for a succesfull procesing) it will delete the message from the queue afterwards so no other instance of service A will also process the same message.
If service A goes down during the processing the queue service will automatically unlock the message for you and another instance A (or the same instance after it has restarted) of service A will try to read the message (i.e. the job) from the queue and try to perform all the tasks (call service B, etc.)
You can combine several queues e.g. also to send a message to service B asynchronously instead of directly performing some kind of API call to it.
The catch is, that the queue service is some highly available and redundant service which will already make sure that no message is getting lost once published to a queue.
Of course you also could handle jobs to be performed in your own database of service A but consider that when service A receives a request there is always a chance that it goes down before it can save that status of the job to it's persistent storage for later processing. Queue services already address that problem for you if chosen thoughtfully and used correctly.
For instance, if look into Kafka as messaging service you can look into this stack overflow answer which relates to the problem solution when using this specific technology: https://stackoverflow.com/a/44589842/7730554
There is many way to solve your problem.
I guess you are talk about 2 topics Design Pattern in Microservices and Cicruit Breaker
https://dzone.com/articles/design-patterns-for-microservices
To solve your problem, Normally I put a message queue between services and use Service Discovery to detect which service is live and If your service die or orverload then use Cicruit Breaker methods

Notifying golongpoll.SubscriptionManager of an event from kafka-go

I was writing a POC on long-polling using go.
I see the general package to be used is https://github.com/jcuga/golongpoll .
But assuming that I would want to publish an event to the golongpoll.SubscriptionManager from a general context, especially when there is a possibility that the long poll API request is being served by one machine, while the Kafka event for that particular consumer group is consumed by another instance in the cluster.
The examples given in the documentation did not talk of such a scenario at all, even though this seems like a common scenario. One way I can think of is have a distributed cache like Redis in between and have all the services poll this for a change? But that sounds a bit dumb to me.

How to share events code in a microservice architecture

I'm working on a "microservice-like" architecture. Each microservice can fire some events to RabbitMQ. The events are identified by an event code. At the moment, the code of the event triggered is an hard coded const string declared inside the microservice that fire the event.
My problem is that each microservice that want to subscribe to this event must duplicate this event code string. This is error prone especially when an event code is renamed because all microservices that subscribed to this event code need to be changed accordingly... which is very bad.
I see the possible alternatives:
Declare the event code only in the microservice that fire the event. Let the consumers microservices directly access to the code declared in the microservice that fire the event. In this case, the event is declared once but it creates a source code dependency between microservices... which is bad.
Create a source file (outside all microservices) that contains all the events code of all the application. This source file is shared by all microservices. In this case, each event is declared once but it creates a global dependency for all microservices which is against the single responsability principle... which is bad.
How do you tackle this problem ?
At the moment, the code of the event triggered is an hard coded const string declared inside the microservice that fire the event. My problem is that each microservice that want to subscribe to this event must duplicate this event code string. This is error prone especially when an event code is renamed because all microservices that subscribed to this event code need to be changed accordingly... which is very bad.
Events are messages. All of the constraints that we use to manage the evolution of messages applies to events as well.
In a microservices architecture, we expect to be able to deploy instances of the services independently of one another. Requiring that all of the services shut down together to coordinate a change in message schema kind of misses the point. That in turn implies that we need to design reasonable behaviors for the cases where the producer and consumer don't have matching understandings of the message.
In practice, this means something like
We never introduce a new required field, only optional fields (with documented default values).
Unrecognized fields are ignored (but forwarded)
Consumers of optional fields know to use default value to use when an expected field is missing.
When these constraints cannot be satisfied, then you are introducing a new message.
If you have the message contracts in place, then you aren't restricting yourself to microservice implementations that share the same runtime platform (because two different implementations of the same contract are equivalent).
Recommend reading:
ZeroMQ RFC 42/C4, specifically section 2.6 which describes the evolution of public contracts
Versioning in an Event Sourced System, speficically "Basic Type Based Versioning"

Are singleton network event processors okay?

Suppose you have a system on the other side of a network that sends events and data that needs to be cached to some intermediate broker.
Instead of giving every component of your application that needs to be informed of such events a new subscription to the broker, I decide for performance and simplicity (the third party library that handles broker subscriptions isnt pretty) I should have only one Event Processor that subscribes to the broker and programatically fires events as it receives them to subscribed listeners provided by the components. The cached data can also be shared from this singleton. This will greatly reduce network connections.
However according to most discussions about singletons, they are always evil PERIOD unless for concurrency reasons or hardware reasons you need only one access point. This is not my situation since every component could have their own subscription and their own personal cache of data since all the data can be requested over the broker. However this could easily add 200 more network connections.
Because singletons are evil does that mean 200 more connections to a broker with 200 copies of data is better than using singleton I don't need to use? After all this slows things down quite a bit but its not game breaking, the application is still usable.
There's nothing inherently wrong with your broker client object servicing multiple clients within your process.
All the talk about singletons being evil is really about global variables being evil. A singleton becomes evil because it provides a static access point to mutable state, not because there is only one instance of it.
In that light, you might want to use dependency injection to hook it up rather than calling Broker.getInstance(). This avoids client code making the assumption that it is in fact a singleton.

Resources