Apache Kafka: How to check, that an event has been fully handled? - events

I am facing an issue when decoupling two systems by an event/message broker like Apache Kafka. The issue is related to a frontend triggering actions in a backend:
How does the producer (frontend service) know, that the published event has been properly handled by all the backend services (as consumers), if the publisher does not know neither the "identities" nor the count of consuming backends?
To be precise: Users can change for example their email address using a frontend UI. An associated service publishes that "change request" event to an appropriate topic within Kafka. The UI form is then "locked" to prevent subsequent change requests, until the change event has been fully processed by every consumer. But it's unclear how to detect this state.

You can use another topic to publish handled jobs. So your front-end publishes to one topic and your back-end publishes to another once it is done.

In Kafka terms, neither the producer nor consumer are considered backend - they're both clients connecting to a broker, which is generally considered to be the backend.
A producer will know that it has produced a message successfully, by virtue of the acks setting. A consumer will read a message, and then at a later point, its offset will be updated to a point corresponding to the last message it read. However, there is generally no interaction between a producer and a consumer, and they are generally completely unaware of one another.

Related

Using transactional bus inside consumer

I have REST API gateway which calls one of the microservices with MassTransit request client. This request is not durable and is meant to live for a short time - essentially it's just replacement of "traditional" synchronous (via HTTP/GRPC/etc) gateway-microservice communication.
On microservice side I have consumer which under the hood uses DbContext and Transaction (EFC) to perform some work in database. After the work is done it should publish "WorkDoneEvent" (to be consumed later by other microservices) and return result of the work to api gateway. Event must be published atomically along with transaction used to perform the work. It does not matter if ApiGateway will receive response / will retry request - as soon as transaction is commited both work result and sending "WorkDoneEvent" must be guaranteed.
Normally this is done with transactional outbox which first saves published event to database within same transaction as the work is done. (And then some process constantly "polls" outbox and tries send message to the broker, when done it removes message from outbox). As far as I know.
MassTransit seems to have transactional outbox built in: https://masstransit-project.com/advanced/middleware/transactions.html#transactional-bus.
However in docs it clearly states:
Never use the TransactionalBus or TransactionalEnlistmentBus when writing consumers. These tools are very specific and should be used only in the scenarios described.
And this is exactly what I want to do...
Why I should not do it?
I'd suggest using the InMemoryOutbox, which is part of MassTransit. It's significantly lighter weight, is designed to work in a consumer, and will not publish your events until after the consumer has completed (but prior to acknowledging the message at the broker). The only consideration is that your consumer should be idempotent (which needs to be the case in your approach as well) and if the operation was already performed on a retry, it should republish the events.
There are videos, articles, and a sample to go along with it.

MassTransit - publish to all consumer instances

I am looking for a way for each consumer instance to receive a message that is published to RabbitMQ via MassTransit. The scenario would be, we have multiple microservices that need to invalidate a cache on notification. Pub-Sub won't work in this instance as there will be 5 consumers of the same type as its the same code per service instance, so only one would receive the message in a traditional PubSub.
Message observation could be an option but this means the messages would never be consumed and hang around forever on the bus.
Can anyone suggest a pattern to use in the context of MassTransit?
Thanks in advance.
You should create a management endpoint in each service, which could even be a temporary queue (just request a receive endpoint without a queue name and one will be dynamically generated). Then, put your queue invalidation consumers on that endpoint. Each service instance will receive a unique instance of the message (when Publish is called), and those queues and bindings will automatically be removed once the service exits.
This is exactly how the bus endpoint works, but in your case, you're creating a receive endpoint which can have consumer message type bindings, so that published messages are received, one copy per service.
cfg.ReceiveEndpoint(cfg => { ... });
Note that the queue name is not specified, and will be automatically generated uniquely.

ActiveMQ with slow consumer skips 200 messages

I'm using ActiveMQ along with Mule (a kind of ESB based on Spring).
We got a fast producer and a slow consumer.
It's synchronous configuration with only one consumer.
Here the configuration of the consumer in spring style: http://pastebin.com/vweVd1pi
The biggest requirement is to keep the order of the messages.
However, after hours of running this code, suddenly, ActiveMQ skips 200 messages, and send the next ones.The 200 messages are still there in the activeMQ, they are not lost.
But our client (Mule), does have some custom code to check the order of the messages, using an unique identifier.
I had this issue already a few month ago. We change the consumer by using the parameter "jms.prefetchPolicy.queuePrefetch=1". It seemed to have worked well and to be the fix we needed unti now when the issue reappeared on another consumer.
Is it a bug, or a configuration issue ?
I can't talk about the requirement from a Mule perspective, but there are a couple of broker features that you should take a look at. There are two ways to guarantee message ordering in ActiveMQ:
Message groups are a way of ensuring that a set of related messages will be consumed by the same consumer in the order that they are placed on a queue. To use it you need to specify a JMSXGroupID header on related messages, and assign them an incrementing JMSXGroupSeq number. If a consumer dies, remaining messages from that group will be sent to another single consumer, while still preserving order.
Total message ordering applies to all messages on a topic. It is configured on the broker on a per-destination basis and requires no particular changes to client code. It comes with a synchronisation overhead.
Both features allow you to scale out to more than one consumer.

Enterprise Integration Scatter-Gather across multiple app servers

I am looking for a way to aggregate JMS messages sent from multiple application servers, load-balanced via JMS. The problem is basically this:
At the end of our registration form, there exists a container in the http session, and the container has two objects of the same type. Each object needs to be processed, then the container needs to be delivered. Processing an object is resource intensive, so the processing is requested (InOnly, asynchronous) and queued up in OpenMQ. The JMS message is consumed by one of two competing consumers, that are basically duplicate application servers, that also serve up the web requests.
Currently, I just have a hard-coded delay on the container delivery, but with increased traffic there are plenty of delivery failures, since the objects have not finished processing yet. I am using Apache Camel 2.6 and Spring Remoting, and the Camel Aggregator would be ideal, except that each app server must have a duplicate camel context, so they would be competing for the aggregate components.
Perhaps a temporary queue and endpoint for each aggregation, but I'm not sure how to go about doing that, especially the tear-down. What would be the best way to process both objects, then deliver the container?
You could send a message to a topic when each object is finished. The message should contain the context id and the object id. Then you would have a from route on the topic. When a message is received it would persist the state in a simple db table and check if the other confirmation is already persisted. If yes it would deliver the container.

About JMS system structure

I’m writing a server/client game, a typical scenario looks like this: one client (clientA) send a message to the server, there is a MessageDrivenBean in server to handle such messages. After the MDB finished its job, it sends the result message back to another client (clientB).
In my opinion I only need two queues for such communication, one for input the other for output. Creating new queue for each connection is not a good idea, right?
The Input queue is relative clear, if more clients are sending message at the same time, the messages are just waiting in the queue, while there are more MDB instances in server, that should not a big performance issue.
But on the other side I am not quite clear about the output queue, should I use a topic instead of a queue? Every client is listening the output queue, one of them gets the new message and checks the property to determine if the message is to it, if not, it rollback the transaction, the message goes back to queue and be ready for other client … It should work but must be very slow. If I use topic instead, every client gets a copy of the message, if it’s not to it, just ignores the message. It should be better, right?
I’m new about message system. Is there any suggestion about my implementation? Thanks!
To begin with, choosing JMS as a gaming platform is, well, unusual — businesses use JMS brokers for delivery reliability and transaction support. Do you really need this heavy lifiting in a game? Shouldn't you resort to your own HTTP-based protocol, for example?
That said, two queues are a standard pattern for point-to-point communication. Creating a queue for a new connection is definitely not OK — message-driven beans are attached to queues at deployment time, so you won't be able to respond to queue creation events. Besides, queues are not meant to be created and destroyed in short cycles, they're rather designed to be long-living entities. If you need to deliver a message to one precise client, have the client listen on the server response queue with a message selector set to filter only the messages intended for this client (see javax.jms.Message API).
With topics it's exactly as you noted — each connected client will get a copy of the message — so again, it's not a good pattern to send to n clients a message that has to be discarded by n-1 clients.
MaDa;
You could stick one output queue (or topic) and simply tag the message with a header that identifies the intended client. Then, clients can listen on the queue/topic using a selector. Hopefully your JMS implementation has efficient server-side listener evaluation.

Resources