How to force a queue to run synchronously for tests?

How to force a queue to run synchronously for tests? - ruby

We have an app that sends messages, and request specs that ensure that the messages are sent.
We are adding a queueing system for the messages. Each message is stored in the db, and then later processed and deleted. The records are de-queued asynchronously in another process. So the specs now fail.
What is a good way to automatically process the queue for the specs?
One approach would be to add an observer to the queue that automatically processes each message as it is queued. But I'm not sure if it makes sense to do it that way, especially since it is only for tests.
What is a good way to handle this?

If I'm understanding this correctly, you have spec that creates a message and sends it, then somehow verifies that it was sent. Now you are changing the app to queue messages and send them later. Where you had previously had one feature (send a message), now you have two features (1. queue a message; 2. send a message).
I'd say the specs should test those features separately, i.e., one spec that verifies that newly created messages are queued, and another spec that verifies that any queued message is sent. That will make the specs much easier to implement, and the specs will better reflect the behavior of the application.

Related

How to know the the running status of a spring integration flow

I have a simple integration flow that poll data based on a cron job from database, publish on a DirectChannel, then do split and transformations, and publish on another executor service channel, do some operations and finally publish to an output channel, its written using dsl style.
Also, I have an endpoint where I might receive an http request to trigger this flow, at this point I send the messages one of the mentioned channels to trigger the flow.
I want to make sure that the manual trigger doesn’t happen if the flow is already running due to either the cron job or another request.
I have used the isRunning method of the StandardIntegrationFlow, but it seems that it’s not thread safe.
I also tried using .wireTap(myService) and .handle(myService) where this service has an atomicBoolean flag but it got set per every message, which is not a solution.
I want to know if the flow is running without much intervention from my side, and if this is not supported how can I apply the atomic boolean logic on the overall flow and not on every message.
How can I simulate the racing condition in a test in order to make sure my implementation prevent this?

The IntegrationFlow is just a logical container for configuration phase. It does have those lifecycle methods, but only for an internal framework logic. Even if they are there, they don't help because endpoints are always running if you want to do them something by some event or input message.
It is hard to control all of that since it is in an async state as you explain. Even if we can stop a SourcePollingChannelAdapter in the beginning of that flow to let your manual call do do something, it doesn't mean that messages in other threads are not in process any more. The AtomicBoolean cannot help here for the same reason: even if you set it to true in the MessageSourceMutator.beforeReceive() and reset back to false in its afterReceive() when message is null, it still doesn't mean that messages you pushed down in other thread are already processed.
You might consider to use an aggregator for AtomicBoolean resetting in the end of batch since you mention that you pull data from DB, so perhaps there is a number of records per poll you can track downstream. This way your manual call could be skipped until aggregator collects results for that batch.
You also need to think about stopping a SourcePollingChannelAdapter at the moment when manual action is permitted, so there won't be any further race conditions with the cron.

Send, Publish and Request/Response in MasstTransit

Recently I am trying to use MassTransit in our microservice ecosystem.
According to MassTransit vocabulary and from documents my understanding is :
Publish: Sends a message to 1 or many subscribers (Pub/Sub Pattern) to propagate the message.
Send: Used to send messages in fire and forget fashion like publish, but instead It is just used for one receiver. The main difference with Publish is that in Send if your destination didn't receive a message, it would return an exception.
Requests: uses request/reply pattern to just send a message and get a response in a different channel to be able to get response value from the receiver.
Now, my question is according to the Microservice concept, to follow the event-driven design, we use Publish to propagate messages(Events) to the entire ecosystem. but what is exactly the usage (use case) of Send here? Just to get an exception if the receiver doesn't exist?
My next question is that is it a good approach to use Publish, Send and Requests in a Microservices ecosystem at the same time? like publish for propagation events, Send for command (fire and forget), and Requests for getting responses from the destination.
----- Update
I also found here which Chris Patterson clear lots of things. It also helps me a lot.

Your question is not related to MassTransit. MassTransit implements well-known messaging patterns thoughtfully described on popular resources such as Enterprise Integration Patterns
As Eben wrote in his answer, the decision of what pattern to use is driven by intent. There are also technical differences in the message delivery mechanics for each pattern.
Send is for commands, you tell some other service to do something. You do not wait for a reply (fire and forget), although you might get a confirmation of the action success or failure by other means (an event, for example).
It is an implementation of the point-to-point channel, where you also can implement competing consumers to scale the processing, but those will be instances of the same service.
With MassTransit using RabbitMQ it's done by publishing messages to the endpoint exchange rather than to the message type exchange, so no other endpoints will get the message even though they can consume it.
Publish is for events. It's a broadcast type of delivery or fan-out. You might be publishing events to which no one is listening, so you don't really know who will be consuming them. You also don't expect any response.
It is an implementation of the publish-subscribe channel.
MassTransit with RabbitMQ creates exchanges for each message type published and publishes messages to those exchanges. Consumers create bindings between their endpoint exchanges and message exchanges, so each consumer service (different apps) will get those in their independent queues.
Request-response can be used for both commands that need to be confirmed, or for queries.
It is an implementation of the request-reply message pattern.
MassTransit has nice diagrams in the docs explaining the mechanics for RabbitMQ.
Those messaging patterns are frequently used in a complex distributed system in different combinations and variations.

The difference between Send and Publish has to do with intent.
As you stated, Send is for commands and Publish is for events. I worked on a large enterprise system once running on webMethods as the integration engine/service bus and only events were used. I can tell you that it was less than ideal. If the distinction had been there between commands and events it would've made a lot more sense to more people. Anyway, technically one needs a message enqueued and on that level it doesn't matter, which is why a queueing mechanism typically would not care about such semantics.
To illustrate this with a silly example: Facebook places and Event on my timeline that one of my friends is having a birthday on a particular day. I can respond directly (send a message) or I could publish a message on my timeline and hope my friend sees it. Another silly example: You send an e-mail to PersonA and CC 4 others asking "Please produce report ABC". PersonA would be expected to produce the report or arrange for it to be done. If that same e-mail went to all five people as the recipient (no CC) then who gets to do it? I know, even for Publish one could have a 1-1 recipient/topic but what if another endpoint subscribed? What would that mean?
So the sender is responsible, still configurable as subscriptions are, to determine where to Send the message to. For my own service bus I use an implementation of an IMessageRouteProvider interface. A practical example in a system I once developed was where e-mails received had to have their body converted to an image for a content store (IBM FileNet P8 if memory serves). For reasons I will not go into the systems were stopped each night at 20h00 and restarted at 6h00 in the morning. This led to a backlog of usually around 8000 e-mails that had to be converted. The conversion endpoint would process a conversion in about 2 seconds but that still takes a while to work through. In the meantime the web front-end folks could request PDF files for conversion to paged TIFF files. Now, these ended up at the end of the queue and they would have to wait hours for that to come back. The solution was to implement another conversion endpoint, with its own queue, and have the web front-end configured to send the same message type, e.g. ConvertDocumentCommand to that "priority" queue for processing. Pretty easy to do. Now, if that had been a publish how would I do that split? The same event going to 2 different endpoints under different circumstances? Well, you could have another subscription store for your system but now you'd need to maintain both. There could be another answer such as coding this logic into the send bit but that is a design choice and would require coding changes.
In my own Shuttle.Esb service bus I only have Send and Publish. For request/response both the sender and receiver have an inbox and a request would be sent (Send) to the receiver and it in turn could reply (also a Send but uses the sender's URI).

Using transactional bus inside consumer

I have REST API gateway which calls one of the microservices with MassTransit request client. This request is not durable and is meant to live for a short time - essentially it's just replacement of "traditional" synchronous (via HTTP/GRPC/etc) gateway-microservice communication.
On microservice side I have consumer which under the hood uses DbContext and Transaction (EFC) to perform some work in database. After the work is done it should publish "WorkDoneEvent" (to be consumed later by other microservices) and return result of the work to api gateway. Event must be published atomically along with transaction used to perform the work. It does not matter if ApiGateway will receive response / will retry request - as soon as transaction is commited both work result and sending "WorkDoneEvent" must be guaranteed.
Normally this is done with transactional outbox which first saves published event to database within same transaction as the work is done. (And then some process constantly "polls" outbox and tries send message to the broker, when done it removes message from outbox). As far as I know.
MassTransit seems to have transactional outbox built in: https://masstransit-project.com/advanced/middleware/transactions.html#transactional-bus.
However in docs it clearly states:
Never use the TransactionalBus or TransactionalEnlistmentBus when writing consumers. These tools are very specific and should be used only in the scenarios described.
And this is exactly what I want to do...
Why I should not do it?

I'd suggest using the InMemoryOutbox, which is part of MassTransit. It's significantly lighter weight, is designed to work in a consumer, and will not publish your events until after the consumer has completed (but prior to acknowledging the message at the broker). The only consideration is that your consumer should be idempotent (which needs to be the case in your approach as well) and if the operation was already performed on a retry, it should republish the events.
There are videos, articles, and a sample to go along with it.

Can client receive multiple messages of the queue before acknowledging them?

My program will be receiving messages rather slowly; and I want to them to persist in the queue until I have receive all of them and acknowledge all of them. I don't know if I have enough messages until I receive a bunch of them.
My question: will the queue block, waiting for the acknowledgement from the first message before delivering the second?

Well I ran a test one this using the sample producer/consumer code. The consumer actually has some code (if you switch over to ClientAcknowledge). It receives a bunch of messages (10 of them) and only acks the last one.

When setting the acknowledge mode to Session.CLIENT_ACKNOWLEDGE you can get as many messages you need. The messages will be locked on the server, so no other consumer can retrieve them meanwhile. So the answer is no, the queue won't block (even thu there might be provider-specific settings that can do that, which I don't know).
However, you can acknowledge only all at once. So when you have received 10 messages, and you acknowledge one of them (doesn't matter which), all messages will be acknowledged.
Check for your reference Controlling Message Acknowledgment

Spring's JMS Design Question : Decouple processing of messages

I'm using a message listener to process some messages from MQ based on Spring's DefaultMessageListenerContainer. After I receive a message, I have to make a Web Service (WS) call. However, I don't want to do this in the onMessage method because it would block the onMessage method until the invocation of WS is successful and this introduces latency in dequeuing of messages from the queue. How can I decouple the invocation of the Web Service by calling it outside of the onMesage method or without impacting the dequeuing of messages?
Thanks,

I think you might actually want to invoke the web service from your onMessage. Why do you want to dequeue messages quickly, then delay further processing? If you do what you're saying, you'd probably have to introduce another level of queueing, or some sort of temporary "holding" collection, which is redundant. The point of the queue is to hold messages, and your message listener will pull them off and process them as quickly as possible.
If you are looking for a way to maximize throughput on the queue, you might think about making it multi-threaded, so that you have multiple threads pulling messages off the queue to invoke the web service. You can easily do this by setting the "concurrentConsumers" configuration on the DefaultMessageListenerContainer. If you set concurrentConsumers to 5, you'll have 5 threads pulling messages off the queue to process. It does get tricky if you have to maintain ordering on the messages, but there may be solutions to that problem if that's the case.

I agree with answer provided before me , however I can see a usecase similar to this very common in practice. I'm adding my two cents It might be valid in some cases that you don't want to do time consuming work in your onMessage Thread (which is pulling message from Q)
We have something similar in one workflow, where if user selects some XYZ option on GUI that means at server we need to connect to another external webservice to get ABCD in this case we do not make call to webservice in onMessage Thread and use ThreadPool to dispatch and handle that call.
If something wrong happens during webservice call we broadcast that to GUI as separate Message , there is concept of request id which is preserved across messages so that GUI can relate error messages. You can use ExecutorService implementation to submit task.
hope it helps.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio