Redis Queue Architecture - parallel-processing

How can we assign Redis queue to take jobs parallelly? Can we place different queues inside a single Redis queue?
I am making an chat application and I want to completely reduce the delay time. If suppose more people are sending messages at a same timestamp, then there will be more messages at the redis queue. Is there a way to handle that?
I am using redis for in-memory data sending.

Redis is single Threaded. Therefore no items can be handled in parallel. Thats not as bad as it sounds at first as redis can handle these small operations very fast ( see http://redis.io/topics/benchmarks for futher detail on how fast it is)
an ordered list can only handle items with an unique score. So using a ordered list might not be a good idea. But you can use a normal list like this:
store new message
LPUSH chatquene message1
store message information
HMSET message1
time 1234 user Adam message hi receiver Eve
3. retrieve most recent message key
RPOP chatquene

Related

Spring Batch Remote Chunking Chunk Response

I have implemented Spring Batch Remote Chunking with Kafka. I have implemented both Manager and worker configuration. I want to send some DTO or object in chunkresponse from worker side to Manager and do some processing once I receive the response. Is there any way to achieve this. I want to know the count of records processed after each chunk is processed from worker side and I have to update the database frequently with count.
I want to send some DTO or object in chunkresponse from worker side to Manager and do some processing once I receive the response. Is there any way to achieve this.
I'm not sure the remote chunking feature was designed to send items from the manager to workers and back again. The ChunkResponse is what the manager is expecting from workers and I see no way you can send processed items in it (except probably serializing the item in the ChunkResponse#message field, or storing it in the execution context, which both are not good ideas..).
I want to know the count of records processed after each chunk is processed from worker side and I have to update the database frequently with count.
The StepContribution is what you are looking for here. It holds all the counts (read count, write count, etc). You can get the step contribution from the ChunkResponse on the manager side and do what is required with the result.

ActiveMQ - Competing Consumers with Selector - messages starve in the queue

ActiveMQ 5.15.13
Context: I have a single queue with multiple Consumers. I want to stop some consumers from processing certain messages. This has to be dynamic, I don't want to create separate queues for this. This works without any problems. e.g. Consumer1 ignores Stocks -> Consumer1 can process all invoices and Consumer2 can process all Stocks
But if there is a large number of messages already in the Queue (of one type, e.g. stocks) and I send a message of another type (e.g. invoices), Consumer1 won't process the message of type invoices. It will instead be idle until Consumer2 has processed all Stocks messages. It does not happen every time, but quite often.
Is there any option to change the order of the new messages coming into the queue, such that an idle consumer with matching selector picks up the new message?
Things I've already tried:
using a PendingMessageLimitStrategy -> it seems like it does not work for queues
increasing the maxPageSize and maxBrowsePageSize in the hope that once all Messages are in RAM, the Consumers will search for their messages.
Exclusive Consumers aren't an option since I want to be able to use more than one Consumer per message type.
Im pretty sure that there is some configuration which allows this type of usage. I'm aware that there are better solutions for this issue, but sadly I can't use them easily due to other constraints.
Thanks a lot in advance!
EDIT: I noticed that when I'm refreshing on the localhost queue browser, the stuck messages get executed immediately. It seems like this action performs some sort of queue refresh where the messages get filtered based on their selector again. So I just need this action whenever a new message enters the queue...
This is a 'window' problem where the next set of 'stocks' data needs to be processed before the 'invoicing' data can be processed.
The gotcha with window problems like this is that you need to account for the fact that some messages may never come through, or a consumer may never come back online either. Also, eventually you will be asked 'how many invoices or stocks are left to be processed'-- aka observability.
ActiveMQ has you covered-- check out wild-card destinations and consumers.
Produce 'stocks' to:
queue://data.stocks.input
Produce 'invoices' to:
queue://data.invoices.input
You then setup consumes to connect:
queue://data.*.input
note: the wildard '*'.
ActiveMQ will match queues based on the wildcard pattern, and then process data accordingly. As a bonus, you can still use a selector.

Is it possible to define a single saga which will process many messages

My team is considering if we can use mass transit as a primary solution for sagas in RabbitMq (vs NServiceBus). I admit that our experience which solution like masstransit and nserviceBus are minimal and we have started to introduce messaging into our system. So I sorry if my question will be simple or even stupid.
However, when I reviewed the mass transit documentation I noticed that I am not sure if that is possible to solve one of our cases.
The case looks like:
One of our components will produce up to 100 messages which will be "sent" to queue. These messages are a result of a single operation in a system. All of the messages will have the same Correlated Id and our internal publication id (same too).
1) is it possible to define a single instance saga (by correlated id) which will wait until it receives all messages from a queue and then process them as a single batch?
2) otherwise, is there any solution to ensure all of the sent messages was processed? (Consistency batch?) I assume that correlated Id will serve as a way to found an existing saga instance (singleton). In the ideal case, I would like to complete an instance of a saga When the system will process every message which belongs to a single group (to one publication)
I look at CompositeEvent too but I do not sure if I could use it to "ensure" that every message was processed and then I would let to complete saga for specific correlated Id.
Can you explain how could it be achieved? And into what mechanism I should look at in order to correlated id a lot of messages with the same id to the single saga and then complete if all of msg will be consumed?
Thank you in advance for any response
What you describe is how correlation by id works. It is like that out of the box.
So, in short - when you configure correlation for your messages correctly, all messages with the same correlation id will be handled by the same saga instance.
Concerning the second question - unless you publish a separate event that would inform the saga about how messages it should expect, how would it know that? You can definitely schedule a long timeout, attempting and assuming that within the timeout all the messages will be received by the saga, but it's not reliable.
Composite events won't help here since they are for messages with different types to be handled as one when all of them arrive and it doesn't count for the number of messages of each type. It just waits for one message of each type.
The ability to receive a series of messages and then operate on them in a batch is a common case, so much so that there is a sample showing how to do just that:
Batch Sample
Each saga instance has a unique correlation identifier, and as long as those messages can be correlated to that single instance, MassTransit will manage the concurrency (either optimistic or pessimistic, and depending upon the saga storage engine).
I'd suggest reviewing the state machine in the sample, and seeing how that compares to your scenario.

How to create unique messages to rabbitmq queue - spring-amp

I am putting a message containing string data to rabbitmq queue.
Message publishing is called as a part of a service and the service can be called with same data (data goes to the queue) multiple times, thus chances for having duplicated data in the queue is very likely.
We have issues with this as the consumer code is inserting this data to table where this data is primary key. Consumer will be called from 4 different nodes simultaneously thus chances for having consumers consuming same data (from different messages) can happen.
I want to know if rabbitMQ publishing has any way to avoid message duplication.
Read "define a property "x-unique-message-code" to compare them is an easy and simple way" , but don't know how to do it.
I am using spring-amqp
Any help is highly appreciated.
Thank you
There is a good article from RabbitMQ about reliability: https://www.rabbitmq.com/reliability.html
There is a note like:
In the event of network failure (or a node crashing), messages can be duplicated, and consumers must be prepared to handle them. If possible, the simplest way to handle this is to ensure that your consumers handle messages in an idempotent way rather than explicitly deal with deduplication.
For this purpose the message to produce can be supplied with a messageId property.

What is the right approach for an async work queue with results?

I have a REST server on heroku. It will have N-dynos for the REST service and N-dynos for workers.
Essentially, I have some long running rest requests. When these come in I want to delegate them to one of the workers and give the client a redirect to poll the operation and eventually return the result of the operation.
I am going to use JEDIS/REDIS from RedisToGo for this. As far as I can tell there are two ways I can do this.
I can use the PUB/SUB functionality. Have the publisher create unique identities for the work results and return these in a redirect URI to the REST client.
Essentially the same thing but instead of PUB/SUB use RPUSH/BLPOP.
I'm not sure what the advantage is to #1. For example, if I have a task called LongMathOperation it seems like I can simply have a list for this. The list elements are JSON objects that have the math operation arguments as well as a UUID generated by the REST server for where the results should be placed. Then all the worker dynos will just have blocking BLPOP calls and the first one there will get the job, process it, and put the results in REDIS using the key of the UUID.
Make sense? So my question is "why would using PUB/SUB be better than this?" What does PUB/SUB bring to the table here that I am missing?
Thanks!
I would also use lists because pubsub messages are not persistent. If you have no subscribers then the messages are lost. In other words, if for whatever reason you do not have any workers listening then the client won't get served properly. Lists are persistent on the other hand. But pubsub does not take as much memory as lists obviously for the same reason: there is nothing to store.

Resources