Autoscaling Backend and RabbitMQ Queues - spring-boot

I have an IoT system around 100k devices, publishing their state every second to the backend written in Java/Spring Boot. Until now, I was using gRPC but I see excessive CPU usage so I was planning to let the devices publish to RabbitMQ and let the backend workers process them.
Processing: Updating the db table.
Since data from same device must be processed sequentially, I was planning to use RabbitMQ's consistent hashing exchange, and bind the n queues for n workers. But I'm not sure how it'd work with autoscaling.
I thought of creating auto-delete queues for each backend instance and binding them to the exchange but I couldn't figure out:
How to rebalance messages already sitting in the queue?
If connectivity issue occurs, queue might get deleted, so I need to re-forward those messages to the existing queues.
Is there any algorithms for handle the autoscaling of workers? For instance if messages pile up, I need to spawn new workers even though cpu/memory usage is low.

I think I'll go with MQTT's shared subcriptions for this case.
https://emqx.medium.com/introduction-to-mqtt-5-0-protocol-shared-subscription-4c23e7e0e3c1
Sharing strategy
Although shared subscriptions allow subscribers to consume messages in
a load-balanced manner, the MQTT protocol does not specify what
load-balancing strategy the server should use. For reference, EMQ X
provides four strategies for users to choose: random, round_robin,
sticky, and hash.
random: randomly select one in all shared subscription sessions to publish messages
round_robin: select in turn according to subscription order
sticky: use a random strategy to randomly select a subscription session, continue to use the session until the subscription is cancelled or disconnect and repeat the process
hash: Hash the ClientID of the sender, and select a subscription session based on the hash result
Hash seems like what I'm looking for.

Related

Does EventStoreDB provide message ordering by an event-key on the consumer side?

I have been exploring EventStoreDB and trying to understand more about the ordering of messages on the consumer side. Read about persistent subscriptions and also the Pinned consumer strategy here.
I have a scenario wherein inventory updates get pushed to eventstore and different streams get created by the different unique inventoryIds in the inventory event.
We have multiple consumers with the same consumerGroup name to read these inventory events. We are using Pinned Persistent Subscription with ResolveLinkTos enabled.
My question:
Will every message from a particular stream always go to the same consumer instance of the consumerGroup?
If the answer to the above question is yes, will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?
The documentation has a warning that ordered message processing using persistent subscriptions is not guaranteed. Any strategy delivers messages with the best-effort level of ordering guarantees, if applicable.
There are a few reasons for this, some of those are:
Spreading out messages across consumer groups lead to a non-linearised checkpoint commit. It means that some messages can be processed before other messages.
Persistent subscriptions attempt to buffer messages, but when a timeout happens on the client side, the whole buffer is redelivered, which can eventually break the processing order
Built-in retry policies essentially can break the message order at any time
Most event log-based brokers, if not all, don't even attempt to guarantee ordered message delivery across multiple consumers. I often hear "but Kafka does it", ignoring the fact that Kafka delivers messages from one partition to at most one consumer in a group. There's no load balancing of one partition between multiple consumers due to exactly the same issue. That being said, EventStoreDB is still not a broker, but a database for events.
So, here are the answers:
Will every message from a particular stream always go to the same consumer instance of the consumer group?
No. It might work most of the time, but it will eventually break.
will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?
Most of the time, yes, but again, if a message is being retried, you might get the next message before the previous one is Acked.
Overall, load-balancing ordered processing of messages, which aren't pre-partitioned on the server is not an easy task. At most, you get messages re-delivered if the checkpoint fails to persist at some point, and the consumers restart.

ActiveMQ - Cost of creating temporary queues

I would like to use queues dynamically generated in ActiveMQ to serialize the handling of events generated by multiple sources.
I need this to be sure that updates on the same record are never in conflicts.
The problem is that I need a different queue for each set of updates that relate to the same record.
There could be in theory millions of records and, of course, I do not want to create millions of queues.
Ideally, a queue should be created when necessary and destroyed when all the updates are completed.
The events that fire the updates are asynchronous but are still correlated. I know that when something happens, several events will be fired in the same time.
It is practically a small burst of asynchronous but correlated updates.
After some time, the queue generated could be deleted.
I understand that there is a cost in creating and deleting queues, but am I right thinking that the cost of generated and deleting these queues with a rate that, during a peak, won't be higher than a few queues per seconds, won't create performance issues ?
There is a cost of temporary queues but generally not that high unless you have high network latency between app server and broker and you should be fine.
Temporary queues, though, have some limits. Such as they are deleted once the created connection goes down. So, if you want your job to resume after a system restart, don't depend on temp-queues. I advice against dynamically creating regular queues at multiple/sec rate. The system is not designed for that.
Generally what you want to do while processing a group of related messages is to utilize message groups. That way, you can use a single queue that does not depend on the producer/temp queue creator connection.

TCP replication of topics

According to the documentation here: https://github.com/OpenHFT/Chronicle-Engine one is able to do pub/sub using maps. This allows one to create a construct similar to topics that are available in middleware such as Tibco, 29W, Kafka and use that as a way of sending events across processes. Is this a recommended usage of chronicle map? What kind of latency can I expect if both publisher and subscriber stay in the same machine?
My second question is, how can this be extended to send messages across machines? How does this work with enterprise TCP replication?
My requirement is to create thousands of topics and use them to communicate across processes running in different machines (in a LAN). Each of these topics would be written by a single source and read by multiple readers running in same or different machines. If the source of a particular topic dies, that source's replica would start writing to the topic and listeners will continue to receive messages. These messages need not be stored for replay.
Is this a recommended usage of chronicle map?
Yes, you can use engine to support event notification across a machine. However, if you want lowest latencies you might need to send a notification via Queue and keep the latest value in a map.
What kind of latency can I expect if both publisher and subscriber stay in the same machine?
It depends on your use case esp the size of the data (in maps case the number of entries as well) The Latency for Map in Engine is around 30 - 100 us, however the latency for Queue is around 2 - 5 us.
My second question is, how can this be extended to send messages across machines?
For this you need our licensed product but the code is the same.
Each of these topics would be written by a single source and read by multiple readers running in same or different machines. If the source of a particular topic dies, that source's replica would start writing to the topic and listeners will continue to receive messages.
Most likely, the simplest solution is to have a Map where each topic is a different key. This will send the latest value for that topic to the consumers.
If you need to recorded every event, a Queue is likely to be a better choice. If you don't need to retain the data for long, you can use a very sort file rotation.

Performance and limitations of temporary queues

I want a bunch of several hundred client apps to create and use temporary queues at one instance of the middleware.
Are there some cons regarding performance why I shouldn't use temp queues? Are there limitations, for example on how many temp. queues can be created per HornetQ instance?
On a recent project we have switched from using temporary queues to using static queues on SonicMQ. We had implemented synchronous service calls over JMS where the response of each call would be delivered on a dedicated temporary queue, created by the consumer. During stress testing we noticed that the overhead of temporary queue creation and allocated resources started to play a bigger and bigger part when pushing the maximum throughput of the solution.
We changed the solution so it would use static queues between consumer and provider and use a selector to correlate on the JMSCorrelationID. This resulted in better throughput in our case. If you are planning on each time (re)creating the temporary queues that your client applications will use, it could start to impact performance when higher throughput rates are needed.
Note that selector performance can also start to play when the number of messages in a queue increase. In our case the solution was designed to hand-off the messages as soon as possible and not play the role of a (storage) buffer in between consumer and provider. As such the number of message inside a queue would always be low.

How long could effectively message stay in Message broker Q

I plan to have persistent message Queues based on some implementation of AMQP and JMS API. I would like to know whether is ok (from architectural point of view) to have messages staying in the queues for hours. A day is max.
I plan to use the message broker as another persistence layer basically. Is this viable?
The technologies that I am evaluating are ActiveMQ, RabbitMQ or qupid.
I plan to use the message broker as another persistence layer
basically. Is this viable?
The broker's persistence mechanism for message retention is usually file-based, or JDBC; either one will work. It is viable? Sure, its a feature of the broker, nothing wrong with using it for the intended purpose, assuming temporary message retention is your goal; 1 day is not a big deal.
But if you're planning to retain messages for 1 day, or more, I recommend doing some calculations based on average message size and total messages per day that may end up sitting in a queue. Queue depth, by default, is usually a low number, like 10Mb, and if exceeded, the broker will probably drop subsequent messages; you want to prevent this from happening. Vendors handle this differently, so check with RabbitMq and ActiveMQ for specifics and what configuration parameters are used to control depth. I know SonicMq has what's known as the "DeadMessage" queue, a destination for expired or undeliverable messages; other products might have something similar.
It's OK to have persistent queues, and it's OK if messages are hanging around in the queues: Clients might be disconnected because of updates, network problems etc. That's one benefit of queues to decouple sender from receiver, and the queue is the buffer. However these use cases are not the normal mode of operation, it's rather an exceptional situation.
Using a messaging broker as "another persistence layer" is technically speaking possible, but in this case a database is probably more suitable, because quick message delivery/messaging and long term storage/database are different tools/scenarios. So ask yourself the question: Is it still messaging or is it already a database?
If in your use case the normal message delay (= period between sending and reception) is always beyond an hour, a database might be better, because JMS selectors are normally slower and less comfortable than database queries using where clauses.
There is another aspect: Consider the need for an online backup of your messages in a JMS provider, especially in a HA cluster mode. It might be easier to do this using a database.

Resources