Load balance Kafka consumer multiple instances - spring-boot

I have a consumer that reads and writes messages to a time-series database. We have multiple instances of the time series database running as a cluster on multiple physical machines.
Our plan is to deploy the consumer on Kubernetes so I can scale if I need more instance with load-balance they all point to a single time series service that is running.
Now I getting an Issue where it's come to my mind that if I have 5 instances which consume the same topic then they work individually means( they all get message payload and save like any one instance is doing )
What we want is
we want if one consumer is busy then it will go to the next free instance but not be subscribed to by all instance running. To scale or load-balance means I want like normal load-balancing application or how spring-boot app works when you scale on Kubernetes
so is there any way to make it like a load-balancing consumer and processing only one, even consume by 1st or 2nd or 3rd like normal app work as loadbanlacer?
if anyone has ideas about this, how it going to behave and what kind of output we are going to get if doing this with Kafka Spring boot application?

Related

Springboot AWS SQS application running on 2 servers

We are going to work on springboot application which will be deployed on two ECS containers to support the cluster environment. This application will accept the request and drop message into SQS. Another flow in the application should pick the message from queue and process it. As same application will be running on two different servers in cluster environment, I am not sure which server will pick the message from queue. How can I make sure that only one server picks up the message from queue. It could be either server.
Ordinary SQS queues do not even guarantee that a message only appears once on the queue - see AWS Standard SQS Queue docs
Using a reasonable value for visibility timeout, the time that a message can’t be seen by other consumers, vs the time it takes to consume a message should solve it.
Alternatively you can use an SQS FIFO queue but it’s much slower and can, in my experience, get stuck on a corrupt message.

Scaling a microservice with frontend and backend instances

I am developing a series of microservices using Spring Boot and plan to deploy them on Kubernetes.
Some of the microservices are composed of an API which writes messages to a kafka queue and a listener which listens to the queue and performs the relevant actions (e.g. write to DB etc, construct messsages for onward processing).
These services work fine locally but I am planning to run multiple instances of the microservice on Kubernetes. I'm thinking of the following options:
Run multiple instances as is (i.e. each microservice serves as an API and a listener).
Introduce a FRONTEND, BACKEND environment variable. If the FRONTEND variable is true, do not configure the listener process. If the BACKEND variable is true, configure the listener process.
This way I can start scale how may frontend / backend services I need and also have the benefit of shutting down the backend services without losing requests.
Any pointers, best practice or any other options would be much appreciated.
You can do as you describe, with environment variables, or you may also be interested in building your app with different profiles/bean configuration and make two different images.
In both cases, you should use two different Kubernetes Deployments so you can scale and configure them independently.
You may also be interested in a Leader Election pattern where you want only one active replica if it only make sense if one single replica processes the events from a queue. This can also be solved by only using a single replica depending on your availability requirements.

Understanding the MajorDomo Pattern from NetMQ ZeroMQ

I am trying to understand how to best implement the MDP example in c# to be used in a windows service in a multiple client - single server environment.
I have read the docs but I am still unclear on the following:
Should all Worker instances be created on startup and left to run?
Should the Workers all be different types of services or just different instances of the same service?
Can I have one windows service when contains the Broker and Workers or is it best to split them out into their own services?
The example code I am using is the MajorDomo Pattern taken from here https://github.com/NetMQ/Samples
Yes, all workers in a MDP environment should be created independently of the requests, since the broker should not know how to create them
Each worker handles a given "service" (contract). Obviously each contract should have at least one worker.
If you need parallelized handling of requests, and a given worker can only do one at a time, having extra workers for that service could make sense. Generally you would do this if multiple machines were involved however (horizontal scaling)
You can have the broker and workers in the same process. HOWEVER, if you want to update only a worker, taking down the broker at the same time can be annoying for the clients. I would recommend letting the broker be its own process, with the workers in one or more other processes.

Azure Cloud Service - Auto Scale by queue counts Invisible messages (not ready for processing)

We just developed a system that integrates azure queue with an azure cloud service to process batch items. One requirement we had was to have items be set in the future to process. So for example, we batch it now, but tell it not to start for 5 hours.
This is built right into azure queues AddMessage using initialVisibilityDelay, so we did not see this as being an issue. However, we just noticed when we add auto scale on our Cloud Service, it is going off the total items in queue. In our situation we added 100,000 queue items to be sent 5 days from now, however it is scaling assuming these 100,000 are ready to go right now.
So in our situation, we would basically have dozens of instances of our app running until these messages can even send, 5 days from now.
I feel like there is something simple we are missing here.
Any feedback would be very helpful.
Anthony
Have you considered using one queue for the waiting messages and another queue for the actual messages to be processed and scaling on that latter queue?

Muliple Websphere Application Servers attached to a single Websphere MQ Failing

Issue:
Having multiple consumer applications active specification attached to a single MQ on distributed VM servers is causing a null payload in a MQ Message.
Note: See notes at the bottom. No issue with mq.
Details:
I have 3 Websphere applications deployed across 2 VM servers. 1 application is a publisher and the other 2 applications are consumers attached to a single MQ Manager, and MQ.
2 consumer applications are pulling off the messages and processing them. The consumer application on the separate server receives a null payload. I have confirmed that its seems to be an issue having multiple application server instances attached to MQ. Confirmed by deploying the publisher on server 2 with consumer 2, then consumer 1 fails.
Question:
Has anyone tried attaching multiple MDB applications deployed on separate server instances bind to one Queue Manager and one MQ?
Specifications:
Websphere 7, EJB 3.0 MDB's,Transactions turned off,Queue in a queue installed on another machine.
Goal:
Distributed computing, scaling up against large number of messages.
I'm thinking this is a configuration issue but not 100% sure where to look. I had read that you could use MQLink, but I don't see why I would need to use service bus integration.
Supporting Doucmentation:
[MQ Link][1
UPDATE: I fixed the problem and it was a related to a combination of class loader issue with a duplicate classes. See Solution Notes below I added.
EDIT HISTORY:
- Clarified specifications, clarified question and added overall goal.
- reference notes to solution.
Has anyone tried attaching multiple MDB applications deployed on
separate server instances bind to one Local MQ?
Multiple MDB applications deployed on separate servers, connecting to one Queue Manager(but different queue) is a normal scenario and we have it everywhere and all applications work fine.
But, I suspect what you are doing is: Multiple MDB applications deployed on separate servers, connecting to one Queue Manager and listening to the same queue.
In this case one message will be received by one consumer only.
You need to create separate queue for each application and create subscriptions for each for the topic being published by your publisher.
Addition:
I suspect, for load balancing the problem you may be facing is that, when your first application gets the message, it doesn't issue a commit. So, there will be an uncommited message in the queue, which may be stopping your other application from getting message from the queue. When your first application finishes its processing, it issues a commit, but then again it is ready to pick the message and hence it again issues a get.
In my architecture, we have implemented load balancing using multiple queue managers like below:
You create 3 queue managers, say GatewayQM, App1QM and App2QM.
Keep the three queue managers in the same cluster.
Create an alias queue(shared in cluster) in GatewayQM and ask your putting app to put message in the gateway queue.
Now create one local cluster queue in each of App1QM and App2QM. Read from these queues via your applications App1 and App2 respectively.
This implementation provides you better security and serves a perfect load balancer.
This specific problem was caused by a code issue and the combination of class loading being set to "Parent First" in the Websphere console. On one node it would work and the other nodes in a cluster would fail, I think this was caused by the "Parent First" setting.
More important, in terms of my configuration in a cluster binding multiple active specifications to a single MQ to provide distributed computing is a correct solution.
However "points" due go to "nitgeek" solution references above if you are looking for a extremely high volume solution. Its important to understand that a single MQ can have a very high depth and takes a lot to fully utilize one. My current configuration is a good starting point for quick configuration and distributed processing using Multiple MDB's.

Resources