Scaling a microservice with frontend and backend instances - spring-boot

I am developing a series of microservices using Spring Boot and plan to deploy them on Kubernetes.
Some of the microservices are composed of an API which writes messages to a kafka queue and a listener which listens to the queue and performs the relevant actions (e.g. write to DB etc, construct messsages for onward processing).
These services work fine locally but I am planning to run multiple instances of the microservice on Kubernetes. I'm thinking of the following options:
Run multiple instances as is (i.e. each microservice serves as an API and a listener).
Introduce a FRONTEND, BACKEND environment variable. If the FRONTEND variable is true, do not configure the listener process. If the BACKEND variable is true, configure the listener process.
This way I can start scale how may frontend / backend services I need and also have the benefit of shutting down the backend services without losing requests.
Any pointers, best practice or any other options would be much appreciated.

You can do as you describe, with environment variables, or you may also be interested in building your app with different profiles/bean configuration and make two different images.
In both cases, you should use two different Kubernetes Deployments so you can scale and configure them independently.
You may also be interested in a Leader Election pattern where you want only one active replica if it only make sense if one single replica processes the events from a queue. This can also be solved by only using a single replica depending on your availability requirements.

Related

Load balance Kafka consumer multiple instances

I have a consumer that reads and writes messages to a time-series database. We have multiple instances of the time series database running as a cluster on multiple physical machines.
Our plan is to deploy the consumer on Kubernetes so I can scale if I need more instance with load-balance they all point to a single time series service that is running.
Now I getting an Issue where it's come to my mind that if I have 5 instances which consume the same topic then they work individually means( they all get message payload and save like any one instance is doing )
What we want is
we want if one consumer is busy then it will go to the next free instance but not be subscribed to by all instance running. To scale or load-balance means I want like normal load-balancing application or how spring-boot app works when you scale on Kubernetes
so is there any way to make it like a load-balancing consumer and processing only one, even consume by 1st or 2nd or 3rd like normal app work as loadbanlacer?
if anyone has ideas about this, how it going to behave and what kind of output we are going to get if doing this with Kafka Spring boot application?

Understanding the MajorDomo Pattern from NetMQ ZeroMQ

I am trying to understand how to best implement the MDP example in c# to be used in a windows service in a multiple client - single server environment.
I have read the docs but I am still unclear on the following:
Should all Worker instances be created on startup and left to run?
Should the Workers all be different types of services or just different instances of the same service?
Can I have one windows service when contains the Broker and Workers or is it best to split them out into their own services?
The example code I am using is the MajorDomo Pattern taken from here https://github.com/NetMQ/Samples
Yes, all workers in a MDP environment should be created independently of the requests, since the broker should not know how to create them
Each worker handles a given "service" (contract). Obviously each contract should have at least one worker.
If you need parallelized handling of requests, and a given worker can only do one at a time, having extra workers for that service could make sense. Generally you would do this if multiple machines were involved however (horizontal scaling)
You can have the broker and workers in the same process. HOWEVER, if you want to update only a worker, taking down the broker at the same time can be annoying for the clients. I would recommend letting the broker be its own process, with the workers in one or more other processes.

Vertx clustering alternative

Anyone with real-world experience of Vertx cluster managers other than Hazelcast have advice on our requirement below?
For our (real time sensor data) system we have hundreds of verticles in multiple JVM's, but we do not need, or want, the eventbus to span multiple physical servers.
We're running Vertx on multiple servers but our platform is less complex if we don't pool a single eventbus between all of them (we prefer to be explicit about passing messages between servers).
Hazelcast is the wrong cluster manager for us. We don't need its peer discovery between servers, but crucially any release change of Hazelcast means that new clients cannot join a cluster with existing running clients running the previous version so bringing up one new verticle compiled with vertx 3.6.3 into an existing cluster is not possible unless we stop the entire cluster and restart it with all the verticles recompiled to 3.6.3. This seriously impacts our development. It's helpful for the verticles to be more plug-and-play and vertx can do that but Hazelcast can't (due to constant version incompatibilities).
Can anyone recommend a vertx cluster manager that fits our use case?
I've now had time to review each of the alternatives Vertx directly supports as a 'cluster manager' (Hazelcast, Zookeeper, Ignite, Infinispan) and we're proceeding with a Zookeeper architecture for our system, replacing Hazelcast:
Here's the background to our decision:
We started as a fairly typical (if there is such a thing) Vertx development with multiple verticles in a JVM responding to external events (urban sensor data entering our java/vertx feed handlers) published on the eventbus and the data being processed asynchronously in many other vertx verticles, often involving them publishing new derived data as new asynchronous messages.
Quite quickly we wanted to use multiple JVM's, mainly to isolate the feedhandlers from the rest of the code so if things broke the feedhandlers would keep running (as a failsafe they're persisting the data as well as publishing it). So we added (easily) Vertx clustering so the JVM's on the same machine could communicate and all verticles could publish/subscribe messages in the same system. We used the default cluster manager, Hazelcast, and modified the config so the vertx clustering is limited to the single server (we run multiple versions of the entire platform on different servers and don't want them confusing each other). We have hundreds of verticles in half-a-dozen JVM's.
Our environment (search SmartCambridge vertx) is fairly dynamic with rapid development cycles (e.g. to create a new feedhandler and have it publishing its data on the eventbus) and that means we commonly wish to start up a JVM containing these new verticles and have it join an existing vertx cluster, maybe permanently, maybe just for a while. Vertx/Hazelcast has joining a (vertx) cluster as a fairly serious operation, i.e. Hazelcast has (I believe) a concept of Hazelcast cluster members and Hazelcast clients, where clients can come and go easily but joining a Hazelcast cluster as a member requires considerable code compatibility between the existing cluster and the new member. Each time we upgraded our Vertx library the Hazelcast library version would change and this made it impossible for a newly compiled vertx verticle to join an existing vertx cluster.
Note we have experimented with having the Vertx eventbus flow between multiple servers, and also extend the eventbus into the browser/javascript, but in both cases have found it simpler/more robust to be explicit about routing messages from server to server and have written verticles specifically for that purpose.
So the new plan (after several years of Vertx development), given our environment of 5 production/development servers but with the vertx eventbus always limited to single servers, is to implement a single Zookeeper cluster across all 5 servers so we get the Zookeeper native resilience goodness, and configure each production server to use a different znode root (the default is 'io.vertx' but this is a simple config option).
This design has an attractive simple minimum build on a single server (i.e Zookeeper + Vertx) so ad-hoc development on a random machine (e.g. laptop) is still possible but we can extend our platform to have multiple servers in a single vertx cluster trivially by setting a common znode root.

Best approach to send updates to other micro services which are running(multiple instances) in different data centers

I have 3 different micro services(ex: A,B,C. these are REST, and springboot based). These 3 different services generally runs on 3 different data centers locations, so i.e different instances for each service.
The problem trying to solve:
I need to send updates(its kind of polling, checking if there are any updated records) in service A, then send updated information to services B and C, through REST call. Based on these updates service B and C does it's own processing. Once after deployment(mostly into cloud). How does A knows which B, C instances are up and running. SO that it can send updates to running instances.
Do we need to keep track of running instances into some DB table and lookup for active instances before sending updates from A?. (OR) just create some indicator or sequence number based approach to find out there are some updates at A, So we need to send out.But in this does it A knows what all are active instances running? Or else, we just need to send updates from A, so that some router or load balancer or some other thing will takes care of sending to available active instances running regardless of storing and looking up for active instances
I am not much familiar with network and prod systems behavior and its communication in cloud systems.
Trying to implement cross service update through REST based synchronization is a bad idea because it is not scalable in a sense that if you add more microservices that needs to be aware of updates made on service A. You would have to modify the existing microservice that emits the change. This in fact introduces risk and additional maintenance cost.
However, you can try to use messaging queues to emit events that indicates changes made on a service. This approach eliminates the need to modify any existing microservice (Thanks to pub/sub pattern) and just plug new consumers to your existing update emitting services in your ecosystem

Vert.x cluster Eventbus cross processes

Does any body have some info, links, pointer on how is cross process Eventbus communication is occurring. Per documentation I am concluding that multiple Vert.x (thus separate JVM processes) could be clustered on and communicate via Eventbus. However, there are little to none documentation on how to achieve it.
Looking into DOCs, I can see that publish/registerHandler methods take address as a String what works within a process, but I can not wrap my head around on how it works cross processes and how to register and publish to address, does it work over HTTP , TCP ? From API perspective do I need to pass port and process signature ?
Cross process communication happens via the EventBus. Multiple vertx instances can be started up and clustered to allow separate instances on the same or other machines to communicate. The low level clustering is handled by Hazelcast.The configuration is handled by the cluster.xml file in the conf folder of your vertx install. You can learn more about the format of the file by looking at the Hazelcast Docs. It is transparent to your handers and works over TCP.
You can test it by running two or more instances on your local machine once they are started with the -cluster flag. Look at the example being run, and the config changes required in How to use eventbus messaging in vertx?

Resources