Spring RMI load balancing / Scalability - spring

I am looking to implement a web application in which the end user is likely to cause invocation of business logic methods which are both cpu heavy and require a fair amount of memory to run.
My initial thought is to provide these methods as part of a standalone stateless business service, which can run on a separate machine to the web application. This can then be horizontally scaled as much as I need.
As these service methods are synchronous I am opting to us RMI as opposed to JMS.
My first question is if the above approach seems viable or seems to be good, or if my though process has got lost somewhere (this will be the first time I don't work on a standalone application).
Should that be the case I have been looking at spring RMI which seems to do an excellent job of exposing remote services non-intrusively. However I am unsure as how I could use this API to load balance between multiple servers. Are there any ways of doing this using spring or do I need a seperate API?

JBoss has the ability provide RMI proxies that are automatically load-balanced: http://docs.jboss.org/jbossas/jboss4guide/r4/html/cluster.chapt.html

Related

Running multiple Quarkus instances on one machine

I have an application separated in various OSGI bundles which run on a single Apache Karaf instance. However, I want to migrate to a microservice framework because
Apache Karaf is pretty tough to set up due its dependency mechanism and
I want to be able to bring the application later to the cloud (AWS, GCloud, whatever)
I did some research, had a look at various frameworks and concluded that Quarkus might be the right choice due to its container-based approach, the performance and possible cloud integration opportunities.
Now, I am struggeling at one point and I didn't find a solution so far, but maybe I also might have a misunderstanding here: my plan is to migrate almost every OSGI bundle of my application into a separate microservice. In that way, I would be able to scale horizontally only the services for which this is necessary and I could also update/deploy them separately without having to restart the whole application. Thus, I assume that every service needs to run in a separate Quarkus instance. However, Quarkus does not not seem to support this out of the box?!? Instead I would need to create a separate configuration for each Quarkus instance.
Is this really the way to go? How can the services discover each other? And is there a way that a service A can communicate with a service B not only via REST calls but also use objects of classes and methods of service B incorporating a dependency to service B for service A?
Thanks a lot for any ideas on this!
I think you are mixing some points between microservices and osgi-based applications. With microservices you usually have a independent process running each microservice which can be deployed in the same o other machines. Because of that you can scale as you said and gain benefits. But the communication model is not process to process. It has to use a different approach and its highly recommended that you use a standard integration mechanism, you can use REST, you can use Json RPC, SOAP, or queues or topics to use a event-driven communication. By this mechanisms you invoke the 'other' service operations as you do in osgi, but you are just using a different interface, instead of a local invocation you do a remote invocation.
Service discovery is something that you can do with just Virtual IP's accessing other services through a common dns name and a load balancer, or using kubernetes DNS, if you go for kubernetes as platform. You could use also a central configuration service or let each service register itself in a central registry. There are already plenty different flavours of solutions to tackle this complexity.
Also more importantly, you will have to be aware of your new complexities, but some you already have.
Contract versioning and design
Synchronous or asynchronous communication between services.
How to deal with security in the boundary of the services / Do i even need security in most of my services or i just need information about the user identity.
Increased maintenance cost and redundant side code for common features (here quarkus helps you a lot with its extensions and also you have microprofile compatibility).
...
Deciding to go with microservices is not an easy decision and not one that should be taken in a single step. My recommendation is that you analyse your application domain and try to check if your design is ok to go with microservices (in terms of separation of concenrs and model cohesion) and extract small parts of your osgi platform into microservices, otherwise you mostly will be force to make changes in your service interfaces which would be more difficult to do due to the service to service contract dependency than change a method and some invocations.

Thread model for Async API implementation using Spring

I am working on the micro-service developed using Spring Boot . I have implemented following layers:
Controller layer: Invoked when user sends API request
Service layer: Processes the request. Either sends request to third-part service or sends request to database
Repository layer: Used to interact with the
database
.
Methods in all of above layers returns the CompletableFuture. I have following questions related to this setup:
Is it good practice to return Completable future from all methods across all layers?
Is it always recommended to use #Async annotation when using CompletableFuture? what happens when I use default fork-join pool to process the requests?
How can I configure the threads for above methods? Will it be a good idea to configure the thread pool per layer? what are other configurations I can consider here?
Which metrics I should focus while optimizing performance for this micro-service?
If the work your application is doing can be done on the request thread without too much latency, I would recommend it. You can always move to an async model if you find that your web server is running out of worker threads.
The #Async annotation is basically helping with scheduling. If you can, use it - it can keep the code free of the references to the thread pool on which the work will be scheduled. As for what thread actually does your async work, that's really up to you. If you can, use your own pool. That will make sure you can add instrumentation and expose configuration options that you may need once your service is running.
Technically you will have two pools in play. One that Spring will use to consume the result of your future, and another that you will use to do the async work. If I recall correctly, Spring Boot will configure its pool if you don't already have one, and will log a warning if you didn't explicitly configure one. As for your worker threads, start simple. Consider using Spring's ThreadPoolTaskExecutor.
Regarding which metrics to monitor, start first by choosing how you will monitor. Using something like Spring Sleuth coupled with Spring Actuator will give you a lot of information out of the box. There are a lot of services that can collect all the metrics actuator generates into time-based databases that you can then use to analyze performance and get some ideas on what to tweak.
One final recommendation is that Spring's Web Flux is designed from the start to be async. It has a learning curve for sure since reactive code is very different from the usual MVC stuff. However, that framework is also thinking about all the questions you are asking so it might be better suited for your application, specially if you want to make everything async by default.

How can you scale a Spring Boot application?

I understand that Spring Boot has a built-in Tomcat server (or Jetty) which facilitates rapid development. But what do you do when you need to scale out your application because traffic has increased?
As pointed out in the comments, there is no silver bullet here, it depends on your infrastructure and there are several tools out there to help you, you only need to choose what works best for you.
For load balancing you can either choose something like an Nginx or leave it to spring cloud which also has a lot of other handy features for scaling/clustering.
Scaling shouldn't be very hard because spring boot runs on it's own server.
Some tools that help with scaling/clustering:
Spring boot app:
If you are going to scale, your app has to be near-stateless (e.g: you cannot have a scheduled task or something like that because when you scale to x instances, they are executed x times).
You can use the spring cloud project for extra added features like service discovery and other goodies that make scaling easier (e.g: When you spin up a new instance, it can get the config easily from a config server, 'register' to ease the loadbalancing between services, have cluster-like behaviour, etc...).
Infrastructure and containers:
Docker is a no-brainer here to handle easy launching of your applications and their replicas, if needed. If you can go further with resources and go with Kubernetes but it all depends on the use case.
Various servers (nodes), in case one of them fails and to easily distribute loads.
Ngnix for load balancing is pretty straightforward if you already don't have something done with spring cloud.
Database:
You really do NOT want to go with MySQL here because it can not scale well as your spring apps. You can choose something like Cassandra or Redis but that would mean restructuring your data model. Maybe the least-painful transition from MySQL to something NoSQL that can scale is a MongoDB (imho: Cassandra performs better).
Logging:
This can be a nightmare but spring also has a solution for this. Check out zipkin and spring sleuth.
Also, there are a lot resources here that talk a lot about architecture in general and how it is necessary to change the mindset when trying to run distributed services.
Hope this helps.
Update 2021-02-23
Today, Kubernetes is pretty much a de-facto standard when we talk about scaling and is preferred because of the rich set of features that you will be able to leverage and focus your app purely on business domain logic and can remove things like spring cloud for service discovery. If you can use some public clouds like EKS and GKE, you are better off without having to manage the clusters by yourself.
It provides autoscaling and built-in healthchecks. Starting from Spring Boot 2.4, you have many added benefits for running Spring Boot on K8s like dedicated healthcheck endpoints for liveness and readiness probes, graceful shutdown, etc....
On the database side, aim for something that is managed and scales easily such as AWS Aurora or similar.
An important thing to mention when managing spring boot services at scale is probably configuration management. A very useful solution that you can use out of the box is Consul. This will enable you to hot reload the configuration which is important when you have 50 services that you need to restart only to change one boolean variable. Depending on how big is your application, the startup can be costly, in terms of time as well as CPU/memory resources

How to do load balancing in distributed OSGi?

We deploy two service instance in difference machine using CXF distributed OSGi. we want to give the system the load balancing feature. All we know the OSGi don't provide any load balancing feature. Does anyone know how to do it ?
Load balancing is a concern that is meant to be implemented by the Topology Manager (TM). It would be useful to read the Remote Services Admin specification, which addresses exactly this kind of question.
As far as I know, the CXF implementation of Remote Services only implements a single TM, which is "promiscuous", i.e. it publishes every available service in every listening framework. It is possible however to write your own TM to perform load balancing and failover etc.
The Remote Services spec is written in such a way that a TM implementation can be developed completely independently of any specific Remote Services implementation.
You should be able to get the complete list of services using a ServiceTracker. So a nice way to create a load balancer should be to create a proxy service yourself that does the load balancing and publish it locally as a service with a special property. So your business application can use the service without knowing anything about the details of load balancing.

Use EIP and integration solutions to distribute layers on cloud?

I want to adopt a solution of EIP for cloud deployment for a web application:
The application will be developed in such an approach that each layer (e.g. data, service, web) will come out as a separate module and artifact.
Each layer has the opportunity to deployed on a different virtual resource on the cloud. In this regards, web nodes will in a way find the related service nodes and likewise service nodes are connected to data nodes.
Objects in the service layer provide REST access to the services in the application. Web layer is supposed to use REST services from service layer to complete requests for users of the application.
For the above requirement to deliver a "highly-scalable" application on the cloud, it seems that solutions such as Apache Camel, Spring Integration, and Mule ESB are of significant options.
There seems to be other discussion such as a question or a blog post on this topic, but I was wondering if anybody has had specific experiences with such a deployment scheme on "the cloud"? I'd be thankful for any ideas and sharing experiences. TIA.
To me this looks a bit like overengineering. Is there a real reason that you need to separate all those layers? What you describe looks a lot like the J2EE applications from some years ago.
How about deploying all layers of the application onto each node and just use simple Java calls or OSGi services to communicate.
This aproach has several advantages:
Less complexity
No serialization or DTOs
Transactions are easy / no distributed transactions necessary
Load Balancing and Failover is much easier as you can do it on the web layer only
Performance is probably a lot higher
You can implement such an application using spring or blueprint (on OSGi).
Another options is to use a modern JavaEE server. If this is interesting to you take a look at some of the courses of Adam Bien. He shows how to use JavaEE in a really lean way.
For communicating between nodes I have good experiences with Camel and CXF but you should try to avoid remoting as much as possible.

Resources