Spring + Load balancing/Clustering - spring

I am working on a webapp project and we are considering deploying it on multiple servers.
What solution do you advise for clustering/load-balancing with Spring?
What are the issues to take into account?
For example: How do singletons behave in a cluster of machines? What about session replication? Are there any other issues to take into account?

Here is the list of possible issues (not necessarily Spring-related):
stateful beans - if your beans have state, like collections accumulating something or counters, you need to think whether this state should be replicated or not. E.g. should this counter be local to one JVM or global in the whole cluster? In the latter case consider terracotta and hazelcast
filesystem - as long as all instances use the same database, everything is fine. But if one node writes to disk, other instance can't read it. Solutions? Either use database for all storage or distributed file system
HTTP sessions - either use sticky session or replicate sessions. If you go for replication, keep sessions as small as possible.
asynchronous jobs - if you have a job running every hour, should it run on every machine, or just on a dedicated one (or maybe on random)?

Related

Session Persistence Hazelcast client initialization when server is offline

We are trying to replicate the WebSphere Traditional (5/6/7/8/9) behaviour about session persistance for servlets and http, but with Hazelcast and Tomcat. Let me explain...
WebSphere, even when configured as client to a replication domain, keeps a local register of session data. And this local register works fine even if the server processes that should keep replicated data are shutdown from the very first moment. That is, you start the client, and session persistence works within the servlet container. Obviously, you cannot expect to recover your session in another servlet container if the first one crashes, but your applications work anyway.
On the other hand, Hazelcast client on Tomcat containers expect the Hazelcast server (at least one member of the cluster) to be up and running to initialize. If no cluster member is available, initialization fails, and ... web applications in the Tomcat servlet container do not start right. They won't answer any request.
Furthermore, once initialization fails, only way to recover is to shutdown and re-start the tomcat web containers (once a hazelcast cluster member is online).
This behaviour is a bit harsh on system administrators: no one can guarantee that a backup service as distributed session persistence is online all time. That means that launching a Tomcat client becomes a risky task, with a single point of failure by design, which is undesirable.
Now, maybe I overlooked something, maybe I got something wrong. So, ¿Did someone ever managed to start a Hazelcast client without servers, and how? For us, the difference is decisive: if we cannot make the web container start with the hazelcast server offline, then we must keep going on with WebSphere.
We have been trying it on a CentOS 7.5 on Virtual Box 5.2.22, and our Tomcat version is 8.5. Hazelcast client and server is 3.11.1/2.
<group>
<name>Integracion</name>
<password></password>
</group>
<network>
<cluster-members>
<address>hazelcastsrv1/address>
<address>hazelcastsrv2</address>
</cluster-members>
</network>
Sadly, we expect exactly what we get: the reading of the Hazelcast manual suggest that offline servers won't allow tomcat to serve applications. But we cannot beleive what we read, because it makes the library unsafe in a distributed context. We expect to be wrong, and that there are good news around the corner.
Hazelcast is not "a single point of failure by design". The design is to avoid a single point of failure. Data is mirrored across the nodes by default.
It's a data grid, you run as many nodes as capacity and resilience requires, and they cluster together.
If you need 3 nodes to be up for successful operations, and also anticipate that 1 might go down, then you need to run 4 in total. Should that 1 failure happen, you have a cluster surviving that is big enough.
Power-on/Power-off order is not relevant in Hazelcast, as long as you are providing remaining nodes, during power-off, enough time to let repartitioning complete. For example, in a 4 nodes cluster, if you take out 1 node and give the other 3 room to complete repartitioning then you dont loose the data. If you take out 2 nodes together then the cluster will be without the data whose backup was stored on 1 of the 2 nodes you took out.
For starting up, the startup sequence is not relevant as each node owns certain set of partitions that are determined based on consistent hashing. And this ownership continues to change even if there are nodes leaving/joining a running cluster.

Shared redis instance by multiple instances of same application

We have been asked to implement caching in an application using redis. The application should have a logic to clear cache on startup and initialize it.
However, the redis instance can be shared by multiple instances of the application.
e.g. application X has two instances X0 and X1 sharing the same redis instance.
Problem:
With multiple instances, it is possible that one instance trying to initialize the cache while other instance is clearing it.
Two questions
1) How do make sure while cache is getting initialized, other instance does not clear it.
One way to solve this problem is to maintain a flag in redis to check if it is being cleared or initialized. If cache is being initialized, do not clear or re-initlize it.
2) Is it good practice to have shared redis instance by multiple application instances?
In general it's not a good idea to share redis. If you only have a limited number of application instances, you are better off creating a separate Redis process for each. Redis is lightweight, so multiple processes running on different parts on the same server works well in practice.
If you cannot install multiple processes, you can have 1 database for each instance. Redis by default allows 16 databases. You can then flush each database independently. Just remember that databases in redis are discouraged, and they have been discontinued in redis cluster.

Alternative to session replication \ tomcat clustering

We have 3 tomcats with the same web app, using the same DB.
We want to use non-stickey session.
this means we will have to share the session (replicate) between the tomcats (cluster?)
We dont like the idea of the delta-manger since it is an all-to-all replication with preformance cost.
However we dont really like the backup-manager as well (still multiple copies)
My question is:
Is it possible to define a single tomcat that will be a "session manager" and all other tomcats will not keep sessions by themselves?
this way no broadcasting of sessions is needed...
My reading of the Tomcat docs finds:
... when using the delta manager it will replicate to all nodes, even
nodes that don't have the application deployed.
exactly as you say, but then says:
To get around this problem, you'll want to use the BackupManager. This
manager only replicates the session data to one backup node
You seem to object to "multiple copies", but this doesn't seem very different from your proposed suggestion, the BackupManager is, so far as I can see, acting as a Session Manager.
When you don't have sticky sessions you are pretty much guaranteeing that 2 of every 3 requests will need to get a copy of the session data from somewhere else, with only 3 tomcats how much performance cost would all-to-all replication impose?
I suspect that tuning your session sizes is more important. Large sessions tend to be a problem for any sort of replication.

Pros & Cons of Session Repliction

Do I really need Session Replication?
I am working on a number of web projects for a firm. Most of the projects are about one or two pages of input and then doing a save to a mysql database. Very Basic projects. My SA's are pushing to try to get session replication working in JBoss but I don't really see any need for it and all of its overhead.
We need load balancing and clustering so if the server does go down we can move the new requests to the backup service but I am not to big in session replication.
This is very low volume projects. I my eyes what is the odds of a user being in the project as the server goes down on the one or two pages.
I need to convince the SAs that session replication is an un-necessary complication in this instance. I am looking for pros and cons of session replication so that I can better structure my argument.
Well, the "pro" is that you have session failover, either in deliberate cluster member restarting or in inadvertent cluster-member failure. That's it.
Some of the "cons" are:
Session objects and their included objects have to be Serializable
You have to choose Session persistence or replication and manage their configurations and/or datastore
You have to think about Session persistence/replication policies (e.g. every write, request end, time scheduled) and still risk losing the session or losing the most current state of it if a failure occurs before recent changes have been stored/replicated
Non-zero performance impact of replicating or or persisting, inversely related to how robust the replication policy is. (That is, the more likely that you'll get every session change replicated promptly, the worse the performance.)
We do session replication because we considered failover to be an absolute requirement years ago when we started this, but I think if I had it to do over again I'd suggest we don't bother for the majority of our applications.

What are the drawbacks of session replication on Tomcat?

I was trying to decide what is better in a Tomcat+Apache reverse proxy mode for session replication. What is more common on deployments? session replication or stick session? Are there any drawbacks for session replication?
Thanks
I can point out the following considerations if you go for session replication.
Performance
The main drawback will be on performance. Replicated sessions involve copying of session data over to all the servers in the cluster. The more servers you have in the cluster, the additional overheads involved.
Tomcat helps with this overhead by definining two modes for session replication.
DeltaManager (default) and BackupManager
From this URL http://tomcat.apache.org/tomcat-6.0-doc/cluster-howto.html
Using the above configuration will
enable all-to-all session replication
using the DeltaManager to replicate
session deltas. By all-to-all we mean
that the session gets replicated to
all the other nodes in the cluster.
This works great for smaller cluster
but we don't recommend it for larger
clusters(a lot of tomcat nodes). Also
when using the delta manager it will
replicate to all nodes, even nodes
that don't have the application
deployed.
To get around this problem,
you'll want to use the BackupManager.
This manager only replicates the
session data to one backup node, and
only to nodes that have the
application deployed. Downside of the
BackupManager: not quite as battle
tested as the delta manager
Read this URL for good design tips for the cluster if enabling session replication.
Memory
How many concurrent users will be hitting the application? the more users, the more data gets stored into sessions, and hence an overload for session replication.
Code considerations
Additionally you need to ensure the data being put into the session by the application is serializable. Serializing session data has some overhead for replicating the session state. It's a good idea to keep the session size reasonably small, so the developers need to check the amount of data being put into the session.
Sticky Sessions
Given these considerations, it actually depends on the criticality of the use cases. If you go for sticky sessions alone, then there is a chance of loss of user data during a critical journey.
Do you have means to recover from that - eg: by persisiting critical data into database at each step of a order or payment journey? If not the user has to login and start again. This is fine for websites which are not transactional, but browse brochureware type of data or filling out forms to capture data which is not payment etc.

Resources