Shared redis instance by multiple instances of same application - caching

We have been asked to implement caching in an application using redis. The application should have a logic to clear cache on startup and initialize it.
However, the redis instance can be shared by multiple instances of the application.
e.g. application X has two instances X0 and X1 sharing the same redis instance.
Problem:
With multiple instances, it is possible that one instance trying to initialize the cache while other instance is clearing it.
Two questions
1) How do make sure while cache is getting initialized, other instance does not clear it.
One way to solve this problem is to maintain a flag in redis to check if it is being cleared or initialized. If cache is being initialized, do not clear or re-initlize it.
2) Is it good practice to have shared redis instance by multiple application instances?

In general it's not a good idea to share redis. If you only have a limited number of application instances, you are better off creating a separate Redis process for each. Redis is lightweight, so multiple processes running on different parts on the same server works well in practice.
If you cannot install multiple processes, you can have 1 database for each instance. Redis by default allows 16 databases. You can then flush each database independently. Just remember that databases in redis are discouraged, and they have been discontinued in redis cluster.

Related

Can Digital Ocean Stateless Servers (I have 4 servers running on Digital Ocean) work with the Caching Policy I implemented on Spring Boot?

I have implemented Caching in my Spring Boot REST Application. My policy includes a time based cache eviction strategy, and an update-based cache eviction strategy. I am worried that since I employ a stateless server, if there is a method called to update certain data, and this was handled by server instance A, then the corresponding caches in server instance B, C and D, are not updated as well.
Is this an issue I would face / is there a way to overcome this issue?
This is the oldest problem in software development - cache invalidation when you have multiple servers
One way to handle it is to move your cache out of the individual servers and move them to somewhere shared like another instance that holds the cache entries that every other app refers to or something like redis [centralized cache]
Second way is to do a broadcast message so that each server now knows to invalidate the entry once the data has been modified or deleted - here you run the risk of the message not being processed and thus a stale entry is left in some server[s]
Another option is to have some sort of write ahead log [like kafka or redis streams] which is processed by each server and thus they will all process the events deterministically and have the same cache state
Lmk if you need more help - we can setup some time outside of SO

avoid persisting memory state between requests in FastAPI

Is there a way to deploy a FastAPI application so that memory state cannot be persisted between requests? The goal is to avoid leaking any data between requests in a multi-tenant application.
Starting up the application from scratch for every request seems not feasible since it takes too long. Is there a way in which the application is launched for every instance of the service but individual requests are handled by workers or threads that get purged after the request is handled so that any static property, singleton instance and such is destroyed and the next request is handled with clean memory?
FastAPI is basically stateless by default. It actually takes extra work to persist data across requests through methods such connection pooling, reading a value from Redis, and so on. If you consider things such as starting up the server, loading a configuration, setting up path redirects, and so on to be "state", then FastAPI will not work for your purposes.
When you say "memory state", it sounds like you are trying to partition off instances of FastAPI server from each other so that they do not even use the same memory. This is not going to be a viable solution because most web servers, FastAPI included, are not designed for this type of segregating. By default, the requests from one tenant will not have anything to do with the requests from another tenant unless you write additional code that allows them to become related; so separating the concerns of the different tenants becomes a matter for the programmer, not the server's memory.
Instead, if you absolutely cannot let requests from multiple tenants inhabit the same memory, you'd be better off giving different tenants their own subdomain on the DNS level. Spin up a VPS and instance of your FastAPI program for each of them. That will truly prevent the requests from one tenant share any memory or state with the others.

Microservices: Simultaneous cache updates

I am developing a microservice. This MS will be deployed to docker containers and will be monitored by Kubernetes. I have to implement a caching solution using hazelcast distributed cache. My requirements are:
Preload the cache on startup of this microservice. For around 3000 stores I have to fetch two specific attributes and cache them.
Every 24 hours refresh the cache.
I implemented Spring #EventListener and on startup to make a database call for the 2 attributes and do a #CachePut and store them in Cache.
I also have a Spring scheduler with cron expression to refresh cache at every 6 AM in morning.
So far so good.
But what I did not realize that in clustered environment - 10-15 instances of my microservice will be in action and will try to do above 2 steps almost simultaneously - thus creating a stampede effect on my database and cache. Does anyone know what to do in this scenario? Is there any good design or even average one which I can follow?
Thanks.
You should be looking to use Hazelcast provided Loading and Storing Persistent Data mechanism that allows 2 options for writing: Write-through and write-behind and read-through for loading data into the cache.
Look for MapLoader and its methods, that will let you warm-up/preload your cluster and you have the freedom to do that with your own implementation.
Check for more details: https://docs.hazelcast.org/docs/3.11/manual/html-single/index.html#loading-and-storing-persistent-data

What is the recommended way of creating a distributed Lock with Redis on Azure?

I'm looking to create a distributed Lock within Redis on Azure for our multi-instance Worker Role. I need a way of creating "critical sections" for which only a single thread can have access at a time across multiple-instances of the Worker Role.
I am using the StackExchange.Redis client to do this and, helpfully, it has an implementation of transactional TakeLock\ReleaseLock already, and this answer on SO gives me a good idea of the pattern to use and details about how to create a lock.
Reading further around the subject, I also read this Redis article regarding distlock which describes the weaknesses of failover-based Redis nodes when trying to implement a distributed lock mechanism.
The Azure Redis cache implements master/slave failover (apart from the Basic tier) so does this mean that I will need to implement the redlock pattern in order to guarantee that only one thing will ever have the lock?
Additionally, I am wondering:
Why do Azure Redis example connection strings not seem to list the master and slave in them? Have Azure implemented the master/slave failover in a different way?
Why has one .NET implementation of redlock chosen not to support using master/slaves in its usage? (See Usage section, first para) Is this just by choice or is it because master/slave is not a valid usage of redlock (that would not seem to be the case in the redis article)
I'm the author of the RedLock.net library that you linked in your question. The reason the documentation specifies connecting to independent redis instances is based on the reasoning in the Redis Distlock documentation. By forcing writes only to master nodes, we hopefully avoid the situation where a user might misconfigure Redlock to connect to multiple replicated hosts.
According to Azure Redis Cache 103 - Failover and Monitoring there is a load balancer in front of an Azure Redis Cache (at the standard tier and above) that ensures that you are always connected to the master.
Connecting to multiple redis instances (either replicated or not) should give a fairly good guarantee that no two processes end up running at the same time (moreso than a single replicated instance).
In order for another process to 'steal' the lock before the first had finished, more than half of the independent redis instances would need to lose their lock keys (e.g. by restarting without persistence), then have process two gain the lock before the timer in process one reacquired it during its extend timer.

Spring + Load balancing/Clustering

I am working on a webapp project and we are considering deploying it on multiple servers.
What solution do you advise for clustering/load-balancing with Spring?
What are the issues to take into account?
For example: How do singletons behave in a cluster of machines? What about session replication? Are there any other issues to take into account?
Here is the list of possible issues (not necessarily Spring-related):
stateful beans - if your beans have state, like collections accumulating something or counters, you need to think whether this state should be replicated or not. E.g. should this counter be local to one JVM or global in the whole cluster? In the latter case consider terracotta and hazelcast
filesystem - as long as all instances use the same database, everything is fine. But if one node writes to disk, other instance can't read it. Solutions? Either use database for all storage or distributed file system
HTTP sessions - either use sticky session or replicate sessions. If you go for replication, keep sessions as small as possible.
asynchronous jobs - if you have a job running every hour, should it run on every machine, or just on a dedicated one (or maybe on random)?

Resources