How can I do when redis is down?

How can I do when redis is down? - spring-boot

There I have many spring-boot service depends on a redis to generate a continuous id such as 1,2,3...
How can I do when redis is down?
extra:
one Redis, not master-slave
Does Redis persistence keep data from being lost?

You can configure Redis to persist data on disk, i.e. AOF and RDB format. However, since the persistence is asynchronous (with AOF, you can sync your write for every operation, but in that way, you'll have performance problems), you still might loose data.
In your case, it seems that you might use the INCR command to generate id. If Redis is down without dumping all data, you'll get duplicate ids when Redis restarts.
This problem cannot be solved, even if you have a master-replica setup, since the synchronization between master and replica is also asynchronous.

Related

What is the best way to share events between Google cloud run containers

I have a service which is running on many cloud run containers.
When a single container (A) receives a web request to do some work, I need all the other live containers to fetch some updated data from elasticsearch.
I would have expected ES to have a "listening" type of connection such as firebase but this is not possible.
Right now I am having to poll the database from each service.
Is there a better way to achieve this sort of cross container sync when using cloud run? Would pub/sub be the best solution here?

It's unusual but not impossible to achieve.
First of all, you have to understand the instance life cycle: the CPU is allocated only when a request is being processed. Else, the CPU is throttle ( bellow 5%). That's also for that you pay only when your instance is processing, and not when the instance is kept warm (and offloaded after a while).
That being said, it's totally useless and inefficient to update instances in background when a request is not being processed.
Therefore, the idea is to perform something when the instance receive a request. The bad thing is that this solution will increase the request latency (the instance start to sync his cache and then process the request).
Finally the solution is to store, somewhere, the latest cache update. You have to keep that pretty same information in your instance. When the instance receive a request, first thing, it compares its own cache date with the central data date.
If it's the same, no problem, continue the processing.
If the central data date is after the current instance date, update the instance data, and then process the request.
You can store the data, and the date of that data in Firestore for instance, or in MemoryStore, or in any other databases.
PubSub can be also a solution but more complex to implement. Each instance, when they start have to create a pull subscription on a topic. When the instance is killed, you have to delete that subscription.
Then, when a request comes in, your instance have to pull the subscription, and get the messages, if any, and update his local cache.
Could be faster than the previous solution, but harder to implement.

Kafka Streams RocksDB large state

Is it okay to hold large state in RocksDB when using Kafka Streams? We are planning to use RocksDB as an eventstore to hold billions of events for ininite of time.

Yes, you can store a lot of state there but there are some considerations:
The entire state will also be replicated on the changelog topics, which means your broker will need to have enough disk space for it. Note that this will NOT be mitigated by KIP-405 (Tiered Storage) as tiered storage does not apply for compacted topics.
As #OneCricketeer mentioned, rebuilding the state can take a long time if there's a crash. However, you can mitigate it via multiple ways:
Use a persistent store and re-start the application on a node with access to the same disk (StatefulSet + PersistentVolume in K8s works).
In exactly-once semantics, until KIP-844 is implemented upon an unclean shutdown the state will still be rebuilt from scratch. But once that PR is merged then only a small amount of data will have to be replayed.
Have standby replicas. They will enable failover as soon as the consumer session timeout expires once the kafka streams instance crashes.

The main limitation would be disk space, so sure, it can be done, but if the app crashes for any reason, you might be waiting for a while for the app to rebuild its state.

Microservice failure Scenario

I am working on Microservice architecture. One of my service is exposed to source system which is used to post the data. This microservice published the data to redis. I am using redis pub/sub. Which is further consumed by couple of microservices.
Now if the other microservice is down and not able to process the data from redis pub/sub than I have to retry with the published data when microservice comes up. Source can not push the data again. As source can not repush the data and manual intervention is not possible so I tohught of 3 approaches.
Additionally Using redis data for storing and retrieving.
Using database for storing before publishing. I have many source and target microservices which use redis pub/sub. Now If I use this approach everytime i have to insert the request in DB first than its response status. Now I have to use shared database, this approach itself adding couple of more exception handling cases and doesnt look very efficient to me.
Use kafka inplace if redis pub/sub. As traffic is low so I used Redis pub/sub and not feasible to change.
In both of the above cases, I have to use scheduler and I have a duration before which I have to retry else subsequent request will fail.
Is there any other way to handle above cases.

For the point 2,
- Store the data in DB.
- Create a daemon process which will process the data from the table.
- This Daemon process can be configured well as per our needs.
- Daemon process will poll the DB and publish the data, if any. Also, it will delete the data once published.
Not in micro service architecture, But I have seen this approach working efficiently while communicating 3rd party services.

At the very outset, as you mentioned, we do indeed seem to have only three possibilities
This is one of those situations where you want to get a handshake from the service after pushing and after processing. In order to accomplish the same, using a middleware queuing system would be a right shot.
Although a bit more complex to accomplish, what you can do is use Kafka for streaming this. Configuring producer and consumer groups properly can help you do the job smoothly.
Using a DB to store would be a overkill, considering the situation where you "this data is to be processed and to be persisted"
BUT, alternatively, storing data to Redis and reading it in a cron-job/scheduled job would make your job much simpler. Once the job is run successfully, you may remove the data from cache and thus save Redis Memory.
If you can comment further more on the architecture and the implementation, I can go ahead and update my answer accordingly. :)

Microservice State Synchronization

We are working on an application that has a WebSocket connection to every client. For high availability and load balancing purposes, we would like to scale the receiving micro service. As the WebSocket connection is used to propagate the state of a client to every other client it is important to synchronize the current state of a client with all other instances of the receiving micro service. It is also important that the state has to be reset when a client disconnects.
To give you some specs:
We are using docker swarm
Its a NodeJS Backend and an Angular 9 Frontend
We have looked into multiple ideas, for example:
Redis Cache (The state would not be deleted if the instance fails.)
Queues/Topics (This would mean every instance has to keep track of the current state of all clients.)
WebSockets between instances (This looks promising but is not really scalable.)
What is the best practice to sync the state of a micro service between multiple instances while making sure that there are no inconsistencies? How are you solving this issue? Are we missing something obvious? Any tips and tricks?
We appreciate any suggestions.

This might not be 100% what you want to hear, but generally people advise that all microservices should be stateless.
An overall application, of course, has state, and databases, persistent event streams or key-value caches (e.g. Redis) are excellent ways of persisting this. Ideally this is bounded per service though, otherwise you risk end up writing a distributed monolith.
Hard to say in your particular case, but perhaps rethink how state is stored conceptually, and make that more explicit - determining what is cache (for performance) and what is genuine state that should be persisted externally (e.g. to Redis & a database), that allows many service instances to use instantly, thus making sure they can are truly disposable processes.

Data replication in Micro Services: restoring database backup

I am currently working with a legacy system that consists of several services which (among others) communicate through some kind of Enterprise Service Bus (ESB) to synchronize data.
I would like to gradually work this system towards the direction of micro services architecture. I am planning to reduce the dependency on ESB and use more of message broker like RabbitMQ or Kafka. Due to some resource/existing technology limitation, I don't think I will be able to completely avoid data replication between services even though I should be able to clearly define a single service as the data owner.
What I am wondering now, how can I safely do a database backup restore for a single service when necessary? Doing so will cause the service to be out of sync with other services that hold the replicated data. Any experience/suggestion regarding this?

Have your primary database publish events every time a database mutation occurs, and let the replicated services subscribe to this event and apply the same mutation on their replicated data.
You already use a message broker, so you can leverage your existing stack for broadcasting the events. By having replication done through events, a restore being applied to the primary database will be propagated to all other services.
Depending on the scale of the backup, there will be a short period where the data on the other services will be stale. This might or might not be acceptable for your use case. Think of the staleness as some sort of eventual consistency model.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio