Default failure/recovery behavior for Gemfire Server/Client Architecture - caching

For gemfire cache, we are using the client/server architecture in 3 different geographic regions with 3 different locators.
Cache Server
Each region would have 2 separate cache server, potentially one
primary and one secondary
The cache servers are peer-to-peer connection
The data-policy on the cache servers is replicate
No region persistence is enabled
Cache Client
No persistence is enabled
No durable queues/subscriptions are set up
What would the default behaviors of the following scenarios:
All cache servers in one geo-region crashes, what happens to the data in the cache clients when the cache servers restart? Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
All cache clients in one geo-region crashes. Although we don't have durable queues/subscriptions set up, for this scenario, let's assume we do. What happens to the data in the cache clients when they restart? Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
All cache servers and cache clients in one geo-region crashes, what happens to the data in the cache servers and cache clients when they start up? Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
Thanks in advance!

Ok, so based on how I am interpreting your configuration/setup and your questions, this is how I would answer them currently.
Also note, I am assuming you have NOT configured WAN between your separate clusters residing in different "geographic regions". However, some of the questions would not matter if WAN was configured or not.
Regarding your first bullet...
what happens to the data in the cache clients when the cache servers restart?
Nothing.
If the cache client were also storing data "locally" (e.g. CACHING_PROXY), then the data will remain intact.
A cache client can also have local-only Regions only available to the cache client, i.e. there is no matching (by "name") Region in the server cluster. This is determined by 1 of the "local" ClientRegionShortcuts (e.g. ClientRegionShortcut.LOCAL, which corresponds to DataPolicy.NORMAL). Definitely, nothing happens to the data in these type of client Regions if the servers in the cluster go down.
If your client Regions are PROXIES, then your client is NOT storing any data locally, at least for those Regions that are configured as PROXIES (i.e. ClientRegionShortcut.PROXY, which corresponds to DataPolicy.EMPTY).
So...
Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
See above, but essentially, your "PROXY" based client Regions will no longer be able to "communicate" with the server.
For PROXY, all Region operations (gets, puts, etc) will fail, with an Exception of some kind.
For CACHING_PROXY, a Region.get should succeed if the data is available locally. However, if the data is not available, the client Region will send the request to the server Region, which of course will fail. If you are performing a Region.put, then that will fail sense the data cannot be sent to the server.
Regarding your second bullet...
What happens to the data in the cache clients when they restart?
Depends on your "Interests Registration (Result) Policy" (i.e. InterestResultPolicy) when the client registers interests for the events (keys/values) in the server Region, particularly when the client comes back online. The interests "expression" (either particular keys, or "ALL_KEYS" or a regex) determines what the client Region will receive on initialization. It is possible not to receive anything.
Durability (the durable flag in `Region.registerInterest(..).) of client "subscription queues" only determines whether the server will store events for the client when the client is not connected so that the client can receive what it missed when it was offline.
Note, an alternative to "register interests" is CQs.
See here and here for more details.
As for...
Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
Not that I know of. It all depends on your interests registration and/or CQs.
Finally, regarding your last bullet...
All cache servers and cache clients in one geo-region crashes, what happens to the data in the cache servers and cache clients when they start up?
There will be no data if you do not enable persistence. GemFire is an "In-Memory" Data Grid, and as such, it keeps your data in memory only, unless you arrange for storing your data externally, either by persistence or writing a CacheWriter to store the data in an external data store (e.g. RDBMS).
Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
Not in this case.
Hope this helps!
-John

Related

Microservice State Synchronization

We are working on an application that has a WebSocket connection to every client. For high availability and load balancing purposes, we would like to scale the receiving micro service. As the WebSocket connection is used to propagate the state of a client to every other client it is important to synchronize the current state of a client with all other instances of the receiving micro service. It is also important that the state has to be reset when a client disconnects.
To give you some specs:
We are using docker swarm
Its a NodeJS Backend and an Angular 9 Frontend
We have looked into multiple ideas, for example:
Redis Cache (The state would not be deleted if the instance fails.)
Queues/Topics (This would mean every instance has to keep track of the current state of all clients.)
WebSockets between instances (This looks promising but is not really scalable.)
What is the best practice to sync the state of a micro service between multiple instances while making sure that there are no inconsistencies? How are you solving this issue? Are we missing something obvious? Any tips and tricks?
We appreciate any suggestions.
This might not be 100% what you want to hear, but generally people advise that all microservices should be stateless.
An overall application, of course, has state, and databases, persistent event streams or key-value caches (e.g. Redis) are excellent ways of persisting this. Ideally this is bounded per service though, otherwise you risk end up writing a distributed monolith.
Hard to say in your particular case, but perhaps rethink how state is stored conceptually, and make that more explicit - determining what is cache (for performance) and what is genuine state that should be persisted externally (e.g. to Redis & a database), that allows many service instances to use instantly, thus making sure they can are truly disposable processes.

How to exchange data between instances of the same service in Consul?

I'm trying a combination of Spring cloud and Consul and I wonder if there is way to exchange data and state between instance the same of a microservice.
For example, I have AuthenticationService1 (AS1) and AuthenticationService2 (AS2). When a user comes to AS1, he logs in, receives a token and next time he comes it's only verified. But at this moment AS2 is not aware of the state of AS1.
I saw ideas about using a database table where information about user sessions is stored, but maybe there is an easier way for AS1 to share its state with AS2 or to send a message about log in?
Consul is a service management (discovery, config, ...), not a cache / pub-sub system.
You may want to use a shared cache behind the scene for your use case. You AS1 service authenticate a user, then put in the cache the token. AS2 can retrieve the token in the cache. For that, you can use application like
redis
hazelcast
infinispan
... other stuff like store data in a DB ...
You can also use a pub-sub system and a local cache for each ASx, but you can have issue when on AS restart (cache lost). So from my point of view, shared cache is better.

Scaling services in Distributed system SOA

What are the various alternatives to data processing in SOA. What I have done so far in PoC is:
Scaling the Services on multiple machines.
One universal service will handle the service registry & discovery.
Multiple requests for one service can be forwarded to any instance of the service running on multiple machines on the cluster.
Next, we are planning introduction of a distributed caching layer. Any service can get the data from the distributed caching layer. Entire flow if the system will be:
Client will request the data from service.
Service will check the cache for the valid requested data. If
data is in the valid state it will be returned to the client right
away. Otherwise permanent data storage will the called for the
requested data and it will flow to client by updating the cache.
Now if the client request for processing the data and it can be
processed by a service. Data can be processed by single instance of the service or by multiple instances of the service 3a or 3b?
3a. We just pass the important data filters from client to service. Distribute the processing command among the multiple instances of the service. Each instance will perform operation on a small set of data& will update the data in the cache and permanent store. Here instead of passing the data we are passing processing command among the clusters.
3b. We process the whole data in one instance of the service and update it on the cache and permanent data store.
Finally we return the processed data to the client.
For the transaction system, should we depend on the distributed cache? It might result into consistency problems while data is being processed by multiple instance of the service.One instance can read stale data and process that stale copy in distributed system. how robust it will be to depend on distributed cache?
How large set of the transaction data should be processed in distributed system (SOA) ? I have been reading this line on mulesoft's site
"Share workload between applications while maintaining transient state information with in-memory data-grid to provide bulletproof reliability together with scalability"
Any pointers to achieve such a distributed system where we can have scalability and reliability?

Data replication in Micro Services: restoring database backup

I am currently working with a legacy system that consists of several services which (among others) communicate through some kind of Enterprise Service Bus (ESB) to synchronize data.
I would like to gradually work this system towards the direction of micro services architecture. I am planning to reduce the dependency on ESB and use more of message broker like RabbitMQ or Kafka. Due to some resource/existing technology limitation, I don't think I will be able to completely avoid data replication between services even though I should be able to clearly define a single service as the data owner.
What I am wondering now, how can I safely do a database backup restore for a single service when necessary? Doing so will cause the service to be out of sync with other services that hold the replicated data. Any experience/suggestion regarding this?
Have your primary database publish events every time a database mutation occurs, and let the replicated services subscribe to this event and apply the same mutation on their replicated data.
You already use a message broker, so you can leverage your existing stack for broadcasting the events. By having replication done through events, a restore being applied to the primary database will be propagated to all other services.
Depending on the scale of the backup, there will be a short period where the data on the other services will be stale. This might or might not be acceptable for your use case. Think of the staleness as some sort of eventual consistency model.

Push or Pull for a near real time automation server?

We are currently developing a server whereby a client requests interest in changes to specific data elements and when that data changes the server pushes the data back to the client. There has vigorous debate at work about whether or not it would be better for the client to poll for this data.
What is considered to be the ideal method, in terms of performance, scalability and network load, of data transfer in a near real time environment?
Update:
Here's a Link that gives some food for thought with regards to UI updates.
There's probably no ideal method for every situation, but push is usually better and used more often. It allows to optimize server caching and data transfers, which helps performance and scalability, and cuts network traffic a bit by avoiding client requests and empty responses. It can be important advantage for a server to operate in it's own pace and supply clients with data when it is ready.
Industry standarts - such as OPC, GID - support both. Server pushes updates to subscribed clients, but client can pull some rarely used data out without bothering with subscription.
As long as the client initiates the connection (to get passed firewall and NAT problems) either way is fine.
If there are several different type of data you need to send, you might want to have the client specify which type he wants, but this is only needed once per connection. Then you can have the server continue to send updates as it has them.
It would be less network traffic to have the server send updates without the client continually asking for updates.
What do you have on the client's side? Many firewalls allow outgoing requests but block incoming requests. In other words, pull may be your only option if you are crossing the Internet unless you are sending out e-mails.

Resources