Scaling services in Distributed system SOA

Scaling services in Distributed system SOA - caching

What are the various alternatives to data processing in SOA. What I have done so far in PoC is:
Scaling the Services on multiple machines.
One universal service will handle the service registry & discovery.
Multiple requests for one service can be forwarded to any instance of the service running on multiple machines on the cluster.
Next, we are planning introduction of a distributed caching layer. Any service can get the data from the distributed caching layer. Entire flow if the system will be:
Client will request the data from service.
Service will check the cache for the valid requested data. If
data is in the valid state it will be returned to the client right
away. Otherwise permanent data storage will the called for the
requested data and it will flow to client by updating the cache.
Now if the client request for processing the data and it can be
processed by a service. Data can be processed by single instance of the service or by multiple instances of the service 3a or 3b?
3a. We just pass the important data filters from client to service. Distribute the processing command among the multiple instances of the service. Each instance will perform operation on a small set of data& will update the data in the cache and permanent store. Here instead of passing the data we are passing processing command among the clusters.
3b. We process the whole data in one instance of the service and update it on the cache and permanent data store.
Finally we return the processed data to the client.
For the transaction system, should we depend on the distributed cache? It might result into consistency problems while data is being processed by multiple instance of the service.One instance can read stale data and process that stale copy in distributed system. how robust it will be to depend on distributed cache?
How large set of the transaction data should be processed in distributed system (SOA) ? I have been reading this line on mulesoft's site
"Share workload between applications while maintaining transient state information with in-memory data-grid to provide bulletproof reliability together with scalability"
Any pointers to achieve such a distributed system where we can have scalability and reliability?

Related

Dealing with a "long-term mutex" type of issue?

My web application ingests data from a third-party source and aims to display this data to web clients over a web socket in a real-time manner. The third-party clients push data into the backend over an HTTP endpoint. The data store inside the Golang backend is of temporary nature, just a global variable: a slice keeping third-party client message content.
Since the global slice can be written to (by the third-party client ingestion endpoint) and read from (by threads that send the ingested data to web-app websockets) at any point in time, the message store reads and writes are protected with a mutex. The slice could grow and get rearranged in memory at any point of time.
A "long-term mutex" lock issue arises here. There's a thread that needs to:
read data from the memory state
write the data to a particular web client websocket (possibly a time-lengthy operation)
Any general patters that elegantly deal with this type of problem?

Microservice failure Scenario

I am working on Microservice architecture. One of my service is exposed to source system which is used to post the data. This microservice published the data to redis. I am using redis pub/sub. Which is further consumed by couple of microservices.
Now if the other microservice is down and not able to process the data from redis pub/sub than I have to retry with the published data when microservice comes up. Source can not push the data again. As source can not repush the data and manual intervention is not possible so I tohught of 3 approaches.
Additionally Using redis data for storing and retrieving.
Using database for storing before publishing. I have many source and target microservices which use redis pub/sub. Now If I use this approach everytime i have to insert the request in DB first than its response status. Now I have to use shared database, this approach itself adding couple of more exception handling cases and doesnt look very efficient to me.
Use kafka inplace if redis pub/sub. As traffic is low so I used Redis pub/sub and not feasible to change.
In both of the above cases, I have to use scheduler and I have a duration before which I have to retry else subsequent request will fail.
Is there any other way to handle above cases.

For the point 2,
- Store the data in DB.
- Create a daemon process which will process the data from the table.
- This Daemon process can be configured well as per our needs.
- Daemon process will poll the DB and publish the data, if any. Also, it will delete the data once published.
Not in micro service architecture, But I have seen this approach working efficiently while communicating 3rd party services.

At the very outset, as you mentioned, we do indeed seem to have only three possibilities
This is one of those situations where you want to get a handshake from the service after pushing and after processing. In order to accomplish the same, using a middleware queuing system would be a right shot.
Although a bit more complex to accomplish, what you can do is use Kafka for streaming this. Configuring producer and consumer groups properly can help you do the job smoothly.
Using a DB to store would be a overkill, considering the situation where you "this data is to be processed and to be persisted"
BUT, alternatively, storing data to Redis and reading it in a cron-job/scheduled job would make your job much simpler. Once the job is run successfully, you may remove the data from cache and thus save Redis Memory.
If you can comment further more on the architecture and the implementation, I can go ahead and update my answer accordingly. :)

Best approach to send updates to other micro services which are running(multiple instances) in different data centers

I have 3 different micro services(ex: A,B,C. these are REST, and springboot based). These 3 different services generally runs on 3 different data centers locations, so i.e different instances for each service.
The problem trying to solve:
I need to send updates(its kind of polling, checking if there are any updated records) in service A, then send updated information to services B and C, through REST call. Based on these updates service B and C does it's own processing. Once after deployment(mostly into cloud). How does A knows which B, C instances are up and running. SO that it can send updates to running instances.
Do we need to keep track of running instances into some DB table and lookup for active instances before sending updates from A?. (OR) just create some indicator or sequence number based approach to find out there are some updates at A, So we need to send out.But in this does it A knows what all are active instances running? Or else, we just need to send updates from A, so that some router or load balancer or some other thing will takes care of sending to available active instances running regardless of storing and looking up for active instances
I am not much familiar with network and prod systems behavior and its communication in cloud systems.

Trying to implement cross service update through REST based synchronization is a bad idea because it is not scalable in a sense that if you add more microservices that needs to be aware of updates made on service A. You would have to modify the existing microservice that emits the change. This in fact introduces risk and additional maintenance cost.
However, you can try to use messaging queues to emit events that indicates changes made on a service. This approach eliminates the need to modify any existing microservice (Thanks to pub/sub pattern) and just plug new consumers to your existing update emitting services in your ecosystem

Default failure/recovery behavior for Gemfire Server/Client Architecture

For gemfire cache, we are using the client/server architecture in 3 different geographic regions with 3 different locators.
Cache Server
Each region would have 2 separate cache server, potentially one
primary and one secondary
The cache servers are peer-to-peer connection
The data-policy on the cache servers is replicate
No region persistence is enabled
Cache Client
No persistence is enabled
No durable queues/subscriptions are set up
What would the default behaviors of the following scenarios:
All cache servers in one geo-region crashes, what happens to the data in the cache clients when the cache servers restart? Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
All cache clients in one geo-region crashes. Although we don't have durable queues/subscriptions set up, for this scenario, let's assume we do. What happens to the data in the cache clients when they restart? Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
All cache servers and cache clients in one geo-region crashes, what happens to the data in the cache servers and cache clients when they start up? Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
Thanks in advance!

Ok, so based on how I am interpreting your configuration/setup and your questions, this is how I would answer them currently.
Also note, I am assuming you have NOT configured WAN between your separate clusters residing in different "geographic regions". However, some of the questions would not matter if WAN was configured or not.
Regarding your first bullet...
what happens to the data in the cache clients when the cache servers restart?
Nothing.
If the cache client were also storing data "locally" (e.g. CACHING_PROXY), then the data will remain intact.
A cache client can also have local-only Regions only available to the cache client, i.e. there is no matching (by "name") Region in the server cluster. This is determined by 1 of the "local" ClientRegionShortcuts (e.g. ClientRegionShortcut.LOCAL, which corresponds to DataPolicy.NORMAL). Definitely, nothing happens to the data in these type of client Regions if the servers in the cluster go down.
If your client Regions are PROXIES, then your client is NOT storing any data locally, at least for those Regions that are configured as PROXIES (i.e. ClientRegionShortcut.PROXY, which corresponds to DataPolicy.EMPTY).
So...
Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
See above, but essentially, your "PROXY" based client Regions will no longer be able to "communicate" with the server.
For PROXY, all Region operations (gets, puts, etc) will fail, with an Exception of some kind.
For CACHING_PROXY, a Region.get should succeed if the data is available locally. However, if the data is not available, the client Region will send the request to the server Region, which of course will fail. If you are performing a Region.put, then that will fail sense the data cannot be sent to the server.
Regarding your second bullet...
What happens to the data in the cache clients when they restart?
Depends on your "Interests Registration (Result) Policy" (i.e. InterestResultPolicy) when the client registers interests for the events (keys/values) in the server Region, particularly when the client comes back online. The interests "expression" (either particular keys, or "ALL_KEYS" or a regex) determines what the client Region will receive on initialization. It is possible not to receive anything.
Durability (the durable flag in `Region.registerInterest(..).) of client "subscription queues" only determines whether the server will store events for the client when the client is not connected so that the client can receive what it missed when it was offline.
Note, an alternative to "register interests" is CQs.
See here and here for more details.
As for...
Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
Not that I know of. It all depends on your interests registration and/or CQs.
Finally, regarding your last bullet...
All cache servers and cache clients in one geo-region crashes, what happens to the data in the cache servers and cache clients when they start up?
There will be no data if you do not enable persistence. GemFire is an "In-Memory" Data Grid, and as such, it keeps your data in memory only, unless you arrange for storing your data externally, either by persistence or writing a CacheWriter to store the data in an external data store (e.g. RDBMS).
Does the behavior differ for cache clients with proxy or caching-proxy client cache regions?
Not in this case.
Hope this helps!
-John

Data replication in Micro Services: restoring database backup

I am currently working with a legacy system that consists of several services which (among others) communicate through some kind of Enterprise Service Bus (ESB) to synchronize data.
I would like to gradually work this system towards the direction of micro services architecture. I am planning to reduce the dependency on ESB and use more of message broker like RabbitMQ or Kafka. Due to some resource/existing technology limitation, I don't think I will be able to completely avoid data replication between services even though I should be able to clearly define a single service as the data owner.
What I am wondering now, how can I safely do a database backup restore for a single service when necessary? Doing so will cause the service to be out of sync with other services that hold the replicated data. Any experience/suggestion regarding this?

Have your primary database publish events every time a database mutation occurs, and let the replicated services subscribe to this event and apply the same mutation on their replicated data.
You already use a message broker, so you can leverage your existing stack for broadcasting the events. By having replication done through events, a restore being applied to the primary database will be propagated to all other services.
Depending on the scale of the backup, there will be a short period where the data on the other services will be stale. This might or might not be acceptable for your use case. Think of the staleness as some sort of eventual consistency model.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio