How to handle common variables in microservices architecture? - microservices

Let's consider a situation, where multiple services relay on data that can change any time and should be updated in each microservice roughly at the same time - for example there is a list of supported languages or some common policies that could change one day and affect many services at once.
One solution that I could think of is to have another microservice that could hold that data and any service that needs current state can just ask for it. The drawback is that this data is not changing very frequently, asking by HTTP is not that cheap and there is a lot of traffic to this let's say global registry service. As it is not changing very often, many services could just cache the data - in order to not ask for it every time - and not be able to respond to change quick enough when the change is made to the configuration.
The other solution could be to externalize such configuration - in AWS for example there could be some configuration file on S3 that would be available for others. The drawback here is that there is no way (as far as I know) to track changes in such file and there is no way to add some logic for verification if changed value in configuration is correct (there is no typos and so on), etc.
So my question is how to handle global configuration/registry in microservice world so that there is little HTTP overhead, you can audit changes as well as introduce change at the same time in many services?

I will prefer the option 1. Apart from the HTTP overhead, this will also lead your system in an inconsistent state. Service 1 might be working on new values but service 2 will be on old.
Since this is a distributed system that we are talking about, I am willing to take a risk with availability.
Have a configuration service that allows you to plan your config changes. Instead of saying change the value of A from x to y, you say change from x to y at time t. This t allows you to consistently propagate changes to all your system.You need to put in effort to understand what the min value of t should be for you set of services, how will you make all services acknowledge the changes and make them at the right time and how will you manage the new services that come up in between.
Another approach is use Spring Cloud Config (or something similar). It ask the service to register with the centralised config service and make refresh call to all the services to update config. Limitation being not all configs could be refreshed and if you are behind the LB you still need to handle ways to make sure all instances gets updated.

Use Config Server( spring cloud config server) that will maintain centralized configurations, you need to make changes to config server related to configurations, each microservices will come on startup for configurations to config server, even after start up after certain interval of time microservices can come to config server for validating any change in configurations and update accordingly.

There are couple of ways to do it, a better way especially in prod is to use external Configuration Store Pattern.
You can save the configuration in external stores like Azure Key Vault or Azure App configuration
Find more details about Azure key vault here:
Azure key vault
5-Minute quickstarts of Azure key vault integration

If you absolutely must have a shared config, best decoupled architecture I've encountered is as follows:
You have a standalone Config Service, completely private to the outside world and can only be accessed through an internal network for your microservices
ON STARTUP: Microservices do a pull request from the Config Service of what is needed per service and is stored in memory. if it is unable to pull from Config Service do not allow it to start. Have Retry Mechanism on this front.
ON CHANGE of the Config Service: Publish an event to your messaging layer that will force services to update their respective configurations.
Caveats:
do not put time sensitive configurations here, since we are using asynchronous communications here (if you have time critical configs why are they shared in the first place, you might need to revisit)
you need to handle your own plumbing, retry mechanism, memory management etc etc.

Related

why rate limiting logic should be placed with application code rather then web server

I am exploring to put rate limiting functionality on rest API which are developed using spring boot.
After going through many articles, I came to know that the best way to put rate limiting functionality is with application code, rather then putting it on web servers.
My question is how do you decide that which functionality should go where. Since, its monitoring your incoming calls and nothing to do with business logic, the ideal place should be a web server.
My question is how do you decide that which functionality should go
where. Since, its monitoring your incoming calls and nothing to do
with business logic, the ideal place should be a web server.
Technically the web server could do the job but in the facts, a web server doesn't have necessarily all needed information, it is not specialized for API consuming and it may also make the testability of this feature much harder.
Some practical reasons why the webserver side could be a bad choice :
the developers don't have necessarily the configuration of the HTTP web server in local.
you want to write unit and integration test to check that the rate limitations are applied as specified. Creating a configuration for automated testing is much simpler in the scope of your Java application than with a configuration file defined on a web server.
web servers reasons in terms of HTTP request-response, not in terms of service.
Rate limitations may be applied according to the IP but not only, the username, the user roles, the type of service may influence the limitations. Not sure that you could get all of these easily from an HTTP server.
For example roles are stored on the server side or in a database.
A better option is setting these mechanisms by adding specific and specialized classes or configuration files, which simplifies their reading, their maintenance and their testability.
As you mention Spring Boot in your tags, that and that should interest you.
I recommend spring-cloud-gateway's rate limiter
you could separate this functionality from your business logic by using Filters.
https://www.baeldung.com/spring-boot-add-filter

Ways to update config properties for a Spring boot rest service

I have a spring boot rest service where configuration values are stored in git and fetched using a config server. Deployment is done in a docker swarm cluster where this service would run across multiple containers. So one thing I had to keep in mind is that when actuator's refresh endpoint is called, it refreshes all the containers for this service seamlessly and not just any random container. This is quite an obvious ask I believe.
I can implement updating the config values for a service as and when it's config changes in git using a message broker. However, that would take time and time is not with me at the moment.
I have come up with two quick solutions and would like your help based on your experience as to which one is better than the other. Keep in mind that both work and I tested them both.
Solution 1
Create a scheduler using #Scheduled in the Application.java and keep pinging actuator's refresh endpoint every 5 seconds. I think this is really expensive and resource intensive in production.
Solution 2
Call actuator's refresh endpoint in the controller method itself. This way, I called refresh endpoint on demand and don't keep polling it like solution 1 and be wasteful. It will also ensure that whatever container is picked for servicing a request, it refreshes itself as refresh endpoint call would refresh the properties referred by that container only.
Do you have any preference on one over the other ? Do you see any pros and cons with these solutions ? which one would you pick and why ?
Please let me know what your thoughts are.
This sounds like an interesting problem. Also, like you pointed out Solution1 is resource intensive and should not be used in production. If you are running out of time, I would suggest you go ahead with Solution2, its smarter than the prior.
However, I think the optimal way to solve this problem can be using webhooks in github. This way github will make an API call to your predefined endpoint when a specific event is generated. Events are the core of Github Webhooks. Here is the list of all github events. Choose the one that best suits your requirement. https://developer.github.com/webhooks/#events

What technology to use to avoid too many VMs

I have a small web and mobile application partly running on a webserver written in PHP (Symfony). I have a few clients using the application, and slowly expanding to more clients.
My back-end architecture looks like this at the moment:
Database is Cloud SQL running on GCP (every client has it's own
database instance)
Files are stored on Cloud Storage (GCP) or S3 (AWS), depending on the client. (every client has it's own bucket)
PHP application is running in a Compute Engine VM (GCP), (every client has it's own VM)
Now the thing is, in the PHP code, the only thing client specific is a settings file with the database credentials and the Storage/S3 keys in it. All the other code is exactly the same for every client. And mostly the different VMs sit idle all day, waiting on a few hours usage per client.
I'm trying to find a way to avoid having to create and maintain a VM for every customer. How could I rearchitect my back-end so I can keep separate Databases and Storage Buckets per client, but only scale up my VM's when capacity is needed?
I'm hearing alot about Docker, was thinking about keeping db credentials and keys in a Redis DB or Cloud Datastore, was looking at Heroku, AppEngine, Elastic Beanstalk, ...
This is my ideal scenario as I see it now
An incoming request is done, hits a load balancer
From the request, determine which client the request is for
Find the correct settings file, or credentials from a DB
Inject the settings file in an unused "container"
Handle the request
Make the container idle again
And somewhere in there, determine based on the the amount of incoming requests or traffic, if I need to spin up or spin down containers to handle the extra or reduced (temporary) load.
All this information overload has me stuck, I have no idea what direction to choose, and I fail seeing how implementing any of the above technologies will actually fix my problem.
There are several ways do it with minimum efforts:
Rewrite loading of config file depending from customer
Make several back-end web sites on one VM (best choice i think)

how to update local memory cache in all server instances

I have a web server cluster that contains many running web server instances. each instance cache some configurations in its local memory, the original configurations are stored in Database.
these configurations are used for every request, so the cache may necessary for performance reason.
I want to provide an admin page, in which, the administrator can change the configurations. how do I update all the cache in every server instance?
now I have two solutions for this:
set an expire time for the cache.
when administrator update the configuration, notify each instance via some pub/sub mechanism(e.g. use redis).
for solution 1, the drawback is the changes can not take effect immediately.
for solution 2, I'm wondering, if the pub/sub will have impact on the performance of the web server.
which one is better? or is there any common solution for this problem?
Another drawback of option 1 is that you'll periodically hit your database unnecessarily.
If you're already using Redis then option 2 is a good solution. I've used it successfully and can't imagine how there could be a performance impact just because you're using pubsub.
Another option is to create a cache invalidation URL on each website, e.g. /admin/cache-reset/, and have your administration tool call the cache-reset URL on each individual server. The drawback of this solution is that you need to maintain a list of servers. If you're not already using Redis it could just be the simple/practical/low-tech solution that you're looking for.

Clustering Spring Boot applications

I have an adapter (written in Spring Boot and Spring Integration) retrieving currency reates from two different sources (via REST and proprietary library). I filter unnecesary things, create instances of class known in my system and send rates to JMS cluster. I want this adapter to be replicated. Only one instance should be running at the same time. When one crashes (I know it from health endpoint) another one should start publishing rates. How can I achieve such effect? I know that available services can be registered using Eureka but how to turn one of them on automatically?
The solution to the problem is using spring-cloud-cluster. One can use either zookeeper or hazelcast to negotiate leadership. From few instances only one is given a leader role. If it crashes, another one takes its role (it is informed via event propagation). You can also use yieldLeadership method to manually relinquish leadership (if health indicator says something is wrong with the application).
Without knowing more details it is hard to give you a recommendation.
I'd personally say Eureka is not build for what you are trying to achieve. But it sounds more like you want to have a look into ZooKeeper. Also see Eureka FAQ for reference. ZooKeeper was exactly build for doing what you are trying to achieve: leader election.
On the other hand, if you can survive also with having the service down for a few seconds I'd suggest you use either your script that monitors the /health endpoint already to restart the service or use systems who already have this build in like Systemd or Docker, where you can define Restart policies.

Resources