How can I use Kuma to run a multi-cloud service mesh that spans across a VM-based environment as well as a Kubernetes-based environment?
Specifically, how will service discovery work in such a way that VM-based workloads can discover K8s-based ones and vice-versa?
Kuma defines the so-called zone as a domain of control isolation, i.e. all workload connections are managed by a single control plane. Such a control plane is called remote. The overall view and policy management is done in a global control plane, which unifies all zones.
When one starts planning a distributed deployment, they have to enlist the following items:
Where the Global control plane will be deployed and its type. The latter can be either Universal (VM/BareMetal/Container) or Kubernetes(on-premise/cloud).
Number and type of zones to add. These can be changed over time.
Follow the instructions to install the global control plane following the steps specific for the chose type of deployment. Gather the relevant IP address/ports as described.
Installing remote control plane is fairly trivial. This process can be repeated as needed during the lifetime of the whole multi-zone deployment.
Cross-zone service consumption is described in brief here. In short, we do recommend using the following syntax to access a service echo-server, deployed in a Kubernetes namespace echo-example and exposed on port 1010:
<kuma-enabled-pod>$ curl http://echo-server_echo-example_svc_1010.mesh
Using this syntax, the service can be found and consumed even from a neighbouring Universal zone where the workload runs in a VM. Kuma leverages its own DNS service, that allows for this service discovery.
It is recommended that service declared in VMs follow the same service naming format so that if needed to have a service replica in a Kubernetes cluster, they can be easily interchanged without the need to reconfigure the whole infrastructure.
Related
I am exploring VertexAI pipelines and understand that it is a managed alternative to, say, AI Platform pipelines (where you have to deploy a GKE cluster to be able to run Kubeflow pipelines). What I am not clear on is whether VertexAI will autoscale the cluster depending on the load. In the answer to a similar question, it is mentioned that for pipeline steps that use GCP resources such as Dataflow etc., autoscaling will be done automatically. In the google docs, it is mentioned that for components, one can set resources, such as CPU_LIMIT GPU_LIMIT etc. My question is, can these limits be set for any type of component, i.e., Google Cloud pipeline components or Custom components, whether Python function-based or those packaged as a container image? Secondly, do these limits mean that the components resources will autoscale till they hit those limits? And what happens if these options are not even specified, how are the resources allocated then, will they autoscale as VertexAI sees fit?
Links to relevant docs and resources would be really helpful.
To answer your questions,
1. Can these limits be set for any type of components?
Yes. Because, these limits are applicable to all Kubeflow components and are not specific to any particular type of component.
These components could be implemented to perform tasks with a set amount of resources.
2. Do these limits mean that the component resources will autoscale till they hit the limits?
No, there is no autoscaling performed by Vertex AI. Based on the limits set, Vertex AI chooses one suitable VM to perform the task.
Having a pool of workers is supported in Google Cloud Pipeline Components such as “CustomContainerTrainingJobRunOp” and “CustomPythonPackageTrainingJobRunOp” as part of Distributed Training in Vertex AI. Otherwise, only 1 machine is used per step.
3. What happens if these limits are not specified? Does Vertex AI scale the resources as it sees fit?
If the limits are not specified, an “e2-standard-4” VM is used for task execution as the default option.
EDIT: I have updated the links with the latest version of the documentation.
Let's consider a situation, where multiple services relay on data that can change any time and should be updated in each microservice roughly at the same time - for example there is a list of supported languages or some common policies that could change one day and affect many services at once.
One solution that I could think of is to have another microservice that could hold that data and any service that needs current state can just ask for it. The drawback is that this data is not changing very frequently, asking by HTTP is not that cheap and there is a lot of traffic to this let's say global registry service. As it is not changing very often, many services could just cache the data - in order to not ask for it every time - and not be able to respond to change quick enough when the change is made to the configuration.
The other solution could be to externalize such configuration - in AWS for example there could be some configuration file on S3 that would be available for others. The drawback here is that there is no way (as far as I know) to track changes in such file and there is no way to add some logic for verification if changed value in configuration is correct (there is no typos and so on), etc.
So my question is how to handle global configuration/registry in microservice world so that there is little HTTP overhead, you can audit changes as well as introduce change at the same time in many services?
I will prefer the option 1. Apart from the HTTP overhead, this will also lead your system in an inconsistent state. Service 1 might be working on new values but service 2 will be on old.
Since this is a distributed system that we are talking about, I am willing to take a risk with availability.
Have a configuration service that allows you to plan your config changes. Instead of saying change the value of A from x to y, you say change from x to y at time t. This t allows you to consistently propagate changes to all your system.You need to put in effort to understand what the min value of t should be for you set of services, how will you make all services acknowledge the changes and make them at the right time and how will you manage the new services that come up in between.
Another approach is use Spring Cloud Config (or something similar). It ask the service to register with the centralised config service and make refresh call to all the services to update config. Limitation being not all configs could be refreshed and if you are behind the LB you still need to handle ways to make sure all instances gets updated.
Use Config Server( spring cloud config server) that will maintain centralized configurations, you need to make changes to config server related to configurations, each microservices will come on startup for configurations to config server, even after start up after certain interval of time microservices can come to config server for validating any change in configurations and update accordingly.
There are couple of ways to do it, a better way especially in prod is to use external Configuration Store Pattern.
You can save the configuration in external stores like Azure Key Vault or Azure App configuration
Find more details about Azure key vault here:
Azure key vault
5-Minute quickstarts of Azure key vault integration
If you absolutely must have a shared config, best decoupled architecture I've encountered is as follows:
You have a standalone Config Service, completely private to the outside world and can only be accessed through an internal network for your microservices
ON STARTUP: Microservices do a pull request from the Config Service of what is needed per service and is stored in memory. if it is unable to pull from Config Service do not allow it to start. Have Retry Mechanism on this front.
ON CHANGE of the Config Service: Publish an event to your messaging layer that will force services to update their respective configurations.
Caveats:
do not put time sensitive configurations here, since we are using asynchronous communications here (if you have time critical configs why are they shared in the first place, you might need to revisit)
you need to handle your own plumbing, retry mechanism, memory management etc etc.
I have studied concept of microservices for a good while now, and understand what they are are and why they are necessary.
Quick refresher
In a nutshell, monolith application is decomposed into independent deployable units, each of which typically exposes it's own web API and has it's own database. Each service fulfills a single responsibility and does it well. These services communicates over synchronous web services such as REST or SOAP, or using asynchronous messaging such as JMS to fulfill some request in synergy. Our monolith application has became a distributed system. Typically all these fine grained APIs are made available through an API gateway or proxy, which acts as an single-point-of-entry facade, performing security and monitoring related tasks.
Main reasons to adapt microservices is high availability, zero downtime update and high performance achieved via horizontal scaling of a particular service, and looser coupling in the system, meaning easier maintenance. Also, IDE functionality, build and deployment process will be significantly faster, and it's easier to change framework or even the language.
Microservices goes hand in hand with clustering and containerization technologies, such as Docker. Each microservice could be packed as a docker container to run it in any platform. Principal concepts of clustering are service discovery, replication, load balancing and fault tolerance. Docker Swarm is a clustering tool which orchestrates these containerized services, glues them together, and handles all those tasks under the hood in a declarative manner, maintaining the desired state of the cluster.
Sounds easy and simple in theory, but I still don't understand how to implement this in practice, even I know Docker Swarm pretty well. Let's view an concrete example.
Here is the question
I'm building a simplistic java application with Spring Boot, backed by MySQL database. I want to build a system, where user gets a webpage from Service A and submits a form. Service A will do some manipulation to data and sends it to Service B, which will further manipulate data, write to database, return something and in the end some response is sent back to user.
Now the problem is, Service A doesn't know where to find Service B, nor Service B know where to find database (because they could be deployed at any node in the cluster), so I don't know how I should configure the Spring boot application. First thing to come in my mind is to use DNS, but I can't find tutorials how to setup such a system in docker swarm. What is the correct way to configure connection parameters in Spring for distributed cloud deployment? I have researched about Spring Cloud project, but don't understand if it's the key for this dilemma.
I'm also confused how databases should be deployed. Should they live in the cluster, deployed alongside with the service (possibly with aid of docker compose), or is it better to manage them in more traditional way with fixed IP's?
Last question is about load balancing. I'm confused if there should be multiple load balancers for each service, or just a single master load balancer. Should the load balancer has a static IP mapped to a domain name, and all user requests target this load balancer? What if load balancer fails, doesn't it make all the effort to scale the services pointless? Is it even necessary to setup a load balancer with Docker Swarm, as it has it's own routing mesh? Which node end user should target then?
If you're looking at using a Docker Swarm you don't need to worry about the DNS configurations as it's already handled by the overlay network. Let's say you have three services:
A
B
C
A is your DB, B might be the first service to collect data, and C to recieve that data and update the database (A)
docker network create \
--driver overlay \
--subnet 10.9.9.0/24 \
youroverlaynetwork
docker service create --network youroverlaynetwork --name A
docker service create --network youroverlaynetwork --name B
docker service create --network youroverlaynetwork --name C
Once all the services are created they can refer to each other directly by name
These requests are load balanced against all replicas of the container on that overlay network. So A can always get an IP for B by referencing "http://b" or just by calling hostname B.
When you're dealing with load balancing in Docker, a swarm service is already load balanced internally. Once you've defined a service to listen on port 8018, all swarm hosts, will listen on port 8018 and mesh route that to a container in round robin fashion.
It is still, however, best practice to have an application load balancer sit in front of the hosts in the event of host failure.
I have an adapter (written in Spring Boot and Spring Integration) retrieving currency reates from two different sources (via REST and proprietary library). I filter unnecesary things, create instances of class known in my system and send rates to JMS cluster. I want this adapter to be replicated. Only one instance should be running at the same time. When one crashes (I know it from health endpoint) another one should start publishing rates. How can I achieve such effect? I know that available services can be registered using Eureka but how to turn one of them on automatically?
The solution to the problem is using spring-cloud-cluster. One can use either zookeeper or hazelcast to negotiate leadership. From few instances only one is given a leader role. If it crashes, another one takes its role (it is informed via event propagation). You can also use yieldLeadership method to manually relinquish leadership (if health indicator says something is wrong with the application).
Without knowing more details it is hard to give you a recommendation.
I'd personally say Eureka is not build for what you are trying to achieve. But it sounds more like you want to have a look into ZooKeeper. Also see Eureka FAQ for reference. ZooKeeper was exactly build for doing what you are trying to achieve: leader election.
On the other hand, if you can survive also with having the service down for a few seconds I'd suggest you use either your script that monitors the /health endpoint already to restart the service or use systems who already have this build in like Systemd or Docker, where you can define Restart policies.
I have the same application running on two WAS clusters. Each cluster has 3 application servers based in different datacenters. In front of each cluster are 3 IHS servers.
Can I specify a primary cluster, and a failover cluster within the plugin-cfg.xml? Currently I have both clusters defined within the plugin, but I'm only hitting 1 cluster for every request. The second cluster is completely ignored.
Thanks!
As noted already the WAS HTTP server plugin doesn't provide the function your're seeking as documented in the WAS KC http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/rwsv_plugincfg.html?lang=en
assuming that by "failover cluster" what is actually meant is "BackupServers" in the plugin-cfg.xml
The ODR alternative mentioned previously likely isn't an option either, this because the ODR isn't supported for use in the DMZ (it's not been security hardened for DMZ deployment) http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/twve_odoecreateodr.html?lang=en
From an effective HA/DR perspective what you're seeking to accomplish should handled at the network layer, using the global load balancer (global site selector, global traffic manager, etc) that is routing traffic into the data centers, this is usually accomplished by setting a "site cookie" using the load balancer
This is by design. IHS, at least at the 8.5.5 level, does not allow for what you are trying to do. You will have to implement such level of high availability in a higher level in your topology.
There are a few options.
If the environemnt is relatively static, you could post-process plugin-cfg.xml and combine them into a single ServerCluster with the "dc2" servers listed as <BackupServer>'s in the cluster. The "dc1" servers are probably already listed as <PrimaryServer>'s
BackupServers are only used when no PrimaryServers are reachable.
Another option is to use the Java On-Demand Router, which has first-class awareness of applications running in two cells. Rules can be written that dictate the behavior of applications residing in two clusters (load balancer, failover, etc.). I believe these are "ODR Routing Rules".