I'm interested in knowing if I can use Consul to solve the following issues:
1) Can Consul be used to load balance microservices? For instance, if I put console on the server that hosts my API gateway, can it be used to monitor all microservices it has discovered and load balance if I have two of the same microservice?
2) Can Consul be used at the microservice level to spin up instances as needed? Essentially, I'd like to not use IIS and find an alternative.
3) If for whatever reason Consul monitors a microservice as offline, can it attempt to start it up again? Or force a shut down of a microservice for whatever reason?
If Consul software can't solve these issues, is there other alternatives?
Thank you.
Consul DNS can provide a simple way for you to load balance services. It's especially powerful if you combine it with Consul Prepared Queries and health checks.
Consul is best suited for monitoring services (via health checks) but you can use consul watch to trigger events if a service suddenly becomes unavailable.
Hashicorp (the company behind Consul) offers another tool called Nomad.
Unlike Consul, Nomad is designed to run services (called jobs) and restart them if necessary.
Nomad works best if you tell it where to find Consul. This enables automatic service registration for any task Nomad launches, including deregistering it if you instruct Nomad to stop running that task. Health checks are supported as well.
Related
I don’t understand the difference between consul’s agent api and catalog api
Although the consul document has always emphasized that agent and catalog should not be confused,But there are indeed many methods that look similar, such as:
/catalog/services
/agent/services
When should I use catalog or agent(Just like the above http url)?
Which one is suitable for high frequency calls?
Consul is designed for services to be registered against a Consul client agent which is running on the same host where a service is deployed. The /v1/agent/service/ endpoints provide a way for you to interact with services which are registered with the specific Consul agent to which you are communicating, and register new services against that agent.
Each Consul agent in the data center submits its registered service information to the Consul servers. The servers aggregate this information to form the service catalog (https://www.consul.io/docs/architecture/anti-entropy#catalog). The /v1/catalog/ endpoints return that aggregated information.
I want to call out this sentence from the anti-entropy doc.
Consul treats the state of the agent as authoritative; if there are any differences between the agent and catalog view, the agent-local view will always be used.
The catalog APIs can be used to register or remove services/nodes from the catalog, but normally these operations should be performed against the client agents (using the /v1/agent/ APIs) since they are authoritative for data in Consul.
The /v1/agent/ APIs should be used for high frequency calls, and should be issued against the local Consul client agent running on the same node as the app, as opposed to communicating directly with the servers.
We're planning to develop some microservices based on the play framework. They will provide rest apis and lots of them will be using akka cluster/cluster-sharding under the hood.
We would like to have an api gateway that exposes the apis of our internal services, but we're facing one big issue:
- Multiple instances of each service will be running under some ip and port.
- How will the api gateway know where the services instances are running?
- Is there maybe something load-balancer-like for play that keeps track of all running services?
Which solution(s) could possibly fill the spot for the "API Gateway"/"Load Balancer"?
The question you're asking is not really related to play framework. And there is no single answer that would solve what you need.
You could start by reading akka Service Discovery and then make your choice based what fits you more.
We're building services with akka-http and use akka-cluster but use unrelated technologies to expose and run the services.
Check out
Kong for API Gateway
Consul for DNS based service discovery
docker swarm for running containers with mesh network for load balancing
You are looking for following components,
Service Registry : The whole point of this component is to keep track of "what service are running on what addresses". This can be as simple as a simple database which keeps entries for all the running services and their instances. Generally the orchestration service is responsible to register new service instances with Service Registry. Other choice can be to have instances themselves notify the service registry about their existence.
Service Health Checker : This component is mostly responsible for doing periodic runtime checks on the registered service instances and tell service registry if any of them is not working. The service registry implementation can then either mark these instances as "inactive" till they are found to be working by Service Health Checker in future (if ever).
Service Resolution : This is the conceptual component responsible for enabling a client to somehow get to the running service instances.
The whole of above components is called Service Discovery.
In your case, you have load-balancers which can act as a form of ServiceDiscovery.
I don't think load-balancers are going to change much over time unless you require a very advanced architecture, so your API gateway can simply "know" the url's to load-balancers for all your services. So, you don't really need service registry layer.
Now, your load-balancers inherently provide a health-check and quarantine mechanism for instances. So, you don't need an extra health check layer.
So, the only piece missing is to register your instances with the load balancer. This part you will have to figure out based on what your load-balancers are and what ecosystem they live in.
If you live in AWS ecosystem and your load balancers are ELB, then you should have things sorted out in that respect.
Based on Ivan's and Sarvesh's answers we did some research and discovered the netflix OSS projects.
Eureka can be used as service locator that integrates well with the Zuul api gateway. Sadly there's not much documentation on the configuration, so we looked further...
We've now finally choosen Kubernetes as Orchestator.
Kubernetes knows about all running containers, so there's no need for an external service locator like Eureka.
Traefik is an api gateway that utilizes the kuberentes api to discover all running microservices instances and does load balancing
Akka management finds all nodes via the kubernetes api and does the bootstrapping of the cluster for us.
This is a micro services deployment question. How would you deploy Envoy SDS(service discovery service) so other envoy proxies can find the SDS server hosts, in order to discover other services to build the service mesh. Should I put it behind a load balancer with a DNS name( single point of failure) or just run the SDS locally in every machine so other micro services can access it? Or is there a better way of deployment that SDS cluster can be dynamically added into envoy config without a single point of failure?
Putting it behind a DNS name with a load balancer across multiple SDS servers is a good setup for reasonable availability. If SDS is down, Envoy will simple not get updated, so it's generally not the most critical failure -- new hosts and services simply won't get added to the cluster/endpoint model in Envoy.
If you want higher availability, you set up multiple clusters. If you add multiple entries to your bootstrap config, Envoy will fail over between them. You can either specify multiple DNS names or multiple IPs.
(My answer after misunderstanding the question below, for posterity)
You can start with a static config or DNS, but you'll probably want to
check out a full integration with your service discovery.
Check out Service Discovery
Integration
on LearnEnvoy.io.
I have consul running on Consul v0.5.2 version & services running in Mesos. Services keep moving from 1 server to another.
Is there way to deregister services in consul that are in 'failing' state? I am able to get the list of services in failing state using this curl
curl http://localhost:8500/v1/health/state/critical
Issue that we are seeing is over a period of time in consul UI we have stale data & making the whole UI unusable
Consul by default do not deregister unhealthy services instead marks them as critical.
From Consul 0.7 there is special option (deregister_critical_service_after) that allows you to define time after unhealthy service will be deregstered
From Consul 0.7 Changelog
Automatic Service Deregistration: Added a new
deregister_critical_service_after timeout field for health checks
which will cause the service associated with that check to get
deregistered if the check is critical for longer than the timeout.
This is useful for cleanup of health checks registered natively by
applications, or in other situations where services may not always be
cleanly shutdown. GH-679
If you are usign Marathon then you can consider using allegro/marathon-consul it will deregister task when its dead
Along with what janisz said, you can also run your services in Nomad and Nomad will automatically register and deregister your services for you. See the Nomad Service Discovery docs for additional details.
In my microservices system I plan to use docker swarm and Consul.
In order to ensure the high availability of Consul I’m going to build a cluster of 3 server agents (along with a client agent per node), but this doesn’t save me from local consul agent failure.
Am I missing something?
If not, how can I configure swarm to be aware of more than 1 consul agents?
Consul is the only service discovery backend that don't support multiple endpoints while using swarm.
Both zookeeper and etcd support the etcd://10.0.0.4,10.0.0.5 format of providing multiple Ip's for the "cluster" of discovery back-ends while using Swarm.
To answer your question how you can configure Swarm to support more than 1 consul (server) - I don't have a definitive answer to it but can point you in a direction and something you can test ( no guarantees ) :
One suggestion worth testing (which is not recommended for production) is to use a Load Balancer that can pass your requests from the Swarm manager to one of the three consul servers.
So when starting the swarm managers you can point to consul://ip_of_loadbalancer:port
This will however cause the LB to be a bottleneck (if it goes down).
I have not tested the above and can't answer if it will work or not - it is merely a suggestion.