Consul deregister 'failing' services - microservices

I have consul running on Consul v0.5.2 version & services running in Mesos. Services keep moving from 1 server to another.
Is there way to deregister services in consul that are in 'failing' state? I am able to get the list of services in failing state using this curl
curl http://localhost:8500/v1/health/state/critical
Issue that we are seeing is over a period of time in consul UI we have stale data & making the whole UI unusable

Consul by default do not deregister unhealthy services instead marks them as critical.
From Consul 0.7 there is special option (deregister_critical_service_after) that allows you to define time after unhealthy service will be deregstered
From Consul 0.7 Changelog
Automatic Service Deregistration: Added a new
deregister_critical_service_after timeout field for health checks
which will cause the service associated with that check to get
deregistered if the check is critical for longer than the timeout.
This is useful for cleanup of health checks registered natively by
applications, or in other situations where services may not always be
cleanly shutdown. GH-679
If you are usign Marathon then you can consider using allegro/marathon-consul it will deregister task when its dead

Along with what janisz said, you can also run your services in Nomad and Nomad will automatically register and deregister your services for you. See the Nomad Service Discovery docs for additional details.

Related

Consul Agent Service Registrations on other nodes are not fetchable from Rest API but is showing on UI

We have a consul cluster of 3 servers and registering agent services on any of them via Rest Api.
In UI, registrations of a server are visible on other servers as well. For e.g. registration on server A is visible on server B's UI (accessible by http://serverb:8500/).
However when hitting server B via Rest Api, it only shows its own registrations and do not show server A registration.
Server are started as
Server A
consul -server -ui bootstrap-expect=1 -node=ServerA -data-dir=D:\data -bind=11.223.15.78 -client=0.0.0.0 -retry-join=11.223.15.79 -retry-join=11.223.15.80
Server B
consul -server -ui bootstrap-expect=1 -node=ServerB -data-dir=D:\data -bind=11.223.15.79 -client=0.0.0.0 -retry-join=11.223.15.78 -retry-join=11.223.15.80
Server C
consul -server -ui bootstrap-expect=1 -node=ServerC -data-dir=D:\data -bind=11.223.15.80 -client=0.0.0.0 -retry-join=11.223.15.78 -retry-join=11.223.15.79
Is this an issue or am I doing something wrong?
The visibility of services will depend on which API endpoint you're using, and where you're registering your services. Consul intends for services to be registered against a Consul client agent which is running on the same host as the deployed service. The services registered with each agent in the data center are aggregated to form the service catalog (https://www.consul.io/docs/architecture/anti-entropy#catalog).
The /catalog/services endpoint returns an aggregated list of services registered with each agent across the data center. The /agent/services endpoint will only return services registered against the specific local agent with which you are communicating.
If you want clients to be able to register services across any server, you'll want to register them using the /catalog/register endpoint. You can optionally use a tool like Consul External Services Monitor to provide health checking for services, independently from the Consul servers. See https://www.hashicorp.com/blog/consul-and-external-services for more information.
If a service has been registered via the agent api only on one consul node of a cluster, you can still query the service by its service name by means of the catalog api from all server nodes:
/v1/catalog/:servicename
See https://www.consul.io/api-docs/catalog#list-nodes-for-service
Note that you need to deregister a service on the same consul node via the agent api where you have registered it via the agent api in the first place. If you just deregister it from the catalog, it will be back after a few minutes (that is at least my experience)
The Consul documentation recommends to use the agent api for registration, so I would still stick to registering via the agent api, although it makes deregistering a bit tricky.

Microservices in Consul

I'm interested in knowing if I can use Consul to solve the following issues:
1) Can Consul be used to load balance microservices? For instance, if I put console on the server that hosts my API gateway, can it be used to monitor all microservices it has discovered and load balance if I have two of the same microservice?
2) Can Consul be used at the microservice level to spin up instances as needed? Essentially, I'd like to not use IIS and find an alternative.
3) If for whatever reason Consul monitors a microservice as offline, can it attempt to start it up again? Or force a shut down of a microservice for whatever reason?
If Consul software can't solve these issues, is there other alternatives?
Thank you.
Consul DNS can provide a simple way for you to load balance services. It's especially powerful if you combine it with Consul Prepared Queries and health checks.
Consul is best suited for monitoring services (via health checks) but you can use consul watch to trigger events if a service suddenly becomes unavailable.
Hashicorp (the company behind Consul) offers another tool called Nomad.
Unlike Consul, Nomad is designed to run services (called jobs) and restart them if necessary.
Nomad works best if you tell it where to find Consul. This enables automatic service registration for any task Nomad launches, including deregistering it if you instruct Nomad to stop running that task. Health checks are supported as well.

Remove dead services from Consul

We have a number of Spring Boot applications that register themselves with Consul (via Spring Cloud Consul). If I stop those applications via docker-compose stop myservice then they de-register themselves correctly and disappear from Consul.
If I use docker-compose kill myservice then the deregistration doesn't happen. I understand that on a UNIX system it's impossible to catch the SIGKILL event, so there's no way to force the de-registration.
What we're therefore seeing is services in Consul that never removed (marked as critical but still visible in the UI). Is there a way to force Consul to refresh what's registered, so that the dead services are removed?
Thanks
Nick
It seems, that you have to use Consul HTTP API and manually deregister unavailable services. API gives you 2 different ways to deregister some service, the first one via agent endpoint like so
curl -v -X PUT http://%CONSUL_IP%:8500/v1/agent/service/deregister/<ServiceID>
and the second via catalog. Unfortunately in both cases you have to make http-request manually.

HA for the local Consul agent with Docker-Swarm

In my microservices system I plan to use docker swarm and Consul.
In order to ensure the high availability of Consul I’m going to build a cluster of 3 server agents (along with a client agent per node), but this doesn’t save me from local consul agent failure.
Am I missing something?
If not, how can I configure swarm to be aware of more than 1 consul agents?
Consul is the only service discovery backend that don't support multiple endpoints while using swarm.
Both zookeeper and etcd support the etcd://10.0.0.4,10.0.0.5 format of providing multiple Ip's for the "cluster" of discovery back-ends while using Swarm.
To answer your question how you can configure Swarm to support more than 1 consul (server) - I don't have a definitive answer to it but can point you in a direction and something you can test ( no guarantees ) :
One suggestion worth testing (which is not recommended for production) is to use a Load Balancer that can pass your requests from the Swarm manager to one of the three consul servers.
So when starting the swarm managers you can point to consul://ip_of_loadbalancer:port
This will however cause the LB to be a bottleneck (if it goes down).
I have not tested the above and can't answer if it will work or not - it is merely a suggestion.

consul: how many agents for services

I am playing a little with Docker and Consul and i have a couple of questions regarding agent-service mapping especially in docker environment. Assume i have a service name "myGreatService" being simple web nodejs helloworld application encapsulated with docker image named "myGreatServiceImage". From Consul docs i did understand that when you register a service (through HTTP or service definition file) than service is about to be "wired" to agent/consul node (the wired node can be retrieved via /v1/catalog/service/). So if a consul node is down (or node health check decided it is down) than all services "wired" to that consule node will automatically be marked as down. Am i right ?
If i run my GreatServiceImage image multiple times on a single host via docker (resulting of multiple instances of "myGreatService" service)
how many agents shall I run ?
A single per host managing all containers (all service instances) on that host? Or maybe a separate agent for each container (service instance) ?
If a health check for a service fails then the service will be marked as down and won't show up if you do a DNS query for that service
dig #localhost -p 8500 apache.service.consul
If you do a call to the api you will see that the service is still listed. This is because the service is not removed, it is just marked as down. If you would do an api call to check the health of that service it would be shown as down.
curl localhost/v1/catalog/service/apache
curl localhost/v1/health/service/apache
You can add the ?passing flag to that last call to recieve only the healthy services. (just like the dns query)
curl localhost/v1/health/service/apache?passing
If the consul agent on the host fails then all services running on that host won't show up if you query consul for the services. (either via a dns query or via the api).
As for the number of agents you should be running: Run one consul agent per host. Let your services register themselves via the api of your local consul agent. (or preconfigure all your services in the config files, but I recommend you to make this a dynamic process of self registering)

Resources