consul deregister_critical_service_after is not woring - consul

Hello everyone I have a healthcheck on my consul service, my goal is whenever the service is unhealthy then the consul should remove them from the service catalog.
Bellow is my config
{
"service": {
"name": "api",
"tags": [ "api-tag" ],
"port": 80
},
"check": {
"id": "api_up",
"name": "Fetch health check from local nginx",
"http": "http://localhost/HealthCheck",
"interval": "5s",
"timeout": "1s",
"deregister_critical_service_after": "15s"
},
"data_dir": "/consul/data",
"retry_join": [
"192.168.0.1",
"192.168.0.2",
]
}
Thanks for all the helps

The reason the service is not being de-registered is that the check is being specified outside of the service {} block in your JSON. This makes the check a node-level check, not a service-level check.
Here's a pretty-printed version of the config you provided.
{
"service": {
"name": "api",
"tags": [
"api-tag"
],
"port": 80
},
"check": {
"id": "api_up",
"name": "Fetch health check from local nginx",
"http": "http://localhost/HealthCheck",
"interval": "5s",
"timeout": "1s",
"deregister_critical_service_after": "15s"
},
"data_dir": "/consul/data",
"retry_join": [
"192.168.0.1",
"192.168.0.2",
]
}
Below is the configuration you should be using in order to correctly associate the check with the configured service, and de-register the service after the check has been marked as critical for more than 15 seconds.
{
"service": {
"name": "api",
"tags": [
"api-tag"
],
"port": 80,
"check": {
"id": "api_up",
"name": "Fetch health check from local nginx",
"http": "http://localhost/HealthCheck",
"interval": "5s",
"timeout": "1s",
"deregister_critical_service_after": "15s"
}
},
"data_dir": "/consul/data",
"retry_join": [
"192.168.0.1",
"192.168.0.2"
]
}
Note this statement from the docs for DeregisterCriticalServiceAfter.
If a check is in the critical state for more than this configured value, then its associated service (and all of its associated checks) will automatically be deregistered. The minimum timeout is 1 minute, and the process that reaps critical services runs every 30 seconds, so it may take slightly longer than the configured timeout to trigger the deregistration. This should generally be configured with a timeout that's much, much longer than any expected recoverable outage for the given service.

Related

passing more information to consul watch handler

I am wondering whether consul watch handler can be passed some dynamic information while it's called.
That means watch mechanism can pass the script more arguments instead of my given arguments like the below example.
{
"watches": [
{
"type": "service",
"args": ["/tmp/dosomething.sh", "how can i get responses from /v1/health/service here"]
}
]
}
By the way, when I want to 'watch' a service, the most important info to me is the service's state(passing or critial), but I don't understand:
when watch type is 'service', why I cannot appoint the 'service'.
when watch type is 'checks', why I cannot appoint state and service concurrently.
consul watch passes the entire API response payload as an argument to the watch handler script. Your script needs to be able to consume and parse the JSON, and then act on the data provided.
When you watch a service, the data returned is from the /v1/health/service/:service endpoint. (See consul/api/watch/funcs.go.)
when watch type is 'service', why I cannot appoint the 'service'.
I assume you mean that you would like to watch a specific service. If so, this is supported. You can specify a specific service to watch using the -service flag. For example, consul watch -type=service -service=assets.
when watch type is 'checks', why I cannot appoint state and service concurrently.
If you're interested in monitoring checks for a particular service, you should just use the aforementioned watch command for a specific service. The service check information is included in the API response.
$ consul watch -type=service -service=assets
[
{
"Node": {
"ID": "f013522f-aaa2-8fc6-c8ac-c84cb8a56405",
"Node": "hashicorp-consul-server-2",
"Address": "10.0.0.82",
"Datacenter": "dc2",
"TaggedAddresses": null,
"Meta": null,
"CreateIndex": 22898191,
"ModifyIndex": 22898191
},
"Service": {
"ID": "assets-v1",
"Service": "assets",
"Tags": [],
"Meta": null,
"Port": 9090,
"Address": "",
"Weights": {
"Passing": 1,
"Warning": 1
},
"EnableTagOverride": false,
"CreateIndex": 22898195,
"ModifyIndex": 22898195,
"Proxy": {
"MeshGateway": {},
"Expose": {}
},
"Connect": {}
},
"Checks": [
{
"Node": "hashicorp-consul-server-2",
"CheckID": "serfHealth",
"Name": "Serf Health Status",
"Status": "passing",
"Notes": "",
"Output": "Agent alive and reachable",
"ServiceID": "",
"ServiceName": "",
"ServiceTags": [],
"Type": "",
"Definition": {
"Interval": "0s",
"Timeout": "0s",
"DeregisterCriticalServiceAfter": "0s",
"HTTP": "",
"Header": null,
"Method": "",
"Body": "",
"TLSServerName": "",
"TLSSkipVerify": false,
"TCP": ""
},
"CreateIndex": 22898191,
"ModifyIndex": 22898191
}
]
}
]

Does consul sidecars support websockets upstream?

Is it possible to configure a consul agent sidecar with a websocket upstream? I have tried the following configuration but it doesn't work:
{
"bind_addr": "172.17.0.2",
"data_dir": "/consul/data",
"datacenter": "dc1",
"node_id" : "98dc3bf4-a364-46d9-8b72-624963064ab2",
"node_name": "socket-client-agent",
"leave_on_terminate": true,
"ports": [
{
"grpc": 8502
}
],
"server": false,
"service": [
{
"address": "172.17.0.3",
"connect": [
{
"sidecar_service": [
{
"checks": [
{
"interval": "10s",
"name": "socket-client-sidecar-proxy",
"tcp": "172.17.0.3:21000"
}
],
"port": 21000,
"proxy": [
{
"config": [
{
"bind_address": "0.0.0.0",
"bind_port": 21000,
"protocol": "tcp"
}
],
"upstreams": [
{
"destination_name": "sockets-server",
"local_bind_port": 5001,
"config": {
"protocol": "tcp"
}
}
]
}
]
}
]
}
],
"id": "socket-client-0",
"name": "socket-client",
"port": 5000
}
],
"ui_config": [
{
"enabled": false
}
]
}
From the configuration I'm trying to connect to sockets-server service which uses websockets protocol. I'm using envoy as sidecar proxy.
Currently Consul does not configure Envoy correctly to support WebSocket upgrades. This GitHub issue has more detail on the issue, and potential fix – https://github.com/hashicorp/consul/issues/9473.

Restart server on node failure with Consul

Newbie to Microservices here.
I have been looking into develop a microservice with spring actuator while having Consul for service discovery and fail recovery.
I have configured a cluster as explained in Consul documentation.
Now what I'm trying to do is configure a Consul Watch to trigger when any of my service is down and execute a shell script to restart my service. Following is my configuration file.
{
"bind_addr": "127.0.0.1",
"datacenter": "dc1",
"encrypt": "EXz7LsrhpQ4idwqffiFoQ==",
"data_dir": "/data",
"log_level": "INFO",
"enable_syslog": true,
"enable_debug": true,
"enable_script_checks": true,
"ui":true,
"node_name": "SpringConsulClient",
"server": false,
"service": { "name": "Apache", "tags": ["HTTP"], "port": 8080,
"check": {"script": "curl localhost >/dev/null 2>&1", "interval": "10s"}},
"rejoin_after_leave": true,
"watches": [
{
"type": "service",
"handler": "/Consul-Script.sh"
}
]
}
Any help/tip would be greatly appreciate.
Regards,
Chrishan
Take a closer look at the description of the service watch type in the official documentation. It has an example, how you can specify it:
{
"type": "service",
"service": "redis",
"args": ["/usr/bin/my-service-handler.sh", "-redis"]
}
Note that it has no property handler and but takes a path to the script as an argument. And one more:
It requires the "service" parameter
It seems, in you case you need to specify it as follows:
"watches": [
{
"type": "service",
"service": "Apache",
"args": ["/fully/qualified/path/to/Consul-Script.sh"]
}
]

Enable HTTP check on External Services Consul

I want to run HTTP checks on services registered as External Services with consul.So far the check gets registered but is never called.
What am I missing.
{
"Datacenter": "dc1",
"Node": "new",
"Address": .google.com",
"Service": {
"ID":"re",
"Service": "search2",
"Port": 80
},
"Check":{
"Node":"new",
"CheckID":"Test",
"HTTP":"http://www.google",
"ServiceID":"re"
}
}
You have to specify Interval to check service health.You can do it as follows:
For TCP port check:
{
"check": {
"id": "http",
"name": "http TCP on port 80",
"tcp": "localhost:80",
"interval": "10s",
"timeout": "1s"
}
}
For HTTP api check:
{
"check": {
"id": "api",
"name": "HTTP API on port 5000",
"http": "https://localhost:5000/health",
"tls_skip_verify": false,
"method": "POST",
"header": {"x-foo":["bar", "baz"]},
"interval": "10s",
"timeout": "1s"
}
}

DC/OS marathon Virtual network not working

I installed DC/OS with 3 masters and 3 agents and face a problem with virtual networking. Here is my Marathon app spec:
{
"id": "/nginx",
"cmd": null,
"cpus": 1,
"mem": 128,
"disk": 0,
"instances": 1,
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "nginx",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 80,
"hostPort": 0,
"servicePort": 10002,
"protocol": "tcp",
"name": "main1",
"labels": {
"VIP_0": "9.0.0.0:34562"
}
}
],
"privileged": false,
"parameters": [],
"forcePullImage": false
}
},
"portDefinitions": [
{
"port": 10002,
"protocol": "tcp",
"labels": {}
}
]
}
I see the following in the DC/OS virtual network section:
VIRTUAL NETWORK NAME | SUBNET | AGENT PREFIX LENGTH
dcos 9.0.0.0/8 24
The containers stays in waiting for a long time. If I remove the port mapping section it runs successfully.
Basically I need to know how to work with this new virtual network, and fix the service discovery and load balancing without using any extra stuff.
Took me some time to figure it out as well...
You need to:
Remove all ports assignment in the task definition
Describe the name of the network to attach to (default network created is named "dcos")
{
"id": "yourtask",
"container": {
"type": "DOCKER",
"docker": {
"image": "your/image",
"network": "USER"
}
},
"acceptedResourceRoles" : [
"slave_public"
],
"ipAddress": {
"networkName": "dcos"
},
"instances": 2,
"cpus": 0.2,
"mem": 128
}

Resources