Ansible: how to start consul-cluster - ansible

I have a ansible-playbook to configure consul with (3 servers (1 bootstrap)) and 3 clients.
First, I want to execute the bootstrap this is the console command:
vagrant#172.16.8.191$ consul agent -config-dir /etc/consul.d/bootstrap
Then while bootstrap is executing, I want to start consul in the others servers of the cluster. I have the next in ansible:
- name: start consul
service: name=consul state=restarted enabled=yes
My problem is, how can I stop the next execution using Ansible:
consul agent -config-dir /etc/consul.d/bootstrap
If it is other way to start consul-cluster by Ansible I'm thrilled to know.
Thanks,

Solution answer:
I have change my consul config on clients and servers to auto-create the cluster, so when you start the nodes computers, the cluster start and consul start automatically.
To do this, I use the next configuration:
Client:
{
"bind_addr":"172.16.8.194",
"client_addr":"0.0.0.0",
"server": false,
"datacenter": "ikerlan-Consul",
"data_dir": "/var/consul",
"ui_dir": "/home/ikerlan/dist",
"log_level": "WARN",
"encrypt": "XXXXXX",
"enable_syslog": true,
"retry_join": [172.16.8.191,172.16.8.192,172.16.8.193]
}
Server:
{
"bind_addr":"0.0.0.0",
"client_addr":"0.0.0.0",
"bootstrap": false,
"server": true,
"datacenter": "ikerlan-Consul",
"data_dir": "/var/consul",
"ui_dir": "/home/ikerlan/dist",
"log_level": "WARN",
"encrypt": "XXXXXX",
"enable_syslog": true,
"retry_join": [172.16.8.191,172.16.8.192,172.16.8.193],
"bootstrap_expect": 3
}

Related

How to setup a single node Consul server/client?

What configuration is required to achieve this?
It's possible using the "development mode" mentioned here - https://learn.hashicorp.com/consul/getting-started/agent (but not recommended for production).
I've tried setting this up but I'm not sure how to set the client config. What I've tried is a config of:
{
"data_dir": "/tmp2/consul-client",
"log_level": "INFO",
"server": false,
"node_name": "master",
"addresses": {
"https": "127.0.0.1"
},
"bind_addr": "127.0.0.1"
}
Which results in a failure of:
consul agent -config-file=client.json
==> Starting Consul agent...
==> Error starting agent: Failed to start Consul client: Failed to start lan serf: Failed to create memberlist: Could not set up network transport: failed to obtain an address: Failed to start TCP listener on "127.0.0.1" port 8301: listen tcp 127.0.0.1:8301: bind: address already in use
No "client" agent is required to run for an operational Consul cluster.
I had to set this server / master with the bootstrap_expect set to 1(number of nodes for boostrap process):
{
"retry_join" : ["127.0.0.1"],
"data_dir": "/tmp2/consul",
"log_level": "INFO",
"server": true,
"node_name": "master",
"addresses": {
"https": "127.0.0.1"
},
"bind_addr": "127.0.0.1",
"ui": true,
"bootstrap_expect": 1
}

Consul not showing services registered in different node

I have a cluster of 3 nodes of consul servers. I have registered one service(FooService) with one of the server(Server1). When i check the registered services using http (/v1/agent/services) from the server(Server1) it is showing correctly. But when i try the same with any of other server(ie, Server1 /Server2) its not listing this registered service. This issue is not happening for KV Store. Can someone suggest a fix for this?
consul version : 1.2.1
I have pasted my configuration below
{
"bootstrap_expect": 3,
"client_addr": "0.0.0.0",
"datacenter": "DC1",
"data_dir": "/var/consul",
"domain": "consul",
"enable_script_checks": true,
"dns_config": {
"enable_truncate": true,
"only_passing": true
},
"enable_syslog": true,
"encrypt": "3scwcXQpgNVo1CZuqlSouA==",
"leave_on_terminate": true,
"log_level": "INFO",
"rejoin_after_leave": true,
"server": true,
"start_join": [
"10.0.0.242",
"10.0.0.243",
"10.0.0.244"
],
"ui": true
}
What i understood is , spring boot app should always connect to local consul client. Then this issue will not occur.

Restart server on node failure with Consul

Newbie to Microservices here.
I have been looking into develop a microservice with spring actuator while having Consul for service discovery and fail recovery.
I have configured a cluster as explained in Consul documentation.
Now what I'm trying to do is configure a Consul Watch to trigger when any of my service is down and execute a shell script to restart my service. Following is my configuration file.
{
"bind_addr": "127.0.0.1",
"datacenter": "dc1",
"encrypt": "EXz7LsrhpQ4idwqffiFoQ==",
"data_dir": "/data",
"log_level": "INFO",
"enable_syslog": true,
"enable_debug": true,
"enable_script_checks": true,
"ui":true,
"node_name": "SpringConsulClient",
"server": false,
"service": { "name": "Apache", "tags": ["HTTP"], "port": 8080,
"check": {"script": "curl localhost >/dev/null 2>&1", "interval": "10s"}},
"rejoin_after_leave": true,
"watches": [
{
"type": "service",
"handler": "/Consul-Script.sh"
}
]
}
Any help/tip would be greatly appreciate.
Regards,
Chrishan
Take a closer look at the description of the service watch type in the official documentation. It has an example, how you can specify it:
{
"type": "service",
"service": "redis",
"args": ["/usr/bin/my-service-handler.sh", "-redis"]
}
Note that it has no property handler and but takes a path to the script as an argument. And one more:
It requires the "service" parameter
It seems, in you case you need to specify it as follows:
"watches": [
{
"type": "service",
"service": "Apache",
"args": ["/fully/qualified/path/to/Consul-Script.sh"]
}
]

Marathon: How to specify environment variables in args

I am trying to run a Consul container on each of my Mesos slave node.
With Marathon I have the following JSON script:
{
"id": "consul-agent",
"instances": 10,
"constraints": [["hostname", "UNIQUE"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "consul",
"privileged": true,
"network": "HOST"
}
},
"args": ["agent","-bind","$MESOS_SLAVE_IP","-retry-join","$MESOS_MASTER_IP"]
}
However, it seems that marathon treats the args as plain text.
That's why I always got errors:
==> Starting Consul agent...
==> Error starting agent: Failed to start Consul client: Failed to start lan serf: Failed to create memberlist: Failed to parse advertise address!
So I just wonder if there are any workaround so that I can start a Consul container on each of my Mesos slave node.
Update:
Thanks #janisz for the link.
After taking a look at the following discussions:
#3416: args in marathon file does not resolve env variables
#2679: Ability to specify the value of the hostname an app task is running on
#1328: Specify environment variables in the config to be used on each host through REST API
#1828: Support for more variables and variable expansion in app definition
as well as the Marathon documentation on Task Environment Variables.
My understanding is that:
Currently it is not possible to pass environment variables in args
Some post indicates that one could pass environment variables in "cmd". But those environment variables are Task Environment Variables provided by Marathon, not the environment variables on your host machine.
Please correct if I was wrong.
You can try this.
{
"id": "consul-agent",
"instances": 10,
"constraints": [["hostname", "UNIQUE"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "consul",
"privileged": true,
"network": "HOST",
"parameters": [
"key": "env",
"value": "YOUR_ENV_VAR=VALUE"
]
}
}
}
Or
{
"id": "consul-agent",
"instances": 10,
"constraints": [["hostname", "UNIQUE"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "consul",
"privileged": true,
"network": "HOST"
}
},
"env": {
"ENV_NAME" : "VALUE"
}
}

Gossip encryption not working fine

I have created a master token using the below command:
$ consul keygen
G74SM8N9NUc4meaHfA7CFg==
Then, I bootstrapped the server with the following config.json:
{
"server": true,
"datacenter": "consul",
"data_dir": "/var/consul",
"log_level": "INFO",
"enable_syslog": true,
"disable_update_check": true,
"client_addr": "0.0.0.0",
"bootstrap": true,
"leave_on_terminate": true,
"encrypt": "G74SM8N9NUc4meaHfA7CFg=="
}
The output of the bootstrap server is as follows:
Node name: 'abcd'
Datacenter: 'consul'
Server: true (bootstrap: true)
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: x.x.x.x (LAN: 8301, WAN: 8302)
Gossip encrypt: true, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
Then, I added a new server as a regular consul server which has the following config.json:
{
"server": true,
"datacenter": "consul",
"data_dir": "/var/consul",
"log_level": "INFO",
"enable_syslog": true,
"disable_update_check": true,
"client_addr": "0.0.0.0",
"bootstrap": false,
"leave_on_terminate": true,
"ui_dir": "/usr/local/bin/consul_ui",
"check_update_interval": "0s",
"ports": {
"dns": 8600,
"http": 8500,
"https": 8700,
"rpc": 8400,
"serf_lan": 8301,
"serf_wan": 8302,
"server": 8300
},
"dns_config": {
"allow_stale": true,
"enable_truncate": true,
"only_passing": true,
"max_stale": "02s",
"node_ttl": "30s",
"service_ttl": {
"*": "10s"
}
},
"advertise_addr": "y.y.y.y",
"encrypt": "G74SM8N9NUc4meaHfA7CFg==",
"retry_join": [
"x.x.x.x",
"y.y.y.y"
]
}
Note: Here, x.x.x.x is IP address of the bootstrap server, y.y.y.y is IP address of the regular server.
For testing purpose, I changed the encrypt key on one of the servers. And, when I do consul members, I can still see the all IPs which means that the servers are still able to communicate even with the different encrypt key. It seems that the gossip encryption is not working fine.
A Consul instance will cache the initial key and re-use it. It is stored in the serf folder in the file local.keyring.
This is counter-intuitive, but it is documented at least in one place together with the encrypt option.
You'll need to delete this file and restart Consul in order to get the expected behaviour.

Resources