HA for the local Consul agent with Docker-Swarm - consul

In my microservices system I plan to use docker swarm and Consul.
In order to ensure the high availability of Consul I’m going to build a cluster of 3 server agents (along with a client agent per node), but this doesn’t save me from local consul agent failure.
Am I missing something?
If not, how can I configure swarm to be aware of more than 1 consul agents?

Consul is the only service discovery backend that don't support multiple endpoints while using swarm.
Both zookeeper and etcd support the etcd://10.0.0.4,10.0.0.5 format of providing multiple Ip's for the "cluster" of discovery back-ends while using Swarm.
To answer your question how you can configure Swarm to support more than 1 consul (server) - I don't have a definitive answer to it but can point you in a direction and something you can test ( no guarantees ) :
One suggestion worth testing (which is not recommended for production) is to use a Load Balancer that can pass your requests from the Swarm manager to one of the three consul servers.
So when starting the swarm managers you can point to consul://ip_of_loadbalancer:port
This will however cause the LB to be a bottleneck (if it goes down).
I have not tested the above and can't answer if it will work or not - it is merely a suggestion.

Related

What if local consul client down in node

I am very new to consul , and has been reading about consul clustering recently. My understanding is , for each node (equivalent to a physical machine or VM), we will run a local consul agent (in client mode), hence any microservices running in that node will register itself thru this agent. but what happen if this one and only one agent is down, won't the microservices in that node unable to register anymore? Or should we expect more than one consul agent (in client mode) per node to handle such situation?
You are correct. If the Consul agent is down, the services on that host will not be able to register with the agent, and Consul will consider all services which were previously registered against the agent to be unavailable.
A very simple solution is to run Consul under a process manager like systemd, and configure systemd to restart the agent if the process unexpectedly fails. You can find an example systemd unit for this at https://learn.hashicorp.com/tutorials/consul/deployment-guide#configure-systemd. If Consul is installed from the HashiCorp Linux package repo (https://learn.hashicorp.com/tutorials/consul/get-started-install), this systemd unit will be included as part of the installation package.

consul servers without LAN every server connected through wan

im just getting started learning nomad and consul
i have several servers without a Local Area Network and they are connected through wam (which i think you mean by datacenters) every server is a datacenter
i found in the docs https://www.consul.io/docs/architecture that each datacenter should have 3 to 5 consul servers so is my case applicable with consul and nomad
should i make all of the consul servers or 3 servers as consul and the rest are consul clients
You could use:
3*consul servers or
2*consul clients and one server
I have every tested all these cases they work correctly

Nomad and consul setup

Should I run consul slaves alongside nomad slaves or inside them?
The later might not make sense at all but I'm asking it just in case.
I brought my own nomad cluster up with consul slaves running alongside nomad slaves (inside worker nodes), my deployable artifacts are docker containers (java spring applications).
The issue with my current setup is that my applications can't access consul slaves (to read configurations) (none of 0.0.0.0, localhost, worker node ip worked)
Lets say my service exposes 8080, I configured docker part (in hcl file) to use bridge as network mode. Nomad maps 8080 to 43210.
Everything is fine until my service tries to reach the consul slave to read configuration. Ideally giving nomad worker node IP as consul host to Spring should suffice. But for some reason it's not.
I'm using latest version of nomad.
I configured my nomad slaves like https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/nomad/client1.hcl
And the link below shows how I configured/ran my consul slave:
https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/server2.yml
Note: if I use static port mapping and host as the network mode for docker (in nomad) I'll be fine but then I can't deploy more than one instance of each application in each worker node (due to port conflic)
Nomad jobs listen on a specific host/port pair.
You might want to ssh into the server and run docker ps to see what host/port pair the job is listening on.
a93c5cb46a3e image-name bash 2 hours ago Up 2 hours 10.0.47.2:21435->8000/tcp, 10.0.47.2:21435->8000/udp foo-bar
Additionally, you will need to ensure that the consul nomad job is listening on port 0.0.0.0, or the specific ip of the machine. I believe that is this config value: https://www.consul.io/docs/agent/options.html#_bind
All those will need to match up in order to consul to be reachable.
More generally, I might recommend: if you're going to run consul with nomad, you might want to switch to host networking, so that you don't have to deal with the specifics of the networking within a container. Additionally, you could schedule consul as a system job so that it is automatically present on every host.
So I managed to solve the issue like this:
nomad.job.group.network.mode = host
nomad.job.group.network.port: port "http" {}
nomad.job.group.task.driver = docker
nomad.job.group.task.config.network_mode = host
nomad.job.group.task.config.ports = ["http"]
nomad.job.group.task.service.connect: connect { native = true }
nomad.job.group.task.env: SERVER_PORT= "${NOMAD_PORT_http}"
nomad.job.group.task.env: SPRING_CLOUD_CONSUL_HOST = "localhost"
nomad.job.group.task.env: SPRING_CLOUD_SERVICE_REGISTRY_AUTO_REGISTRATION_ENABLED = "false"
Running consul agent (slaves) using docker-compose alongside nomad agent (slave) with host as network mode + exposing all required ports.
Example of nomad job: https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/nomad/location-update-publisher.hcl
Example of consul agent config (docker-compose file): https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/server2.yml
Disclaimer: The LAB is part of Cluster Visualization Framework called: LiteArch Trafik which I have created as an interesting exercise to understand Nomad and Consul.
It took me long time to shift my mind from K8S to Nomad and Consul,
Integration them was one of my effort I spent in the last year.
When service resolution doesn't work, I found out it's more or less the DNS configuration on servers.
There is a section for it on Hashicorp documentation called DNS Forwarding
Hashicorp DNS Forwarding
I have created a LAB which explains how to set up Nomad and Consul.
But you can use the LAB seperately.
I created the LAB after learning the hard way how to install the cluster and how to integrate Nomad and Consul.
With the LAB you need Ubuntu Multipass installed.
You execute one script and you will get full functional Cluster locally with three servers and three nodes.
It shows you as well how to install docker and integrate the services with Consul and DNS services on Ubuntu.
After running the LAB you will get the links to Nomad, Fabio, Consul.
Hopefully it will guide you through the learning process of Nomad and Consul
LAB: LAB
Trafik:Trafik Visualizer

Microservices in Consul

I'm interested in knowing if I can use Consul to solve the following issues:
1) Can Consul be used to load balance microservices? For instance, if I put console on the server that hosts my API gateway, can it be used to monitor all microservices it has discovered and load balance if I have two of the same microservice?
2) Can Consul be used at the microservice level to spin up instances as needed? Essentially, I'd like to not use IIS and find an alternative.
3) If for whatever reason Consul monitors a microservice as offline, can it attempt to start it up again? Or force a shut down of a microservice for whatever reason?
If Consul software can't solve these issues, is there other alternatives?
Thank you.
Consul DNS can provide a simple way for you to load balance services. It's especially powerful if you combine it with Consul Prepared Queries and health checks.
Consul is best suited for monitoring services (via health checks) but you can use consul watch to trigger events if a service suddenly becomes unavailable.
Hashicorp (the company behind Consul) offers another tool called Nomad.
Unlike Consul, Nomad is designed to run services (called jobs) and restart them if necessary.
Nomad works best if you tell it where to find Consul. This enables automatic service registration for any task Nomad launches, including deregistering it if you instruct Nomad to stop running that task. Health checks are supported as well.

Registering service on existing node in Consul

There is a Consul cluster in my local environment, and some developers' local machines as well. Each developer has a Tomcat server which runs some web artifacts in Docker container, so I want to register these artifacts as services on Tomcat deploy.
Assuming that we have already registered empty node for each developer's local machine, how can i register/deregister a new service on existing node? Do i need consul agent running on any node?
I know it's possible to add service when registering node, but haven't found any info about how to add services to node dynamically. I'd prefer HTTP API if possible (it's much easier to run on local machines).
Do i need consul agent running on any node?
Yes, even though you can add external services to a remote machine using curl post too, the service discovery is going to benifit you with the agent running on nodes too.
I know it's possible to add service when registering node, but haven't found any info about how to add services to node dynamically.
Registering a service is fairly easy on consul and you can find more details at the following link:
https://www.consul.io/intro/getting-started/services.html
However, if you wish to give better isolation to your developers, I would recommend running the consul agent server/client in docker and let registrator take care of everything.
Registrator from gliderlabs is service registry bridge for Docker. It automatically registers and deregisters services for any Docker container by inspecting containers as they come online.
You can find more details here: https://github.com/gliderlabs/registrator

Resources