I have created consul server cluster using different ports in localhost.
I used below commands for that.
server 1:
consul agent -server -bootstrap-expect=3 -data-dir=consul-data -ui -bind=127.0.0.1 -dns-port=8601 -http-port=8501 -serf-lan-port=8303 -serf-wan-port=8304 -server-port=8305 -node=node1
server 2:
consul agent -server -bootstrap-expect=3 -data-dir=consul-data2 -ui -bind=127.0.0.1 -dns-port=8602 -http-port=8502 -serf-lan-port=8306 -serf-wan-port=8307 -server-port=8308 -node=node2 -join=127.0.0.1:8303
server 3:
consul agent -server -bootstrap-expect=3 -data-dir=consul-data1 -ui -bind=127.0.0.1 -node=node3 -join=127.0.0.1:8303
Then I created 2 microservices using spring boot, called service_A and service_B.
Service_B calls service_A to get some data.
Both services get registered with one of the above servers.
In application.properties:
spring.cloud.consul.port=8501 #For service_A
spring.cloud.consul.port=8502 #For service_B
This works fine as Service_B discovers Service_A without any problem.
Now When I kill the consul server which service_A got registered, system fails to give results since Service_B cannot find Service_A.
How should I make this system fault tolerant, Which means even though the consul server fails, services who registered with that server automatically get registered with another server which is available in the cluster.
Further I need to know how consul achieves High availability and fault tolerance in service registration and discovery. Hope you get the Question.
Apparently, you can deploy a consul cluster in your local machine but you cannot expect any resilience mechanism or fault tolerance in that same local machine. It's because your spring services (service_A & service_B) has been configured to identify the consul server which runs in the given consul server port under bootstrap.yml (default 8500).
spring:
cloud:
consul:
config:
watch:
enabled: true
port: 8500
discovery:
instanceId: ${spring.application.name}:${random.value}
So each services will discover the consul servers that runs under 8500 port (you can change it as you wish). If you are running your consul cluster in your same local machine you cannot assign the same port number (8500) to each cluster nodes that need to be identified. It will be differed in order to run under same ip address. To achieve this you will need to deploy each consul nodes under different ip addresses with the same port number 8500.
8301 is the serf LAN port that used to handle gossip in the LAN. Even this port can be the same in each nodes to maintain the cluster inter-connection.
The easiest way to achieve this is that to use a private subnet in a AWS VPC.
And then you can assign separate configurations for each subnet nodes with the same port number for each server nodes so that it can be identified by your services_A & service_B with #EnableDiscoveryClient annotation.
Related
im just getting started learning nomad and consul
i have several servers without a Local Area Network and they are connected through wam (which i think you mean by datacenters) every server is a datacenter
i found in the docs https://www.consul.io/docs/architecture that each datacenter should have 3 to 5 consul servers so is my case applicable with consul and nomad
should i make all of the consul servers or 3 servers as consul and the rest are consul clients
You could use:
3*consul servers or
2*consul clients and one server
I have every tested all these cases they work correctly
Should I run consul slaves alongside nomad slaves or inside them?
The later might not make sense at all but I'm asking it just in case.
I brought my own nomad cluster up with consul slaves running alongside nomad slaves (inside worker nodes), my deployable artifacts are docker containers (java spring applications).
The issue with my current setup is that my applications can't access consul slaves (to read configurations) (none of 0.0.0.0, localhost, worker node ip worked)
Lets say my service exposes 8080, I configured docker part (in hcl file) to use bridge as network mode. Nomad maps 8080 to 43210.
Everything is fine until my service tries to reach the consul slave to read configuration. Ideally giving nomad worker node IP as consul host to Spring should suffice. But for some reason it's not.
I'm using latest version of nomad.
I configured my nomad slaves like https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/nomad/client1.hcl
And the link below shows how I configured/ran my consul slave:
https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/server2.yml
Note: if I use static port mapping and host as the network mode for docker (in nomad) I'll be fine but then I can't deploy more than one instance of each application in each worker node (due to port conflic)
Nomad jobs listen on a specific host/port pair.
You might want to ssh into the server and run docker ps to see what host/port pair the job is listening on.
a93c5cb46a3e image-name bash 2 hours ago Up 2 hours 10.0.47.2:21435->8000/tcp, 10.0.47.2:21435->8000/udp foo-bar
Additionally, you will need to ensure that the consul nomad job is listening on port 0.0.0.0, or the specific ip of the machine. I believe that is this config value: https://www.consul.io/docs/agent/options.html#_bind
All those will need to match up in order to consul to be reachable.
More generally, I might recommend: if you're going to run consul with nomad, you might want to switch to host networking, so that you don't have to deal with the specifics of the networking within a container. Additionally, you could schedule consul as a system job so that it is automatically present on every host.
So I managed to solve the issue like this:
nomad.job.group.network.mode = host
nomad.job.group.network.port: port "http" {}
nomad.job.group.task.driver = docker
nomad.job.group.task.config.network_mode = host
nomad.job.group.task.config.ports = ["http"]
nomad.job.group.task.service.connect: connect { native = true }
nomad.job.group.task.env: SERVER_PORT= "${NOMAD_PORT_http}"
nomad.job.group.task.env: SPRING_CLOUD_CONSUL_HOST = "localhost"
nomad.job.group.task.env: SPRING_CLOUD_SERVICE_REGISTRY_AUTO_REGISTRATION_ENABLED = "false"
Running consul agent (slaves) using docker-compose alongside nomad agent (slave) with host as network mode + exposing all required ports.
Example of nomad job: https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/nomad/location-update-publisher.hcl
Example of consul agent config (docker-compose file): https://github.com/bmd007/statefull-geofencing-faas/blob/master/infrastructure/server2.yml
Disclaimer: The LAB is part of Cluster Visualization Framework called: LiteArch Trafik which I have created as an interesting exercise to understand Nomad and Consul.
It took me long time to shift my mind from K8S to Nomad and Consul,
Integration them was one of my effort I spent in the last year.
When service resolution doesn't work, I found out it's more or less the DNS configuration on servers.
There is a section for it on Hashicorp documentation called DNS Forwarding
Hashicorp DNS Forwarding
I have created a LAB which explains how to set up Nomad and Consul.
But you can use the LAB seperately.
I created the LAB after learning the hard way how to install the cluster and how to integrate Nomad and Consul.
With the LAB you need Ubuntu Multipass installed.
You execute one script and you will get full functional Cluster locally with three servers and three nodes.
It shows you as well how to install docker and integrate the services with Consul and DNS services on Ubuntu.
After running the LAB you will get the links to Nomad, Fabio, Consul.
Hopefully it will guide you through the learning process of Nomad and Consul
LAB: LAB
Trafik:Trafik Visualizer
I have so far configured servers inside Kubernetes containers that used HTTP or terminated HTTPS at the ingress controller. Is it possible to terminate HTTPS (or more generally TLS) traffic from outside the cluster directly on the container, and how would the configuration look in that case?
This is for an on-premises Kubernetes cluster that was set up with kubeadm (with Flannel as CNI plugin). If the Kubernetes Service would be configured with externalIPs 1.2.3.4 (where my-service.my-domain resolves to 1.2.3.4) for service access from outside the cluster at https://my-service.my-domain, say, how could the web service running inside the container bind to address 1.2.3.4 and how could the client verify a server certificate for 1.2.3.4 when the container's IP address is (FWIK) some local IP address instead? I currently don't see how this could be accomplished.
UPDATE My current understanding is that when using an Ingress HTTPS traffic would be terminated at the ingress controller (i.e. at the "edge" of the cluster) and further communication inside the cluster towards the backing container would be unencrypted. What I want is encrypted communication all the way to the container (both outside and inside the cluster).
I guess, Istio envoy proxies is what you need, with the main purpose to authenticate, authorize and encrypt service-to-service communication.
So, you need a mesh with mTLS authentication, also known as service-to-service authentication.
Visually, Service A is your Ingress service and Service B is a service for HTTP container
So, you terminate external TLS traffic on the ingress controller and it will go further inside the cluster with Istio mTLS encryption.
It's not exactly what you asked for -
terminate HTTPS traffic directly on Kubernetes container
Though it fulfill the requirement-
What I want is encrypted communication all the way to the container
I am doing a POC on Consul for supporting service discovery and multiple microservice versions. Consul clients and server cluster(3 servers) are set up on Linux VMs. I followed the documentation at Consul and the set up is successful.
Here is my doubt. My set up is completely on VMs. I've added a service definition using HTTP API. The same service is running on two nodes.
The services are correctly registered:
curl http://localhost:8600/v1/catalog/service/my-service
gives me the two node details.
When I do a DNS query:
dig #127.0.0.1 -p 8600 my-service.service.consul
I am able to see the expected results with the node which hosts the service. But I cannot ping the service since the service name is not resolved.
ping -c4 my-service or ping -c4 my-service.service.consul
ping: unknown host.
If I enter a mapping for my-service in /etc/hosts file, I can ping this, only from the same VM. I won't be able to ping this from another VM on the same LAN or WAN.
The default port for DNS is 53. Consul DNS interface listens to 8600. I cannot use Docker for DNS forwarding. Is it possible I missed something here? Can consul DNS query work without Docker/dnsmasq or iptables updates?
To be clear, here is what I would like to have as the end result:
ping my-service
This needs to ping the nodes I have configured, in a round robin fashion.
Please bear with me if this question is basic, and I've gone through each of the consul related questions in SO.
Also gone through this and this and these too says I need to do extra set up.
Wait! Please don't do this!
DO. NOT. RUN. CONSUL. AS. ROOT.
Please. You can, but don't. Instead do the following:
Run a caching or forwarding DNS server on your VMs. I'm bias toward dnsmasq because of its simplicity and stability in the common case.
Configure dnsmasq to forward the TLD .consul to the consul agent listening on 127.0.0.1:8600 (the default).
Update your /etc/resolv.conf file to point to 127.0.0.1 as your nameserver.
There are a few ways of doing this, and the official docs have a write up that is worth looking into:
https://www.consul.io/docs/guides/forwarding.html
That should get you started.
This can be a pretty complicated topic, but the simplest way is to change consul to bind to port 53 using the ports directive, and add some recursers to the consul config can pass real DNS requests on to a host that has full DNS functionality. Something like these bits:
{
"recursors": [
"8.8.8.8",
"8.8.4.4"
],
"ports": {
"dns": 53
}
}
Then modify your system to use the consul server for dns with a nameserver entry in /etc/resolve.conf. Depending on your OS, you might be able to use a port in the resolv.conf file, and avoid having to deal with Consul needing root to bind to port 53.
In a more complicated scenario, I know many people that use unbound or bind to do split DNS and essentially do the opposite, routing the .consul domain to the consul cluster on a non-privileged port at the org level of their DNS infrastructure.
In my microservices system I plan to use docker swarm and Consul.
In order to ensure the high availability of Consul I’m going to build a cluster of 3 server agents (along with a client agent per node), but this doesn’t save me from local consul agent failure.
Am I missing something?
If not, how can I configure swarm to be aware of more than 1 consul agents?
Consul is the only service discovery backend that don't support multiple endpoints while using swarm.
Both zookeeper and etcd support the etcd://10.0.0.4,10.0.0.5 format of providing multiple Ip's for the "cluster" of discovery back-ends while using Swarm.
To answer your question how you can configure Swarm to support more than 1 consul (server) - I don't have a definitive answer to it but can point you in a direction and something you can test ( no guarantees ) :
One suggestion worth testing (which is not recommended for production) is to use a Load Balancer that can pass your requests from the Swarm manager to one of the three consul servers.
So when starting the swarm managers you can point to consul://ip_of_loadbalancer:port
This will however cause the LB to be a bottleneck (if it goes down).
I have not tested the above and can't answer if it will work or not - it is merely a suggestion.