Prometheus: How to get Consul nodes from multiple datacenters - consul

I'd like to use Prometheus's Consul integration to auto-discover all my consul nodes. At the moment, my Prometheus server only receives nodes from a single datacenter in Consul eventhough I never actually specified which datacenter to use in the configuration (I guess it just picked the one that my consul-client (installed on my prometheus server) is part of).
How do I get all the nodes from all the datacenters that consul is aware of?

The solution we found was to use the 'datacenter' keyword in the configuration - and to explicitly list of the datacenters we have. Its not optimal (because we might add/remove datacenters in the future) but it does work. below is an example:
scrape_configs:
- job_name: 'consul'
consul_sd_configs:
- server: '0.0.0.0:8500'
datacenter: 'datacenter-name-1'
- server: '0.0.0.0:8500'
datacenter: 'datacenter-name-2'
- server: '0.0.0.0:8500'
datacenter: 'datacenter-name-3'

Related

Do we need to have metricbeat installed in the remote prometheus cluster to pull prometheus data to ELK Cluster using metricbeat prometheus module?

Reference:
Configuring Metricbeat
Metricbeat Prometheus Module
From the second link I got the metricbeat Prometheus module configuration is as follows:-
- module: prometheus
period: 10s
hosts: ["localhost:9090"]
metricsets: ["query"]
queries:
- name: 'up'
path: '/api/v1/query'
params:
query: "up"
Regarding my use case I want to pull data from remote prometheus host which is outside my network to my ELK cluster using metricbeat prometheus queries.
In this regard I added my remote prometheus host name in the host section of the above config file for metricbeat prometheus module.
Now my question do we need to install metricbeat in the remote prometheus cluster also to pull the data (Ref: Configuring Metricbeat) or just adding the remote prometheus host name in the host section of metricbeat configuration is enough to do the trick?
You are not required to again configured MetricBeat on remote Prometheus host. You can use same configuration which you have given in question. But you can not give localhost:9090 as you are not running metricbeat on same host where Prometheus is running. Hence, you can update configuration like prometheus_ip:9090.
Also, You need to make sure that connectivity is allowed between host where you have installed metricbeat and host where you are running Prometheus.
You can use Elastic Agent & fleet as well instead of Metricbeat. because it provide centralized configuration management and it easy to configure. You can read more about Elastic agent and fleet here and it provide Prometheus integration.

How to remove a wan connected datacenter in consul

I have a completely failed datacenter cluster(DC1) which I'll not use in future. But it is wan-join ed with another live data cluster(DC2) which I'd like to continue using.
I've already marked all the members of DC1 as 'left' using consul force-leave
But I still see DC2 when I list the datacenters using consul catalog datacenters. How can I remove it?
In simpler words : How can we break wan connection between two consul clusters?
These datacenters will be removed by consul in 72 hours.

Achieve Fault Tolerance with Consul Cluster

I have created consul server cluster using different ports in localhost.
I used below commands for that.
server 1:
consul agent -server -bootstrap-expect=3 -data-dir=consul-data -ui -bind=127.0.0.1 -dns-port=8601 -http-port=8501 -serf-lan-port=8303 -serf-wan-port=8304 -server-port=8305 -node=node1
server 2:
consul agent -server -bootstrap-expect=3 -data-dir=consul-data2 -ui -bind=127.0.0.1 -dns-port=8602 -http-port=8502 -serf-lan-port=8306 -serf-wan-port=8307 -server-port=8308 -node=node2 -join=127.0.0.1:8303
server 3:
consul agent -server -bootstrap-expect=3 -data-dir=consul-data1 -ui -bind=127.0.0.1 -node=node3 -join=127.0.0.1:8303
Then I created 2 microservices using spring boot, called service_A and service_B.
Service_B calls service_A to get some data.
Both services get registered with one of the above servers.
In application.properties:
spring.cloud.consul.port=8501 #For service_A
spring.cloud.consul.port=8502 #For service_B
This works fine as Service_B discovers Service_A without any problem.
Now When I kill the consul server which service_A got registered, system fails to give results since Service_B cannot find Service_A.
How should I make this system fault tolerant, Which means even though the consul server fails, services who registered with that server automatically get registered with another server which is available in the cluster.
Further I need to know how consul achieves High availability and fault tolerance in service registration and discovery. Hope you get the Question.
Apparently, you can deploy a consul cluster in your local machine but you cannot expect any resilience mechanism or fault tolerance in that same local machine. It's because your spring services (service_A & service_B) has been configured to identify the consul server which runs in the given consul server port under bootstrap.yml (default 8500).
spring:
cloud:
consul:
config:
watch:
enabled: true
port: 8500
discovery:
instanceId: ${spring.application.name}:${random.value}
So each services will discover the consul servers that runs under 8500 port (you can change it as you wish). If you are running your consul cluster in your same local machine you cannot assign the same port number (8500) to each cluster nodes that need to be identified. It will be differed in order to run under same ip address. To achieve this you will need to deploy each consul nodes under different ip addresses with the same port number 8500.
8301 is the serf LAN port that used to handle gossip in the LAN. Even this port can be the same in each nodes to maintain the cluster inter-connection.
The easiest way to achieve this is that to use a private subnet in a AWS VPC.
And then you can assign separate configurations for each subnet nodes with the same port number for each server nodes so that it can be identified by your services_A & service_B with #EnableDiscoveryClient annotation.

HA for the local Consul agent with Docker-Swarm

In my microservices system I plan to use docker swarm and Consul.
In order to ensure the high availability of Consul I’m going to build a cluster of 3 server agents (along with a client agent per node), but this doesn’t save me from local consul agent failure.
Am I missing something?
If not, how can I configure swarm to be aware of more than 1 consul agents?
Consul is the only service discovery backend that don't support multiple endpoints while using swarm.
Both zookeeper and etcd support the etcd://10.0.0.4,10.0.0.5 format of providing multiple Ip's for the "cluster" of discovery back-ends while using Swarm.
To answer your question how you can configure Swarm to support more than 1 consul (server) - I don't have a definitive answer to it but can point you in a direction and something you can test ( no guarantees ) :
One suggestion worth testing (which is not recommended for production) is to use a Load Balancer that can pass your requests from the Swarm manager to one of the three consul servers.
So when starting the swarm managers you can point to consul://ip_of_loadbalancer:port
This will however cause the LB to be a bottleneck (if it goes down).
I have not tested the above and can't answer if it will work or not - it is merely a suggestion.

Kubernetes on AWS cloud provider

I installed CentOS Atomic Host as operating system for kubernetes on AWS.
Everything works fine, but it seems I missed something.
I did not configure cloud provider and can not find any documentation on that.
In this question I want to know:
1. What features cloud provider gives to kubernetes?
2. How to configure AWS cloud provider?
UPD 1: external load balancer does not work; I have not tested awsElasticBlockStore yet, but I also suspect it does not work.
UPD 2:
Service details:
$ kubectl get svc nginx-service-aws-lb -o yaml
apiVersion: v1
kind: Service
metadata:
creationTimestamp: 2016-01-02T09:51:40Z
name: nginx-service-aws-lb
namespace: default
resourceVersion: "74153"
selfLink: /api/v1/namespaces/default/services/nginx-service-aws-lb
uid: 6c28b718-b136-11e5-9bda-06c2feb29b0d
spec:
clusterIP: 10.254.172.185
ports:
- name: http-proxy-protocol
nodePort: 31385
port: 8080
protocol: TCP
targetPort: 8080
- name: https-proxy-protocol
nodePort: 31370
port: 8443
protocol: TCP
targetPort: 8443
selector:
app: nginx
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer: {}
I can't speak to the ProjectAtomic bits, nor to the KUBERNETES_PROVIDER env-var, since my experience has been with the CoreOS provisioner. I will talk about my experiences and see if that helps you dig a little more into your setup.
Foremost, it is absolutely essential that the controller EC2 and the worker EC2 machines have the correct IAM role that will enable the machines to make AWS calls on behalf of your account. This includes things like provisioning ELBs and working with EBS Volumes (or attaching an EBS Volume to themselves, in the case of the worker). Without that, your cloud-config experience will go nowhere. I'm pretty sure the IAM payloads are defined somewhere other than those .go files, which are hard to read, but that's the quickest link I had handy to show what's needed.
Fortunately, the answer to that question, and the one I'm about to talk about, are both centered around the apiserver and the controller-manager. The configuration of them and the logs they output.
Both the apiserver and the controller-manager have an argument that points to an on-disk cloud configuration file that regrettably isn't documented anywhere except for the source. That Zone field is, in my experience, optional (just like they say in the comments). However, it was seeing the KubernetesClusterTag that led me to follow that field around in the code to see what it does.
If your experience is anything like mine, you'll see in the docker logs of the controller-manager a bunch of error messages about how it created the ELB but could not find any subnets to attach to it; (that "docker logs" bit is presuming, of course, that ProjectAtomic also uses docker to run the Kubernetes daemons).
Once I attached a Tag named KubernetesCluster and set every instance of the Tag to the same string (it can be anything, AFAIK), then the aws_loadbalancer was able to find the subnet in the VPC and it attached the Nodes to the ELB and everything was cool -- except for the part about it can only create Internet facing ELBs, right now. :-(
Just for clarity: the aws.cfg contains a field named KubernetesClusterTag that allows you to redefine the Tag that Kubernetes will look for; without any value in that file, Kuberenetes will use the Tag name KubernetesCluster.
I hope this helps you and I hope it helps others, because once Kubernetes is up, it's absolutely amazing.
What features cloud provider gives to kubernetes?
Some features that I know: the external loadbalancer, the persistent volumes.
How to configure AWS cloud provider?
There is a environment var called KUBERNETES_PROVIDER, but it seems the env var only matters when people start a k8s cluster. Since you said "everything works fine", I guess you don't need any further configuration to use the features I mentioned above.

Resources