I've used Hashicorps' example docker-compose to start a 3-server / 3-client cluster of Consul.
I just wanted to see how consul exec works:
consul members
Node Address Status Type Build Protocol DC Segment
b9e4dbaa35ac 172.18.0.7:8301 alive server 1.3.0 2 dc1 <all>
dac0d326a3c2 172.18.0.4:8301 alive server 1.3.0 2 dc1 <all>
efd58b702d4c 172.18.0.5:8301 alive server 1.3.0 2 dc1 <all>
30303321aefc 172.18.0.3:8301 alive client 1.3.0 2 dc1 <default>
a91e25b36145 172.18.0.6:8301 alive client 1.3.0 2 dc1 <default>
b0f7559d3bea 172.18.0.2:8301 alive client 1.3.0 2 dc1 <default>
Then, I tried calling in all kinds of combinations:
consul exec -node {hash} ip a
consul exec 'ip a'
consul exec -shell 'ip a'
etc.
No errors, but no output either, always:
0 / 0 node(s) completed / acknowledged
I couldn't find any examples on the internet. The documentation is less than unhelpful.
Found this: https://groups.google.com/forum/#!topic/consul-tool/zE4G9ixWq60
Which basically says that consul exec is an un-feature...
OK, in case you were wondering the same, here's how I solved it:
consul agent -hcl 'disable_remote_exec=false' ...
Related
EDIT: I'm going to leave this up but I moved away from Canonical Kubernetes to a microk8 install and everything "just worked." I would not recommend Canonical Kubernetes at this time (early 2019).
Goal:
I want to connect to the Canonical Kubernetes cluster running on Ubuntu 18.04 box (192.168.2.148) on my Windows machine (192.168.2.40). I installed the cluster via conjure-up.
Problem:
running kubectl cluster-info on windows machine gives me:
Unable to connect to the server: dial tcp 10.91.211.64:443: connectex: A connection attempt
failed because the connected party did not properly respond after a period of time,
or established connection failed because connected host has failed to respond.
I have ssh'd to the ubuntu box and copied the ~/.kube/config file to Windows.
~/.kube/config:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <BIG LONG STRING O STUFF>
server: https://10.91.211.64:443
name: conjure-canonical-kubern-931
contexts:
- context:
cluster: conjure-canonical-kubern-931
user: conjure-canonical-kubern-931
name: conjure-canonical-kubern-931
current-context: conjure-canonical-kubern-931
kind: Config
preferences: {}
users:
- name: conjure-canonical-kubern-931
user:
password: <Smaller String>
username: admin
Background:
I have a spare Ubuntu 18.04 LTS server (192.168.2.148) on my home LAN that I've used conjure-up to install the Canonical Kubernetes Install.
I've successfully installed the cluster and it seems to be working. I can ssh and see kubectl cluster-info and see the Master, Heapster, KubeDNS, Metrics-server, Grafana and InfluxDB all running.
Kubernetes master is running at https://10.91.211.64:443
Heapster is running at https://10.91.211.64:443/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://10.91.211.64:443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://10.91.211.64:443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
Grafana is running at https://10.91.211.64:443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
InfluxDB is running at https://10.91.211.64:443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy
along with juju status looking like everything is up and running:
Model Controller Cloud/Region
Version SLA Timestamp
conjure-canonical-kubern-931 conjure-up-localhost-673 localhost/localhost 2.4.3 unsupported 02:01:00Z
App Version Status Scale Charm Store Rev OS Notes
easyrsa 3.0.1 active 1 easyrsa jujucharms 195 ubuntu
etcd 3.2.10 active 3 etcd jujucharms 378 ubuntu
flannel 0.10.0 active 5 flannel jujucharms 351 ubuntu
kubeapi-load-balancer 1.14.0 active 1 kubeapi-load-balancer jujucharms 525 ubuntu exposed
kubernetes-master 1.13.2 active 2 kubernetes-master jujucharms 542 ubuntu
kubernetes-worker 1.13.2 active 3 kubernetes-worker jujucharms 398 ubuntu exposed
Unit Workload Agent Machine Public address Ports Message
easyrsa/0* active idle 0 10.91.211.138 Certificate Authority connected.
etcd/0 active idle 1 10.91.211.120 2379/tcp Healthy with 3 known peers
etcd/1* active idle 2 10.91.211.205 2379/tcp Healthy with 3 known peers
etcd/2 active idle 3 10.91.211.41 2379/tcp Healthy with 3 known peers
kubeapi-load-balancer/0* active idle 4 10.91.211.64 443/tcp Loadbalancer ready.
kubernetes-master/0 active idle 5 10.91.211.181 6443/tcp Kubernetes master running.
flannel/0* active idle 10.91.211.181 Flannel subnet 10.1.50.1/24
kubernetes-master/1* active idle 6 10.91.211.218 6443/tcp Kubernetes master running.
flannel/1 active idle 10.91.211.218 Flannel subnet 10.1.85.1/24
kubernetes-worker/0* active idle 7 10.91.211.29 80/tcp,443/tcp Kubernetes worker running.
flannel/4 active idle 10.91.211.29 Flannel subnet 10.1.94.1/24
kubernetes-worker/1 active idle 8 10.91.211.70 80/tcp,443/tcp Kubernetes worker running.
flannel/3 active idle 10.91.211.70 Flannel subnet 10.1.46.1/24
kubernetes-worker/2 active idle 9 10.91.211.167 80/tcp,443/tcp Kubernetes worker running.
flannel/2 active idle 10.91.211.167 Flannel subnet 10.1.30.1/24
Entity Meter status Message
model amber user verification pending
Machine State DNS Inst id Series AZ Message
0 started 10.91.211.138 juju-86bdea-0 bionic Running
1 started 10.91.211.120 juju-86bdea-1 bionic Running
2 started 10.91.211.205 juju-86bdea-2 bionic Running
3 started 10.91.211.41 juju-86bdea-3 bionic Running
4 started 10.91.211.64 juju-86bdea-4 bionic Running
5 started 10.91.211.181 juju-86bdea-5 bionic Running
6 started 10.91.211.218 juju-86bdea-6 bionic Running
7 started 10.91.211.29 juju-86bdea-7 bionic Running
8 started 10.91.211.70 juju-86bdea-8 bionic Running
9 started 10.91.211.167 juju-86bdea-9 bionic Running
I've been experimenting with an ICP instance (ICP 2.1.0.2): 1 master node and 2 worker nodes.
I noticed that the pods in my ICP Kubernetes cluster don't have outbound Internet connectivity (or are having DNS lookup issues)
For example, If I start up a busybox pod in my cluster, and try to do "nslookup github.com" or "ping google.com" .. it fails..
kubectl run curl --image=radial/busyboxplus:curl -i --tty
root#curl-545bbf5f9c-gssbg:/ ]$ nslookup github.com
Server: 10.0.0.10
Address 1: 10.0.0.10
nslookup: can't resolve 'github.com'
I checked and saw that "kube-dns" (service, pod, daemonset.extensions, daemonset.apps) does appear to be running.
When I'm logged into (eg. SSH) to the ICP master and the worker nodes machines, I am able to ping these external sites successfully.
Any suggestions for how to troubleshoot this problem? Thanks!
We had kind of the reverse problem - where we could look up anything on internet or other domains, but not the domain in which the cluster was deployed.
That turned out to be the vague documentation around what cluster_domain and cluster_CA_domain mean in the config.yaml. But as a plus we got to learn a bit more about those and about configuring kube-dns.
Basically cluster_domain should be a private virtual domain to the cluster for which kube-dns will be authoritative. Anything else it should use the host's resolve.conf nameservers as upstream servers. If you suspect that your DNS servers are not being utilised for public DNS then you can update the kube-dns configMap to specify the upstream servers that it should use.
https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/
This is assuming you have configure cluster_domain, cluster_CA_domain correctly of course.
They should look something like
cluster_domain = mycluster.icp <----- could be "Mickey-mouse" for all it matters
cluster_CA_domain = icp.mycompany.com <----- the endpoint that portal/registry/api etc are accessible to users on
I am trying to configure two node(node1 and node2 HA cluster using pacemaker on centos 7. I executed below steps on both nodes
yum install pcs
systemctl enable pcsd.service pacemaker.service corosync.service
systemctl start pcsd.service
passwd hacluster
After that execute below command on node1
pcs cluster auth node1 node2
i am getting below error
Error: Unable to communicate with node2 Error: Unable to
communicate with node1
I have also verified that both nodes are listening on port 2224 and also used telnet to verify that both nodes are able to connect to each other on 2224.
Need help.
The issue got resolved after using FQDN instead of hostname(node1.demo.in, node2.demo.in). below command worked fine.
pcs cluster auth node1.demo.in node2.demo.in
Don't know exact cause for this. Any Idea?
I have a problem similar to How to access externally to consul UI but I can't get the combinations of network options to work right.
I'm on OSX using Docker for Mac, not the old docker-machine stuff, and the official consul docker image, not the progrium/docker image.
I can start up a 3-node server cluster fine using
docker run -d --name node1 -h node1 consul agent -server -bootstrap-expect 3
JOIN_IP="$(docker inspect -f '{{.NetworkSettings.IPAddress}}' node1)"
docker run -d --name node2 -h node2 consul agent -server -join $JOIN_IP
docker run -d --name node3 -h node3 consul agent -server -join $JOIN_IP
So far so good, they're connected to each other and working fine. Now I want to start an agent, and view the UI via it.
I tried a bunch of combinations of -client and -bind, which seem to be the key to all of this. Using
docker run -d -p 8500:8500 --name node4 -h node4 consul agent -join $JOIN_IP -ui -client=0.0.0.0 -bind=127.0.0.1
I can get the UI via http://localhost:8500/ui/, and consul members shows all the nodes:
docker exec -t node4 consul members
Node Address Status Type Build Protocol DC
node1 172.17.0.2:8301 alive server 0.7.1 2 dc1
node2 172.17.0.3:8301 alive server 0.7.1 2 dc1
node3 172.17.0.4:8301 alive server 0.7.1 2 dc1
node4 127.0.0.1:8301 alive client 0.7.1 2 dc1
But all is not well; in the UI it tells me node4 is "Agent not live or unreachable" and in its logs there's a whole bunch of
2016/12/19 18:18:13 [ERR] memberlist: Failed to send ping: write udp 127.0.0.1:8301->172.17.0.4:8301: sendto: invalid argument
I've tried a bunch of other combinations - --net=host just borks things up on OSX.
If I try -bind=my box's external IP it won't start,
Error starting agent: Failed to start Consul client: Failed to start lan serf: Failed to create memberlist: Failed to start TCP listener. Err: listen tcp 192.168.1.5:8301: bind: cannot assign requested address
I also tried mapping all the other ports including the udp ports (-p 8500:8500 -p 8600:8600 -p 8400:8400 -p 8300-8302:8300-8302 -p 8600:8600/udp -p 8301-8302:8301-8302/udp) but that didn't change anything.
How can I join a node up to this cluster and view the UI?
Try using the 0.7.2 release of Consul and start the agent using the following (beta as of 0.7.2, final by 0.8.0) syntax:
$ docker run -d -p 8500:8500 --name node4 -h node4 consul agent -join $JOIN_IP -ui -client=0.0.0.0 -bind='{{ GetPrivateIP }}'
The change being the argument to -bind where Consul will now render out the IP address of a private IP address. The other template parameters are documented in the hashicorp/go-sockaddr.
I have 6 machines mesos cluster (3 masters and 3 slaves), I acces to mesos User interface 172.16.8.211:5050 and it works correctly and redirect to the leader if it is not. Then If I access to marathon User interface 172.16.8.211:8080 it works correctly. Summing before configuring and executing the consul-cluster marathon works well.
My problem is when I configure and run a consul cluster with 3 servers that are the mesos masters and 3 clients that are the mesos slaves. If I execute consul members it is fine, all the members alive and working together.
But now if I try to access to marathon User interface I can't, and I access to mesos User interface and I go to 'Frameworks' and does not appear marathon Framework.
ikerlan#client3:~$ consul members
Node Address Status Type Build Protocol DC
client3 172.16.8.216:8301 alive client 0.5.2 2 nyc2
client2 172.16.8.215:8301 alive client 0.5.2 2 nyc2
server2 172.16.8.212:8301 alive server 0.5.2 2 nyc2
server3 172.16.8.213:8301 alive server 0.5.2 2 nyc2
client1 172.16.8.214:8301 alive client 0.5.2 2 nyc2
server1 172.16.8.211:8301 alive server 0.5.2 2 nyc2
In Slaves tab of mesos I could see the next:
-Mesos version: 0.27.0
-Marathon version: 0.15.1
I have the next file logs, where would appear something related with this issue?
What could be the problem?
Solution:
I have see in the marathon logs '/var/log/syslog' that the problem is a problem of DNS. So I try to add the IPs of the other hosts of the cluster to the file /etc/hosts. And it resolv the problem, now it works perfectly.
You can add all the cluster hosts to the zookeeper config file, it would work