ibm-cloud-private DNS or Internet issues from inside the pods - ibm-cloud-private

I've been experimenting with an ICP instance (ICP 2.1.0.2): 1 master node and 2 worker nodes.
I noticed that the pods in my ICP Kubernetes cluster don't have outbound Internet connectivity (or are having DNS lookup issues)
For example, If I start up a busybox pod in my cluster, and try to do "nslookup github.com" or "ping google.com" .. it fails..
kubectl run curl --image=radial/busyboxplus:curl -i --tty
root#curl-545bbf5f9c-gssbg:/ ]$ nslookup github.com
Server: 10.0.0.10
Address 1: 10.0.0.10
nslookup: can't resolve 'github.com'
I checked and saw that "kube-dns" (service, pod, daemonset.extensions, daemonset.apps) does appear to be running.
When I'm logged into (eg. SSH) to the ICP master and the worker nodes machines, I am able to ping these external sites successfully.
Any suggestions for how to troubleshoot this problem? Thanks!

We had kind of the reverse problem - where we could look up anything on internet or other domains, but not the domain in which the cluster was deployed.
That turned out to be the vague documentation around what cluster_domain and cluster_CA_domain mean in the config.yaml. But as a plus we got to learn a bit more about those and about configuring kube-dns.
Basically cluster_domain should be a private virtual domain to the cluster for which kube-dns will be authoritative. Anything else it should use the host's resolve.conf nameservers as upstream servers. If you suspect that your DNS servers are not being utilised for public DNS then you can update the kube-dns configMap to specify the upstream servers that it should use.
https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/
This is assuming you have configure cluster_domain, cluster_CA_domain correctly of course.
They should look something like
cluster_domain = mycluster.icp <----- could be "Mickey-mouse" for all it matters
cluster_CA_domain = icp.mycompany.com <----- the endpoint that portal/registry/api etc are accessible to users on

Related

Make dnsmasq NOT server the local server?

Is there a way to configure dnsmasq so that it only serves DNS requests from remote systems? What I want is that ('internet' means outside my network):
Any DNS request from a program on the dnsmasq machine just uses the internet DNS servers and ignores dnsmasq
A DNS request from a remote system directed to the dnsmasq machine receives a response from dnsmasq, which could be a locally configured response or one that dnsmasq has relayed on to the internet DNS servers
Optionally restrict the IP addresses of remote servers permitted to query the dnsmasq system.
FYI my use case is needing to patch/respond locally to requests from an embedded system to add resilience with a remote server is down; I can't change the queried hostname so I want to be able to locally spoof the IP address, but only affecting specific queries from this embedded system which I've manually directed to my local dnsmasq server.
After much digging, I stumbled across the answer. The magic incantation required to achieve this is:
$ echo DNSMASQ_EXCEPT=lo | sudo tee --append /etc/default/dnsmasq
$ sudo systemctl restart dnsmasq
Look up DNSMASQ_EXCEPT for details but basically this stops dnsmasq providing DNS services to the lo interface.

How to configure kube-proxy bind IP address?

For testing purposes, I want to set up the kubernetes master to be only accessible from the local machine and not the outside. Ultimately I am going to run a proxy server docker container on the machine that is opened up to the outside. This is all inside a minikube VM.
I figure configuring kube-proxy is the way to go. I did the following
kubeadm config view > ~/cluster.yaml
# edit proxy bind address
vi ~/cluster.yaml
kubeadm reset
rm -rf /data/minikube
kubeadm init --config cluster.yaml
Upon doing netstat -ln | grep 8443 i see tcp 0 0 :::8443 :::* LISTEN which means it didn't take the IP.
I have also tried kubeadm init --apiserver-advertise-address 127.0.0.1 but that only changes the advertised address to 10.x.x.x in the kubeadm config view. I feel that is probably the wrong thing anyways. I don't want the API server to be inaccessible to the other docker containers that need to access it or something.
I have also tried doing this kubeadm config upload from-file --config ~/cluster.yaml and then attempting to manually restart the docker running kube-proxy.
Also tried to restart the machine/cluster after kubeadm config change but couldn't figure that out. When you reboot a minikube VM by hand kubeadm command disappears and not even docker is running. Various online methods of restarting things dont seem to work either (could be just doing this wrong).
Also tried editing the kube-proxy docker's config file (bound to a local dir) but that gets overwritten when i restart the docker. I dont get it.
There's nothing in the kubernetes dashboard that allows me to edit the config file of the kube-proxy either (since its a daemonset).
Ultimately, I wish to use an authenticated proxy server sitting infront of the k8s master (apiserver specifically). Direct access to the k8s master from outside the VM will not work.
Thanks
you could limit it via the local network configuration. (Firewall, Routes)
As far as I know, the API needs to be accessible, at least via the local network where the other nodes reside in. Except you want to have a single node "cluster".
So, when you do not have a different network card, where you could advertise or bind the address to, you need to limit it then by the above mentioned Firewall or Route rules.
To your initial question topic, did you look into this issue? https://github.com/kubernetes/kubernetes/issues/39586

How can I troubleshoot/fix an issue interacting with a running Kubernetes pod (timeout error)?

I have two EC2 instances, one running a Kubernetes Master node and the other running the Worker node. I can successfully create a pod from a deployment file that pulls a docker image and it starts with a status of "Running". However when I try to interact with it I get a timeout error.
Ex: kubectl logs <pod-name> -v6
Output:
Config loaded from file /home/ec2-user/.kube/config
GET https://<master-node-ip>:6443/api/v1/namespaces/default/pods/<pod-name> 200 OK in 11 milliseconds
GET https://<master-node-ip>:6443/api/v1/namespaces/default/pods/<pod-name>/log 500 Internal Server Error in 30002 milliseconds
Server response object: [{"status": "Failure", "message": "Get https://<worker-node-ip>:10250/containerLogs/default/<pod-name>/<container-name>: dial tcp <worker-node-ip>:10250: i/o timeout", "code": 500 }]
I can get information about the pod by running kubectl describe pod <pod-name> and confirm the status as Running. Any ideas on how to identify exactly what is causing this error and/or how to fix it?
Probably, you didn't install any network add-on to your Kubernetes cluster. It's not included in kubeadm installation, but it's required to communicate between pods scheduled on different nodes. The most popular are Calico and Flannel. As you already have a cluster, you may want to chose the network add-on that uses the same subnet as you stated with kubeadm init --pod-network-cidr=xx.xx.xx.xx/xx during cluster initialization.
192.168.0.0/16 is default for Calico network addon
10.244.0.0/16 is default for Flannel network addon
You can change it by downloading corresponded YAML file and by replacing the default subnet with the subnet you want. Then just apply it with kubectl apply -f filename.yaml

UnknownHostException within Docker Container on Alpine openjdk:8-jdk-alpine

This is very similar to the following question, however the solution/answer to this previous question doesn't solve the problem.
In my case I'm not connecting to MySQL specifically, however trying to resolve www.google.com results in the same UnknownHostException only within the container. When I run from just the JVM and not within a container on my MAC, there's no issues in resolving.
Same scenario where:
InetAddress ip = InetAddress.getByName("www.google.com");
I've tried the following suggested fix:
RUN echo 'hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4' >> /etc/nsswitch.conf
as well as..
RUN echo "hosts: files dns" >> /etc/nsswitch.conf"
Neither seem to do the trick..
Are there any other suggestions out there, anything I'm missing in addition to the above suggestions?
Thanks in advance.
It turns out the solution was fairly simple, and there's a couple options..
For you experts, this is probably funny, but at least its one more idea for the next guy..
So what I've found is I can specify the DNS on each of the nodes in my swarm via:
/etc/docker/daemon.json
{
"dns": ["10.0.0.2", "8.8.8.8", etc.. ]
}
After setting this on each node, specifically 8.8.8.8 for google's DNS, then "google.com" resolved and no prob. Note that its a google specific DNS, but its provides a public DNS. Yahoo, Amazon, etc all resolved.. The 10.0.0.2 address would be any other DNS you want to specify, and you can specify multiples.
This came from the following post: Fix Docker's networking DNS config
However, it even easier if you want to specify the DNS via your compose/stack file.
Rather than go to each node in your swarm and update the daemon.json DNS entries, you can specify the DNS directly in your compose.
version: '3.3'
services:
my-sample-service:
image: my-repo/my-sample:1.0.0
ports: 8081:8080
networks:
- my-network
dns:
- 10.0.0.1 #this would be whatever your say internal DNS is priority 1
- 10.0.0.2 #this would be whatever other DNS you'd provide priority 2
- 8.8.8.8 #default google address, even though docker specifies this
#as a default, it wasn't working until I specifically set it
#this would be the last one checked if no resolution happened
#on the first two.

Web UI redirection issue

I am running IBM Cloud Private using 5 VMs on my laptop. My home network subnet is 192.168.100 whereas the subnet used by all 5 VMs is 192.168.142. I am port forwarding 8443 from the VMware Workstation from host to the master node which is 192.168.142.103. My laptop IP is 192.168.100.201.
I was hoping that I should be able to access this Web UI from any other machine in my home network and I tried this URL from other machine:
https://192.168.100.201:8443
And, it directs properly to the guest VM as I see the url changes to :
https://192.168.100.201:8443/console/
But, after few seconds, I get the message that the site cannot be reached. I noticed that the url has changed from original host laptop address of 192.168.100.201 address to the Guest VM address 192.168.142.103 as shown:
https://192.168.142.103:8443/idauth/oidc/endpoint/OP/authorize?client_id=617a0480d5e506a5e797f852bea1df38&response_type=code&scope=openid%20email%20profile&redirect_uri=https://192.168.100.201:8443/auth/liberty/callback
This seems like that the redirection in the Web UI is not handled properly.
However, I installed kubectl for Windows on another machine and I did the port 8001 forward from 192.168.100.201 to the VM's master Guest 192.168.142.103 and added kubectl set config commands (from web UI Client Configure option) on my other laptop (192.168.100.202).
kubectl config set-cluster pot_icp_cluster.icp --server=https://192.168.100.201:8001 --insecure-skip-tls-verify=true
kubectl config set-context pot_icp_cluster.icp-context --cluster=pot_icp_cluster.icp
kubectl config set-credentials admin --token=<token>
kubectl config set-context pot_icp_cluster.icp-context --user=admin --namespace=default
kubectl config use-context pot_icp_cluster.icp-context
And, this works perfect as I am able to run kubectl commands from the other laptop (192.168.100.202) to the VMs running on another laptop (192.168.100.201) using port forwarding same way I did for the Web UI.
My question is: Is there something that I can do to get this redirection problem fixed in the Web UI?
I received a reply from an expert that liberty server that authenticates and verifies a login has only the master node's IP address registered with it as a callback URL during the installation. In the version of IBM Cloud Private 2.1.0.1, there is no direct way to register the new clients. However, this limitation is being fixed and starting next upgrade, we should be able to register new clients dynamically post install also.

Resources