Kubernetes breaks (no response from kubectl) when I have too many Pods running in the cluster (1000 pods).
There are more than enough resources (CPU and memory), so it seems to me that some kind of controller is breaking and unable to handle a large number of Pods.
The workload I need to run can be massively parallel processed, hence I have a high number of Pods.
Actually, I would like to be able to run many more times 1000 Pods. Maybe even 100,000 Pods.
My Kubernetes master node is an AWS EC2 m4.xlarge instance.
My intuition tells me that it is the master node's network performance that is holding the cluster back?
Any ideas?
Details:
I am running 1000 Pods in a Deployment.
when I do kubectl get deploy
it shows:
DESIRED CURRENT UP-TO-DATE AVAILABLE
1000 1000 1000 458
and through my application-side DB, I can see that there are only 458 Pods working.
when I do kops validate cluster
I receive the warning:
VALIDATION ERRORS
KIND NAME MESSAGE
ComponentStatus controller-manager component is unhealthy
ComponentStatus scheduler component is unhealthy
Pod kube-system/kube-controller-manager-<ip>.ec2.internal
kube-system pod
"kube-controller-manager-<ip>.ec2.internal" is not healthy
Pod
kube-system/kube-scheduler-<ip>.ec2.internal
kube-system pod "kube-scheduler-<ip>.ec2.internal" is not healthy
The fact that it takes a long time to list your pods is not really about your nodes as they will able to handle pods as much depending on the resources they have such CPUs and Memory.
The issue you are seeing is more about the kubeapi-server being able query/reply a large number of pods or resources.
So the two contention points here are the kube-apiserver and etcd where the state for everything in a Kubernetes cluster is stored. So you can focus on optimizing those two components and the faster you'll get responses from say kubectl get pods (Networking is another contention point but that's if you are issuing kubectl commands from a slow broadband connection).
You can try:
Setting up an HA external etcd cluster with pretty beefy machines and fast disks.
Upgrade the machines where your kubeapi-server(s) lives.
Follow more guidelines described here.
Related
Does letting the kubelet write logs that pods emit on their stdout / stderr consume more resources such as CPU and memory, than if the pods wrote their logs directly to a mounted volume? I am asking this considering that the logs are copied from the pod to the kubelet across processes. Or does the disk IO by N pods instead of one kubelet offset any advantage?
I recognize that not letting kubelet handle the logs means kubectl logs ... would not show recent logs. But otherwise, could there be performance advantage to not writing through an intermediary like kubelet?
I need some help in monitoring Micro services which are running on AKS cluster using existing Prometheus which is running on a different node.
How can we detect and setup alerts for a number of scenarios around pods like
out of memory issues and pods getting restarted? I have checked few articles on internet but most of them are addressing the issue in which both micro services and Prometheus is running on same AKS cluster. But in my case, Prometheus is already configured on different node and we are using node_exporter and black_box_exporter to monitor other servers. I do not want to disturb the existing setup of Prometheus.
Any suggestion would be greatly appreciated!
Thanks,
Sharmila
I am guess I am just asking for confirmation really. As had some major issues in the past with our elastic search cluster on kubernetes.
Is it fine to add a pod affinity to rule to a already running deployment. This is a live production elastic search cluster and I want to pin the elastic search pods to specific nodes with large storage.
I kind of understand kubernetes but not really elastic search so dont want to cause any production issues/outages as there is no one around that could really help to fix it.
Currently running 6 replicas but want to reduce to 3 that run on 3 worker nodes with plenty of storage.
I have labelled my 3 worker nodes with the label 'priority-elastic-node=true'
This is podaffinity i will add to my yaml file and apply:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: priority-elastic-node
operator: In
values:
- "true"
topologyKey: "kubernetes.io/hostname"
What I assume will happen is nothing after I apply but then when I start scaling down the elastic node replicas the elastic nodes stay on the preferred worker nodes.
Any change to the pod template will cause the deployment to roll all pods. That includes a change to those fields. So it’s fine to change, but your cluster will be restarted. This should be fine as long as your replication settings are cromulent.
I've done quite a bit of research and have yet to find an answer to this. Here's what I'm trying to accomplish:
I have an ELK stack container running in a pod on a k8s cluster in GCE - the cluster also contains a PersistentVolume (format: ext4) and a PersistentVolumeClaim.
In order to scale the ELK stack to multiple pods/nodes and keep persistent data in ElasticSearch, I either need to have all pods write to the same PV (using the node/index structure of the ES file system), or have some volume logic to scale up/create these PVs/PVCs.
Currently what happens is if I spin up a second pod on the replication controller, it can't mount the PV.
So I'm wondering if I'm going about this the wrong way, and what is the best way to architect this solution to allow for persistent data in ES when my cluster/nodes autoscale.
Persistent Volumes have access semantics. on GCE I'm assuming you are using a Persistent Disk, which can either be mounted as writable to a single pod or to multiple pods as read-only. If you want multi writer semantics, you need to setup Nfs or some other storage that let's you write from multiple pods.
In case you are interested in running NFS - https://github.com/kubernetes/kubernetes/blob/release-1.2/examples/nfs/README.md
FYI: We are still working on supporting auto-provisioning of PVs as you scale your deployment. As of now it is a manual process.
I am newbie in Go. I want to get the storage statistics of nodes and cluster in kubernetes using Go code. How i can get the free and used storage of Kubernetes nodes and cluster using Go.
This is actually 2 problems:
How do I perform http requests to the Kubernetes master?
See [1] for more details. Tl;dr you can access the apiserver in at least 3 ways:
a. kubectl get nodes (not go)
b. kubectl proxy, followed by a go http client to this url
c. Running a pod in a kubernetes cluster
What are the requests I need to do to get node stats?
a. Run kubectl describe node, it should show you resource information.
b. Now run kubectl describe node --v=7, it should show you the REST calls.
I also think you should reformat the title of your question per https://stackoverflow.com/help/how-to-ask, so it reflects what you're really asking.
[1] https://github.com/kubernetes/kubernetes/blob/release-1.0/docs/user-guide/accessing-the-cluster.md