Colocate pods in Kubernetes - performance

I have a kubernetes cluster which is spread across 2 zones- zone1 and zone2.
I have 2 applications- a web application and a database. The web application's configurations are stored in the database. Both the application as well as the database are deployed as stateful applications.
The idea is to deploy 2 replica sets for web application (application-0 and application-1) and 2 replica for database (database-0 and database-1). application-0 points to database-0, application-1 points to database-1.
Pod anti-affinity has been enabled. So preferably application-0 and application-1 will not be in same zone. Also database-0 and database-1 will not be in same zone.
I want to ensure application-0 and database-0 are in the same zone. And application-1 and database-1 are in another zone. So that the performance of the web application is not compromised. Is that possible?

If you want to have strict separation of the workloads over the two zones - I'd suggest using nodeSelector on a node's zone.
A similar result is possible with pod affinity but it's more complex and to get the clear split you describe. You'd need to use the requiredDuringScheduling / execution rules which are usually best avoided unless you really need them.

Just like you used anti-affinity rules to avoid provisioning both apps in the same zone, you can use affinity to provision app-0 only in the zone where db-0 exists, technically that would mean that you could drop anti affinity from app completely as if you tell it to schedule only in the zone of db, and db is defined to spread zones with anti-affinity, you inherit the distribution from the database part.

PodAntiAffinity to spread to Zones
For your database
metadata:
labels:
app: db
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- db
topologyKey: "kubernetes.io/hostname"
For your web-app
metadata:
labels:
app: web-app
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web-app
topologyKey: "kubernetes.io/hostname"
PodAffinity to co-locate pods on nodes
In addition, you add podAffinity to your web-app (not database) to co-locate it on nodes with your database.
With podAffinity added for your web-app:
metadata:
labels:
app: web-app
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web-app
topologyKey: "kubernetes.io/hostname"
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- db
topologyKey: "kubernetes.io/hostname"
Result
With both PodAntiAffinity and PodAffinity you will get web-app-X co-located with db-X.
web-app-1 web-app-2
db-1 db-2
See the Kubernetes documentation has an example to co-locate cache with app on nodes about PodAffinity and ´PodAntiAffinity`.
Stable Network identity for StatefulSet
To address one specific instance of a StatefulSet, create an Headless Service with ClusterIP: None for each db replica. This allow your web-apps to connect to a specific instance.
Access the closest db instance
Now your web-apps can connect to db-0 and db-1 via the headless services. Let your web-apps connect to both initially, and use the one with shortest response time - that one is most likely the one on the same node.

Related

Best practices for data storage with Elasticsearch and Kubernetes

After reading some documentation regarding Persistent Volumes in Kubernetes I am wondering which one would be the best setup (storage speaking) for running a highly available ElasticSearch cluster. I am not running the typical EFK (or ELK) setup, but I am using ElasticSearch as a proper full-text search engine.
I've read the official Elastic Documentation, but I find it quite lacking of clarification. According to "Kubernetes in Action", Chapter 6:
When an application running in a pod needs to persist data to disk and
have that same data available even when the pod is rescheduled to
another node, you can’t use any of the volume types we’ve mentioned so
far. Because this data needs to be accessible from any cluster node,
it must be stored on some type of network-attached storage (NAS).
So if I am not mistaken, I need a Volume and access it through PersistentVolumes and PersistentVolumeClaim with Retain policies.
When looking at Official Volumes, I get a feeling that one should define the Volume type him/herself. Though, when looking at a DigitalOcean guide, it does not seem there was any Volume setup there.
I picked that tutorial, but there are dozens on Medium that are all doing the same thing.
So: which one is the best setup for an ElasticSearch cluster? Of course keeping in mind order to not loose any data within an index, and being able to add pods(Kubernetes) or nodes (ElasticSearch) that can access the index.
A good pattern to deploy an ElasticSearch cluster in kubernetes is to define a StatefulSets.
Because the StatefulSet replicates more than one Pod you cannot simply reference a persistent volume claim. Instead, you need to add a persistent volume claim template to the StatefulSet state definition.
In order for these replicated peristent volumes to work, you need to create a Dynamic Volume Provisioning and StorageClass which allows storage volumes to be created on-demand.
In the DigitalOcean guide tutorial, the persistent volume claim template is as follow:
volumeClaimTemplates:
- metadata:
name: data
labels:
app: elasticsearch
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: do-block-storage
resources:
requests:
storage: 100Gi
Here, the StorageClass is do-block-storage. You can replace it with your own storage class
Very interesting question,
You need to think of an Elasticsearch node in Kubernetes that would be equivalent to an Elasticsearch Pod.
And Kubernetes need to hold the identity of each pod to attach to the correct Persistent Volume claim in case of an outage, here comes the StatefulSet
A StatefulSet will ensure the same PersistentVolumeClaim stays bound to the same Pod throughout its lifetime.
A PersistentVolume (PV) is a Kubernetes abstraction for storage on the provided hardware. This can be AWS EBS, DigitalOcean Volumes, etc.
I'd recommend having a look into the Elasticsearch Offical Helm chart: https://github.com/elastic/helm-charts/tree/master/elasticsearch
Also Elasticsearch Operator: https://operatorhub.io/operator/elastic-cloud-eck

Is it possible to find zone and region of the node my container is running on

I want to find the region and node of my node, I need this to log monitoring data.
kubernetes spec and metadata doesn't provide this information. I checked out
https://github.com/kubernetes/client-go which looks promising but I can't find
the info I am looking for.
Any suggestion? Thanks
If you are using GKE then node zone and region should be in node's labels:
failure-domain.beta.kubernetes.io/region
failure-domain.beta.kubernetes.io/zone
topology.kubernetes.io/region
topology.kubernetes.io/zone
You can see node labels using kubectl get nodes --show-labels
This is what I ended up doing.
Enabled Workload Identity
gcloud container clusters update <CLUSTER_NAME> \
--workload-pool=<PROJECT_ID>.svc.id.goog
Updated Node pool to use workload identity
gcloud container node-pools update <NODEPOOL_NAME> \
--cluster=<CLUSTER_NAME> \
--workload-metadata=<GKE_METADATA>
Created a service account for my app
bind my KSA to GSA
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:<PROJECT_ID>.svc.id.goog[<K8S_NAMESPACE>/<KSA_NAME]" \
<GSA_NAME>#<PROJECT_ID>.iam.gserviceaccount.com
Annotated service account using the email address of the GSA
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
iam.gke.io/gcp-service-account: <GSA_NAME>#<PROJECT_ID>.iam.gserviceaccount.com
name: KSA_NAME
namespace: <K8S_NAMESPACE>
This authenticated by container with GSA and I was able to get the metadata info using https://cloud.google.com/compute/docs/storing-retrieving-metadata
Not in any direct way. You could use the downward API to expose the node name to the pod, and then fetch the annotations/labels for that node. But that would require fairly broad read permissions (all nodes) so might be a security risk.

Kubernetes - Apply pod affinity rule to live deployment

I am guess I am just asking for confirmation really. As had some major issues in the past with our elastic search cluster on kubernetes.
Is it fine to add a pod affinity to rule to a already running deployment. This is a live production elastic search cluster and I want to pin the elastic search pods to specific nodes with large storage.
I kind of understand kubernetes but not really elastic search so dont want to cause any production issues/outages as there is no one around that could really help to fix it.
Currently running 6 replicas but want to reduce to 3 that run on 3 worker nodes with plenty of storage.
I have labelled my 3 worker nodes with the label 'priority-elastic-node=true'
This is podaffinity i will add to my yaml file and apply:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: priority-elastic-node
operator: In
values:
- "true"
topologyKey: "kubernetes.io/hostname"
What I assume will happen is nothing after I apply but then when I start scaling down the elastic node replicas the elastic nodes stay on the preferred worker nodes.
Any change to the pod template will cause the deployment to roll all pods. That includes a change to those fields. So it’s fine to change, but your cluster will be restarted. This should be fine as long as your replication settings are cromulent.

How to attach storage volume with elasticsearch nodes in kubernetes?

I am doing setup of Elasticseach on Kubernetes. I have created the cluster of Elasticsearch of 2 nodes. I want to attach storage with both of these nodes. like 80Gi with the first node and 100Gi with the second node.
My Kubernetes cluster is on EC2 and I am using EBS as storage.
In order to attach persistence, you need:
A StorageClass Object (Define the Storage)
A PersistentVolume Object (Provision the Storage)
A PersistentVolumeClaim Object (Attach the storage)
With each Node in ElasticSearch that you can attached with the pods in deployment\pod object definition.
An easier way is deploying ES cluster using Helm Chart.
As per helm chart documentation:
Automated testing of this chart is currently only run against GKE (Google Kubernetes Engine). If you are using a different Kubernetes provider you will likely need to adjust the storageClassName in the volumeClaimTemplate
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: elast
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
fsType: ext4
Hope this help.

How can I scale the PVC of a statefulset?

When I try to edit the PVC, Kubernetes gives error saying:
The StatefulSet "es-data" is invalid: spec: Forbidden: updates to
statefulset spec for fields other than 'replicas', 'template', and
'updateStrategy' are forbidden.
I am trying to increase the disk size of elasticsearch which is deployed as a statefulset on AKS.
The error is self explaining. You can only update template and updateStrategy part of a StatefulSet. Also, you can't resize a PVC. However, from kubernetes 1.11 you can resize pvc but it is still alpha feature.
Ref: Resizing an in-use PersistentVolumeClaim
Note: Alpha features are not enabled by default and you have to enable manually while creating the cluster.
It is possible to expand the PVC of a statefulset on AKS, following these four steps:
https://stackoverflow.com/a/71175193/4083568

Resources