I have a flask api and I am trying to improve it identifying which function calls in the api definition takes the longest time whenever call it. For that I am using a profiler as highlighted in this repo. Whenever I make the api call, this profiler generates a .prof file which I can use with snakeviz to visualize.
Now I am trying to run this on aws cluster in the same region where my database is stored to minimize network latency time. I can get the api server running and make the api calls, my question is how can I transfer the .prof file from kubernetes pod without disturbing the api server. Is there a way to start a separate shell that transfers file to say an s3 bucket whenever that file is created without killing off the api server.
If you want to automate this process or it's simply hard to figure out connectivity for running kubectl exec ..., one idea would be to use a sidecar container. So your pod contains two containers with a single emptyDir volume mounted into both. emptyDir is perhaps the easiest way to create a folder shared between all containers in a pod.
First container is your regular Flask API
Second container is watching for new files in shared folder. Whenever it finds a file there it uploads this file to S3
You will need to configure profiler so it dumps output into a shared folder.
One benefit of this approach is that you don't have to make any major modifications to the existing container running Flask.
The best option the sidecar container.
Pods that run multiple containers that need to work together. A Pod can encapsulate an application composed of multiple co-located containers that are tightly coupled and need to share resources. These co-located containers form a single cohesive unit of service—for example, one container serving data stored in a shared volume to the public, while a separate sidecar container refreshes or updates those files. The Pod wraps these containers, storage resources, and an ephemeral network identity together as a single unit.
For example, you might have a container that acts as a web server for files in a shared volume, and a separate "sidecar" container that updates those files from a remote source.
Here's a link!
The sidecar creation is easy look this:
apiVersion: v1
kind: Pod
metadata:
name: demo
spec:
containers:
- image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do echo $(date -u) 'Hi I am from Sidecar container'; sleep 5;done"]
name: sidecar-container
resources: {}
volumeMounts:
- name: var-logs
mountPath: /var/log
- image: nginx
name: main-container
resources: {}
ports:
- containerPort: 80
volumeMounts:
- name: var-logs
mountPath: /usr/share/nginx/html
dnsPolicy: Default
volumes:
- name: var-logs
emptyDir: {}
All you need is change the sidecar container command to your needs.
Related
What I'd like to ask is if is it possible to create a Kubernetes Job that runs a bash command within another Pod.
apiVersion: batch/v1
kind: Job
metadata:
namespace: dev
name: run-cmd
spec:
ttlSecondsAfterFinished: 180
template:
spec:
containers:
- name: run-cmd
image: <IMG>
command: ["/bin/bash", "-c"]
args:
- <CMD> $POD_NAME
restartPolicy: Never
backoffLimit: 4
I considered using :
Environment variable to define the pod name
Using Kubernetes SDK to automate
But if you have better ideas I am open to them, please!
The Job manifest you shared seems the valid idea.
Yet, you need to take into consideration below points:
As running command inside other pod (pod's some container) requires interacting with Kubernetes API server, you'd need to interact with it using Kubernetes client (e.g. kubectl). This, in turn, requires the client to be installed inside the job's container's image.
job's pod's service account has to have the permissions to pods/exec resource. See docs and this answer.
I'm slightly confused about the correct way to use Filebeat's modules, whilst running Filebeat in a Docker container. It appears that the Developers prefer the modules.d method, however it's not clear to me of their exact intentions.
Here is the relevant part of my filebeat.yml:
filebeat.config:
modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: true
reload.period: 60s
And the filebeat container definition from my docker-compose.yml:
filebeat:
build:
context: filebeat/
args:
ELK_VERSION: $ELK_VERSION
container_name: filebeat
mem_limit: 2048m
labels:
co.elastic.logs/json.keys_under_root: true
co.elastic.logs/json.add_error_key: true
co.elastic.logs/json.overwrite_keys: true
volumes:
- type: bind
source: ./filebeat/config/filebeat.docker.yml
target: /usr/share/filebeat/filebeat.yml
read_only: true
- type: bind
source: /volume1/#docker/containers
target: /var/lib/docker/containers
read_only: true
- type: bind
source: /var/run/docker.sock
target: /var/run/docker.sock
read_only: true
- type: bind
source: /var/log
target: /host-log
read_only: true
- type: volume
source: filebeat-data
target: /usr/share/filebeat/data
user: root
depends_on:
- elasticsearch
- kibana
command: filebeat -e -strict.perms=false
networks:
- elk
depends_on:
- elasticsearch
With this configuration, I can docker exec into my container and activate modules (pulling in their default configuration), and setup pipelines and dashboards like so:
filebeat modules enable elasticsearch kibana system nginx
filebeat setup -e --pipelines
This all works fine, until I come to recreate my container, at which point the enabled modules are (unsurprisingly) disabled and I have to run this stuff again.
I tried to mount the modules.d directory on my local filesystem, expecting this to build the default modules.d config files (with their .disabled suffix) on my local filesystem, such that recreation of the container would persist the installed modules. To do so, the following was added as a mount to docker-compose.yml:
- type: bind
source: ./filebeat/config/modules.d
target: /usr/share/filebeat/modules.d
read_only: false
When I do this, and recreate the container, it spins up with an empty modules.d directory and filebeat modules list returns no modules at all, meaning none can be enabled.
My current workaround is to copy each individual monitor's config file, and mount it specifically, like so:
- type: bind
source: ./filebeat/config/modules.d/system.yml
target: /usr/share/filebeat/modules.d/system.yml
read_only: true
- type: bind
source: ./filebeat/config/modules.d/nginx.yml
target: /usr/share/filebeat/modules.d/nginx.yml
read_only: true
This is suboptimal for a number of reasons, chiefly, if I want to enable a new module, and for it to persist my container being recreated, I need to:
docker exec into the container
get the default config file for the module I want to use
create a file on the local filesystem for the module
edit the docker-compose.yml file with the new bind mounted module config
recreate the container with docker-compose up --detach
The way I feel this should work is:
I mount modules.d to my local filesystem
I recreate the container
modules.d gets populated with all the default module config files
I enable modules by filebeat modules enable blah or by renaming the module config file from my local filesystem (removing the .disabled suffix)
Enabled modules and their config survive container recreation
One way around this could be to copy (urgh) the whole modules.d directory from a running container to my local filesystem and mounting that wholesale. That feels wrong too.
What am I misunderstanding or misconfiguring here? How are other people doing this?
I build a custom image for each type of beat, and embed the .yml configuration in my image. Then I use the filebeat.module property of the configuration file to setup my modules inside of that file.
I think the intention of using the modules.d folder approach is that it makes it easier to understand your module configuration for a filebeat instance that is working with multiple files.
That is my intention of this 1 image to 1 module/file type approach that I use for sure. All of my logic/configuration is stored along with the service that I am monitoring and not in one central, monolithic location.
Another benefit to this approach is each individual filebeat only has access to the log files it needs. In the case of collecting the docker logs like you are, you need to run in in privileged mode to bind mount to /var/run/docker.sock. If I want to run your compose file but do not want to run privileged (or if I am using windows and cannot) then I loose all the other monitoring that you have built out.
I have 3 nodes kubernetes cluster managing with kubeadm. Previously i used kind and minikube. When I wanted make deployment based on docker image i just need make:
kind load docker-image in kind or,
minikube cache add in minikube,
Now, when I want make deployment in kubeadm I obviously get ImagePullBackOff.
Question: Is a equivalent comment do add image to kubeadm and I can't find it, or there is entirely other way to solve that problem?
EDIT
Maybe, question is not clear enough, so instead of delete it I try to put more details.
I have tree nodes (one control plane, and two workers) with docker, kubeadm, kubelet and kubectl installed on each. One deployment of my future cluster is machine learning module so I need tensorflow:
docker pull tensorflow/tensorflow
Using this image I build my own:
docker build -t mlimage:cluster -f ml.Dockerfile .
Next I prepare deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mldeployment
spec:
selector:
matchLabels:
app: mldeployment
name: mldeployment
replicas: 1
template:
metadata:
labels:
app: mldeployment
name: mldeployment
spec:
containers:
- name: mlcontainer
image: mlimage:cluster
imagePullPolicy: Never
ports:
- name: http
containerPort: 6060
and create it:
kubectl create -f mldeployment.yaml
Now, when I type
kubectl describe pod
In mldeployment are these events:
In that case of minikube or kind it was enought to simply add image to cluster typing
minikibe cache add ...
and
kind load docker-image ...
respectively.
Question is how to add image from my machine to cluster in case of managing it from kubeadm. I assume that there is similar way to do that like for minikube or kind (without creating any connection to docker hub, because everything is locally).
you are getting ImagePullBackOff due to kubeadm maybe going to check inside the registry.
while if you look at both command minikube cache and kind load is for loading the local images into the cluster.
As I now understand images for cluster managed via kubeadm should be stored in trusted registry like dockerhub or cloud. But if you want make fast solution in separated networks there is a posibility: Docker registry.
There are also some tools ready to use e.g. Trow, or simpler solution.
I used second approach, and it works (code is a bit old, so it may needs some changes, this links may be helpful: change apiVersion, add label
After that changes, first create deployment and daemonSet:
kubectl create -f docker-private-registry.json
kubectl create -f docker-private-registry-proxy.json
Add localhost address to image:
docker tag image:tag 127.0.0.1:5000/image:tag
Check full name of docker private registry deployment, and forward port (replace x by exact deployment name:
kubectl get pod
kubectl port-forward docker-private-registry-deployment-xxxxxxxxx-xxxxx 5000:5000 -n default
Open next terminal window and push image to private registry:
docker push 127.0.0.1:5000/image:tag
Finnaly change in deployment.yaml file containers image (add 127.0.0.1:5000/...) and create deployment.
This solution is very unsafe and vulnerable, so use it wisely only in separated networks for test and dev purposes.
For the life of Bryan, how do I do this?
Terraform is used to create an SQL Server instance in GCP.
Root password and user passwords are randomly generated, then put into the Google Secret Manager.
The DB's IP is exposed via private DNS zone.
How can I now get the username and password to access the DB into my K8s cluster? Running a Spring Boot app here.
This was one option I thought of:
In my deployment I add an initContainer:
- name: secrets
image: gcr.io/google.com/cloudsdktool/cloud-sdk
args:
- echo "DB_PASSWORD=$(gcloud secrets versions access latest --secret=\"$NAME_OF_SECRET\")" >> super_secret.env
Okay, what now? How do I get it into my application container from here?
There are also options like bitnami/sealed-secrets, which I don't like since the setup is using Terraform already and saving the secrets in GCP. When using sealed-secrets I could skip using the secrets manager. Same with Vault IMO.
On top of the other answers and suggestion in the comments I would like to suggest two tools that you might find interesting.
First one is secret-init:
secrets-init is a minimalistic init system designed to run as PID 1
inside container environments and it`s integrated with
multiple secrets manager services, e.x. Google Secret Manager
Second one is kube-secrets-init:
The kube-secrets-init is a Kubernetes mutating admission webhook,
that mutates any K8s Pod that is using specially prefixed environment
variables, directly or from Kubernetes as Secret or ConfigMap.
It`s also support integration with Google Secret Manager:
User can put Google secret name (prefixed with gcp:secretmanager:) as environment variable value. The secrets-init will resolve any environment value, using specified name, to referenced secret value.
Here`s a good article about how it works.
How do I get it into my application container from here?
You could use a volume to store the secret and mount the same volume in both init container and main container to share the secret with the main container from the init container.
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: my-app
image: my-app:latest
volumeMounts:
- name: config-data
mountPath: /data
initContainers:
- name: secrets
image: gcr.io/google.com/cloudsdktool/cloud-sdk
args:
- echo "DB_PASSWORD=$(gcloud secrets versions access latest --secret=\"$NAME_OF_SECRET\")" >> super_secret.env
volumeMounts:
- name: config-data
mountPath: /data
volumes:
- name: config-data
emptyDir: {}
You can use spring-cloud-gcp-starter-secretmanager to load secrets from Spring application itself.
Documentation - https://cloud.spring.io/spring-cloud-gcp/reference/html/#secret-manager
Using volumes of emptyDir with medium: Memory to guarantee that the secret will not be persisted.
...
volumes:
- name: scratch
emptyDir:
medium: Memory
sizeLimit: "1Gi"
...
If one has control over the image, it's possible to change the entry point and use berglas.
Dockerfile:
FROM adoptopenjdk/openjdk8:jdk8u242-b08-ubuntu # or whatever you need
# Install berglas, see https://github.com/GoogleCloudPlatform/berglas
RUN mkdir -p /usr/local/bin/
ADD https://storage.googleapis.com/berglas/main/linux_amd64/berglas /usr/local/bin/berglas
RUN chmod +x /usr/local/bin/berglas
ENTRYPOINT ["/usr/local/bin/berglas", "exec", "--"]
Now we build the container and test it:
docker build -t image-with-berglas-and-your-app .
docker run \
-v /host/path/to/credentials_dir:/root/credentials \
--env GOOGLE_APPLICATION_CREDENTIALS=/root/credentials/your-service-account-that-can-access-the-secret.json \
--env SECRET_TO_RESOLVE=sm://your-google-project/your-secret \
-ti image-with-berglas-and-your-app env
This should print the environment variables with the sm:// substituted by the actual secret value.
In K8s we run it with Workload Identity, so the K8s service account on behalf of which the pod is scheduled needs to be bound to a Google service account that has the right to access the secret.
In the end your pod description would be something like this:
apiVersion: v1
kind: Pod
metadata:
name: your-app
spec:
containers:
- name: your-app
image: image-with-berglas-and-your-app
command: [start-sql-server]
env:
- name: AXIOMA_PASSWORD
value: sm://your-google-project/your-secret
I was looking at step-by-step tutorial on how to run my spring boot, mysql-backed app using AWS EKS (Elastic Container service for Kubernetes) using the existing SSL wildcard certificate and wasn't able to find a complete solution.
The app is a standard Spring boot self-contained application backed by MySQL database, running on port 8080. I need to run it with high availability, high redundancy including MySQL db that needs to handle large number of writes as well as reads.
I decided to go with the EKS-hosted cluster, saving a custom Docker image to AWS-own ECR private Docker repo going against EKS-hosted MySQL cluster. And using AWS issued SSL certificate to communicate over HTTPS. Below is my solution but I'll be very curious to see how it can be done differently
This a step-by-step tutorial. Please don't proceed forward until the previous step is complete.
CREATE EKS CLUSTER
Follow the standard tutorial to create EKS cluster. Don't do step 4. When you done you should have a working EKS cluster and you must be able to use kubectl utility to communicate with the cluster. When executed from the command line you should see the working nodes and other cluster elements using
kubectl get all --all-namespaces command
INSTALL MYSQL CLUSTER
I used helm to install MySQL cluster following steps from this tutorial. Here are the steps
Install helm
Since I'm using Macbook Pro with homebrew I used brew install kubernetes-helm command
Deploy MySQL cluster
Note that in MySQL cluster and Kubernetes (EKS) cluster, word "cluster" refers to 2 different things. Basically you are installing cluster into cluster, just like a Russian Matryoshka doll so your MySQL cluster ends up running on EKS cluster nodes.
I used a 2nd part of this tutorial (ignore kops part) to prepare the helm chart and install MySQL cluster. Quoting helm configuration:
$ kubectl create serviceaccount -n kube-system tiller
serviceaccount "tiller" created
$ kubectl create clusterrolebinding tiller-crule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
clusterrolebinding.rbac.authorization.k8s.io "tiller-crule" created
$ helm init --service-account tiller --wait
$HELM_HOME has been configured at /home/presslabs/.helm.
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!
$ helm repo add presslabs https://presslabs.github.io/charts
"presslabs" has been added to your repositories
$ helm install presslabs/mysql-operator --name mysql-operator
NAME: mysql-operator
LAST DEPLOYED: Tue Aug 14 15:50:42 2018
NAMESPACE: default
STATUS: DEPLOYED
I run all commands exactly as quoted above.
Before creating a cluster, you need a secret that contains the ROOT_PASSWORD key.
Create a file named example-cluster-secret.yaml and copy into it the following YAML code
apiVersion: v1
kind: Secret
metadata:
name: my-secret
type: Opaque
data:
# root password is required to be specified
ROOT_PASSWORD: Zm9vYmFy
But what is that ROOT_PASSWORD? Turns out this is base64 encoded password that you planning to use with your MySQL root user. Say you want root/foobar (please don't actually use foobar). The easiest way to encode the password is to use one of the websites such as https://www.base64encode.org/ which encodes foobar into Zm9vYmFy
When ready execute kubectl apply -f example-cluster-secret.yaml which will create a new secret
Then you need to create a file named example-cluster.yaml and copy into it the following YAML code:
apiVersion: mysql.presslabs.org/v1alpha1
kind: MysqlCluster
metadata:
name: my-cluster
spec:
replicas: 2
secretName: my-secret
Note how the secretName matches the secret name you just created. You can change it to something more meaningful as long as it matches in both files. Now run kubectl apply -f example-cluster.yaml to finally create a MySQL cluster. Test it with
$ kubectl get mysql
NAME AGE
my-cluster 1m
Note that I did not configure a backup as described in the rest of the article. You don't need to do it for the database to operate. But how to access your db? At this point the mysql service is there but it doesn't have external IP. In my case I don't even want that as long as my app that will run on the same EKS cluster can access it.
However you can use kubectl port forwarding to access the db from your dev box that runs kubectl. Type in this command: kubectl port-forward services/my-cluster-mysql 8806:3306. Now you can access your db from 127.0.0.1:8806 using user root and the non-encoded password (foobar). Type this into separate command prompt: mysql -u root -h 127.0.0.1 -P 8806 -p. With this you can also use MySQL Workbench to manage your database just don't forget to run port-forward. And of course you can change 8806 to other port of your choosing
PACKAGE YOUR APP AS A DOCKER IMAGE AND DEPLOY
To deploy your Spring boot app into EKS cluster you need to package it into a Docker image and deploy it into the Docker repo. Let's start with a Docker image. There are plenty tutorials on this like this one but the steps are simple:
Put your generated, self-contained, spring boot jar file into a directory and create a text file with this exact name: Dockerfile in the same directory and add the following content to it:
FROM openjdk:8-jdk-alpine
MAINTAINER me#mydomain.com
LABEL name="My Awesome Docker Image"
# Add spring boot jar
VOLUME /tmp
ADD myapp-0.1.8.jar app.jar
EXPOSE 8080
# Database settings (maybe different in your app)
ENV RDS_USERNAME="my_user"
ENV RDS_PASSWORD="foobar"
# Other options
ENV JAVA_OPTS="-Dverknow.pypath=/"
ENTRYPOINT [ "sh", "-c", "java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar /app.jar" ]
Now simply run a Docker command from the same folder to create an image. Of course that requires Docker client installed on your dev box.
$ docker build -t myapp:0.1.8 --force-rm=true --no-cache=true .
If all goes well you should see your image listed with docker ps command
Deploy to the private ECR repo
Deploying your new image to ECR repo is easy and ECR works with EKS right out of the box. Log into AWS console and navigate to the ECR section. I found it confusing that apparently you need to have one repository per image but when you click "Create repository" button put your image name (e.g. myapp) into the text field. Now you need to copy the ugly URL for your image and go back to the command prompt
Tag and push your image. I'm using a fake URL as example: 901237695701.dkr.ecr.us-west-2.amazonaws.com you need to copy your own from the previous step
$ docker tag myapp:0.1.8 901237695701.dkr.ecr.us-west-2.amazonaws.com/myapp:latest
$ docker push 901237695701.dkr.ecr.us-west-2.amazonaws.com/myapp:latest
At this point the image should show up at ECR repository you created
Deploy your app to EKS cluster
Now you need to create a Kubernetes deployment for your app's Docker image. Create a myapp-deployment.yaml file with the following content
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: myapp-deployment
spec:
selector:
matchLabels:
app: myapp
replicas: 2
template:
metadata:
labels:
app: myapp
spec:
containers:
- image: 901237695701.dkr.ecr.us-west-2.amazonaws.com/myapp:latest
name: myapp
ports:
- containerPort: 8080
name: server
env:
# optional
- name: RDS_HOSTNAME
value: "10.100.98.196"
- name: RDS_PORT
value: "3306"
- name: RDS_DB_NAME
value: "mydb"
restartPolicy: Always
status: {}
Note how I'm using a full URL for the image parameter. I'm also using a private CLUSTER-IP of mysql cluster that you can get with kubectl get svc my-cluster-mysql command. This will differ for your app including any env names but you do have to provide this info to your app somehow. Then in your app you can set something like this in the application.properties file:
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://${RDS_HOSTNAME}:${RDS_PORT}/${RDS_DB_NAME}?autoReconnect=true&zeroDateTimeBehavior=convertToNull
spring.datasource.username=${RDS_USERNAME}
spring.datasource.password=${RDS_PASSWORD}
Once you save the myapp-deployment.yaml you need to run this command
kubectl apply -f myapp-deployment.yaml
Which will deploy your app into EKS cluster. This will create 2 pods in the cluster that you can see with kubectl get pods command
And rather than try to access one of the pods directly we can create a service to front the app pods. Create a myapp-service.yaml with this content:
apiVersion: v1
kind: Service
metadata:
name: myapp-service
spec:
ports:
- port: 443
targetPort: 8080
protocol: TCP
name: http
selector:
app: myapp
type: LoadBalancer
That's where the magic happens! Just by setting the port to 443 and type to LoadBalancer the system will create a Classic Load Balancer to front your app.
BTW if you don't need to run your app over HTTPS you can set port to 80 and you will be pretty much done!
After you run kubectl apply -f myapp-service.yaml the service in the cluster will be created and if you go to to the Load Balancers section in the EC2 section of AWS console you will see that a new balancer is created for you. You can also run kubectl get svc myapp-service command which will give you EXTERNAL-IP value, something like bl3a3e072346011e98cac0a1468f945b-8158249.us-west-2.elb.amazonaws.com. Copy that because we need to use it next.
It is worth to mention that if you are using port 80 then simply pasting that URL into the browser should display your app
Access your app over HTTPS
The following section assumes that you have AWS-issued SSL certificate. If you don't then go to AWS console "Certificate Manager" and create a wildcard certificate for your domain
Before your load balancer can work you need to access AWS console -> EC2 -> Load Balancers -> My new balancer -> Listeners and click on "Change" link in SSL Certificate column. Then in the pop up select the AWS-issued SSL certificate and save.
Go to Route-53 section in AWS console and select a hosted zone for your domain, say myapp.com.. Then click "Create Record Set" and create a CNAME - Canonical name record with Name set to whatever alias you want, say cluster.myapp.com and Value set to the EXTERNAL-IP from above. After you "Save Record Set" go to your browser and type in https://cluster.myapp.com. You should see your app running