Spring Cloud Dataflow Volume Mounts for Composed Tasks - spring

Can you mount volumes on composed tasks?
I see you can pass arguments to composed tasks as follows:
--arguments "--composed-task-arguments=--app.datasource.jdbc-url=jdbc:mysql:XXXXX"
and pass properties to the deployer to mount volumes as so:
--properties "deployer.*.kubernetes.volumeMounts=[{name: 'myName', mountPath:
'/test'}], deployer.*.kubernetes.volumes=[{name: 'myName', persistentVolumeClaim: {
claimName: 'myName'}}]"
However in doing so I only see the volume mounted on the composed task runner and not child tasks spawned from it. Is there a way to do this?

Please refer the answer to the question: passing properties to child task of composed-task-runner app of spring cloud dataflow
To be precise, you need to do something like this:
deployer.<name-of-your-composed-task>.<child task name/label>.kubernetes.volumeMounts=***
The documentation here can help you understand this better as well.

Related

Creating a Simple Hello World app in Kubernetes

Most software tech has a "Hello World" type example to get started on. With Kubernetes this seems to be lacking.
My scenario cannot be simpler. I have a simple hello world app made with Spring-Boot with one Rest controller that just returns: "Hello Hello!"
After I create my docker file, I build an image like this :
docker build -t helloworld:1.0 .
Then I run it in a container like this :
docker run -p 8080:8080 helloworld:1.0
If I open up a browser now, I can access my application here :
http://localhost:8080/hello/
and it returns :
"Hello Hello!"
Great! So far so good.
Next I tag it (my docker-hub is called ollyw123, and the ID of my image is 776...)
docker tag 7769f3792278 ollyw123/helloworld:firsttry
and push :
docker push ollyw123/helloworld
If I log into Docker-Hub I will see
Now I want to connect this to Kubernetes. This is where I have plunged deep into the a state of confusion.
My thinking is, I need to create a cluster. Somehow I need to connect this cluster to my image, and as I understand, I just need to use the URL of the image to connect to (ie.
https://hub.docker.com/repository/docker/ollyw123/helloworld)
Next I would have to create a service. This service would then be able to expose my "Hello World!" rest call through some port. This is my logical thinking, and for me this would seem like a very simple thing to do, but the tutorials and documentation on Kubernetes is a mine field of confusion and dead ends.
Following on from the spring-boot kubernetes tutorial (https://spring.io/guides/gs/spring-boot-kubernetes/) I have to create a deployment object, and then a service object, and then I have to "apply" it :
kubectl create deployment hello-world-dep --image=ollyw123/helloworld --dry-run -o=yaml > deployment.yaml
kubectl create service clusterip hello-world-dep --tcp=8080:8080 --dry-run -o=yaml >> deployment.yaml
kubectl apply -f deployment.yaml
OK. Now I see a service :
But now what???
How do I push this to the cloud? (eg. gcloud) Do I need to create a cluster first, or is this already a cluster?
What should my next step be?
There are a couple of concepts that we need to go through regarding your question.
The first would be about the "Hello World" app in Kubernetes. Even this existing (as mentioned by Limido in the comments [link]), the app itself is not a Kubernetes app, but an app created in the language of your choice, which was containerized and it is deployed in Kubernetes.
So I would call it (in your case) a Dockerized SpringBoot HelloWorld app.
Okay, now that we have a container we could simply deploy it running docker, but what if your container dies, or you need to scale it up and down, manage volumes, network traffic and a bunch of other things, this starts to become complicated (imagine a real life scenario, with hundreds or even thousands of containers running at the same time). That's exactly where the Container Orchestration comes into place.
Kubernetes helps you managing this complexity, in a single place.
The third concept that I'd like to talk, is the create and apply commands. You can definitely find a more detailed explanation in here, but both of then can be used to create the resource in Kubernetes.
In your case, the create command is not creating the resources, because you are using the --dry-run and adding the output to your deployment file, which you apply later on, but the following command would also create your resource:
kubectl create deployment hello-world-dep --image=ollyw123/helloworld
kubectl create service clusterip hello-world-dep --tcp=8080:8080
Note that even this working, if you need to share this deployment, or commit it in a repository you would need to get it:
kubectl get deployment hello-world-dep -o yaml > your-file.yaml
So having the definition file is really helpful and recommended.
Great... Going further...
When you have a deployment you will also have a number of replicas that is expected to be running (even when you don't define it - the default value is 1). In your case your deployment is managing one pod.
If you run:
kubectl get pods -o wide
You will get your pod hello-world-dep-hash and an IP address. This IP is the IP of your container and you can access your application using it, but as pods are ephemeral, if your pod dies, Kubernetes will create a new one for you (automatically) with a new IP address, so if you have for instance a backend and its IP is constantly changing, you would need to manage this change in the frontend every time you have a new backend pod.
To solve that, Kubernetes has the Service, which will expose the deployment in a persistent way. So if your pod dies and a new one comes back, the address of your service will continue the same, and all the traffic will be automatically routed to your new pod.
When you have more than one replica of your deployment, the service also load balance the load across all the available pods.
Last but not least, your question!
You have asked, now what?
So basically, once you have your application containerized, you can deploy it almost anywhere. There are N different places you can get it. In your case you are running it locally, but you could get your deployment.yaml file and deploy your application in GKE, AKS, EKS, just to quote the biggest ones, but all cloud providers have some type of Kubernetes service available, where you can spin up a cluster and start playing around.
Actually, to play around I'd recommend Katakoda, as they have scenarios for free, and you can use the cluster to play around.
Wow... That was a long answer...
Just to finish, I'd recommend the Network Introduction in Katakoda, as there are different types of Services, depending on your scenario or what you need, and the tutorial is goes through the different types in a hands-on approach.
In the context of Kubernetes, Cluster is the environment where your PODS and Services are running. Think of it like a VM environment where you setup your Web Server and etc.. (although I don't like my own analogy)
If you want to run the same thing in GCloud, then you create a Kubernetes cluster there and all you need to do is to apply your YAML files that contains the Service and Deployment there via the CLI that Google Cloud provides to interact with your Cluster.
In order to interact with GCloud GKS Cluster via your local command prompt, you need to get the credentials for that cluster. This official GCloud document explain how to retrieve your cluster credential. once done, you can start interacting with the Kubernetes instance running in GCloud via kubectl command using your command prompt.
The service that you have is of type clusterIP which is only accessible from within the kubernetes cluster. You need to either use NodePort or LoadBalanacer type service or ingress to expose the application outside the remote kubernetes cluster(a set of VMs or bare metal servers in public or private cloud environment with kubernetes deployed on them) or local minikube/docker desktop. Once you do that you should be able to access it using a browser or curl

Easier way to find pods and look at logs for a job in spring cloud dataflow kubernetes

Currently, when I launch a task in spring cloud dataflow it starts a pod inside which the task and the inherent jobs run. That pod has a naming convention of task name followed by a random ID. I wanted to know if I can maybe map it to the task execution ID or Job Execution ID so that its easier for me to locate a pod in case a job fails and look at the logs.
In Spring Cloud Data Flow, this is the expected behavior.
The task execution ID and job execution ID are generated only after the task launch request is sent to the corresponding target deployment environment (local, CF, k8s).
Feel free to create a feature request (would be great if you have any other ideas) in SCDF Github and we can track it from there.

How to add a new scheduler priority to the default kubernetes scheduler?

Kubernetes scheduler includes two parts: predicate and priority. The source code is in kubernetes/plugin/pkg/scheduler. I want to add a new priority algorithm to the default priorities. Can anyone guide me the detailed steps? Thanks a lot!
Maybe I should do the following steps:
Add my own priority algorithm to the path: kubernetes/plugin/pkg/scheduler/algorithm/priorities
Register that priority algorithm
Build/Recompile the whole k8s project and install\deploy a new k8s cluster
Test if that priority effects, maybe give it a high weight.
If there are more detailed articles and documents, it will help me a lot!
The more detailed the better!Thanks a lot!
k8s version: 1.2.0, 1.4.0 or later.
You can run your scheduler as a kubernetes deployment.
Kelsey Hightower has an example scheduler coded up on Github
The meat and bones of this is here: https://github.com/kelseyhightower/scheduler/blob/master/bestprice.go
And the deployment yaml is here
Essentially, you can package it up as a docker container and deploy it.
Take note of the way you interact with the k8s API using this package in order to do it this way you'll need to have a similar wrapper, but it's much easier than building/recompiling the whole k8s package.

How to set env vars on all nodes in a Mesos cluster?

I'm trying to set some env vars on our DCOS/Mesos cluster - what's the simplest way to do that?
I would suggest you taking a look at Consul and envconsul combo.
Use Consul as K/V for storing and managing the variables across the cluster and envconsul to feed them to the apps inside the container. For secrets - add Vault.
You have mentioned you were looking for simple solution. I would say it's relatively simple and elegant way to achieve that.

Event-hook upon up/down-scaling or deletion of an App

I didn't find info whether it is possible to define something like an Event-hook upon up/down-scaling or deletion of an App in the Marathon Rest API docs at https://mesosphere.github.io/marathon/docs/rest-api.html
What I'd like to achieve is that I'm able to backup some data from a running Docker container before be is destroyed. For example, I run a cluster of Elasticsearch nodes on Marathon, and I would like to delay the deletion of the app until the then triggered "Create snapshot to external disk resource" process is finished.
Is there currently something I could use?
Marathon provides an Event Bus covering some phases of the lifecycle. Beyond that, currently the only other option I see is to go for Mesos Modules/Hooks.

Resources