How to run stateful applications in Apache Mesos? - mesos

How can stateful containers be run inside Mesos?
According to the Mesos documentation sandbox can be used to store state:
With the introduction of persistent volumes, executors and tasks
should never create files outside of the sandbox.
At the same time Sandbox files are scheduled for garbage collection when:
An executor is removed or terminated.
A framework is removed.
An executor is recovered unsuccessfully during agent recovery.
Is this the only way? Or can docker containers be used to maintain state (in a similar manner to a VM)?
So for example, can a container be created and run across 2 nodes? Can such a container contain state and not be disposed of after the task is completed?

The key statement in that quote from the Mesos documentation is
With the introduction of persistent volumes...
You're correct that sandboxes can be garbage collected. However, Mesos provides a primitive called persistent volumes which allows you to create volumes that will persist across task failures and agent restarts and will not be garbage collected.
Additionally, Mesos also now provides support for network storage via the Docker volume isolator. This allows you to mount network volumes using Docker volume drivers, which enables the use of a wide variety of storage back-ends.
Docker containers can store persistent state, but they must do so in either a Mesos persistent volume or a network-attached volume via the Docker volume isolator. These volumes live outside the Docker container and are mounted into the container, so they persist after the container has died.
Mesos tasks cannot be run across multiple nodes. Note that it would be possible for multiple tasks on different nodes to access the same network-attached volume via the Docker volume isolator, provided the back-end storage provider supports concurrent access.

Related

What is the default disk space of a ECS task running on a EC2 cluster?

I'm using an ECS cluster that is running on EC2 instances, I have several instances of my app running as tasks in the cluster. I want to add a cache layer for my app, this layer will write the data in the disk. Furthermore, I also want to know how much memory an AWS ECS task will give to my container? and what happens to the files that I had in my container after a deployment?.
I already look in google for answers, but I only found information for tasks that are running in a Fargate cluster.

How to configure elasticsearch snapshots using persistent volumes as the "shared file system repository" in Kubernetes(on GCP)?

I have registered the snapshot repository and have been able to create snapshots of the cluster for a pod. I have used a mounted persistent volume as the "shared file system repository" as the backup storage.
However in a production cluster with multiple nodes, it is required that the shared file system is mounted for all the data and master nodes.
Hence I would have to mount the persistent volume for the data nodes and the master nodes.
But Kubernetes persistent volumes don't have a "read write many" option. So can't mount it on all the nodes and hence am unable to register the snapshot repository. Is there a way to use persistent volumes as the backup snapshot storage for a production elastic search cluster in Google Kubernetes Engine?
Reading this, I guess that you are using a cluster created on your own and not GKE, since you cannot install agents on master nodes and workers will get recreated whenever there is a node pool update. Please make this clear since it can be misleading.
There are multiple volumes that allow multiple readers, such as cephfs, glusterfs and nfs. You can take a look at the different volume types on this

Does Kubernetes evenly distribute across an ec2 cluster?

So, I'm trying to understand CPU and VM allocation with kubernetes, docker and AWS ecs.
Does this seem right?
Locally, running "docker compose" with a few services:
each container gets added to the single Docker Machine VM. You can allocate CPU shares from this single VM.
AWS, running ECS, generated from a docker compose:
each container (all of them) gets added to a single ec2 VM. You can allocate CPU shares from that single VM. The fact that you deploy to a cluster of 5 ec2 instances makes no difference unless you manually "add instances" to your app. Your 5 containers will be sharing 1 ec2.
AWS, running kubernetes, using replication controllers and service yamls:
each get container gets distributed amongst ALL of your ec2 instances in your kubernetes cluster?????
If i spin up a cluster of 5 ec2 instances, and then deploy 5 replication controllers / services, will they be actually distributed across ec2's? this seems like a major difference from ECS and local development. Just trying to get the right facts.
Here are the answers to your different questions:
1> Yes you are right,you have a single VM and any container you run will get cpu shares from this single VM. You also have the option of spawning a swarm cluster and try out. Docker compose support swarm for containers connected via a overlay network spread over multiple vms.
2> Yes your containers defined in a single task will end up in the same ec2 instance. When you spin up more than one instances of the task, the tasks get spread over the instances part of the cluster. Non of tasks should have resource requirement which is greater than the max resource available on one of your ec2 instances.
3> Kubernetes is more evolved than ECS in many aspects, but in case of container distribution it works similar ecs. Kubernetes pod is equivalent to a ecs task. Which is one or a group of container colocated on a single VM. In kubernetes also you cannot have a pod need resources more the max available on one of your underneath compute resources.
In all the three scenarios, you are bound by the max capacity available on underneath resource when deploying a large container or a pod.
You should not equate the docker platform to VM creation and management platform.
All these docker platforms expect you to define tasks which fit into the VMs and require you to scale horizontally with more task count when needed. Kubernetes comes with service discovery, which allows seamless routing of requests to the deployed containers using DNS lookups. You will have build your own service discovery with swarm and ecs. CONSUL, EUREKA etc are tools which you can use for the same.

Implications of exposing /var/lib/docker over NFS to serve hosts with limited memory

What are the implications of exporting /var/lib/docker over NFS? The idea is to store the docker images in a server and export it to hosts which has limited memory to store and run containers. This would be useful to avoid having each host download and store it's own library of docker image. The hosts may make use of FS-Cache to limit the data transfer over network.
The /var/lib/docker directory is designed to be exclusively accessed by a single daemon, and should never be shared with multiple daemons.
Having multiple daemons use the same /var/lib/docker can lead to many issues, and possible data corruption.
For example, the daemon keeps an in-memory state of which images are in use (by containers), and which ones not; multiple daemons using those image won't keep track of that (an image may be in use by another daemon), and remove the image while it's in use.
Docker also stores various other files in /var/lib/docker, such as a key/value store for user-defined networks, which is not designed to be accessed concurrently by multiple daemons.

Does elasticsearch need a persistent storage when deployed on kubernetes?

In the Kubernetes example of Elasticsearch production deployment, there is a warning about using emptyDir, and advises to "be adapted according to your storage needs", which is linked to the documentation of persistent storage on Kubernetes.
Is it better to use a persistent storage, which is an external storage for the node, and so needs (high) I/O over network, or can we deploy a reliable Elasticsearch using multiple data nodes with local emptyDir storage?
Context: We're deploying our Kubernetes on commodity hardware, and we prefer not to use SAN for the storage layer (because it doesn't seem like commodity).
The warning is so that folks don't assume that using emptyDir provides a persistent storage layer. An emptyDir volume will persist as long as the pod is running on the same host. But if the host is replaced or it's disk becomes corrupted, then all data would be lost. Using network mounted storage is one way to work around both of these failure modes. If you want to use replicated storage instead, that works as well.

Resources