DC/OS stateful app with persistent external storage - amazon-ec2

I'm trying to setup a stateful app in DC/OS by assigning an external (EBS) volume to the docker container. I've ran the demo app provided in the docs and it created a 100GB EBS volume in AWS. Is there a way to specify the size of the volume in the marathon.json file? Can I use the same EBS volume for multiple apps? Here's the demo app I've tested.
{
"id": "/test-docker",
"instances": 1,
"cpus": 0.1,
"mem": 32,
"cmd": "date >> /data/test-rexray-volume/test.txt; cat /data/test-rexray-volume/test.txt",
"container": {
"type": "DOCKER",
"docker": {
"image": "alpine:3.1",
"network": "HOST",
"forcePullImage": true
},
"volumes": [
{
"containerPath": "/data/test-rexray-volume",
"external": {
"name": "my-test-vol",
"provider": "dvdi",
"options": { "dvdi/driver": "rexray" }
},
"mode": "RW"
}
]
},
"upgradeStrategy": {
"minimumHealthCapacity": 0,
"maximumOverCapacity": 0
}
}

You cannot attach one EBS volume to multiple EC2 instances. My bad! I ditched the rexray persistent storage option in favor of EFS.
I had to create an EFS share and attach it to the cluster's VPC. Then I had to ssh into every slave node, mount it like an NFS share under the same folder on all nodes and finally mount it in the container from marathon.json.

Related

Is it possible to register ec2 node in elbv2 target group via CF template?

I use AWS::ELBv2 with set of rules which redirect requests to different services running on several tcp ports on my EC2 instance. My application doesn't support scaling, so I can't use autoscaling group, I need just ELBv2 attached to my ec2-instance. We are using CloudFormation for deployment automation purposes.
With Autoscaling I can use TargetGroupARNs property of AutoscalingCluster:
"AutoScalingCluster": {
"Type": "AWS::AutoScaling::AutoScalingGroup",
"Properties": {
"TargetGroupARNs": [
{
"Ref": "FirstappTargetGroup"
},
{
"Ref": "SecondappTargetGroup"
}
],
...
}
However for AWS::EC2::Instance there is no such property (according to https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-instance.html) and also there are no properties related to Target Groups.
It is possible to register nodes in Target Group after stack is created, however that's additional step which might be not convenient.
Found solution here: https://github.com/getcft/aws-elb-to-ec2-target-group-cf-template/blob/master/elb-to-ec2-target-group-cf-template.yml#L338
You can specify ec2 nodes as Targets: in AWS::ElasticLoadBalancingV2::TargetGroup
"MyTargetGroup": {
"Type": "AWS::ElasticLoadBalancingV2::TargetGroup",
"Properties": {
"Port": 80,
"Protocol": "HTTP",
"VpcId": {
"Ref": "VpcId"
},
"Targets": [
{
"Id": {
"Ref": "MyEC2Node"
}
}
]
}
}
So it's not EC2 node property, it's a TargetGroup property in this case.
Also aws api doesn't show that property via aws elbv2 describe-target-groups

Mesos/Marathon - reserved resources for role are not offered for Marathon app

I have assigned slave resources to the particular role ("app-role") by set --default_role="app-role" parameter to ExecStart for slave service ( /etc/systemd/system/dcos-mesos-slave.service). Next I have restarted slave agent:
sudo systemctl daemon-reload
sudo systemctl stop dcos-mesos-slave.service
sudo rm -f /var/lib/mesos/slave/meta/slaves/latest
sudo systemctl start dcos-mesos-slave.service
and verified by: curl master.mesos/mesos/slaves.
After that I expect marathon app with acceptedResourceRoles attribute will receive only these particular resource offers, but it does not happen (the app is still in waiting state).
Why does marathon didn't receive it? How should this be done to make it work?
{
"id": "/basic-4",
"cmd": "python3 -m http.server 8080",
"cpus": 0.5,
"mem": 32,
"disk": 0,
"instances": 1,
"acceptedResourceRoles": [
"app-role"
],
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "python:3",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 8080,
"hostPort": 0,
"servicePort": 10000,
"protocol": "tcp",
"name": "my-vip",
"labels": {
"VIP_0": "/my-service:5555"
}
}
],
"privileged": false,
"parameters": [],
"forcePullImage": false
}
},
"portDefinitions": [
{
"port": 10000,
"protocol": "tcp",
"name": "default",
"labels": {}
}
]
}
This works only if marathon is started with --mesos_role set.
In the context of the question this should be: --mesos_role 'app-role'.
If you set --mesos_role other, Marathon will register with Mesos for this role – it will receive offers for resources that are reserved
for this role, in addition to unreserved resources.
If you set default_accepted_resource_roles *, Marathon will apply this default to all AppDefinitions that do not explicitly
define acceptedResourceRoles. Since your AppDefinition defines that
option, the default will not be applied (both are equal anyways).
If you set "acceptedResourceRoles": [""] in an AppDefinition (or the AppDefinition inherits a default of ""), Marathon will only
consider unreserved resources for launching of this app.
More: https://mesosphere.github.io/marathon/docs/recipes.html

Marathon: How to specify environment variables in args

I am trying to run a Consul container on each of my Mesos slave node.
With Marathon I have the following JSON script:
{
"id": "consul-agent",
"instances": 10,
"constraints": [["hostname", "UNIQUE"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "consul",
"privileged": true,
"network": "HOST"
}
},
"args": ["agent","-bind","$MESOS_SLAVE_IP","-retry-join","$MESOS_MASTER_IP"]
}
However, it seems that marathon treats the args as plain text.
That's why I always got errors:
==> Starting Consul agent...
==> Error starting agent: Failed to start Consul client: Failed to start lan serf: Failed to create memberlist: Failed to parse advertise address!
So I just wonder if there are any workaround so that I can start a Consul container on each of my Mesos slave node.
Update:
Thanks #janisz for the link.
After taking a look at the following discussions:
#3416: args in marathon file does not resolve env variables
#2679: Ability to specify the value of the hostname an app task is running on
#1328: Specify environment variables in the config to be used on each host through REST API
#1828: Support for more variables and variable expansion in app definition
as well as the Marathon documentation on Task Environment Variables.
My understanding is that:
Currently it is not possible to pass environment variables in args
Some post indicates that one could pass environment variables in "cmd". But those environment variables are Task Environment Variables provided by Marathon, not the environment variables on your host machine.
Please correct if I was wrong.
You can try this.
{
"id": "consul-agent",
"instances": 10,
"constraints": [["hostname", "UNIQUE"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "consul",
"privileged": true,
"network": "HOST",
"parameters": [
"key": "env",
"value": "YOUR_ENV_VAR=VALUE"
]
}
}
}
Or
{
"id": "consul-agent",
"instances": 10,
"constraints": [["hostname", "UNIQUE"]],
"container": {
"type": "DOCKER",
"docker": {
"image": "consul",
"privileged": true,
"network": "HOST"
}
},
"env": {
"ENV_NAME" : "VALUE"
}
}

How to mount HDFS in a Docker container

I made an application Dockerized in a Docker container. I intended to make the application able to access files from our HDFS. The Docker image is to be deployed on the same cluster where we have HDFS installed via Marathon-Mesos.
Below is the json to be POST to Marathon. It seems that my app is able to read and write files in the HDFS. Can someone comment on the safety of this? Would files changed by my app correctly changed in the HDFS as well? I Googled around and didn't find any similar approaches...
{
"id": "/ipython-test",
"cmd": null,
"cpus": 1,
"mem": 1024,
"disk": 0,
"instances": 1,
"container": {
"type": "DOCKER",
"volumes": [
{
"containerPath": "/home",
"hostPath": "/hadoop/hdfs-mount",
"mode": "RW"
}
],
"docker": {
"image": "my/image",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 8888,
"hostPort": 0,
"servicePort": 10061,
"protocol": "tcp",
}
],
"privileged": false,
"parameters": [],
"forcePullImage": true
}
},
"portDefinitions": [
{
"port": 10061,
"protocol": "tcp",
"labels": {}
}
]
}
You might have a look at the Docker volume docs.
Basically, the volumes definition in the app.json would trigger the start of the Docker image with the flag -v /hadoop/hdfs-mount:/home:RW, meaning that the host path gets mapped to the Docker container as /home in read-write mode.
You should be able to verify this if you SSH into the node which is running the app and do a docker inspect <containerId>.
See also
https://mesosphere.github.io/marathon/docs/native-docker.html

Configuring prometheus mesos-exporter running on mesosphere DCOS

I am trying to set up mesos exporter on my mesosphere DCOS cluster. The link I am referring to is https://github.com/prometheus/mesos_exporter. The JSON file I have used is :
{
"id": "/mesosexporter",
"instances": 6,
"cpus": 0.1,
"mem": 25,
"constraints": [["hostname", "UNIQUE"]],
"acceptedResourceRoles": ["slave_public","*"],
"container": {
"type": "DOCKER",
"docker": {
"image": "prom/mesos-exporter",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 9105,
"hostPort": 9105,
"protocol": "tcp"
}
]
}
},
"healthChecks": [{
"protocol": "TCP",
"gracePeriodSeconds": 600,
"intervalSeconds": 30,
"portIndex": 0,
"timeoutSeconds": 10,
"maxConsecutiveFailures": 2
}]
}
But only meter exposed to Prometheus is 'mesos_exporter_slave_scrape_errors_total'. What are the other meters which mesos exporter exposes to Promethues. The readme from the github of mesos-exporter says that we need to provide command line flags, but if I want to run mesos exporter as a docker container how should I specify the configuration?
EDIT - The meter 'mesos_exporter_slave_scrape_errors_total' gives non-zero value, indicating that errors occurred during the scrape.
EDIT - After adding the 'parameter' primitive my JSON file looks like:
{
"id": "/mesosexporter",
"instances": 1,
"cpus": 0.1,
"mem": 25,
"constraints": [["hostname", "UNIQUE"]],
"acceptedResourceRoles": ["slave_public"],
"container": {
"type": "DOCKER",
"docker": {
"image": "prom/mesos-exporter",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 9105,
"hostPort": 9105,
"protocol": "tcp"
}
],
"privileged": true,
"parameters": [
{ "key": "-exporter.discovery", "value": "true" },
{ "key": "-exporter.discovery.master-url",
"value": "http://mymasterDNS.amazonaws.com:5050" }
]
}
},
"healthChecks": [{
"protocol": "TCP",
"gracePeriodSeconds": 600,
"intervalSeconds": 30,
"portIndex": 0,
"timeoutSeconds": 10,
"maxConsecutiveFailures": 2
}]
}
Mesos version: 0.22.1
Marathon version: 0.8.2-SNAPSHOT
The app remains in 'deploying' state after using this JSON
But only meter exposed to Prometheus is 'mesos_exporter_slave_scrape_errors_total'. What are the other meters which mesos exporter exposes to Promethues.
If the mesos-exporter is listening on port 9100, you can check http://<hostname>:9100/metrics to know what metrics are being exposed. I am referring the prom/node-exporter that I have setup on one of the systems.
but if I want to run mesos exporter as a docker container how should I specify the configuration?
I am assuming you're POST'ing this JSON file to the Marathon REST API to start Docker containers. If that is indeed the case, you can specify additional options using parameters JSON directive. More info can be found on Marathon docs under the section Privileged Mode and Arbitrary Docker Options.
Hope that helps!
Using the args primitive solved the problem. The equivalent docker command is docker run -p 9105:9105 prom/mesos-exporter -exporter.discovery=true -exporter.discovery.master-url="mymasternodeDNS:5050" -log.level=debug
As the parameters 'exporter.discovery', 'exporter.discovery.master-url' and 'log.level' are for the image entry point and not for 'docker run', 'args' has to be used.
The format for 'args' as added in the working JSON is:
"args": [
"-exporter.discovery=true",
"-exporter.discovery.master-url=http://mymasternodeDNS:5050",
"-log.level=debug"]

Resources