Stoping or re registering orphaned tasks on mesos cluster - mesos

I am using marathon to deploy my application on mesos-cluster. Recently I experienced failover in my mesos-master and marathon. On restarting, master was able to identify running old tasks on slave but was not showing them in active tasks pane as marathon registered with new framework id. Is it somehow possible to stop these orphaned tasks when we restart mesos-master so that they can be redeployed using marathon ?

You should be able to shutdown the old Marathon frameworkId and kill all of its tasks using the /teardown endpoint on the Mesos master.
You use the endpoint by sending a POST request with the frameworkID in the body. For example:
curl -d 'frameworkId=#' -X POST localhost:5050/master/teardown
You can find the frameworkId of your old Marathon instance by using one of the master endpoints, such as /frameworks. Be careful to use the frameworkId of the old Marathon instance, not the new one.

Related

Mesos task history after restart

I am using Mesos for container orchestration and get task history from Mesos using /task endpoint.
Mesos is running in a 7 nodes cluster and zookeeper is running in a 3 node cluster. I hope, Mesos uses Zookeeper to store the task History. We lost history sometimes when we restart Mesos. Does it store in memory? I am trying to understand what is happening here.
My questions are,
Where does it store task histories?
How can we configure the task history cleanup policy?
Why do we lose complete task history on restarting Mesos?
To answer your questions:
Task history/state for Mesos is stored in memory, and in the replicated_log (details here). The default is set to use the replicated_log, to store state completely in memory without the replicated_log you would have to specify this in your Mesos flags seen here in the configuration page as --registry=in_memory
Most users typically configure task history cleanup by using these three flags (there are more, but these are most common) --max_completed_frameworks=VALUE, --max_completed_tasks_per_framework=VALUE, and --max_unreachable_tasks_per_framework=VALUE as described in the previous document.
Yes, task history for the /tasks endpoint is lost every time a Mesos Master is restarted. However, the /state endpoint will still contain all task status changes over time.
**Edited to reflect information about the /tasks endpoint, not the /state endpoint.

Provision to start group of applications on same Mesos slave

I have cluster of 3 Mesos slaves, where I have two applications: “redis” and “memcached”. Where redis depends on memcached and the requirement is both of the applications/services should start on same node instead of different slave nodes.
So I have created the application group and added the dependency properly in the JSON file. After launching the JSON file via “v2/groups” REST API, I observe that sometime both application group will start on same node but sometimes it will start on different slaves which breaks our requirement.
So intent/requirement is; if any application fails to start on a slave both the application should failover to other slave node. Also can I configure the JSON file to tell Marathon to start the application group on slave-1 (specific slave first) if it is available else start it on other slave in a cluster. Due to some reason if this application group will start on other slave can Marathon relaunch the application group to slave-1 if it is available to serve the request.
Thanks in advance for help.
Edit/Update (2):
Mesos, Marathon, and DC/OS support for PODs is available now:
DC/OS: https://dcos.io/docs/1.9/usage/pods/using-pods/
Mesos: https://github.com/apache/mesos/blob/master/docs/nested-container-and-task-group.md
Marathon: https://github.com/mesosphere/marathon/blob/master/docs/docs/pods.md
I assume you are talking about marathon apps.
Marathon application groups don't have any semantics concerning co-location on the same node and the same is the case for dependencies.
You seem to be looking for a Kubernetes like Pod abstraction in marathon, which is on the roadmap but not yet available (see update above :-)).
Hope this helps!
I think this should be possible (as a workaround) if you specify the correct app contraints within the group's JSON.
Have a look at the example request at
https://mesosphere.github.io/marathon/docs/generated/api.html#v2_groups_post
and the constraints syntax at
https://mesosphere.github.io/marathon/docs/constraints.html
e.g.
"constraints": [["hostname", "CLUSTER", "slave-1"]]
should do. Downside is that there will be no automatic failover to another slave that way. Still, I'd be curious why both apps need to specifically run on the same slave node...

I am not sure whether the application is running on just the master or the whole cluster for Spark on EC2

I am using Spark 1.1.1 . I followed the instructions given on https://spark.apache.org/docs/1.1.1/ec2-scripts.html and have a cluster of 1 master node and 1 worker on EC2 running.
I have made a jar of the application and rsynced it to the slaves. When I run the application using spark-submit with the deploy-mode of client, the application works. However, when I do so using deploy-mode cluster it gives me an error saying it cannot find the jar on the worker. The permission of the jar is 755 on both the master and worker.
I am not sure whether when I run the application using deploy-mode=client whether the application is using the workers. I don't think it is since the worker url does not show any completed jobs. But it does show failed jobs during deploy-mode=cluster.
Am I doing something wrong? Thank you for your help.
You can check if executors are assigned to the application on the /executors page on port 4040 (e.g. http://localhost:4040/executors/). If you only see <driver> then you are not using the worker. If you see one line for <driver> and one other line (with ID 0, unless it has restarted), then the worker is also providing an executor to your application. Here you can also see how many tasks it has completed for your application, and other stats.

Why marathon does not terminate jobs after the quorum is lost?

I'm working with Apache mesos and marathon. I have 3 master nodes and 3 slave nodes. I configure mesos with quorum 2. Later I post a JSON to run one job with marathon and all look fine.
Then I try a shutdown of two master nodes to break the quorum, after this, mesos unregister all slave and all look ok, but when I inspect the slaves I found that the started job was continue running...it is normal? I was supposing that marathon stop all job after the quorum is lost.
Part of the Mesos philosophy, especially for long-running services, is that a failure in one or more Mesos components should not need to stop the user application.
If a slave shuts down and the framework has checkpointing enabled, the executor driver will wait for the slave's --recovery_timeout (default 15min) before shutting down the executor/tasks. To prevent this, disable checkpointing on your framework (in Marathon, just set --checkpoint=false when starting Marathon). See also Marathon's --failover_timeout on https://mesosphere.github.io/marathon/docs/command-line-flags.html
On the other hand, if it's just the Masters/ZKs that shut down, and the Slaves are still up and running, the slaves can still monitor the tasks and queue up status updates, so the tasks can stay alive. If ZK loses quorum, then there is no leading master, and each slave will continue to operate independently until a new leader is detected, at which point it will reregister with the master and send any queued status updates.

Running Hadoop/Storm tasks on Apache Marathon

I recently came across Apache Mesos and successfully deployed my Storm topology over Mesos.
I want to try running Storm topology/Hadoop jobs over Apache Marathon (had issues running Storm directly on Apache Mesos using mesos-storm framework).
I couldn't find any tutorial/article that could list steps how to launch a Hadoop/Spark tasks from Apache Marathon.
It would be great if anyone could provide any help or information on this topic (possibly a Json job definition for Marathon for launching storm/hadoop job).
Thanks a lot
Thanks for your reply, I went ahead and deployed a Storm-Docker cluster on Apache Mesos with Marathon. For service discovery I used HAProxy. This setup allows services (nimbus or zookeeper etc) to talk to each other with the help of ports, so for example adding multiple instances for a service is not a problem since the cluster will find them using the ports and loadbalance the requests between all the instances of a service. Following is the GitHub project which has the Marathon recipes and Docker images: https://github.com/obaidsalikeen/storm-marathon
Marathon is intended for long-running services, so you could use it to start your JobTracker or Spark scheduler, but you're better off launching the actual batch jobs like Hadoop/Spark tasks on a batch framework like Chronos (https://github.com/airbnb/chronos). Marathon will restart tasks when the complete/fail, whereas Chronos (a distributed cron with dependencies) lets you set up scheduled jobs and complex workflows.
While a little outdated, the following tutorial gives a good example.
http://mesosphere.com/docs/tutorials/etl-pipelines-with-chronos-and-hadoop/

Resources