marathon not run tasks - mesos

I am making mesos-cluster terraform provisioning.
Using marathon reflected the following problem:
tasks are queued only if the leader associated with marathon matches the same machine as the leader assigned to mesos-master.
Otherwise the endless waits.
Any idea to solve the problem?

Related

if an aws spot instance is stopped by AWS and then restarts will it just start where it left off?

I am running luigi, a pipeline manager which is processing 1000 tasks. Currently I poll for the AWS termination notice. If it is present then I requeue the job; wait 30 minutes; then launch a new server starting all the tasks from scratch. However sometimes it restarts the same job multiple times which is inefficient.
Instead I am considering using create_fleet with InstanceInterruptionBehaviour=Stop? If I do this then when it restarts will it still be running the luigi daemon and retain the state of all the tasks?
All InstanceInterruptionBehaviour=Stop does is effectively shutdown your EC2 instance rather than terminate it. Since the "persistent" setting is required in addition to EBS storage" you will keep all the data currently on the attached EBS volumes at the time of the instance stop.
It is completely dependent on the application itself (Luigi in this case) to be able to store the state of its execution and pick back up from where it left off. For one, you'll want to ensure you enable the service daemon to automatically start upon system start (example):
sudo systemctl enable yourservice

Mesos task history after restart

I am using Mesos for container orchestration and get task history from Mesos using /task endpoint.
Mesos is running in a 7 nodes cluster and zookeeper is running in a 3 node cluster. I hope, Mesos uses Zookeeper to store the task History. We lost history sometimes when we restart Mesos. Does it store in memory? I am trying to understand what is happening here.
My questions are,
Where does it store task histories?
How can we configure the task history cleanup policy?
Why do we lose complete task history on restarting Mesos?
To answer your questions:
Task history/state for Mesos is stored in memory, and in the replicated_log (details here). The default is set to use the replicated_log, to store state completely in memory without the replicated_log you would have to specify this in your Mesos flags seen here in the configuration page as --registry=in_memory
Most users typically configure task history cleanup by using these three flags (there are more, but these are most common) --max_completed_frameworks=VALUE, --max_completed_tasks_per_framework=VALUE, and --max_unreachable_tasks_per_framework=VALUE as described in the previous document.
Yes, task history for the /tasks endpoint is lost every time a Mesos Master is restarted. However, the /state endpoint will still contain all task status changes over time.
**Edited to reflect information about the /tasks endpoint, not the /state endpoint.

Do EC2 system reboot events cause EMR to wait for the node or to replace it?

I have a long running EMR cluster. I received EC2 event notifications of upcoming system reboots. The help document advises that even rebooting these manually will not reschedule this, though stopping and starting the instances might.
The EMR cluster claims if a core node goes unresponsive it will provision a new one. I suspect this provisioning takes longer than a reboot, so what I cannot find in the documentation is whether the EC2 event is known to EMR and the cluster will wait for it's missing core nodes (or task nodes) to reboot and rejoin, or whether EMR will respond as though these instances disappeared un-expectantly, and thus will start provisioning new replacements even as the nodes come back and rejoin the cluster.
Does anyone know which it will be?
It turns out the AWS service person operating the HW replacement and rebooting the instance was tasked with doing the correct adjustments in EMR for changing the instance. They started by adding a node, then draining the old node of tasks. Then they rebooted the node and it was reattached to EMR. Then they drained the added node and shut that down.
I'm not sure this is happening every time there's a reboot event though. It seems like the script of service steps is modified for different types of case.

Mesos/Marathon checkpointing and HA

Mesos and Marathon mention checkpointing from time to time, but I couldn't find a good explanation of how it works anywhere. Also, what does it mean in practice?
1) Is the Task current state continuously being stored, or is only the Task ID stored? Where is it stored and what does it contain?
2) There are two Marathon instances. Marathon has been running Nginx for a week, then goes down. Does that mean that the actual Nginx application state continues running on the second Marathon instance, or does it just restart the task from beginning? If the Task actual state is copied, isn't there a lot of data to be continuously persisted and passed around between slaves?
Slave recovery is a feature of Mesos that allows:
Executors/tasks to keep running when the slave process is down and
Allows a restarted slave process to reconnect with running executors/tasks on the slave.
(Mesos Slave recovery).
So regarding you questions this means:
Enough information (a little more than TaskID) is stored in order that a new slave process can reconnect to the still running executor/task.
As the task state is not checkpointed, it would start the task from the beginning.
Hope this helps,
Joerg

Why marathon does not terminate jobs after the quorum is lost?

I'm working with Apache mesos and marathon. I have 3 master nodes and 3 slave nodes. I configure mesos with quorum 2. Later I post a JSON to run one job with marathon and all look fine.
Then I try a shutdown of two master nodes to break the quorum, after this, mesos unregister all slave and all look ok, but when I inspect the slaves I found that the started job was continue running...it is normal? I was supposing that marathon stop all job after the quorum is lost.
Part of the Mesos philosophy, especially for long-running services, is that a failure in one or more Mesos components should not need to stop the user application.
If a slave shuts down and the framework has checkpointing enabled, the executor driver will wait for the slave's --recovery_timeout (default 15min) before shutting down the executor/tasks. To prevent this, disable checkpointing on your framework (in Marathon, just set --checkpoint=false when starting Marathon). See also Marathon's --failover_timeout on https://mesosphere.github.io/marathon/docs/command-line-flags.html
On the other hand, if it's just the Masters/ZKs that shut down, and the Slaves are still up and running, the slaves can still monitor the tasks and queue up status updates, so the tasks can stay alive. If ZK loses quorum, then there is no leading master, and each slave will continue to operate independently until a new leader is detected, at which point it will reregister with the master and send any queued status updates.

Resources