Is there a way to force re-election in Apache mesos master quorum? - mesos

We have a Apache Mesos master running in HA mode with 3 nodes(each with 4CPU, 15G Memory), this cluster stops offering resources when the memory gets completely exhausted (happens every week)
we have >200 agents connected to this master and it grows, so a long term solution is to increase the CPU & Memory. But till we get bigger VMs, we have to baby sit every day to monitor the CPU load and memory to restart the mesos master service (which will force the re-election) as a precaution.
To avoid this manual effort, we are planning to force the re-election of this cluster on a specific interval.. say every 2days.
So my question here is, whether mesos master has support to force re-election like this, if so how, is it recommended and does it has any caveat?
Appreciate your time to answer and help me here!

Related

k8s tasks slowdown with no excess CPU or RAM usage

I have a small virtualised k8s cluster running on top of KVM on 2 physical machines. After deploying Ceph (a storage framework), all the k8s tasks like creating containers or starting containers became insufferably slow, like taking over a minute to get from creating to starting a container.
I checked the nodes for excess CPU or RAM usages, both work nodes and the master node is well below consuming half the assigned resources. I have about 10-20 pods running on each node at the moment.
I am not sure what to google and given my level of k8s knowledge am completely out of ideas. Anyone with similar experience or could point me in the right direction would be much appreciated!

GKE: How to handle deployments with CPU intensive initialization?

I have a GKE cluster (n1-standard-1, master version 1.13.6-gke.13) with 3 nodes on which I have 7 deployments, each running a Spring Boot application. A default Horizontal Pod Autoscaler was created for each deployment, with target CPU 80% and min 1 / max 5 replicas.
During normal operation, there is typically 1 pod per deployment and CPU usage at 1-5%. But when the application starts, e.g after performing a rolling update, the CPU usage spikes and the HPA scales up to max number of replicas reporting CPU usage at 500% or more.
When multiple deployments are started at the same time, e.g after a cluster upgrade, it often causes various pods to be unschedulable because it's out of CPU, and some pods are at "Preemting" state.
I have changed the HPAs to max 2 replicas since currently that's enough. But I will be adding more deployments in the future and it would be nice to know how to handle this correctly. I'm quite new to Kubernetes and GCP so I'm not sure how to approach this.
Here is the CPU chart for one of the containers after a cluster upgrade earlier today:
Everything runs in the default namespace and I haven't touched the default LimitRange with 100m default CPU request. Should I modify this and set limits? Given that the initialization is resource demanding, what would the proper limits be? Or do I need to upgrade the machine type with more CPU?
HPA only takes into account ready pods. Since your pods only experience a spike in CPU usage during the early stages, your best bet is to configure a readiness probe that only shows as ready once the CPU usage comes down or has a initialDelaySeconds set longer than the startup period to ensure the spike in CPU usage is not taken into account for the HPA.

How do I find a marathon runaway process

I have a mesos / marathon system, and it is working well for the most part. There are upwards of 20 processes running, most of them using only part of a CPU. However, sometimes (especially during development), a process will spin up and start using as much CPU as is available. I can see on my system monitor that there is a pegged CPU, but I can't tell what marathon process is causing it.
Is there a monitor app showing CPU usage for marathon jobs? Something that shows it over time. This would also help with understanding scaling and CPU requirements. Tracking memory usage would be good, but secondary to CPU.
It seems that you haven't configured any isolation mechanism on your agent (slave) nodes. mesos-slave comes with an --isolation flag that defaults to posix/cpu,posix/mem. Which means isolation at process level (pretty much no isolation at all). Using cgroups/cpu,cgroups/mem isolation will ensure that given task will be killed by kernel if exceeds given memory limit. Memory is a hard constraint that can be easily enforced.
Restricting CPU is more complicated. If you have machine that offers 8 CPU cores to Mesos and each of your tasks is set to require cpu=2.0, you'll be able run there at most 4 tasks. That's easy, but at given moment any of your 4 tasks might be able to utilize all idle cores. In case some of your jobs is misbehaving, it might affect other jobs running on the same machine. For restricting CPU utilization see Completely Fair Scheduler (or related question How to understand CPU allocation in Mesos? for more details).
Regarding monitoring there are many possibilities available, choose an option that suits your requirements. You can combine many of the solutions, some are open-source other enterprise level solutions (in random order):
collectd for gathering stats, Graphite for storing, Grafana for visualization
Telegraf for gathering stats, InfluxDB for storing, Grafana for visualization
Prometheus for storing and gathering data, Grafana for visualization
Datadog for a cloud based monitoring solution
Sysdig platform for monitoring and deep insights

How to select CPU parameter for Marathon apps ran on Mesos?

I've been playing with Mesos cluster for a little bit, and thinking of utilizing Mesos cluster in our production environment. One problem I can't seem to find an answer to: how to properly schedule long running apps that will have varying load?
Marathon has "CPUs" property, where you can set weight for CPU allocation to particular app. (I'm planning on running Docker containers) But from what I've read, it is only a weight, not a reservation, allocation, or limitation that I am setting for the app. It can still use 100% of CPU on the server, if it's the only thing that's running. The problem is that for long running apps, resource demands change over time. Web server, for example, is directly proportional to the traffic. Coupled to Mesos treating this setting as a "reservation," I am choosing between 2 evils: set it too low, and it may start too many processes on the same host and all of them will suffer, with host CPU going past 100%. Set it too high, and CPU will go idle, as reservation is made (or so Mesos think), but there is nothing that's using those resources.
How do you approach this problem? Am I missing something in how Mesos and Marathon handle resources?
I was thinking of an ideal way of doing this:
Specify weight for CPU for different apps (on the order of, say, 0.1 through 1), so that when going gets tough, higher priority gets more (as is right now)
Have Mesos slave report "Available LA" with its status (e.g. if 10 minute LA is 2, with 8 CPUs available, report 6 "Available LA")
Configure Marathon to require "Available LA" resource on the slave to schedule a task (e.g. don't start on particular host if Available LA is < 2)
When available LA goes to 0 (due to influx of traffic at the same time as some job was started on the same server before the influx) - have Marathon move jobs to another slave, one that has more "Available LA"
Is there a way to achieve any of this?
So far, I gather that I can possible write a custom isolator module that will run on slaves, and report this custom metric to the master. Then I can use it in resource negotiation. Is this true?
I wasn't able to find anything on Marathon rescheduling tasks on different nodes if one becomes overloaded. Any suggestions?
As of Mesos 0.23.0 oversubscription is supported. Unfortunately it is not yet implemented in Marathon: https://github.com/mesosphere/marathon/issues/2424
In order to dynamically do allocation, you can use the Mesos slave metrics along with the Marathon HTTP API to scale, for example, as I've done here, in a different context. My colleague Niklas did related work with nibbler, which might also be of help.

How Much Ram For a Jenkins Master Instance?

I want to set up a Jenkins Master that will let slaves do all the the builds.
The master is just a traffic cop, getting SVN hook triggers and kicking off slave builds.
There will be about 10 Java Maven build jobs in this setup.
I wish to run the Jenkins master on a hosted server that has limited resources (RAM).
I will run the slaves on some nicely loaded machines on my own network.
So my question is how little RAM could I get away with allocating to the Master Jenkins instance? 256M? 384M? 512M? Other?
I cannot seem to find this specific info in the Jenkins docs.
A coworker asked me the same question and my first answer was that 1-2 GB should be enough. Later I discovered this entry from the Jenkins documentation:
Have a beefy machine for Jenkins master & do not run slaves on the
master machine. Every slave has certain memory allocated in the master
JVM, so the bigger the RAM for the master, the better it is. We
typically hear customers allocate 16G or so.
Source: https://docs.cloudbees.com/docs/cloudbees-core/latest/traditional-install-guide/system-requirements
As of mid-2016, the official documentation says
Memory Requirements for the Master
The amount of memory Jenkins needs is largely dependent on many factors, which is why the RAM allotted for it can range from 200 MB for a small installation to 70+ GB for a single and massive Jenkins master. However, you should be able to estimate the RAM required based on your project build needs.
Each build node connection will take 2-3 threads, which equals about 2 MB or more of memory. You will also need to factor in CPU overhead for Jenkins if there are a lot of users who will be accessing the Jenkins user interface.
It is generally a bad practice to allocate executors on a master, as builds can quickly overload a master’s CPU/memory/etc and crash the instance, causing unnecessary downtime. Instead, it is advisable to set up slaves that the Jenkins master can delegate build jobs to, keeping the bulk of the work off of the master itself.
I don't think there is a rule of thumb for this. Our master uses 2G and we have 6 slaves. We have close to 60 jobs - most of them maven. We have never had memory issues in the past. And our slaves are always busy (I always see some job or the other being kicked off).
You could start with 512M and see how it works. If you see memory issues increase the memory. That is the only way I can think of. But to monitor the memory of your master use the Jenkins Monitoring Plugin. This plugin integrates JavaMelody and lets you monitor the JVM of your master and even slaves. Good luck!
I have a Jenkins master with a few dozens of jobs and slaves. But I don't run builds or tests on Master. Based on my observation the memory consuption is not that big. It rarely went more than 2 or 3 GB. Also I believe it depends on the memory size option you specifiy to the java process of Jenkins. I would recommend at least 2GB RAM in your case. You could always load balance the builds to slaves if you need.
It depends on what you are building, how often you are running builds, and how fast you need them to build. From my experience you need 1 GB per concurrent build, but this can vary depending on how resource intensive your build is. Monitor your memory usage under it's heaviest load and if the memory usage gets between 70-80% or more add more memory, it's around 30-40% or less allocate less.
Also keep an eye on disk usage, about 4 years ago had a TeamCity build server that kept falling to it's knees, that problem turned out to be the master and the salves where all virtual servers sharing the same attached storage and the disks couldn't keep up. More of a problem with the VM environment but still something to keep in mind.

Resources