I would like to setup a SLURM cluster. How many machines do I need at minimum? Can I start with 2 machines (one being only client, and one being both client and server)?
As #Carles wrote, you can use only one computer if you want, running both the controller (slurmctld) and the worker (slurmd) daemon.
If you want to test some configurations and observe Slurm's behavior, you can even run multiple worker daemon on a single machine to simulate a larger cluster, using the -N <hostname> option.
If you want to actually get some computation done, you can run the controller and the worker daemon on the same node. If you want the system to still be responsive, just configure Slurm to let it believe the system has 1 core and 2GB of RAM less than it actually has to leave some room for the OS and the Slurm daemons.
As a side note, the pages you link in your question correspond to a very old version of Slurm. The newer version of the documentation is hosted on Schedmd's website.
You can start using only one machine, but 2 machines will be the most standard configuration, being one machine the controller and the other the "worker" node. With this model you can add as many machines to the cluster being "worker" nodes. This way the server will not execute jobs, and will be not suffering jobs interference.
Related
I've been playing with Mesos cluster for a little bit, and thinking of utilizing Mesos cluster in our production environment. One problem I can't seem to find an answer to: how to properly schedule long running apps that will have varying load?
Marathon has "CPUs" property, where you can set weight for CPU allocation to particular app. (I'm planning on running Docker containers) But from what I've read, it is only a weight, not a reservation, allocation, or limitation that I am setting for the app. It can still use 100% of CPU on the server, if it's the only thing that's running. The problem is that for long running apps, resource demands change over time. Web server, for example, is directly proportional to the traffic. Coupled to Mesos treating this setting as a "reservation," I am choosing between 2 evils: set it too low, and it may start too many processes on the same host and all of them will suffer, with host CPU going past 100%. Set it too high, and CPU will go idle, as reservation is made (or so Mesos think), but there is nothing that's using those resources.
How do you approach this problem? Am I missing something in how Mesos and Marathon handle resources?
I was thinking of an ideal way of doing this:
Specify weight for CPU for different apps (on the order of, say, 0.1 through 1), so that when going gets tough, higher priority gets more (as is right now)
Have Mesos slave report "Available LA" with its status (e.g. if 10 minute LA is 2, with 8 CPUs available, report 6 "Available LA")
Configure Marathon to require "Available LA" resource on the slave to schedule a task (e.g. don't start on particular host if Available LA is < 2)
When available LA goes to 0 (due to influx of traffic at the same time as some job was started on the same server before the influx) - have Marathon move jobs to another slave, one that has more "Available LA"
Is there a way to achieve any of this?
So far, I gather that I can possible write a custom isolator module that will run on slaves, and report this custom metric to the master. Then I can use it in resource negotiation. Is this true?
I wasn't able to find anything on Marathon rescheduling tasks on different nodes if one becomes overloaded. Any suggestions?
As of Mesos 0.23.0 oversubscription is supported. Unfortunately it is not yet implemented in Marathon: https://github.com/mesosphere/marathon/issues/2424
In order to dynamically do allocation, you can use the Mesos slave metrics along with the Marathon HTTP API to scale, for example, as I've done here, in a different context. My colleague Niklas did related work with nibbler, which might also be of help.
I have limited number of machines (3 machines). I want to simulate 5000 concurrent users for my website. I want to know if I can run multiple instance on jmeter-server on one host. something like this
host1:
192.168.1.1:3000
192.168.1.1:3001
192.168.1.1:3002
host2:
192.168.1.2:3000
192.168.1.2:3001
192.168.1.2:3002
I dont want to run independent jmeter instances.
I haven't found multiple remotes on one machine to be any better than a single jmeter on the machine.
I have even found the opposite, since there are a lot more overheads.
I have found on some tests that one jmeter master can generate more samples than two or more slaves running in distributed mode.
To do more samples, you need to be using less local resources for other stuff. VMs, jmeter-server, etc all add overheads, unless you are running on a high power server that a single JVM can't make the most of. Even then, the least overhead method is run another jmeter jvm.
Is one Zookeeper installation good enough to be used by Hadoop Kafka and Storm clusters?
I want to deploy all on one test environment and try playing with those technologies,
can I use one zookeeper installation for that? same znode could be dedicated for number of services?
Yes, you can use a single zookeeper installation to support more than one cluster and indeed different types of clusters. This has been the case for a long time - here's a link to a good discussion on it from 2009: http://zookeeper-user.578899.n2.nabble.com/Multiple-ZK-clusters-or-a-single-shared-cluster-td3277547.html
For testing this is fine (and even to run it on one ZK server). For production use though you'll want at least a 3 node cluster. And you should think carefully about running everything off of a single cluster.
The reason is that if you run multiple Hadoop, Storm and Kafka clusters off of a single ZK cluster, that one set of servers becomes a single point of failure for all of your distributed systems. You can beef up the ZK setup with more than 3 servers (let's say 7) so that it can handle multiple failures, but if someone were to accidentally bring ZK down all your distributed environments would come down too.
Some argue that you would be better off with more isolation between systems. I think it varies by use case but I'd be careful about putting all of your eggs in one ZK basket.
I created test with JMeter to test performance of Ghost blogging platform. Ghost written in Node.js and was installed in cloud server with 1Gb RAM, 1 CPU.
I noticed after 400 concurrent users JMeter getting errors. Till 400 concurrent users load is normal. I decide increase CPU and added 1 CPU.
But errors reproduced and added 2 CPUs, totally 4 CPUs. The problem is occuring after 400 concurrent users.
I don't understand why 1 CPU can handle 400 users and the same results with 4 CPUs.
During monitoring I noticed that only one CPU is busy and 3 other CPUs idle. When I check JMeter summary in console there were errors, about 5% of request. See screenshot.
I would like to know is it possible to balance load between CPUs?
Are you using cluster module to load-balance and Node 0.10.x?
If that's so, please update your node.js to 0.11.x.
Node 0.10.x was using balancing algorithm provided by an operating system. In 0.11.x the algorithm was changed, so it will be more evenly distributed from now on.
Node.js is famously single-threaded (see this answer): a single node process will only use one core (see this answer for a more in-depth look), which is why you see that your program fully uses one core, and that all other cores are idle.
The usual solution is to use the cluster core module of Node, which helps you launch a cluster of Node processes to handle the load, by allowing you to create child processes that all share the same server ports.
However, you can't really use this without fixing Ghost's code. An option is to use pm2, which can wrap a node program, by using the cluster module for you. For instance, with four cores:
$ pm2 start app.js -i 4
In theory this should work, except if Ghost relies on some global variables (that can't be shared by every process).
Use cluster core and for load balancing nginx. Thats bad part about node.js. Fantastic framework, but developer has to enter into load balancing mess. While java and other runtimes makes is seamless. Anyway, nothing is perfect.
Server Scenario:
Ubuntu 12.04 LTS
Torque w/ Maui Scheduler
Hadoop
I am building a small cluster (10 nodes). The users will have the ability to ssh into any child node(LDAP Auth) but this is really unnecessary since all computation jobs they want to run can be submitted on the head node using torque, hadoop, or other resource managers tied with a scheduler to insure priority and proper resource allocation throughout the nodes. Some users will have priority over others.
Problem:
You can't force a user to use a batch system like torque. If they want to hog all the resources on one node or the head node they can just run their script / code directly from their terminal / ssh session.
Solution:
My main users or "superusers" want me to set up a remote login timeout which is what their current cluster uses to eliminate this problem. (I do not have access to this cluster so I can not grab the configuration). I want to setup a 30 minute timeout on all remote sessions that are inactive(keystrokes), if they are running processes I also want the session to be killed along with all job processes. This will eliminate people from NOT using an available batch system / scheduler.
Question:
How can I implement something like this?
Thanks for all the help!
I've mostly seen sys admins solve this by not allowing ssh access to the nodes (often done using the pam module in TORQUE), but there are other techniques. One is to use pbstools. The reaver script can be setup to kill user processes that aren't part of jobs (or shouldn't be on those nodes). I believe it can also be configured to simply notify you. Some admins forcibly kill things, others educate users, that part is up to you.
Once you get people using jobs instead of ssh'ing directly, you may want to look into the cpuset feature in TORQUE as well. It can help you as you try to get users to use the amount of resources they request. Best of luck.
EDIT: noted that the pam module is one of the most common ways to restrict ssh access to the compute nodes.