Mesos slave doesn't offer resources - mesos

I have a mesos cluster with 1 master and 1 slave.
I followed this tutorial to set up the configuration of both the master and the slave. And I can see the Mesos Web UI in http://master:5050 and the marathon UI in http://master:8080.
Also Activated=1, which means that the mesos slave is successfully connected to the mesos master I guess.
However In the Resources section, the Offered property is always equal to 0.
The total amount of available CPU is 4. 13GB of RAM and 38GB of disk.
Can anyone help with this issue ?

Related

Can I change GCP Dataproc cluster from Standard (1 master, N workers) to High Availability?

I have created a GCP Dataproc cluster with Standard (1 master, N workers). Now I want to upgrade it to High Availability (3 masters, N workers) - Is it possible?
I tried GCP, GCP alpha and GCP beta commands. For example GCP beta documented here: https://cloud.google.com/sdk/gcloud/reference/beta/dataproc/clusters/update.
It has option to scale worker nodes, however does not have option to switch from standard to high availability mode. Am I correct?
You can upgrade the master node by going into VM Instances section under your cluster , Stop your Master VM and Edit the configuration to
You may always upgrade your master node machine type and also add more worker node.
While that would improve your cluster job performance but noting to do with HA.
The answer is - no. Once HA cluster is created, it can't be downgraded and vice versa. You can add worker nodes, however master node can't be altered.
Yes, you can always do that, for changing the machine type of master node
you first need to stop the master VM instance, then you can change the machine type
Even the machine type of worker node can be changed, only we need to do is stop the machine and edit the machine configuration.

Calculating yarn.nodemanager.resource.cpu-vcores for a yarn cluster with multiple spark clients

If I have 3 spark applications all using the same yarn cluster, how should I set
yarn.nodemanager.resource.cpu-vcores
in each of the 3 yarn-site.xml?
(each spark application is required to have it's own yarn-site.xml on the classpath)
Does this value even matter in the client yarn-site.xml's ?
If it does:
Let's say the cluster has 16 cores.
Should the value in each yarn-site.xml be 5 (for a total of 15 to leave 1 core for system processes) ? Or should I set each one to 15 ?
(Note: Cloudera indicates one core should be left for system processes here: http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ however, they do not go into details of using multiple clients against the same cluster)
Assume Spark is running with yarn as the master, and running in cluster mode.
Are you talking about the server-side configuration for each YARN Node Manager? If so, it would typically be configured to be a little less than the number of CPU cores (or virtual cores if you have hyperthreading) on each node in the cluster. So if you have 4 nodes with 4 cores each, you could dedicate for example 3 per node to the YARN node manager and your cluster would have a total of 12 virtual CPUs.
Then you request the desired resources when submitting the Spark job (see http://spark.apache.org/docs/latest/submitting-applications.html for example) to the cluster and YARN will attempt to fulfill that request. If it can't be fulfilled, your Spark job (or application) will be queued up or there will eventually be a timeout.
You can configure different resource pools in YARN to guarantee a specific amount of memory/CPU resources to such a pool, but that's a little bit more advanced.
If you submit your Spark application in cluster mode, you have to consider that the Spark driver will run on a cluster node and not your local machine (that one that submitted it). Therefore it will require at least 1 virtual CPU more.
Hope that clarifies things a little for you.

How to allocate physical resources for a big data cluster?

I have three servers and I want to deploy Spark Standalone Cluster or Spark on Yarn Cluster on that servers.
Now I have some questions about how to allocate physical resources for a big data cluster. For example, i want to know whether i can deploy Spark Master Process and Spark Worker Process on the same node. Why?
Server Details:
CPU Cores: 24
Memory: 128GB
I need your help. Thanks.
Of course you can, just put host with Master in slaves. On my test server I have such configuration, master machine is also worker node and there is one worker-only node. Everything is ok
However be aware, that is worker will fail and cause major problem (i.e. system restart), then you will have problem, because also master will be afected.
Edit:
Some more info after question edit :) If you are using YARN (as suggested), you can use Dynamic Resource Allocation. Here are some slides about it and here article from MapR. It a very long topic how to configure memory properly for given case, I think that these resources will give you much knowledge about it
BTW. If you have already intalled Hadoop Cluster, maybe try YARN mode ;) But it's out of topic of question

Could not determine the current leader

I'm in this situation in which I got two masters and four slaves in mesos. All of them are running fine. But when I'm trying to access marathon I'm getting the 'Could not determine the current leader' error. I got marathon in both masters (117 and 115).
This is basically what I'm running to get marathon up:
java -jar ./bin/../target/marathon-assembly-0.11.0-SNAPSHOT.jar --master 172.16.50.117:5050 --zk zk://172.16.50.115:2181,172.16.50.117:2181/marathon
Could anyone shed some light over this?
First, I would double-check that you're able to talk to Zookeeper from the Marathon hosts.
Next, there are a few related points to be aware of:
Per the Zookeeper administrator's guide (http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkMulitServerSetup) you should have an odd number of Zookeeper instances for HA. A cluster size of two is almost certainly going to turn out badly.
For a highly available Mesos cluster, you should run an odd number of masters and also make sure to set the --quorum flag appropriately based on that number. See the details of how to set the --quorum flag (and why it's important) in the operational guide on the Apache Mesos website here: http://mesos.apache.org/documentation/latest/operational-guide
In a highly-available Mesos cluster (#masters > 1) you should let both the Mesos agents and the frameworks discover the leading master using Zookeeper. This lets them rediscover the leading master in case a failover occurs. In your case assuming canonical ZK ports you would set the --zk flag on the Mesos masters to --zk=zk://172.16.50.117:2181,172.16.50.115:2181/mesos (add a third ZK instance, see the first point above). The same value should be used for the --master flags in both the Mesos agents and Marathon, instead of specifying a single master.
It's best to run an odd number of masters in your cluster. To do so, either add another master so you have three or remove one so you have only one.

how do I set servers for hadoop? (CDH)

I am running 3 instances using AWS EC2 (m1.small -- 20GB HDD & 1.7 GB RAM).
On the cluster, there will be hadoop, mapReduce, and several monitoring processes.
This is how I split :
1 Master server
NameNode
SecondaryNameNode
JobTracker
Activity Monitor
Alert Publisher
Event Server
Host Monitor
Service Monitor
2 Slave servers
TaskTracker
DataNode
Because of the server's spec, I think it is kind of burden for the master server to run those 8 jobs. How do I divide them? Should I make another server to allocate monitoring processes?
Having NameNode & SeondaryNameNode on same server does not serve any purpose.
With 1.7 GB ram /machine i don't think you can do much. You need more nodes or higher configuration. 8GB/ Node i think should be minimum.
You can assign some services to slave nodes also.

Resources