I'm trying to squeeze every single bit from my cluster when configuring the spark application but it seems I'm not understanding everything completely right. So I'm running the application on an AWS EMR cluster with 1 master and 2 core nodes from type m3.xlarge(15G ram and 4 vCPU for every node). This means that by default 11.25 GB are reserved on every node for applications scheduled by yarn. So the master node is used only by the resource manager(yarn) and that means the remaining 2 core nodes will be used to schedule applications(so we have 22.5G for that purpose). So far so good. But here comes the part which I don't get. I'm starting the spark application with the following parameters:
--driver-memory 4G --num-executors 4 --executor-cores 7 --executor-memory 4G
What this means by my perceptions(from what I found as information) is that for the driver will be allocated 4G and 4 executors will be launched with 4G every one of them. So a rough estimate makes it 5*4=20G(lets make them 21G with the expected memory reserves), which should be fine as we have 22.5G for applications. Here's a screenshot from the UI of the hadoop yarn after the launch:
What we can see is that 17.63 are used by the application but this a little bit less than the expected ~21G and this triggers the first question- what did happen here?
Then I go to the spark UI's executors page. Here comes the bigger question:
The executors are 3(not 4), the memory allocated for them and the driver is 2.1G(not the specified 4G). So hadoop yarn says 17.63G are used, but the spark says 8.4G are allocated. So, what is happening here? Is this related to the Capacity Scheduler(from the documentation I couldn't come up with this conclusion)?
Can you check whether spark.dynamicAllocation.enabled is turned on. If that is the case then spark your application may give resources back to the cluster if they are no longer used. The minimum number of executors to be launched at the startup will be decided by spark.executor.instances.
If that is not the case, what is your source for spark application and what is the partition size set for that, spark will literally map the partition size to the spark cores, if your source has only 10 partitions, and when you try to allocate 15 cores it will only use 10 cores because that is what is needed. I guess this might be the cause that spark has launched 3 executors instead of 4. Regarding memory i would recommend to revisit because you are asking for 4 executors and 1 driver with 4Gb each which would be 5*4+5*384MB approx equals to 22GB and you are trying to use up everything and not much is left for your OS and nodemanager to run that would not be the ideal way to do.
Related
I have two linux machines, both with different configuration
Machine 1: 16 GB RAM, 4 Virtual Cores and 40 GB HDD (Master and Slave Machine)
Machine 2: 8 GB RAM, 2 Virtual Cores and 40 GB HDD (Slave machine)
I have set up a hadoop cluster between these two machines.
I am using Machine 1 as both master and slave.
And Machine 2 as slave.
I want to run my spark application and utilise as much as Virtual Cores and memory as possible but I am unable to figure out what settings.
My spark code looks something like:
conf = SparkConf().setAppName("Simple Application")
sc = SparkContext('spark://master:7077')
hc = HiveContext(sc)
sqlContext = SQLContext(sc)
spark = SparkSession.builder.appName("SimpleApplication").master("yarn-cluster").getOrCreate()
So far, I have tried the following:
When I process my 2 GB file only on Machine 1 (in local mode as Single node cluster), it uses all the 4 CPUs of the machine and completes in about 8 mins.
When I process my 2 GB file with cluster configuration as above, it takes slightly longer than 8 mins, though I expected, it would take less time.
What number of executors, cores, memory do I need to set to maximize the usage of cluster?
I have referred below articles but because I have different machine configuration in my case, not sure what parameter would fit best.
Apache Spark: The number of cores vs. the number of executors
Any help will be greatly appreciated.
When I process my 2 GB file with cluster configuration as above, it takes slightly longer than 8 mins, though I expected, it would take less time.
Its not clear where your file is stored.
I see you're using Spark Standalone mode, so I'll assume it's not split on HDFS into about 16 blocks (given block size of 128MB).
In that scenario, your entire file will processed at least once in whole, plus the overhead of shuffling that data amongst the network.
If you used YARN as the Spark master with HDFS as the FileSystem, and a splittable file format, then the computation would go "to the data", which you could expect quicker run times.
As far as optimal settings, there's tradeoffs between cores&memory and amount of executors, but there's no magic number for a particular workload and you'll always be limited by the smallest node in the cluster, keeping in mind the memory of the Spark driver and other processes on the OS should be accounted for when calculating sizes
I am running Sparkling waterover 36 Spark executors.
Due to Yarn's scheduling, some executors would preempt and comeback later.
Overall, there are 36 executors for the majority of time, just not always.
So far, my experience is that, as soon as 1 executor fails, the entire H2o instance halts, even if the missing executor comes back to life later.
I wonder if this is how Sparkling-waterbehaves? Or some preemptive capability needs to be turned on?
Anyone have a clue about this ?
[Summary]
What you are seeing is how Sparkling Water behaves.
[ Details... ]
Sparkling Water on YARN can run in two different ways:
the default way, where H2O nodes are embedded inside Spark executors and there is a single (Spark) YARN job,
the external H2O cluster way, where the Spark cluster and H2O cluster are separate YARN jobs (running in this mode requires more setup; if you were running in this way, you would know it)
H2O nodes do not support elastic cloud formation behavior. Which is to say, once an H2O cluster is formed, new nodes may not join the cluster (they are rejected) and existing nodes may not leave the cluster (the cluster becomes unusable).
As a result, YARN preemption must be disabled for the queue where H2O nodes are running. In the default way, it means the entire Spark job must run with YARN preemption disabled (and Spark dynamicAllocation disabled). For the external H2O cluster way, it means the H2O cluster must be run in a YARN queue with preemption disabled.
Other pieces of information that might help:
If you are just starting on a new problem with Sparkling Water (or H2O in general), prefer a small number of large memory nodes to a large number of small memory nodes; fewer things can go wrong that way,
To be more specific, if you are trying to run with 36 executors that each have 1 GB of executor memory, that's a really awful configuration; start with 4 executors x 10 GB instead,
In general you don't want to start Sparkling Water with executors less than 5 GB at all, and more memory is better,
If running in the default way, don't set the number of executor cores to be too small; machine learning is hungry for lots of CPU.
I start the sparkling-shell with the following command.
./bin/sparkling-shell --num-executors 4 --executor-memory 4g --master yarn-client
I only ever get two executors. Is this an H2o problem, YARN problem, or Spark problem?
Mike
There can be multiple reasons for this behaviour.
YARN can give you only the amount of executors based on available resources ( memory, vcores ). If you ask for more then you have resources, it will give you max what it can.
It can be also case when you have dynamic allocation enabled. This means that that Spark will create new executors when they are needed.
In order to solve some technicalities in Sparkling Water we need to discover all available executors at the start of the application by creating artificial computation and trying to utilise the whole cluster. This might give you less number of executors as well.
I would suggest looking at https://github.com/h2oai/sparkling-water/blob/master/doc/tutorials/backends.rst where you can read more about the paragraph above and how it can be solved using so called external sparkling water backend.
You can also have a look here https://github.com/h2oai/sparkling-water/blob/master/doc/configuration/internal_backend_tuning.rst. This is Sparkling Water guide for tuning the configuration.
Kuba
I got over the problem by changing the following four values in cloudera manager
Setting Value
yarn.scheduler.maximum-allocation-vcores 8
yarn.nodemanager.resource.cpu-vcores 4
yarn.nodemanager.resource.cpu-vcores 4
yarn.scheduler.maximum-allocation-mb 16 GB
I have a cluster running Spark with 4 servers each having 8 cores. Somehow the master is not detecting all available cores. It is using 18 out of 32 cores:
I have not set anything relating to the no. of cores in any spark conf file (at least not that I am aware of)
I am positive each cluster member has the same no. of cores (8):
Is there a way to make Spark detect/use the other cores as well?
I found it but still it is somewhat unclear:
One node that was only contributing 1 out of 8 cores was having this setting turned on in $SPARK_HOME/conf/spark-env.sh:
SPARK_WORKER_CORES=1
Commenting it out did the trick for that node. Spark will grab all cores by default. (same goes for memory)
But... on the other node with only 1 core this setting was not activated, but still Spark did not grab 8 cores untill I specifically told it to:
SPARK_WORKER_CORES=8
But at least it is grabbing all resources now.
I have a very small new EMR cluster to play around with and I'm trying to limit the number of concurrent mappers per node to 2. I tried this by tweaking the default cpu-vcores down to 2.
Formula used:
min((yarn.nodemanager.resource.memory-mb / mapreduce.map.memory.mb),
(yarn.nodemanager.resource.cpu-vcores / mapreduce.map.cpu.vcores))
Cluster configuration:
AMI version: 3.3.1
Hadoop distribution: Amazon 2.4.0
Core: 4 m1.large
Job Configuration:
yarn.nodemanager.resource.memory-mb:5120
mapreduce.map.memory.mb:768
yarn.nodemanager.resource.cpu-vcores: 2
mapreduce.map.cpu.vcores: 1
As a result, I am currently seeing 22 mappers running at the same time. Besides being wrong according to the formula, this does not make sense at all giving the I have 4 cores. Any thoughts?
I haven't experienced the second part of the formula (the one with vcores) to ever take place on small dedicated cluster I worked on (although it should have according to formula). I read somewhere also that YARN does not take cpu cores into account when allocating resources (i.e. it only allocates based on memory requirements).
As for the memory calculation, yarn.nodemanager.resource.memory-mb is a per node setting, but dashboards often give you cluster wide numbers, so before you divide the yarn.nodemanager.resource.memory-mb by mapreduce.map.memory.mb, multiply it with the number of nodes in your cluster, i.e.
(yarn.nodemanager.resource.memory-mb*number_of_nodes_in_cluster) / mapreduce.map.memory.mb