I can run several jobs (MapReduce, Hive) in one queue. But if I run a Spark/Spark Streaming job, every job added after that will be in ACCEPTED state but not RUNNING. Only after I kill the Spark job the other job will be RUNNING.
I tried to create a different queue for Spark and non Spark jobs, they work as expected but this is not what I want.
My questions:
1. Is this YARN or Spark config issue?
2. What is the right config to solve that issue?
Any helps will be appreciated, thanks.
Related
I'm running a large Spark job (about 20TB in and stored to HDFS) alongside Hadoop. The spark console is showing the job as complete but Hadoop still things the job is running, both in the console and the logs are still spitting out 'running'.
How long should I be waiting until I should be worried?
You can try to stop the spark context cleanly. If you havent close it add a sparkcontext stop method at the end of the job. For example
sc.stop()
I am trying to run pyspark on yarn with oozie, after submitting the workflow, there are 2 jobs in the hadoop job queue, one is the oozie job , which is with the application type "map reduce", and another job triggered by the previous one, with application type "Spark", while the first job is running, the second job remains in 'accepted" status. here comes the problem, while the first job is waiting for the second job to finish to proceed, the second is waiting for the first one to finish to run, I may be stuck in a dead lock, how could I get rid of this trouble, is there anyway the hadoop job with application type "mapreduce" run parallel with other jobs of different application type?
Any advice is appreciated, thanks!
Please check the value for property into Yarn scheduler configuration. I guess you need to increase it to something like .9 or so.
Property: yarn.scheduler.capacity.maximum-am-resource-percent
You would need to start Yarn, MapReduce and Oozie after updating the property.
More info: Setting Application Limits.
My hadoop job was running over 10 hours but since I put it in wrong queue, the containers are kept getting killed by the scheduler.
How do I change the queue of currently running hadoop job without restarting it?
Thank you
if running Yarn you can change the current job's queue by
yarn application -movetoqueue <app_id> -queue <queue_name>
I have some jobs that are long-time running but not that important.. some cleaning/indexing jobs... If I am running the job right now I am blocking rest of jobs. all new jobs stay in queue in pending mode..
What I would like is: reduce number of running mappers for a specific job. Can I do that?
Running Spark 1.3.1 on Yarn and EMR. When I run the spark-shell everything looks normal until I start seeing messages like INFO yarn.Client: Application report for application_1439330624449_1561 (state: ACCEPTED). These messages are generated endlessly, once per second. Meanwhile, I am unable to use the Spark shell.
I don't understand why this is happening.
Seeing (near) endless Accepted messages from YARN has always been a sure sign that there were not enough cluster resources to allocate for my Spark jobs / shell. YARN will continue trying to schedule your Spark application, but will eventually time-out if not enough resources become available in a certain amount of time.
Are you providing any command line options to spark-shell that override the defaults provided? When I ask for too many executors/cores/memory YARN will accept my request but never transition to a Running ApplicationMaster.
Try running a spark-shell with no options (other than perhaps --master yarn) and see if it gets past Accepted.
Realized there were a couple of streaming jobs I had killed in the terminal, but I guess they were somehow still running. I was able to find these in the UI showing all running applications on YARN (I wasn't able to execute Hive queries as either). Once I killed the jobs using the command below the spark-shell started as usual.
yarn application -kill application_1428487296152_25597
I guess that YARN is not having resources enough for running jobs.
Please check
https://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_ig_yarn_tuning.html
for calculating how many resources can you provide to YARN.
Please check the number of cores and the RAM quantity that it is controlled by the following variables:
yarn.nodemanager.resource.cpu-vcores
yarn.nodemanager.resource.memory-mb