Spark 2.0 Thrift server not started in yarn mode - hadoop

I have started the spark-2.0 thrift server in local environment and it's working fine, when I'll tried with the cluster environment the following exception thrown.
16/06/02 10:21:06 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended!
It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitFor
Application(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start
(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:148)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:502)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2246)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:749)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:57)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:81)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:724)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/06/02 10:21:06 INFO util.ShutdownHookManager: Shutdown hook called
When checking in application master logs
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
Spark-defaults configuration :
spark.executor.memory 2g
spark.driver.memory 4g
spark.executor.cores 1

Related

Install Puppet Server 7 on AWS ARM instance

We’ve encountered a ruby error during Puppet Server 7 installation on AWS Graviton (arm) instances with Ubuntu 18.04 (ami-0c925af1500feb25d) and 20.04 (ami-08b6fc871ad49ff41).
Puppet was installed according to the official manual, without any additional configuration.
By the way, on amd64 instances it works properly.
During puppetserver.service start we have following errors:
Syslog:
Starting puppetserver Service...
Puppet::Error: Cannot determine basic system flavour
<main> at /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<module:Puppet> at /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:98
<main> at /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:42
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<main> at uri:classloader:/puppetserver-lib/puppet/server.rb:1
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<main> at uri:classloader:/puppetserver-lib/puppet/server/master.rb:1
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<main> at <script>:1
Execution error (RuntimeError) at RUBY/<main> (/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19).
(Error) Cannot determine basic system flavour
Full report at:
/tmp/clojure-2104485397975789536.edn
Background process 1437 exited before start had completed
puppetserver.service: Control process exited, code=exited status=1
puppetserver.service: Failed with result 'exit-code'.
Failed to start puppetserver Service.
puppetserver.service: Service hold-off time over, scheduling restart.
puppetserver.service: Scheduled restart job, restart counter is at 5689.
Stopped puppetserver Service.
puppetserver.log:
Logging initialized #5753ms to org.eclipse.jetty.util.log.Slf4jLog
Initializing Scheduler Service
Using default implementation for ThreadExecutor
Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
Quartz Scheduler v.2.3.2 created.
RAMJobStore initialized.
Scheduler meta-data: Quartz Scheduler (v2.3.2) '536c68b8-9037-4bef-bca8-d8c28bd9ba6e' with instanceId 'NON_CLUSTERED'
Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
NOT STARTED.
Currently in standby mode.
Number of jobs executed: 0
Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.
Quartz scheduler '536c68b8-9037-4bef-bca8-d8c28bd9ba6e' initialized from an externally provided properties instance.
Quartz scheduler version: 2.3.2
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED started.
Initializing web server(s).
Registering status callback function for service 'puppet-profiler', version 7.0.3
Initializing the JRuby service
Initializing the JRuby service
Creating JRubyInstance with id 1.
Registering status callback function for service 'jruby-metrics', version 7.0.3
No code-id-command set for versioned-code-service. Code-id will be nil.
No code-content-command set for versioned-code-service. Attempting to fetch code content will fail.
Error during service init!!!
java.lang.IllegalStateException: Unable to borrow JRubyInstance from pool
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34136$borrow_from_pool_BANG__STAR___34141$fn__34142.invoke(jruby_internal.clj:313)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34136$borrow_from_pool_BANG__STAR___34141.invoke(jruby_internal.clj:300)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34183$borrow_from_pool_with_timeout__34188$fn__34189.invoke(jruby_internal.clj:348)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34183$borrow_from_pool_with_timeout__34188.invoke(jruby_internal.clj:337)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34947.invokeStatic(instance_pool.clj:48)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34947.invoke(instance_pool.clj:10)
at puppetlabs.services.protocols.jruby_pool$fn__34772$G__34685__34779.invoke(jruby_pool.clj:3)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35861$borrow_from_pool_with_timeout__35866$fn__35867.invoke(jruby_core.clj:222)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35861$borrow_from_pool_with_timeout__35866.invoke(jruby_core.clj:209)
at puppetlabs.services.config.puppet_server_config_core$fn__43588$get_puppet_config__43593$fn__43594$fn__43595.invoke(puppet_server_config_core.clj:107)
at puppetlabs.services.config.puppet_server_config_core$fn__43588$get_puppet_config__43593$fn__43594.invoke(puppet_server_config_core.clj:107)
at puppetlabs.services.config.puppet_server_config_core$fn__43588$get_puppet_config__43593.invoke(puppet_server_config_core.clj:102)
at puppetlabs.services.config.puppet_server_config_service$reify__43623$service_fnk__5000__auto___positional$reify__43634.init(puppet_server_config_service.clj:25)
at puppetlabs.trapperkeeper.services$fn__4824$G__4816__4827.invoke(services.clj:9)
at puppetlabs.trapperkeeper.services$fn__4824$G__4815__4831.invoke(services.clj:9)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378$fn__14379.invoke(internal.clj:196)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378.invoke(internal.clj:179)
at puppetlabs.trapperkeeper.internal$fn__14400$run_lifecycle_fns__14405$fn__14406.invoke(internal.clj:229)
at puppetlabs.trapperkeeper.internal$fn__14400$run_lifecycle_fns__14405.invoke(internal.clj:206)
at puppetlabs.trapperkeeper.internal$fn__15015$build_app_STAR___15024$fn$reify__15036.init(internal.clj:602)
at puppetlabs.trapperkeeper.internal$fn__15063$boot_services_for_app_STAR__STAR___15070$fn__15071$fn__15073.invoke(internal.clj:630)
at puppetlabs.trapperkeeper.internal$fn__15063$boot_services_for_app_STAR__STAR___15070$fn__15071.invoke(internal.clj:629)
at puppetlabs.trapperkeeper.internal$fn__15063$boot_services_for_app_STAR__STAR___15070.invoke(internal.clj:623)
at clojure.core$partial$fn__5841.invoke(core.clj:2630)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632$fn__14635.invoke(internal.clj:249)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632.invoke(internal.clj:249)
at clojure.core.async.impl.ioc_macros$run_state_machine.invokeStatic(ioc_macros.clj:973)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:972)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invokeStatic(ioc_macros.clj:977)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:975)
at clojure.core.async$ioc_alts_BANG_$fn__11818.invoke(async.clj:384)
at clojure.core.async$do_alts$fn__11758$fn__11761.invoke(async.clj:253)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__6422$fn__6423.invoke(channels.clj:95)
at clojure.lang.AFn.run(AFn.java:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.jruby.embed.EvalFailedException: (Error) Cannot determine basic system flavour
at org.jruby.embed.internal.EmbedEvalUnitImpl.run(EmbedEvalUnitImpl.java:131)
at org.jruby.embed.ScriptingContainer.runUnit(ScriptingContainer.java:1295)
at org.jruby.embed.ScriptingContainer.runScriptlet(ScriptingContainer.java:1288)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:167)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:102)
at puppetlabs.services.jruby.jruby_puppet_core$fn__36162$get_initialize_pool_instance_fn__36167$fn__36168$fn__36169.invoke(jruby_puppet_core.clj:118)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945$fn__33948.invoke(jruby_internal.clj:256)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945.invoke(jruby_internal.clj:225)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359$fn__34363.invoke(jruby_agents.clj:52)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359.invoke(jruby_agents.clj:47)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386$fn__34390.invoke(jruby_agents.clj:76)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386.invoke(jruby_agents.clj:61)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34943$fn__34944.invoke(instance_pool.clj:16)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:403)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:388)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$fn__14866$shutdown_service__14871$fn$reify__14873$service_fnk__5000__auto___positional$reify__14878.shutdown_on_error(internal.clj:448)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14792__14804.invoke(internal.clj:411)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14791__14813.invoke(internal.clj:411)
at clojure.core$partial$fn__5839.invoke(core.clj:2625)
at clojure.core$partial$fn__5839.invoke(core.clj:2624)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34328$send_agent__34333$fn__34334$agent_fn__34335.invoke(jruby_agents.clj:41)
at clojure.core$binding_conveyor_fn$fn__5754.invoke(core.clj:2033)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.lang.Agent$Action.doRun(Agent.java:114)
at clojure.lang.Agent$Action.run(Agent.java:163)
... 3 common frames omitted
Caused by: org.jruby.exceptions.RuntimeError: (Error) Cannot determine basic system flavour
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.rubygems.core_ext.kernel_require.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<module:Puppet>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:98)
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:42)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server/master.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(<script>:1)
shutdown-on-error triggered because of exception!
java.lang.IllegalStateException: There was a problem adding a JRubyInstance to the pool.
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359$fn__34363.invoke(jruby_agents.clj:58)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359.invoke(jruby_agents.clj:47)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386$fn__34390.invoke(jruby_agents.clj:76)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386.invoke(jruby_agents.clj:61)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34943$fn__34944.invoke(instance_pool.clj:16)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:403)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:388)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$fn__14866$shutdown_service__14871$fn$reify__14873$service_fnk__5000__auto___positional$reify__14878.shutdown_on_error(internal.clj:448)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14792__14804.invoke(internal.clj:411)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14791__14813.invoke(internal.clj:411)
at clojure.core$partial$fn__5839.invoke(core.clj:2625)
at clojure.core$partial$fn__5839.invoke(core.clj:2624)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34328$send_agent__34333$fn__34334$agent_fn__34335.invoke(jruby_agents.clj:41)
at clojure.core$binding_conveyor_fn$fn__5754.invoke(core.clj:2033)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.lang.Agent$Action.doRun(Agent.java:114)
at clojure.lang.Agent$Action.run(Agent.java:163)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.jruby.embed.EvalFailedException: (Error) Cannot determine basic system flavour
at org.jruby.embed.internal.EmbedEvalUnitImpl.run(EmbedEvalUnitImpl.java:131)
at org.jruby.embed.ScriptingContainer.runUnit(ScriptingContainer.java:1295)
at org.jruby.embed.ScriptingContainer.runScriptlet(ScriptingContainer.java:1288)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:167)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:102)
at puppetlabs.services.jruby.jruby_puppet_core$fn__36162$get_initialize_pool_instance_fn__36167$fn__36168$fn__36169.invoke(jruby_puppet_core.clj:118)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945$fn__33948.invoke(jruby_internal.clj:256)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945.invoke(jruby_internal.clj:225)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359$fn__34363.invoke(jruby_agents.clj:52)
... 22 common frames omitted
Caused by: org.jruby.exceptions.RuntimeError: (Error) Cannot determine basic system flavour
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.rubygems.core_ext.kernel_require.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<module:Puppet>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:98)
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:42)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server/master.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(<script>:1)
Beginning shutdown sequence
JRuby Metrics Service: stopping metrics sampler job
JRuby Metrics Service: stopped metrics sampler job
Draining JRuby pool.
Encountered error during shutdown sequence
java.lang.InterruptedException: Lock can't be granted because a pill has been inserted
at com.puppetlabs.jruby_utils.pool.JRubyPool.lockWithTimeout(JRubyPool.java:368)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:167)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:102)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34462$borrow_all_jrubies__34467$fn__34468$fn__34469.invoke(jruby_agents.clj:128)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34462$borrow_all_jrubies__34467$fn__34468.invoke(jruby_agents.clj:127)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34462$borrow_all_jrubies__34467.invoke(jruby_agents.clj:119)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34542$drain_and_refill_pool_BANG___34551$fn__34554.invoke(jruby_agents.clj:190)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34542$drain_and_refill_pool_BANG___34551.invoke(jruby_agents.clj:172)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34583$flush_pool_for_shutdown_BANG___34588$fn__34589.invoke(jruby_agents.clj:211)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34583$flush_pool_for_shutdown_BANG___34588.invoke(jruby_agents.clj:199)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34959.invokeStatic(instance_pool.clj:20)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34959.invoke(instance_pool.clj:10)
at puppetlabs.services.protocols.jruby_pool$fn__34737$G__34697__34742.invoke(jruby_pool.clj:3)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35936$flush_pool_for_shutdown_BANG___35941$fn__35942.invoke(jruby_core.clj:250)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35936$flush_pool_for_shutdown_BANG___35941.invoke(jruby_core.clj:245)
at puppetlabs.services.jruby.jruby_puppet_service$reify__36714$service_fnk__5000__auto___positional$reify__36728.stop(jruby_puppet_service.clj:50)
at puppetlabs.trapperkeeper.services$fn__4850$G__4820__4853.invoke(services.clj:9)
at puppetlabs.trapperkeeper.services$fn__4850$G__4819__4857.invoke(services.clj:9)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378$fn__14379.invoke(internal.clj:196)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378.invoke(internal.clj:179)
at puppetlabs.trapperkeeper.internal$fn__14923$shutdown_BANG___14928$fn__14929$shutdown_fn__14931$fn__14946.invoke(internal.clj:459)
at puppetlabs.trapperkeeper.internal$fn__14923$shutdown_BANG___14928$fn__14929$shutdown_fn__14931.invoke(internal.clj:458)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632$fn__14635$fn__14649.invoke(internal.clj:274)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632$fn__14635.invoke(internal.clj:258)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632.invoke(internal.clj:249)
at clojure.core.async.impl.ioc_macros$run_state_machine.invokeStatic(ioc_macros.clj:973)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:972)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invokeStatic(ioc_macros.clj:977)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:975)
at clojure.core.async$ioc_alts_BANG_$fn__11818.invoke(async.clj:384)
at clojure.core.async$do_alts$fn__11758$fn__11761.invoke(async.clj:253)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__6438.invoke(channels.clj:135)
at clojure.lang.AFn.run(AFn.java:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Shutting down web server(s).
Shutting down Scheduler Service
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED shutting down.
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED paused.
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED shutdown complete.
Scheduler Service shutdown complete.
Finished shutdown sequence
How to fix this issue?
Thanks

Spark Thirft Service not started in Hadoop with Azure Storage Blob configuration

We have Created a High availability hadoop cluster with default file system as azure blob storage instead of hdfs by following the link https://hadoop.apache.org/docs/stable/hadoop-azure/index.html
Hivethrift service where started successfully but spark thift service where not started.
I can able to connect the spark-shell and connect with blob by referening the jar file hadoop-azure.jar but cannot start the thrift service.
Command used to start spark thrift server:
spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --master yarn
Following are the error details.
17/04/26 10:19:32 INFO metastore: Connected to metastore.
Exception in thread "main" java.lang.IllegalArgumentException: Error while insta
ntiating 'org.apache.spark.sql.hive.HiveSessionState':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$
$reflect(SparkSession.scala:981)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSessio
n.scala:110)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109
)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.appl
y(SparkSession.scala:878)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.appl
y(SparkSession.scala:878)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:99)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.sca
la:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala
:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.sc
ala:878)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.
scala:47)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveTh
riftServer2.scala:81)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThr
iftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSub
mit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:18
7)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct
orAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC
onstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$
$reflect(SparkSession.scala:978)
... 22 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.ap
ache.spark.sql.hive.HiveExternalCatalog':
at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$inter
nal$SharedState$$reflect(SharedState.scala:169)
at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86
)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkS
ession.scala:101)
at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkS
ession.scala:101)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession
.scala:101)
at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:
157)
at org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.sc
ala:32)
... 27 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct
orAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC
onstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$inter
nal$SharedState$$reflect(SharedState.scala:166)
... 35 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct
orAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC
onstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(Is
olatedClientLoader.scala:264)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.s
cala:366)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.s
cala:270)
at org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCata
log.scala:65)
... 40 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.fs.azure.AzureException
: java.util.NoSuchElementException: An error occurred while enumerating the resu
lt, check the original exception for details.
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav
a:522)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl
.scala:192)
... 48 more
Caused by: org.apache.hadoop.fs.azure.AzureException: java.util.NoSuchElementExc
eption: An error occurred while enumerating the result, check the original excep
tion for details.
at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.retrieveMetadat
a(AzureNativeFileSystemStore.java:1930)
at org.apache.hadoop.fs.azure.NativeAzureFileSystem.getFileStatus(Native
AzureFileSystem.java:1592)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(Sess
ionState.java:596)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(Sess
ionState.java:554)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav
a:508)
... 49 more
Caused by: java.util.NoSuchElementException: An error occurred while enumerating
the result, check the original exception for details.
at com.microsoft.azure.storage.core.LazySegmentedIterator.hasNext(LazySe
gmentedIterator.java:113)
at org.apache.hadoop.fs.azure.StorageInterfaceImpl$WrappingIterator.hasN
ext(StorageInterfaceImpl.java:128)
at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.retrieveMetadat
a(AzureNativeFileSystemStore.java:1909)
... 54 more
Caused by: com.microsoft.azure.storage.StorageException: The server encountered
an unknown failure: OK
at com.microsoft.azure.storage.StorageException.translateException(Stora
geException.java:178)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(Exe
cutionEngine.java:273)
at com.microsoft.azure.storage.core.LazySegmentedIterator.hasNext(LazySe
gmentedIterator.java:109)
... 56 more
Caused by: java.lang.ClassCastException: org.apache.xerces.parsers.XIncludeAware
ParserConfiguration cannot be cast to org.apache.xerces.xni.parser.XMLParserConf
iguration
at org.apache.xerces.parsers.SAXParser.<init>(Unknown Source)
at org.apache.xerces.parsers.SAXParser.<init>(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.<init>(Unknown Sou
rce)
at org.apache.xerces.jaxp.SAXParserImpl.<init>(Unknown Source)
at org.apache.xerces.jaxp.SAXParserFactoryImpl.newSAXParser(Unknown Sour
ce)
at com.microsoft.azure.storage.core.Utility.getSAXParser(Utility.java:54
6)
at com.microsoft.azure.storage.blob.BlobListHandler.getBlobList(BlobList
Handler.java:72)
at com.microsoft.azure.storage.blob.CloudBlobContainer$6.postProcessResp
onse(CloudBlobContainer.java:1253)
at com.microsoft.azure.storage.blob.CloudBlobContainer$6.postProcessResp
onse(CloudBlobContainer.java:1217)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(Exe
cutionEngine.java:148)
... 57 more
17/04/26 10:19:33 INFO SparkContext: Invoking stop() from shutdown hook
17/04/26 10:19:33 INFO SparkUI: Stopped Spark web UI at http://10.0.0.4:4040
17/04/26 10:19:33 INFO YarnClientSchedulerBackend: Interrupting monitor thread
17/04/26 10:19:33 INFO YarnClientSchedulerBackend: Shutting down all executors
17/04/26 10:19:33 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each exec
utor to shut down
17/04/26 10:19:33 INFO SchedulerExtensionServices: Stopping SchedulerExtensionSe
rvices
(serviceOption=None,
services=List(),
started=false)
17/04/26 10:19:33 INFO YarnClientSchedulerBackend: Stopped
17/04/26 10:19:33 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEnd
point stopped!
17/04/26 10:19:33 INFO MemoryStore: MemoryStore cleared
17/04/26 10:19:33 INFO BlockManager: BlockManager stopped
17/04/26 10:19:33 INFO BlockManagerMaster: BlockManagerMaster stopped
17/04/26 10:19:33 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
17/04/26 10:19:33 INFO SparkContext: Successfully stopped SparkContext
17/04/26 10:19:33 INFO ShutdownHookManager: Shutdown hook called
17/04/26 10:19:33 INFO ShutdownHookManager: Deleting directory C:\Users\labuser\
AppData\Local\Temp\2\spark-11c406ec-2c53-4042-b336-9d1164c3c6f9
17/04/26 10:19:33 INFO MetricsSystemImpl: Stopping azure-file-system metrics sys
tem...
17/04/26 10:19:33 INFO MetricsSystemImpl: azure-file-system metrics system stopp
ed.
17/04/26 10:19:33 INFO MetricsSystemImpl: azure-file-system metrics system shutd
own complete.
Please help me to resolve this issue. any help would be greatly appreciated.

Apache Spark running spark-shell on YARN error

I downloaded: spark-2.1.0-bin-hadoop2.7.tgz from http://spark.apache.org/downloads.html. I have Hadoop HDFS and YARN started with $ start-dfs.sh and $ start-yarn.sh. But running $ spark-shell --master yarn --deploy-mode client gives me the error below:
$ spark-shell --master yarn --deploy-mode client
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/04/08 23:04:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/04/08 23:04:54 WARN util.Utils: Your hostname, Pandora resolves to a loopback address: 127.0.1.1; using 192.168.1.11 instead (on interface wlp3s0)
17/04/08 23:04:54 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/04/08 23:04:56 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
17/04/08 23:05:15 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
17/04/08 23:05:15 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:614)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:169)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:567)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
at $line3.$read$$iw$$iw.<init>(<console>:15)
at $line3.$read$$iw.<init>(<console>:42)
at $line3.$read.<init>(<console>:44)
at $line3.$read$.<init>(<console>:48)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.$print$lzycompute(<console>:7)
at $line3.$eval$.$print(<console>:6)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:105)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:68)
at org.apache.spark.repl.Main$.main(Main.scala:51)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/04/08 23:05:15 ERROR client.TransportClient: Failed to send RPC 7918328175210939600 to /192.168.1.11:56186: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
17/04/08 23:05:15 ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(0,0,Map()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 7918328175210939600 to /192.168.1.11:56186: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:488)
at io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)
at io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:438)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:408)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:455)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
17/04/08 23:05:15 ERROR util.Utils: Uncaught exception in thread Yarn application state monitor
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:512)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:93)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:151)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:467)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1588)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1826)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1283)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1825)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:108)
Caused by: java.io.IOException: Failed to send RPC 7918328175210939600 to /192.168.1.11:56186: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:488)
at io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)
at io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:438)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:408)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:455)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:614)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:169)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:567)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
... 47 elided
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.0
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_121)
Type in expressions to have them evaluated.
Type :help for more information.
YARN detects Spark is running with it, but the error is causing Spark to exit with undefined status.
I found the solution from another Stackoverflow question. It was not about configuring Apache Spark, it was about configuring Hadoop YARN:
Running yarn with spark not working with Java 8
Make sure your yarn-site.xml, from your Hadoop configuration folder, has these properties:
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
I met the same problem with you. When I check the NodeManager log,I find this warn:
2017-10-26 19:43:21,787 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Container [pid=3820,containerID=container_1509016963775_0001_02_000001] is running beyond virtual memory limits. Current usage: 339.0 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
So I set a bigger virtual memory(yarn.nodemanager.vmem-pmem-ratio in yarn-site.xml, which default value is 2.1). Then it really worked.

How to know what is the reason for ClosedChannelExceptions with spark-shell in YARN client mode?

I have been trying to run spark-shell in YARN client mode, but I am getting a lot of ClosedChannelException errors. I am using spark 2.0.0 build for Hadoop 2.6.
Here are the exceptions :
$ spark-2.0.0-bin-hadoop2.6/bin/spark-shell --master yarn --deploy-mode client
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/09/13 14:12:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/09/13 14:12:38 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/09/13 14:12:55 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/09/13 14:12:55 ERROR client.TransportClient: Failed to send RPC 7920194824462016141 to /172.27.1.63:41034: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/09/13 14:12:55 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:581)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:549)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2256)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
at $line3.$read$$iw$$iw.<init>(<console>:15)
at $line3.$read$$iw.<init>(<console>:31)
at $line3.$read.<init>(<console>:33)
at $line3.$read$.<init>(<console>:37)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.$print$lzycompute(<console>:7)
at $line3.$eval$.$print(<console>:6)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:94)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:68)
at org.apache.spark.repl.Main$.main(Main.scala:51)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/09/13 14:12:55 WARN netty.NettyRpcEndpointRef: Error sending message [message = RequestExecutors(0,0,Map())] in 1 attempts
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply$mcV$sp(YarnSchedulerBackend.scala:271)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply(YarnSchedulerBackend.scala:271)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply(YarnSchedulerBackend.scala:271)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to send RPC 7920194824462016141 to /172.27.1.63:41034: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801)
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699)
at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
Caused by: java.nio.channels.ClosedChannelException
java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:581)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:549)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2256)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
... 47 elided
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.0.0
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_101)
Type in expressions to have them evaluated.
Type :help for more information.
scala> 16/09/13 14:12:59 ERROR client.TransportClient: Failed to send RPC 5797372389565173518 to /172.27.1.63:41034: java.nio.channels.ClosedChannelException
16/09/13 14:12:59 WARN netty.NettyRpcEndpointRef: Error sending message [message = RequestExecutors(0,0,Map())] in 2 attempts
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply$mcV$sp(YarnSchedulerBackend.scala:271)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply(YarnSchedulerBackend.scala:271)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$1.apply(YarnSchedulerBackend.scala:271)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to send RPC 5797372389565173518 to /172.27.1.63:41034: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801)
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699)
at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
Caused by: java.nio.channels.ClosedChannelException
Reason is association with yarn cluster may be lost due to the Java 8 excessive memory allocation issue: https://issues.apache.org/jira/browse/YARN-4714
You can force YARN to ignore this by setting up the following properties in yarn-site.xml
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
Thanks to simplejack,
Reference from Spark Pi Example in Cluster mode with Yarn: Association lost
Personally I resolved this by increasing yarn.nodemanager.vmem-pmem-ratio as suggested in the Jira ticket by Akira Ajisaka:
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>5</value>
</property>
I have built another answer which depends whether you are using spark client or cluster mode.
In cluster mode it failed when I specified Driver Memory --driver-memory to be 512m. (The default setting requested 2GB of am resources (This consists of driver memory + Overhead requested for Application Master) which was enough)
In client mode the setting that mattered was spark.yarn.am.memory as by default this requested only 1024m for the AM which is too little as Java 8 requires a lot of virtual memory. > 1024m seemed to be working.
Answer is described here
I got the ClosedChannelException with a different message:
20/01/07 06:31:54 ERROR server.TransportChannelHandler: Connection to ip-10-0-202-150.ec2.internal/10.0.202.150:37801 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong.
20/01/07 06:31:54 ERROR executor.Executor: Exception in task 556.0 in stage 1.0 (TID 556)
java.nio.channels.ClosedChannelException
...
Inside mapPartition, I am batching the records and making a HTTP call to process these records, which can take a few minutes. It may be that Spark assumes the partition is dead because it it not fetching more records for a long time and hence we get this exception.
Setting the network timeout with longer value worked.
spark.network.timeout=500s

Why do the Spark examples fail to spark-submit on EC2 with spark-ec2 scripts?

I downloaded spark-1.5.2 and I setup a cluster on ec2 using the spark-ec2 doc here.
After that I went to examples/ and run mvn package and packaged the examples in a jar.
In the end I run the submit with:
bin/spark-submit --class org.apache.spark.examples.JavaTC --master spark://url_here.eu-west-1.compute.amazonaws.com:7077 --deploy-mode cluster /home/aki/Projects/spark-1.5.2/examples/target/spark-examples_2.10-1.5.2.jar
Instead of it running, I get the error:
WARN RestSubmissionClient: Unable to connect to server spark://url_here.eu-west-1.compute.amazonaws.com:7077.
Warning: Master endpoint spark://url_here.eu-west-1.compute.amazonaws.com:7077 was not a REST server. Falling back to legacy submission gateway instead.
15/12/22 17:36:07 WARN Utils: Your hostname, aki-linux resolves to a loopback address: 127.0.1.1; using 192.168.10.63 instead (on interface wlp4s0)
15/12/22 17:36:07 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/12/22 17:36:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:116)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.deploy.Client$.main(Client.scala:233)
at org.apache.spark.deploy.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241)
... 21 more
Are you sure the URL to master contains "url-here"?
spark://url_here.eu-west-1.compute.amazonaws.com:7077
Or maybe you are trying to obfuscate it for this post.
If you can you connect the Spark UI at
http://url_here.eu-west-1.compute.amazonaws.com:4040 or depending on your spark version http://url_here.eu-west-1.compute.amazonaws.com:8080, make sure you are using the URL variable seen on the Spark UI for your spark://...:7070 command line argument

Resources