We’ve encountered a ruby error during Puppet Server 7 installation on AWS Graviton (arm) instances with Ubuntu 18.04 (ami-0c925af1500feb25d) and 20.04 (ami-08b6fc871ad49ff41).
Puppet was installed according to the official manual, without any additional configuration.
By the way, on amd64 instances it works properly.
During puppetserver.service start we have following errors:
Syslog:
Starting puppetserver Service...
Puppet::Error: Cannot determine basic system flavour
<main> at /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<module:Puppet> at /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:98
<main> at /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:42
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<main> at uri:classloader:/puppetserver-lib/puppet/server.rb:1
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<main> at uri:classloader:/puppetserver-lib/puppet/server/master.rb:1
require at org/jruby/RubyKernel.java:974
require at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54
<main> at <script>:1
Execution error (RuntimeError) at RUBY/<main> (/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19).
(Error) Cannot determine basic system flavour
Full report at:
/tmp/clojure-2104485397975789536.edn
Background process 1437 exited before start had completed
puppetserver.service: Control process exited, code=exited status=1
puppetserver.service: Failed with result 'exit-code'.
Failed to start puppetserver Service.
puppetserver.service: Service hold-off time over, scheduling restart.
puppetserver.service: Scheduled restart job, restart counter is at 5689.
Stopped puppetserver Service.
puppetserver.log:
Logging initialized #5753ms to org.eclipse.jetty.util.log.Slf4jLog
Initializing Scheduler Service
Using default implementation for ThreadExecutor
Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
Quartz Scheduler v.2.3.2 created.
RAMJobStore initialized.
Scheduler meta-data: Quartz Scheduler (v2.3.2) '536c68b8-9037-4bef-bca8-d8c28bd9ba6e' with instanceId 'NON_CLUSTERED'
Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
NOT STARTED.
Currently in standby mode.
Number of jobs executed: 0
Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.
Quartz scheduler '536c68b8-9037-4bef-bca8-d8c28bd9ba6e' initialized from an externally provided properties instance.
Quartz scheduler version: 2.3.2
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED started.
Initializing web server(s).
Registering status callback function for service 'puppet-profiler', version 7.0.3
Initializing the JRuby service
Initializing the JRuby service
Creating JRubyInstance with id 1.
Registering status callback function for service 'jruby-metrics', version 7.0.3
No code-id-command set for versioned-code-service. Code-id will be nil.
No code-content-command set for versioned-code-service. Attempting to fetch code content will fail.
Error during service init!!!
java.lang.IllegalStateException: Unable to borrow JRubyInstance from pool
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34136$borrow_from_pool_BANG__STAR___34141$fn__34142.invoke(jruby_internal.clj:313)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34136$borrow_from_pool_BANG__STAR___34141.invoke(jruby_internal.clj:300)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34183$borrow_from_pool_with_timeout__34188$fn__34189.invoke(jruby_internal.clj:348)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__34183$borrow_from_pool_with_timeout__34188.invoke(jruby_internal.clj:337)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34947.invokeStatic(instance_pool.clj:48)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34947.invoke(instance_pool.clj:10)
at puppetlabs.services.protocols.jruby_pool$fn__34772$G__34685__34779.invoke(jruby_pool.clj:3)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35861$borrow_from_pool_with_timeout__35866$fn__35867.invoke(jruby_core.clj:222)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35861$borrow_from_pool_with_timeout__35866.invoke(jruby_core.clj:209)
at puppetlabs.services.config.puppet_server_config_core$fn__43588$get_puppet_config__43593$fn__43594$fn__43595.invoke(puppet_server_config_core.clj:107)
at puppetlabs.services.config.puppet_server_config_core$fn__43588$get_puppet_config__43593$fn__43594.invoke(puppet_server_config_core.clj:107)
at puppetlabs.services.config.puppet_server_config_core$fn__43588$get_puppet_config__43593.invoke(puppet_server_config_core.clj:102)
at puppetlabs.services.config.puppet_server_config_service$reify__43623$service_fnk__5000__auto___positional$reify__43634.init(puppet_server_config_service.clj:25)
at puppetlabs.trapperkeeper.services$fn__4824$G__4816__4827.invoke(services.clj:9)
at puppetlabs.trapperkeeper.services$fn__4824$G__4815__4831.invoke(services.clj:9)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378$fn__14379.invoke(internal.clj:196)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378.invoke(internal.clj:179)
at puppetlabs.trapperkeeper.internal$fn__14400$run_lifecycle_fns__14405$fn__14406.invoke(internal.clj:229)
at puppetlabs.trapperkeeper.internal$fn__14400$run_lifecycle_fns__14405.invoke(internal.clj:206)
at puppetlabs.trapperkeeper.internal$fn__15015$build_app_STAR___15024$fn$reify__15036.init(internal.clj:602)
at puppetlabs.trapperkeeper.internal$fn__15063$boot_services_for_app_STAR__STAR___15070$fn__15071$fn__15073.invoke(internal.clj:630)
at puppetlabs.trapperkeeper.internal$fn__15063$boot_services_for_app_STAR__STAR___15070$fn__15071.invoke(internal.clj:629)
at puppetlabs.trapperkeeper.internal$fn__15063$boot_services_for_app_STAR__STAR___15070.invoke(internal.clj:623)
at clojure.core$partial$fn__5841.invoke(core.clj:2630)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632$fn__14635.invoke(internal.clj:249)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632.invoke(internal.clj:249)
at clojure.core.async.impl.ioc_macros$run_state_machine.invokeStatic(ioc_macros.clj:973)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:972)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invokeStatic(ioc_macros.clj:977)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:975)
at clojure.core.async$ioc_alts_BANG_$fn__11818.invoke(async.clj:384)
at clojure.core.async$do_alts$fn__11758$fn__11761.invoke(async.clj:253)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__6422$fn__6423.invoke(channels.clj:95)
at clojure.lang.AFn.run(AFn.java:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.jruby.embed.EvalFailedException: (Error) Cannot determine basic system flavour
at org.jruby.embed.internal.EmbedEvalUnitImpl.run(EmbedEvalUnitImpl.java:131)
at org.jruby.embed.ScriptingContainer.runUnit(ScriptingContainer.java:1295)
at org.jruby.embed.ScriptingContainer.runScriptlet(ScriptingContainer.java:1288)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:167)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:102)
at puppetlabs.services.jruby.jruby_puppet_core$fn__36162$get_initialize_pool_instance_fn__36167$fn__36168$fn__36169.invoke(jruby_puppet_core.clj:118)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945$fn__33948.invoke(jruby_internal.clj:256)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945.invoke(jruby_internal.clj:225)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359$fn__34363.invoke(jruby_agents.clj:52)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359.invoke(jruby_agents.clj:47)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386$fn__34390.invoke(jruby_agents.clj:76)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386.invoke(jruby_agents.clj:61)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34943$fn__34944.invoke(instance_pool.clj:16)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:403)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:388)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$fn__14866$shutdown_service__14871$fn$reify__14873$service_fnk__5000__auto___positional$reify__14878.shutdown_on_error(internal.clj:448)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14792__14804.invoke(internal.clj:411)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14791__14813.invoke(internal.clj:411)
at clojure.core$partial$fn__5839.invoke(core.clj:2625)
at clojure.core$partial$fn__5839.invoke(core.clj:2624)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34328$send_agent__34333$fn__34334$agent_fn__34335.invoke(jruby_agents.clj:41)
at clojure.core$binding_conveyor_fn$fn__5754.invoke(core.clj:2033)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.lang.Agent$Action.doRun(Agent.java:114)
at clojure.lang.Agent$Action.run(Agent.java:163)
... 3 common frames omitted
Caused by: org.jruby.exceptions.RuntimeError: (Error) Cannot determine basic system flavour
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.rubygems.core_ext.kernel_require.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<module:Puppet>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:98)
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:42)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server/master.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(<script>:1)
shutdown-on-error triggered because of exception!
java.lang.IllegalStateException: There was a problem adding a JRubyInstance to the pool.
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359$fn__34363.invoke(jruby_agents.clj:58)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359.invoke(jruby_agents.clj:47)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386$fn__34390.invoke(jruby_agents.clj:76)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34381$prime_pool_BANG___34386.invoke(jruby_agents.clj:61)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34943$fn__34944.invoke(instance_pool.clj:16)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:403)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invokeStatic(internal.clj:388)
at puppetlabs.trapperkeeper.internal$shutdown_on_error_STAR_.invoke(internal.clj:378)
at puppetlabs.trapperkeeper.internal$fn__14866$shutdown_service__14871$fn$reify__14873$service_fnk__5000__auto___positional$reify__14878.shutdown_on_error(internal.clj:448)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14792__14804.invoke(internal.clj:411)
at puppetlabs.trapperkeeper.internal$fn__14796$G__14791__14813.invoke(internal.clj:411)
at clojure.core$partial$fn__5839.invoke(core.clj:2625)
at clojure.core$partial$fn__5839.invoke(core.clj:2624)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34328$send_agent__34333$fn__34334$agent_fn__34335.invoke(jruby_agents.clj:41)
at clojure.core$binding_conveyor_fn$fn__5754.invoke(core.clj:2033)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.lang.Agent$Action.doRun(Agent.java:114)
at clojure.lang.Agent$Action.run(Agent.java:163)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.jruby.embed.EvalFailedException: (Error) Cannot determine basic system flavour
at org.jruby.embed.internal.EmbedEvalUnitImpl.run(EmbedEvalUnitImpl.java:131)
at org.jruby.embed.ScriptingContainer.runUnit(ScriptingContainer.java:1295)
at org.jruby.embed.ScriptingContainer.runScriptlet(ScriptingContainer.java:1288)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:167)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:102)
at puppetlabs.services.jruby.jruby_puppet_core$fn__36162$get_initialize_pool_instance_fn__36167$fn__36168$fn__36169.invoke(jruby_puppet_core.clj:118)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945$fn__33948.invoke(jruby_internal.clj:256)
at puppetlabs.services.jruby_pool_manager.impl.jruby_internal$fn__33936$create_pool_instance_BANG___33945.invoke(jruby_internal.clj:225)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34354$add_instance__34359$fn__34363.invoke(jruby_agents.clj:52)
... 22 common frames omitted
Caused by: org.jruby.exceptions.RuntimeError: (Error) Cannot determine basic system flavour
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/feature/base.rb:19)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.rubygems.core_ext.kernel_require.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<module:Puppet>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:98)
at RUBY.<main>(/opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:42)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(uri:classloader:/puppetserver-lib/puppet/server/master.rb:1)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:974)
at RUBY.require(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:54)
at RUBY.<main>(<script>:1)
Beginning shutdown sequence
JRuby Metrics Service: stopping metrics sampler job
JRuby Metrics Service: stopped metrics sampler job
Draining JRuby pool.
Encountered error during shutdown sequence
java.lang.InterruptedException: Lock can't be granted because a pill has been inserted
at com.puppetlabs.jruby_utils.pool.JRubyPool.lockWithTimeout(JRubyPool.java:368)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:167)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:102)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34462$borrow_all_jrubies__34467$fn__34468$fn__34469.invoke(jruby_agents.clj:128)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34462$borrow_all_jrubies__34467$fn__34468.invoke(jruby_agents.clj:127)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34462$borrow_all_jrubies__34467.invoke(jruby_agents.clj:119)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34542$drain_and_refill_pool_BANG___34551$fn__34554.invoke(jruby_agents.clj:190)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34542$drain_and_refill_pool_BANG___34551.invoke(jruby_agents.clj:172)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34583$flush_pool_for_shutdown_BANG___34588$fn__34589.invoke(jruby_agents.clj:211)
at puppetlabs.services.jruby_pool_manager.impl.jruby_agents$fn__34583$flush_pool_for_shutdown_BANG___34588.invoke(jruby_agents.clj:199)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34959.invokeStatic(instance_pool.clj:20)
at puppetlabs.services.jruby_pool_manager.impl.instance_pool$fn__34959.invoke(instance_pool.clj:10)
at puppetlabs.services.protocols.jruby_pool$fn__34737$G__34697__34742.invoke(jruby_pool.clj:3)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35936$flush_pool_for_shutdown_BANG___35941$fn__35942.invoke(jruby_core.clj:250)
at puppetlabs.services.jruby_pool_manager.jruby_core$fn__35936$flush_pool_for_shutdown_BANG___35941.invoke(jruby_core.clj:245)
at puppetlabs.services.jruby.jruby_puppet_service$reify__36714$service_fnk__5000__auto___positional$reify__36728.stop(jruby_puppet_service.clj:50)
at puppetlabs.trapperkeeper.services$fn__4850$G__4820__4853.invoke(services.clj:9)
at puppetlabs.trapperkeeper.services$fn__4850$G__4819__4857.invoke(services.clj:9)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378$fn__14379.invoke(internal.clj:196)
at puppetlabs.trapperkeeper.internal$fn__14371$run_lifecycle_fn_BANG___14378.invoke(internal.clj:179)
at puppetlabs.trapperkeeper.internal$fn__14923$shutdown_BANG___14928$fn__14929$shutdown_fn__14931$fn__14946.invoke(internal.clj:459)
at puppetlabs.trapperkeeper.internal$fn__14923$shutdown_BANG___14928$fn__14929$shutdown_fn__14931.invoke(internal.clj:458)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632$fn__14635$fn__14649.invoke(internal.clj:274)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632$fn__14635.invoke(internal.clj:258)
at puppetlabs.trapperkeeper.internal$fn__14445$initialize_lifecycle_worker__14456$fn__14457$fn__14607$state_machine__11603__auto____14632.invoke(internal.clj:249)
at clojure.core.async.impl.ioc_macros$run_state_machine.invokeStatic(ioc_macros.clj:973)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:972)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invokeStatic(ioc_macros.clj:977)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:975)
at clojure.core.async$ioc_alts_BANG_$fn__11818.invoke(async.clj:384)
at clojure.core.async$do_alts$fn__11758$fn__11761.invoke(async.clj:253)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__6438.invoke(channels.clj:135)
at clojure.lang.AFn.run(AFn.java:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Shutting down web server(s).
Shutting down Scheduler Service
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED shutting down.
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED paused.
Scheduler 536c68b8-9037-4bef-bca8-d8c28bd9ba6e_$_NON_CLUSTERED shutdown complete.
Scheduler Service shutdown complete.
Finished shutdown sequence
How to fix this issue?
Thanks
Related
We are trying to implement a POC where we are trying to run Oozie in AWS EMR. Due to security reasons, I cannot post the workflow but it is an simple example where we only have an rename action which renames the file name. The rest of the actions are the standard ones like start, end, Fatal error, error Handler etc.
The same workflow worked fine on EC2 instance. But when we try to run Oozie workflow on EMR we are getting the following error
2019-09-12 19:34:41,300 WARN ActionStartXCommand:523 - SERVER[<hostname>] USER[hadoop] GROUP[-] TOKEN[] APP[<WorkflowName>] JOB[0000006-190911195656052-oozie-oozi-W] ACTION[0000006-190911195656052-oozie-oozi-W#ErrorHandler] Error starting action [ErrorHandler]. ErrorType [ERROR], ErrorCode [EM007], Message [EM007: Encountered an error while sending the email message over SMTP.]
org.apache.oozie.action.ActionExecutorException: EM007: Encountered an error while sending the email message over SMTP.
at org.apache.oozie.action.email.EmailActionExecutor.email(EmailActionExecutor.java:304)
at org.apache.oozie.action.email.EmailActionExecutor.validateAndMail(EmailActionExecutor.java:173)
at org.apache.oozie.action.email.EmailActionExecutor.start(EmailActionExecutor.java:112)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:459)
at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:82)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:283)
at org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:62)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:244)
at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:56)
at org.apache.oozie.command.XCommand.call(XCommand.java:291)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.mail.MessagingException: Could not connect to SMTP host: <hostname>, port: 25;
nested exception is:
java.net.ConnectException: Connection refused (Connection refused)
at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:1961)
at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:654)
When we check the application logs, we get the below error
Launcher AM execution failed
java.lang.UnsupportedOperationException: Not implemented by the S3FileSystem FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.oozie.action.hadoop.FSLauncherURIHandler.create(FSLauncherURIHandler.java:36)
at org.apache.oozie.action.hadoop.PrepareActionsHandler.execute(PrepareActionsHandler.java:86)
at org.apache.oozie.action.hadoop.PrepareActionsHandler.prepareAction(PrepareActionsHandler.java:73)
at org.apache.oozie.action.hadoop.LauncherAM.executePrepare(LauncherAM.java:371)
at org.apache.oozie.action.hadoop.LauncherAM.access$000(LauncherAM.java:55)
at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:220)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
Exception in thread "main" java.lang.UnsupportedOperationException: Not implemented by the S3FileSystem FileSystem implementation
at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1060)
at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1371)
Hadoop distribution:Amazon 2.8.5
Oozie version:Oozie 5.1.0
EMR version : emr-5.26.0
Appreciate any guidance here.
Issue resolved after we used the older version of Oozie i.e., 4.3. No other changes made. Works fine. Had read in one of the AWS links that some people were not able to execute oozie with 5.X versions. Will update the answer once we get an concrete reply from AWS.
I have started the spark-2.0 thrift server in local environment and it's working fine, when I'll tried with the cluster environment the following exception thrown.
16/06/02 10:21:06 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended!
It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitFor
Application(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start
(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:148)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:502)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2246)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:749)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:57)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:81)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:724)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/06/02 10:21:06 INFO util.ShutdownHookManager: Shutdown hook called
When checking in application master logs
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
Spark-defaults configuration :
spark.executor.memory 2g
spark.driver.memory 4g
spark.executor.cores 1
I have an Elastic MapReduce job which uses elasticsearch-hadoop via scalding-taps to transfer data from Amazon S3 to Amazon Elasticsearch Service. For a long time this job ran successfully. However, it has recently started failing with the following stack trace:
2016-03-02 07:28:34,003 FATAL [IPC Server handler 0 on 41019] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1456902751849_0012_m_000000_0 - exited : cascading.tuple.TupleException: unable to sink into output identifier: myindex/mytable
at cascading.tuple.TupleEntrySchemeCollector.collect(TupleEntrySchemeCollector.java:160)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:95)
at cascading.tuple.TupleEntrySchemeCollector.add(TupleEntrySchemeCollector.java:134)
at cascading.flow.stream.SinkStage.receive(SinkStage.java:90)
at cascading.flow.stream.SinkStage.receive(SinkStage.java
:37)
at cascading.flow.stream.FunctionEachStage$1.collect(FunctionEachStage.java:80)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at com.twitter.scalding.MapFunction.operate(Operations.scala:59)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: null
We have enabled the "es.nodes.wan.only" setting.
What could be causing this failure?
I downloaded spark-1.5.2 and I setup a cluster on ec2 using the spark-ec2 doc here.
After that I went to examples/ and run mvn package and packaged the examples in a jar.
In the end I run the submit with:
bin/spark-submit --class org.apache.spark.examples.JavaTC --master spark://url_here.eu-west-1.compute.amazonaws.com:7077 --deploy-mode cluster /home/aki/Projects/spark-1.5.2/examples/target/spark-examples_2.10-1.5.2.jar
Instead of it running, I get the error:
WARN RestSubmissionClient: Unable to connect to server spark://url_here.eu-west-1.compute.amazonaws.com:7077.
Warning: Master endpoint spark://url_here.eu-west-1.compute.amazonaws.com:7077 was not a REST server. Falling back to legacy submission gateway instead.
15/12/22 17:36:07 WARN Utils: Your hostname, aki-linux resolves to a loopback address: 127.0.1.1; using 192.168.10.63 instead (on interface wlp4s0)
15/12/22 17:36:07 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/12/22 17:36:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:116)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.deploy.Client$.main(Client.scala:233)
at org.apache.spark.deploy.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241)
... 21 more
Are you sure the URL to master contains "url-here"?
spark://url_here.eu-west-1.compute.amazonaws.com:7077
Or maybe you are trying to obfuscate it for this post.
If you can you connect the Spark UI at
http://url_here.eu-west-1.compute.amazonaws.com:4040 or depending on your spark version http://url_here.eu-west-1.compute.amazonaws.com:8080, make sure you are using the URL variable seen on the Spark UI for your spark://...:7070 command line argument
I am reading millions of xml files via
val xmls = sc.binaryFiles(xmlDir)
The operation runs fine locally but on yarn it fails with:
client token: N/A
diagnostics: Application application_1433491939773_0012 failed 2 times due to ApplicationMaster for attempt appattempt_1433491939773_0012_000002 timed out. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1433750951883
final status: FAILED
tracking URL: http://controller01:8088/cluster/app/application_1433491939773_0012
user: ariskk
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:622)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
On hadoops/userlogs logs I am frequently getting these messages:
15/06/08 09:15:38 WARN util.AkkaUtils: Error sending message [message = Heartbeat(1,[Lscala.Tuple2;#2b4f336b,BlockManagerId(1, controller01.stratified, 58510))] in 2 attempts
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:195)
at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:427)
I run my spark job via spark-submit and it works for an other HDFS directory that contains only 37k files. Any ideas how to resolve this?
Ok after getting some help on sparks mailing list, I found out there were 2 issues:
the src directory, if it is given as /my_dir/ it makes spark fail and creates the heartbeat issues. Instead it should be given as hdfs:///my_dir/*
An out of memory error appears in the logs after fixing #1. This is the spark driver running on yarn running out of memory due to the number of files (apparently it keeps all file info in memory). So I spark-submit'ed the job with --conf spark.driver.memory=8g which fixed the issue.