HMaster: Failed to become active master - hadoop

I have had a perfectly working installation and running HBASE cluster with 2 nodes.
I shut down the servers and now when i restart it i get this error.
No configurations have been changed, the IP's are the same for the servers and the namenode and data nodes are also exactly the same.
What i have noticed is that HBase master starts and runs, i can logon to Hbase shell and list all the tables, but cannot read any data, or create any new tables either.
I have checked with JPS all datanodes and namenodes are started, have checked on the other nodes they have also started.
From previous installation notes i noticed that Resource Manager is not running. Not sure if this is relevant.
Zookeeper is also running without any errors.
Not sure what is going on but its really critical for me to solve this.
Detailed Info for the steps followed and errors encountered
The steps i followed to start the Hbase cluster is as follows:
Start HDFS
start-dfs.s
JPS Output
2164 NameNode
2519 Jps
2399 SecondaryNameNode
Start Zookeeper
JPS output
2164 NameNode
2554 QuorumPeerMain
2588 Jps
2399 SecondaryNameNode
Start hbase
This gives the following on the console
/opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
/opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
running master, logging to /opt/hbase/bin/../logs/hbase-hadoop-master-rd-demo-hbase.out
/opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
/opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
rd-demo-hbase-c1: running regionserver, logging to /opt/hbase/bin/../logs/hbase-hadoop-regionserver-rd-demo-hbase-c1.out
rd-demo-hbase-c2: running regionserver, logging to /opt/hbase/bin/../logs/hbase-hadoop-regionserver-rd-demo-hbase-c2.out
rd-demo-hbase-c1: /opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
rd-demo-hbase-c1: /opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
rd-demo-hbase-c2: /opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
rd-demo-hbase-c2: /opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
Until this time there are no errors in the hbase-hadoop-master-hbase.log
JPS output
2832 HMaster
2164 NameNode
2554 QuorumPeerMain
2399 SecondaryNameNode
3183 Jps
It implies that HMaster is indeed running
Logon to Hbase shell
Gives some warnings
/opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
/opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.4, r20e7ba45b0c3affdc0c06b1a0e5cbddd1b2d8d18, Mon Jun 7 15:31:55 PDT 2021
Took 0.0052 seconds
Successfully logs on to Hbase Shell
List command gives the tables present
As soon as i try to scan a table, things start to go wrong
Hbase shell shows the following
scan 'md_Domains'
ERROR: Unknown table md_Domains!
Hbase Logs show the following
2022-01-27 18:49:18,055 ERROR [master/rd-demo-hbase:16000:becomeActiveMaster] master.HMaster: Failed to become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1233)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1028)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2091)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:507)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1231)
... 4 more
2022-01-27 18:49:18,056 ERROR [master/rd-demo-hbase:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: []
2022-01-27 18:49:18,057 ERROR [master/rd-demo-hbase:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master rd-demo-hbase.c.rd-demo-320517.internal,16000,1643309039738: Unhandled exception. Starting shutdown. *****
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1233)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1028)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2091)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:507)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1231)
... 4 more
2022-01-27 18:49:18,057 INFO [master/rd-demo-hbase:16000:becomeActiveMaster] regionserver.HRegionServer: ***** STOPPING region server 'rd-demo-hbase.c.rd-demo-320517.internal,16000,1643309039738' *****
2022-01-27 18:49:18,057 INFO [master/rd-demo-hbase:16000:becomeActiveMaster] regionserver.HRegionServer: STOPPED: Stopped by master/rd-demo-hbase:16000:becomeActiveMaster
2022-01-27 18:49:18,058 INFO [master/rd-demo-hbase:16000] regionserver.HRegionServer: Stopping infoServer
2022-01-27 18:49:18,070 INFO [master/rd-demo-hbase:16000] handler.ContextHandler: Stopped o.a.h.t.o.e.j.w.WebAppContext#35c12c7a{master,/,null,STOPPED}{file:/opt/hbase/hbase-webapps/master}
2022-01-27 18:49:18,075 INFO [master/rd-demo-hbase:16000] server.AbstractConnector: Stopped ServerConnector#3db972d2{HTTP/1.1, (http/1.1)}{0.0.0.0:16010}
2022-01-27 18:49:18,076 INFO [master/rd-demo-hbase:16000] server.session: node0 Stopped scavenging
2022-01-27 18:49:18,083 INFO [master/rd-demo-hbase:16000] handler.ContextHandler: Stopped o.a.h.t.o.e.j.s.ServletContextHandler#3d5790ea{static,/static,file:///opt/hbase/hbase-webapps/static/,STOPPED}
2022-01-27 18:49:18,090 INFO [master/rd-demo-hbase:16000] handler.ContextHandler: Stopped o.a.h.t.o.e.j.s.ServletContextHandler#bfc14b9{logs,/logs,file:///opt/hbase/logs/,STOPPED}
2022-01-27 18:49:18,094 INFO [master/rd-demo-hbase:16000] regionserver.HRegionServer: aborting server rd-demo-hbase.c.rd-demo-320517.internal,16000,1643309039738
2022-01-27 18:49:18,095 INFO [master/rd-demo-hbase:16000] regionserver.HRegionServer: stopping server rd-demo-hbase.c.rd-demo-320517.internal,16000,1643309039738; all regions closed.
2022-01-27 18:49:18,095 INFO [master/rd-demo-hbase:16000] master.ReplicationLogCleaner: Stopping replicationLogCleaner-0x1000019d2180006, quorum=rd-demo-hbase:2181, baseZNode=/hbase
2022-01-27 18:49:18,096 WARN [OldWALsCleaner-1] cleaner.LogCleaner: Interrupted while cleaning old WALs, will try to clean it next round. Exiting.
2022-01-27 18:49:18,098 WARN [OldWALsCleaner-0] cleaner.LogCleaner: Interrupted while cleaning old WALs, will try to clean it next round. Exiting.
2022-01-27 18:49:18,201 INFO [master/rd-demo-hbase:16000] zookeeper.ZooKeeper: Session: 0x1000019d2180006 closed
2022-01-27 18:49:18,202 INFO [master/rd-demo-hbase:16000:becomeActiveMaster-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x1000019d2180006
2022-01-27 18:49:18,202 INFO [master/rd-demo-hbase:16000] hbase.ChoreService: Chore service for: master/rd-demo-hbase:16000 had [] on shutdown
2022-01-27 18:49:18,203 INFO [master/rd-demo-hbase:16000] procedure2.RemoteProcedureDispatcher: Stopping procedure remote dispatcher
2022-01-27 18:49:18,203 INFO [master/rd-demo-hbase:16000] procedure2.ProcedureExecutor: Stopping
2022-01-27 18:49:18,206 INFO [master/rd-demo-hbase:16000] region.RegionProcedureStore: Stopping the Region Procedure Store, isAbort=true
2022-01-27 18:49:18,208 WARN [master/rd-demo-hbase:16000] master.ActiveMasterManager: Failed get of master address: java.io.IOException: Can't get master address from ZooKeeper; znode data == null
2022-01-27 18:49:18,208 INFO [master/rd-demo-hbase:16000] assignment.AssignmentManager: Stopping assignment manager
2022-01-27 18:49:18,208 INFO [master/rd-demo-hbase:16000] region.MasterRegion: Closing local region {ENCODED => 1595e783b53d99cd5eef43b6debb2682, NAME => 'master:store,,1.1595e783b53d99cd5eef43b6debb2682.', STARTKEY => '', ENDKEY => ''}, isAbort=true
2022-01-27 18:49:18,242 INFO [master/rd-demo-hbase:16000] regionserver.HRegion: Closing region master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-01-27 18:49:18,248 ERROR [master/rd-demo-hbase:16000] regionserver.HRegion: Memstore data size is 54229 in region master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-01-27 18:49:18,248 INFO [master/rd-demo-hbase:16000] regionserver.HRegion: Closed master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-01-27 18:49:18,248 INFO [master/rd-demo-hbase:16000] flush.MasterFlushTableProcedureManager: stop: server shutting down.
2022-01-27 18:49:18,249 INFO [master/rd-demo-hbase:16000] ipc.NettyRpcServer: Stopping server on /192.168.0.111:16000
2022-01-27 18:49:18,252 INFO [master:store-WAL-Roller] wal.AbstractWALRoller: LogRoller exiting.
2022-01-27 18:49:18,373 INFO [master/rd-demo-hbase:16000] zookeeper.ZooKeeper: Session: 0x1000019d2180000 closed
2022-01-27 18:49:18,373 INFO [master/rd-demo-hbase:16000] regionserver.HRegionServer: Exiting; stopping=rd-demo-hbase.c.rd-demo-320517.internal,16000,1643309039738; zookeeper connection closed.
2022-01-27 18:49:18,374 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:261)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:149)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2872)
2022-01-27 18:49:18,374 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x1000019d2180000
If running JPS now it shows that HMASTER is not running anymore
2164 NameNode
3416 Jps
2554 QuorumPeerMain
2399 SecondaryNameNode

Related

Yarn (Node and Resource Manager) not Running, Hadoop 3.2.1 Installation Windows 10

Summary
Installing Hadoop following this guide, everything goes fine until Step 7 (starting NameNode and DataNode) but when I'm trying Step 8 (starting NodeManager and ResourceManager) the two cmds open up but they fail with the following excpetions each.
nodemanager cmd:
2022-11-18 18:29:44,278 ERROR nodemanager.NodeManager: Error starting NodeManager
java.lang.ExceptionInInitializerError
at com.google.inject.internal.cglib.reflect.$FastClassEmitter.<init>(FastClassEmitter.java:67)
at com.google.inject.internal.cglib.reflect.$FastClass$Generator.generateClass(FastClass.java:72)
at com.google.inject.internal.cglib.core.$DefaultGeneratorStrategy.generate(DefaultGeneratorStrategy.java:25)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator.create(AbstractClassGenerator.java:216)
at com.google.inject.internal.cglib.reflect.$FastClass$Generator.create(FastClass.java:64)
at com.google.inject.internal.BytecodeGen.newFastClass(BytecodeGen.java:204)
at com.google.inject.internal.ProviderMethod$FastClassProviderMethod.<init>(ProviderMethod.java:256)
at com.google.inject.internal.ProviderMethod.create(ProviderMethod.java:71)
at com.google.inject.internal.ProviderMethodsModule.createProviderMethod(ProviderMethodsModule.java:275)
at com.google.inject.internal.ProviderMethodsModule.getProviderMethods(ProviderMethodsModule.java:144)
at com.google.inject.internal.ProviderMethodsModule.configure(ProviderMethodsModule.java:123)
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:349)
at com.google.inject.AbstractModule.install(AbstractModule.java:122)
at com.google.inject.servlet.ServletModule.configure(ServletModule.java:52)
at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
at com.google.inject.spi.Elements.getElements(Elements.java:110)
at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
at com.google.inject.Guice.createInjector(Guice.java:96)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:387)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:432)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:428)
at org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer.serviceStart(WebServer.java:112)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:975)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1054)
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws java.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module #7c0c77c7
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
at java.base/java.lang.reflect.Method.checkCanSetAccessible(Method.java:200)
at java.base/java.lang.reflect.Method.setAccessible(Method.java:194)
at com.google.inject.internal.cglib.core.$ReflectUtils$2.run(ReflectUtils.java:56)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:318)
at com.google.inject.internal.cglib.core.$ReflectUtils.<clinit>(ReflectUtils.java:46)
... 32 more
2022-11-18 18:29:44,286 INFO ipc.Server: Stopping server on 57727
2022-11-18 18:29:44,287 INFO ipc.Server: Stopping IPC Server listener on 0
2022-11-18 18:29:44,287 INFO ipc.Server: Stopping IPC Server Responder
2022-11-18 18:29:44,288 WARN monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting.
2022-11-18 18:29:44,297 INFO ipc.Server: Stopping server on 8040
2022-11-18 18:29:44,298 INFO ipc.Server: Stopping IPC Server listener on 8040
2022-11-18 18:29:44,298 INFO ipc.Server: Stopping IPC Server Responder
2022-11-18 18:29:44,299 WARN nodemanager.NodeResourceMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl is interrupted. Exiting.
2022-11-18 18:29:44,299 INFO localizer.ResourceLocalizationService: Public cache exiting
2022-11-18 18:29:44,299 INFO impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2022-11-18 18:29:44,300 INFO impl.MetricsSystemImpl: NodeManager metrics system stopped.
2022-11-18 18:29:44,301 INFO impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2022-11-18 18:29:44,301 INFO nodemanager.NodeManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at my-computer-name/xxx.xxx.xxx.xxx
************************************************************/
resourcemanager cmd:
2022-11-18 18:29:43,321 FATAL resourcemanager.ResourceManager: Error starting ResourceManager
java.lang.ExceptionInInitializerError
at com.google.inject.internal.cglib.reflect.$FastClassEmitter.<init>(FastClassEmitter.java:67)
at com.google.inject.internal.cglib.reflect.$FastClass$Generator.generateClass(FastClass.java:72)
at com.google.inject.internal.cglib.core.$DefaultGeneratorStrategy.generate(DefaultGeneratorStrategy.java:25)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator.create(AbstractClassGenerator.java:216)
at com.google.inject.internal.cglib.reflect.$FastClass$Generator.create(FastClass.java:64)
at com.google.inject.internal.BytecodeGen.newFastClass(BytecodeGen.java:204)
at com.google.inject.internal.ProviderMethod$FastClassProviderMethod.<init>(ProviderMethod.java:256)
at com.google.inject.internal.ProviderMethod.create(ProviderMethod.java:71)
at com.google.inject.internal.ProviderMethodsModule.createProviderMethod(ProviderMethodsModule.java:275)
at com.google.inject.internal.ProviderMethodsModule.getProviderMethods(ProviderMethodsModule.java:144)
at com.google.inject.internal.ProviderMethodsModule.configure(ProviderMethodsModule.java:123)
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:349)
at com.google.inject.AbstractModule.install(AbstractModule.java:122)
at com.google.inject.servlet.ServletModule.configure(ServletModule.java:52)
at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
at com.google.inject.spi.Elements.getElements(Elements.java:110)
at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
at com.google.inject.Guice.createInjector(Guice.java:96)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:387)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:432)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1231)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1340)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1535)
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws java.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module #222545dc
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
at java.base/java.lang.reflect.Method.checkCanSetAccessible(Method.java:200)
at java.base/java.lang.reflect.Method.setAccessible(Method.java:194)
at com.google.inject.internal.cglib.core.$ReflectUtils$2.run(ReflectUtils.java:56)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:318)
at com.google.inject.internal.cglib.core.$ReflectUtils.<clinit>(ReflectUtils.java:46)
... 29 more
2022-11-18 18:29:43,329 INFO resourcemanager.ResourceManager: Transitioning to standby state
2022-11-18 18:29:43,329 INFO resourcemanager.ResourceManager: Transitioned to standby state
2022-11-18 18:29:43,330 INFO resourcemanager.ResourceManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at my-computer-name/xxx.xxx.xxx.xxx
************************************************************/
Details of my Attempt
Using JDK 18.0.2
Environment variable JAVA_HOME is C:\PROGRA~1\Java\jdk-18.0.2 (because "Program Files" had some issues earlier)
I do not have yarn package manager installed
Reader's Note
In case there are important details missing let me know to add them.
Issue was actually the JDK version, I used the one on the guide and it worked just fine.

ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode

I am trying to restart one of the namenode (nn2) but i get the following error in the logs:
2021-12-17 10:23:53,676 ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode.
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
... 12 more
2021-12-17 10:23:53,678 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
2021-12-17 10:23:53,681 INFO namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX
************************************************************/
i tryied to do the following steps in order to solve the issue:
i copied from nn01 to the NameNode directories of nn02 the following logs
edits_0000000000274487928-0000000000274488048
edits_0000000000274488049-0000000000274488109
So far the nn02 is still not starting and i get the same error.
Can you please help?
If that is an HA setup, and your NN1 is working properly. Format your NN2(hdfs namenode -format) and do a bootstrap (hdfs namenode -bootstrapStandby)
Then try restarting the NN2.

Failed to start namenode.java.lang.IllegalStateException

iam using hadoop apache 2.7.1 high availability cluster that consists of
two name nodes mn1,mn2 and 3 journal nodes
but while i was working on cluster i faced the following error
when i issue start-dfs.sh mn1 is standby and mn2 is active
but after that if one of theses two namenodes are off there is no possibility
to turn it on again
and here are the last lines of log of one of these two name nodes
2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=true, haEnabled=true, isRollingUpgrade=false)
2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 3 entries 72 lookups
2017-08-05 09:37:21,088 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 7052 msecs
2017-08-05 09:37:21,300 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to mn2:8020
2017-08-05 09:37:21,304 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2017-08-05 09:37:21,316 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2017-08-05 09:37:21,353 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean
2017-08-05 09:37:21,354 WARN org.apache.hadoop.hdfs.server.common.Util: Path /opt/hadoop/metadata_dir should be specified as a URI in configuration files. Please update hdfs configuration.
2017-08-05 09:37:21,361 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:119)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:5741)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1063)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:678)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:664)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
2017-08-05 09:37:21,364 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-08-05 09:37:21,365 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at mn2/192.168.25.22
************************************************************/
This may be
1.Namenode PORT may be Change for each NODE.
This is a particularly vexing problem.
Swallow IllegalStateExceptions thrown by removeShutdownHook in FileSystem. The javadoc states:
public boolean removeShutdownHook(Thread hook)
Throws:
IllegalStateException - If the virtual machine is already in the process of shutting down
So if we are getting this exception, it MEANS we are already in the process of shutdown, so we CANNOT, try what we may, removeShutdownHook. If Runtime had a method Runtime.isShutdownInProgress(), we could have checked for it before the removeShutdownHook call. As it stands, there is no such method. In my opinion, this would be a good patch regardless of the needs for this JIRA.
Not send SIGTERMs from the NM to the MR-AM in the first place. Rather we should expose a mechanism for the NM to politely tell the AM its no longer needed and should shutdown asap. Even after this, if an admin were to kill the MRAppMaster with a SIGTERM, the JobHistory would be lost defeating the purpose of 3614
i discovered that my problem was in journal node and not in namenode
even though the log of namenode shows the error mentioned in question
jps shows journal node but it is fake because journal node service is shut down
even though it is found in jps output
so as a solution i issue hadoop-daemon.sh stop journalnode
then hadoop-daemon.sh start journalnode
and then namenode starts to work again

Unable to Start ResourceManager (capacity-scheduler.xml) not found hadoop 2-6.0

I installed haddop-2.6.0 and followed the Single Cluster instructions from Apache Site http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SingleCluster.html
When i tried to start ResourceManager using the following command
$ sbin/start-yarn.sh
I am getting no error in the console how ever when i see the resource manager log i am seeing the errors. here is the log
2015-02-05 19:59:08,360 INFO [main] resourcemanager.RMNMInfo (RMNMInfo.java:<init>(63)) - Registered RMNMInfo MBean
2015-02-05 19:59:08,360 INFO [main] metrics.SystemMetricsPublisher (SystemMetricsPublisher.java:serviceInit(92)) - YARN system metrics publishing service is not enabled
2015-02-05 19:59:08,361 INFO [main] util.HostsFileReader (HostsFileReader.java:refresh(129)) - Refreshing hosts (include/exclude) list
2015-02-05 19:59:08,364 INFO [main] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2231)) - capacity-scheduler.xml not found
2015-02-05 19:59:08,388 INFO [main] service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler failed in state INITED; cause: java.lang.IllegalStateException: Queue configuration missing child queue names for root
java.lang.IllegalStateException: Queue configuration missing child queue names for root
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:558)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:463)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:295)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:558)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:989)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:255)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1210)
2015-02-05 19:59:08,390 INFO [main] service.AbstractService (AbstractService.java:noteFailure(272)) - Service RMActiveServices failed in state INITED; cause: java.lang.IllegalStateException: Queue configuration missing child queue names for root
java.lang.IllegalStateException: Queue configuration missing child queue names for root
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:558)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:463)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:295)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:558)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:989)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:255)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1210)
2015-02-05 19:59:08,390 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping ResourceManager metrics system...
2015-02-05 19:59:08,391 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216)) - ResourceManager metrics system stopped.
2015-02-05 19:59:08,391 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(605)) - ResourceManager metrics system shutdown complete.
2015-02-05 19:59:08,391 INFO [main] event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(138)) - AsyncDispatcher is draining to stop, igonring any new events.
2015-02-05 19:59:08,391 INFO [main] service.AbstractService (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in state INITED; cause: java.lang.IllegalStateException: Queue configuration missing child queue names for root
java.lang.IllegalStateException: Queue configuration missing child queue names for root
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:558)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:463)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:295)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:558)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:989)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:255)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1210)
2015-02-05 19:59:08,392 INFO [main] resourcemanager.ResourceManager (ResourceManager.java:transitionToStandby(1058)) - Transitioning to standby state
2015-02-05 19:59:08,392 INFO [main] resourcemanager.ResourceManager (ResourceManager.java:transitionToStandby(1068)) - Transitioned to standby state
2015-02-05 19:59:08,392 FATAL [main] resourcemanager.ResourceManager (ResourceManager.java:main(1214)) - Error starting ResourceManager
java.lang.IllegalStateException: Queue configuration missing child queue names for root
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:558)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:463)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:295)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:558)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:989)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:255)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1210)
2015-02-05 19:59:08,403 INFO [Thread-1] resourcemanager.ResourceManager (StringUtils.java:run(659)) - SHUTDOWN_MSG:
I have the file capacity-scheduler.xml in hadoop-2.6.0/etc/hadoop/ folder
Not sure what went wrong.
I am able to start the namenode using start-dfs without any issues.
jps show the following
9379 SecondaryNameNode
9057 NameNode
9199 DataNode
12861 Jps
Thanks
You are missing the capacity-scheduler.xml in your config directory or in your classpath. You can pull a default example copy of it from here

While running a topology in storm we are getting error like this

While running a topology in storm we are getting error like this,
8983 [Thread-6] INFO com.netflix.curator.framework.imps.CuratorFrameworkImpl -
Starting
9144 [main] INFO **backtype.storm.daemon.nimbus** - Shutting down master
9199 [Thread-6-EventThread] INFO backtype.storm.zookeeper - Zookeeper state upd
ate: :connected:none
9241 [main] INFO backtype.storm.daemon.nimbus - Shut down master
9273 [Thread-6] INFO com.netflix.curator.framework.imps.CuratorFrameworkImpl -
Starting
9306 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - EndOfStreamException: Unable to read additional data from cli
ent sessionid 0x143af55728d0003, likely client has closed socket
9354 [main] INFO backtype.storm.daemon.supervisor - Shutting down c094c3b1-a378
-4c4f-af35-9278647c217a:4beddc09-4675-4fb9-8bdc-9cf5013ce9ca
9358 [main] INFO backtype.storm.daemon.supervisor - Shut down c094c3b1-a378-4c4
f-af35-9278647c217a:4beddc09-4675-4fb9-8bdc-9cf5013ce9ca
9361 [main] INFO **backtype.storm.daemon.superviso**r - Shutting down supervisor c0
94c3b1-a378-4c4f-af35-9278647c217a
9364 [Thread-5] INFO **backtype.storm.event** - Event manager interrupted
9369 [Thread-6] INFO backtype.storm.event - Event manager interrupted
9425 [main] INFO **backtype.storm.daemon.supervisor** - Shutting down supervisor 38
6d8d71-c9b5-4b51-bd6e-f9f605034ea0
9428 [Thread-8] INFO backtype.storm.event - Event manager interrupted
9429 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - EndOfStreamException: Unable to read additional data from cli
ent sessionid 0x143af55728d0007, likely client has closed socket
9429 [Thread-9] INFO backtype.storm.event - Event manager interrupted
9473 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - EndOfStreamException: Unable to read additional data from cli
ent sessionid 0x143af55728d0009, likely client has closed socket
9476 [main] INFO backtype.storm.testing - Shutting down in process zookeeper
9503 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN org.apache.zookeeper.serv
er.NIOServerCnxn - Ignoring exception
**java.nio.channels.ClosedChannelException**: null
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.jav
a:211) ~[na:1.7.0_03]
at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.j
ava:242) ~[zookeeper-3.3.3.jar:3.3.3-1073969]
9510 [main] INFO **backtype.storm.testing** - Done shutting down in process zookeep
er
9513 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\c9b1bc1a-a950-4098-af77-f81a4d2b112f
9520 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\7e75c468-18ea-4787-a4ac-496fb108db71
9527 [main] INFO backtype.storm.testing - Unable to delete file: C:\Users\sowmi
ya\AppData\Local\Temp\7e75c468-18ea-4787-a4ac-496fb108db71\version-2\log.1
9529 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\fa7b3c9b-ac93-4090-b9e2-63f10019e61f
9543 [main] INFO backtype.storm.testing - Deleting temporary path C:\Users\sowm
iya\AppData\Local\Temp\55f1fd11-508e-43bb-b340-0d9b79f3af33
9579 [Thread-6-EventThread] INFO com.netflix.curator.framework.state.Connection
StateManager - State change: SUSPENDED
9580 [ConnectionStateManager-0] WARN com.netflix.curator.framework.state.Connec
tionStateManager - There are no ConnectionStateListeners registered.
9583 [Thread-6-EventThread] WARN backtype.storm.cluster - Received event :disco
nnected::none: with disconnected Zookeeper.
11232 [Thread-6-SendThread(localhost:2000)] WARN org.apache.zookeeper.ClientCnx
n - Session 0x143af55728d000b for server null, unexpected error, closing socket
connection and attempting reconnect
**java.net.ConnectException: Connection refused: no further information**
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.7.0_0
3]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701
) ~[na:1.7.0_03]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
~[zookeeper-3.3.3.jar:3.3.3-1073969]
13992 [Thread-6-SendThread(localhost:2000)] WARN org.apache.zookeeper.ClientCnx
n - Session 0x143af55728d000b for server null, unexpected error, closing socket
connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.7.0_0
3]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701
) ~[na:1.7.0_03]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
Whwn we are trying to run the topology jar file all the operation like nimbus,zookeeper and supervisor process going to dead.please help us to know why this is happened.
Please help us to rectify this error and help to proceed further.
Thank you,
Sowmiya
Priya
This looks like a zookeeper issue. It looks like your processes are not being able to connect to zookeeper. Can't say more without more information.

Resources