Hbase master keeps dying, claims a hbase:namespace already exists - hadoop

In todays episode of hbase is bringing me to my wits end we have an issue where the hbase master starts and then very quickly dies. My master log is like so:
2014-06-20 12:52:40,469 FATAL [master:hdev01:60000] master.HMaster: Master serve
r abort: loaded coprocessors are: []
2014-06-20 12:52:40,470 FATAL [master:hdev01:60000] master.HMaster: Unhandled ex
ception. Starting shutdown.
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(Cre
ateTableHandler.java:120)
at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceT
able(TableNamespaceManager.java:232)
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNames
paceManager.java:86)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:106
2)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.j
ava:926)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
at java.lang.Thread.run(Thread.java:662)
2014-06-20 12:52:40,473 INFO [master:hdev01:60000] master.HMaster: Aborting
2014-06-20 12:52:40,473 DEBUG [master:hdev01:60000] master.HMaster: Stopping ser
vice threads
2014-06-20 12:52:40,473 INFO [master:hdev01:60000] ipc.RpcServer: Stopping serv
er on 60000
2014-06-20 12:52:40,473 INFO [CatalogJanitor-hdev01:60000] master.CatalogJanito
r: CatalogJanitor-hdev01:60000 exiting
2014-06-20 12:52:40,473 INFO [hdev01,60000,1403283149823-BalancerChore] balance
r.BalancerChore: hdev01,60000,1403283149823-BalancerChore exiting
2014-06-20 12:52:40,474 INFO [RpcServer.listener,port=60000] ipc.RpcServer: Rpc
Server.listener,port=60000: stopping
2014-06-20 12:52:40,474 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopped
2014-06-20 12:52:40,474 INFO [master:hdev01:60000] master.HMaster: Stopping inf
oServer
2014-06-20 12:52:40,474 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopping
2014-06-20 12:52:40,474 INFO [master:hdev01:60000.oldLogCleaner] cleaner.LogCle
aner: master:hdev01:60000.oldLogCleaner exiting
2014-06-20 12:52:40,475 INFO [hdev01,60000,1403283149823-ClusterStatusChore] ba
lancer.ClusterStatusChore: hdev01,60000,1403283149823-ClusterStatusChore exiting
2014-06-20 12:52:40,476 INFO [master:hdev01:60000.oldLogCleaner] master.Replica
tionLogCleaner: Stopping replicationLogCleaner-0x246ba2ab1e4001c, quorum=hdev02:
5181,hdev01:5181,hdev03:5181, baseZNode=/hbase
2014-06-20 12:52:40,479 INFO [master:hdev01:60000] mortbay.log: Stopped SelectC
hannelConnector#0.0.0.0:16010
2014-06-20 12:52:40,478 INFO [master:hdev01:60000.archivedHFileCleaner] cleaner
.HFileCleaner: master:hdev01:60000.archivedHFileCleaner exiting
2014-06-20 12:52:40,483 INFO [master:hdev01:60000.oldLogCleaner] zookeeper.ZooK
eeper: Session: 0x246ba2ab1e4001c closed
2014-06-20 12:52:40,484 INFO [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,589 DEBUG [master:hdev01:60000] catalog.CatalogTracker: Stop
ping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#f3f348b
2014-06-20 12:52:40,591 INFO [master:hdev01:60000] client.HConnectionManager$HC
onnectionImplementation: Closing zookeeper sessionid=0x246ba2ab1e4001b
2014-06-20 12:52:40,592 INFO [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001b closed
2014-06-20 12:52:40,592 INFO [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,695 INFO [hdev01,60000,1403283149823.splitLogManagerTimeout
Monitor] master.SplitLogManager$TimeoutMonitor: hdev01,60000,1403283149823.split
LogManagerTimeoutMonitor exiting
2014-06-20 12:52:40,696 INFO [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001a closed
2014-06-20 12:52:40,696 INFO [main-EventThread] zookeeper.ClientCnxn: EventThre
ad shut down
2014-06-20 12:52:40,696 INFO [master:hdev01:60000] master.HMaster: HMaster main
thread exiting
2014-06-20 12:52:40,697 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMaster
CommandLine.java:194)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandL
ine.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLi
ne.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2803)
I thought this might be some remnant of an old run so I deleted the files in hbases data directory, the zookeepers data directory and my hdfs. I still got the same error. Strangely my HMaster popper back up again temporarily when I ran stop-hbase.sh although there wasn't much I could do with it.
My Hbase version is 98.3 and my hadoop is 2.2.0. My hbase-site.comf is
<configuration>
<property>
<name>hbase.master</name>
<value>hdev01:60000</value>
<description>The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver
in a single process.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hdev01:9000/hbase</value>
<description>The directory shared by region servers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper true: fully-distributed with unmanaged Zookeeper
Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>5181</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>10000</value>
<description></description>
</property>
<property>
<name>hbase.client.retries.number</name>
<value>10</value>
<description></description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hdev01,hdev02,hdev03</value>
<description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If
HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop
ZooKeeper on.
</description>
</property>
</configuration>
EDIT
Attempted hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair, my error now is HBase file layout needs to be upgraded. You have version null and I want version 8. Is your hbase.rootdir valid? If so, you may need to run 'hbase hbck -fixVersionFile'
Which is unhelpful since without a master hbck will not actually run.
Edited edit
I nuked and restarted my dfs and then tried repairing and starting things again, i am now back where i started.

hbase namespace is the internal namespace HBAse uses for its own management tables. Try to run the offline repair tool
from the $HBASE_HOME directory:
./bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair

su - hdfs
hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
(restart the hbase master.if still u are facing issue then do following)
zookeeper-client (enter)
rmr /hbase
quit
Then restart the hbase master service

#shash:
When HBase manages ZooKeeper( i.e. HBASE_manages_ZK=true), the command to access and clean hbase data is :
hbase zkcli. Afterwards you clean hbae using the command rmr /hbase, then you quit.

Related

jps does not show hmaster but <no information available>

I configured HBase today and I configured it correctly at first. However, when I ran HBase use the code 'start-all.sh' again, I could not see 'Hmaster' anywhere. It just shows like:
[root#master bin]# jps
25164 QuorumPeerMain
83447 HRegionServer
44542 NameNode
44789 DataNode
45098 SecondaryNameNode
45378 ResourceManager
45536 NodeManager
56678 <no information available>
56949 Jps
when I 'jps' again, '':
enter image description here
and the log shows:
[root#master bin]# cd /home/hadoop/hbase-2.2.3/logs
[root#master logs]# ls
hbase-root-master-master.log hbase-root-regionserver-master.out.1
hbase-root-master-master.out hbase-root-regionserver-master.out.2
hbase-root-master-master.out.1 hbase-root-regionserver-master.out.3
hbase-root-regionserver-master.log hbase-root-regionserver-master.out.4
hbase-root-regionserver-master.out SecurityAuth.audit
[root#master logs]# tail hbase-root-master-master.log
2022-04-28 17:29:56,674 INFO [master/master:16000] zookeeper.ZooKeeper: Session: 0x100000e4a0d0020 closed
2022-04-28 17:29:56,674 INFO [master/master:16000] regionserver.HRegionServer: Exiting; stopping=master,16000,1651138191876; zookeeper connection closed.
2022-04-28 17:29:56,674 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x100000e4a0d0020
2022-04-28 17:29:56,674 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:244)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2940)
[root#master logs]#
I solve the problem by adding the following configuration to the configuration file "hbase-site.xml":
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
I do not why, but it works.

hbase master not starting

I am running HBase on Hadoop in standalone mode. I have successfully installed hadoop,zookeeper and hbase but in hbase master is not starting. Below is my hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/kumar/hdata/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
</configuration>
I have started hadoop and zookeeper services:
start-all.sh
zkServer.sh start
start-hbase.sh
and the processes I am getting in Jps command
2133 DataNode
1974 NameNode
2679 NodeManager
2365 SecondaryNameNode
3917 QuorumPeerMain
2527 ResourceManager
3935 Jps
Hbase shell is starting succcesfully, but when i run any command in shell like 'list' I am getting below error:
ERROR: KeeperErrorCode = NoNode for /hbase/master
After that I try to run master with below command
hbase master start
and I am getting below error:
2018-10-15 18:51:51,380 ERROR [main] server.ZooKeeperServer: ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes
2018-10-15 18:51:51,437 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182] server.NIOServerCnxnFactory: Accepted socket connection from /127.0.0.1:34034
2018-10-15 18:51:51,479 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182] server.ServerCnxn: The list of known four letter word commands is : [{1936881266=srvr, 1937006964=stat, 2003003491=wchc, 1685417328=dump, 1668445044=crst, 1936880500=srst, 1701738089=envi, 1668247142=conf, 2003003507=wchs, 2003003504=wchp, 1668247155=cons, 1835955314=mntr, 1769173615=isro, 1920298859=ruok, 1735683435=gtmk, 1937010027=stmk}]
2018-10-15 18:51:51,479 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182] server.ServerCnxn: The list of enabled four letter word commands is : [[wchs, stat, stmk, conf, ruok, mntr, srvr, envi, srst, isro, dump, gtmk, crst, cons]]
2018-10-15 18:51:51,479 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2182] server.NIOServerCnxn: Processing stat command from /127.0.0.1:34034
2018-10-15 18:51:51,485 INFO [Thread-2] server.NIOServerCnxn: Stat command output
2018-10-15 18:51:51,491 INFO [main] zookeeper.MiniZooKeeperCluster: Started MiniZooKeeperCluster and ran successful 'stat' on client port=2182
Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
2018-10-15 18:51:51,497 ERROR [main] master.HMasterCommandLine: Master exiting
java.io.IOException: Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:217)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2983)
2018-10-15 18:51:51,500 INFO [Thread-2] server.NIOServerCnxn: Closed socket connection for client /127.0.0.1:34034 (no session established for client)
Also I am not getting any response for local hbase web URL
localhost:60010
HBase has a built-in instance of Zookeeper for just development environments and by default when you start HBase by the command start-hbase.sh it start the Zookeeper daemon, too. The error is because of you already start a standalone Zookeeper that uses the port 2181. When you start HBase it tries to start it's built-in zookeeper in port 2181, too and it got the error!
If you want to use standalone Zookeeper component, first edit the file hbase-env.sh and add the line: export HBASE_MANAGES_ZK=false (You also can search for HBASE_MANAGES_ZK variable in the file and set it to false). So now when you start the HBase, it just starts HBase Daemon and not Zookeeper anymore. Remember you should start the Zookeeper daemon before HBase.
I solved this by adding this in hbase-site.xml:
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
</property>
in my case I was working on hbase 2.2.0 in standalone

ERROR in datanode execution while running Hadoop first time in Windows 10

I am trying to run Hadoop 3.1.1 in my Windows 10 machine. I modified all the files:
hdfs-site.xml
mapred-site.xml
core-site.xml
yarn-site.xml
Then, I executed the following command:
C:\hadoop-3.1.1\bin> hdfs namenode -format
The format ran correctly so I directed to C:\hadoop-3.1.1\sbin to execute the following command:
C:\hadoop-3.1.1\sbin> start-dfs.cmd
The command prompt opens 2 new windows: one for datanode and another for namenode.
The namenode window keeps running:
2018-09-02 21:37:06,232 INFO ipc.Server: IPC Server Responder: starting
2018-09-02 21:37:06,232 INFO ipc.Server: IPC Server listener on 9000: starting
2018-09-02 21:37:06,247 INFO namenode.NameNode: NameNode RPC up at: localhost/127.0.0.1:9000
2018-09-02 21:37:06,247 INFO namenode.FSNamesystem: Starting services required for active state
2018-09-02 21:37:06,247 INFO namenode.FSDirectory: Initializing quota with 4 thread(s)
2018-09-02 21:37:06,247 INFO namenode.FSDirectory: Quota initialization completed in 3 milliseconds
name space=1
storage space=0
storage types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
2018-09-02 21:37:06,279 INFO blockmanagement.CacheReplicationMonitor: Starting CacheReplicationMonitor with interval 30000 milliseconds
While the datanode gives following error:
ERROR: datanode.DataNode: Exception in secureMain
org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0
at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:220)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2762)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2677)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2719)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2863)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2887)
2018-09-02 21:37:04,250 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0
2018-09-02 21:37:04,250 INFO datanode.DataNode: SHUTDOWN_MSG:
And then, the datanode shuts down! I tried several ways to overcome this error, but this is first time I am installing Hadoop on windows and can't understand what to do next!
I got things working, after I removed the file system reference for the datanode in hdfs-site.xml. I found that enabled the software to create and initialise its own datanode, which then popped up in sbin. After that I could use hdfs without a hitch. Here is what worked for me for Hadoop 3.1.3 on windows:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///C:/Users/myusername/hadoop/hadoop-3.1.3/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>datanode</value>
</property>
</configuration>
Cheers,
MV
I had the same problem and what worked for me was editing hdfs-site.xml as follows:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///C:/Hadoop/hadoop-3.1.2/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/C:/Hadoop/hadoop-3.1.2/data/datanode</value>
</property>

Run HDFS pseudo mode in a docker container

I'm trying to run a HDFS under pseudo mode in a docker container, configured with this page: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation, but I didn't use start-all.sh script as it isn't supposed to be able to do ssh, so I manually ran command bin/hdfs --daemon start namenode|datanode to start them one by one. The problem is I can see namenode started successfully, but datanode quited without any error message. the last piece of log from datanode is:
...
2018-04-09 21:04:03,830 INFO org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker: Scheduling a check for [DISK]file:/apps/hadoop/hdfs/data
2018-04-09 21:04:04,188 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2018-04-09 21:04:04,296 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2018-04-09 21:04:04,296 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2018-04-09 21:04:04,665 INFO org.apache.hadoop.hdfs.server.common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2018-04-09 21:04:04,667 INFO org.apache.hadoop.hdfs.server.datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576
2018-04-09 21:04:04,671 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is hdfs
2018-04-09 21:04:04,671 INFO org.apache.hadoop.hdfs.server.common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2018-04-09 21:04:04,677 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting DataNode with maxLockedMemory = 0
2018-04-09 21:04:04,733 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at /0.0.0.0:9866
2018-04-09 21:04:04,735 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwidth is 10485760 bytes/s
2018-04-09 21:04:04,735 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Number threads for balancing is 50
core-site.xml file:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost</value>
</property>
</configuration>
And hdfs-site.xml is
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/apps/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/apps/hadoop/hdfs/data</value>
</property>
</configuration>
Did I miss any thing from there?
I think it is base image issue, I was using alpine, once I changed to centos, datanode works! must be something missing from alpine, appreciate if anyone knows what is it, as centos based image eventually will much more bigger then alpine.

Hbase master not starting correctly

I'm using Hadoop2.4.0 / Hbase 0.98.0 / Hive 0.14.0
Hadoop and HBase were running fine until I restarted my HMaster. The following error appears in hbase-hduser-master-master.log file :
2015-02-17 05:46:15,157 INFO [master:master:60000] master.TableNamespaceManager: Namespace table not found. Creating...
2015-02-17 05:46:15,193 DEBUG [master:master:60000] lock.ZKInterProcessLockBase: Acquired a lock for /hbase/table-lock/hbase:namespace/write-master:600000000000004
2015-02-17 05:46:15,212 DEBUG [master:master:60000] lock.ZKInterProcessLockBase: Released /hbase/table-lock/hbase:namespace/write-master:600000000000004
2015-02-17 05:46:15,212 FATAL [master:master:60000] master.HMaster: Master server abort: loaded coprocessors are: []
2015-02-17 05:46:15,213 FATAL [master:master:60000] master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(CreateTableHandler.java:120)
at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceTable(TableNamespaceManager.java:232)
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:86)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:1049)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:913)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:606)
at java.lang.Thread.run(Unknown Source)
2015-02-17 05:46:15,214 INFO [master:master:60000] master.HMaster: Aborting
2015-02-17 05:46:15,214 INFO [master,60000,1424180766819-BalancerChore] balancer.BalancerChore: master,60000,1424180766819-BalancerChore exiting
2015-02-17 05:46:15,215 INFO [master,60000,1424180766819-ClusterStatusChore] balancer.ClusterStatusChore: master,60000,1424180766819-ClusterStatusChore exiting
2015-02-17 05:46:15,215 INFO [CatalogJanitor-master:60000] master.CatalogJanitor: CatalogJanitor-master:60000 exiting
2015-02-17 05:46:15,216 DEBUG [master:master:60000] master.HMaster: Stopping service threads
2015-02-17 05:46:15,216 INFO [master:master:60000] ipc.RpcServer: Stopping server on 60000
2015-02-17 05:46:15,216 INFO [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: stopping
2015-02-17 05:46:15,218 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2015-02-17 05:46:15,218 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping
2015-02-17 05:46:15,218 INFO [master:master:60000.oldLogCleaner] cleaner.LogCleaner: master:master:60000.oldLogCleaner exiting
2015-02-17 05:46:15,218 INFO [master:master:60000.oldLogCleaner] master.ReplicationLogCleaner: Stopping replicationLogCleaner-0x14b97c83f580008, quorum=slave:2181,master:2181, baseZNode=/hbase
2015-02-17 05:46:15,219 INFO [master:master:60000.archivedHFileCleaner] cleaner.HFileCleaner: master:master:60000.archivedHFileCleaner exiting
2015-02-17 05:46:15,219 INFO [master:master:60000] master.HMaster: Stopping infoServer
2015-02-17 05:46:15,223 INFO [master:master:60000.oldLogCleaner] zookeeper.ZooKeeper: Session: 0x14b97c83f580008 closed
2015-02-17 05:46:15,223 INFO [master:master:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-02-17 05:46:15,229 INFO [master:master:60000] mortbay.log: Stopped SelectChannelConnector#0.0.0.0:60010
2015-02-17 05:46:15,236 DEBUG [master:master:60000] catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#19f9598
2015-02-17 05:46:15,236 INFO [master:master:60000] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x14b97c83f580007
2015-02-17 05:46:15,237 INFO [master:master:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-02-17 05:46:15,238 INFO [master:master:60000] zookeeper.ZooKeeper: Session: 0x14b97c83f580007 closed
2015-02-17 05:46:15,238 INFO [master,60000,1424180766819.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor: master,60000,1424180766819.splitLogManagerTimeoutMonitor exiting
2015-02-17 05:46:15,243 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-02-17 05:46:15,243 INFO [master:master:60000] zookeeper.ZooKeeper: Session: 0x14b97c83f580006 closed
2015-02-17 05:46:15,243 INFO [master:master:60000] master.HMaster: HMaster main thread exiting
2015-02-17 05:46:15,243 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:192)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:134)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2785)
what's wrong here and what HMaster Aborted means ?
for more information this is what my hbase-site.xml looks like:
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/hbase/zookeeper</value>
</property>
I ran into this problem today! My solution is as follows:
Step 1:stop Hbase.
Step 2:run the follow command
hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
This command is used to repair MetaData of Hbase
Step 3:delete the data in zookeeper (WARNING It will make you lost you old data)
./opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/zookeeper/bin/zkCli.sh
you can use ls / to scan the data in zookeeper
use rmr /hbase to delete the hbase's data in zookeeper
Step 4:Start hbase
This is based on the other answer but to clarify for upgrade cloudera 5.4
Step 1:
service hbase-regionserver stop
service hbase-master stop
Step 2:
hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
Step 3: Delete the data in zookeeper (WARNING It will make you lost your old data)
cd /usr/lib/zookeeper/bin/
./zkCli.sh
It opens up the zookeeper shell.
Then run:
ls /
rmr /hbase
Step 4:Start hbase
service hbase-master restart
service hbase-regionserver restart

Resources