Why Hbase doesn't use all nodes in the cluster? - hadoop

I'm setting up a Hbase cluster and I ran into a problem. When I'm writing my data to the cluster some nodes remain empty.
Hbase Status Screen :
Dfshealth screen :
Hbase 1.4.10, Hadoop 3.1.2
node-master hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://node-master:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>hdfs://node-master:9000/zookeeper</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeper.quorum</name>
<value>node-master</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.client.write.buffer</name>
<value>8388608</value>
</property>
<property>
<name>hbase.client.scanner.caching</name>
<value>10000</value>
</property>
</configuration>
node-master regionservers (hadoop workers same)
node1
node2
node3
node4
node5
node6

Hbase writes data in regions where each region is a collection of rows based on sorted keys. When you reach a region limit HBase split the region into 2 regions and same when 2nd region reach limit it again split. each region is assigned to a region server(Datanode).
So your table does not have as many regions that it will utilize all the nodes. so to balance the data across nodes you can either pre-split the table while creating it.
Hbase Table Pre-split
Please also read about HBase hotspot problem then you get more understanding.

Related

Name node fails after restarting the hadoop HA cluster nodes after power off

I have setup HA hadoop cluster with 2 name nodes and journal nodes with automatic fail-over control . it starts fines when starting after namenode format. But it fails when restarting the cluster. I also tried to up the cluster in the order.
start all journal nodes
start active name node
start standby node (using bootstrap)and start name node
start zkserver on all nodes
start all data nodes.
format zkfc on active node ,then start
format zkfc on standby node ,then start.
it works fine until stage 5 and all nodes are up(both name nodes are up and standby).When i started zkfc , name node fails and getting an error journal node not formated.
(before this step , i started the setup with successfully by formatting the active name node, in the second time i start , i removed name node format in step 2):
how do i starting the setup after shutdown and restart?
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoop/data/nameNode</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoop/data/dataNode</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>ha_cluster</value>
</property>
<property>
<name>dfs.ha.namenodes.ha_cluster</name>
<value>sajan,sajan2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ha_cluster.sajan</name>
<value>192.168.5.249:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ha_cluster.sajan2</name>
<value>192.168.5.248:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ha_cluster.sajan</name>
<value>192.168.5.249:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.ha_cluster.sajan2</name>
<value>192.168.5.248:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://192.168.5.249:8485;192.168.5.248:8485;192.168.5.250:8485/ha_cluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ha_cluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>192.168.5.249:2181,192.168.5.248:2181,192.168.5.250:2181,192.168.5.251:2181,192.168.5.252:2181,192.168.5.253:2181</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
</configuration>
If you want to stop the service use the below order. I lost my 2 working days to figure this out.
stop all name nodes.
stop-all journal nodes.
stop-all data nodes.
stop fail over service.
stop zkserver

slaves not being shown on web browser in hbase multi node

I have installed hadoop(1.2.1) multinode on 1 master and 2 slaves. Now I am trying to install hbase over it. The problem is that when I start hbase on the master, it only shows one regionserver(the master itself) while the slaves are not being shown on the web browser. On the terminal each slave has its own regionserver but that is not reflected on the browser. Can anyone tell me as to what the problem is?
I had same problem, i solve it by adding port number in hbase.rootdir
And your hbase-site.xml should look like this
<property>
<name>hbase.zookeeper.quorum</name>
<value>master-IP-Adress,slave1-IP,slave2-IP</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/ravi/zookeeper-data</value>
<description>Property from ZooKeeper config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master-IP:50000/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>

Hbase: I cluster hbase in Distributed mode and starting fine but when i run hbase shell I can't create table

I'm so newby in hbase cluster , I cluster hbase in Distributed mode and starting fine but when i run hbase shell I can't create table this error is shown:
my base-site.xml configuration is
<property>
<name>hbase.master</name>
<value>matser:60000</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-namnode:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>hbase.zookeeper.property.clientport</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>usr/local/hbase/temp</value>
</property>
could you please help me ?Thanks in advance
The version of Hbase should compatible to Hadoop version.Downgrade the Hbase it'll work fine.

HBase UI doesn't show any region servers

I run hbase in a distributed mode. Hbase starts region servers java processes on all nodes, but web ui doesn' show them
http://s1.ipicture.ru/uploads/20120517/16DXTnsU.png
here is hbase-site.xml
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>10.3.6.44</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/hdfs/zookeeper</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://10.3.6.44:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
btw hadoop cluster is running normally and sees all the datanodes
thanks very much for your help.
problem was with dns and hosts file.
Add this property to your hbase-site.xml file and see if it works for you
name - hbase.zookeeper.property.clientPort
value - 2181

Hbase regionservers

We have installed hadoop cluster. We want to use HBase over it. My hbase-site.xml is below
<property>
<name>hbase.rootdir</name>
<value>hdfs://ali:54310/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>ali,reg_server1</value>
<description>The directory shared by region servers.
</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>
</description>
</property>
And I have 2 region servers ali and reg_server1. When I open page at http://ali:60010 I see that server reg_server1 has 0 regions. but server ali has n > 0 regions. I put some data to Hbase but, server reg_server1 still has 0 regions. Does it means that this node is not participiating in cluster? How can I resolve it?
Thanks
No, you are ok as long as you see both regionservers in the master's web UI. When you write to an HBase table, it will write to one region (one region is always on one regionserver, in your case, ali). Once you write enough data to make the region exceed the configured max file size, the region will be split and distributed across the two regionservers.

Resources