HBase: Difference between Regionserver and QuorumPeer - hadoop

I am new to Hbase. Now I have a simple question : what's the difference between regionserver and quorumpeer. Regionservers list is in the file regionserver and quorumpeer should be configured in HBase_site.xml. I guessed regions of a Hbase table can only be stored in region servers but I have no idea with quorumpeer. Should any node of hbase cluser be regionserver and quorumpeer at the same time? If you know, please explain to me. Thanks!

For hBase to work it needs Zookeeper so that the regionServers and Hmaster can communicate and transfer data. CHeck this out http://hbase.apache.org/book/zookeeper.html
You need to have a quorum of Zookeeper servers running (generally 3 or 5)
You have to list the nodes where Zk servers are running in the hbase.zookeeper.quorum property in HBase-site.xml

Related

understanding how hbase uses hdfs

I’m trying to understand how hbase uses the hdfs.
so here is what I understand (please correct me if I'm wrong):
I know that hbase use hdfs to store data and that data is split into regions, and that each region server my serve many regions,so I guess that one region (exclusively) may communicate with many data node to get and put data, so If that is correct then if that region server fails then data stored in those data node, will not be accessible anymore
thank you in advance :)
In general, a Regionserver runs on a datanode.
Due to how HDFS works, the Regionserver will perform its reads and writes to the local datanode when possible, and then HDFS will ensure that the data is replicated onto two other random datanodes. So at all times, the data written by that regionserver is stored on 3 nodes in HDFS.
While a regionserver is serving a region, only it will read / write the data for that region, but if the regionserver process crashes, the HBase master will select another regionsever to serve that region. The data will be unavailable for a few minutes, but HBase will recover quickly.
If the entire host fails, then as HDFS ensured the data was written onto two other nodes, the scenario is the same - the master will select a new regionserver to open the failed region and the data not be lost.

How to get node names in my ZooKeeper?

I am using HBASE, for which I installed zookeeper. Is there any config file, which has Znode names stored or some command which gives the Znodes names as output ?
Looking at your configuration, there seems to be only one node in the quorum (not a quorum technically) and its the local machine. The minimum nodes to maintain a quorum is three.
As Cricket_007 commented, You should see zookeeper quorum info on hbase-site.xml. Looks like you are also looking for what is getting registered and written by Hbase in zookeeper. To get full list and details you can use zk_dump on Hbase shell.
more info on the command
https://learnhbase.wordpress.com/2013/03/02/hbase-shell-commands/

Hadoop Nodes and Roles

I've a Hadoop Cluster at work that has over 50 nodes, We occasionally face disk failures and require to decommission the datanode roles.
My Question is - if I were to only decommission the datanode and leave the tasktracker running, would this result in failed tasks/jobs on this node due to unavailability of HDFS Service on that node?
Does the TaskTracker on Node1 sit idle since there is no DataNode service on that Node? Correct, if the data node is disabled then the task tracker will not be able to process the data as the data will not be avaiable; it will be idle. 2. or Does the TaskTracker work on data from DataNodes on other Nodes? Nope, due to data locality principle, the task tracker will not process the data from other nodes.. 3. Do we get errors from TaskTracker Service on Node1 due to the DN on it's node being down? , Task tracker will not be able to process any data, so no errors.; 4. if I have services like Hive, Impala, etc running on HDFS - would those services throw error upon contact with TaskTracker on Node1? They will not be able to contact the task tracker on node 1. When client requests for the processing of the data, Name node tells the client about the data locations, so based on the data locations all other applications will communicate with data nodes
I would expect any task that tries to read from HDFS on the "dead" node to fail. This should result in the node being blacklisted by M/R after N failures (default is 3 I think). Also, I believe this happens each time a job runs.
However, jobs should still finish since the tasks that got routed to the bad node will simply be retried on other nodes.
Firstly, in order to run a job you need to have the input file. So when you load the input file to HDFS this will be split into 64 MB block size by default. Also there will be 3 replications with default settings. Now since one of your data node in the cluster is failed, Name node will not store the data in that node. Even if it tries to store also, it gets the frequent updates from data node about the status. So it will not choose that specific data node to store the data.
It should throw exception when you don't have the disk space and the only dead data node is left in the cluster. Then its time for you to replace the data node and scale up the cluster.
Hope this helps.

Why does NameNode HA use JournalNode instead of Zookeeper istself?

Looking at the following hdfs documentation makes me wonder why hadoop is using a JournalQuorum instead of using Zookeeper itself for keeping fsImage and EditLogs synchronized
http://hortonworks.com/blog/namenode-high-availability-in-hdp-2-0/
In other words, whats the problem in using zookeeper service instead of a bunch of JournalNodes for active-standby NameNode communication.?
One reason could be - Zookeeper is not meant for storage. If I remember correctly, the default maximum size of data that can be stored in a ZNode is 1 MB.

Running pig on a multi node Cassandra cluster

I am working on BI process that will read data from cassandra, create summaries using Map Reduce and write back to a different keyspace.
Starting with a single node, everything worked as i expected, but when moving to a multi-node, i am not sure I fully understand the topology and configuration.
I have a setup with 3 nodes. Each has a Cassandra node (version 1.1.9), data node and task tracker (version 0.20.2+923.421- CDH3U5) . The NameNode and job tracker are on a different server. At this point i am trying to run Pig script from the DataNode server.
The thing i am not sure of is the pig argument PIG_INITIAL_ADDRESS. I assumed the query would run on all Cassandra nodes, each task tracker would only query the local Cassandra node, and the reducer would handle any duplicates. Based on that assumption i thought the PIG_INITIAL_ADDRESS should be localhost. But when running the pig script it fails:
java.io.IOException: Unable to connect to server localhost:9160
My questions are- should the initial address be any one of the Cassandra nodes, and Splitting the map on the cluster is done from Cassandra keys partitions (will i get the distribution i need)?
IF I where to use java map reduce, will i still need to supply the initial address?
Is the current implementation assumes pig is running from a Cassandra node?
The PIG_INITIAL_ADDRESS is the address of one of the Cassandra nodes in your ring. In order to have the Hadoop job read data from or write data to Cassandra, it just needs to have some properties set. Those properties are also available to set in the job properties or in the default Hadoop configuration on the server that you're running the job from. Other than that, it's just like submitting a job to a job tracker.
For more information, I would look at the readme that's in the cassandra source download under examples/pig. There is a lot of explanation in there as well.

Resources