Information Issues while connecting Hbase remotely - hadoop

This is the output file of running the java program remotely.
> Opening socket connection to server, Will not attempt to authenticate using SASL error, I'm facing this error, when i'm connecting remotely only.
15/10/01 17:00:21 INFO zookeeper.ClientCnxn: Opening socket connection to server quickstart.cloudera/192.168.0.106:2181. Will not attempt to authenticate using SASL (unknown error)
15/10/01 17:00:21 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.0.105:63654, server: quickstart.cloudera/192.168.0.106:2181
15/10/01 17:00:21 INFO zookeeper.ClientCnxn: Session establishment complete on server quickstart.cloudera/192.168.0.106:2181, sessionid = 0x150220e67060034, negotiated timeout = 60000
15/10/01 17:00:21 WARN util.DynamicClassLoader: Failed to identify the fs of dir hdfs://quickstart.cloudera:8020/hbase/lib, ignored
java.io.IOException: No FileSystem for scheme: hdfs
This is the output file of running the java program locally.
15/10/01 13:22:36 INFO zookeeper.ClientCnxn: Opening socket connection to server /127.0.0.1:2181
15/10/01 13:22:36 INFO client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
15/10/01 13:22:36 INFO zookeeper.ClientCnxn: Socket connection established to quickstart.cloudera/127.0.0.1:2181, initiating session
15/10/01 13:22:36 INFO zookeeper.ClientCnxn: Session establishment complete on server quickstart.cloudera/127.0.0.1:2181, sessionid = 0x150220e67060028, negotiated timeout = 60000
15/10/01 13:22:36 WARN util.DynamicClassLoader: Failed to identify the fs of dir hdfs://quickstart.cloudera:8020/hbase/lib, ignored
java.io.IOException: No FileSystem for scheme: hdfs
My questions are:
Am I not able to create a new table in hbase remotely?
Is there any issue from output file related to my question?

Yes, you are able to create new tables and perform any other operation with
HBase remotely.
Looks like your error means that you don't add Hadoop or HBase libraries
properly to your project. As a result your program couldn't find classes of HDFS file system.
Do you add jars to your project directly or use Maven/Gradle? What dependencies your project have in addition to HBase?

Related

ambari on HDP cluster + ambari-metrics-collector service not start

we have some issue with ambari-metrics-collector service , ( we have HDP cluster version - 2.6.4 with 8 nodes )
ambari metrics collector service can’t start or start of few second then failed
the details about metrics collector version
rpm -qa | grep metrics
ambari-metrics-grafana-2.6.1.0-143.x86_64
ambari-metrics-monitor-2.6.1.0-143.x86_64
ambari-metrics-collector-2.5.0.3-7.x86_64
ambari-metrics-hadoop-sink-2.6.1.0-143.x86_64
all machines are rhel 7.2
we performed the following steps in order to resolve the problem
1.restart metrics-collector service
su - ams -c '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf/ stop'
su - ams -c '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf/ start'
or
ambari-metrics-collector stop
ambari-metrics-collector start
2.restart ambari-metrics-monitor on all nodes
ambari-metrics-monitor stop
ambari-metrics-monitor start
3.clean the folder /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/
mv /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/zookeeper_0 /tmp/bck/zookeeper/
Then restart metrics-collector service
4.Tuning the metrics-collector parameters according - https://docs.cloudera.com/HDPDocuments/Ambari-2.2.1.0/bk_ambari_reference_guide/content/_ams_general_guidelines.html
we update the follwing parameters in ambari
metrics_collector_heap_size=1024
hbase_regionserver_heapsize=1024
hbase_master_heapsize=512
hbase_master_xmn_size=128
status for now: - steps 1-4 doesn’t help
From the logs we can see the following:
log file - ambari-metrics-collector.log
2020-06-25 09:06:14,474 WARN org.apache.zookeeper.ClientCnxn: Session 0x172eab71f310002 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
2020-06-25 09:06:14,575 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=master02.sys671.com:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure/meta-region-server
log file - hbase-ams-master-master02.sys671.com.log
2020-06-25 09:38:18,799 WARN [RS:0;master02:51842-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Session 0x172ead5d73a0004 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2020-06-25 09:38:20,437 INFO [main-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Opening socket connection to server master02.sys671.com/23.2.35.171:61181. Will not attempt to authenticate using SASL (unknown error)
2020-06-25 09:38:20,438 WARN [main-SendThread(master02.sys671.com:61181)] zookeeper.ClientCnxn: Session 0x172ead5d73a0002 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
we also not see that port is listening ( timeline.metrics.service.webapp.address )
netstat -tulpn | grep 6188
any advice how to continue from this point ?
we'll appreciate to get any help about this problem

Squirrel Client Connecting to Phoenix - Timeout Exception

I am trying to connect to Phoenix via Squirrel client. I am receiving the following logs in the Squirrel logs. The logs suggests that the ClientConnection to zooperkeeper is established however it fails when a SQLClient Connection is being established with a Timeout exception.
I have copied the phoenix client jar into the lib directory of Squirrel and the driver is registered succesfully. Also when I run the SQLLine.py utility in the localhost it loads the SQL commandline to Phoenix succesfully and I can run the commands. Added the phoenix core jars to the $HBASE_HOME/lib folders as well.
2015-06-15 12:48:53,766 [pool-7-thread-1] INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper - Process identifier=hconnection-0x776a1002 connecting to ZooKeeper ensemble=10.58.126.245:2181
2015-06-15 12:48:53,766 [pool-7-thread-1] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=10.58.126.245:2181 sessionTimeout=90000 watcher=hconnection-0x776a10020x0, quorum=10.58.126.245:2181, baseZNode=/hbase
2015-06-15 12:48:58,287 [pool-7-thread-1-SendThread(10.58.126.245:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.58.126.245/10.58.126.245:2181. Will not attempt to authenticate using SASL (unknown error)
2015-06-15 12:48:58,301 [pool-7-thread-1-SendThread(10.58.126.245:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 10.58.126.245/10.58.126.245:2181, initiating session
2015-06-15 12:48:58,314 [pool-7-thread-1-SendThread(10.58.126.245:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 10.58.126.245/10.58.126.245:2181, sessionid = 0x14df5b87b120040, negotiated timeout = 90000
2015-06-15 12:49:58,100 [pool-7-thread-1] INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=10, retries=35, started=59774 ms ago, cancelled=false, msg=
2015-06-15 12:50:20,456 [pool-7-thread-1] INFO org.apache.hadoop.hbase.client.RpcRetryingCaller - Call exception, tries=11, retries=35, started=82130 ms ago, cancelled=false, msg=
2015-06-15 12:50:36,114 [AWT-EventQueue-1] ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack - Unexpected Error occurred attempting to open an SQL connection.
java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:201)
at net.sourceforge.squirrel_sql.client.mainframe.action.OpenConnectionCommand.awaitConnection(OpenConnectionCommand.java:132)
at net.sourceforge.squirrel_sql.client.mainframe.action.OpenConnectionCommand.access$100(OpenConnectionCommand.java:45)
at net.sourceforge.squirrel_sql.client.mainframe.action.OpenConnectionCommand$2.run(OpenConnectionCommand.java:115)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
I have the same problem, don't find the solution yet, but I managed to use the "thin" client instead.
Start queryserver https://phoenix.apache.org/server.html should listen on port 8765
Copy JAR phoenix-4.6.0-HBase-1.1-thin-client to Squirel lib folder
Create new driver, the class name is "org.apache.phoenix.queryserver.client.Driver"
Connect with this driver (my URI: jdbc:phoenix:thin:url=http://docker:8765)

Hbase managed zookeeper suddenly trying to connect to localhost instead of zookeeper quorum

I was running some tests with table mappers and reducers on large scale problems. After a certain point my reducers started failing when the job was 80% done. From what I can tell when looking at the syslogs the problem is that one of my zookeepers is attempting to connect to the localhost as opposed to the other zookeepers in the quorum
Oddly it seems to do just fine connecting to the other nodes when mapping is going on, its reducing that it has a problem with. Here are selected portions of the syslog which might be relevant to figuring out whats going on
2014-06-27 09:44:01,599 INFO [main] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=hdev02:5181,hdev01:5181,hdev03:5181 sessionTimeout=10000 watcher=hconnection-0x4aee260b, quorum=hdev02:5181,hdev01:5181,hdev03:5181, baseZNode=/hbase
2014-06-27 09:44:01,612 INFO [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x4aee260b connecting to ZooKeeper ensemble=hdev02:5181,hdev01:5181,hdev03:5181
2014-06-27 09:44:01,614 INFO [main-SendThread(hdev02:5181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server hdev02/172.17.43.36:5181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-06-27 09:44:01,615 INFO [main-SendThread(hdev02:5181)] org.apache.zookeeper.ClientCnxn: Socket connection established to hdev02/172.17.43.36:5181, initiating session
2014-06-27 09:44:01,617 INFO [main-SendThread(hdev02:5181)] org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2014-06-27 09:44:01,723 WARN [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=hdev02:5181,hdev01:5181,hdev03:5181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
2014-06-27 09:44:01,723 INFO [main] org.apache.hadoop.hbase.util.RetryCounter: Sleeping
***
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 1 on-disk map-outputs
2014-06-27 09:55:12,012 INFO [main] org.apache.hadoop.mapred.Merger: Merging 1 sorted segments
2014-06-27 09:55:12,013 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 33206049 bytes
2014-06-27 09:55:12,208 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merged 1 segments, 33206079 bytes to disk to satisfy reduce memory limit
2014-06-27 09:55:12,209 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merging 2 files, 265119413 bytes from disk
2014-06-27 09:55:12,209 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
2014-06-27 09:55:12,210 INFO [main] org.apache.hadoop.mapred.Merger: Merging 2 sorted segments
2014-06-27 09:55:12,212 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 265119345 bytes
2014-06-27 09:55:12,279 INFO [main] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x65afdbbb, quorum=localhost:2181, baseZNode=/hbase
2014-06-27 09:55:12,281 INFO [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x65afdbbb connecting to ZooKeeper ensemble=localhost:2181
2014-06-27 09:55:12,282 INFO [main-SendThread(localhost.localdomain:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost.localdomain/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-06-27 09:55:12,283 WARN [main-SendThread(localhost.localdomain:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2014-06-27 09:55:12,384 WARN [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
2014-06-27 09:55:12,384 INFO [main] org.apache.hadoop.hbase.util.RetryCounter: Sleeping 1000ms before retry #0...
2014-06-27 09:55:13,385 INFO [main-SendThread(localhost.localdomain:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost.localdomain/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-06-27 09:55:13,385 WARN [main-SendThread(localhost.localdomain:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing
***
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
2014-06-27 09:55:13,486 ERROR [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 1 attempts
2014-06-27 09:55:13,486 WARN [main] org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x65afdbbb, quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
I'm pretty sure its configured correctly, here is the relevant portion of my hbase-site.xml.
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>5181</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>10000</value>
<description></description>
</property>
<property>
<name>hbase.client.retries.number</name>
<value>10</value>
<description></description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hdev01,hdev02,hdev03</value>
<description></description>
</property>
So far as I can tell hdev03 is the only server that has any problem with this. Netstating all relevant ports doesn't show me anything strange.
I've had same problem when running HBase through Spark on Yarn. Everything was fine until suddenly it started to trying to connect to localhost instead of quorum. Setting port and quorum programmatically before HBase call fixed the issue
conf.set("hbase.zookeeper.quorum","my.server")
conf.set("hbase.zookeeper.property.clientPort","5181")
I'm using MapR, and it has "unusual" (5181) zookeeper port
Hard to say what is happening with the information, given. I have found the Hadoop Stack (HBase especially) to be quite hostile to even the slightest bit of misconfiguration in DNS or the hosts file.
As the quorum in your hbase-site.xml looks good, I'd start checking with network/hostname resolution related configurations:
Has the nodename slipped into the localhost entry in /etc/hosts on hdev03?
Is there an entry for the host itself in hdev03s /etc/hosts (there should)?
Has Reverse DNS been correctly configured in case you are using DNS for name resolution instead of the hosts file?
These are just a few pointers in the direction I'd look with this kind of issue. Hope it helps!
Add '--driver-class-path ~/hbase-1.1.2/conf' into spark-submit command, so that the task can find the configured zookeeper servers instead of 127.0.0.1.

FATAL master.HMaster: Unexpected state : .. Cannot transit it to OFFLINE

I've got a serious Hbase crash problem. I'm using HBase 0.94.7 with one master and two region servers. The HBase master crashed regularly, I can't even get it restarted. I've got the master logs as following:
DEBUG master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=master,60020,1374506461230, region=46c2333f401964bf877254be19c2cc8c
DEBUG handler.ClosedRegionHandler: Handling CLOSED event for 6423df864603aa6e8c45c726ab3ae62f
DEBUG master.AssignmentManager: Forcing OFFLINE; was=LogDetail,\x00\x00\x01\xE8\x00\x00\x01?\xF8\xB3\x8F\x17\xCE\xE2g\x84,1374498065657.6423df864603aa6e8c45c726ab3ae62f. state=CLOSED, ts=1374508769672, server=slave,60020,1374506460892
DEBUG zookeeper.ZKAssign: master:60000-0x14006f52f3f000e Creating (or updating) unassigned node for 6423df864603aa6e8c45c726ab3ae62f with OFFLINE state
FATAL master.HMaster: Unexpected state : LogDetail,\x00\x00\x01\xE8\x00\x00\x01?\xF6\xC17p&c\x8F\x14,1374498085655.c2f4143750eb1559a1dd92e937ea712d. state=PENDING_OPEN, ts=1374508769697, server=master,60020,1374506461230 .. Cannot transit it to OFFLINE.
java.lang.IllegalStateException: Unexpected state : LogDetail,\x00\x00\x01\xE8\x00\x00\x01?\xF6\xC17p&c\x8F\x14,1374498085655.c2f4143750eb1559a1dd92e937ea712d. state=PENDING_OPEN, ts=1374508769697, server=master,60020,1374506461230 .. Cannot transit it to OFFLINE.
at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
INFO master.HMaster: Aborting
DEBUG handler.ClosedRegionHandler: Handling CLOSED event for 0710b486dcb3d51465695b51db376255
....
DEBUG master.AssignmentManager: The znode of region LogDetail,\x00\x00\x01\xE8\x00\x00\x01?\xF6\xC17p&c\x8F\x14,1374498085655.c2f4143750eb1559a1dd92e937ea712d. has been deleted.
INFO master.AssignmentManager: The master has opened the region LogDetail,\x00\x00\x01\xE8\x00\x00\x01?\xF6\xC17p&c\x8F\x14,1374498085655.c2f4143750eb1559a1dd92e937ea712d. that was online on master,60020,1374506461230
DEBUG master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=master,60000,1374508461536, region=c9cfdd360c09b292412ba5ad88815e6f
DEBUG catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#5c061cd2
INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x14006f52f3f000f
INFO zookeeper.ZooKeeper: Session: 0x14006f52f3f000f closed
INFO zookeeper.ClientCnxn: EventThread shut down
INFO master.AssignmentManager$TimerUpdater: master,60000,1374508461536.timerUpdater exiting
INFO master.SplitLogManager$TimeoutMonitor: master,60000,1374508461536.splitLogManagerTimeoutMonitor exiting
INFO master.AssignmentManager$TimeoutMonitor: master,60000,1374508461536.timeoutMonitor exiting
INFO zookeeper.ZooKeeper: Session: 0x14006f52f3f000e closed
INFO zookeeper.ClientCnxn: EventThread shut down
INFO master.HMaster: HMaster main thread exiting
ERROR master.HMasterCommandLine: Failed to start master
I also found something unusual in the ZK log:
INFO org.apache.zookeeper.server.NIOServerCnxnFactory: Accepted socket connection from /master:37856
INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /master:37856
INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x140100dda0300e1 with negotiated timeout 180000 for client /master:37856
WARN org.apache.zookeeper.server.NIOServerCnxn: caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x140100dda0300e1, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:662)
INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /master:37856 which had sessionid 0x140100dda0300e1
Can anybody help to see what the problem is? Is it related to the unassigned region or something like this? I've tried the bin/hbase hbck -repair and bin/hbase hbck -fix, but it doesn't help.
Thanks
After checked the log of my region server very carefully, I got the answer.
Cause
It turns out that there is one library called 'SNAPPY' for the compression of the hbase table is not well installed on the region server. And all my tables are created using this compression algorithm. When the master tries to balance the region to the region server, it failed. Eventually the master aborted.
Solution
Install and configure the SNAPPY on EVERY NODE as following:
apt-get install libsnappy1
su hbase
mkdir /home/hbase/hbase-0.94.7/lib/native/Linux-amd64-64
ln -s /usr/lib/libsnappy.so.1.1.2 /home/hbase/hbase-0.94.7/lib/native/Linux-amd64-64/libsnappy.so
exit (-> root)
ln -s /usr/lib/libsnappy.so.1.1.2 /usr/lib64/libsnappy.so.1.1.2
ln -s /usr/lib/libsnappy.so.1.1.2 /usr/lib64/libsnappy.so.1
ln -s /usr/lib/libsnappy.so.1.1.2 /usr/lib64/libsnappy.so
ln -s /usr/lib/libsnappy.so.1 /usr/lib/libsnappy.so
Now everything is OK! The regions are well balanced over region servers.
Check the region server log, if it is caused by LZO compressor missing and you are using Cloudera Hadoop,you can install lzo easily according to the following instruction:
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/v1/v1-0-1/Installing-and-Using-Impala/ciiu_lzo.html

Running MapReduce on HBase gives Zookeeper error

I am doing a test project with Hadoop and HBase. Currently the cluster has 2 Ubuntu VMs hosted on a Windows machine.
I am able to perform PUT, QUERY and DELETE operation remotly (in my host machine) using following HBase Java API configuration
config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "192.168.56.90");
config.set("hbase.zookeeper.property.clientPort", "2222");
When I am trying to run a HBase MapReduce job on Windows with the same config as above, I am getting following error
13/03/24 06:11:03 ERROR security.UserGroupInformation: PriviledgedActionException as:Joel cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-Joel\mapred\staging\Joel290889388\.staging to 0700
java.io.IOException: Failed to set permissions of path: \tmp\hadoop-Joel\mapred\staging\Joel290889388\.staging to 0700
From what I have read on the web, there seems to be a problem with running MapReduce jobs on Windows. So I tried running the MapReduce job on Linux by using "java - jar MR.jar".
On Linux, I can't connect to Zookeeper. For unknown reason, Zookeeper host and port are getting reseted on the client side
13/03/24 05:59:33 INFO zookeeper.ZooKeeper: Client environment:os.version=3.5.0-23-generic
13/03/24 05:59:33 INFO zookeeper.ZooKeeper: Client environment:user.name=hduser
13/03/24 05:59:33 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hduser
13/03/24 05:59:33 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hduser/testes
13/03/24 05:59:33 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.56.90:2222 sessionTimeout=180000 watcher=hconnection
13/03/24 05:59:33 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 11552#node01
13/03/24 05:59:33 INFO zookeeper.ClientCnxn: Opening socket connection to server node01/192.168.56.90:2222. Will not attempt to authenticate using SASL (unknown error)
13/03/24 05:59:33 INFO zookeeper.ClientCnxn: Socket connection established to node01/192.168.56.90:2222, initiating session
13/03/24 05:59:33 INFO zookeeper.ClientCnxn: Session establishment complete on server node01/192.168.56.90:2222, sessionid = 0x13d9afaa1a30006, negotiated timeout = 180000
13/03/24 05:59:33 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13d9afaa1a30006
13/03/24 05:59:33 INFO zookeeper.ZooKeeper: Session: 0x13d9afaa1a30006 closed
13/03/24 05:59:33 INFO zookeeper.ClientCnxn: EventThread shut down
13/03/24 05:59:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/03/24 05:59:33 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/03/24 05:59:33 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
13/03/24 05:59:33 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 11552#node01
13/03/24 05:59:33 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
13/03/24 05:59:33 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
Judging from log above, it connects correctly to node01:2222 (node01 resolves to 192.168.56.90). But for some reason, it changes to localhost:2181 and it then gives a connection refused error.
How can I fix this issue to get a MR jobs running on Linux, on the same machine as Zookeeper is running?
Version: Hbase 0.94.5 / Hadoop 1.1.2
Thanks.
You may need to set the hbase.master also.
also check the /etc/hosts file and see if it is correct. Are you able to telnet to the zookeeper using that connection info?
config.set("hbase.zookeeper.quorum", "192.168.56.90");
config.set("hbase.zookeeper.property.clientPort", "2222");
config.set("hbase.master", "some.host.com:60000")

Resources