Hbase connection about zookeeper error - hadoop

Environment : Ubuntu 14.04 , hadoop-2.2.0 , hbase-0.98.7
when i start hadoop and hbase(single node mode), both all success (I also check the website 8088 for hadoop, 60010 for hbase)
jps
4507 SecondaryNameNode
5350 HRegionServer
4197 NameNode
4795 NodeManager
3948 QuorumPeerMain
5209 HMaster
4678 ResourceManager
5831 Jps
4310 DataNode
but when i check hbase-hadoop-master-localhost.log, i found a information following
2014-10-23 14:16:11,392 INFO [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2014-10-23 14:16:11,426 INFO [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
i have google lot of website for that unknown error problem, but i can't solve this problem...
Following is my hadoop and hbase configuration
Hadoop :
salves content : localhost
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:9001</value>
<description>host is the hostname of the resource manager and
port is the port on which the NodeManagers contact the Resource Manager.
</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:9002</value>
<description>host is the hostname of the resourcemanager and port is the port
on which the Applications in the cluster talk to the Resource Manager.
</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
<description>In case you do not want to use the default scheduler</description>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:9003</value>
<description>the host is the hostname of the ResourceManager and the port is the port on
which the clients can talk to the Resource Manager. </description>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value></value>
<description>the local directories used by the nodemanager</description>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>localhost:9004</value>
<description>the nodemanagers bind to this port</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>10240</value>
<description>the amount of memory on the NodeManager in GB</description>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/app-logs</value>
<description>directory on hdfs where the application logs are moved to </description>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value></value>
<description>the directories used by Nodemanagers as log directories</description>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>shuffle service that needs to be set for Map Reduce to run </description>
</property>
</configuration>
Hbase:
hbase-env.sh :
..
export JAVA_HOME="/usr/lib/jvm/java-7-oracle"
..
export HBASE_MANAGES_ZK=true
..
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
</configuration>
regionserver content : localhost
my /etc/hosts content:
127.0.0.1 localhost
#127.0.1.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
I try lots of methods to solve it, but all fail, please help me to solve it, i really need to know how to solve.
Originally, i run a mapreuce program and when map 67% reduce 0%, it print out some INFO and some of INFO is following:
14/10/23 15:50:41 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=60000 watcher=org.apache.hadoop.hbase.client.HConnectionManager$ClientZKWatcher#ce1472
14/10/23 15:50:41 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
14/10/23 15:50:41 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
14/10/23 15:50:41 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1493be510380007, negotiated timeout = 40000
14/10/23 15:50:43 INFO mapred.LocalJobRunner: map > sort
14/10/23 15:50:46 INFO mapred.LocalJobRunner: map > sort
then it crash.. I think program maybe in dead lock and that is what i want to solve zookeeper problem above.
If want another configuration file i set in hadoop or hbase or others, just tell me, i'll post up.
thanks!

I don't think zookeeper is your problem. You should look at your other logs for more information about your map/reduce job status. Check the datanode and namenode logs for errors along with the yarn log messages through the yarn job tracker ui.
ZooKeeper Messages
Those messages are from zookeeper trying to connect with the Zookeeper sasl client. If sasl is not configured the client will still be able to connect but the connection won't be authenticated.
Error message comes from this file
ZooKeeperSaslClient.java
150 // The user did not override the default context. It might be that they just don't intend to use SASL,
151 // so log at INFO, not WARN, since they don't expect any SASL-related information.
152 String msg = "Will not attempt to authenticate using SASL ";
153 if (runtimeException != null) {
154 msg += "(" + runtimeException + ")";
155 } else {
156 msg += "(unknown error)";
157 }
158 this.configStatus = msg;
159 this.isSASLConfigured = false;
160 }
If you want to get rid of the error you will have to configure zookeeper to use sasl. Sorry don't have any experience with configuration sasl for zookeeper.
ZooKeeper SASL Configuration

Add follwing properties in hbase-site.xml file
<property>
<name>hbase.zookeeper.quorum</name>
<value>192.168.56.101</value> #this is my server ip
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
restart ./start-hbase.sh

This is how I solved it - after I tried including hbase-site.xml in classpath and giving the zookeeper quorum value as -Dhbase.zookeeper.quorum - and they did not work.
I copied hbase-site.xml to the same folder as my jar and then did
jar uf myjar.jar hbase-site.xml
And then I ran hadoop jar myjar.jar Blah
This fixed the problem

Related

Hadoop: How to access a HDFS from an external IP?

Internally (i.e. internal network; private IP address-to-private IP address), I can access my HDFS just fine using:
hdfs dfs -ls hdfs://#.#.#.#/
However, when I try the same from a machine outside the network on which the HDFS namenode resides (obviously using the namenode machine's WAN IP instead of its LAN IP), I get:
ls: DestHost:destPort ec2-▒-▒-▒-▒.compute-1.amazonaws.com:8020 , LocalHost:localPort mymachine/127.0.0.1:0. Failed on local exception: java.io.IOException: Connection reset by peer
The namenode log reads:
INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8020: readAndProcess from client ▒.▒.▒.▒:▒ threw exception [java.io.IOException: Connection reset by peer]
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:377)
at org.apache.hadoop.ipc.Server.channelRead(Server.java:3486)
at org.apache.hadoop.ipc.Server.access$2600(Server.java:138)
at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:2144)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:1389)
at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1245)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1216)
My core-site.xml reads:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:8020</value>
</property>
</configuration>
Note that I have also tried setting the fs.defaultFS value to hdfs://#.#.#.#:8020. I have also tried setting it to hdfs://hadoophost:8020, and adding #.#.#.# hadoophost to the top of /etc/hosts. (#.#.#.# is obviously the LAN IP of the namenode's machine in both cases.) The results have been the same.
My hdfs-site.xml reads:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
Note that I am able to telnet externally to the namenode's machine on port 8020 just fine.
What setting(s) am I missing to enable external access to my Hadoop file system?

Unable to import data using scoop: exitCode=255

I am a noob in hadoop spark. I have setup a hadoop/spark cluster (1 namenode, 2 datanode). Now I am trying to import data from DB (mysql) using scoop in HDFS, but its failing always
16/07/27 16:50:04 INFO mapreduce.Job: Running job: job_1469629483256_0004
16/07/27 16:50:11 INFO mapreduce.Job: Job job_1469629483256_0004 running in uber mode : false
16/07/27 16:50:11 INFO mapreduce.Job: map 0% reduce 0%
16/07/27 16:50:13 INFO ipc.Client: Retrying connect to server: datanode1_hostname/172.31.58.123:59676. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/07/27 16:50:14 INFO ipc.Client: Retrying connect to server: datanode1_hostname/172.31.58.123:59676. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/07/27 16:50:15 INFO ipc.Client: Retrying connect to server: datanode1_hostname/172.31.58.123:59676. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/07/27 16:50:18 INFO mapreduce.Job: Job job_1469629483256_0004 failed with state FAILED due to: Application application_1469629483256_0004 failed 2 times due to AM Container for appattempt_1469629483256_0004_000002 exited with exitCode: 255
For more detailed output, check application tracking page:http://ip-172-31-55-182.ec2.internal:8088/cluster/app/application_1469629483256_0004Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1469629483256_0004_02_000001
Exit code: 255
Stack trace: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 255
Failing this attempt. Failing the application.
16/07/27 16:50:18 INFO mapreduce.Job: Counters: 0
16/07/27 16:50:18 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
16/07/27 16:50:18 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 16.2369 seconds (0 bytes/sec)
16/07/27 16:50:18 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
16/07/27 16:50:18 INFO mapreduce.ImportJobBase: Retrieved 0 records.
16/07/27 16:50:18 ERROR tool.ImportTool: Error during import: Import job failed!
I am able to manually write in HDFS:
hdfs dfs -put <local file path> <hdfs path>
But when i run scoop import command
sqoop import --connect jdbc:mysql://<host>/<db_name> --username <USERNAME> --password <PASSWORD> --table <TABLE_NAME> --enclosed-by '\"' --fields-terminated-by , --escaped-by \\ -m 1 --target-dir <hdfs location>
Can any one please tell me what I am doing wrong
Here is the list of things that I have already tried
Shutting down cluster, formatting HDFS, then restarting cluster (didn't help)
Made sure that HDFS is not in SAFE MODE
all the nodes have this in their /etc/hosts
127.0.0.1 localhost
172.31.55.182 namenode_hostname
172.31.58.123 datanode1_hostname
172.31.58.122 datanode2_hostname
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
Configuration Files:
All Nodes: $HADOOP_CONF_DIR/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ip-172-31-55-182.ec2.internal:9000</value>
</property>
</configuration>
All Nodes: $HADOOP_CONF_DIR/yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>ip-172-31-55-182.ec2.internal</value>
</property>
</configuration>
All Nodes: $HADOOP_CONF_DIR/mapred-site.xml:
<configuration>
<property>
<name>mapreduce.jobtracker.address</name>
<value>ip-172-31-55-182.ec2.internal:54311</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
NameNode Specific Configurations
$HADOOP_CONF_DIR/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///mnt/hadoop_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:50010</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:50075</value>
</property>
<property>
<name>dfs.datanode.https.address</name>
<value>0.0.0.0:50475</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:50020</value>
</property>
</configuration>
$HADOOP_CONF_DIR/masters:
ip-172-31-55-182.ec2.internal
$HADOOP_CONF_DIR/slaves:
ip-172-31-58-123.ec2.internal
ip-172-31-58-122.ec2.internal
DataNode Specific Configurations
$HADOOP_CONF_DIR/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///mnt/hadoop_data/hdfs/datanode</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:50010</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:50075</value>
</property>
<property>
<name>dfs.datanode.https.address</name>
<value>0.0.0.0:50475</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:50020</value>
</property>
</configuration>
From where u are trying to import the data. I mean from which machine you are trying to connect.check the master and slaves file in both namenode and datanode.
Try to ping the ip address from different server and check if it's showing as up.
Make these changes and restart your cluster, and try again:
Edit the part as mention in comment(#) below, and remove the comment
/etc/hosts file on client node:
127.0.0.1 localhost yourcomputername #get computername by "hostname -f" command and replace here
172.31.55.182 namenode_hostname ip-172-31-55-182.ec2.internal
172.31.58.123 datanode1_hostname ip-172-31-58-123.ec2.internal
172.31.58.122 datanode2_hostname ip-172-31-58-122.ec2.internal
/etc/hosts file on cluster nodes:
198.22.23.212 youcomputername #change to public ip of client node, change computername same as client node
172.31.55.182 namenode_hostname ip-172-31-55-182.ec2.internal
172.31.58.123 datanode1_hostname ip-172-31-58-123.ec2.internal
172.31.58.122 datanode2_hostname ip-172-31-58-122.ec2.internal
I am terminating this cluster and starting from scratch.

Resource Manager Has No Nodes

EDIT: I have looked at YARN Resourcemanager not connecting to nodemanager and the solution does not work for me. I have attached the section of the node-manager log where a connection to the resource manager is made:
[main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8031
2016-06-17 19:01:04,697 INFO [main] nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:getNMContainerStatuses(429)) - Sending out 0 NM container statuses: []
2016-06-17 19:01:04,701 INFO [main] nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:registerWithRM(268)) - Registering with RM using containers :[]
2016-06-17 19:01:05,815 INFO [main] ipc.Client (Client.java:handleConnectionFailure(867)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-06-17 19:01:06,816 INFO [main] ipc.Client (Client.java:handleConnectionFailure(867)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
For some reason it says it is connecting to 0.0.0.0. When I ssh into one of the data nodes and ping resource-manager I get a response so it is able to resolve the hostname.
This leads me to believe that an options is incorrect in my yarn-site.xml as my nodes are trying to connect to 0.0.0.0:8031 instead of the resource-manager:8031
I am running a Cloudera hadoop cluster on dockers and am having issues with the Yarn resource manager being able to see the other nodes. They way it is set up is as follows:
Node1 - Namenode (hadoop-hdfs-namenode)
Node 2 - Secondary Namenode (hadoop-hdfs-secondarynamenode)
Node 3 - Yarn Resource-Manager (hadoop-yarn-resourcemanager)
Node 4 - datanode and node manager (hadoop-hdfs-datanode, hadoop-yarn-nodemanager)
Node 5 - datanode and node manager (hadoop-hdfs-datanode, hadoop-yarn-nodemanager)
When I go to namenode:50070 I am able to see both nodes. However, when I go to the resource-manager:8088 it shows I have zero nodes. My yarn-site.xml file which is on every node is as follows:
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>resource-manager:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>resource-manager:8030</value>
</property>
<property>
<description>Classpath for typical applications.</description>
<name>yarn.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:///data/1/yarn/local,file:///data/2/yarn/local,file:///data/3/yarn/local</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>file:///data/1/yarn/logs,file:///data/2/yarn/logs,file:///data/3/yarn/logs</value>
</property>
<property>
<name>yarn.log.aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>Where to aggregate logs</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>hdfs://namenode:8020/var/log/hadoop-yarn/apps</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>resource-manager:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>resource-manager:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>resource-manager:8033</value>
</property>
<property>
<description>
Number of seconds after an application finishes before the nodemanager's
DeletionService will delete the application's localized file directory
and log directory.
To diagnose Yarn application problems, set this property's value large
enough (for example, to 600 = 10 minutes) to permit examination of these
directories. After changing the property's value, you must restart the
nodemanager in order for it to have an effect.
The roots of Yarn applications' work directories is configurable with
the yarn.nodemanager.local-dirs property (see below), and the roots
of the Yarn applications' log directories is configurable with the
yarn.nodemanager.log-dirs property (see also below).
</description>
<name>yarn.nodemanager.delete.debug-delay-sec</name>
<value>600</value>
</property>
</configuration>
Does anyone have any ideas as to why this is the case?
Thanks for reading.
Specify:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master-1</value>
</property>
As indicated in the edit it appeared as if the yarn-site.xml was not being picked up and only defaults were happening. I solved this be copying the yarn-site.xml file into every directory on the machine as user root. I then ran the node-manager as to make it error reading the file as it does not run under user root. The log directed me to where it expected the file which was in a yarn specific directory instead of the general hadoop directory.

Loading data into HBASE using importtsv causes error

I am trying to load a data from CSV File into HBASE using importtsv tool. I have set up a cluster of 3 machines.
This is my hbase-site.xml file
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ec2-54-190-103-64.us-west-2.compute.amazonaws.com:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>ec2-54-203-95-235.us-west-2.compute.amazonaws.com</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/ubuntu/zookeeper</value>
<description>Property from ZooKeeper's config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
</configuration>
When i start and run jps. Under Master Node i see HMaster and under datanodes i see Hquorumpeer and Hregionserver
When i try to load data i get following error
INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
14/08/11 07:41:28 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
14/08/11 07:41:28 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
14/08/11 07:41:28 INFO util.RetryCounter: Sleeping 8000ms before retry #3...
14/08/11 07:41:29 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
Not sure what is the issue with zookeeper. Thanks in advance
You need to add the following property in your hbase-site.xml file(both master and slaves)
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>1000</value>
</property>

Zookeeper connecting to localhost

While running Pig Script which is inserting data into HBase i got following error.
2013-10-25 14:57:03,147 [main-SendThread(localhost:2181)] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2013-10-25 14:57:03,147 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to localhost/127.0.0.1:2181, initiating session
2013-10-25 14:57:03,169 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x41ead7bb150092, negotiated timeout = 180000
Also when i see slave logs it says that unable to establish connection to master, But master says that connection is established successfully. Below are the logs of both:
Master Log :
zookeeper.ZooKeeper: Initiating client connection, connectString=slave:2181,hadoop-master:2181,ubuntu:2181 sessionTimeout=180000 watcher=hconnection
13/10/24 19:51:56 INFO zookeeper.ClientCnxn: Opening socket connection to server slave:2181. Will not attempt to authenticate using SASL (unknown error)
13/10/24 19:51:56 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 104744#hadoop-master
13/10/24 19:51:56 INFO zookeeper.ClientCnxn: Socket connection established to slave:2181, initiating session
13/10/24 19:51:56 INFO zookeeper.ClientCnxn: Session establishment complete on server slave:2181, sessionid = 0x141ead77c250002, negotiated timeout = 180000
Slave Log
2013-10-24 19:51:32,174 INFO org.apache.zookeeper.server.quorum.FastLeaderElection: New election. My id = 1, proposed zxid=0x80000002a
2013-10-24 19:51:32,180 WARN org.apache.zookeeper.server.quorum.QuorumCnxManager: Cannot open channel to 0 at election address hadoop-master:3888
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
Below are my configurations:
hbase-env.sh
export HBASE_MANAGES_ZK=true
hbase-site.xml
<configuration>
<property>
<name>hbase.master</name>
<value>hadoop-master:60000</value>
<description>
The host and port that the HBase master runs at.
A value of 'local' runs the master and a regionserver in a single process.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master:54310/hbase</value>
<description>
The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>
The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description>
Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop-master,slave,ubuntu</value>
<description>
Comma separated list of servers in the ZooKeeper Quorum.
For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
By default this is set to localhost for local and pseudo-distributed modes
of operation. For a fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop ZooKeeper on.
</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>
Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
By entering command jps i get following output:
At Master:
104744 HMaster
91841 JobTracker
80184 TaskTracker
91475 DataNode
91222 NameNode
105062 HRegionServer
91747 SecondaryNameNode
104666 HQuorumPeer
At Slave:
11533 HQuorumPeer
2444 SecondaryNameNode
5970 TaskTracker
11756 HRegionServer
This is probably an issue with your HBase config not being on the classpath when running the Pig script. One thing you can try is to set the zookeeper connection settings in the Pig script to see if it connects to the correct instance. If it does, you can add the HBase config to the classpath before launching your script.

Resources