Hmaster unexpectedly stopping while starting Hbase - hadoop

While starting hbase , hmaster is exiting abruptly with the follwing error :-
2015-07-17 20:52:36,136 DEBUG [main] master.HMaster: master/master/218.93.250.18:60000 HConnection server-to-server retries=350
2015-07-17 20:52:37,969 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:3015)
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:193)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:3029)
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2462)
at org.apache.hadoop.hbase.ipc.RpcServer$Listener.<init>(RpcServer.java:588)
at org.apache.hadoop.hbase.ipc.RpcServer.<init>(RpcServer.java:1942)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:512)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:3010)
Please suggest a reason.
Following is my hbase-site.xml file :-
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/zookeeperData</value>
</property>
</configuration>
following is my hosts file configuration.
127.0.0.1 localhost
127.0.0.1 xxx.xxx.xxx.xxx
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Whats going wrong in this???

Related

Connection refused on a Mapreduce job in a Hadoop cluster enviorment

I've set up a 4 node Hadoop cluster with a master node and three data nodes. It all seems to run fine until I try to execute a map reduce job.
Jps (master-node):
[root#master logs]# jps
26967 SecondaryNameNode
25720 JobHistoryServer
26778 NameNode
27115 ResourceManager
27839 Jps
Jps (data-nodes):
[root#localhost ~]# jps
21872 DataNode
22257 Jps
21974 NodeManager
The yarn log file on the master node gives the following exception:
2018-05-22 21:59:10,376 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1527018750538_0001 failed 2 times due to Error launching appattempt_1527018750538_0001_000002. Got exception: java.net.ConnectException: Call From NameNode/193.198.139.50 to localhost:41227 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor47.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
at org.apache.hadoop.ipc.Client.call(Client.java:1413)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy83.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy84.startContainers(Unknown Source)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:250)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:713)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
... 15 more
. Failing the application.
As far as I see it the problem is with the localhost:41227, since I've never specified anything like that in any of the configuration files, and the port number is a new one every time a try to run a new job, but obviously I'm not sure. Any advice or help is appreciated. Thanks
core-site.xml
<configuration>
<!-- core-site.xml -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://NameNode:9000/</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>NameNode:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>NameNode:19888</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<!-- hdfs-site.xml -->
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_work/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_work/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:/usr/local/hadoop_work/hdfs/namesecondary</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>NameNode</value>
</property>
<property>
<name>yarn.resourcemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.nodemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:/usr/local/hadoop_work/yarn/local</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>file:/usr/local/hadoop_work/yarn/log</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>hdfs://NameNode:9000/var/log/hadoop-yarn/apps</value>
</property>
</configuration>
It's the problem in the hostname of the Datanodes.
Give a meaningful hostname to Datanodes other than localhost and restart the processes.
Call From NameNode/193.198.139.50 to localhost:41227
means it's trying to reach a random port of Datanode(localhost) from Namenode. Each node will listen to its loopback IP(127.0.0.1/localhost). It supposed to reach the data node but as per your config, it's trying to reach its own machine.
Can you also post your slaves file?

successive errors on hadoop :failed on connection exception then com.google.protobuf.InvalidProtocolBufferException

i am trying to execute the command hadoop dfs -ls, and i got this error
Call From localhost/127.0.0.1 to yass-SATELLITE-C855-2CF:8021 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
once i resolved i got another which is
ls: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "localhost/127.0.0.1"; destination host is: "yass-SATELLITE-C855-2CF":9000;
and i keep in loop between this two errors
my core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://yass-SATELLITE-C855-2CF:9000</value>
</property>
</configuration>
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.safemode.threshold.pct</name>
<value>0</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/hadoop/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/hadoop/data/datanode</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/yass/Téléchargements/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
etc/hosts
127.0.0.1 localhost
127.0.0.1 yass-SATELLITE-C855-2CF
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
i got the first exception once it dispair i got the second one and i keep always in loop with this two exceptions
Any suggestions please ?
in reality i find somehow a solution in order to not staying with same errors
so i installed Hadoop verion 2.6.2 ,configure the version using the XML then keep working ,is not good solution for everyone but i hope that will show light for others

namenode can not started in Multi-node cluster in ec2

so i follow this guide to setup mu multi-node cluster:http://disi.unitn.it/~lissandrini/notes/installing-hadoop-on-ubuntu-14.html
when i finish all setup and run start-dfs.sh, after that, when i run jps, only have SecondaryNameNode started.
here is my core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mynode1/</value>
<description>NameNode URI</description>
</property>
</configuration>
and my hdfs.xml
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoop/data/datanode</value>
<description>DataNode directory</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoop/data/namenode</value>
<description>NameNode directory for namespace and transaction logs storage.</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
</configuration>
and my /etc/hosts
127.0.0.1 localhost
54.225.196.4 mynode1
54.80.40.198 mynode2
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
This is my namenode log
2014-10-26 01:16:57,756 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ip-10-169-41-62.ec2.internal/10.169.41.62
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.4.1
The thing i notice is if i change everything in the setting from mynode1 to localhost, then seem namenode can get started, but datanode in node2 will not response to master, which i cant upload a file to hdfs
I recommend you to use this tutorial here. I used it and everything was working fine, I had just changing the port number from 8020 t0 9000, and 8021 to 9001.
The core and hdfs files are not correct. I cannot understand how many nodes you are deploying. So, change your tutorial as I refereed you in the above link, and if you have any issue just let me know.

Hbase starts on slave but fails on reamining slaves in linux

I am using hadoop 1.2.1 and hbase 0.94.20.I created a cluster of 5 slave when i started hbase by using ./start-hbase.sh it start the services on one slave but other slaves unable to start...
my Hbase-site.xml file is
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:54310/opt/hbase-0.94.20</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
<description>
</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/hbase-0.94.20/tmp</value>
</property>
</configuration>
and /etc/hosts file is
127.0.0.1 localhost ubuntu
10.10.73.42 master sitmaster-HP-Compaq-6200-Pro-SFF-PC
10.10.73.5 slave1 sit2-HP-Pro-2110
10.10.73.25 slave2 sit-HP-Pro-2110
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

browse file system link - hadoop - localhost link

I am using Hadoop 2.2 on Ubuntu.
I am able to load this link in my browser.
http://[my_ip]:50070/dfshealth.jsp
From there, when I click the "Browse the filesystem" link, I am sent to
http://localhost:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/&nnaddr=127.0.0.1:9000
while here I think I want my_ip instead of localhost and 127.0.0.1
Also, if I type manually
http://my_ip:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/&nnaddr=my_ip:9000
it still does not work.
The my_ip is an external/global IP throughout my whole question text.
How can I get this working? All I want is to be able to browse my HDFS filesystem from the browser.
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<!-- <value>hdfs://my_ip:9000</value> -->
</property>
<!--
fs.default.name
hdfs://localhost:9000
-->
</configuration>
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/var/lib/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/var/lib/hadoop/hdfs/datanode</value>
</property>
<!--
dfs.replication
1
dfs.namenode.name.dir
file:/var/lib/hadoop/hdfs/namenode
dfs.datanode.data.dir
file:/var/lib/hadoop/hdfs/datanode
-->
<property>
<name>dfs.http.address</name>
<value>my_ip:50070</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>my_ip:50075</value>
</property>
</configuration>
/etc/hosts
127.0.0.1 localhost test02
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
EDIT ERROR:
HTTP ERROR 500
Problem accessing /nn_browsedfscontent.jsp. Reason:
Cannot issue delegation token. Name node is in safe mode.
The reported blocks 21 has reached the threshold 0.9990 of total blocks 21. The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically in 2 seconds.
Caused by:
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot issue delegation token. Name node is in safe mode.
The reported blocks 21 has reached the threshold 0.9990 of total blocks 21. The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically in 2 seconds.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5887)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:447)
at org.apache.hadoop.hdfs.server.namenode.NamenodeJspHelper$1.run(NamenodeJspHelper.java:623)
at org.apache.hadoop.hdfs.server.namenode.NamenodeJspHelper$1.run(NamenodeJspHelper.java:620)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
In your hdfs-site.xml, replace
<property>
<name>dfs.http.address</name>
<value>my_ip:50070</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>my_ip:50075</value>
</property>
by
<property>
<name>dfs.namenode.http-address</name>
<value>localhost:50070</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>localhost:50075</value>
</property>
But usually, in pseudo-ditributed mode it's not necessary to specify those properties.
Reboot your cluster after changing the properties.

Resources