Hadoop installation: Namenode cannot be started - hadoop

Currently I am trying to install hadoop-2.6.0 on my ubuntu 14.10 (32 bit utopic). I followed the instruction from here:
http://www.itzgeek.com/how-tos/linux/ubuntu-how-tos/install-apache-hadoop-ubuntu-14-10-centos-7-single-node-cluster.html#axzz3X2DuWaxQ
However, namenode cannot be started when I try to format namenode.
This is what I keep receiving when I try to do hdfs or hadoop namenode -format:
15/04/11 16:32:13 FATAL namenode.NameNode: Fialed to start namenode
java.lang.IllegalArgumentException: URI has an authority component
at java.io.File.<init>(File.java:423)
at org.apache.hadoop.hdfs.server.namenode.NNSStorage.getStorageDirectory(NNStorage.java:329)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java: 270)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:241)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:935)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1379)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
15/04/11 16:32:13 INFO util.ExitUtil: Exiting with status 1
15/04/11 16:32:14 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ThinkPad-Edge-E540/127.0.1.1
************************************************************/
I am new to linux and hadoop. Please help me out on this issue. Also, when I first tried to install hadoop, I was receiving error message like this:
java.net.ConnectException: Call From ThinkPad-Edge-E540/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Later, I uninstalled hadoop 2.6.0 and now I'm trying to follow current instruction as shown in the above link.
Update
I have removed all previous installed java (jdk1.7.0) that I installed in previous version. But the error message is still there.
Update
This is what showing in my etc/hosts:
127.0.0.1 localhost
127.0.1.1 myname-mycomputer (I have commented out this line per suggestion)
#The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00:0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

This problem arise when mistaken I specified wrong path for namenode and datanode in hdfs-site.xml and tmp dir path in core-site.xml,
Path should be well formatted, for example-
<property>
<name>dfs.namenode.edits.dir</name>
<value>file:///home/hadoop/hadoop-content/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoop-content/hdfs/datanode</value>
</property>
and for temp dir in core-site.xml it is like -
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-content/tmp</value>
</property>
sometimes we make mistake in specifying - file:///

In /etc/hosts:
1. Add this line:
your-ip-address your-host-name
example: 192.168.1.8 master
In /etc/hosts:
2. Delete the line with 127.0.1.1 (This will cause loopback)
3. In your core-site, change localhost to your-ip or your-hostname
Now, restart the cluster.

Related

Hadoop java.net.SocketException: Network is unreachable

I´m executing this command in a 4 node hadoop cluster on the namenode node:
hadoop fs -ls /
But it shows an error:
ls: Failed on local exception: java.net.SocketException:
Network is unreachable; Host Details: local host is "namenode/172.16.1.2";
destination host is: "namenode":9000;
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
</configuration>
cat /etc/hosts:
172.16.1.2 namenode
172.16.1.3 datanode1
172.16.1.4 datanode2
172.16.1.5 datanode3
First try to ping namenode and see what happen. If ping reaches the host, check the firewall via iptables on your current machine and namenode because it is probably blocking related traffic.
For me work setting JVM config
-Djava.net.preferIPv4Stack=true

hadoop Protocol message tag had invalid wire type

I Set up hadoop 2.6 cluster using two nodes of 8 cores each on Ubuntu 12.04. sbin/start-dfs.sh and sbin/start-yarn.sh both succeed. And I can see the following after jps on the master node.
22437 DataNode
22988 ResourceManager
24668 Jps
22748 SecondaryNameNode
23244 NodeManager
The jps outcome on the slave node is
19693 DataNode
19966 NodeManager
I then run the PI example.
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 30 100
Which gives me there error-log
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "Master-R5-Node/xxx.ww.y.zz"; destination host is: "Master-R5-Node":54310;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
The problem seems with the HDFS file system since trying out the command bin/hdfs dfs -mkdir /user fails with the similar exception.
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "Master-R5-Node/xxx.ww.y.zz"; destination host is: "Master-R5-Node":54310;
where xxx.ww.y.zz is the ip-address of Master-R5-Node
I have checked and followed all the recommendations of ConnectionRefused on Apache and on this site.
Despite the week long effort, I cannot get it fixed.
Thanks.
There are so many reasons to what may lead to the problem I faced. But I finally ended up fixing it using some of the following things.
Make sure that you have the needed permission to the /hadoop and hdfs temporary files. (you have to figure out where that is for your paticular case)
remove the port number from fs.defaultFS in $HADOOP_CONF_DIR/core-site.xml. It should look like this:
`<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://my.master.ip.address/</value>
<description>NameNode URI</description>
</property>
</configuration>`
Add the following two properties to `$HADOOP_CONF_DIR/hdfs-site.xml
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
Voila! You should now be up and running!

ConnectException: Connection refused when run mapreduce in Hadoop

I set up Hadoop(2.6.0) with multi machines mode : 1 namenode + 3 datanodes. When I used command : start-all.sh, they (namenode, datanode, resource manager, node manager) worked ok. I checked it with jps command and result on each node were bellow:
NameNode :
7300 ResourceManager
6942 NameNode
7154 SecondaryNameNode
DataNodes:
3840 DataNode
3924 NodeManager
And I also uploaded sample text file on HDFS at: /user/hadoop/data/sample.txt. Absolutely no error at that moment.
But when I tried to run a mapreduce with hadoop example's jar :
hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount /user/hadoop/data/sample.txt /user/hadoop/output
I have this error:
15/04/08 03:31:26 INFO mapreduce.Job: Job job_1428478232474_0001 running in uber mode : false
15/04/08 03:31:26 INFO mapreduce.Job: map 0% reduce 0%
15/04/08 03:31:26 INFO mapreduce.Job: Job job_1428478232474_0001 failed with state FAILED due to: Application application_1428478232474_0001 failed 2 times due to Error launching appattempt_1428478232474_0001_000002. Got exception: java.net.ConnectException: Call From hadoop/127.0.0.1 to localhost:53245 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy31.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:254)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 9 more Failing the application.
15/04/08 03:31:26 INFO mapreduce.Job: Counters: 0
About the configuration, sure that namenode can ssh to datanodes and vice versa without prompt password.I also dissabled IP6 and modified /etc/hosts file :
127.0.0.1 localhost hadoop
192.168.56.102 hadoop-nn
192.168.56.103 hadoop-dn1
192.168.56.104 hadoop-dn2
192.168.56.105 hadoop-dn3
I dont know why mapreduced can't run althought namenode and datanodes worked alright. I'm almost stucked at here, can you help me find the reason??
Thank you
Edit :
Here config in hdfs-site.xml (namenode):
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoop/hadoop_stores/hdfs/namenode</value>
<description>NameNode directory for namespace and transaction logs storage.</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>hadoop-nn:50070</value>
<description>Your NameNode hostname for http access.</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-nn:50090</value>
<description>Your Secondary NameNode hostname for http access.</description>
</property>
In datanodes :
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoop/hadoop_stores/hdfs/data/datanode</value>
<description>DataNode directory</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>hadoop-nn:50070</value>
<description>Your NameNode hostname for http access.</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-nn:50090</value>
<description>Your Secondary NameNode hostname for http access.</description>
Here's result with command : hadoop fs -ls /user/hadoop/data
hadoop#hadoop:~/DATA$ hadoop fs -ls /user/hadoop/data 15/04/09 00:23:27
Found 2 items
-rw-r--r-- 3 hadoop supergroup 29 2015-04-09 00:22 >/user/hadoop/data/sample.txt
-rw-r--r-- 3 hadoop supergroup 27 2015-04-09 00:22 >/user/hadoop/data/sample1.txt
hadoop fs -ls /user/hadoop/output
ls: `/user/hadoop/output': No such file or directory
Found solution!! see this post- yarn shows data nodes id/name as localhost
Call From localhost.localdomain/127.0.0.1 to localhost.localdomain:56148 failed on connection exception: java.net.ConnectException: Connection refused;
Both master and slaves were having host names of localhost.localdomain in /etc/hostname.
I changed host names of slaves to slave1 and slave2. That worked.
Thank you everyone for your time.
#kate make sure etc/hostname in namenode and datanodes are not set to localhost. Just type ~# hostname in terminal to see. You can set a new hostname by the same command.
My master and workers or slaves' /etc/hosts looks like this-
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#127.0.1.1 localhost
192.168.111.72 master
192.168.111.65 worker1
192.168.111.66 worker2
hostname of worker1
hduser#worker1:/mnt/hdfs/datanode$ cat /etc/hostname
worker1
and worker2
hduser#worker2:/usr/local/hadoop/logs$ cat /etc/hostname
worker2
Also, probably you don't want to have "hadoop" hostname with loopback interface. i.e.
127.0.0.1 localhost hadoop
Check this point (1) in https://wiki.apache.org/hadoop/ConnectionRefused.
Thank you.
FIREWALL ISSUE:
java.net.ConnectException: Connection refused
This error might be due to firewall issues. Do this in terminal:
sudo apt-get install iptables-persistent
sudo iptables -L
sudo iptables-save > /usr/iptables-backup/iptables.v4.rules
Check whether the file is created before continuing (since this will be used to restore firewall if something goes wrong).
Now, flush iptable rules (i.e. stop firewall):
sudo iptables -F
Now try,
sudo iptables -L
This command should return no rules. Now, try to run your map/reduce job.
Note: If you want to restore iptables to previous condition, type this in terminal:
sudo iptables-restore < /usr/iptables-backup/iptables.v4.rules

regarding hbase running on hadoop in distributed mode

Hadoop version=2.4.1
hbase version=0.98.6
i have hadoop up and running prefectly fine on below conf:
107.108.86.119-hadoop namenode,SecondaryNameNode
107.109.155.100-datanode1
107.109.155.102-datanode2
now i install hbase as below conf:-
107.108.86.114:-hmaster,HQuorumPeer
107.109.155.100-regionserver1
107.109.155.102-regionserver2
when i do jps following process are running:
107.109.155.102:-hregionserver,datanode
107.109.155.100:-hregionserver,datanode
107.108.86.119:-NameNode,secondaryNameNode
107.108.86.114:-hmaster
but on doing status on hbase shell is showing "0 servers, 0 dead, NaN average load"
on entering cmd on hbase shell showing ERROR: java.io.IOException: Table Namespace Manager not ready yet, try again later
logs on regionserver showing:
regionserver.HRegionServer: reportForDuty to master=localhost,60000,1415007213689 with port=60020, startcode=1415007215055
regionserver.HRegionServer: error telling master we are up
my hbase-site.xml-
<property>
<name>hbase.master</name>
<value>107.108.86.114:60000</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://push-mcd2:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>107.108.86.114</value>
</property>
while /etc/hosts of hmaster is:
127.0.0.1 localhost arpita-ubuntu
127.0.1.1 arpita-ubuntu
107.109.155.100 push-ws1
107.109.155.102 push-ws2
107.108.86.114 push-mcd1
107.108.86.119 push-mcd2
WHILE slaves file are also almost similiar to above one.
conf/hbase-env.sh
export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.22 export HBASE_CLASSPATH=/home/hadoop/hadoop-0.20.2/conf export HBASE_MANAGES_ZK=true
so what change i make so hbase will run on above cluster
Why does your regionserver log mentions that it is looking for HBase Master on localhost?
Form information above you have setup Master on a node different for either regionservers, please check your config is correct on each node.
logs on regionserver showing: regionserver.HRegionServer:
reportForDuty to master=localhost,60000,1415007213689 with port=60020,
startcode=1415007215055 regionserver.HRegionServer: error telling
master we are up
Also in /etc/hosts on each node please update first two lines from
127.0.0.1 localhost arpita-ubuntu
127.0.1.1 arpita-ubuntu
to
127.0.0.1 localhost
<Actual_IP_Address_for_Host> arpita-ubuntu
This is necessary if you don't have automatic dns name resolution in place.
Also please use IP instead of localhost in all config settings.
If you still face issues, check if the respective ports are open or not.
Hope this helps you.

access hbase in IDE Eclipse , java.net.UnknownHostException

When I write the java code to access hbase in IDE Eclipse, the messages "java.net.UnknownHostException" are always been shown.But hbase shell works well.
I install the hadoop and hbase on a single linux node in pseudo distribution mode. And my hostname is yzd. Here are the /etc/hosts and hbase-site.xml:
/etc/hosts:
127.0.0.1 localhost yzd
hbase-site.xml:
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Error message:
INFO [main] (HBaseRPC.java:117) - Using org.apache.hadoop.hbase.ipc.WritableRpcEngine for org.apache.hadoop.hbase.ipc.HMasterInterface
INFO [main] (HConnectionManager.java:596) - getMaster attempt 0 of 10 failed; retrying after sleep of 1000
java.net.UnknownHostException: unknown host: � 13846#yzdlocalhost
at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.<init>(HBaseClient.java:224)
at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:954)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:816)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:141)
at com.sun.proxy.$Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:174)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:295)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:272)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:324)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:579)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
at com.hbasebook.hush.schema.SchemaManager.process(SchemaManager.java:126)
at com.hbasebook.hush.HushMain.main(HushMain.java:57)
Check the version of your local hbase matches the one you are using as a dependency in your pom. This should solve your issue. I was facing the same issue, I was using hbase in standalone mode. I hope this helps you.
First of all yzd is not host name, its domain name (You should prefer FQDN). Now this line
java.net.UnknownHostException: unknown host: � 13846#yzdlocalhost
clearly says that 13846#yzdlocalhost host is not there. Now you can do followings:
Use IP address instead of hostname in both hbase-site.xml and core-site.xml and check
Then use FQDN in etc/hosts file and tab-separate the values, now you can replace the IP with FQDN

Resources