"Connection refused" Error for Namenode-HDFS (Hadoop Issue) - hadoop

All my nodes are up and running when we see using jps command, but still I am unable to connect to hdfs filesystem. Whenever I click on Browse the filesystem on the Hadoop Namenode localhost:8020 page, the error which i get is Connection Refused. Also I have tried formatting and restarting the namenode but still the error persist. Can anyone please help me solving this issue.

Check whether all your services are running JobTracker, Jps, NameNode. DataNode, TaskTracker by running jps command.
Try to run start them one by one:
./bin/stop-all.sh
./bin/hadoop-daemon.sh start namenode
./bin/hadoop-daemon.sh start jobtracker
./bin/hadoop-daemon.sh start tasktracker
./bin/hadoop-daemon.sh start datanode
If you're still getting the error, stop them again and clean your temp storage directory. The directory details are in the config file ./conf/core-site.xml and the run,
./bin/stop-all.sh
rm -rf /tmp/hadoop*
./bin/hadoop namenode -format
Check the logs in the ./logs folder.
tail -200 hadoop*jobtracker*.log
tail -200 hadoop*namenode*.log
tail -200 hadoop*datanode*.log
Hope it helps.

HDFS may use port 9000 under certain distribution/build.
please double check your name node port.

Change the core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopvm:8020</value>
<final>true</final>
</property>
change to the ip adress .
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.132.129:8020</value>
<final>true</final>
</property>

Related

Hadoop: Secondary NameNode Permission Denied

I'm attempting to run Hadoop in pseudo-distributed mode to learn how the system work. To install it, I've downloaded Hadoop-3.0.0 from the site, untarred it. I've done my configurations as follows (leaving out the configuration tags for brevity):
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
</property>
hdsf-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value> </property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
After doing this, I've formatted my hdfs using
hdfs namenode -format
I've also setup passwordless ssh using the following:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa2
cat ~/.ssh/id_rsa2.pub >> ~/.ssh/authorized_keys
(I've also added id_rsa2.pub as the default for localhost using a config file, since I already was using id_rsa.pub for something else and didn't want to mix-and-match in case I broke something)
I'm able to ssh into localhost. All looks well.
Then I run start-dfs.sh, and I see this error:
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [zm.local]
zm.local: zm#zm.local: Permission denied (publickey,password,keyboard-interactive).
2018-01-16 17:31:35,807 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
If I run jps (after starting yarn and mapreduce history server), I have the following:
37921 NodeManager
38070 Jps
37434 NameNode
38060 JobHistoryServer
37821 ResourceManager
Noticeably, the SecondaryNameNode is missing, my assumption being it's due to the error above.
I can then try to use hadoop's fs command and I'm able to create a folder and look it up. But if I try to copy any data over, I get notified that the NameNode is in SAFEmode. If I turn off save mode using:
hdfs dfsadmin -safemode leave
It immediately turns back on. By going to the namenode port on localhost, I see the following message:
Safe mode is ON. Resources are low on NN. Please add or free up more resourcesthen turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
However, I have plenty of resources. The single datanode is using less than 8% of it's allotted space, the namenode as almost 100GB of space. The datanode and namenode are both reporting as healthy. Thus, I think the problem is the lack of a secondary namenode. With that in mind, is anyone aware what might be causing the SecondaryNameNode to have different permission issues from the PrimaryNameNode? It seems to be trying to put the sNN somewhere on the local machine instead - but when I check in /tmp/hadoop*, all of the file permissions seem to be normal.
Thanks for any help.

Not able to run dump in pig

I am trying to dump a relation but getting following error.
I have tried start-all.sh and tried formatting namenode using hadoop namenode -format.
But I am not getting what is wrong.
Error:-
Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
Start the JobHistoryServer
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
Pig when ran in mapreduce mode expects the JobHistoryServer to be available.
To configure JobHistoryServer, add these properties to mapred-site.xml replacing hostname with actual name of the host where the process is started
<property>
<name>mapreduce.jobhistory.address</name>
<value>hostname:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hostname:19888</value>
</property>
I would first ensure I'm able to connect to namenode from hdfs client on a edge node. If not some problem/inconsistency with your namenode configs in core-site.xml file either with ports or hostname.
Once you are able to run below with out any issues and ensure namenode is not in safe mode on url http://namenode_host:50070 (which prevents any writes)
hadoop fs -ls /
Then I would proceed with pig. Looks like based on your error hdfs client is unable to reach namenode for some reason which could be firewall or config issue.

How to add an hard disk to hadoop

I installed Hadoop 2.4 on Ubuntu 14.04 and now I am trying to add an internal sata HD to the existing cluster.
I have mounted the new hd in /mnt/hadoop and assigned its ownership to the hadoop user
Then I tried to add it to the configuration file as follow:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode, file:///mnt/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode, file:///mnt/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
Afterwards, I started the hdfs:
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-namenode-hadoop-Datastore.out
localhost: starting datanode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-datanode-hadoop-Datastore.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-secondarynamenode-hadoop-Datastore.out
It seems that it does not fire up the second hd
This is my core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
In addition I tried to refresh the namenode and I get a connection problem:
Refreshing namenode [localhost:9000]
refreshNodes: Call From hadoop-Datastore/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Error: refresh of namenodes failed, see error messages above.
In addition, I can't connect to the Hadoop web interface.
It seems that I have two related problems:
1) A connection problem
2) I cannot connect to the new installed hd
Are these problem related?
How can I fix these issues?
Thanks
EDIT
I can ping the localhost and I can access localhost:50090/status.jsp
However, I cannot access 50030 and 50070
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode, file:///mnt/hadoop/hadoopdata/hdfs/namenode</value>
</property>
This is documented as:
Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
Are you sure you need this? Do you want your fsimage to be copied in both locations, for redundancy? And if yes, did you actually copy the fsimage on the new HDD before starting the namenode? See Adding a new namenode data directory to an existing cluster.
The new data directory (dfs.data.dir) is OK, the datanode should pick it up and start using it for placing blocks.
Also, as a general troubleshooting advice, look into the namenode and datanode logs for more clues.
Regarding your comment: "sudo chown -R hadoop.hadoop /usr/local/hadoop_store."
The owner has to be hdfs user. Try:
sudo chown -R hdfs.hadoop /usr/local/hadoop_store.

Unable to start nodemanager of Hadoop YARN at OS X 10.8

After starting all other nodes, when I try to start nodemanager, it seems it has been opened and then automatically terminated. Like the following:
Yitongs-MacBook-Pro:hadoop timyitong$ sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /Users/timyitong/Dev/hadoop/logs/yarn-timyitong-nodemanager-Yitongs-MacBook-Pro.local.out
Yitongs-MacBook-Pro:hadoop timyitong$ jps
8981 DataNode
9300 Jps
9139 JobHistoryServer
8932 NameNode
9038 ResourceManager
I don't get any error, any exception, but the nodemanger is not there. And when I try to stop it, it says like this (the stopnodes.sh is just a script), which confirms that the nodemanager is not there:
Yitongs-MacBook-Pro:hadoop timyitong$ sh stopnodes.sh
stopping namenode
stopping datanode
stopping resourcemanager
no nodemanager to stop
stopping historyserver
And I am not sure whether it is because nodemanager is not started, when I try to run the sample wordcount program, I always got my task pending forever.
My environment is OS X 10.8, Hadoop YARN 2.2.0.
And I already solved the java version issue with export JAVA_HOME=$(/usr/libexec/java_home -v 1.6).
Acctually I used bin/yarn nodemanger to start the server directly and found out the problem. It is in my yarn-site.xml where I should not set the name of yarn.nodemanager.aux-services containing dots (.) like mapreduce.shuffle. After change mapreduce.shuffle to mapreduce_shuffle, the problem is solved.
Really don't understand why it does not allow dots, since I config everything according to this blog post, where this setting seems to be fine.
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
The mapreduce.shuffle should be mapreduce_shuffle . Please observe _ (underscore instead of dot). Also have a look at http://www.thecloudavenue.com/2012/01/getting-started-with-nextgen-mapreduce.html

Namenode not getting started

I was using Hadoop in a pseudo-distributed mode and everything was working fine. But then I had to restart my computer because of some reason. And now when I am trying to start Namenode and Datanode I can find only Datanode running. Could anyone tell me the possible reason of this problem? Or am I doing something wrong?
I tried both bin/start-all.sh and bin/start-dfs.sh.
I was facing the issue of namenode not starting. I found a solution using following:
first delete all contents from temporary folder: rm -Rf <tmp dir> (my was /usr/local/hadoop/tmp)
format the namenode: bin/hadoop namenode -format
start all processes again:bin/start-all.sh
You may consider rolling back as well using checkpoint (if you had it enabled).
hadoop.tmp.dir in the core-site.xml is defaulted to /tmp/hadoop-${user.name} which is cleaned after every reboot. Change this to some other directory which doesn't get cleaned on reboot.
Following STEPS worked for me with hadoop 2.2.0,
STEP 1 stop hadoop
hduser#prayagupd$ /usr/local/hadoop-2.2.0/sbin/stop-dfs.sh
STEP 2 remove tmp folder
hduser#prayagupd$ sudo rm -rf /app/hadoop/tmp/
STEP 3 create /app/hadoop/tmp/
hduser#prayagupd$ sudo mkdir -p /app/hadoop/tmp
hduser#prayagupd$ sudo chown hduser:hadoop /app/hadoop/tmp
hduser#prayagupd$ sudo chmod 750 /app/hadoop/tmp
STEP 4 format namenode
hduser#prayagupd$ hdfs namenode -format
STEP 5 start dfs
hduser#prayagupd$ /usr/local/hadoop-2.2.0/sbin/start-dfs.sh
STEP 6 check jps
hduser#prayagupd$ $ jps
11342 Jps
10804 DataNode
11110 SecondaryNameNode
10558 NameNode
In conf/hdfs-site.xml, you should have a property like
<property>
<name>dfs.name.dir</name>
<value>/home/user/hadoop/name/data</value>
</property>
The property "dfs.name.dir" allows you to control where Hadoop writes NameNode metadata.
And giving it another dir rather than /tmp makes sure the NameNode data isn't being deleted when you reboot.
Open a new terminal and start the namenode using path-to-your-hadoop-install/bin/hadoop namenode
The check using jps and namenode should be running
Why do most answers here assume that all data needs to be deleted, reformatted, and then restart Hadoop?
How do we know namenode is not progressing, but taking lots of time.
It will do this when there is a large amount of data in HDFS.
Check progress in logs before assuming anything is hung or stuck.
$ [kadmin#hadoop-node-0 logs]$ tail hadoop-kadmin-namenode-hadoop-node-0.log
...
016-05-13 18:16:44,405 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 117/141 transactions completed. (83%)
2016-05-13 18:16:56,968 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 121/141 transactions completed. (86%)
2016-05-13 18:17:06,122 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 122/141 transactions completed. (87%)
2016-05-13 18:17:38,321 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 123/141 transactions completed. (87%)
2016-05-13 18:17:56,562 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 124/141 transactions completed. (88%)
2016-05-13 18:17:57,690 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 127/141 transactions completed. (90%)
This was after nearly an hour of waiting on a particular system.
It is still progressing each time I look at it.
Have patience with Hadoop when bringing up the system and check logs before assuming something is hung or not progressing.
In core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/yourusername/hadoop/tmp/hadoop-${user.name}
</value>
</property>
</configuration>
and format of namenode with :
hdfs namenode -format
worked for hadoop 2.8.1
If anyone using hadoop1.2.1 version and not able to run namenode, go to core-site.xml, and change dfs.default.name to fs.default.name.
And then format the namenode using $hadoop namenode -format.
Finally run the hdfs using start-dfs.sh and check for service using jps..
Did you change conf/hdfs-site.xml dfs.name.dir?
Format namenode after you change it.
$ bin/hadoop namenode -format
$ bin/hadoop start-all.sh
If you facing this issue after rebooting the system, Then below steps will work fine
For workaround.
1) format the namenode: bin/hadoop namenode -format
2) start all processes again:bin/start-all.sh
For Perm fix: -
1) go to /conf/core-site.xml change fs.default.name to your custom one.
2) format the namenode: bin/hadoop namenode -format
3) start all processes again:bin/start-all.sh
Faced the same problem.
(1) Always check for the typing mistakes in the configuring the .xml files, especially the xml tags.
(2) go to bin dir. and type ./start-all.sh
(3) then type jps , to check if processes are working
Add hadoop.tmp.dir property in core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/yourname/hadoop/tmp/hadoop-${user.name}</value>
</property>
</configuration>
and format hdfs (hadoop 2.7.1):
$ hdfs namenode -format
The default value in core-default.xml is /tmp/hadoop-${user.name}, which will be deleted after reboot.
Try this,
1) Stop all hadoop processes : stop-all.sh
2) Remove the tmp folder manually
3) Format namenode : hadoop namenode -format
4) Start all processes : start-all.sh
If you kept default configurations when running hadoop the port for the namenode would be 50070. You will need to find any processes running on this port and kill them first.
Stop all running hadoop with : bin/stop-all.sh
check all processes running in port 50070
sudo netstat -tulpn | grep :50070 #check any processes running in
port 50070, if there are any the / will
appear at the RHS of the output.
sudo kill -9 <process_id> #kill_the_process.
sudo rm -r /app/hadoop/tmp #delete the temp folder
sudo mkdir /app/hadoop/tmp #recreate it
sudo chmod 777 –R /app/hadoop/tmp (777 is given for this example purpose only)
bin/hadoop namenode –format #format hadoop namenode
bin/start-all.sh #start-all hadoop services
Refer this blog
For me the following worked after I changed the directory of the namenode
and datanode in hdfs-site.xml
-- before executing the following steps stop all services with stop-all.sh or in my case I used the stop-dfs.sh to stop the dfs
On the new configured directory, for every node (namenode and datanode), delete every folder/files inside it (in my case a 'current' directory).
delete the Hadoop temporary directory: $rm -rf /tmp/haddop-$USER
format the Namenode: hadoop/bin/hdfs namenode -format
start-dfs.sh
After I followed those steps my namenode and datanodes were alive using the new configured directory.
I ran $hadoop namenode to start namenode manually at foreground.
From the logs I figured out that 50070 is ocuupied, which was defaultly used by dfs.namenode.http-address. After configuring dfs.namenode.http-address in hdfs-site.xml, everything went well.
I got the solution just share with you that will work who got the errors:
1. First check the /home/hadoop/etc/hadoop path, hdfs-site.xml and
check the path of namenode and datanode
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
2.Check the permission,group and user of namenode and datanode of the particular path(/home/hadoop/hadoopdata/hdfs/datanode), and check if there are any problems in all of them and if there are any mismatch then correct it. ex .chown -R hadoop:hadoop in_use.lock, change user and group
chmod -R 755 <file_name> for change the permission
After deleting a resource managers' data folder, the problem is gone.
Even if you have formatting cannot solve this problem.
If your namenode is stuck in safemode you can ssh to namenode, su hdfs user and run the following command to turn off safemode:
hdfs dfsadmin -fs hdfs://server.com:8020 -safemode leave
Instead of formatting namenode, may be you can use the below command to restart the namenode. It worked for me:
sudo service hadoop-master restart
hadoop dfsadmin -safemode leave
I was facing the same issue of namenode not starting with Hadoop-3.2.1**** version. I did the steps to resolve the issue:
Delete the contents from temporary folder from the name node directory. In my case the "current" directory made by root user: rm -rf (dir name)
Format the namenode: hdfs namenode -format
start the processes again:start-dfs.sh
Point #1 has change in the hdfs-site.xml file.
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///opt/hadoop/node-data/hdfs/namenode</value>
</property>
I ran into the same thing after a restart.
for hadoop-2.7.3 all I had to do was format the namenode:
<HadoopRootDir>/bin/hdfs namenode -format
Then a jps command shows
6097 DataNode
755 RemoteMavenServer
5925 NameNode
6293 SecondaryNameNode
6361 Jps

Resources