Start Hadoop 50075 Port is not resolved - hadoop

I have installed Hadoop in my system, Jobtracker : localhost:50030/jobtracker.jsp is working fine but localhost:50075/ host is not resolved. Can anybody help my what is the problem in my Ubuntu system. Below check my code-site.xml configuration :
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

never seen 50075 before, but 50070 is the local NameNode, I suggest you format the NameNode to have a try:
rm -r /tmp/hadoop-*;
bin/hadoop namenode -format;
./bin/start-all.sh

Check your port configurations. Here is a list of hadoop daemon configurable parameters
dfs.http.address: The address and the base port where the dfs namenode web ui will listen on. If the port is 0 then the server will start on a free port.
The default for the name node is usually set to port 50070, so try localhost:50070.
http://hadoop.apache.org/common/docs/r0.20.2/hdfs-default.html
I think you mean core-site.xml (not code-site.xml). The fs.default.name configuration here does not determine the location of the hadoop web ui/dashboard. This is the port used by data nodes to communicate with the name node.

Related

Various ports in hadoop cluster?

I am trying to understand the various ports at which various daemons / processes listens at in a Hadoop cluster.
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://master.hadoop.cluster:54310</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.address</name>
<value>master.hadoop.cluster:8032</value>
</property>
I see we have three other ports, which are:
1) 50070 --> To see hdfs GUI
2) 8088 --> To see RM GUI
3) 8042 --> Not sure which GUI we can see at this port
As there are so many ports, I am not clear which port is meant for which thing. If I make HTTP request to port, say at 8032, it says this is Hadoop IPC port.
Can anyone help me understand this, what are the main port numbers which we should be aware of, and what processes listens at those ports.
The port defined in fs.defaultFS is for file system metadata operations. You cannot use it for accessing the Web UI.
8042 is for NodeManager Web UI and 8032 is for ResourceManager IPC.
Refer
hdfs-default.xml - for HDFS related ports
yarn-default.xml - for YARN related ports
mapred-default.xml - for JHS related ports.

Why I can't access http://hadoop-master:50070 when i define dfs.namenode.http-address

The Hadoop Version is 2.7.1
Modify the hdfs.xml , add two properties:
<property>
<name>dfs.namenode.http-address</name>
<value>HADOOP-MASTER:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>HADOOP-SLAVE-1:50090</value>
</property>
And restart the hadoop cluster,but I can't access http://hadoop-master:50070.
the namenode process is alive.
and
[hadoop#HADOOP-MASTER ~]$ lsof -i:50070
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 26541 hadoop 184u IPv4 1261606 0t0 TCP HADOOP-MASTER:50070 (LISTEN)
But when I remove the dfs.namenode.http-address property,the 50070 worked on.
So,The Problem is what dfs.namenode.http-address property mean,I guess it define the node who can access?
That value is defined as "The address and the base port where the dfs namenode web ui will listen on" and defaults to 0.0.0.0:50070, which means it is publicly accessible to all machines that can reach it.
Notice that is says address, not hostname. If you need to change this value from the default, use an IP address, not a physical machine name.
Source: https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

Datanode not allowed to connect to the Namenode in Hadoop 2.3.0 cluster

I am trying to set up a Apache Hadoop 2.3.0 cluster , I have a master and three slave nodes , the slave nodes are listed in the $HADOOP_HOME/etc/hadoop/slaves file and I can telnet from the slaves to the Master Name node on port 9000, however when I start the datanode on any of the slaves I get the following exception .
2014-08-03 08:04:27,952 FATAL
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed
for block pool Block pool BP-1086620743-xx.xy.23.162-1407064313305
(Datanode Uuid null) service to
server1.mydomain.com/xx.xy.23.162:9000
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException):
Datanode denied communication with namenode because hostname cannot be
resolved .
The following are the contents of my core-site.xml.
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://server1.mydomain.com:9000</value>
</property>
</configuration>
Also in my hdfs-site.xml I am not setting any value for dfs.hosts or dfs.hosts.exclude properties.
Thanks.
Each node needs fully qualified unique hostname.
Your error says
hostname cannot be resolved
Can you cat /etc/hosts file on your each slave an make them having distnct hostname
After that try again

How to change address 'hadoop jar' command is connecting to?

I have been trying to start a MapReduce job on my cluster with the following command:
bin/hadoop jar myjar.jar MainClass /user/hduser/input /user/hduser/output
But I get the following error over and over again, until connection is refused:
13/08/08 00:37:16 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
I then checked with netstat to see if the service was listening to the correct port:
~> sudo netstat -plten | grep java
tcp 0 0 10.1.1.4:54310 0.0.0.0:* LISTEN 10022 38365 11366/java
tcp 0 0 10.1.1.4:54311 0.0.0.0:* LISTEN 10022 32164 11829/java
Now I notice that my service is listening to port 10.1.1.4:54310, which is the IP of my master, but it seems that the 'hadoop jar' command is connecting to 127.0.0.1 (the localhost, which is the same machine) but therefore doesn't find the service. Is there anyway to force 'hadoop jar' to look at 10.1.1.4 instead of 127.0.0.1?
My NameNode, DataNode, JobTracker, TaskTracker, ... are all running. I even checked for DataNode and TaskTracker on the slaves and it all seems to be working. I can check the WebUI on the master and it shows my cluster is online.
I expect the problem to be DNS related since it seems that the 'hadoop jar' command finds the correct port, but always uses the 127.0.0.1 address instead of the 10.1.1.4
UPDATE
Configuration in core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
Configuration in mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
Configuration in hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
Although it seemed to be a DNS issue, it was actually Hadoop trying to resolve a reference to localhost in the code. I was deploying the jar of someone else and assumed it was correct. Upon further inspection I found the reference to localhost and changed it to master, solving my issue.

"Connection refused" Error for Namenode-HDFS (Hadoop Issue)

All my nodes are up and running when we see using jps command, but still I am unable to connect to hdfs filesystem. Whenever I click on Browse the filesystem on the Hadoop Namenode localhost:8020 page, the error which i get is Connection Refused. Also I have tried formatting and restarting the namenode but still the error persist. Can anyone please help me solving this issue.
Check whether all your services are running JobTracker, Jps, NameNode. DataNode, TaskTracker by running jps command.
Try to run start them one by one:
./bin/stop-all.sh
./bin/hadoop-daemon.sh start namenode
./bin/hadoop-daemon.sh start jobtracker
./bin/hadoop-daemon.sh start tasktracker
./bin/hadoop-daemon.sh start datanode
If you're still getting the error, stop them again and clean your temp storage directory. The directory details are in the config file ./conf/core-site.xml and the run,
./bin/stop-all.sh
rm -rf /tmp/hadoop*
./bin/hadoop namenode -format
Check the logs in the ./logs folder.
tail -200 hadoop*jobtracker*.log
tail -200 hadoop*namenode*.log
tail -200 hadoop*datanode*.log
Hope it helps.
HDFS may use port 9000 under certain distribution/build.
please double check your name node port.
Change the core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopvm:8020</value>
<final>true</final>
</property>
change to the ip adress .
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.132.129:8020</value>
<final>true</final>
</property>

Resources