Cannot connect slave1:8088 in hadoop 2.7.2 - hadoop

I am new in hadoop and I have installed hadoop 2.7.2 into two machines which are master and slave1. I have followed this tutorial. It was not mentioned in the tutorial but I have also edited JAVA_HOME and HADOOP_CONF_DIR variables in hadoop-env.sh. At the end I have two machines hadoop installed. In master NameNode, DataNode, SecondaryNameNode, ResourceManager and NodeManager are running and in slave1 DataNode and NodeManager are running.
I am able to go to master:8088 in the browser and when I go http://master:8088/cluster/nodes, there is only master node here. I am not able to go isci17:8088 and that is not a live node. Why could it be?

Port 8088 is the resource manager web ui port, so if it is running on master you probably won't have it on the slave.
You should be able to also go to the name node web ui on port 50070 on your name node as well to see status such as http://master:50070/ and the MapReduce JobHistory Server at http://hostname:19888/ for a web ui.
If you have access to a terminal session you run the following command on each server as root/sudo user to see which ports are listening on which server in a Linux terminal session;
sudo lsof -i tcp | grep -i LISTEN
You also also run hadoop cli commands that will give you info;
You can run the following to check hadoops ports in a terminal session.
hdfs portmap
Other Health checks on command line;
hdfs classpath
hdfs getconf -namenodes
hdfs dfsadmin -report -live
hdfs dfsadmin -report -dead
hdfs dfsadmin -printTopology
Depending on if the hadoop cli command works automatically you might have to find the executable to run ./hdfs. Also depending on distro/version you might have to replace the command hdfs with the command hadoop.
If you want to see your cluster configurations check your /etc/hadoop/conf folder along with /etc/hadoop/hive . You will find about 5-10 *-site.xml files. There configuration files contain your clusters configuration with the hostnames and ports.

Related

Preferred way to start a hadoop datanode

Functionally, I was wondering whether there is a difference between starting a hadoop namenode with the command:
$HADOOP_HOME/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
and:
hdfs datanode
The first command gives me an error stating that the datanode cannot ssh into itself(this is running on a docker container), while the second command seems to run without that issue. The official hadoop documentation for this version, (2.9.1) doesn't mention "hdfs datanode" as a way to start a datanode.

Cannot start running on browser the namenode for Hadoop

It is my first time in installing Hadoop on my Linux (Fedora distro) running on VM (using Parallel on my Mac). And I followed every step on this video and including the textual version of it.And then when I run it on localhost (or the equivalent value from hostname) in port 50070, I got the following message.
...can't establish a connection to the server at localhost:50070
When I run the jps by the way command I don't have the datanode and namenode unlike at the end of the textual version tutorial which has the following:
While mine has only the following processes running:
6021 NodeManager
3947 SecondaryNameNode
5788 ResourceManager
8941 Jps
When I run the hadoop namenode command I have some of the following [redacted] error:
Cannot access storage directory /usr/local/hadoop_store/hdfs/namenode
16/10/11 21:52:45 WARN namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop_store/hdfs/namenode is in an inconsistent state: storage directory does not exist or is not accessible.
I tried to access by the way the above mentioned directories and it existed.
Any hint for this newbie? ;-)
You would need to give read and write permission to user with which you are running the services on directory /usr/local/hadoop_store/hdfs/namenode.
Once done, you should run format command using hadoop namenode -format
Then try to start your services.
delete files /app/hadoop/tmp/*
and try again formatting the namenode and then start-dfs.sh & start-yarn.sh

Ambari show namenode is stop but actually namenode is still working

We are using HDP 2.7.1.2.3 with Ambari 2.1.2
After finish setup, every node status is correct.
But oneday ambari suddenly show namdenode is stopped.(we don't change any config of ambari or namenode)
However, we still can use HBASE and run MapReduce.
we think name node status should be normal.
We try to restart namenode and check ambari-server log
It shows:
ServiceComponentHostImpl:949 - Host role transitioned to a new state, serviceComponentName=NAMENODE, oldState=STARTING, currentState=STARTED
HeartBeatHandler:657 - State of service component NAMENODE of service HDFS of cluster wae has changed from STARTED to INSTALLED
we don't understand why its status change from "STARTED" to "INSTALLED".
In namenode side, we check ambari-agent.log
It shows one warning:
[Alert][namenode_directory_status] HA nameservice value is present but there are no aliases for {{hdfs-site/dfs.ha.namenodes.{{ha-nameservice}}}}
We think it is irrelevant.
What's the reason that ambari think namenode is stopped?
Is there any way that we can fix this issue?
Run the command ambari-server restart from linux terminal in Ambari server node
Run the command ambari-agent restart from linux terminal in all the nodes in the cluster.
You can run the command hdfs dfsadmin -report from the terminal as hdfs user to confirm all the nodes are up and running.

Name node is not displaying iwhen I hit JPS command

17223 JobTracker
16897 DataNode
17518 Jps
17451 TaskTracker
17129 SecondaryNameNode
8571 FsShell
Name node is not displaying
Seems like you are using the same user for starting all users, so If namenode is coming in the jps output, Probably namenode daemons might be got killed to not started properly. you may use the following command for ensuring namenode process running or not
ps aux | grep -i namenode
If not running you may need to format your namenode before starting hdfs service, stop all hdfs deamons using stop-dfs.sh script then format your namenode using the below command and start HDFS using the start-dfs.sh script.
hadoop namenode -format
Go through the below SO post if you are hitting the below situation.
Hadoop namenode needs to be formatted after every computer start
If you are looking to check all running JVMs on the host via 'jps',
you need to run the command as root. Otherwise, 'jps' will only show
JVMs running as your currently logged-in user.
Please see this link for more:
https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/1dlxmB_GVuU
Should always check the logs.. :)

How to check if hdfs is running?

I would like to see if the hdfs file system for Hadoop is working properly. I know that jps lists the daemons that are running, but I don't actually know which daemons to look for.
I ran the following commands:
$HADOOP_PREFIX/sbin/hadoop-daemon.sh start namenode
$HADOOP_PREFIX/sbin/hadoop-daemon.sh start datanode
$HADOOP_PREFIX/sbin/yarn-daemon.sh start resourcemanager
$HADOOP_PREFIX/sbin/yarn-daemon.sh start nodemanager
Only namenode, resourcemanager, and nodemanager appeared when I entered jps.
Which daemons are supposed to be running in order for hdfs/Hadoop to function? Also, what could you do to fix hdfs if it is not running?
Use any of the following approaches for to check your deamons status
JPS command would list all active deamons
the below is the most appropriate
hadoop dfsadmin -report
This would list down details of datanodes which is basically in a sense your HDFS
cat any file available in hdfs path.
So, I spent two weeks validating my setup (it was fine) , finally found this command:
sudo -u hdfs jps
Initially my simple JPS command was showing only one process, but Hadoop 2.6 under Ubuntu LTS 14.04 was up. I was using 'Sudo' to run the startup scripts.
Here is the startup that work with JPS listing multiple processes:
sudo su hduser
/usr/local/hadoop/sbin/start-dfs.sh
/usr/local/hadoop/sbin/start-yarn.sh

Resources