Hadoop-Installation-Multinode - hadoop

Hi all I am trying to install the multinode hadoop installation. Everything works fine but my nodemanager for yarn is not working. When I looked at the log file for Yarn nodemanager, I got following information
"org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
Initialized nodemanager for null: physical-memory=-1 virtual-memory=-2
virtual-cores=-1"
I have no idea why its not showing the actual memory and virtual core. My VM has 8GB memory and 8Vcpus. Because of above values I am getting this error:
"org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved
SHUTDOWN signal from Resourcemanager ,Registration of NodeManager
failed, Message from ResourceManager: NodeManager from SFeUbuntuVM2
doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the
NodeManager"
Can someone help me out with this issue?

Check if you have
Selinux disabled
firewall disabled
Check your configuration files.
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>{your host name}</value>
</property>
After all do format your namenode, and start all services again.

Related

Fatal error of "failed to become active master" while running hbase in cluster mode

I have 4 nodes, one master and 3 slaves.
master: * .* .*.18, slaves: * .*. *.12, 104, 36.
Configurations for Hadoop on Namenode:
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hduser/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hduser/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
hadoop-env.sh:
export JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64"
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
export HADOOP_PID_DIR=${HADOOP_PID_DIR} // default to /tmp
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_IDENT_STRING=$USER
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
</configuration>
slaves:
10.0.3.12
10.0.3.36
10.0.3.104
yarn-site.xml:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8050</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
In the slave nodes the configurations for hadoop are:
yarn-site.xml:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>10.0.3.18:8050</value>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>localhost:8035</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
the rest of the files remain the same in all the slave nodes as in the master node. With respect to the Hbase configuration,
hbase-env.sh(in all):
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m -XX:ReservedCodeCacheSize=256m"
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m -XX:ReservedCodeCacheSize=256m"
export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
export HBASE_MANAGES_ZK=true
hbase-site.xml(in all):
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>10.0.3.18,10.0.3.12,10.0.3.104,10.0.3.36</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/Downloads/hbase/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>1200000</value>
</property>
<property>
<name>hbase.zookeeper.property.tickTime</name>
<value>6000</value>
</property>
</configuration>
except that in slaves, localhost is changed to 10.0.3.18(address of namenode)
regionservers:
10.0.3.12
10.0.3.104
10.0.3.36
I formatted namenode and when I start hdfs with commands: start-dfs.sh and start-yarn.sh, output is as follows:
...succefully formatted namenode...
localhost: starting namenode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-namenode-saichanda-OptiPlex-9020.out
10.0.3.12: starting datanode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-datanode-aaron.out
10.0.3.36: starting datanode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-datanode-dmacs-OptiPlex-9020.out
10.0.3.104: starting datanode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-datanode-hadoop-104.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hduser/Downloads/hadoop/logs/hadoop-hduser-secondarynamenode-saichanda-OptiPlex-9020.out
starting yarn daemons
starting resourcemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-resourcemanager-saichanda-OptiPlex-9020.out
10.0.3.12: starting nodemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-nodemanager-aaron.out
10.0.3.36: starting nodemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-nodemanager-dmacs-OptiPlex-9020.out
10.0.3.104: starting nodemanager, logging to /home/hduser/Downloads/hadoop/logs/yarn-hduser-nodemanager-hadoop-104.out
when I run jps command (on master):
28032 SecondaryNameNode
28481 Jps
28198 ResourceManager
27720 NameNode
when I run jps command (on slaves):
11303 DataNode
11595 Jps
11436 NodeManager
Then I started Hbase with the command: ./start-hbase.sh. output is:
10.0.3.12: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-aaron.out
10.0.3.36: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-dmacs-OptiPlex-9020.out
10.0.3.104: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-hadoop-104.out
10.0.3.18: running zookeeper, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-zookeeper-saichanda-OptiPlex-9020.out
running master, logging to /home/hduser/Downloads/hbase/logs/hbase-hduser-master-saichanda-OptiPlex-9020.out
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
10.0.3.12: running regionserver, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-regionserver-aaron.out
10.0.3.36: running regionserver, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-regionserver-dmacs-OptiPlex-9020.out
10.0.3.104: running regionserver, logging to /home/hduser/Downloads/hbase/bin/../logs/hbase-hduser-regionserver-hadoop-104.out
10.0.3.12: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
10.0.3.12: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
10.0.3.36: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
10.0.3.36: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
10.0.3.104: OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
10.0.3.104: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
when I run jps on namenode:
28032 SecondaryNameNode
28821 HQuorumPeer
29126 Jps
28198 ResourceManager
27720 NameNode
when I run jps on slaves:
11776 HRegionServer
11669 HQuorumPeer
11303 DataNode
11899 Jps
11436 NodeManager
What I observed was that HMaster is not running on the namenode. Can anyone help understand the problem why HMaster is crashing out. After sometime even NodeManager crashes out in the slaves. Also I observed that When I shutdown hbase, HRegionservers on the slaves donot go down, they continue to be running even after I give stop-hbase.sh command in the master node. Key warnings and errors observed in My logs are as follows.
hadoop-namenode.log: multiple times I get this Exception...
java.io.IOException: File /hbase/.tmp/hbase.version could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
hadoop-secondary-namenode.log: multiple times I get this ERROR...
ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
No error found in yarn-resourcemanager.log.
For hbase logs: in hbase-master.log:
FATAL [saichanda-OptiPlex-9020:16000.activeMasterManager] master.HMaster: Failed to become active master
File /hbase/.tmp/hbase.version could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
In hbase-zookeeper.log: I see this line, as such no errors were there in the log.
019-01-29 10:09:49,431 INFO [main] server.NIOServerCnxnFactory: binding to port 0.0.0.0/0.0.0.0:2181
on one of the slaves, regionserver.log:
client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
on one of the slaves, hadoop-datanode.log gives multiple times the following warning.
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: localhost/127.0.0.1:9000
AMONG ALL THE ABOVE WARNINGS AND ERRORS, I FEEL THE ERROR PERTAINING TO HBASE-MASTER.LOG SEEMS CRITICAL, WHERE IT SAYS, replicated to 0 nodes instead of minReplication (=1). Please help me solve this issue.
Also, when I finally run the hbase shell, I get the error:
ERROR: Can't get master address from ZooKeeper; znode data == null
Thank you.

Configure Yarn with Hadoop 2.7.4 resources issue

I have configured hadoop 2.7.4 by following this tutorial. DataNode, NameNode and SecondaryNameNode are working properly.
But when I run yarn, NodeManager goes down with the following message
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved
SHUTDOWN signal from Resourcemanager ,Registration of NodeManager
failed, Message from ResourceManager: NodeManager from localhost
doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the
NodeManager.
My system has 8 cpu with 8 GB RAM. How to configure yarn with these resources? I have found a lot such as this but could not find any solution that solve my problem.
I had the same problem during a course. We were using Amazon virtual machines with 2 cores.
After various modifications in yarn-site.xml, we got our NodeManager running setting the following properties
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
In your case, you may need to establish 8 virtual cores.

Error starting datanode on hadoop

I'm trying to run a hadoop cluster via Docker. I have one virtual machine as the namenode and another for the datanode, but the datanode gives me this error running start-dfs.sh:
namenode: namenode running as process 130. Stop it first.
The command jps on the datanode does not show the namenode running. Then I try to start it by hand, using:
hadoop namenode
And it fails with this error:
java.net.BindException: Problem binding to [namenode:9000] java.net.BindException: Cannot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException
So far it seems that namenode is not accesible or is not listening on port 9000. But the network setup is correct: if I execute on datanode:
telnet namenode 9000
It correctly connects to the namenode, and the command netstat -apn | grep 9000 from namenode shows the incoming connection. If I shut down dfs on namenode (stop-dfs.sh), the telnet command from datanode fails with "Connection closed by foreign host."
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value> <!-- I have tried with 1 and 2 too -->
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
</configuration>
Thanks!

HBase is not working in Hadoop 2.2.0

I am trying to install hbase-0.96.0-hadoop2 on Hadoop 2.2.0. While I am trying to start my HBase. HBase is giving following error.
master: log4j:ERROR Could not find value for key log4j.appender.DRFAS
master: log4j:ERROR Could not instantiate appender named "DRFAS".
log4j:ERROR Could not find value for key log4j.appender.DRFAS
log4j:ERROR Could not instantiate appender named "DRFAS".
When I am doing JPS Linux is showing following processes:
17422 JobHistoryServer
11461 NameNode
31375 Jps
12127 ResourceManager
11671 DataNode
30077 HRegionServer
12344 NodeManager
11935 SecondaryNameNode
30948 HQuorumPeer
Here is my hbase-site.xml configuraiton:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/master</value>
</property>
</configuration>
Try these two methods .
Stop your hbase demon and clear the hbase log files which was located
in /tmp/ folder delete all files which had name hbase in it
after deleting disconnect your machine from internet and try to
start the hbase demon now.
Hbase has this weird issue in some x64 ubuntu machines disconnecting from internet will help in resolving this issue,after startup you can connect to the internet.
now try to access hbase from cli
bin/hbase

Unable to start nodemanager of Hadoop YARN at OS X 10.8

After starting all other nodes, when I try to start nodemanager, it seems it has been opened and then automatically terminated. Like the following:
Yitongs-MacBook-Pro:hadoop timyitong$ sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /Users/timyitong/Dev/hadoop/logs/yarn-timyitong-nodemanager-Yitongs-MacBook-Pro.local.out
Yitongs-MacBook-Pro:hadoop timyitong$ jps
8981 DataNode
9300 Jps
9139 JobHistoryServer
8932 NameNode
9038 ResourceManager
I don't get any error, any exception, but the nodemanger is not there. And when I try to stop it, it says like this (the stopnodes.sh is just a script), which confirms that the nodemanager is not there:
Yitongs-MacBook-Pro:hadoop timyitong$ sh stopnodes.sh
stopping namenode
stopping datanode
stopping resourcemanager
no nodemanager to stop
stopping historyserver
And I am not sure whether it is because nodemanager is not started, when I try to run the sample wordcount program, I always got my task pending forever.
My environment is OS X 10.8, Hadoop YARN 2.2.0.
And I already solved the java version issue with export JAVA_HOME=$(/usr/libexec/java_home -v 1.6).
Acctually I used bin/yarn nodemanger to start the server directly and found out the problem. It is in my yarn-site.xml where I should not set the name of yarn.nodemanager.aux-services containing dots (.) like mapreduce.shuffle. After change mapreduce.shuffle to mapreduce_shuffle, the problem is solved.
Really don't understand why it does not allow dots, since I config everything according to this blog post, where this setting seems to be fine.
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
The mapreduce.shuffle should be mapreduce_shuffle . Please observe _ (underscore instead of dot). Also have a look at http://www.thecloudavenue.com/2012/01/getting-started-with-nextgen-mapreduce.html

Resources