Unable to start nodemanager of Hadoop YARN at OS X 10.8 - macos

After starting all other nodes, when I try to start nodemanager, it seems it has been opened and then automatically terminated. Like the following:
Yitongs-MacBook-Pro:hadoop timyitong$ sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /Users/timyitong/Dev/hadoop/logs/yarn-timyitong-nodemanager-Yitongs-MacBook-Pro.local.out
Yitongs-MacBook-Pro:hadoop timyitong$ jps
8981 DataNode
9300 Jps
9139 JobHistoryServer
8932 NameNode
9038 ResourceManager
I don't get any error, any exception, but the nodemanger is not there. And when I try to stop it, it says like this (the stopnodes.sh is just a script), which confirms that the nodemanager is not there:
Yitongs-MacBook-Pro:hadoop timyitong$ sh stopnodes.sh
stopping namenode
stopping datanode
stopping resourcemanager
no nodemanager to stop
stopping historyserver
And I am not sure whether it is because nodemanager is not started, when I try to run the sample wordcount program, I always got my task pending forever.
My environment is OS X 10.8, Hadoop YARN 2.2.0.
And I already solved the java version issue with export JAVA_HOME=$(/usr/libexec/java_home -v 1.6).

Acctually I used bin/yarn nodemanger to start the server directly and found out the problem. It is in my yarn-site.xml where I should not set the name of yarn.nodemanager.aux-services containing dots (.) like mapreduce.shuffle. After change mapreduce.shuffle to mapreduce_shuffle, the problem is solved.
Really don't understand why it does not allow dots, since I config everything according to this blog post, where this setting seems to be fine.
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

The mapreduce.shuffle should be mapreduce_shuffle . Please observe _ (underscore instead of dot). Also have a look at http://www.thecloudavenue.com/2012/01/getting-started-with-nextgen-mapreduce.html

Related

Hadoop: Secondary NameNode Permission Denied

I'm attempting to run Hadoop in pseudo-distributed mode to learn how the system work. To install it, I've downloaded Hadoop-3.0.0 from the site, untarred it. I've done my configurations as follows (leaving out the configuration tags for brevity):
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
</property>
hdsf-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value> </property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
After doing this, I've formatted my hdfs using
hdfs namenode -format
I've also setup passwordless ssh using the following:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa2
cat ~/.ssh/id_rsa2.pub >> ~/.ssh/authorized_keys
(I've also added id_rsa2.pub as the default for localhost using a config file, since I already was using id_rsa.pub for something else and didn't want to mix-and-match in case I broke something)
I'm able to ssh into localhost. All looks well.
Then I run start-dfs.sh, and I see this error:
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [zm.local]
zm.local: zm#zm.local: Permission denied (publickey,password,keyboard-interactive).
2018-01-16 17:31:35,807 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
If I run jps (after starting yarn and mapreduce history server), I have the following:
37921 NodeManager
38070 Jps
37434 NameNode
38060 JobHistoryServer
37821 ResourceManager
Noticeably, the SecondaryNameNode is missing, my assumption being it's due to the error above.
I can then try to use hadoop's fs command and I'm able to create a folder and look it up. But if I try to copy any data over, I get notified that the NameNode is in SAFEmode. If I turn off save mode using:
hdfs dfsadmin -safemode leave
It immediately turns back on. By going to the namenode port on localhost, I see the following message:
Safe mode is ON. Resources are low on NN. Please add or free up more resourcesthen turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
However, I have plenty of resources. The single datanode is using less than 8% of it's allotted space, the namenode as almost 100GB of space. The datanode and namenode are both reporting as healthy. Thus, I think the problem is the lack of a secondary namenode. With that in mind, is anyone aware what might be causing the SecondaryNameNode to have different permission issues from the PrimaryNameNode? It seems to be trying to put the sNN somewhere on the local machine instead - but when I check in /tmp/hadoop*, all of the file permissions seem to be normal.
Thanks for any help.

Configure Yarn with Hadoop 2.7.4 resources issue

I have configured hadoop 2.7.4 by following this tutorial. DataNode, NameNode and SecondaryNameNode are working properly.
But when I run yarn, NodeManager goes down with the following message
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved
SHUTDOWN signal from Resourcemanager ,Registration of NodeManager
failed, Message from ResourceManager: NodeManager from localhost
doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the
NodeManager.
My system has 8 cpu with 8 GB RAM. How to configure yarn with these resources? I have found a lot such as this but could not find any solution that solve my problem.
I had the same problem during a course. We were using Amazon virtual machines with 2 cores.
After various modifications in yarn-site.xml, we got our NodeManager running setting the following properties
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
In your case, you may need to establish 8 virtual cores.

Hadoop-Installation-Multinode

Hi all I am trying to install the multinode hadoop installation. Everything works fine but my nodemanager for yarn is not working. When I looked at the log file for Yarn nodemanager, I got following information
"org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl:
Initialized nodemanager for null: physical-memory=-1 virtual-memory=-2
virtual-cores=-1"
I have no idea why its not showing the actual memory and virtual core. My VM has 8GB memory and 8Vcpus. Because of above values I am getting this error:
"org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved
SHUTDOWN signal from Resourcemanager ,Registration of NodeManager
failed, Message from ResourceManager: NodeManager from SFeUbuntuVM2
doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the
NodeManager"
Can someone help me out with this issue?
Check if you have
Selinux disabled
firewall disabled
Check your configuration files.
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>{your host name}</value>
</property>
After all do format your namenode, and start all services again.

Namenode stops working after hadoop restart

I have a server with Hadoop installed on.
I wanted to change some configuration (about the mapreduce.map.output.compress); therefore, I changed the configuration file, and I restarted Hadoop, with:
stop-all.sh
start-all.sh
After that, I was not able to use it again, becouse it was in Safe Mode:
The reported blocks is only 0 but the threshold is 0.9990 and the total blocks 11313. Safe mode will be turned off automatically.
Please, notice that the number of reported blocks is 0, and it was not increasing at all.
Therefore, I forced it to leave the Safe Mode with:
bin/hadoop dfsadmin -safemode leave
Now, I get errors like this:
2014-03-09 18:16:40,586 [Thread-1] ERROR org.apache.hadoop.hdfs.DFSClient - Failed to close file /tmp/temp-39739076/tmp2073328134/GQL.jar
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/temp-39739076/tmp2073328134/GQL.jar could only be replicated to 0 nodes, instead of 1
If it helps, my hdfs-site.xml is:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/hduser/hadoop/name/data</value>
</property>
</configuration>
I've run into this problem many times. Whenever you get the error stating x could only be replicated to 0 nodes, instead of 1, the following steps should fix the problem:
Stop all Hadoop services with: stop-all.sh
Delete the dfs/name and dfs/data directories
Format the NameNode with: hadoop namenode -format
Start Hadoop again with: start-all.sh

"Connection refused" Error for Namenode-HDFS (Hadoop Issue)

All my nodes are up and running when we see using jps command, but still I am unable to connect to hdfs filesystem. Whenever I click on Browse the filesystem on the Hadoop Namenode localhost:8020 page, the error which i get is Connection Refused. Also I have tried formatting and restarting the namenode but still the error persist. Can anyone please help me solving this issue.
Check whether all your services are running JobTracker, Jps, NameNode. DataNode, TaskTracker by running jps command.
Try to run start them one by one:
./bin/stop-all.sh
./bin/hadoop-daemon.sh start namenode
./bin/hadoop-daemon.sh start jobtracker
./bin/hadoop-daemon.sh start tasktracker
./bin/hadoop-daemon.sh start datanode
If you're still getting the error, stop them again and clean your temp storage directory. The directory details are in the config file ./conf/core-site.xml and the run,
./bin/stop-all.sh
rm -rf /tmp/hadoop*
./bin/hadoop namenode -format
Check the logs in the ./logs folder.
tail -200 hadoop*jobtracker*.log
tail -200 hadoop*namenode*.log
tail -200 hadoop*datanode*.log
Hope it helps.
HDFS may use port 9000 under certain distribution/build.
please double check your name node port.
Change the core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://hadoopvm:8020</value>
<final>true</final>
</property>
change to the ip adress .
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.132.129:8020</value>
<final>true</final>
</property>

Resources