ERROR datanode.DataNode: Exception in secureMain - windows

I was trying to install Hadoop on windows.
Namenode is working fine but Data Node is not working fine. Following error is being displayed again and again even after trying for several times.
Following Error is being shown on CMD regarding dataNode:
2021-12-16 20:24:32,624 INFO checker.ThrottledAsyncChecker: Scheduling a check for [DISK]file:/C:/Users/mtalha.umair/datanode 2021-12-16 20:24:32,624 ERROR datanode.DataNode: Exception in secureMain org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value configured for dfs.datanode.failed.volumes.tolerated -
1. Value configured is >= to the number of configured volumes (1).
at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:176)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2799)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2714)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2756)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2900)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2924) 2021-12-16 20:24:32,640 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value configured for dfs.datanode.failed.volumes.tolerated - 1. Value configured is >= to the number of configured volumes (1). 2021-12-16 20:24:32,640 INFO datanode.DataNode: SHUTDOWN_MSG:
I have referred to many different articles but to no avail. I have tried to use another version of Hadoop but the problem remains and as I am just starting out, I can't fully understand the problem therefore I need help
these are my configurations
-For core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
For mapred-site.xml
mapreduce.framework.name
yarn
-For yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
-For hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/D:/big-data/hadoop-3.1.3/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>datanode</value> </property> <property>
<name>dfs.datanode.failed.volumes.tolerated</name>
<value>1</value> </property> <property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

Well unfortunately the reason this is failing is exactly what the message says. Let me try to say it another way.
dfs.datanode.failed.volumes.tolerated = 1
The number of (dfs.datanode.data.dir) folders you have configured is 1.
You are saying you will tolerate no data drives (1 drive configured and you'll tolerate it breaking). This does not make sense and is why this is being raised as an issue.
You need to alter it so there's a gap of at least 1 (so that you can still have a running datanode.)
Here are your options:
Configure more data volumes (2) with dfs.datanode.failed.volumes.tolerated Set to 1. For example, store data in both your C and D drive.
dfs.datanode.failed.volumes.tolerated to 0; and keep you data volumes
as is (1)

Related

Hadoop localhost:9870 browser interface is not working

I need to do data analysis using Hadoop. Therefore I have installed Hadoop and configured as below. But localhost:9870 is not working. Even I have format namenode every time I worked with that. Some articles and answers of this forum mentioned that 9870 is the updated one from 50070. I have win 10. I also referred answers in this forum but none of them worked. Java-home and hadoop-home paths are set. Paths to bin and sbin of hadoop are also set up. Can anyone please tell me what I am doing wrong in here?
I referred this site to do the installation and configuration.
https://medium.com/#pedro.a.hdez.a/hadoop-3-2-2-installation-guide-for-windows-10-454f5b5c22d3
core-site.xml
I have set up the Java path in this xml as well.
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9870</value>
</property>
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>C:\hadoop-3.2.2\data\namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>C:\hadoop-3.2.2\data\datanode</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
If you look at the namenode logs, it very likely has an error saying something about a port already being in use.
The default fs.defaultFS port should be 9000 - https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html ; you shouldn't change this without good reason.
The Namenode web UI isn't the value in fs.defaultFS. It's default port is 9870, and is defined by dfs.namenode.http-address in hdfs-site.xml
need to do data analysis
You can do analysis on Windows without Hadoop using Spark, Hive, MapReduce, etc. directly and it'll have direct access to your machine without being limited by YARN container sizes.

Cannot start Hadoop Namenode- syntax error

After I install Hadoop-2.8.0 start by running command "start-all.cmd", Datanode, Nodemanager and Resourcemanager start well. However, namenode cannot start with the below error:
ERROR common.Util: Syntax error in URI C:\Hadoop-2.8.0\data\namenode. Please check hdfs configuration.
java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\Hadoop-2.8.0\data\namenode
I checked hdfs-site.xml, I configured it as below and try to search a lot but I cannot fix the error.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>C:\Hadoop-2.8.0\data\namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>C:\Hadoop-2.8.0\data\datanode</value>
</property>
</configuration>
Can you help me to figure out where is the fault?

issue with hadoop secondary node

I am new to hadoop. When I run wordcount test project, evrything works fine. But, I can't access the JobTracker at http://localhost:50030. in fact, when I get my secondary node log file, I get exception message :
java.io.IOException: Bad edit log manifest (expected txid = 3: [[21,22], [23,24]
[8683,8684], [8685,8686], [8687,8688], [8689,8690], [8691,8692], [8693,8694], [8695,8696], [8697,8698], [8699,8700]]...
....
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.downloadCheckpointFiles(SecondaryNameNode.java:438)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:540)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:395)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)
at java.lang.Thread.run(Thread.java:745)
Btw, when I run jps, I get 53745 JobHistoryServer 77259 Jps
UPDATE : here's my config
in core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
in hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9010</value>
</property>
</configuration>
and nothing is set in my yarn-site.xml
If you are using latest version of Hadoop, then Job Tracker will not be available. Job tracker is replaced by Resource Manager and History Server.
If you want to access past job details, go to http://hostname:19888. This is the web UI address for job history server.
Please refer Hadoop Cluster Setup for further details.

Hadoop 2.5.1 job stuck at map 0% and reduce 0%

I am trying to run a word count example. My current testing setup is:
NameNode and ResourceManager on one machine (10.38.41.134).
DataNode and NodeManager on another (10.38.41.135).
They can ssh between them without passwords.
When reading the logs, I don't get any warnings, except a security warning (I didn't set it up for testing) and a containermanager.AuxServices 'mapreduce_shuffle' warning. Upon submitting the example job, nodes react to it and output logs, which suggests that they can communicate well. NodeManager outputs memory usage, but the job doesn't budge.
Where should I even start looking for problems? Everything else I could find is either old or non-relevant. I followed the official cluster setup tutorial for version 2.5.1 which left way too many questions unanswered.
My conf files are following:
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.38.41.134:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
<value>NEVER</value>
<description>
</description>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>The runtime framework for executing MapReduce jobs.
Can be one of local, classic or yarn.
</description>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.delete.debug-delay-sec</name>
<value>300</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>mapreduce.jobtracker.address</name>
<value>10.38.41.134:50030</value>
</property>
</configuration>
Everything else is default.
I suggest you first try to get it working with a single server cluster so it's easier to debug.
When that is working, continue with two nodes.
As already suggested, memory might be an issue. Without tweaking the settings, it seems some 2GB is the minimum and I'd recommend at least 4GB per server. Also remember to check also the job's logs (under logs/userlogs, especially syslog).
P.S. I share your frustration about old / non-relevant documentation.

Hadoop webuser: No such user

While running a hadoop multi-node cluster , i got below error message on my master logs , can some advise what to do..? do i need to create a new user or can i gave my existing Machine user name over here
2013-07-25 19:41:11,765 WARN
org.apache.hadoop.security.UserGroupInformation: No groups available
for user webuser 2013-07-25 19:41:11,778 WARN
org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception
trying to get groups for user webuser
org.apache.hadoop.util.Shell$ExitCodeException: id: webuser: No such
user
hdfs-site.xml file
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
i followed http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/ .
Hadoop 1.2.0
jetty-6.1.26
After adding my hdfs-site.xml looks
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.web.ugi</name>
<value>hduser,hadoop</value>
</property>
</configuration>
Edit the dfs.web.ugi property in hdfs-site.xml and add your user there. It is by default webuser,webgroup.

Resources