Unable to start hadoop problem with namenode - hadoop

Once I install Hadoop and type hdfs namenode –format or hadoop namenode -format in cmd for the 1st time,
Am getting below error, can anyone help me in solving this.
1st it is asking me this:
Re-format filesystem in Storage Directory root= C:\hadoop-3.2.1\data\namenode; location= null ? (Y or N)
No matter what I give i.e., Y or N, am getting the below error.
ERROR namenode.NameNode: Failed to start namenode
ERROR namenode.NameNode: Failed to start namenode.
java.lang.UnsupportedOperationException
INFO util.ExitUtil: Exiting with status 1: java.lang.UnsupportedOperationException
Quick answer is much appreciated
Regards
ShaX

This is a bug in 3.2.1 release and is supposed to fixed in 3.2.2 or 3.3.0.
The fix is to change the StorageDirectory class by adding FileUtil for Windows permission setup:
if (permission != null) {
try {
Set<PosixFilePermission> permissions =
PosixFilePermissions.fromString(permission.toString());
Files.setPosixFilePermissions(curDir.toPath(), permissions);
} catch (UnsupportedOperationException uoe) {
// Default to FileUtil for non posix file systems
FileUtil.setPermission(curDir, permission);
}
}
I found this issue when publishing a Hadoop 3.2.1 installation guide on Windows:
Latest Hadoop 3.2.1 Installation on Windows 10 Step by Step Guide
I published a temporary resolution and it is working. Refer to my above post for details and you can follow it to complete Hadoop 3.2.1 installation on Windows 10. I've uploaded my updated Hadoop HDFS jar file to the following location:
https://github.com/FahaoTang/big-data/blob/master/hadoop-hdfs-3.2.1.jar

Related

Re-format filesystem error starting Hadoop services on Mac

iMac 2020 Intel, MacOS Monterey 12.6, Java 1.8, Hadoop 3.3.4 as at 9-Feb-23
I am getting this error when starting Hadoop with this command:
$HADOOP_HOME/sbin/start-all.sh
Irrespective of response as Y or N, the error keeps running on the terminal and never stops.
localhost: Re-format filesystem in Storage Directory root=
/tmp/hadoop-arshadssss/dfs/name; location= null ? (Y or N) Invalid
input:
I followed the steps from https://techblost.com/how-to-install-hadoop-on-mac-with-homebrew/ to install and configure. It feels like all is done and this is final step...any help/support to resolve would be appreciated.
I tried killing the namenode process from activity monitor and re-starting to no avail
start-all is a deprecated script.
You should use start-dfs and start-yarn separately.
Or use hdfs namenode start, for example, and same for datanode, yarn resourcemanager, nodemanager, etc. all separately.
Then debug which daemon is actually causing the problem

java.io.EOFException: Premature EOF: no length prefix available in Spark on Hadoop

I'm getting this weird exception. I'm using Spark 1.6.0 on Hadoop 2.6.4 and submitting Spark job on YARN cluster.
16/07/23 20:05:21 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-532134798-128.110.152.143-1469321545728:blk_1073741865_1041
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2203)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:867)
16/07/23 20:49:09 ERROR server.TransportRequestHandler: Error sending result RpcResponse{requestId=4719626006875125240, body=NioManagedBuffer{buf=java.nio.HeapByteBuffer[pos=0 lim=81 cap=81]}} to ms0440.utah.cloudlab.us/128.110.152.175:58944; closing connection
java.nio.channels.ClosedChannelException
I was getting this error when running on Hadoop 2.6.0 and thought the exception might be kind of a bug like this but after even changing this to Hadoop 2.6.4 I'm getting the same error. There is not any memory problem, my cluster is good with HDFS and memory. I went through this and this but no luck.
Note: 1. I'm using Apache Hadoop and Spark not any CDH/HDP. 2. I'm able to copy data in HDFS and even able to execute another job on this cluster.
Check file permissions of dfs directory:
find /path/to/dfs -group root
In general, the user permission group is hdfs.
Since I started HDFS service with root user, some dfs block file with root permissions was generated.
I solved the problem after change to right permissions:
sudo chown -R hdfs:hdfs /path/to/dfs

Hadoop namenode format not working

I've been trying to install hadoop 2.7.0 on Ubuntu but when i enter the hadoop namenode -format command i get the following message:
Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode
I've triple checked all the configuration files but i can't seem to find where the problem is.
I followed this tutorial : http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
Can anyone please tell me why is this not working??
You have to add hadoop-hdfs-2.7.0.jar to your hadoop classpath. Just add these lines in $HADOOP_HOME/etc/hadoop/hadoop-env.sh:
export HADOOP_HOME=/path/to/hadoop
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.7.0.jar
Now, stop all hadoop processes. Try to format namenode now. Post the error if you get any.

Hadoop Installation: Format Namenode

I'm struggling with installing Hadoop 2.2.0 on my Mac OSX 10.9.3. I essentially followed this tutorial:
http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide
When I run $HADOOP_PREFIX/bin/hdfs namenode -format to format namenode, I get the message:
SHUTDOWN_MSG: Shutting down NameNode at Macintosh.local/192.168.0.103. I believe this is preventing me from successfully running the test
$HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/yarn/hadoop-yarn-applications-
distributedshell-2.2.0.jar org.apache.hadoop.yarn.applications.distributedshell.Client --jar
$HADOOP_PREFIX/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar --
shell_command date --num_containers 2 --master_memory 1024
Does anyone know how to correctly format namenode?
(Regarding the test command above, someone mentioned to me that it could have something to do with the hdfs file system not functioning properly, if this is relevant.)

Hadoop 1.0.4 - file permission issue in running map reduce jobs

I am new to hadoop and need to setup a sandbox environment in windows to showcase to a client. I have followed below mentioned steps
Install cygwin on all machines
setup ssh
install hadoop 1.0.4
configure hadoop
Applied patch for hadoop-7682 bug
After lot of hit and trial I was successfully able to run all the components (namenode, datanode, tasktracker and jobtracker). But now I am facing problem while running map-reduce jobs and getting permission error on tmp directory. When I run word count example using following command
bin/hadoop jar hadoop*examples*.jar wordcount wcountjob wcountjob/gutenberg-output
13/03/28 23:43:29 INFO mapred.JobClient: Task Id :
attempt_201303282342_0001_m_000003_2, Status : FAILED Error
initializing attempt_201303282342_0001_m_000003_2:
java.io.IOException: Failed to set permissions of path:
c:\cygwin\usr\local\tmp\taskTracker\uswu50754 to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.mapred.JobLocalizer.createLocalDirs(JobLocalizer.java:144)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:182)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1228)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
at java.lang.Thread.run(Thread.java:662)
I have tried setting the permissions manually but that also doesn't work. What I understand is that this due to java libraries being used that try to reset the permissions and fail. The permission patch that solved the tasktracker problem doesn't seem to solve this one.
Has anybody found a solution for this?
Can anybody point me to download location for Hadoop 0.20.2 which seems to be immune
to this problem?

Resources