How to eliminate Error util.Shell: Failed to locate the winutils binary - hadoop

I am executing a remote job from a windows machine(the client) under eclipse, I clarify that I dont have any hadoop installation on my windows client, and I dont needed, I am executing the hadoop job remotely, and hadoop is installed on a linux machine.
Everything is executed correctly, but I would like to get rid of this ERROR:
14/09/22 11:49:49 ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)
at sun.misc.Unsafe.ensureClassInitialized(Native Method)
at sun.reflect.UnsafeFieldAccessorFactory.newFieldAccessor(Unknown Source)
at sun.reflect.ReflectionFactory.newFieldAccessor(Unknown Source)
at java.lang.reflect.Field.acquireFieldAccessor(Unknown Source)
at java.lang.reflect.Field.getFieldAccessor(Unknown Source)
at java.lang.reflect.Field.set(Unknown Source)
at MyFirstJob.main(MyFirstJob.java:45)
Do you know how to make this exception not hapenning ?

Install the winutils.exe, there is no other way of fixing this error.
Here is a little context: Hadoop will write some files locally (e.g. the job configs) before uploading them to the cluster. Thus it will need to set permissions, write some files or create directories.
In case it doesn't find the binary, it will fallback to the Java implementations anyway, so you don't need to worry. However, there is no built-in configuration to turn this message off, so the only way to really fixing it is to recompile your hadoop-common jar without this error (I guess installing winutils isn't that bad compared to it).

Copy org.apache.hadoop.util.Shell.java into your project.
You can comment out the below line,to remove the Error.
throw new IOException("Could not locate executable " + fullExeName + " in the Hadoop binaries.");
Also for Windows check,
Error while running Mapreduce(yarn)from windows eclipse

I saw a suggestion somewhere to just create an empty file with that name, to get rid of the error. I think I tried it once and it worked - feel free to try if it works for you. The file can be created on-the-fly if needed.

Related

NameNode: Failed to start namenode in windows 7

I am trying to install Hadoop in windows machine, in middle I got the below error.
Logs
17/11/28 16:31:48 ERROR namenode.NameNode: Failed to start namenode.
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:609)
at org.apache.hadoop.fs.FileUtil.canWrite(FileUtil.java:996)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyze
Storage(Storage.java:490)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:369)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:225)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:978)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:685)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:819)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:803)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1500)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1566)
Looks like you didn't install Hadoop winutils or build Hadoop with Native Libraries
Native IO is mandatory on Windows and without it you will not be able to get your installation working. You must follow all the instructions from BUILDING.txt to ensure that Native IO support is built correctly
Hadoop2 on Windows
I also have the similar issue.
I am using Hadoop-2.8.1. These steps solved the error for me.
download the winutils of your version from GitHub
Copy paste winutils at <HADOOP_HOME>/bin/
Also. double check JAVA_HOME environment is correctly set and reference in hadoop-env.cmd file

Hadoop java.io.IOException: Mkdirs failed to create /some/path, when running mapreduce job on mac osx

When I run my MR job on mac osx, I face on the following exception:
Exception in thread "main" java.io.IOException: Mkdirs failed to create /var/folders/9m/w_vzzmtx0rq0tt9whf_r4yhr0000gn/T/hadoop-unjar7688811202881231043/META-INF/license
at org.apache.hadoop.util.RunJar.ensureDirectory(RunJar.java:128)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:104)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:81)
at org.apache.hadoop.util.RunJar.run(RunJar.java:209)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
According other post, people gave alternative way to remove META-INF/LICENSE from jar file. I feel it seems temporary solution.
I think it will resolve if path trying to store tmp files below:
/var/folders/9m/.../META-INF/license
I checked permission and tried to change "hadoop.tmp.dir" value in core-site.xml, but it doesn't work for me.
PS. I know the issue is caused case-insensitive property for osx. Then, I am working with directory mounted disk image, which is case sensitive.
Thanks in advance!

Error in hadoop examples.jar

I just installed Hadoop from the yahoo developers network running on a vm. I ran the following code after start-all.sh after cd-ing to the bin folder
hadoop jar hadoop-0.19.0.-examples.jar pi 10 1000000
I'm getting
java. io.IOException:Error opening jon jar:hadoop-0.18.0-examples.jar
at org.apache.hadoop.util.main(RunJar.java:90) at
org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at
org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) caused
by:java.util.ZipExcaption:error in opening zip file
How do i sort this out?
Please make sure that have below things in place
Your examples.jar file is present in the path where you are running the above command. else you need to give complete path for the jar file.
hadoop jar /usr/lib/hadoop-mapreduce/*example.jar pi 10 100000
It has appropriate read permissions for the user that you are using to run the hadoop job.
If you still face issue, please update logs in your question.
You will face this issue if you are using older version of the java . Hadoop needs Java 7 or Java 8. Please check your JAVA version and update if needed.

Is it possible to run Hadoop jobs (like the WordCount sample) in the local mode on Windows without Cygwin?

I have Windows 7, Java 8, Maven and Eclipse.
I've created a Maven project and used almost exactly the same code as here.
It's just a simple "word count" sample.
I try to launch the "driver" program from Eclipse, I provide command line arguments (the input file and the output directory) and get the following error:
Exception in thread "main" java.lang.NullPointerException at
java.lang.ProcessBuilder.start(ProcessBuilder.java:1012) at
org.apache.hadoop.util.Shell.runCommand(Shell.java:404) at
org.apache.hadoop.util.Shell.run(Shell.java:379) at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at
org.apache.hadoop.util.Shell.execCommand(Shell.java:678) at
org.apache.hadoop.util.Shell.execCommand(Shell.java:661) at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639) at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:435) at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:277) at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125) at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:344) at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at
org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286) at
misc.projects.hadoop.exercises.WordCountDriverApp.main(WordCountDriverApp.java:29)
The failing line (WordCountDriverApp.java:29) contains the command to launch the job:
job.waitForCompletion(true)
I want to make it work and therefore I want to understand something:
Do I have to provide any hdfs-site.xml, yarn-site.xml, ... all this, if I want just the local mode (without any cluster)?
I don't have these XML config files now. As far as I remember, the defaults are all OK for the local mode, maybe I am wrong.
Is it possible at all under Windows (to launch any Hadoop jobs whatsoever) or the whole Hadoop thing is Linux-only?
P.S.:
The Hadoop dependency is the following:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
Download Hadoop 2.6.0 or 2.7.1 compiled for Windows
Create HADOOP_HOME environment variable pointing to the unzipped dir
Add %HADOOP_HOME%\bin to PATH env var
Source: https://stackoverflow.com/a/27394808/543836
Hadoop runs on Windows, it is possible, but you'll grow white hair if you try to pull it off on your own.
To start with, all filesystem operations in Windows Hadoop are routed either through the NativeIO, if available, or via winutils if NativeIO is not loaded. In your case it took the winutils path. You could make NativeIO available if you instruct Eclipse where to find it. See How to add native library to “java.library.path” with Eclipse launch (instead of overriding it), you need to add the location of that hadoop-common-project project target's bin, where you'll find hadoop.dll which hosts the NativeIO. But even after that, you'll still need wintils for container launch. The winutils.exe will be in that same location (the hadoop-common target/bin), but the code looks for it based on %HADOOP_HOME%, so you'll have to define that. And it will go uphill from there. I intentionally omitted the details how to configure all these because I don't think you should, or to be more precise, you should only if you understand how to do it.
It would be much much easier if you take an off-the-shelf Hadoop distribution for Windows, of which there are exactly one: the HDP from Hortonworks, download it, install it, configure it and then run against the 'cluster'.

Hadoop Basic Examples WordCount

I am getting this error with a mostly out of the box configuration from
version 0.20.203.0
Where should I look for a potential issue. Most of the configuration is out of the box. I was able to visit the local websites for hdfs, task manager.
I am guessing the error is related to a permissions issue on cygwin and windows. Also, googling the problem, they say there might be some kind of out of memory issue. It is such a simple example, I don't see how that could be.
When I try to run the wordcount examples.
$ hadoop jar hadoop-examples-0.20.203.0.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output6
I get this error:
2011-08-12 15:45:38,299 WARN org.apache.hadoop.mapred.TaskRunner:
attempt_201108121544_0001_m_000008_2 : Child Error
java.io.IOException: Task process exit with nonzero status of 127.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
2011-08-12 15:45:38,878 WARN org.apache.hadoop.mapred.TaskLog: Failed to
retrieve stdout log for task: attempt_201108121544_0001_m_000008_1
java.io.FileNotFoundException:
E:\projects\workspace_mar11\ParseLogCriticalErrors\lib\h\logs\userlogs\j
ob_201108121544_0001\attempt_201108121544_0001_m_000008_1\log.index (The
system cannot find the file specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:112)
...
The userlogs/job* directory is empty. Maybe there is some permission
issue with those directories.
I am running on windows with cygwin so I don't really know permissions
to set.
I couldn't figure out this problem with the current version of hadoop. I reverted from the current version and went to a previous release, hadoop-0.20.2. I had to play around with the core-site.xml configuration file and temp directories but I eventually got the hdfs and map reduce to work properly.
The issue seems to be cygwin, windows and the drive setup that I was using. Hadoop launches a new JVM process when it tries to invoke a 'child' map/reduce task. The actual jvm execute statement is in some shell script.
In my case, hadoop couldn't find the path to the shell script. I am assuming that status code 127 error was the result of the Java Runtime execute not finding the shell script.

Resources