Running wordcount Hadoop example on Windows using Hadoop 2.6.0 - hadoop

I am new to Hadoop and learnt that with 2.x version, I can try Hadoop on my local Windows 7 64-bit machine.
I installed hadoop 2.6.0 and installed cygwin.
I could execute bin/hadoop version but I get the below error while executing the jar command:
Note: I have also placed the winutils.jar in the bin, from hadoop-common-2.2.0.jar. Please help. I am not able to get rid of this error. I have also entered the input and output parameters, it still fails.
$ bin/hadoop jar /Hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount
15/02/03 12:40:45 ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)
at org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows
(GenericOptionsParser.java:438)
at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions
(GenericOptionsParser.java:484)
at org.apache.hadoop.util.GenericOptionsParser.<init>
(GenericOptionsParser.java:170)
at org.apache.hadoop.util.GenericOptionsParser.<init>
(GenericOptionsParser.java:153)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke
(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Usage: wordcount <in> [<in>...] <out>
I could run the below command as well:
$ bin/hadoop jar /Hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar

It used to be an issue earlier. However if you are able to run the program through jar, there could be something else at fault.
If the same thing works for you using a Java code, you can edit the jar to remove the code where a new exception is being raised.
To be doubly sure, check if the bin directory contains winutils.exe and hadoop.dll.
If they are not present, chances are that someone else must have faced a similar issue and would have kept the files. These files are created when Hadoop is built from source code on the OS.

It seems like that you have installed hadoop 2.6.0 and older version of hadoop winutils. You must install hadoop winutils of your current hadoop version. Try to download winutils from this github repo https://github.com/steveloughran/winutils/tree/master/hadoop-2.6.0/bin
Finally replace your bin directory with the winutils bin directory!

Related

Geomesa Configuration Erorr

I am using Hadoop 2.7 with geoserver 2.8.0, but while I am trying to configure Geomesa 1.2.0, I am getting this error message:
$ geomesa
Using GEOMESA_HOME = /usr/local/geomesa/dist/tools/geomesa-tools-1.2.0
Warning: you have not set ACCUMULO_HOME and/or HADOOP_HOME as environment variables.
GeoMesa tools will not run without the appropriate Accumulo and Hadoop jars in the tools classpath.
Please ensure that those jars are present in the classpath by running 'geomesa classpath' .
To take corrective action, please place the necessary jar files in the lib directory of geomesa-tools.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/accumulo/core/client/TableNotFoundException
at org.locationtech.geomesa.tools.commands.TableConfCommand.<init>(TableConfCommand.scala:32)
at org.locationtech.geomesa.tools.Runner$.createCommand(Runner.scala:50)
at org.locationtech.geomesa.tools.Runner$.main(Runner.scala:21)
at org.locationtech.geomesa.tools.Runner.main(Runner.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.accumulo.core.client.TableNotFoundException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 4 more
How can I fix this?
The GeoMesa tools need Hadoop and Accumulo jars in order to connect to Accumulo.
One quick option is to run the GeoMesa tools from a tablet server or another machine already configured to be part of the Hadoop cluster. If you are using another machine, you can mirror the $HADOOP_HOME and $ACCUMULO_HOME directories from a cluster node locally.
As another alternative, you can download the install-hadoop-accumulo.sh script in the geomesa-tools/bin directory to download a set of Hadoop and Accumulo jars.
verify that corresponding jar file is present in the classpath, you can check this with the help of command:- Geomesa classpath
If jar is not present then copy the jar in the Geomesa directoryin my case it is in following path:
/*/geomesa-1.2.4/dist/tools/geomesa-tools-1.2.4/lib/common/

hbase mapreduce file not found exception

I have installed hadoop 2.4.1 and hbase 0.98.8 in 2 machines. When I run an hbase mapreduce job I get the below error:
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://pc1/opt/hbase-0.98.8-hadoop2/lib/hbase-server-0.98.8-hadoop2.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at thesis.test2.run(test2.java:93)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at thesis.test2.main(test2.java:107)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
I can run hadoop mapreduce jobs and simple hbase jobs without any problems. The code I m trying to run is an example that is supposed to run.
Please provide "jps" output.
Because it seems like your hbase is not working , hopefully the problem will be with zookeeper
I faced the exact problem. You have to add the hbase library path to the .bashrc file. Add the lib folder in hbase to the CLASSPATH.
Also, add the classpath of hbase to HADOOP_CLASSPATH.
Your .bashrc file should contain the following:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`${HBASE_HOME}/bin/hbase classpath`
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`${HBASE_HOME}/bin/hbase mapredcp`
export CLASSPATH=${HBASE_HOME}/lib/*
Note: The CLASSPATH should point to the lib folder of your hbase installation folder. Use the following to compile and run your java code.
javac Example.java
java -classpath $CLASSPATH:. Example

Problems running Manning's Hadoop in Practice 4.1 MapReduce code on Hadoop 1.0.3

I am attempting to run the 4.1 example code from Manning's "Hadoop in Practice" at http://www.manning.com/lam/
I am running Ubuntu 10.4 using hadoop 1.0.3 java 6.
The examples from http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/, I used the wordcount example to verify the installation.
I then tried to running the 4.1 example using:
hduser#ubuntu:/usr/local/hadoop$ bin/hadoop jar MyJob.jar MyJob /user/hduser/4.1/input /user/hduser/4.1output
I get the error:
Exception in thread "main" java.lang.ClassNotFoundException: MyJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
The public run method in the example that runs and manning's code appear to be different.
I appreciate your assistance!
Give the complete path of the jar. For example, if MyJob.jar is present inside your home directory then : hduser#ubuntu:/usr/local/hadoop$ bin/hadoop jar /home/hduser/MyJob.jar MyJob /user/hduser/4.1/input /user/hduser/4.1output
I had the same problem with Hadoop 1.0.3.16 and java 6 but I managed to get the Manning example 4.1 working by adding job.setJar("/path/to/MyJob.jar"); after job.setJobName("MyJob"); I thought of making this change because I was getting a warning: WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). Do you get the same warning Tariq?
I also tried adding job.setJarByClass(MyJob.class); instead but this did not work.
Cheers, Alex

Hadoop in Windows

I'm trying to run the sample word count program for Hadoop in Windows thru Cygwin. I've installed Hadoop and Cygwin.
I run the wordcount program using this statment:
$ bin/hadoop jar hadoop-examples-1.0.1.jar wordcount input output
I'm getting the following error:
12/05/08 23:05:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/05/08 23:05:35 ERROR security.UserGroupInformation: PriviledgedActionException as:suresh cause:java.io.IOException: Failed to set permissions of path: \tmp\hadoop-suresh\mapred\staging\suresh1005684431\.staging to 0700
java.io.IOException: Failed to set permissions of path: \tmp\hadoop-suresh\mapred\staging\suresh1005684431\.staging to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:682)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:655)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
I've set the Cygwin bin path in path variable. Any help is appreciated.
This is a known issue with some versions of Hadoop (see https://issues.apache.org/jira/browse/HADOOP-7682 for the full discussion).
I had this problem with version 1.0.2, so I tried various other versions.
In the end I got it to work by going back to version 0.22.0
If you go back to version 0.22.0 you will need to make a couple of changes to the bin/hadoop-config.sh script:
Change the line that sets up HADOOP_MAPRED_HOME to point the mapreduce directory instead of the mapred directory.
Comment out all the code that sets the java.library.path for a native hadoop install.
You should look at hadoop services for windows - a version of hadoop that was ported to windows.
currently it's in beta, but it's supposed to be released soon.
I've managed to get this working to the point where jobs are dispatched, tasks executed, and results compiled.
en.wikisource.org/wiki/User:Fkorning/Code/Hadoop-on-Cygwin

After upgrading hadoop clusters to clodera 4 b1 ,Invalid "mapreduce.jobtracker.address" configuration error comes

I recently upgraded to clodera4b1 .Before upgradation the mapreduce jobs were running fine but now when I execute any mapreduce program,Thefollowing error comes:
Command run:
hadoop jar /usr/lib/hadoop/hadoop-mapreduce-examples-0.23.0-cdh4b1.jar grep *.xml /user/out/ 'dfs'
12/04/10 19:23:15 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid "mapreduce.jobtracker.address" configuration value for LocalJobRunner : "dev-xxx:yyyy"
12/04/10 19:23:15 ERROR security.UserGroupInformation: PriviledgedActionException as:anchauhan (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:123)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:85)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:78)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1185)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1181)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1167)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1180)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1209)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1233)
at com.nextag.mapred.MerchantImport.doTask(MerchantImport.java:221)
at com.nextag.mapred.Main.main(Main.java:9)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:200)
And for my custom mapred jobs,I changed the library files to the new ones while compiling jar files,but no success
We had the same exception and the reason was that we had the wrong dependencies in the pom files. We were pointing to hadoop-client version 2.0.0-cdh4.0.0 which is for yarn but we are using MRv1 so we should have been pointing to 2.0.0-mr1-cdh4.0.0. You should not have any yarn jar files in your dependencies.
Reference

Resources