Debugging hadoop Wordcount program in eclipse in windows - hadoop

Trying to run Wordcount program in hadoop in eclipse (windows 7). and passing these argument in eclipse only
I have created input file in project only like input folder and inside it word.txt file
But it is throughing below excption
2015-04-08 15:30:09,947 WARN [main] util.NativeCodeLoader (<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-04-08 15:30:10,238 ERROR [main] util.Shell ( - Failed to locate the winutils binary in the hadoop binary path Could not locate executable E:\hadoop\hadoop-HADOOP_HOME\hadoop-2.6.0\bin\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(
at org.apache.hadoop.util.Shell.getWinUtilsPath(
at org.apache.hadoop.util.Shell.<clinit>(
at org.apache.hadoop.util.StringUtils.<clinit>(
at org.apache.hadoop.mapreduce.task.JobContextImpl.<init>(
at org.apache.hadoop.mapreduce.Job.<init>(
at org.apache.hadoop.mapreduce.Job.getInstance(
at org.apache.hadoop.mapreduce.Job.getInstance(
at com.WordCount.main(
2015-04-08 15:30:11,039 INFO [main] Configuration.deprecation ( - is deprecated. Instead, use dfs.metrics.session-id
2015-04-08 15:30:11,041 INFO [main] jvm.JvmMetrics ( - Initializing JVM Metrics with processName=JobTracker, sessionId=
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/E:/hadoop/eclipse-hadoop-pro/workspace-hadoop/WordCountPro/output already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
at org.apache.hadoop.mapreduce.Job$
at org.apache.hadoop.mapreduce.Job$
at Method)
at Source)
at org.apache.hadoop.mapreduce.Job.submit(
at org.apache.hadoop.mapreduce.Job.waitForCompletion(
at com.WordCount.main(

I doubt if Hadoop is installed correctly. Check in your machine if all the daemons are running or not.If not, then consider re-checking or re-installing what you are missing.
ERROR [main] util.Shell ( - Failed to locate the winutils binary in the hadoop binary path Could not locate executable


Hadoop Streaming Error No such file or directory

I study Hadoop and test Hadoop Streaming using Ruby whether my MapReduce algorithm could work without error.
So, I did next command.
hadoop jar hadoop-streaming-2.7.2.jar -files mapper.rb,reducer.rb -mapper mapper.rb -reducer reducer.rb -input test.json -output test
But, next error occurred.
dir/usercache/Kuma/appcache/application_1469093819516_0005/container_1469093819516_0005_01_000002/./mapper.rb": error=2, No such file or directory
Also, next is one job all error.
16/07/21 19:22:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
packageJobJar: [/var/folders/l_/21nh6rmn6f3fn3vd55kh0b5h0000gn/T/hadoop-unjar8708804571112102348/] [] /var/folders/l_/21nh6rmn6f3fn3vd55kh0b5h0000gn/T/streamjob5933629893966751936.jar tmpDir=null
16/07/21 19:22:05 INFO client.RMProxy: Connecting to ResourceManager at localhost/
16/07/21 19:22:05 INFO client.RMProxy: Connecting to ResourceManager at localhost/
16/07/21 19:22:06 INFO mapred.FileInputFormat: Total input paths to process : 1
16/07/21 19:22:06 INFO mapreduce.JobSubmitter: number of splits:2
16/07/21 19:22:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1469093819516_0005
16/07/21 19:22:06 INFO impl.YarnClientImpl: Submitted application application_1469093819516_0005
16/07/21 19:22:06 INFO mapreduce.Job: The url to track the job: http://hatanokaoruakira-no-MacBook-Air.local:8088/proxy/application_1469093819516_0005/
16/07/21 19:22:06 INFO mapreduce.Job: Running job: job_1469093819516_0005
16/07/21 19:22:13 INFO mapreduce.Job: Job job_1469093819516_0005 running in uber mode : false
16/07/21 19:22:13 INFO mapreduce.Job: map 0% reduce 0%
16/07/21 19:22:18 INFO mapreduce.Job: Task Id : attempt_1469093819516_0005_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
at org.apache.hadoop.util.ReflectionUtils.setConf(
at org.apache.hadoop.util.ReflectionUtils.newInstance(
at org.apache.hadoop.mapred.MapTask.runOldMapper(
at org.apache.hadoop.mapred.YarnChild$
at Method)
at org.apache.hadoop.mapred.YarnChild.main(
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
at org.apache.hadoop.util.ReflectionUtils.setConf(
at org.apache.hadoop.util.ReflectionUtils.newInstance(
at org.apache.hadoop.mapred.MapRunner.configure(
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
... 17 more
Caused by: java.lang.RuntimeException: configuration exception
at org.apache.hadoop.streaming.PipeMapRed.configure(
at org.apache.hadoop.streaming.PipeMapper.configure(
... 22 more
Caused by: Cannot run program "/private/tmp/hadoop-Kuma/nm-local-dir/usercache/Kuma/appcache/application_1469093819516_0005/container_1469093819516_0005_01_000002/./mapper.rb": error=2, No such file or directory
at java.lang.ProcessBuilder.start(
at org.apache.hadoop.streaming.PipeMapRed.configure(
... 23 more
Caused by: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(
at java.lang.ProcessImpl.start(
at java.lang.ProcessBuilder.start(
... 24 more
mapper.rb and reducer.rb exist at current directory which executed the command.
wordcount MapReduce for testing is running without error, so I think that Hadoop setting is ok.
Mac El Capitan
Hadoop 2.7.2(presudo distributed mode)
files specified with -files options must be in hdfs:
hadoop jar hadoop-streaming.jar -files f1.txt,f2.txt -input f1.txt -output test1
Input path does not exist: hdfs://quickstart.cloudera:8020/user/cloudera/f1.txt
After copying files to hdfs (in your case you will need to copy to hdfs root dir - (according to error in your log)):
hadoop fs -put f1.txt /user/cloudera
hadoop fs -put f2.txt /user/cloudera
job ran with NO errors:
[cloudera#quickstart hadoop-mapreduce]$ hadoop jar hadoop-streaming.jar -files f1.txt,f2.txt -input f1.txt -output test1
packageJobJar: [] [/usr/jars/hadoop-streaming-2.6.0-cdh5.7.0.jar] /tmp/streamjob5729321067745308196.jar tmpDir=null
16/07/23 11:36:16 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/
16/07/23 11:36:17 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/
16/07/23 11:36:18 INFO mapred.FileInputFormat: Total input paths to process : 1
16/07/23 11:36:18 INFO mapreduce.JobSubmitter: number of splits:1
HadoopStreaming - Making_Files_Available_to_Tasks:
The -files and -archives options allow you to make files and archives available to the tasks. The argument is a URI to the file or archive that you have already uploaded to HDFS. These files and archives are cached across jobs. You can retrieve the host and fs_port values from the config variable.
Note: The -files and -archives options are generic options. Be sure to place the generic options before the command options, otherwise the command will fail.

Hadoop Wordcount Program Compile Error

I am new to hadoop programming.I am using eclipse for hadoop development.I added all jar files through java buildpath when i run my program it is not running and giving this error,so please help to solve error?
14/05/31 23:33:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/05/31 23:33:10 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/05/31 23:33:10 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-deep/mapred/staging/deep689130586/.staging/job_local689130586_0001
14/05/31 23:33:10 ERROR security.UserGroupInformation: PriviledgedActionException as:deep cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/ already exists
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/ already exists
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(
at org.apache.hadoop.mapred.JobClient$
at org.apache.hadoop.mapred.JobClient$
at Method)
at Source)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(
at org.apache.hadoop.mapred.JobClient.submitJob(
at org.apache.hadoop.mapred.JobClient.runJob(
at hadoop1.MyJob.main(
The native libraries failed to load.
This can be due to you are using a 64-bit machine but the hadoop distro is for 32-bit. You can follow the steps here to recompile hadoop for 64 bit, and then replace the native libs.

Error on starting Pig

I configured Pig on my Hadoop system, but when I start it I get an error related to log4j. Am I missing something?
$ pig
log4j:ERROR Could not instantiate class [org.apache.hadoop.log.metrics.EventCounter].
java.lang.ClassNotFoundException: org.apache.hadoop.log.metrics.EventCounter
at Method)
at java.lang.ClassLoader.loadClass(
at sun.misc.Launcher$AppClassLoader.loadClass(
at java.lang.ClassLoader.loadClass(
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(
at org.apache.log4j.helpers.Loader.loadClass(
log4j:ERROR Could not instantiate appender named "EventCounter".
2014-02-14 10:45:46,512 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1 (r1459641) compiled Mar 22 2013, 02:13:53
2014-02-14 10:45:46,513 [main] INFO org.apache.pig.Main - Logging error messages to: /usr/local/hadoop/pig_1392381946511.log
2014-02-14 10:45:46,541 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/hduser/.pigbootup not found
2014-02-14 10:45:46,695 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: maprfs:///
2014-02-14 10:45:46,767 [main] INFO org.apache.hadoop.util.NativeCodeLoader - Loaded the native-hadoop library
2014-02-14 10:45:46,768 [main] INFO - Using JniBasedUnixGroupsMapping for Group resolution
2014-02-14 10:45:46,853 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: maprfs:///
First try running a pig script locally by
pig -x local <filename>.pig
If some error message shows up for running it locally also try setup video,it is proper if you are using ubuntu 12.04LTS‎

Oozie map-reduce example fails with ClassNotFoundException when using Bigtop 0.5.0

I'm using a relatively clean installation of CentOS 6.3 minimal with the Bigtop 0.5.0 repo and Sun Java 1.6. I add the Bigtop repo as per the instructions here.
I have installed Hadoop common and Oozie using yum. I configured oozie by running sudo service oozie init, then set up the HDFS paths using the commands in the file in Bigtop 0.6.0.
I can run Java and streaming map reduce jobs without any problems. I can also run the Oozie streaming example that comes bundled with Bigtop. Unfortunately, when I try to run the map-reduce example, I get a java.lang.ClassNotFoundException
I can see from the HDFS audit logs that the oozie-examples-3.3.0.jar file gets inspected, but never opened. These are the only four entries for the jar file in the audit log for the time the workflow is running:
2013-03-12 14:42:07,394 INFO FSNamesystem.audit: allowed=true ugi=user (auth:SIMPLE) via oozie (auth:SIMPLE) ip=/ cmd=getfileinfo src=/user/user/examples/apps/map-reduce/lib/oozie-examples-3.3.0.jar dst=null perm=null
2013-03-12 14:42:07,399 INFO FSNamesystem.audit: allowed=true ugi=user (auth:SIMPLE) via oozie (auth:SIMPLE) ip=/ cmd=getfileinfo src=/user/user/examples/apps/map-reduce/lib/oozie-examples-3.3.0.jar dst=null perm=null
2013-03-12 14:42:07,547 INFO FSNamesystem.audit: allowed=true ugi=user (auth:SIMPLE) via oozie (auth:SIMPLE) ip=/ cmd=getfileinfo src=/user/user/examples/apps/map-reduce/lib/oozie-examples-3.3.0.jar dst=null perm=null
2013-03-12 14:42:07,550 INFO FSNamesystem.audit: allowed=true ugi=user (auth:SIMPLE) via oozie (auth:SIMPLE) ip=/ cmd=getfileinfo src=/user/user/examples/apps/map-reduce/lib/oozie-examples-3.3.0.jar dst=null perm=null
The container logs I get from the webconsole on port 8088 show the following exception, but offer no further clues:
2013-03-12 15:10:18,681 FATAL [IPC Server handler 2 on 57310] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1363061307536_0002_m_000000_0 - exited : java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
at org.apache.hadoop.util.ReflectionUtils.setConf(
at org.apache.hadoop.util.ReflectionUtils.newInstance(
at org.apache.hadoop.mapred.MapTask.runOldMapper(
at org.apache.hadoop.mapred.YarnChild$
at Method)
at org.apache.hadoop.mapred.YarnChild.main(
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.util.ReflectionUtils.setJobConf(
... 9 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.example.SampleMapper not found
at org.apache.hadoop.conf.Configuration.getClass(
at org.apache.hadoop.mapred.JobConf.getMapperClass(
at org.apache.hadoop.mapred.MapRunner.configure(
... 14 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.example.SampleMapper not found
at org.apache.hadoop.conf.Configuration.getClass(
at org.apache.hadoop.conf.Configuration.getClass(
... 16 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.oozie.example.SampleMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(
at org.apache.hadoop.conf.Configuration.getClass(
... 17 more
I managed to grab the job.xml file out of the temp directory while the failing stage of the workflow was running, and I can see that the jar file gets added to the classpath property:
... but the class is still apparently not found. I've set all debugging up to DEBUG for all components and can find no more clues.
Have I simply misconfigured something, or is this actually a bug? I don't really know what to do next.

wordcount from eclipse

I was using the eclipse plugin for hadoop. I can see all the files in HDFS by making a hadoop server but when I try to run the file from the eclipse, it gives me exception whereas from the terminal it runs smoothly. The exception is below.
2/11/14 04:09:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/11/14 04:09:06 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
12/11/14 04:09:06 WARN snappy.LoadSnappy: Snappy native library not loaded
12/11/14 04:09:06 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-hduser/mapred/staging/hduser1728681403/.staging/job_local_0001
12/11/14 04:09:06 ERROR security.UserGroupInformation: PriviledgedActionException as:hduser cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/user/hduser/gutenberg
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/user/hduser/gutenberg
at org.apache.hadoop.mapred.FileInputFormat.listStatus(
at org.apache.hadoop.mapred.FileInputFormat.getSplits(
at org.apache.hadoop.mapred.JobClient.writeOldSplits(
at org.apache.hadoop.mapred.JobClient.writeSplits(
at org.apache.hadoop.mapred.JobClient.access$600(
at org.apache.hadoop.mapred.JobClient$
at org.apache.hadoop.mapred.JobClient$
at Method)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(
at org.apache.hadoop.mapred.JobClient.submitJob(
at org.apache.hadoop.mapred.JobClient.runJob(
at WordCount.main(
I'd start with investigating this:
ERROR security.UserGroupInformation: PriviledgedActionException as:hduser cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/user/hduser/gutenberg
It seems it causes the problem. Are you sure this is the proper path? If so, you may not have the privilage to access to it. Later on I would try to eliminate as much WARN as I can.
Thanks Shujaat that solved my problem. From Eclipse I was getting the same issue...
use hdfs://localhost:54310/user/... instead of "/user/..."
