Problems running Manning's Hadoop in Practice 4.1 MapReduce code on Hadoop 1.0.3 - hadoop

I am attempting to run the 4.1 example code from Manning's "Hadoop in Practice" at http://www.manning.com/lam/
I am running Ubuntu 10.4 using hadoop 1.0.3 java 6.
The examples from http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/, I used the wordcount example to verify the installation.
I then tried to running the 4.1 example using:
hduser#ubuntu:/usr/local/hadoop$ bin/hadoop jar MyJob.jar MyJob /user/hduser/4.1/input /user/hduser/4.1output
I get the error:
Exception in thread "main" java.lang.ClassNotFoundException: MyJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
The public run method in the example that runs and manning's code appear to be different.
I appreciate your assistance!

Give the complete path of the jar. For example, if MyJob.jar is present inside your home directory then : hduser#ubuntu:/usr/local/hadoop$ bin/hadoop jar /home/hduser/MyJob.jar MyJob /user/hduser/4.1/input /user/hduser/4.1output

I had the same problem with Hadoop 1.0.3.16 and java 6 but I managed to get the Manning example 4.1 working by adding job.setJar("/path/to/MyJob.jar"); after job.setJobName("MyJob"); I thought of making this change because I was getting a warning: WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). Do you get the same warning Tariq?
I also tried adding job.setJarByClass(MyJob.class); instead but this did not work.
Cheers, Alex

Related

M/R job submission failing with error: Could not find Yarn tags property > (mapreduce.job.tags)

I am getting the following exception when running a map/reduce job. We submit map/reduce jobs through oozie.
Failing Oozie Launcher, Main class
[org.apache.oozie.action.hadoop.JavaMain], main() threw exception,
Could not find Yarn tags property (mapreduce.job.tags)
java.lang.RuntimeException: Could not find Yarn tags property
(mapreduce.job.tags) at
org.apache.oozie.action.hadoop.LauncherMainHadoopUtils.getChildYarnJobs(LauncherMainHadoopUtils.java:53)
at
org.apache.oozie.action.hadoop.LauncherMainHadoopUtils.killChildYarnJobs(LauncherMainHadoopUtils.java:88)
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:46) at
org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:46)
at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:38) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:228)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:378)
at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:296)
at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745
I did a google search, and found the following SO post: Hadoop MapReduce job starts but can not find Map class? However the resolution mentioned in this post is not working for me, I cannot see any file permission related errors in the log files.
We are using Cloudera distribution.
You need to upgrade Oozie sharelibs. Follow instructions in Cloudera's documentation. Namely:
sudo oozie-setup sharelib create -fs FS_URI -locallib /usr/lib/oozie/oozie-sharelib-yarn
Don't forget to restart Oozie afterwards. This helped us to solve this particular problem after CDH 5.5 upgrade.

ERROR : org.apache.oozie.action.hadoop.PigMain not found

I'm trying to execute a simple pig script through oozie workflow which imports a python jar and as well as some other jar and eventually getting error like:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], exception invoking main(), java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:224)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
... 9 more
Oozie Launcher failed, finishing Hadoop job gracefully
and for this workflow i added all jars in lib directory including pig.jar .
Please check the Pig Jar should be present in Physical location of the Node where the Oozie Workflow is running.
Also You can plase the Pig jar in hadoop location of Oozie Shared Lib, and pass parameter
oozie.use.system.libpath = true
these will read the jar from Shared Lib Location

Exception in thread "main" while formatting namenode in hadoop

satya#ubuntu:~/hadoop/bin$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/hadoop/hdfs/server/namenode/NameNode : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.hadoop.hdfs.server.namenode.NameNode. Program will exit.
This error (Unsupported major.minor version) generally appears because of using a higher JDK during compile time and lower JDK during runtime. In this case 51 corresponds to JDK 7 (for more version mappings visit this link), this indicates that
whatever the JVM 1.6 runtime loaded, it was meant for JVM 1.7. Try using JDK 1.7 and set that using JAVA_HOME environment variable in hadoop-env.sh.
The default java version and you Hadoop's java version should match. Do this:
java -version
Open hadoop-env.sh (can be found in hadoop config folder) and search for JAVA_HOME. This java version and the default java version should match.
NOTE: Set your JAVA_HOME to point to jdk folder and not your java's bin folder
It is better if you can show your Hadoop version... but for Hadoop 2, I think you can try the new format command
[hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format [-clusterid cid] [-force] [-nonInteractive]
So in your case, type
satya#ubuntu:~/hadoop/bin$ hdfs namenode -format
(I'm referring to Hadoop 2.7.0 which should apply to your situation.)
I also met the question. And when I type:
$hadoop classpath
I find the classpath of hdfs is wrong. Then I did
vi ~/.bashrc
export HADOOP_HDFS_HOME=$HADOOP_HOME
It works, hope it helps.

Hadoop can't find example jar file

I am trying to run this in pseudo distributed mode following the directions in Hadoop In Action. It ran when I used the local/standalone mode.
Now it can't seem to find the path to the jar file.
cd $HADOOP_HOME
jps
17559 JobTracker
17466 SecondaryNameNode
17791 TaskTracker
16993 NameNode
17942 Jps
bin/hadoop hadoop-examples-1.0.3.jar wordcount
Warning: $HADOOP_HOME is deprecated.
Exception in thread "main" java.lang.NoClassDefFoundError: hadoop-examples-1/0/3/jar
Caused by: java.lang.ClassNotFoundException: hadoop-examples-1.0.3.jar
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: hadoop-examples-1.0.3.jar. Program will exit.
My CLASSPATH is set to $HADOOP_HOME
Any ideas?
Two things that don't look right:
You should also have DataNode process running check the logs to see what happened to it.
The correct command to use is bin/hadoop jar hadoop-examples-1.0.3.jar wordcount
You should also have HADOOP_CONF_DIR set to point to the directory with 'hdfs-site.xml' and 'core-site.xml'

hadoop ClassNotFoundException when running start-all.sh

I tried to run ./hadoop start-all.sh
Unfortunately this error is thrown
Exception in thread "main" java.lang.NoClassDefFoundError: start/all/sh
Caused by: java.lang.ClassNotFoundException: start.all.sh
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: start.all.sh. Program will exit.
I though it might have been the hadoop path but that does not seem to fix the issue. The path that i set in the hadoop-env.sh is /usr/local/hadoop/bin`.
I looked at other posts with simular titles
Hadoop: strange ClassNotFoundException
what is considered the main class. I tried changing the path to /usr/local/hadoop/bin/
Its a shell script. >> start-all.sh should do. You do not need hadoop. You can find more information here. http://hadoop.apache.org/common/docs/r0.19.2/quickstart.html
Just run as follows
/path/to/Hadoop/home/bin/start-all.sh
In your case
/user/local/hadoop/bin/start-all.sh
Since you are already in /hadoop/bin folder. You no need to give again ./hadoop start-all.sh
instead just give ./start-all.sh
It will not throw any error and it will start your hadoop process.

Resources