Error in launching Spark REPL - hadoop

I got pre-built Spark 1.4.1 and I'm running HDP 2.6. when I try to run spark-shell it gives me an error message as follows.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:111)
at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:111)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.deploy.SparkSubmitArguments.mergeDefaultSparkProperties(SparkSubmitArguments.scala:111)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:97)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:107)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
What is the issue?

ClassNotFoundException occurs when class loader could not find the
required class in class path . So , basically you should check your
class path and add the class in the classpath.
Check whether hadoop-common-0.21.0.jar is added to your classpath.

Is it possible that your Hadoop home is not set, as in here?
Cannot find hadoop installation: $HADOOP_HOME must be set or hadoop must be in the path

Related

Class not found in a MapReduce job

I have a mapreduce job which takes an avro file as input. I export it along with all the required libraries (jar library files) into a jar file. I have 2 different clusters, one is HDInsight simulator and the other one in HDP sandbox. It works fine on the HDP sandbox but it gives me an error on the HDInsight simulator and cannot find AvroInputFormat class. I tried running the job with the -libjar option but it didn't help. Here is the error message:
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.avro.mapred.AvroInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1927)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:686)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:168)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.avro.mapred.AvroInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1919)
... 9 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.avro.mapred.AvroInputFormat not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
... 10 more
This looks weired because it runs fine on one cluster! Does anyone know what can be the problem?

Unable to find partitioner class - Cassandra

Can some help me in fixing the below issue facing with Cassandra, when i run my application on Hadoop.
When i run the application, i am getting the below error with respect to the partitioner class we mentioned in the application.
Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: Unable to find partitioner class 'org.apache.cassandra.dht.RandomPartitioner'
at org.apache.cassandra.hadoop.ConfigHelper.getInputPartitioner(ConfigHelper.java:426)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.validateConfiguration(AbstractColumnFamilyInputFormat.java:85)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.validateConfiguration(ColumnFamilyInputFormat.java:74)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:122)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:493)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
at com.test.cassandratest.WcJob.run(WcJob.java:96)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.test.cassandratest.WcJob.main(WcJob.java:104)
... 10 more
Caused by: org.apache.cassandra.exceptions.ConfigurationException: Unable to find partitioner class 'org.apache.cassandra.dht.RandomPartitioner'
at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:458)
at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:470)
at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:416)
at org.apache.cassandra.hadoop.ConfigHelper.getInputPartitioner(ConfigHelper.java:422)
... 26 more
Caused by: java.lang.NoClassDefFoundError: org/github/jamm/MemoryMeter$Guess
at org.apache.cassandra.utils.ObjectSizes.<clinit>(ObjectSizes.java:34)
at org.apache.cassandra.dht.RandomPartitioner.<clinit>(RandomPartitioner.java:45)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:450)
... 29 more
Caused by: java.lang.ClassNotFoundException: org.github.jamm.MemoryMeter$Guess
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 34 more
I met the same problem when upgraded Cassandra to 2.1 in our system, and the root cause is as following.
The jamm version Cassandra 2.1 uses is 3.0.0, and older Cassandra was using 2.5. So please update the jamm version you use, and you problem may be fixed.
http://mvnrepository.com/artifact/com.github.jbellis/jamm/0.3.0

Creating jar for running MapReduce on Hadoop 1.2.1

I am new to Hadoop and I have just setup Hadoop 1.2.1 on my Mac laptop (Mavericks). I then created a simple WordCount project in IntelliJ IDEA and was able to run the code on a dummy text file. I am having trouble with successfully creating a jar file which will replicate my execution through the IDE. I get the following error:
java -jar ./out/artifacts/WordCount_jar/WordCount.jar test.txt out [19:35:21]
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
at org.apache.hadoop.conf.Configuration.<clinit>(Configuration.java:146)
at neu.cs.parallelprogramming.WordCount.main(WordCount.java:48)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 2 more
FAIL: 1
Could anyone let me know what I am missing?
I guess you have to specify your class (which implements the Map/Reduce function).
E.g., $ java -jar ./WordCount.jar classWordCount input.txt output
or $ hadoop jar yourprogram.jar **yourclass** inputpath outputpath

Hadoop - map-reduce - java.lang.NoClassDefFoundError

The question is: if I type
hadoop jar MY.jar name_my_class /user/user/input /user/user/output
And all the class which I need are inside MY.jar, why I still get the error
java.lang.NoClassDefFoundError ??
I know that it is very general question, so if you think is better, I 'll provide all the details and the code :)
Thanks in advance
13/11/22 17:24:27 WARN mapred.LocalJobRunner: job_local1970405879_0001
java.lang.NoClassDefFoundError: org/jocl/CLException
at jocl.MaxTemperatureReducer.setup(MaxTemperatureReducer.java:33)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
Caused by: java.lang.ClassNotFoundException: org.jocl.CLException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 5 more
It Looks like you are missing CLException class in your class path.
Copy the jar file containing the class org.jocl.CLException and put it in the lib folder of your hadoop installation.

hcatalog with mapreduce

I get the following error while executing a MapReduce program.
I have placed all jars in hadoop/lib directory and have also mentioned the jars in -libjars.
This is the cmd I am executing:
$HADOOP_HOME/bin/hadoop --config $HADOOP_HOME/conf jar /home/shash/distinct.jar HwordCount -libjars $LIB_JARS WordCount HWordCount2
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hcatalog.mapreduce.HCatOutputFormat at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996) at
org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:248) at org.apache.hadoop.mapred.Task.initialize(Task.java:501) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:306) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at
org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: java.lang.ClassNotFoundException: org.apache.hcatalog.mapreduce.HCatOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
java.lang.ClassLoader.loadClass(ClassLoader.java:423) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
java.lang.ClassLoader.loadClass(ClassLoader.java:356) at
java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:264) at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994) ...
8 more
Make sure LIB_JARS is a comma-separated list (as opposed to colon-separated like CLASSPATH)
Applies To CDH 5.0.x CDH 5.1.x CDH 5.2.x CDH 5.3.x Sqoop
Cause Sqoop cannot pick up the HCatalog libraries because Cloudera
Manager does not set the HIVE_HOME environment. It needs to be set
manually.
This problem is tracked with below JIRA:
https://issues.apache.org/jira/browse/SQOOP-2145
The fix of this JIRA has been included in CDH since version 5.4.0.
Workaround: Applicable to CDH versions lower than 5.4.0.
Execute below commands in shell before calling Sqoop command or adding them to /etc/sqoop/conf/sqoop-env.sh (create one, if it does not already exists):
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive (for parcel installation)
export HIVE_HOME=/usr/lib/hive (for package installation)

Resources