IncompatibleClassChangeError when calling getSplit hadoop 2.0.0-cdh4.0.0 - hadoop

I'm using the Cloudera-VM. Hadoop version: Hadoop 2.0.0-cdh4.0.0.
I have written an inputFileFormat, when the client calls the getSplits method I get an exception:
IncompatibleClassChangeError found interface org.apache.hadoop.mapreduce.JobContext expecting
I'm using the classes from the mapreduce package not mapred.
However when I look at the stacktrace I see that somewhere along the line the library changes to mapred:
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at com.hadoopApp.DataGeneratorFileInput.getSplits(DataGeneratorFileInput.java:27)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at com.hadoopApp.HBaseApp.generateData(HBaseApp.java:54)
at com.hadoopApp.HBaseApp.run(HBaseApp.java:24)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.hadoopApp.HBaseApp.main(HBaseApp.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Not sure if this helps, but i'm using this in my maven pom:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
solved it not sure why
changed my pom to this and started working - not sure why it solved it though - your input is appreciated it:
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.0.0-cdh4.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.0.0-cdh4.2.0</version>
</dependency>
How can get around this?

I've bumped into the same problem when using Hipi on cdh4.2.0.
The problem is caused by incompatibilities between Hadoop versions (jobs build with Hadoop 1 may not work on Hadoop 2). Initially you were building the job with Hadoop v1 and running it on Hadoop 2.0.0 environment (cloudera uses Hadoop 2.0.0).
Fortunately, hadoop 1.x API is fully supported in Hadoop 2.x, so rebuilding the job with newer version of hadoop helps.

Related

jdiameter-ha-* ClassNotFoundException

I'm attempting introduce high availability mode (via JBoss Cache) in my server implementation (essentially an expanded version of the example server) by configuring my Maven project to use jdiameter-ha-api and jdiameter-ha-impl dependencies instead of jdiameter-api and jdiameter-impl, in addition to adding the following extensions to jdiameter-config.xml:
<Extensions>
<SessionDatasource value="org.mobicents.diameter.impl.ha.data.ReplicatedSessionDatasource"/>
<TimerFacility value="org.mobicents.diameter.impl.ha.timer.ReplicatedTimerFacilityImpl"/>
</Extensions>
Now, when I run the server from Eclipse, it works fine, i.e. it start up in clustered mode (w/ JBoss Cache), however, when I attempt to run the jar produced by mvn install, it throws the following error:
2018-10-11 18:24:13,899 - (-)(-)(-)(-)(-) Starting Mobicents DIAMETER Stack v1.7.0-SNAPSHOT (-)(-)(-)(-)(-)
2018-10-11 18:24:13,959 - Failure creating stack 'Server'
org.jdiameter.api.InternalException: java.lang.reflect.InvocationTargetException
at org.jdiameter.client.impl.StackImpl.init(StackImpl.java:135)
at com.company.charging.diameter.ocf.utilities.StackCreator.<init>(StackCreator.java:37)
at com.company.charging.diameter.ocf.utilities.StackCreator.<init>(StackCreator.java:71)
at com.company.charging.diameter.ocf.server.Ocf.<init>(Ocf.java:187)
at com.company.charging.diameter.ocf.server.Ocf.main(Ocf.java:157)
Caused by: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:488)
at org.jdiameter.client.impl.StackImpl.init(StackImpl.java:129)
... 4 more
Caused by: java.lang.ClassNotFoundException: org.mobicents.diameter.impl.ha.timer.ReplicatedTimerFacilityImpl
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:190)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:499)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:291)
at org.jdiameter.client.impl.helpers.AssemblerImpl.fill(AssemblerImpl.java:139)
at org.jdiameter.client.impl.helpers.AssemblerImpl.<init>(AssemblerImpl.java:91)
... 9 more
Given that it starts up in Eclipse just fine, I'm assuming my POM file isn't managing dependencies properly, so that the final jar is missing these classes. Here's the relevant portion of my pom.xml:
<dependencies>
<dependency>
<groupId>org.mobicents.diameter</groupId>
<artifactId>jdiameter-ha-api</artifactId>
<version>${restcomm.diameter.jdiameter.version}</version>
</dependency>
<dependency>
<groupId>org.mobicents.diameter</groupId>
<artifactId>jdiameter-ha-impl</artifactId>
<version>${restcomm.diameter.jdiameter.version}</version>
</dependency>
<dependency>
<groupId>org.mobicents.diameter</groupId>
<artifactId>restcomm-diameter-mux-jar</artifactId>
<version>${restcomm.diameter.mux.version}</version>
</dependency>
</dependencies>

java.lang.NoClassDefFoundError: org/springframework/orm/hibernate5/HibernateTransactionManager

I am trying to integrate spring with hibernate and I have spring-orm .4.3.6 jar file in my project.But still I am getting below error :
java.lang.NoClassDefFoundError: org/springframework/orm/hibernate5/HibernateTransactionManager
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.getDeclaredMethods(Class.java:1975)
at org.springframework.util.ReflectionUtils.getDeclaredMethods(ReflectionUtils.java:613)
at org.springframework.util.ReflectionUtils.doWithMethods(ReflectionUtils.java:524)
at org.springframework.util.ReflectionUtils.doWithMethods(ReflectionUtils.java:510)
at org.springframework.util.ReflectionUtils.getUniqueDeclaredMethods(ReflectionUtils.java:570)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.getTypeForFactoryMethod(AbstractAutowireCapableBeanFactory.java:697)
Tried googling it.But did not find answer.
Can someone help?
This class is coming out of the spring-orm dependency (notice the hibernate5 package in there. There is equally hibernate3 and hibernate4 packages in the same jar not to break compatibility).
The Maven Coordinates are:
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-orm</artifactId>
<version>4.3.10.RELEASE</version>
</dependency>

Unable to run a Spark Java Program

I am running a Spark Program written in java & I am using the sample wordcount example.
I have created a jar file but, when I am submitting the spark job it is throwing an error.
$ spark-submit --class WordCount --master local \ home/cloudera/workspace/sparksample/target/sparksample-0.0.1-SNAPSHOT.jar
I am getting the below error
java.lang.ClassNotFoundException: wordCount
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Edited
i am also adding my pom.xml so that you can help.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.igi.sparksample</groupId>
<artifactId>sparksample</artifactId>
<version>0.0.1-SNAPSHOT</version>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
</dependencies>
</project>
After trying so many combinations and doing a bit R&D i solved my issue.
Issue was in my spark submit command i changed it to this
spark-submit --class com.xxx.sparksample.WordCount --master local /home/cloudera/workspace/sparksample/target/sparksample-0.0.1-SNAPSHOT.jar
and it worked.
It can't find the WordCount class. You probably need to include the package that class is in, so you have the full classpath, ie:
--class <PACKAGE>.WordCount
The error you posted doesn't show any problem with Spark.
However, you must have a typo in your program. Java threw a ClassNotFoundException looking for wordCount, where it should most probably be WordCount, with a capital W.
Please check the names of your classes and your imports.
Make sure that the name of class (wordcount or WordCount or whatever...) that you pass to spark-submit is exactly similar to what you have defined.
Make sure that packaging is correct.
To verify, open/extract your jar and see the class name and the package hierarchy.

Flink's Quickstart doesn't create a proper fat-jar

i'm definitely not an expert in mvn, but after 2 days hacking around, i'm just giving up.
my workflow:
1.
mvn archetype:generate
-DarchetypeGroupId=org.apache.flink
-DarchetypeArtifactId=flink-quickstart-scala
-DarchetypeVersion=0.10.1
-DgroupId=org.apache.flink.quickstart
-DartifactId=flink-scala-project
-Dversion=0.1
-Dpackage=org.apache.flink.quickstart
-DinteractiveMode=false
2.
cd flink-scala-project
3.
mvn clean package
here is a build log: https://gist.github.com/zavalit/1e78478ebdda827f3454 and when I run
`java -jar target/flink-scala-project-0.1.jar`
I get
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/api/scala/ExecutionEnvironment$
at org.apache.flink.quickstart.Job$.main(Job.scala:41)
at org.apache.flink.quickstart.Job.main(Job.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.flink.api.scala.ExecutionEnvironment$
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 2 more
The fat jar which you're building is not supposed to be run outside of a cluster environment. Therefore, all Flink related dependencies which run in the cluster environment are excluded from the fat jar.
What you usually do with the generated fat jar is to submit it to a local or remote cluster via bin/flink run -c org.example.MyJob myFatJar.jar. In order to start quickly a local cluster you can run bin/start-local.sh. This will start a local cluster to which you can submit your job jar.
By default, the flink libraries are not included in the fat jar since it would be provided by flink cluster at runtime. To fix that, change the scope of dependencies in pom.xml, from provided to compile:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
<scope>compile</scope>
</dependency>
link: Maven Doc

Spark: Hive Insert overwrite throws ClassNotFoundException

I have this code that saves the schemaRDD (person) to a Hive table stored as parquet (person_parquet)
hiveContext.sql("insert overwrite table person_parquet select * from person")
But it throws an error:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
at org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:399)
at org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:867)
at org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:589)
at org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:174)
at org.apache.hadoop.hive.ql.metadata.Table.<init>(Table.java:116)
at org.apache.hadoop.hive.ql.metadata.Hive.newTable(Hive.java:2566)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:917)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1464)
at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:243)
at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:137)
at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.execute(InsertIntoHiveTable.scala:51)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
at com.example.KafkaConsumer$$anonfun$main$2.apply(KafkaConsumer.scala:114)
at com.example.KafkaConsumer$$anonfun$main$2.apply(KafkaConsumer.scala:83)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:529)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:529)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:171)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:376)
at org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:381)
... 29 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:366)
... 30 more
I changed my hive-site.xml to this but still throws the same exception
<property>hive.security.authenticator.manager</property>
<value>org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator</value>
<property>hive.security.authorization.enabled</property>
<value>false</value>
<property>hive.security.authorization.manager</property
<value>org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvid‌​er</value>
(same hive-site.xml as #1) When I added the hive-exec 1.0 in my dependencies, it threw a different exception (AbstractMethodError)
(same hive-site.xml as #1) I tried adding hive-exec 0.13 to my dependencies. During first run (insert), it still throws an error, but on second and succeeding insert, it's successful.
I am using Sandbox HDP 2.2 (Hive 0.14.0.2.2.0.0-2041) and Spark 1.2.0.
Dependencies:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>0.13.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.2.0</version>
</dependency>
"SQLStdConfOnlyAuthorizerFactory" class has been added in hive 0.14.0 version (HIVE-8045) but Spark 1.2 depends on hive 0.13. Your hive-site.xml must be having "hive.security.authorization.manager" set as "org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory " and your classpath musn't be having hive-exec 0.14 JAR thats why its throwing ClassNotFoundException. So either include your hive-exec 0.14.0 JAR in classpath (and before Spark's own hive JARs) or change your entry in hive-site.xml to something like this :-
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider</value>
</property>
Former is not recommended since similar issues may arise further due to hive version mismatch
Changing the value for
hive.security.authorization.manager = org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
worked.
Changed the hive-site.xml
I think this happens because you have duplicate jars on the classpath.

Resources