Getting a "NoClassDefFoundError" when running Apache Spark Word2Vec in Java - spring-boot

I'm new to Apache Spark and trying to use it's Word2Vec capabilities in Springboot for synonym generation but keep getting an error. See code snippent and stacktrace below.
SparkSession spark = SparkSession.builder().appName("Synonym Recommender")
.config("spark.master", "local")
.getOrCreate()
JavaRDD<String> lines = spark.read().textFile(Paths.get("src/main/resources/static/text8.txt").toString()).toJavaRDD();
JavaRDD<Iterable<String>> wordsIterable = lines.map(new Function<String, Iterable<String>>() {
public Iterable<String> call(String s) throws Exception {
String[] words = s.split(" ");
Iterable<String> output = Arrays.asList(words);
return output;
}
})
Word2Vec vec = new Word2Vec()
vecModel = vec.fit(wordsIterable)`
When I run the above code, I get the following error (full stack trace at bottom):
java.lang.NoClassDefFoundError: org/codehaus/janino/InternalCompilerException
Here are relevant entries in my pom.xml:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.13</artifactId>
<version>3.3.1</version>
<exclusions>
<exclusion>
<artifactId>janino</artifactId>
<groupId>org.codehaus.janino</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
<version>3.1.9</version>
</dependency>
I separtely included janino dependency based on a potential solution I saw, but that doesn't seem to work either.
Caused by: java.lang.NoClassDefFoundError: org/codehaus/janino/InternalCompilerException
at org.apache.spark.sql.catalyst.expressions.objects.GetExternalRowField.<init>(objects.scala:1850) ~[spark-catalyst_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.$anonfun$serializerFor$3(RowEncoder.scala:195) ~[spark-catalyst_2.13-3.3.1.jar:3.3.1]
at scala.collection.ArrayOps$.flatMap$extension(ArrayOps.scala:986) ~[scala-library-2.13.0.jar:na]
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.serializerFor(RowEncoder.scala:192) ~[spark-catalyst_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:73) ~[spark-catalyst_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:81) ~[spark-catalyst_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:92) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:89) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:444) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at scala.Option.getOrElse(Option.scala:202) ~[scala-library-2.13.0.jar:na]
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:645) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:682) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:654) ~[spark-sql_2.13-3.3.1.jar:3.3.1]
at org.apache.spark.sql.DataFrameReader$textFile.call(Unknown Source) ~[na:na]
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) ~[groovy-2.5.14.jar:2.5.14]
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115) ~[groovy-2.5.14.jar:2.5.14]
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127) ~[groovy-2.5.14.jar:2.5.14]
at com.tcwb.classification.services.USMLService.loadWord2VecModel(testapp.groovy:591) ~[classes/:na]
at com.tcwb.classification.services.USMLService.postConstruct(testapp.groovy:76) ~[classes/:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:389) ~[spring-beans-5.3.6.jar:5.3.6]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:333) ~[spring-beans-5.3.6.jar:5.3.6]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:157) ~[spring-beans-5.3.6.jar:5.3.6]
... 56 common frames omitted
If there are other, more light-weight/pre-trained alternatives I should consider for synonym generation in java, those would also be appreciated.

So... it turns out, I needed to include another dependency to fix the compiler issue. Not the end of my woes (more errors subsequent to this), but at least a solution to the present issue (the commons-compiler):
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
<version>3.0.8</version>
</dependency>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
<version>3.0.8</version>
</dependency>

Related

Im trying to read a csv file from GCS bucket using spark and writing as delta lake(path in GCS) but not able to do write operation

Im trying in IntelliJ and have added the dependency in pom.xml file.
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-core_2.12</artifactId>
<version>1.2.1</version>
</dependency>
Below are the code used:
val df_gcs = spark.read.format("csv").csv(sourcepath )
df_gcs.write.format("delta").save(save_path)
Getting Below error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/execution/command/LeafRunnableCommand
at java.base/java.lang.ClassLoader.defineClass1(Native Method)
at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:174)
at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:800)
at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:698)
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:621)
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:579)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:150)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:132)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:131)
at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:409)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
Changed the version as 1.0.0 and it fixed the issue.
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-core_2.12</artifactId>
<version>1.0.0</version>
</dependency>

java.lang.reflect.Method.invoke(Object, Object[]) exception trying to use jaxb on customer mac even though jaxb classes included in build

Issue on Mac OS X 10.13.6 x86_64 using Java 17 Temurin on users machine, but cannot replicate locally
This code is failing
static
{
try
{
jc = JAXBContext.newInstance("com.jthink.musicschema");
}
catch (JAXBException e)
{
throw new RuntimeException("Unable to create SongKong prepared statement:" + e.getMessage(), e);
}
}
With
28/05/2022 08.23.18:EDT:UncaughtExceptionHandler:uncaughtException:SEVERE: An unexpected error has occurred on thread main, please report to support #jthink.net
java.lang.ExceptionInInitializerError
at com.jthink.songkong.cmdline.SongKong.checkCache(SongKong.java:1380)
at com.jthink.songkong.cmdline.SongKong.guiStart(SongKong.java:1166)
at com.jthink.songkong.cmdline.SongKong.finish(SongKong.java:1251)
at com.jthink.songkong.cmdline.SongKong.main(SongKong.java:1276)
Caused by: java.lang.NullPointerException: Cannot invoke "java.lang.reflect.Method.invoke(Object, Object[])" because "com.sun.xml.bind.v2.runtime.reflect.opt.Injector.defineClass" is null
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:311)
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:97)
at com.sun.xml.bind.v2.runtime.reflect.opt.AccessorInjector.prepare(AccessorInjector.java:87)
at com.sun.xml.bind.v2.runtime.reflect.opt.OptimizedAccessorFactory.get(OptimizedAccessorFactory.java:179)
at com.sun.xml.bind.v2.runtime.reflect.Accessor$FieldReflection.optimize(Accessor.java:285)
at com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty.<init>(SingleElementLeafProperty.java:92)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
at com.sun.xml.bind.v2.runtime.property.PropertyFactory.create(PropertyFactory.java:128)
at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.<init>(ClassBeanInfoImpl.java:181)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.getOrCreate(JAXBContextImpl.java:514)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.<init>(JAXBContextImpl.java:331)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.<init>(JAXBContextImpl.java:139)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1156)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:165)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:289)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:217)
at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:175)
at javax.xml.bind.ContextFinder.find(ContextFinder.java:353)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:508)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:465)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:366)
at com.jthink.songkong.cache.AlbunackCache.<clinit>(AlbunackCache.java:37)
... 4 more
and this similar question explains it is because Java 17 doesnt have jaxb classes inbuilt. However I know this and already include them in my maven build
<dependency>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>com.sun.xml.bind</groupId>
<artifactId>jaxb-core</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>com.sun.xml.bind</groupId>
<artifactId>jaxb-impl</artifactId>
<version>2.3.1</version>
</dependency>
and the problem seems to only occurs for one user.

How to configure spring-boot micrometer to push into elasticsearch?

I have a spring-boot 2 application, that exposes its actuator endpoints. Those values, I want to export to an existing elasticsearch instance. Therefore I used the following:
pom.xml
...
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.5.RELEASE</version>
</parent>
<properties>
<java.version>11</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-elastic</artifactId>
<version>1.2.0</version>
</dependency>
...
</dependencies>
application.yml
management:
metrics:
export:
elastic:
enabled: true
host: http://192.168.23.43:9200/
auto-create-index: true
index: metrics
step: 1m
When starting the application, the POST throws the following exception:
2019-09-04 11:20:42.498 WARN 2902 --- [ Thread-3] i.m.c.instrument.push.PushMeterRegistry : Unexpected exception thrown while publishing metrics for ElasticMeterRegistry
java.lang.RuntimeException: java.lang.IllegalArgumentException: Unexpected response body: {"error":"Incorrect HTTP method for uri [/] and method [POST], allowed: [DELETE, GET, HEAD]","status":405}
at io.micrometer.elastic.ElasticMeterRegistry.determineMajorVersionIfNeeded(ElasticMeterRegistry.java:252) ~[micrometer-registry-elastic-1.2.0.jar:1.2.0]
at io.micrometer.elastic.ElasticMeterRegistry.publish(ElasticMeterRegistry.java:194) ~[micrometer-registry-elastic-1.2.0.jar:1.2.0]
at io.micrometer.core.instrument.push.PushMeterRegistry.publishSafely(PushMeterRegistry.java:48) ~[micrometer-core-1.1.4.jar:1.1.4]
at io.micrometer.core.instrument.push.PushMeterRegistry.close(PushMeterRegistry.java:83) ~[micrometer-core-1.1.4.jar:1.1.4]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:337) ~[spring-beans-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:271) ~[spring-beans-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:571) ~[spring-beans-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:543) ~[spring-beans-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1034) ~[spring-beans-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:504) ~[spring-beans-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1027) ~[spring-beans-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1057) ~[spring-context-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1026) ~[spring-context-5.1.7.RELEASE.jar:5.1.7.RELEASE]
at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:945) ~[spring-context-5.1.7.RELEASE.jar:5.1.7.RELEASE]
Caused by: java.lang.IllegalArgumentException: Unexpected response body: {"error":"Incorrect HTTP method for uri [/] and method [POST], allowed: [DELETE, GET, HEAD]","status":405}
at io.micrometer.elastic.ElasticMeterRegistry.getMajorVersion(ElasticMeterRegistry.java:260) ~[micrometer-registry-elastic-1.2.0.jar:1.2.0]
at io.micrometer.elastic.ElasticMeterRegistry.determineMajorVersionIfNeeded(ElasticMeterRegistry.java:250) ~[micrometer-registry-elastic-1.2.0.jar:1.2.0]
... 17 common frames omitted
I made sure the elasticsearch is reachable, but the fact, that it tries to post against /, puzzles me. What did I miss?
For some reason, the version of micrometer-registry-elastic caused the problem. After downgrading to version 1.1.4 everything works as expected.

Use of HiveMetaStoreClient(thereby, HiveConf) to retrieve Hive metadata

Kerberized HDP-2.6.3.0.
I am able to connect to Hive from my Windows machine using the Hive JDBC driver, however, I need to use some methods of the HiveMetaStoreClient. I flipped through the api and wrote a test code which I am executing from an IDE.
private static void connectHiveMetastore() throws MetaException, MalformedURLException {
//System.setProperty("javax.security.auth.useSubjectCredsOnly","false");
//System.setProperty("java.security.krb5.conf","C:\\kerb5.conf");
Configuration configuration = new Configuration();
//configuration.addResource("E:\\hdp\\client_config\\HDFS_CLIENT\\core-site.xml");
//configuration.addResource("E:\\hdp\\client_config\\HDFS_CLIENT\\hdfs-site.xml");
HiveConf hiveConf = new HiveConf(configuration,Configuration.class);
//URL url = new File("E:\\hdp\\client_config\\HDFS_CLIENT\\hive-site.xml").toURI().toURL();
//hiveConf.setHiveSiteLocation(url);
//hiveConf.setVar(HiveConf.ConfVars.METASTOREURIS,"thrift://l4283t.sss.com:9083,thrift://l4284t.sss.com:9083");
HiveMetaStoreClient hiveMetaStoreClient = new HiveMetaStoreClient(hiveConf);
}
The dependencies in the pom file:
</dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hive/hive-metastore -->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-metastore</artifactId>
<version>2.3.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hive/hive-exec -->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>2.3.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.9.0</version>
</dependency>
</dependencies>
Irrespective of whether I comment or uncomment the lines pertaining to the config and Kerberos, I receive the following exception which is explained on the Hive wiki:
15:35:27.139 [main] ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler - MetaException(message:Version information not found in metastore. )
at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7564)
at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7542)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at com.sun.proxy.$Proxy8.verifySchema(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6893)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:164)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:129)
at com.my.App.connectHiveMetastore(App.java:58)
at com.my.App.main(App.java:37)
15:35:27.141 [main] ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler - HMSHandler Fatal error: MetaException(message:Version information not found in metastore. )
at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7564)
at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7542)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at com.sun.proxy.$Proxy8.verifySchema(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6893)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:164)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:129)
at com.my.App.connectHiveMetastore(App.java:58)
at com.my.App.main(App.java:37)
Exception in thread "main" MetaException(message:Version information not found in metastore. )
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:83)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6893)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:164)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:129)
at com.my.App.connectHiveMetastore(App.java:58)
at com.my.App.main(App.java:37)
Caused by: MetaException(message:Version information not found in metastore. )
at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7564)
at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7542)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at com.sun.proxy.$Proxy8.verifySchema(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
... 6 more
Process finished with exit code 1
I have the following questions/concerns:
Is the way I am thinking about and connecting the HiveMetaStoreClient correct? If not, how do I retrieve the metadata information provided by the methods of HiveMetaStoreClient?
The code certainly isn't 'reaching' the cluster. Is the above exception pertaining to the dependency versions? If not, what can be the root cause?
Following the leads provided by #Samson(in comments), I gave the Maven approach and simply kept on copying the jars required from the cluster. Yes, this took a long time to get things sorted but I did make some progress.
Below is the class I am using, I still get SASL-related exceptions but at least the request is reaching the server.
private static void connectHiveMetastore() throws MetaException, MalformedURLException {
System.setProperty("hadoop.home.dir", "E:\\Software\\Virtualization");
/*Start : Commented or un-commented, immaterial ...*/
System.setProperty("javax.security.auth.useSubjectCredsOnly","false");
System.setProperty("java.security.auth.login.config","E:\\lib\\hdp\\loginconf.ini");
System.setProperty("java.security.krb5.conf","E:\\lib\\hdp\\krb5.conf");
/*End : Commented or un-commented, immaterial ...*/
Configuration configuration = new Configuration();
/*Start : Commented or un-commented, immaterial ...*/
configuration.addResource("E:\\lib\\hdp\\client_config\\HDFS_CLIENT\\core-site.xml");
configuration.addResource("E:\\lib\\hdp\\client_config\\HDFS_CLIENT\\hdfs-site.xml");
configuration.addResource("E:\\lib\\hdp\\client_config\\HIVE_CLIENT\\hive-site.xml");
configuration.set("hive.server2.authentication","KERBEROS");
configuration.set("hadoop.security.authentication", "Kerberos");
/*End : Commented or un-commented, immaterial ...*/
HiveConf hiveConf = new HiveConf(configuration,Configuration.class);
URL url = new File("E:\\lib\\hdp\\client_config\\HIVE_CLIENT\\hive-site.xml").toURI().toURL();
HiveConf.setHiveSiteLocation(url);
hiveConf.setVar(HiveConf.ConfVars.METASTOREURIS,"thrift://l4283t.sss.se.com:9083,thrift://l4284t.sss.se.com:9083");
/*Start : Commented or un-commented, immaterial ...*/
hiveConf.setVar(HiveConf.ConfVars.HIVE_SERVER2_AUTHENTICATION,"KERBEROS");
/*End : Commented or un-commented, immaterial ...*/
HiveMetaStoreClient hiveMetaStoreClient = new HiveMetaStoreClient(hiveConf);
System.out.println("Metastore client : "+hiveMetaStoreClient);
System.out.println("Is local metastore ? "+hiveMetaStoreClient.isLocalMetaStore());
System.out.println(hiveMetaStoreClient.getAllDatabases());
hiveMetaStoreClient.close();
}

Oozie job failed due to "not org.apache.hadoop.mapred.Mapper" while running through hue

I am trying to run wordcount program through oozie job.
When I run the wordcout jar manually like hadoop jar wordcoutjar /data.txt /out .It runs fine and give me output.
Here is the details of mapper code of my wordcount program.
public class MapperWordcount extends Mapper<LongWritable, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
When I execute it through oozie job ,the error is like below:
2015-07-31 00:39:23,357 FATAL [IPC Server handler 29 on 40854] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1438294006985_0011_m_000000_3 - exited : java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 9 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: class com.mr.wc.MapperWordcount not org.apache.hadoop.mapred.Mapper
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2108)
at org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:1109)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.RuntimeException: **class com.mr.wc.MapperWordcount not org.apache.hadoop.mapred.Mapper**
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2102)
... 16 more
My pom.xml is like this .
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.6.0</version>
I got a same problem here the actual problem was code is referring to old map reduce libraries while at runtime it is trying to find new map reduce librarires.
In Gradle
compile("org.apache.hadoop:hadoop-core:2.4.0")
and in your pom.xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.4.0</version>
</dependency>
and change all the references in Mapper and reducer from org.apache.hadoop.mapred.Mapper to org.apache.hadoop.mapreduce.Mapper

Resources