Specify a vaild path to the correct hive jars using $HIVE_METASTORE_JARS or change spark.sql.hive.metastore.version to 1.2.1 - hadoop

When i try to run spark-submit on the Jar which had HiveContext,getting the below error.
Spark-defaults.conf had
spark.sql.hive.metastore.version 0.14.0
spark.sql.hive.metastore.jars ----/external_jars/hive-metastore-0.14.0.jar
#spark.sql.hive.metastore.jars maven
I would like to use Hive Metastore version 0.14. both spark and hadoop are on diff clusters.
Can anyone helping me with resolving this one?
16/09/19 16:52:24 INFO HiveContext: default warehouse location is /apps/hive/warehouse
Exception in thread "main" java.lang.IllegalArgumentException: Builtin jars can only be used when hive execution version == hive metastore version. Execution: 1.2.1 != Metastore: 0.14.0.
Specify a vaild path to the correct hive jars using $HIVE_METASTORE_JARS or change spark.sql.hive.metastore.version to 1.2.1.
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:254)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:441)
at org.apache.spark.sql.SQLContext$$anonfun$4.apply(SQLContext.scala:272)
at org.apache.spark.sql.SQLContext$$anonfun$4.apply(SQLContext.scala:271)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$cla

try
val hadoopConfig: Configuration = spark.hadoopConfiguration
hadoopConfig.set("fs.hdfs.impl", classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getNam‌​e)
hadoopConfig.set("fs.file.impl", classOf[org.apache.hadoop.fs.LocalFileSystem].getName)
in the spark

Related

Interpreter hive not found in zeppelin's jdbc interpreter

I have installed zeppelin on my centOS system. It is not listing hive under JDBC interpreter.
I have hive installed on my system. Hive metastore and hiveserver2 are running. HIVE_HOME and HADOOP_HOME are set correctly.
Error on Zeppelin editor :
paragraph_1490339323949_-1789938581's Interpreter hive not found
Error in Zeppelin log files :
ERROR [2017-03-24 15:56:18,913] ({qtp1566723494-18} NotebookServer.java[afterStatusChange]:2018) - Error
org.apache.zeppelin.interpreter.InterpreterException: paragraph_1490346145929_-1782899327's Interpreter hive not found
at org.apache.zeppelin.notebook.Note.run(Note.java:572)
at org.apache.zeppelin.socket.NotebookServer.persistAndExecuteSingleParagraph(NotebookServer.java:1626)
at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:1600)
at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:263)
at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59)
at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309)
at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220)
at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Any help will be appreciated.
Thanks.
You can resolve the above issue by :-
1) setting the following properties zeppelin.apache.org/docs/0.7.0/interpreter/hive.html for jdbc interpreter.
2) using %jdbc as an Interpreter
%jdbc select date
Hope this Helps!!!...

Is there an official way to support both Spark 1.6.2 and 2.0.0 on Hadoop yarn 2.7.2 cluster?

I have a cluster running Hadoop yarn 2.7.2 with dynamic allocation enabled for Spark 1.6.2.
Is there an official way to support both Spark 1.6.2 and 2.0.0? Because when I tried to submit an application from Spark 2.0.0 client, exception happened in driver like below:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.network.util.JavaUtils.byteStringAs(Ljava/lang/String;Lorg/apache/spark/network/util/ByteUnit;)J
at org.apache.spark.internal.config.ConfigHelpers$.byteFromString(ConfigBuilder.scala:63)
at org.apache.spark.internal.config.ConfigBuilder$$anonfun$bytesConf$1.apply(ConfigBuilder.scala:197)
at org.apache.spark.internal.config.ConfigBuilder$$anonfun$bytesConf$1.apply(ConfigBuilder.scala:197)
at org.apache.spark.internal.config.TypedConfigBuilder.createWithDefaultString(ConfigBuilder.scala:131)
at org.apache.spark.internal.config.package$.<init>(package.scala:41)
at org.apache.spark.internal.config.package$.<clinit>(package.scala)
at org.apache.spark.deploy.yarn.ApplicationMaster.<init>(ApplicationMaster.scala:69)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:785)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:784)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
This feature is supported by Hortonwork's HDP distribution, I have a cluster running HDP 2.5, which supports Hadoop 2.7.3, Spark 1.6.2 and 2.0.0 on Centos7.
I have not experienced any problem while using either Spark and Spark2 jobs.
How did you install and configure both Spark versions? You can give a try to HDP sandbox and use as inspiration how is Spark & Spark2 configured for your own cluster.

Hive Internal Error: java.lang.ClassNotFoundException(org.apache.atlas.hive.hook.HiveHook)

I am running a hive query throwh oozie using hue..
I am creating a table through hue-oozie work flow...
My job is failing but when I check in hive the table is created.
Log shows below error:
16157 [main] INFO org.apache.hadoop.hive.ql.hooks.ATSHook - Created ATS Hook
2015-09-24 11:05:35,801 INFO [main] hooks.ATSHook (ATSHook.java:<init>(84)) - Created ATS Hook
16159 [main] ERROR org.apache.hadoop.hive.ql.Driver - hive.exec.post.hooks Class not found:org.apache.atlas.hive.hook.HiveHook
2015-09-24 11:05:35,803 ERROR [main] ql.Driver (SessionState.java:printError(960)) - hive.exec.post.hooks Class not found:org.apache.atlas.hive.hook.HiveHook
16159 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Hive Internal Error: java.lang.ClassNotFoundException(org.apache.atlas.hive.hook.HiveHook)
java.lang.ClassNotFoundException: org.apache.atlas.hive.hook.HiveHook
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
Not able to identify the issue....
I am usig HDP 2.3.1
Basically this error is due to missing atlas jar in oozie share lib.
In HDP the Atlas jar is available in /usr/hdp/2.3.0.0-2557/atlas/
Put all the jars related to atlas in hadoop share lib ..
hadoop fs -put /usr/hdp/2.3.0.0-2557/atlas/hook/hive/* /user/oozie/share/lib/lib200344/hive
Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in hive-env.sh .
Copy <atlas package>/conf/application.propertiesto hive conf directory.
Restart the oozie services. This will solve this problem. If anybody face the problem please comment here so that I can help.
[Comment by Immo Huneke: when using the Hortonworks sandbox VM, I found that just putting the jar files in the share/lib folder under HDFS was enough to resolve the problem. I didn't have to update hive-env.sh or copy the application.properties file. But check the exact path of your share/lib folder by executing the command hdfs dfs -ls /user/oozie/share/lib before copying.]
hive>add jar /usr/hdp//atlas/hook/hive/hive-bridge-${VERSION}.jar
it will be ok.
hope help for u.
It Seems You CLASS is not found exception.
Have you installed Oozie Sharedlib, if Yes, please update all the hive dependent Jar in the sharedLib Location, and check if the status
Also check if Hive Client is available in all the Nodes under the cluster and same should be running
​I tried each and every possible solution mentioned in this forum and in stackoverflow, but it did not resolve my issue.
Finally, I resolved it by copying all the jars in /hook/hive to lib (create a new lib folder at job.properties level) folder of my oozie workflow

Which version of pig should use for hbase 0.98.8

I have hadoop 2.5.1 installed
Hive version is 0.13.1
Pig version is 0.13.0
Habse version is 0.98.8
If I want to load files from HDFS into habase using pig then will my pig version work fine?
For now I am facing issue as follows:
2014-12-24 16:11:24,783 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Make sure you build pig with the following options:
-Dhbaseversion=95 -Dhadoopversion=23 -Dprotobuf-java.version=2.5.0

Reading hive table using Pig script

I am trying to read hive table using PIG script but when I run a pig code to read a table in hive its giving me following error:
2014-02-12 15:48:36,143 [main] WARN org.apache.hadoop.hive.conf.HiveConf
-hive-site.xml not found on CLASSPATH 2014-02-12 15:49:10,781 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate
exception from backed error: Error: Found class
org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
(Ignore newlines and whitespace added for readability)
Hadoop version
1.1.1
Hive version
0.9.0
Pig version
0.10.0
Pig code
a = LOAD '/user/hive/warehouse/test' USING
org.apache.pig.piggybank.storage.HiveColumnarLoader('name string');
Is it due to some version mismatch ?
Why can't you use hcatalog to access hive metadata in pig?
Check this for an example

Resources