hive shell is not starting in cloudera - hadoop

I tried restarting my system, checked whether there is enough space or not and also made sure my hive server2 is running. But I'm getting these errors when given '$hive' in Cloudera.
Logging initialized using configuration in
file:/etc/hive/conf.dist/hive-log4j.properties
WARN: The method class
org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
Exception in thread "main" java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

The process of starting Hive2 is changed, as Hive got deprecated. Usage of Beeline is recommended.
Beeline was developed specifically to interact with the new server. Unlike Hive CLI, which is an Apache Thrift-based client, Beeline is a JDBC client based on the SQLLine CLI — although the JDBC driver used communicates with HiveServer2 using HiveServer2’s Thrift APIs.
As Hive development has shifted from the original Hive server (HiveServer1) to the new server (HiveServer2), users and developers accordingly need to switch to the new client tool. However, there’s more to this process than simply switching the executable name from “hive” to “beeline”.
More information provided over here
Use the below command to enter into interactive mode. Beeline supports same commands that Hive server does. You can execute same script in Beeline without any modifications.
beeline -u jdbc:hive2://
To start the Hive metastore,
sudo service hive-metastore start

Related

Connecting HiveServer2 from pyspark

I am stuck at point as , how to use pyspark to fetch data from hive server using jdbc.
I am Trying to connect to HiveServer2 running on my local machine from pyspark using jdbc. All components HDFS,pyspark,HiveServer2 are on same machine.
Following is the code i am using to connect :
connProps={ "username" : 'hive',"password" : '',"driver" : "org.apache.hive.jdbc.HiveDriver"}
sqlContext.read.jdbc(url='jdbc:hive2://127.0.0.1:10000/default',table='pokes',properties=connProps)
dataframe_mysql = sqlContext.read.format("jdbc").option("url", "jdbc:hive://localhost:10000/default").option("driver", "org.apache.hive.jdbc.HiveDriver").option("dbtable", "pokes").option("user", "hive").option("password", "").load()
both methods used above are giving me same error as below:
org.apache.spark.sql.AnalysisException: java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
javax.jdo.JDOFatalDataStoreException: Unable to open a test connection
to the given database. JDBC url =
jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
Terminating connection pool (set lazyInit to true if you expect to
start your database after your app).
ERROR XSDB6: Another instance of Derby may have already booted the database /home///jupyter-notebooks/metastore_db
metastore_db is located at same directory where my jupyter notebooks are created. but hive-site.xml is having different metastore location.
I have already checked reffering to other questions about same error saying other spark-shell or such process is running,but its not. Even if i try following command when HiveServer2 and HDFS are down i am getting same error
spark.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING) USING hive")
I am able to connect to hives using java program using jdbc. Am I missing something here? Please help.Thanks in advance.
Spark should not use JDBC to connect to Hive.
It reads from the metastore, and skips HiveServer2
However, Another instance of Derby may have already booted the database means that you're running Spark from another session, such as another Jupyter kernel that's still running. Try setting a different metastore location, or work on setting up a remote Hive metastore using a local Mysql or Postgres database and edit $SPARK_HOME/conf/hive-site.xml with that information.
From SparkSQL - Hive tables
spark = SparkSession \
.builder \
.appName("Python Spark SQL Hive integration example") \
.config("spark.sql.warehouse.dir", warehouse_location) \
.enableHiveSupport() \
.getOrCreate()
# spark is an existing SparkSession
spark.sql("CREATE TABLE...")

Hive : The application won't work without a running HiveServer2

I am new to this field. I was checking CDH 5.8 quick-start VM to try some basic hive/impala example.
But I hit an issue, while I am opening HUE it's giving below error. I searched solution for but didnt get anything which can resolve my issue.
Configuration files located in /etc/hue/conf.empty
Potential misconfiguration detected. Fix and restart Hue.
Hive The application won't work without a running HiveServer2.
I checked the and it's up & running. Tried restarting the service & CDH, didnt help.
Hive Server2 is running [ OK ]
When navigated to Hive tried some command it gave me below error.
Could not connect to quickstart.cloudera:10000 (code THRIFTTRANSPORT): TTransportException('Could not connect to quickstart.cloudera:10000',)
FOR Impala I am getting
AnalysisException: This Impala daemon is not ready to accept user requests. Status: Waiting for catalog update from the StateStore.
Tried starting hive --service metastore but got error
[cloudera#quickstart conf.empty]$ hive --service metastore
2017-03-03 05:37:14,502 WARN [main] mapreduce.TableMapReduceUtil: The hbase-prefix-tree module jar containing PrefixTreeCodec is not present. Continuing without it.
Starting Hive Metastore Server
org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083.
Not sure what is wrong or if I need to change some config. Can you anyone guide me towards the solution ?
You HiveServer2 requires Metastore up and running. Seems your Metastore Server cannot start because the port 9083 is already used by some service. Check it:
netstat -tulpn | grep 9083
If something is using this port you need to either change the port of you metastore in hive configuration or stop the application which already uses this port.

Hive - issues while starting

I have been using Hive for sometime now on Ubuntu while Hadoop is in Pseudo Distribution mode however today out of nowhere i am getting error while starting Hive shell.I have not made any changes in configuration at all -
Caused by: Meta Exception(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
The hivemetastore service is not running. You can start the service with the command below. This command is for installations made using packages.
service hive-metastore start
For tarball installations, you can start the hive metastore using the below command
hive --service metastore &

Beeline command issue

I am new to Hive and hopefully this is going to be an easy thing to solve
for someone with more experience, but I am having trouble doing it on my
own.
On my EC2 app server I am running the following command with no error:
beeline -u jdbc:hive2://master
This is working on Hive 13 which was installed through a bootstrap action
using the latest AMI version. 'master' is pointing to my EMR cluster
Then I downloaded the source for Hive 14 and built it. I have replaced my
/home/hadoop/hive directory with the package that was built.
However, if I try to execute the same command, I get an error:
scan complete in 6ms
Connecting to jdbc:hive2://master
Error: Could not open client transport with JDBC Uri: jdbc:hive2://master:
Cannot open without port. (state=08S01,code=0)
Beeline version 0.14.0 by Apache Hive
0: jdbc:hive2://master (closed)>
Running it with the port provided works correctly:
beeline -u jdbc:hive2://master:10000
I would like to be able to able to run the command without providing the
default port number.
Can anyone direct me with an instruction.
Thanks,
Hive Beeline Connection in Two Modes:
1.Embedded Mode:
If both Hive Client and Hive server are same then connect beeline by using below url:
!connect jdbc:hive2://
2.Remote Mode:
If server in one machine but client in one machine you can connect beeline using below url:
!connect jdbc:hive2://<host>:<port>

Could not establish connection to localhost:10000/default: java.net.ConnectException: Connection refused

I have work on Hadoop/Hive. I have installed Hadoop 1.1.2 and Hive 0.10.0.When I use Hive as command prompt then it works fine,but when I am using as it JDBC in Eclipse then gives the below error :
Could not establish connection to localhost:10000/default:
java.net.ConnectException: Connection refused
You can connect to Hive in two modes. Through thrift server and embedded mode.
By seeing your url localhost:10000/default, it looks like you are trying to connect to the thrift sever. So please ensure that you are have started the hive thrift server by the following command.
$ hive --service hiveserver
If you want to connect in embedded mode. you should give the url as
jdbc:hive://
To use the embedded mode you should add the hive/conf and the jars in hive/lib to your classpath.
It is to be noted that, use of the Thrift server is not thread safe as of now.

Resources