Hive Metastore tries to create a Derby connection instead of MySQL - hadoop

I am using Hive 0.11 and Metastore in local mode. When I try to start the Metastore daemon, it exits after spitting the following error message:
2013-11-21 08:47:19.541 GMT Thread[main,5,main] java.io.FileNotFoundException: derby.log (Permission denied)
2013-11-21 08:47:19.646 GMT Thread[main,5,main] Cleanup action starting
ERROR XBM0H: Directory /metastore_db cannot be created.
This is my hive-site.xml. I am using MySQL as Metastore storage. What I don't understand is why is Hive trying to create metastore_db locally.
Thanks.

Set hive.metastore.local property as false. (Removed as of Hive 0.10: If hive.metastore.uris is empty local mode is assumed, remote otherwise)
Set hive.metastore.uris property with valid uri (Host and port for the Thrift metastore server)
For eg:
<property>
<name>hive.metastore.uris</name>
<value>thrift://hap-db:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>

Hi faced similar issue on hive 0.14. I had installed hive as root user and was trying to run hive services as a sudo user i use for all hadoop jobs.
Once i changed the installation owner to sudo and restarted it worked . so this error is mostly related to file permissions issue.

Related

User is not allowed to impersonate anonymous (state=08S01,code=0) org.apache.hadoop.security.authorize.AuthorizationException

I am getting the below error when I try to start Hive using hiverserver2.
Connecting to jdbc:hive2://localhost:10000
18/10/25 09:45:38 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: deco is not allowed to impersonate anonymous (state=08S01,code=0)
The user name I am using is deco.
I have also added the below entry in core-site.xml file:
<property>
<name>hadoop.proxyuser.deco.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.deco.groups</name>
<value>*</value>
</property>
I am still unable to connect using beeline. I used the following commands:
$HIVE_HOME/bin/beeline -u jdbc:hive2://localhost:10000
and
$HIVE_HOME/bin/beeline -n $(whoami) -u jdbc:hive2://localhost:10000
I even took a backup of the metastore_db folder and reinitiated with the below command:
$HIVE_HOME/bin/schematool -dbType derby -initSchema
I even started hiveserver2 on 10001 port and connected beeline to 10001 and still got the same error
All the above prove futile.
Help I am dying
I ever got this error
User * is not allowed to impersonate anonymous
That's because by default hive tries to execute operations as the calling user, I add below lines to hive config file conf/hive-site.xml, to ask hive to execute operations as the hiveserver2 process user, then get rid of this error:
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>
Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
</description>
</property>
Here is the document:
Impersonation
By default HiveServer2 performs the query processing as
the user who submitted the query. But if the following parameter is
set to false, the query will run as the user that the hiveserver2
process runs as.
hive.server2.enable.doAs – Impersonate the connected user, default
true.

Why does Hive return FAILED: SemanticException...Unable To Instantiate

I have installed Hive, added it to PATH and am able to open it using the hive command in Terminal.
However, when I attempt to run a basic command such as
SHOW TABLES;
I am presented with the error:
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
The instructions I am following do not suggest that anything has to be instantiated.
For reference, I am using the book Hadoop: The Definitive Guide (4th Edition) and running it locally on my machine.
When running JPS the following services are running:
2528 DataNode
7232 RunJar
2441 NameNode
7401 Jps
2634 SecondaryNameNode
282
2842 NodeManager
2751 ResourceManager
I fixed by removing the derby database files
rm -rf $HIVE_HOME/bin/metastore_db
and
$HIVE_HOME/bin/schematool -initSchema -dbType derby
I was able to resolve this problem by initializing the schema. I am surprised it is not mentioned anywhere.
To initialize the schema:
Navigate to your Hive installation folder
[install folder]/bin/schematool -initSchema -dbType derby
Next you should receive some messages confirming
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.derby.sql
Initialization script completed
schemaTool completed
Start hive
Run any basic commands to determine Hive is functioning such as SHOW TABLES;

Hive metastore Configuration with derby

In RedHat test server I installed hadoop 2.7 and I ran Hive ,Pig & Spark with out issues .But when tried to access metastore of Hive from Spark I got errors So I thought of putting hive-site.xml(After extracting 'apache-hive-1.2.1-bin.tar.gz' file I just add $HIVE_HOME to bashrc as per tutorial and everything was working other than this integration with Spark) In apache site I found that I need to put hive-site.xml as metastore configuration
I created the file as below
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
</configuration>
I put IP as localhost since it is single node machine .After that I am not able to connect to even Hive .It is throwing error
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
....
Caused by: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby://localhost:1527/metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: No suitable driver found for jdbc:derby://localhost:1527/metastore_db;create=true
There are lot many error log pointing to the same thing . If I remove hive-site.xml from the conf folder hive is working without issues .Can anyone point me to the right path for default metastore configuration
Thanks
Anoop R
Derby is used as an embedded database. try using
jdbc:derby:metastore_db;create=true
as jdbc-url. see also
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-EmbeddedMetastore
To use the metastore fully functional (and by that to be able to access it from different services), try setting up using mysql as described in the document above.
As you are setting up an embedded metastore database, use the property below as JDBC URL:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:metastore_db;create=true </value>
<description>JDBC connect string for a JDBC metastore </description>
</property>
I was also facing similar kind of exception while installing hive. The thing which worked for me was to initialize the derby db. I used the following command to solve the problem : command -> Go to $HIVE_HOME/bin and run the command schematool -initSchema -dbType derby .
You can follow the link http://www.edureka.co/blog/apache-hive-installation-on-ubuntu
It will work if you put derbyclient.jar in lib folder of hive

After changing CDH5 Kerberos Authentication i am not able to access hdfs

I am trying to implement Kerberos authentication. I am using Hadoop 2.3 version of hadoop on cdh5.0.1. I have done the following changes :
Added following properties to core-site.xml
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
After restarting the daemon when i am issuing hadoop fs -ls / command, I am getting following error :
ls: Failed on local exception: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.; Host Details : local host is: "cldx-xxxx-xxxx/xxx.xx.xx.xx"; destination host is: "cldx-xxxx-xxxx":8020;
Please help me out.
Thanks in advance,
Ankita Singla
There is a lot more to configuring a secure HDFS cluster than just specifying hadoop.security.authentication as Kerberos. See Configuring Hadoop Security in CDH 5 about the required config settings. You'll need to create appropriate keytab files. Only after you configured everything and you confirmed that none of the Hadoop services report any error in their respective logs (namenode, datanode on all hosts, resourcemanager, nodemanager on all nodes etc) can you attempt to connect.

facing issue while starting hive server and hive web interface

((1))
I'm getting the below error while starting thrift server:
hive --service hiveserver
Starting Hive Thrift Server
org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:10000.
when I ran netstat port 10000 was already in use..
$ netstat -nl | grep 10000
tcp6 0 0 :::10000 :::* LISTEN
How do I resolve this?
((2))
While starting hive web interface getting below error
hive --service hwi
$ hive --service hwi
13/01/01 22:05:36 INFO hwi.HWIServer: HWI is starting up
13/01/01 22:05:37 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
13/01/01 22:05:37 INFO mortbay.log: jetty-6.1.26
13/01/01 22:05:37 INFO mortbay.log: Extract /opt/hive/lib/hive-hwi-0.9.0.jar to /tmp/Jetty_127_0_0_1_3606_hive.hwi.0.9.0.jar__hwi__.6ogsv5/webapp
13/01/01 22:05:37 WARN mortbay.log: failed SocketConnector#127.0.0.1:3606: java.net.BindException: Address already in use
13/01/01 22:05:37 WARN mortbay.log: failed Jetty20SShims$Server#21e554: java.net.BindException: Address already in use
Exception in thread "main" java.net.BindException: Address already in use
at java.net.PlainSocketImpl.socketBind(Native Method)
Please help.
Thanks in advance!!
Your port address seems to be used by some other program, you may follow below mentioned steps :-
((1)) Start hive server using another port address
hive --service hiveserver -p 10001 &
((2))
a] create hive-site.xml file if not present in $HIVE_HOME/conf folder
b] put following lines in it
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.hwi.listen.host</name>
<value>localhost</value>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9998</value>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>lib/hive-hwi-0.10.0.war</value>
<description>This sets the path to the HWI war file, relative to ${HIVE_HOME}. </description>
</property>
</configuration>
c] start hive web interface
hive --service hwi
d] browse localhost:9998/hwi/
I faced same problem here is the solution I got
1)set port numer
export HIVE_PORT=10000
2) Check which services is listening
sudo lsof -i -P | grep -i "listen"
3)if there is process relevant to port 10000 kill it
kill -9 pid
4) Start hive server
$HIVE_HOME/bin --service hiveserver
If it not work go to step 2 and start server again
Stop Hive;
Add the following properties in hive-site.xml
1) hive.hwi.listen.host = host
2) hive.hwi.listen.port = 9999
3) hive.hwi.war.file = /lib/hive-common-0.12.0.2.0.6.1-102.jar {This sets the path to the HWI war file, relative to $HIVE_HOME}
Start Hive again
Start HWI on Hive server with the command
nohup hive --service hwi &
Now, you can access HWI as host:9999/hwi
normally this issue arises .
Either you are changing the hostname so that what ever the user u have created in the metastore it still refering to the old metastore hostname.
Case -1 either metastore is not up which throws the above error so run the bin/metatool -listFSRoot if it ran without error then u r able to connect hive safly.
but still issue is not resolved case -2
Case-2 what ever the table created in the hive still points to the old hiveuser which was pointing to the old host name so u cant featch the record from the hive table .
Solution :- revert the host name in all the file with old host name and then run the hadoop and hive stack one after other.
Apart from this if any one has other solution please share.This I resolved in my production box.
If this kind of issue arises run
$ bin/metatool -listFSRoot
If it runs with out error then try to run the metastore and then check the hive can fetch the record from a table or not.

Resources