unable to connect to sparkSQL - hadoop

I am using remote mysql metastore for hive. when i run hive client it runs perfect. but when i try to use spark-sql either via spark-shell or by spark-submit i am not able to connect to hive. & getting following error :
Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.EmbeddedDriver
I am not getting why spark tries to connect derby database while i am using mysql database for metastore.
i am using apache spark version 1.3 & cloudera version CDH 5.4.8

It seems spark is using default hive settings, follow these steps:
Copy or create soft-link of hive-site.xml to your SPARK_HOME/conf folder.
Add hive lib path to classpath in SPARK_HOME/conf/spark-env.sh
Restart the Spark cluster for everything to take effect.
I believe your hive-site.xml has location of MYSQL metastore? if not, follow these steps and restart spark-shell:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://MYSQL_HOST:3306/hive_{version}</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore/description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>XXXXXXXX</value>
<description>Username to use against metastore database/description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>XXXXXXXX</value>
<description>Password to use against metastore database/description>
</property>

Related

when creating table in hive in mac os failing with error localhost:9000 failed on connection

hive> CREATE SCHEMA IF NOT EXISTS inconv_seql;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.net.ConnectException Call From User-MacBook-Air.local/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused)
localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused;
the above error is due to hadoop demons with the ports 9000 are not running in your local machine,
please start hadoop and then start hive by following the below steps.
1. check hadoop is running,
hduser#ubuntu:~$ jps
if you could not find any hadoop daemons running in your local, then start hadoop follow the below command,
hduser#ubuntu:~$ $HADOOP_HOME/sbin/start-all.sh
2. check the hive-site.xml,core-site.xml
hive-site.xml
<property>
<name>hive.metastore.db.type</name>
<value>DERBY</value>
<description> Expects one of [derby, oracle, mysql, mssql, postgres]. Type of database used by the metastore. Information schema & JDBCStorageHandler depend on it. </description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://localhost:8020/user/hive/warehouse</value>
<description>location of default database for the warehouse</description> </property>
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>
and try to launch hive terminal and proceed.

SemanticException in Hive Shell Mode

hive exception
I have installed Hadoop 3.0.0 and Hive 2.3.1 in my PC. Parallely i installed mysql and working with sql commands in sql shell mode and working fine. But While executing queries in Hive shell mode, i am receiving the following error,
hive> create table saurzcode(id int, name string);
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Please let me know the reason for failure.
Also please clarify the following queries,
1) Difference between hive shell mode vs mysql shell mode.
2) Why to configure MySql Metastore for Hive?
Please find the hive-site.xml configuration,
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hivelogin</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>apache</value>
</property>
</configuration>
Your Original exception is
Unable to load authentication plugin 'caching_sha2_password' as you can see in below error log.
org.apache.hadoop.hive.metastore.HiveMetaStore - Retrying creating default database after error: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true
, username = hivelogin. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Unable to load authentication plugin 'caching_sha2_password'.
Solution:
This error happens due to all new MySQL version come up with added password plugin called "caching_sha2_password", and it has to be configured properly at MySQL server or else you can simply use "mysql_native_password" parameter with "CREATE USER" in MySQL as below to get it resolved.
While creating the hive Meta Store user just follow the below command.
CREATE USER 'username'#'localhost' IDENTIFIED WITH mysql_native_password BY 'password';
GRANT ALL PRIVILEGES ON metastore_db.* TO 'hive'#'%';

Hive metastore Configuration with derby

In RedHat test server I installed hadoop 2.7 and I ran Hive ,Pig & Spark with out issues .But when tried to access metastore of Hive from Spark I got errors So I thought of putting hive-site.xml(After extracting 'apache-hive-1.2.1-bin.tar.gz' file I just add $HIVE_HOME to bashrc as per tutorial and everything was working other than this integration with Spark) In apache site I found that I need to put hive-site.xml as metastore configuration
I created the file as below
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
</configuration>
I put IP as localhost since it is single node machine .After that I am not able to connect to even Hive .It is throwing error
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
....
Caused by: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby://localhost:1527/metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: No suitable driver found for jdbc:derby://localhost:1527/metastore_db;create=true
There are lot many error log pointing to the same thing . If I remove hive-site.xml from the conf folder hive is working without issues .Can anyone point me to the right path for default metastore configuration
Thanks
Anoop R
Derby is used as an embedded database. try using
jdbc:derby:metastore_db;create=true
as jdbc-url. see also
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-EmbeddedMetastore
To use the metastore fully functional (and by that to be able to access it from different services), try setting up using mysql as described in the document above.
As you are setting up an embedded metastore database, use the property below as JDBC URL:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:metastore_db;create=true </value>
<description>JDBC connect string for a JDBC metastore </description>
</property>
I was also facing similar kind of exception while installing hive. The thing which worked for me was to initialize the derby db. I used the following command to solve the problem : command -> Go to $HIVE_HOME/bin and run the command schematool -initSchema -dbType derby .
You can follow the link http://www.edureka.co/blog/apache-hive-installation-on-ubuntu
It will work if you put derbyclient.jar in lib folder of hive

default.fs.name and hive.metastore.warehouse.dir do not conflict

Hi When I try to run the below command
Load data Inpath '/data' into Table Tablename;
in hive shell it throws following error
Move from: hdfs://hadoopcluster/data to: file:/user/hive/warehouse/Tablename is not valid. Please check that values for params "default.fs.name" and "hive.metastore.warehouse.dir" do not conflict.
where my default.fs.name property is
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoopcluster</value>
</property>
where my hive.metastore.warehouse.dir is
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
Can any one help me in this?
This is because you are using "local" storage location /user/hive/warehouse for your Hive metastore that conflicts with the defaultFS (per Hive).
Do you mean to be using "local" storage, or HDFS?
To use HDFS for the Hive metastore setting you need to specify the full HDFS URI for that storage:
hdfs://hadoopcluster/user/hive/warehouse

HBase is not working in Hadoop 2.2.0

I am trying to install hbase-0.96.0-hadoop2 on Hadoop 2.2.0. While I am trying to start my HBase. HBase is giving following error.
master: log4j:ERROR Could not find value for key log4j.appender.DRFAS
master: log4j:ERROR Could not instantiate appender named "DRFAS".
log4j:ERROR Could not find value for key log4j.appender.DRFAS
log4j:ERROR Could not instantiate appender named "DRFAS".
When I am doing JPS Linux is showing following processes:
17422 JobHistoryServer
11461 NameNode
31375 Jps
12127 ResourceManager
11671 DataNode
30077 HRegionServer
12344 NodeManager
11935 SecondaryNameNode
30948 HQuorumPeer
Here is my hbase-site.xml configuraiton:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/master</value>
</property>
</configuration>
Try these two methods .
Stop your hbase demon and clear the hbase log files which was located
in /tmp/ folder delete all files which had name hbase in it
after deleting disconnect your machine from internet and try to
start the hbase demon now.
Hbase has this weird issue in some x64 ubuntu machines disconnecting from internet will help in resolving this issue,after startup you can connect to the internet.
now try to access hbase from cli
bin/hbase

Resources