Hue Hive -- Beeswax Server Can't Find JDBC Driver for MySQL - jdbc

We're using the Cloudera 3.7.5 and having a tough time configuring the Beeswax server such that the Hue can access the Hive databases. I followed all the instructions from the Cloudera documentation that to setup MySQL to serve as Hive's metastore, but when I restart the Hue services and check Beeswax server's StdErr logs, I still see the painful "javax.jdo.JDOFatalInternalException: Error creating transactional connection factory" which is caused by
org.datanucleus.exceptions.NucleusException: Attempt to invoke the "DBCP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
This is bizzare to me, because the logs also indicate that the environment variable HIVE_HOME is equal to "/usr/lib/hive", and sure enough I have copied the "mysql-connector-java-5.1.15-bin.jar" into the /usr/lib/hive/lib directory, as the documents dictate.
I have also tried the instructions on the blog post http://hadoopchallenges.blogspot.com/2011/03/hue-120-upgrade-and-beeswax.html, which involved copying the the mysql-connector jar into "/usr/share/hue/apps/beeswax/hive/lib/". Unfortunately I did not have a hive/lib subdirectory in the beeswax folder, so I attempted to make one. This also did not work.
Any advice how I can get the MySQL JDBC library onto Beeswax's classpath?

We finally decided to just bite the bullet and upgrade to CDH4. Placing the JDBC jar in /usr/share/hive/lib allowed the Beeswax server to function perfectly without issue.
If anyone else is experiencing this issue I recommend upgrading from CDH3 to CDH4, the UI is much cleaner, smoother, and we had much fewer installation and maintenance bugs with CDH4.

You have to paste your mysql connector in HUE_HOME/apps/beeswax/hive/lib.
If this path doesn't exist, create hive/lib and then paste the mysql connector. I hope your problem will be solved.

When you start using cloudera 4.5 they move everything into parcels, so this exact problem on my hive meta server was fixed by this command (below). Essentially you're just re-adding modules. I'm sure you can modify the extra classpath in the hive config file to make this oblivious to parcel updates.
cp /usr/lib/hive/lib/mysql-connector-java-5.1.17-bin.jar /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive/lib/.
So a real fix might be something like this:
cp `locate mysql-connector | grep jar | head -n 1` /opt/cloudera/parcels/*/lib/hive/lib/.
which would copy the jar into every parcel.

Related

Error when trying to execute kylin.sh start in HDP Sandbox 2.6

I installed Apache Kylin, following the official installation guide http://kylin.apache.org/docs/install/index.html, in HDP sandbox 2.6
When I run the script, $KYLIN_HOME/bin/kylin.sh start, I got the error below:
What can I do to fix this error?
Thanks in advance
Check if Hive service is up in your ambari, when Hive service is down Kylin cannot find it and gives the error. Check for .bash_profile as well. When those two issues are addressed kylin should be able to find location of hive dependency.
Kylin uses the find-hive-dependency.sh script to setup the CLASSPATH. This script uses a Hive CLI command (I test it with beeline) to query Hive env vars and extract the CLASSPATH from them.
beeline connect to Hive using the properties at kylin_hive_conf.xml but for some reason (probably due to the Hive version included in HDP 2.6) some of the loaded Hive properties cannot be set when the connection is stablished.
The Hive properties that causes the issue can be discarded for connecting to Hive to query the CLASSPATH, so, to fix this issue:
Edit $KYLIN_HOME/conf/kylin.properties and set kylin.source.hive.client=beeline
Open the find-hive-dependency.sh script, go to line 34 aprox and modify the line
hive_env=${beeline_shell} ${hive_conf_properties} ${beeline_params} --outputformat=dsv -e "set;" 2>&1 | grep 'env:CLASSPATH'
Just remove ${hive_conf_properties}
Check Hive depedencies have been configured by running the command find-hive-dependency.sh.
Now $KYLIN_HOME/bin/kylin.sh start should works.

How to add jar files for Hue in Cloudera?

I'm running an SQL query on a JSON serde table. It's working in the Hive CLI, but it's failing in Hue with the error:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
I guess it's due to the missing jar file; any idea how to add the jar file hive-hcatalog-core-1.2.1.jar for Hue?
Place your jar in HDFS and add same path by using ADD JAR hdfs:///user/hive/lib/hive-hcatalog-core-1.2.1.jar ;
Run ADD JAR hive-hcatalog-core-1.2.1.jar in hue before your query this thing will be present till your current secession persists.
For the benefit of others, who might face same issue either for this particular jar "hive-hcatalog-core-1.2.1.jar" or any udf jar:
In the HUE - Query Editor, run the following command:
add jar hdfs:/hive-hcatalog-core-1.2.1.jar;
Please note single quotes is not required as is the case with Hive CLI
Exact command cloudera gave is ADD JAR {{lib_dir}}/hive/lib/hive-contrib.jar;
1)I am unable to find hive/lib directory on CDH 5
The {{lib_dir}} on CDH installed environments for Hive would either be /usr/lib/hive/ or /opt/cloudera/parcels/CDH/lib/hive/ (depending on packages or parcels being in use).
this is the way to add jar in cloudera
for this you have to change to supper user by use this command
SUDO SU
it will change to supper user

Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses-submiting job2remoteClustr

I recently upgraded my cluster from Apache Hadoop1.0 to CDH4.4.0. I have a weblogic server in another machine from where i submit jobs to this remote cluster via mapreduce client. I still want to use MR1 and not Yarn. I have compiled my client code against the client jars in the CDH installtion (/usr/lib/hadoop/client/*)
Am getting the below error when creating a JobClient instance. There are many posts related to the same issue but all the solutions refer to the scenario of submitting the job to a local cluster and not to remote and specifically in my case from a wls container.
JobClient jc = new JobClient(conf);
Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
But running from the command prompt on the cluster works perfectly fine.
Appreciate your timely help!
I had a similar error and added the following jars to classpath and it worked for me:
hadoop-mapreduce-client-jobclient-2.2.0.2.0.6.0-76:hadoop-mapreduce-client-shuffle-2.3.0.jar:hadoop-mapreduce-client-common-2.3.0.jar
It's likely that your app is looking at your old Hadoop 1.x configuration files. Maybe your app hard-codes some config? This error tends to indicate you are using the new client libraries but that they are not seeing new-style configuration.
It must exist since the command-line tools see them fine. Check your HADOOP_HOME or HADOOP_CONF_DIR env variables too although that's what the command line tools tend to pick up, and they work.
Note that you need to install the 'mapreduce' service and not 'yarn' in CDH 4.4 to make it compatible with MR1 clients. See also the '...-mr1-...' artifacts in Maven.
In my case, this error was due to the version of the jars, make sure that you are using the same version as in the server.
export HADOOP_MAPRED_HOME=/cloudera/parcels/CDH-4.1.3-1.cdh4.1.3.p0.23/lib/hadoop-0.20-mapreduce
I my case i was running sqoop 1.4.5 and pointing it to the latest hadoop 2.0.0-cdh4.4.0 which had the yarn stuff also thats why it was complaining.
When i pointed sqoop to hadoop-0.20/2.0.0-cdh4.4.0 (MR1 i think) it worked.
As with Akshay (comment by Setob_b) all I needed to fix was to get hadoop-mapreduce-client-shuffle-.jar on my classpath.
As follows for Maven:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-shuffle</artifactId>
<version>${hadoop.version}</version>
</dependency>
In my case, strangely this error was because in my 'core-site.xml' file, I mentioned "IP-address" rather than "hostname".
The moment I mentioned "hostname" in place of IP address and in "core-site.xml" and "mapred.xml" and re-installed mapreduce lib files, error got resolved.
in my case, i resolved this by using hadoop jar instead of java -jar .
it's usefull, hadoop will provide the configuration context from hdfs-site.xml, core-site.xml ....

FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

I shutdown my HDFS client while HDFS and hive instances were running. Now when I relogged into Hive, I can't execute any of my DDL Tasks e.g. "show tables" or "describe tablename" etc. It is giving me the error as below
ERROR exec.Task (SessionState.java:printError(401)) - FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
Can anybody suggest what do I need to do to get my metastore_db instantiated without recreating the tables? Otherwise, I have to duplicate the effort of creating the entire database/schema once again.
I have resolved the problem. These are the steps I followed:
Go to $HIVE_HOME/bin/metastore_db
Copied the db.lck to db.lck1 and dbex.lck to dbex.lck1
Deleted the lock entries from db.lck and dbex.lck
Log out from hive shell as well as from all running instances of HDFS
Re-login to HDFS and hive shell. If you run DDL commands, it may again give you the "Could not instantiate HiveMetaStoreClient error"
Now copy back the db.lck1 to db.lck and dbex.lck1 to dbex.lck
Log out from all hive shell and HDFS instances
Relogin and you should see your old tables
Note: Step 5 may seem a little weird because even after deleting the lock entry, it will still give the HiveMetaStoreClient error but it worked for me.
Advantage: You don't have to duplicate the effort of re-creating the entire database.
Hope this helps somebody facing the same error. Please vote if you find useful. Thanks ahead
I was told that generally we get this exception if we the hive console not terminated properly.
The fix:
Run the jps command, look for "RunJar" process and kill it using
kill -9 command
See: getting error in hive
Have you copied the jar containing the JDBC driver for your metadata db into Hive's lib dir?
For instance, if you're using MySQL to hold your metadata db, you wll need to copy
mysql-connector-java-5.1.22-bin.jar into $HIVE_HOME/lib.
This fixed that same error for me.
I faced the same issue and resolved it by starting the metastore service. Sometimes service might get stopped if your machine is re-booted or went down. You could start the service by running the command:
Login as $HIVE_USER
nohup hive --service metastore>$HIVE_LOG_DIR/hive.out 2>$HIVE_LOG_DIR/hive.log &
I had a similar problem with hive server and followed the below steps:
1. Go to $HIVE_HOME/bin/metastore_db
2. Copied the db.lck to db.lck1 and dbex.lck to dbex.lck1
3. Deleted the lock entries from db.lck and dbex.lck
4. Relogin from hive shell. It is working
Thanks
For instance, I use MySQL to hold metadata db, I copied
mysql-connector-java-5.1.22-bin.jar into $HIVE_HOME/lib folder
My error resolved
I also was facing the same problem, and figured out that I had both hive-deafult.xml and hive-site.xml(created manually by me),
I moved my hive-site.xml to hive-site.xml-template(as I was not needed this file) then
started hive, worked fine.
Cheers,
Ajmal
I have faced this issue and in my case it was while running hive command from command line.
I resolved this issue by running kinit command as I was using kerberized hive.
kinit -kt <your keytab file location> <kerberos principal>

Oozie + Sqoop: JDBC Driver Jar Location

I have a 6 node cloudera based hadoop cluster and I'm trying to connect to an oracle database from a sqoop action in oozie.
I have copied my ojdbc6.jar into the sqoop lib location (which for me happens to be at: /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib/ ) on all the nodes and have verified that I can run a simple 'sqoop eval' from all the 6 nodes.
Now when I run the same command using Oozie's sqoop action, I get "Could not load db driver class: oracle.jdbc.OracleDriver"
I have read this article about using shared libs and it makes sense to me when we're talking about my task/action/workflow specific dependencies. But I see a JDBC driver installation as an extention to sqoop and so I think it belongs in the sqoop installation lib.
Now the question is, while sqoop sees this ojdbc6 jar I have put into it's lib folder, how come my Oozie workflow doesn't see it?
Is this something expected or am I missing something?
As an aside, what do you guy think about where is the appropriate location for a JDBC driver jar?
Thanks in advance!
The JDBC driver jar (and any jars it depends on) should go in your Oozie sharelib folder on HDFS. I'm running Hortonworks Data Platform 1.2 instead of Cloudera 4.2 so the details may vary, but my JDBC driver is located in /user/oozie/share/lib/sqoop. This should allow you to run Sqoop with the JDBC via Oozie.
It is not necessary to put to the JDBC driver jar in the sqoop lib on the data nodes. In my setupt I can't run a simple sqoop eval from the command line on my data nodes. I understand the logic for why you thought this would work. The reason the JDBC driver jar needs to be on HDFS is so that all the data nodes have access to it. Your solution should accomplish the same goal. I'm not familiar enough with the inner workings of Oozie to say why using the sharelib works but your solution does not.
In CDH5, you should put the jar to '/user/oozie/share/lib/lib_${timestamp}/sqoop', and after that, you must update the sharelib or restart oozie.
update sharelib:
oozie admin -oozie http://localhost:11000/oozie -sharelibupdate
If you are using CDH-5 the JDBC driver jar (and any jars it depends on) should go in '/user/oozie/share/lib/lib_timestamp/sqoop' folder on HDFS.
I was facing the same issue it was not able to find the mysql jar. I am using cloudera 4.4 in this even oozie admin -oozie http://localhost:11000/oozie -sharelibupdate command will not work
To resolve the issue I had followed the below steps:
create a user in Hue with hdfs and provide the admin privileges
using Hue UI upload the jar into /user/oozie/share/lib/sqoop hdfs path
or you can use below command:
hadoop put /var/lib/sqoop2/mysql-connector-java.jar /user/oozie/share/lib/sqoop
Once the jar is placed run the oozie command.

Resources