Pentaho Data Integration with Hive connection - hadoop

I am using Pentaho Data Integration and I am trying to connect to Hive but when i am trying to do so, i am getting below error.....
Error connecting to database [Hive] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database
Error connecting to database: (using class org.apache.hadoop.hive.jdbc.HiveDriver)
Error occured while trying to connect to the database
Error connecting to database: (using class org.apache.hadoop.hive.jdbc.HiveDriver)
at org.pentaho.di.core.database.Database.normalConnect(
at org.pentaho.di.core.database.Database.connect(
at org.pentaho.di.core.database.Database.connect(
at org.pentaho.di.core.database.Database.connect(
at org.pentaho.di.core.database.DatabaseFactory.getConnectionTestReport(
at org.pentaho.di.core.database.DatabaseMeta.testConnection(
at org.pentaho.di.ui.core.database.dialog.DatabaseDialog.test(
at org.pentaho.di.ui.core.database.wizard.CreateDatabaseWizardPage2.test(
at org.pentaho.di.ui.core.database.wizard.CreateDatabaseWizardPage2$3.widgetSelected(
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.eclipse.jface.window.Window.runEventLoop(
at org.pentaho.di.ui.core.database.wizard.CreateDatabaseWizard.createAndRunDatabaseWizard(
using settings as localhost, port as 8888 and database as default....
Kindly help, awaiting for your reply....
Jiten Pansara

What Hadoop distribution are you using? If you are not using Apache Hadoop 0.20.x, then you will have to configure PDI by setting certain properties, see the following Wiki page for more details on how to set up Pentaho for a particular Hadoop distribution:

Did you edit in the plugin folder??
data-integration > plugins > pentaho-big-data-plugin >
change the property "active.hadoop.configuration" to the hadoop distribution you are using, eg :
This might solve the issue.


derby db upgrade file permission issue

I'm currently running a derby DB instance created from version
I connect via the network mode (startNetworkServer) running on a redhat server.
I'm now wanting to upgrade to version
However, when trying to connect to the upgraded database, I receive an access denied "" error.
I went and downloaded both version and onto my windows desktop.
A backup of the database is created using the following command: SYSCS_UTIL.SYSCS_BACKUP_DATABASE
I copied this backup to both the 10.13 and 10.14 folders.
Starting with my current version (13) i start the network server, and then use ij to connect to the database. This works fine, and i can see the tables. This validates my backup is fine.
connect 'jdbc:derby://localhost:1527/c:\Temp\13\database;create=false';
I then start my 14 versions network server, and then go to 14's ij. When I try to connect to the backup:
connect 'jdbc:derby://localhost:1527/c:\Temp\14\database;create=false';
I get the filePermission error:
access denied ("" "C:\Temp\updating_derby\threatadvisor" "read")
Fair enough, I assume this is because i'm trying to connect to an older version, without having run the upgrade=true parameter. When I remove the create parameter, and add the upgrade parameter, it still fails with the same issue.
Ok, so perhaps I can't upgrade a DB via the network server, and I have to directly connect to the DB. From within my app, I use the following connection string:
The app has the version 14 jar on the classpath, so should use it and upgrade. Which it does, the app starts normally and I see all the data. How do I know it upgraded? Because I tried to connect to this 14 database using 13 network server and ij, and it fails (as expected due to version).
So i'm done right? No, I once more try to connect to this now upgraded database via the network server, using ij and i once again get the issue.
I went in and ensured the actual OS permissions on the folders and files inside the "database" folder are not just read-only. None are. Yet still it errors.
I've even tried running 14 network server on the redhat box (on a different port), and trying to connect to this db via ij and even there i get the file permission issue.
I'm really at a loss as to what to do next. Please help!
FYI, the full issue from the derby.log file:
Tue Jun 11 12:04:15 AEST 2019 : Apache Derby Network Server - - (1828579) started and ready to accept connections on port 1527
Tue Jun 11 12:04:28 AEST 2019 Thread[DRDAConnThread_2,5,main] Cleanup action starting access denied ("" "C:\Temp\14\database" "read")
at java.lang.SecurityManager.checkPermission(
at java.lang.SecurityManager.checkRead(
at Source)
at Source)
at Source)
at Source)
at$400(Unknown Source)
at$ Source)
at$ Source)
at Method)
at Source)
at Source)
at Source)
at Source)
at org.apache.derby.impl.jdbc.EmbedConnection$ Source)
at org.apache.derby.impl.jdbc.EmbedConnection$ Source)
at Method)
at org.apache.derby.impl.jdbc.EmbedConnection.startPersistentService(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source)
at org.apache.derby.jdbc.InternalDriver$ Source)
at org.apache.derby.jdbc.InternalDriver$ Source)
at Method)
at org.apache.derby.jdbc.InternalDriver.getNewEmbedConnection(Unknown Source)
at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
at org.apache.derby.jdbc.EmbeddedDriver.connect(Unknown Source)
at org.apache.derby.impl.drda.Database.makeConnection(Unknown Source)
at org.apache.derby.impl.drda.DRDAConnThread.getConnFromDatabaseName(Unknown Source)
at org.apache.derby.impl.drda.DRDAConnThread.verifyUserIdPassword(Unknown Source)
at org.apache.derby.impl.drda.DRDAConnThread.parseSECCHK(Unknown Source)
at org.apache.derby.impl.drda.DRDAConnThread.parseDRDAConnection(Unknown Source)
at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown Source)
at Source)
Cleanup action completed
Now trying to setup the security.policy file as per this guide. However, after creating a new policy file based off the template in the demo directory, we can't even get derby to pick up our file.
When we try to run:
java -classpath "C:\Temp\14\lib\derby.jar;C:\Temp\14\lib\derbynet.jar;C:\Temp\14\lib\derbyclient.jar;C:\Temp\14\lib\derbytools.jar;C:\Temp\14\lib\derbyoptionaltools.jar"\Temp\14\server.policy org.apache.derby.drda.NetworkServerControl start
We get the following error: access denied "engine", "usederbyinternals" )
at Source)
at Source)
at$ Source)
at$ Source)
at Method)
at Source)
at Source)
at Source)
at org.apache.derby.impl.drda.NetworkServerControlImpl.init(Unknown Source)
at org.apache.derby.impl.drda.NetworkServerControlImpl.(Unknown Source)
at org.apache.derby.drda.NetworkServerControl.main(Unknown Source)
I know this line is in the policy file (and uncommented):
permission "engine", "usederbyinternals";
However, I don't think it is even picking up our policy file, as if we change our reference to a non-existing policy file, we still get the same error.
Thanks to #BryanPendleton for pointing me in the right direction. For the initial issue, it was indeed because we needed the server.policy file. His link was helpful:
The second issue which we were having was resolved by using the server.policy file template located here:
Instead of the one provided in the download (the one in the derby download didn't have as many jars mentioned in it). More to the point, the way we referenced the jars had to be tweaked. You will see all the examples were for unix format, whereas we were developing on a test windows PC. Therefore instead of something like (unix):
grant codeBase "file:///home/someone/derby/lib/derby.jar"
We needed to do:
grant codeBase "file:///C:/Temp/14/lib/derby.jar"
Note the additional '/' after 'file' - we had assumed it was merely "file://C:...."
There is another solution to the problem which is to use this code:
and use this:
Policy.setPolicy(new DerbyPolicy());
To get a policy set programmatically.

Hive failed to metastore database

After running hive command it failed to create database
Following the official "Getting Started" guide on Apache website
java.sql.SQLException: Unable to open a test connection to the given
database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true,
username = APP. Terminating connection pool (set lazyInit to true if you
expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Failed to create database 'metastore_db', see the next exception for details.
Caused by: java.sql.SQLException: Directory /opt/hive/bin/metastore_db cannot be created.
at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
... 80 more
Caused by: ERROR XBM0H: Directory /opt/hive/bin/metastore_db cannot be created.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at$ Source)
at Method)
at Source)
at Source)
at Source)
at Source)
... 80 more
check the particular user have permission to create /opt/hive/bin/metastore_db
if not then add permission
sudo chmod -R 777 /opt/hive/bin/metastore_db
hope this helps
use: chown -R hduser:hduser * xxxxxxxxx/hive/ (*or whatever username /usergroup you are using) to grant owner (hence full permission.....incl write/create)
(I dont know whether there is a default hive user/group named hive:hive, i created this but granting the ownership right to this hive user created wont work..)
basically the installation instruction assumes you (the hadoop and hence hive who inherit your right) have sufficient permissions.
(note: this is kind of equivalent to run command prompt in Admin Mode in windows, even though you are already admin). Linux tends to be flexible in many things so cause some chaos.
I just realized in hive: hive-site.xml and in the installation steps to create /user/hive/warehouse:
"user" can be so easily confused with "usr" without any alert at all. so remember:
under hadoop fs, this is /user/hive/warehouse, which points to metadata_db folder, to create (NOT /usr/hive/warehouse)

Fiware Cosmos Hive Authorization Issue

I'm using a shared instance of Fiware Cosmos (meaning I don't have root privileges). I have until today successfully acessed and managed tables in hive both remotely using jdbc, and Hive CLI.
But now I'm getting this error when starting Hive CLI:
log4j:ERROR Could not instantiate class [org.apache.hadoop.hive.shims.HiveEventCounter].
java.lang.RuntimeException: Could not load shims in class org.apache.hadoop.log.metrics.EventCounter
at org.apache.hadoop.hive.shims.ShimLoader.createShim(
at org.apache.hadoop.hive.shims.ShimLoader.loadShims(
at org.apache.hadoop.hive.shims.ShimLoader.getEventCounter(
at org.apache.hadoop.hive.shims.HiveEventCounter.<init>(
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
at java.lang.reflect.Constructor.newInstance(
at java.lang.Class.newInstance0(
at java.lang.Class.newInstance(
at org.apache.log4j.helpers.OptionConverter.instantiateByClassName(
at org.apache.log4j.helpers.OptionConverter.instantiateByKey(
at org.apache.log4j.PropertyConfigurator.parseAppender(
at org.apache.log4j.PropertyConfigurator.parseCategory(
at org.apache.log4j.PropertyConfigurator.configureRootCategory(
at org.apache.log4j.PropertyConfigurator.doConfigure(
at org.apache.log4j.PropertyConfigurator.doConfigure(
at org.apache.log4j.PropertyConfigurator.configure(
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(
at org.apache.hadoop.hive.cli.CliDriver.main(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.util.RunJar.main(
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.log.metrics.EventCounter
at Method)
at java.lang.ClassLoader.loadClass(
at sun.misc.Launcher$AppClassLoader.loadClass(
at java.lang.ClassLoader.loadClass(
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(
at org.apache.hadoop.hive.shims.ShimLoader.createShim(
... 27 more
log4j:ERROR Could not instantiate appender named "EventCounter".
Logging initialized using configuration in jar:file:/usr/local/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/
I can however perform select and create in the Hive CLI.
If I then try to access Hive remotely, I get this:
Connecting to jdbc:hive://x.x.x.x:10000/default?user=user&password=XXXXXXXXXX
Could not establish connection: Connection refused
I didn't do any changes in code or commands before the errors appeared, and after googling around I haven't found any working solutions.
If anyone can guide me to where the problem is, or how to find it, or even better how to solve it, I'd be grateful.
Thanks in advance!
HiveServer2 (the Hive JDBC service) is a very unstable piece of shoftware. In our Prod cluster we have a CRON job to restart each instance every day, and even then, sometimes it blows OutOfMemory errors then just hangs saying Connection refused like you show. Open a ticket to your Hadoop admin so that he/she retarts the damn service.
On the other hand, the org.apache.hadoop.log.metrics.EventCounter message smells like someone tried to change a shared config somewhere (or tried to upgrade some JARs) and now Hive believes that it runs on a very, very old version of Hadoop
=> e.g. comments in Hive-4133 or that MapR support post
The cause of these issues were Hive upgrades in Cosmos. A more thorough explanation and solution is found here:
My Hive client stopped working with Cosmos instance

Using Phoenix to help to integrate elastic-search and Hbase. When use,to create table, bad happens

I follow the instruction Connecting Hbase to Elasticsearch in 10 min or less. Everything goes fine before the step: Create a table in Hbase using SQLline. When I type $ $PHOENIX_HOME/hadoop1/bin/ localhost , the terminal shows:
znbee#znbee-Aspire-V5-452G:~/phoenix-4.1.0-bin/hadoop1$ bin/ localhost
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:localhost none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:localhost
14/12/19 11:35:03 WARN util.Tracing: Tracing will outputs will not be written to any metrics sink! No TraceMetricsSink found on the classpath
java.lang.RuntimeException: Could not create interface org.apache.phoenix.trace.PhoenixSpanReceiver Is the hadoop compatibility jar on the classpath?
at org.apache.hadoop.hbase.CompatibilityFactory.getInstance(
at org.apache.phoenix.trace.TracingCompat.newTraceMetricSource(
at org.apache.phoenix.trace.util.Tracing.addTraceMetricsSource(
at org.apache.phoenix.jdbc.PhoenixConnection.<clinit>(
at org.apache.phoenix.query.ConnectionQueryServicesImpl$
at org.apache.phoenix.query.ConnectionQueryServicesImpl$
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(
at org.apache.phoenix.jdbc.PhoenixDriver.connect(
at sqlline.SqlLine$DatabaseConnection.connect(
at sqlline.SqlLine$DatabaseConnection.getConnection(
at sqlline.SqlLine$Commands.connect(
at sqlline.SqlLine$Commands.connect(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at sqlline.SqlLine$ReflectiveCommandHandler.execute(
at sqlline.SqlLine.dispatch(
at sqlline.SqlLine.initArgs(
at sqlline.SqlLine.begin(
at sqlline.SqlLine.mainWithInputRedirection(
at sqlline.SqlLine.main(
Caused by: java.util.NoSuchElementException
at java.util.ServiceLoader$
at java.util.ServiceLoader$
at org.apache.hadoop.hbase.CompatibilityFactory.getInstance(
... 24 more

Connect to Oracle using Slick

I am trying to connect to Oracle using Slick.
I got the slick-extensions_2.10-1.0.0.jar.
Added the line below in Scala
Database.forURL("jdbc:oracle:thin:#myhost:myport:dbalias", "myid", "mypwd", null, driver =
"") withSession {.......}
What is the right URL to use for this driver since I got the following error:
Exception in thread "main" java.sql.SQLException: No suitable driver found for jdbc:oracle:thin:#myhost:myport:dbalias
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at scala.slick.session.Database$$anon$2.createConnection(Database.scala:105)
at scala.slick.session.BaseSession.conn$lzycompute(Session.scala:207)
at scala.slick.session.BaseSession.conn(Session.scala:207)
at scala.slick.session.BaseSession.close(Session.scala:221)
at scala.slick.session.Database.withSession(Database.scala:38)
at scala.slick.session.Database.withSession(Database.scala:46)
It seems you did not make the oracle jdbc driver available in classpath when running your program.
