Connect to kerberised hive using jdbc from remote windows system - hadoop

I have setup a hive environment with Kerberos security enabled on a Linux server (Red Hat). And I need to connect from a remote windows machine to hive using JDBC.
So, I have hiveserver2 running in the linux machine, and I have done "kinit".
Now I try to connect from a java program on the windows side with a test program like this,
Class.forName("org.apache.hive.jdbc.HiveDriver");
String url = "jdbc:hive2://<host>:10000/default;principal=hive/_HOST#<YOUR-REALM.COM>"
Connection con = DriverManager.getConnection(url);
And I got the following error,
Exception due to: Could not open client transport with JDBC Uri:
jdbc:hive2://<host>:10000/;principal=hive/_HOST#YOUR-REALM.COM>:
GSS initiate failed
What am I doing here wrong ? I checked many forums, but couldn't get a proper solution. Any answer will be appreciated.
Thanks

If you were running your code in Linux, I would simply point to that post -- i.e. you must use System properties to define Kerberos and JAAS configuration, from conf files with specific formats.
And you have to switch the debug trace flags to understand subtile configuration issue (i.e. different flavors/versions of JVMs may have different syntax requirements, which are not documented, it's a trial-and-error process).
But on Windows there are additional problems:
the Apache Hive JDBC driver has some dependencies on Hadoop JARs, especially when Kerberos is involved (see that post for details)
these Hadoop JARs require "native libraries" -- i.e. a Windows port of Hadoop (which you have to compile yourself!! or download from an insecure source on the web!!) -- plus System properties hadoop.home.dir and java.library.path pointing to the Hadoop home dir and its bin sub-dir respectively
On the top of that, the Apache Hive driver has compatibility issues -- whenever there are changes in the wire protocol, newer clients cannot connect to older servers.
So I strongly advise you to use the Cloudera JDBC driver for Hive for your Windows clients. The Cloudera site just asks your e-mail.
After that you have a 80+ pages PDF manual to read, the JARs to add to your CLASSPATH, and your JDBC URL to adapt according to the manual.
Side note: the Cloudera driver is a proper JDBC-4.x compliant driver, no need for that legacy Class.forName()...

The key for us when we ran into the problem, was as follows:
On your server there are certain kerberos principals listed that are allowed to operate on the data.
When we tried to run a query via JDBC, we didn't do the proper kinit on the client side.
In this case the solution is obvious:
On the windows client: do a kinit with the proper account before connecting

String url = "jdbc:hive2://<host>:10000/default;principal=hive/_HOST#<YOUR-REALM.COM>"
You should replace <YOUR-REALM.COM> with your real REALM.

Related

DB2 JDBC Windows Authentication

Looking for example using JDBC on DB2 database using Windows authentication, preferably with db2jcc4.jar driver. Seems like a common enough scenario, but I'm having a hard time finding an example.
Your original question was too vague until you clarified it with a comment asking how to connect to local Db2-databases via jdbc without a userid/password. So your real question appears to be "how do I achieve passwordless authentication to local Db2-databases on MS-Windows?", which may be a FAQ.
Db2-server delegates authentication to the underlying operating-system services on which the Db2-server is running. Keep in mind that Db2-server runs on a few quite-distinct operating systems, only one of which is MS-Windows.
Yes you can connect via-JDBC to a local Db2-database on MS-Windows without specifying a userid/password, using the IBM-supplied jdbc driver.
You can also connect to a local Db2-database on MS-Windows via CLI/ODBC and command-line without specifying a userid/password. When no userid/password is specified then the authentication-ID is that of the currently running session (either the logged on identify, or the runas identity).
If you have a local Db2-server with a local database running on MS-Windows, then all necessary software is already installed (if using defaults) to achieve the above.
It is important to understand that if the Db2-database is remote from the client then the authentication will need some form of credentials. Such credentials may be in the form of a certificate (if the Db2-database lives on Z/OS), or in the form of a userid/password, or in the form of a kerberos ticket, or in the form of a token used for cloud-based Db2 etc.
For a passwordless local jdbc connection to a Db2-database , you can use the URL format "jdbc:db2:your_database_name" .
The class com.ibm.db2.jcc.DB2Driver (as supplied in currently supported versions of db2jcc4.jar) supports passwordless connections with that URL pattern.
Example with local database-name = sample.
try
{
Connection con = DriverManager.getConnection("jdbc:db2:sample");
...
}
catch (Exception e)
{
...
}
The Db2 Knowledge-Centre gives all the details of the available jdbc properties
here
https://www.ibm.com/support/knowledgecenter/SSEPGG_11.5.0/com.ibm.db2.luw.apdv.java.doc/src/tpc/imjcc_r0052038.html
Other pages show additional properties that are specific to Z/OS data-sources, or cloud-databases, or i-series data sources, or informix sources etc.

JDBC Connection between Tableau and Apache Druid

I have been trying to connect to Druid from Tableau using a JDBC Driver.
I have successfully connected using an ODBC Driver as per my answer to this post Connecting Tableau to Apache Druid
However, I want to be able to use a JDBC driver as well.
Though I have followed the steps in this post: https://support.imply.io/hc/en-us/articles/360025589574-Connecting-Tableau-to-Druid-with-JDBC,
I keep getting the error: "No suitable Driver installed or the URL is incorrect".
As per the article, I have ensured that the avatica driver is downloaded and installed in ~/Library/Tableau/Drivers, as I am on a mac.
I am also sure I am giving the right URL to my broker which I am otherwise able to access on a browser at port 8082.
Any pointers what might be wrong?
the issue may be that you don't have the avatica driver in your classpath. please see https://calcite.apache.org/docs/adapter.html
I found from this article: https://kb.tableau.com/articles/issue/locating-library-jdbc-directory-in-mac-to-install-the-athena-drivers-for-tableau-prep
that Tableau actually looks for the JDBC driver in the ~/Library/JDBC folder on mac (and it does not seem to read from ~/Library/Tableau/Drivers folder as mentioned in the original article in my question). Once I placed the avatica driver in this folder, Tableau desktop was able to find the driver.

Error getting a JDBC connection to Hive via Knox

I have a Hadoop cluster running Hortonworks Data Platform 2.4.2 which has been running well for more than a year. The cluster is Kerberised and external applications connect via Knox. Earlier today, the cluster stopped accepting JDBC connections via Knox to Hive.
The Knox logs show no errors, but the Hive Server2 log shows the following error:
"Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: knox is not allowed to impersonate org.apache.hive.service.cli.HiveSQLException: Failed to validate proxy privilege of knox for "
Having looked at other users the suggestions mostly seem to be around the correct setting of configuration options for hadoop.proxyusers.users and hadoop.proxyusers.groups.
However, in my case I don't see how these settings could be the problem. The cluster has been running for over a year and we have a number of applications connecting to Hive via JDBC on a daily basis. The configuration of the server has not been changed and connections were previously succeeding on the current configuration. No changes had been made to the platform or environment and the cluster was not restarted or taken down for maintenance between the last successful JDBC connection and JDBC connections being declined.
I have now stopped and started the cluster, but after restart the cluster still does not accept JDBC connections.
Does anyone have any suggestions on how I should proceed?
Do you have Hive Impersonation turned on?
hive.server2.enable.doAs=true
This could be the issue assuming hadoop.proxyusers.users and hadoop.proxyusers.groups are set properly.
Also, check whether the user 'knox' exist on Hive Server2 node (and others used for impersonation).
The known work around seems to be to set:
hadoop.proxyuser.knox.groups = *
hadoop.proxyuser.knox.hosts = *
I have yet to find a real fix that lets you keep this layer of added security.

Hive and Tableau - Proxies/ODBC?

So, I have a Hive server (Cloudera, Thrift via HTTP) set up and working, and can connect to it from Tableau using the ODBC driver for Cloudera Hive - all good, from the servers in the AWS farm.
However, no luck from the client site/their end-user PCs.
The reason for this is that they require all outbound traffic to the internet (here, my AWS instance) to go through proxies using NTLM, and I can't get the Cloudera ODBC driver to talk via the NTLM proxy. It appears to ignore the Windows proxy settings entirely, in fact.
I'm aware of two (obvious) solutions - use Fiddler/cntlm locally on the box as a reverse proxy / set up a reverse-proxy in the customer's net and point ODBC at that - both of these are somewhat unpalatable to the users.
So: Is there a way to get Cloudera's ODBC driver (or Windows itself) to forcibly go via an NTLM proxy without requiring additional software/servers? Or is there a Cloudera-Hive-compatible Tableau connector that works well with proxies in the middle?
TL;DR: Need to get from Tableau client on Windows to Cloudera Hive in AWS across an NTLM proxy. Thoughts?
The Cloudera Hive ODBC driver currently doesn't support proxy and NTLM authentication. If this feature is important for you I would suggest raising it as a feature request against Cloudera. I am not aware of any other Hive ODBC driver that supports proxy and NTLM.
Holman

How to sniff Oracle's credentials from a connection attempt to the database?

I have a legacy application, which connects to the configured Oracle database.
It seems it has some logic that alters the database credentials as it is unable to successfully log in to the Oracle database, while sqlplus started on the same machine is able to log in.
The error I am getting is: [DataDirect][ODBC Oracle Wire Protocol driver][Oracle]ORA-01017: invalid username/password; logon denied
How to find out what is the database username and password that are sent to the database?
What I have tried so far:
Enabled auditing of failed sign-on attempts on Oracle (audit create session whenever not successful). It does not solve the issue, because it only logs the username, which seems to be correct, without the password.
Used a sniffer to eavesdrop the network traffic between the machine running the application and the database, but since Oracle's TNS protocol is encrypted, it did not help a lot.
Started a server using netcat on port X, provided port X in the application configuration file. The application did connect to my server, that is how I know the application is connecting to the correct server. But since the TNS protocol is pretty complex (requires a series of messages to be exchanged between the client and the server) I hope there is a simpler why of achiving what I want without having to reverse engineer Oracle and implementing my own server.
Enabled tracing of the JDBC driver (Trace=1, TraceFile, TraceDll). The trace file shows the correct username, but obviously the password is not getting logged.
My environment:
Database: Oracle 11g
Application runs on: Solaris
Application uses: DataDirect ODBC Oracle Wire Protocol v70
I not sure, but if connection established by ODBC driver (as described in question tags) then you can try ODBC sniffing tools like ODBC Tracing.
Citation:
Password "Sniffing" Using Trace
ODBC provides a means for tracing the conversation taking place between the driver and the host database. Used by developers for testing purposes, the tracing feature is designed to help programmers find out exactly what is going on and to help fix problems. However, tracing (also called "sniffing") can be used by nefarious bad guys to retrieve user passwords.
When tracing is enabled, communications with the host are written to a file. This includes the user ID and password, which are captured in plain text.
Update
SQLPlus connects to Oracle with OCI interface, but DataDirect ODBC driver uses it's own proprietary implementation of communication protocol. So, most probable point of failure is driver misconfiguration or incompatibility.
DataDirect provides some tools for ODBC drivers diagnostics, but only option applicable to case described in question is using snoop utility, which acts like a netcat which already tried.
Because connection failed at credential verification stage, the most probable source of error is using localized symbols for user name or password. There are some issues with Oracle authentication process, listed in DataDirect Knowledge Search (search for ORA-01017).
It seems that DataDirect provides two separate version of driver with and without Unicode support, therefore one of possible points of failure is to connecting with non-Unicode version of driver to Unicode version of database and vice verse.
P.S. For now I don't have any experience with DataDirect ODBC driver. So it's only suggestions about possible source of failure.

Resources