Error getting a JDBC connection to Hive via Knox - hadoop

I have a Hadoop cluster running Hortonworks Data Platform 2.4.2 which has been running well for more than a year. The cluster is Kerberised and external applications connect via Knox. Earlier today, the cluster stopped accepting JDBC connections via Knox to Hive.
The Knox logs show no errors, but the Hive Server2 log shows the following error:
"Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User: knox is not allowed to impersonate org.apache.hive.service.cli.HiveSQLException: Failed to validate proxy privilege of knox for "
Having looked at other users the suggestions mostly seem to be around the correct setting of configuration options for hadoop.proxyusers.users and hadoop.proxyusers.groups.
However, in my case I don't see how these settings could be the problem. The cluster has been running for over a year and we have a number of applications connecting to Hive via JDBC on a daily basis. The configuration of the server has not been changed and connections were previously succeeding on the current configuration. No changes had been made to the platform or environment and the cluster was not restarted or taken down for maintenance between the last successful JDBC connection and JDBC connections being declined.
I have now stopped and started the cluster, but after restart the cluster still does not accept JDBC connections.
Does anyone have any suggestions on how I should proceed?

Do you have Hive Impersonation turned on?
hive.server2.enable.doAs=true
This could be the issue assuming hadoop.proxyusers.users and hadoop.proxyusers.groups are set properly.
Also, check whether the user 'knox' exist on Hive Server2 node (and others used for impersonation).

The known work around seems to be to set:
hadoop.proxyuser.knox.groups = *
hadoop.proxyuser.knox.hosts = *
I have yet to find a real fix that lets you keep this layer of added security.

Related

Use gMSA for Hashicorp Vault mssql credential rotation

I want to start using Vault to rotate credentials for mssql databases, and I need to be able to use a gMSA in my mssql connection string. My organization currently only uses Windows servers and will only provide gMSAs for service accounts.
Specifying the gMSA as the user id in the connection string returns the 400 error error creating database object: error verifying connection: InitialBytes InitializeSecurityContext failed 8009030c.
I also tried transitioning my vault services to use the gMSA as their log on user, but this made nodes unable to become a leader node even though they were able to join the cluster and forward requests.
My setup:
I have a Vault cluster running across a few Windows servers. I use nssm to run them as a Windows service since there is no native Windows service support.
nssm is configured to run vault server -config="C:\vault\config.hcl" and uses the Local System account to run under.
When I change the user, the node is able to start up and join the raft cluster as a follower, but can not obtain leader status, which causes my cluster to become unresponsive once the Local System user nodes are off.
The servers are running on Windows Server 2022 and Vault is at v1.10.3, using integrated raft storage. I have 5 vault nodes in my cluster.
I tried running the following command to configure my database secret engine:
vault write database/config/testdb \
connection_url='server=myserver\testdb;user id=domain\gmsaUser;database=mydb;app name=vault;' \
allowed_roles="my-role"
which caused the error message I mentioned above.
I then tried to change the log on user for the service. I followed these steps to rotate the user:
Updated the directory permissions for everywhere vault is touching (configs, certificates, storage) to include my gMSA user. I gave it read permissions for the config and certificate files and read/write for storage.
Stopped the service
Removed the node as a peer from the cluster using vault operator raft remove-peer instanceName.
Deleted the old storage files
Changed the service user by running sc.exe --% config "vault" obj="domain\gmsaUser" type= own.
Started the service back up and waited for replication
When I completed the last step, I could see the node reappear as a voter in the Vault UI. I was able to directly hit the node using the cli and ui and get a response. This is not an enterprise cluster, so this should have just forwarded the request to the leader, confirming that the clustering portion was working.
Before I got to the last node, I tried running vault operator step-down and was never able to get the leader to rotate. Turning off the last node made the cluster unresponsive.
I did not expect changing the log on user to cause any issue with node's ability to operate. I reviewed the logs but there was nothing out of the ordinary, even by setting the log level to trace. They do show successful unseal, standby mode, and joining the raft cluster.
Most of the documentation I have found for the mssql secret engine includes creating a user/pass at the sql server for Vault to use, which is not an option for me. Is there any way I can use the gMSA in my mssql config?
When you put user id into the SQL connection string it will try to do SQL authentication and no longer try windows authentication (while gMSA is a windows authentication based).
When setting up the gMSA account did you specify the correct parameter for who is allowed to retrieve the password (correct: PrincipalsAllowedToRetrieveManagedPassword, incorrect but first suggestion when using tab completion PrincipalsAllowedToDelegateToAccount)
maybe you need to Install-ADServiceAccount ... on the machine you're running vault on

Apache NIFI 1.9.2 Connecting to Oracle Using Kerberos

Please bear with me - its a bit complicated.
The high level goal is to connect NIFI to an Oracle db service - but can only use Kerberos for authentication.
We are running Apache NIFI 1.9.2 and trying to connect to Oracle (using driver version 12.1) via a DBCPConnectionPool controller service. I have configured a KeytabCredentialService controller service and reference it in my DBCP controller service.
I am setting the Oracle driver class name to be "oracle.jdbc.driver.OracleDriver". Full configuration settings here.
When we enable the associated ExecuteSQL processor - we get an Oracle authentication error message.
ORA-01017 - invalid username/password; logon denied.
Full error here.
After some troubleshooting - it seems that the Oracle driver wrapped within NIFI's DBCP service is not even trying to use Kerberos at all.
Outside of NIFI, programmatically we would normally need to add in driver property CONNECTION_PROPERTY_THIN_NET_AUTHENTICATION_SERVICES to "turn on" the kerberos authentication feature, but there is no such option available to us when using NIFI's DBCP Controller service.
Does anyone have any ideas on how we might be able to properly enable Kerberos authentication on the Oracle driver via NIFI's DBCP controller service?
Any help or direction given will be greatly appreciated?
I was able to figure how to "enable" Kerberos on the Oracle driver.
I set the dynamic properties below.
oracle.net.authentication_services = (KERBEROS5)
oracle.net.kerberos5_mutual_authentication = true
Thank you to those that responded.

getting error while trying to connect windows pentaho to virtual machines HDFS

i am new to pentaho and bigdata......every time i try to connect my windows pentaho to my Linux based virtual machines HDFS..this error pops up..i'v tried a couple of solutions but haven't had any luck with them....i would really appreciate if any of you could come up with a solution...
thanks in advance...!!
Error connecting to database [hadoop] :org.pentaho.di.core.exception.KettleDatabaseException:
Error occurred while trying to connect to the database
Error connecting to database: (using class org.apache.hadoop.hive.jdbc.HiveDriver)
No suitable driver found for jdbc:hive://(virtual machine's ip address):10000/test
You must have your Hive JDBC driver in classpath. It can be included by extending your CLASSPATH to include the Hive JDBC jar.
set CLASSPATH=%CLASSPATH%;%HIVE_HOME%\lib\hive-jdbc-1.1.0-cdh5.10.1.jar
You should be through if there is no other error!
If you are using a Java application, you can use the following to obtain the connection object :
Connection con = DriverManager.getConnection("jdbc:hive2://172.16.149.158:10000/default", "hive", "");
Where
172.16.149.158 is the hive server address,
10000 is the default hive port
Do check if the connection is successful using telnet command..
$ telnet 'hive-server' 'hive-port'
It should connect successfully.
You can also use the Pentaho wizard to connect with hive db. Link from Pentaho wiki : http://wiki.pentaho.com/display/BAD/Create+Hive+Database+Connection

Connect to kerberised hive using jdbc from remote windows system

I have setup a hive environment with Kerberos security enabled on a Linux server (Red Hat). And I need to connect from a remote windows machine to hive using JDBC.
So, I have hiveserver2 running in the linux machine, and I have done "kinit".
Now I try to connect from a java program on the windows side with a test program like this,
Class.forName("org.apache.hive.jdbc.HiveDriver");
String url = "jdbc:hive2://<host>:10000/default;principal=hive/_HOST#<YOUR-REALM.COM>"
Connection con = DriverManager.getConnection(url);
And I got the following error,
Exception due to: Could not open client transport with JDBC Uri:
jdbc:hive2://<host>:10000/;principal=hive/_HOST#YOUR-REALM.COM>:
GSS initiate failed
What am I doing here wrong ? I checked many forums, but couldn't get a proper solution. Any answer will be appreciated.
Thanks
If you were running your code in Linux, I would simply point to that post -- i.e. you must use System properties to define Kerberos and JAAS configuration, from conf files with specific formats.
And you have to switch the debug trace flags to understand subtile configuration issue (i.e. different flavors/versions of JVMs may have different syntax requirements, which are not documented, it's a trial-and-error process).
But on Windows there are additional problems:
the Apache Hive JDBC driver has some dependencies on Hadoop JARs, especially when Kerberos is involved (see that post for details)
these Hadoop JARs require "native libraries" -- i.e. a Windows port of Hadoop (which you have to compile yourself!! or download from an insecure source on the web!!) -- plus System properties hadoop.home.dir and java.library.path pointing to the Hadoop home dir and its bin sub-dir respectively
On the top of that, the Apache Hive driver has compatibility issues -- whenever there are changes in the wire protocol, newer clients cannot connect to older servers.
So I strongly advise you to use the Cloudera JDBC driver for Hive for your Windows clients. The Cloudera site just asks your e-mail.
After that you have a 80+ pages PDF manual to read, the JARs to add to your CLASSPATH, and your JDBC URL to adapt according to the manual.
Side note: the Cloudera driver is a proper JDBC-4.x compliant driver, no need for that legacy Class.forName()...
The key for us when we ran into the problem, was as follows:
On your server there are certain kerberos principals listed that are allowed to operate on the data.
When we tried to run a query via JDBC, we didn't do the proper kinit on the client side.
In this case the solution is obvious:
On the windows client: do a kinit with the proper account before connecting
String url = "jdbc:hive2://<host>:10000/default;principal=hive/_HOST#<YOUR-REALM.COM>"
You should replace <YOUR-REALM.COM> with your real REALM.

WebSphere to Oracle - doesn't accept correct password

In WebSphere 6.1 I have created a datasource to an Oracle 11g instance using the thin JDBC client.
In Oracle I have two users, one existing and another newly created.
My websphere datasource is OK if I use the component-managed authentication alias of the existing user, but fails with "invalid user/password" message if I use the alias of the new user. The error message is:
The test connection operation failed for data source MyDB (Non-XA) on
server nodeagent at node MY_node with the following exception:
java.sql.SQLException: ORA-01017: invalid username/password;
logon denied DSRA0010E: SQL State = 72000, Error Code = 1,017.
View JVM logs for further details.
There is nothing in the JVM logs. I have grepped all websphere logs and they do not mention my connection at all.
I can confirm that the username and password are correct by logging in via SQLPlus or (to prove the JDBC connection is OK) via SQuirreL.
I have checked in Oracle that the new user has all the system privs that the existing user has.
Any thoughts on what is going on or how I can debug this further?
Just FYI. I am guessing you are running WebSphere in Network Deployment mode.
This behavior you're experiencing is actually by design.
The reason for it is that the "Test Connection" button you see on the admin console, invokes the JDBC connection test from within the process of the Node Agent. There is no way for the J2C Alias information to propagate to the Node Agent without restarting it; some configuration objects take effect in WebSphere as soon as you save the configuration to the master repository, and some only take effect on a restart. J2C aliases take effect on restarts.
In a Network Deployment topology, you may have any number of server instances controlled by the same Node Agent. You may restart your server instances as you'd like, but unless you restart the Node Agent itself, the "test connection" button will never work.
It's a known WebSphere limitation... Which also exists on version 7.0, so don't be surprised when you test it during your next migration. :-)
If this happens to anyone else, I restarted WebSphere and all my problems went away. It's a true hallmark of quality software.
Oftentimes when people tell me they can't log into Oracle 11g with the correct password, I know they've been caught out by passwords becoming case-sensitive between 10g and 11g.
Try this :
data source definition
security
use the j2c alias both autentication managed by component and autentication managed by container
IBM WAS 8.5.5 Knowledge Center - Managing Java 2 Connector Architecture authentication data entries for JAAS
If you create or update a data source that points to a newly created J2C authentication data alias, the test connection fails to connect until you restart the deployment manager.
After you restart the deployment manager, the J2C authentication data is reflected in the runtime configuration. Any changes to the J2C authentication data fields require a deployment manager restart for the changes to take effect.
The node agent must also be restarted.
I have point my data source to componenet-manage authentication as well as container-managed authentication.Its working fine now........

Resources