kRB Ticket issues - hadoop

Namenode and zk service went down and try to restart getting below error.
Open source hadoop cluster with KRB security enabled
ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Getting m

Related

Unable to access Hadoop CLI after enabling Kerberos

I've followed the following tutorial CDH Hadoop Kerberos, NameNode and DataNode are able to start properly and I'm able to see all the DataNode listed on the WebUI (0.0.0.0:50070). But I'm unable to access the Hadoop CLI. I've followed this tutorial Certain Java versions cannot read credentials cache, still I'm unable to use the Hadoop CLI.
[root#local9 hduser]# hadoop fs -ls /
20/11/03 12:24:32 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
20/11/03 12:24:32 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
20/11/03 12:24:32 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "local9/192.168.2.9"; destination host is: "local9":8020;
[root#local9 hduser]# klist
Ticket cache: KEYRING:persistent:0:krb_ccache_hVEAjWz
Default principal: hdfs/local9#FBSPL.COM
Valid starting Expires Service principal
11/03/2020 12:22:42 11/04/2020 12:22:42 krbtgt/FBSPL.COM#FBSPL.COM
renew until 11/10/2020 12:22:12
[root#local9 hduser]# kinit -R
[root#local9 hduser]# klist
Ticket cache: KEYRING:persistent:0:krb_ccache_hVEAjWz
Default principal: hdfs/local9#FBSPL.COM
Valid starting Expires Service principal
11/03/2020 12:24:50 11/04/2020 12:24:50 krbtgt/FBSPL.COM#FBSPL.COM
renew until 11/10/2020 12:22:12
[root#local9 hduser]# hadoop fs -ls /
20/11/03 12:25:04 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
20/11/03 12:25:04 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
20/11/03 12:25:04 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "local9/192.168.2.9"; destination host is: "local9":8020;
Any Help would be greatly appreciated.
I figured out the issue.
It's a cache credential bug in Redhat: Red Hat Bugzilla – Bug 1029110
Then I found this document on Kerberos on Cloudera: Manage krb5.conf
Finally the solution was to comment out this line from /etc/krb5.conf
default_ccache_name = KEYRING:persistent:%{uid}
I was able to access the Hadoop CLI after commenting this line.

Kerberos HBase Zookeeper fails

I'm trying to kerberise my HBase Cluster and I get some problems with Zookeeper. When I start Hbase I get this error on the Master log :
ERROR [main-SendThread(X.X.X.X:2181)] client.ZooKeeperSaslClient: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state.
ERROR [main-SendThread(X.X.X.X:2181)] zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)]) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state.
DEBUG [main-EventThread] zookeeper.ZKWatcher: master:16000-0x16c236187be0000, quorum=Y.Y.Y.Y:2181,X.X.X.X:2181, baseZNode=/hbase Received ZooKeeper Event, type=None, state=AuthFailed, path=null
DEBUG [main] zookeeper.ZooKeeper: Close called on already closed client
On the Zookeeper log, I get :
WARN [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] quorum.Learner: Unexpected exception, tries=0, connecting to /X.X.X.X:2888
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:229)
at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:937)
I verified my firewall, the ports are open
For the configuration, I followed the HBase Reference Guide :
http://hbase.apache.org/book.html#zk.sasl.auth
At first I thought it was a problem with my keytab but Hadoop is working fine with it.
I run HBase 2.0.5, Hadoop 3.1.2 and the Zookeeper is the one provided by HBase.
Following #SamsonScharfrichter 's comment, I've tried a few things :
I've created and specified in /etc/hosts the FQDN of my servers and modified my configurations to reflect this change.
Changed the hostname of my servers for the FQDN
tried to nslookup my hostnames, didn't work since they are specified in /etc/hosts
It didn't do anything, I'm still getting the error. My guess is that Kerberos tries to search for a DNS on my public NIC and not my private. I do not know why it struggles so hard to find my servers, since hadoop has absolutely no problem with it.
EDIT - I set up a private DNS on my network. DNS working great, still getting the error. I'm about to give up
EDIT 2 - I installed tshark on the node with the error. Apparently I get a frame with the message :
Error: KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN
which is weird, I verified my keytab and the principals listed in kadmin. Maybe there defaults principals that I don't use ?

PriviledgedActionException (failed to find any kerberos tgt)

I am connecting to hdfs by using kerberos as authentication mechanism,I am running a job which takes 3 days to complete,I am getting this error:
org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:user_name(auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
I have also initiated crontab which does kinit by using the same keytab for every 1 hour but still am getting the same error and job fails to complete
please help me

HTTP Status 403 - GSSException: Failure unspecified at GSS-API level (Mechanism level: Specified version of key is not available (44))

Im getting this error when I open oozie link.can someone help ?
HTTP Status 403 - GSSException: Failure unspecified at GSS-API level (Mechanism level: Specified version of key is not available (44))
Im able to generate kerberos ticket in my local windows machine

Kerberos Authentication on Hadoop Cluster

I have prepared a 2 node cluster with plain apache Hadoop. These nodes acts as Kerberos client to another machines which acts as Kerberos Server.
The KDC Db, principals of hdfs on each machines are created with their kaytab files with proper encryption types, using AES.
The required hdfs-site, core-site, mapred-site, yarn-site and container-executor.cfg files are modified. Also for unlimited strength of security, the JCE policy files are also kept in $JAVA_HOME/lib/security directory.
When starting the namenode daemon, it is working fine. But while accessing the hdfs as
hadoop fs –ls /
we got the below error:
15/02/06 15:17:12 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "xxxxxxx/10.122.48.12"; destination host is: "xxxxxxx":8020;
If anyone has prior knowledge or has worked on Kerberos on top of Hadoop, kindly suggest us some solution on the above issue.
To use Hadoop command, you need to use kinit command to get a Kerberos ticket first:
kinit [-kt user_keytab username]
Once it's done, you can list the ticket with:
klist
See cloudera's doc for more details: Verify that Kerberos Security is Working

Resources