Oozie invalid user in secure mode - hadoop

Configured oozie to work with hadoop-2.6.0 and enabled kerberos security.
I didn't get ticket using kinit command but when i submit job using below command,
oozie job -oozie http://hostname:11000/oozie -config job.properties -run
it throws the following exception,
Error: E0501 : E0501: Could not perform authorization operation, User: oozie/hostname#EXAMPLE.COM is not allowed to impersonate Kumar
I know how to solve the above error but my question is
Kumar is my local account username. As i configured kerberos, it should check my user ticket. But it didn't show me any error like "No credential found"
If i get ticket using kinit for any other user then also oozie shows the same exception with my local user account name.
Is there anything to configure? I don't understand the concept. I am following this to configure oozie with kerberos on secured cluster.

I just found the answer in Oozie Authentication
Once authentication is performed successfully the received authentication token is cached in the user home directory in the .oozie-auth-token file with owner-only permissions. Subsequent requests reuse the cached token while valid.
This is the reason for using invalid user even getting the ticket for any other user using kinit command.
I just resolved as below
The use of the cache file can be disabled by invoking the oozie CLI with the -Doozie.auth.token.cache false= option.
Try this.

Related

Permission when setup Hadoop Cluster in Pentaho

I try to setup new hadoop cluster in pentaho 9.3 but i got permission error.
It requires username and password for hdfs but i don't know how to create user and password for hdfs.
step 1
step 2
I get Error
Error
Hadoop, by default, uses regular OS user accounts. For Linux, you'd use useradd command.

OOZIE status check throws java.lang.NullPointerException

I am new to oozie, trying to write a oozie workflow in CDH4.1.1. So I started the oozie service and then I checked the status using this command:
sudo service oozie status
I got the message:
running
Then I tried this command for checking the status:
oozie admin --oozie http://localhost:11000/oozie status
And I got the below exception:
java.lang.NullPointerException
at java.io.Writer.write(Writer.java:140)
at org.apache.oozie.client.AuthOozieClient.writeAuthToken(AuthOozieClient.java:182)
at org.apache.oozie.client.AuthOozieClient.createConnection(AuthOozieClient.java:137)
at org.apache.oozie.client.OozieClient.validateWSVersion(OozieClient.java:243)
at org.apache.oozie.client.OozieClient.createURL(OozieClient.java:344)
at org.apache.oozie.client.OozieClient.access$000(OozieClient.java:76)
at org.apache.oozie.client.OozieClient$ClientCallable.call(OozieClient.java:410)
at org.apache.oozie.client.OozieClient.getSystemMode(OozieClient.java:1299)
at org.apache.oozie.cli.OozieCLI.adminCommand(OozieCLI.java:1323)
at org.apache.oozie.cli.OozieCLI.processCommand(OozieCLI.java:499)
at org.apache.oozie.cli.OozieCLI.run(OozieCLI.java:466)
at org.apache.oozie.cli.OozieCLI.main(OozieCLI.java:176)
null
Reading the exception stack, I am unable to figure out the reason for this exception. Please let me know why I got this exception and how to resolve this.
Try disabling the env property USE_AUTH_TOKEN_CACHE_SYS_PROP in your cluster. As per your stacktrace and the code .
Usually the clusters are setup with Kerberos based authentication, which is set up by following the steps here . Not sure if you want to do that, but just wanted to mentioned that as an FYI.

Kerberos defaulting to wrong principal when accessing hdfs from remote server

I've configured kerberos to access hdfs from a remote server and I am able to authenticate and generate a ticket but when I try to access hdfs I am getting an error:
09/02 15:50:02 WARN ipc.Client: Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Server has invalid Kerberos principal: nn/hdp.stack.com#GLOBAL.STACK.COM
in our krb5.conf file, we defined the the admin_server and kdc under a different realm:
DEV.STACK.COM = {
admin_server = hdp.stack.com
kdc = hdp.stack.com
}
Why is it defaulting to a different realm that is also defined in our krb5 (GLOBAL.STACK.COM?). I have ensured that all our hadoop xml files are #DEV.STACK.COM
Any ideas? Any help much appreciated!
In your KRB5 conf, you should define explicitly which machine belongs to which realm, starting with generic rules then adding exceptions.
E.g.
[domain_realm]
stack.com = GLOBAL.STACK.COM
stack.dev = DEV.STACK.COM
vm123456.dc01.stack.com = DEV.STACK.COM
srv99999.dc99.stack.com = DEV.STACK.COM
We have a Cluster Spark that requires connect to hdfs on Cloudera and we have getting the same error.
8/08/07 15:04:45 WARN Client: Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/host1#REALM.DEV
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/host1#REALM.DEV; Host Details : local host is: "workstation1.local/10.1.0.62"; destination host is: "host1":8020;
Based on above post of ohshazbot and other post on Cloudera site https://community.cloudera.com/t5/Cloudera-Manager-Installation/Kerberos-issue/td-p/29843 we modified the core-site.xml (in Spark Cluster ....spark/conf/core-site.xml)file adding the property and the connection is succesfull
<property>
<name>dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
I recently bumped into an issue with this using HDP2.2 on 2 seperate clusters and after hooking up a debugger to the client I found the issue, which may affect other flavors of hadoop.
There is a possible hdfs config dfs.namenode.kerberos.principal.pattern which dictates if a principal pattern is valid. If the principal doesn't match AND doesn't match the current clusters principal then you get the exception you saw. In hdp2.3 and higher, as well as cdh 5.4 and higher, it looks like this is set to a default of *, which means everything hits. But in HDP2.2 it's not in the defaults so this error occurs whenever you try to talk to the remote kerberized hdfs from your existing kerberized hdfs.
Simply adding this property with *, or any other pattern which matches both local and remote principal names, resolves the issue.
give ->
ls -lart
check for keytab file ex: .example.keytab
if keytab file is there
give -
klist -kt keytabfilename
ex: klist -kt .example.keytab
you'll get principal like example#EXAM-PLE.COM in output
then give
kinit -kt .example.keytab(keytabfile) example#EXAM-PLE.COM(principal)

How to propagate delegation token in Oozie ssh action

I have an oozie shell action that executes a bunch of hadoop fs -get merge commands, it is currently failing because of:
[Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
In oozie docs it says how to do it for Java actions: here
IMPORTANT: In order for a Java action to succeed on a secure cluster, it must propagate the Hadoop delegation token like in the following code snippet (this is benign on non-secure clusters):
// propagate delegation related props from launcher job to MR job
if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) {
jobConf.set("mapreduce.job.credentials.binary", System.getenv("HADOOP_TOKEN_FILE_LOCATION"));
}
How do I do this for Shell? when i try to echo $HADOOP_TOKEN_FILE_LOCATION, it returns nothing
Can you try using kinit command to authenticate using keytab in the shell script
kinit ${kinit_url} -k -t <keytab>;
The Hadoop delegation tokens are copied to the local/current directory by Oozie. export HADOOP_TOKEN_FILE_LOCATION=./container_tokens should help.

Running any Hadoop command fails after enabling security.

I was trying to enable Kerberos for my CDH 4.3 (via Cloudera Manager) test bed. So after changing authentication from Simple to Kerberos in the WebUI, I'm unable to do any hadoop operations as shown below. Is there anyway to specify the keytab explicitly?
[root#host-dn15 ~]# su - hdfs
-bash-4.1$ hdfs dfs -ls /
13/09/10 08:15:35 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
13/09/10 08:15:35 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
13/09/10 08:15:35 ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "host-dn15.hadoop.com/192.168.10.227"; destination host is: "host-dn15.hadoop.com":8020;
-bash-4.1$ kdestroy
-bash-4.1$ kinit
Password for hdfs#HADOOP.COM:
-bash-4.1$ klist
Ticket cache: FILE:/tmp/krb5cc_494
Default principal: hdfs#HADOOP.COM
Valid starting Expires Service principal
09/10/13 08:20:31 09/11/13 08:20:31 krbtgt/HADOOP.COM#HADOOP.COM
renew until 09/10/13 08:20:31
-bash-4.1$ klist -e
Ticket cache: FILE:/tmp/krb5cc_494
Default principal: hdfs#HADOOP.COM
Valid starting Expires Service principal
09/10/13 08:20:31 09/11/13 08:20:31 krbtgt/HADOOP.COM#HADOOP.COM
renew until 09/10/13 08:20:31, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
-bash-4.1$
So I took a good look at the namenode log,
2013-09-10 10:02:06,085 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8022: readAndProcess threw exception javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)] from client 10.132.100.228. Count of bytes read: 0
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]
JCE policy files are Installed already on all nodes.
[root#host-dn15 security]# sha256sum ./local_policy.jar
4a5c8f64107c349c662ea688563e5cd07d675255289ab25246a3a46fc4f85767 ./local_policy.jar
[root#host-dn15 security]# sha256sum ./US_export_policy.jar
b800fef6edc0f74560608cecf3775f7a91eb08d6c3417aed81a87c6371726115 ./US_export_policy.jar
[root#host-dn15 security]# sha256sum ./local_policy.jar.bak
7b26d0e16722e5d84062240489dea16acef3ea2053c6ae279933499feae541ab ./local_policy.jar.bak
[root#host-dn15 security]# sha256sum ./US_export_policy.jar.bak
832133c52ed517df991d69770f97c416d2e9afd874cb4f233a751b23087829a3 ./US_export_policy.jar.bak
[root#host-dn15 security]#
And the list of principals in the realm.
kadmin: listprincs
HTTP/host-dn15.hadoop.com#HADOOP.COM
HTTP/host-dn16.hadoop.com#HADOOP.COM
HTTP/host-dn17.hadoop.com#HADOOP.COM
K/M#HADOOP.COM
cloudera-scm/admin#HADOOP.COM
hbase/host-dn15.hadoop.com#HADOOP.COM
hbase/host-dn16.hadoop.com#HADOOP.COM
hbase/host-dn17.hadoop.com#HADOOP.COM
hdfs/host-dn15.hadoop.com#HADOOP.COM
hdfs/host-dn16.hadoop.com#HADOOP.COM
hdfs/host-dn17.hadoop.com#HADOOP.COM
hdfs#HADOOP.COM
hue/host-dn15.hadoop.com#HADOOP.COM
host-dn16/hadoop.com#HADOOP.COM
kadmin/admin#HADOOP.COM
kadmin/changepw#HADOOP.COM
kadmin/host-dn15.hadoop.com#HADOOP.COM
krbtgt/HADOOP.COM#HADOOP.COM
mapred/host-dn15.hadoop.com#HADOOP.COM
mapred/host-dn16.hadoop.com#HADOOP.COM
mapred/host-dn17.hadoop.com#HADOOP.COM
root/admin#HADOOP.COM
root#HADOOP.COM
zookeeper/host-dn15.hadoop.com#HADOOP.COM
kadmin: exit
[root#host-dn15 ~]#
exported the keytab for hdfs and used to kinit.
-bash-4.1$ kinit -kt ./hdfs.keytab hdfs
-bash-4.1$ klist
Ticket cache: FILE:/tmp/krb5cc_494
Default principal: hdfs#HADOOP.COM
Valid starting Expires Service principal
09/10/13 09:49:42 09/11/13 09:49:42 krbtgt/HADOOP.COM#HADOOP.COM
renew until 09/10/13 09:49:42
Everything went futile. Any idea??
Thanks ahead,
I ran into a problem in which I had a Kerberized CDH cluster and even with a valid Kerberos ticket, I couldn't run any hadoop commands from the command line.
NOTE: After writing this answer I wrote it up as a blog post at http://sarastreeter.com/2016/09/26/resolving-hadoop-problems-on-kerberized-cdh-5-x/ . Please share!
So even with a valid ticket, this would fail:
$ hadoop fs -ls /
WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Here is what I learned and how I ended up resolving the problem. I have linked to Cloudera doc for the current version where possible, but some of the doc seems to be present only for older versions.
Please note that the problem comes down to a configuration issue but that Kerberos itself and Cloudera Manager were both installed correctly. Many of the problems I ran across while searching for answers came down to Kerberos or Hadoop being installed incorrectly. The problem I had occurred even though both Hadoop and Kerberos were functional, but they were not configured to work together properly.
TL;DR
MAKE SURE YOU HAVE A TICKET
Do a klist from the user you are trying to execute the hadoop command.
$ sudo su - myuser
$ klist
If you don't have a ticket, it will print:
klist: Credentials cache file '/tmp/krb5cc_0' not found
If you try to do a hadoop command without a ticket you will get the GSS INITIATE FAILED error by design:
WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
In other words, that is not an install problem. If this is your situation, take a look at:
http://www.roguelynn.com/words/explain-like-im-5-kerberos/
For other troubleshooting of Kerberos in general, check out https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/errors.html
CDH DEFAULT HDFS USER AND GROUP RESTRICTIONS
A default install of Cloudera has user and group restrictions on execution of hadoop commands, including a specific ban on certain users ( more on page 57 of http://www.cloudera.com/documentation/enterprise/5-6-x/PDF/cloudera-security.pdf ).
There are several properties that deal with this, including the supergroup for hdfs being set to the string supergroup instead of hdfs, dfs_permissions enabled property being set to false by default (hadoop user file permissions), users with uid over 1000 being banned.
Any of these could be a factor, for me it was HDFS being listed in the banned.users property.
Specifically for user HDFS, make sure you have removed hdfs from the banned.users configuration property in hdfs-site.xml configuration if you are trying to use it to execute hadoop commands.
1) UNPRIVILEGED USER AND WRITE PERMISSIONS
The Cloudera-recommended way to execute Hadoop commands is to create an unprivileged user and matching principal, instead of using the hdfs user. A gotcha is that this user also needs its own /user directory and can run into write permissions errors with the /user directory. If your unprivileged user does not have a directory in /user, it may result in the WRITE permissions denied error.
Cloudera Knowledge Article
http://community.cloudera.com/t5/CDH-Manual-Installation/How-to-resolve-quot-Permission-denied-quot-errors-in-CDH/ta-p/36141
2) DATANODE PORTS AND DATA DIR PERMISSIONS
Another related issue is that Cloudera sets dfs.datanode.data.dir to 750 on a non-kerberized cluster, but requires 700 on a kerberized cluster. With the wrong dir permissions set, the Kerberos install will fail. The ports for the datanodes must also be set to values below 1024, which are recommended as 1006 for the HTTP port and 1004 for the Datanode port.
Datanode Directory
http://www.cloudera.com/documentation/enterprise/5-6-x/topics/cdh_ig_hdfs_cluster_deploy.html
Datanode Ports
http://www.cloudera.com/documentation/archive/manager/4-x/4-7-2/Configuring-Hadoop-Security-with-Cloudera-Manager/cmchs_enable_security_s9.html
3) SERVICE-SPECIFIC CONFIGURATION TASKS
On page 60 of the security doc, there are steps to kerberize Hadoop services. Make sure you did these!
MapReduce
$ sudo -u hdfs hadoop fs -chown mapred:hadoop ${mapred.system.dir}
HBase
$ sudo -u hdfs hadoop fs -chown -R hbase ${hbase.rootdir}
Hive
$ sudo -u hdfs hadoop fs -chown hive /user/hive
YARN
$ rm -rf ${yarn.nodemanager.local-dirs}/usercache/*
All of these steps EXCEPT for the YARN one can happen at any time. The step for YARN must happen after Kerberos installation because what it is doing is removing the user cache for non-kerberized YARN data. When you run mapreduce after the Kerberos install it should populate this with the Kerberized user cache data.
YARN User Cache
YARN Application exited with exitCode: -1000 Not able to initialize user directories
KERBEROS PRINCIPAL ISSUES
1) SHORT NAME RULES MAPPING
Kerberos principals are "mapped" to the OS-level services users. For example, hdfs/WHATEVER#REALM maps to the service user 'hdfs' in your operating system only because of a name mapping rule set in the core-site of Hadoop. Without name mapping, Hadoop wouldn't know which user is authenticated by which principal.
If you are using a principal that should map to hdfs, make sure the principal name resolves correctly to hdfs according to these Hadoop rules.
Good
(has a name mapping rule by default)
hdfs#REALM
hdfs/_HOST#REALM
Bad
(no name mapping rule by default)
hdfs-TAG#REALM
The "bad" example will not work unless you add a rule to accommodate it
Name Rules Mapping
http://www.cloudera.com/documentation/archive/cdh/4-x/4-5-0/CDH4-Security-Guide/cdh4sg_topic_19.html )
2) KEYTAB AND PRINCIPAL KEY VERSION NUMBERS MUST MATCH
The Key Version Number (KVNO) is the version of the key that is actively being used (as if you had a house key but then changed the lock on the door so it used a new key, the old one is no longer any good). Both the keytab and principal have a KVNO and the version number must match.
By default, when you use ktadd or xst to export the principal to a keytab, it changes the keytab version number, but does not change the KVNO of the principal. So you can end up accidentally creating a mismatch.
Use -norandkey with kadmin or kadmin.local when exporting a principal to a keytab to avoid updating the keytab number and creating a KVNO mismatch.
In general, whenever having principal issues authentication issues, make sure to check that the KVNO of the principal and keytab match:
Principal
$ kadmin.local -q 'getprinc myprincipalname'
Keytab
$ klist -kte mykeytab
Creating Principals
http://www.cloudera.com/documentation/archive/cdh/4-x/4-3-0/CDH4-Security-Guide/cdh4sg_topic_3_4.html
SECURITY JARS AND JAVA HOME
1) JAVA VERSION MISMATCH WITH JCE JARS
Hadoop needs the Java security JCE Unlimited Strength jars installed in order to use AES-256 encryption with Kerberos. Both Hadoop and Kerberos need to have access to these jars. This is an install issue but it is easy to miss because you can think you have the security jars installed when you really don't.
JCE Configurations to Check:
the jars are the right version - the correct security jars are bundled with Java, but if you install them after the fact you have to make sure the version of the jars corresponds to the version of Java or you will continue to get errors. To troubleshoot, check the md5sum hash from a brand new download of the JDK that you're using in against the md5sum hash of the ones on the Kerberos server.
the jars are in the right location $JAVA_HOME/jre/lib/security
Hadoop is configured to look for them in the right place. Check if there is an export statement for $JAVA_HOME to the correct Java install location in /etc/hadoop/conf/hadoop-env.sh
If Hadoop has JAVA_HOME set incorrectly it will fail with "GSS INITIATE FAILED". If the jars are not in the right location, Kerberos won't find them and will give an error that it doesn't support the AES-256 encryption type (UNSUPPORTED ENCTYPE).
Cloudera with JCE Jars
http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_sg_s2_jce_policy.html
Troubleshooting JCE Jars
https://community.cloudera.com/t5/Cloudera-Manager-Installation/Problem-with-Kerberos-amp-user-hdfs/td-p/6809
TICKET RENEWAL WITH JDK 6 AND MIT KERBEROS 1.8.1 AND HIGHER
Cloudera has an issue documented at http://www.cloudera.com/documentation/archive/cdh/3-x/3u6/CDH3-Security-Guide/cdh3sg_topic_14_2.html in which tickets must be renewed before hadoop commands can be issued. This only happens with Oracle JDK 6 Update 26 or earlier and package version 1.8.1 or higher of the MIT Kerberos distribution.
To check the package, do an rpm -qa | grep krb5 on CentOS/RHEL or aptitude search krb5 -F "%c %p %d %V" on Debian/Ubuntu.
So do a regular kinit as you would, then do a kinit -R to force the ticket to be renewed.
$ kinit -kt mykeytab myprincipal
$ kinit -R
And finally, the issue I actually had which I could not find documented anywhere ...
CONFIGURATION FILES AND TICKET CACHING
There are two important configuration files for Kerberos, the krb5.conf and the kdc.conf. These are configurations for the krb5kdc service and the KDC database. My problem was the krb5.conf file had a property:
default_ccache_name = KEYRING:persistent:%{uid}.
This set my cache name to KEYRING:persistent and user uid (explained https://web.mit.edu/kerberos/krb5-1.13/doc/basic/ccache_def.html). When I did a kinit, it created the ticket in /tmp because the cache name was being set elsewhere as /tmp. Cloudera services obtain authentication with files generated at runtime in /var/run/cloudera-scm-agent/process , and these all export the cache name environment variable (KRB5CCNAME) before doing their kinit. That's why Cloudera could obtain tickets but my hadoop user couldn't.
The solution was to remove the line from krb5.conf that set default_ccache_name and allow kinit to store credentials in /tmp, which is the MIT Kerberos default value DEFCCNAME (documented at https://web.mit.edu/kerberos/krb5-1.13/doc/mitK5defaults.html#paths).
Cloudera and Kerberos installation guides:
Step-by-Step
https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_intro_kerb.html .
Advanced troubleshooting
http://www.cloudera.com/documentation/enterprise/5-6-x/PDF/cloudera-security.pdf, starting on page 48 .

Resources