Table Folder permission issues while using Hive and Impala both - hadoop

We are using latest versions of Hive as well as Impala. Impala is being authenticated with LDAP and authorization is being done via Sentry. Hive access is not authorized via Sentry as yet. We are creating tables from Impala while the /user/hive/warehouse has group level ownership by "hive" group, hence, the folder permissions are impala:hive.
drwxrwx--T - impala hive 0 2015-08-24 21:16 /user/hive/warehouse/test1.db
drwxrwx--T - impala hive 0 2015-08-11 17:12 /user/hive/warehouse/test1.db/events_test_venus
As can be seen, above folders are owned by Impala and group is Hive, and are group-writable. The group “hive” has a user named “hive” as well:
[root#server ~]# groups hive
hive : hive impala data
[root#server ~]# grep hive /etc/group
hive:x:486:impala,hive,flasun,testuser,fastlane
But when I try to query the table created on the folder, it gives access errors:
[root#jupiter fastlane]# sudo -u hive hive
hive> select * from test1.events_test limit 1;
FAILED: SemanticException Unable to determine if hdfs://mycluster/user/hive/warehouse/test1.db/events_test_venus is encrypted: org.apache.hadoop.security.AccessControlException: Permission denied: user=hive, access=EXECUTE, inode="/user/hive/warehouse/test1.db":impala:hive:drwxrwx--T
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6599)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6581)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6506)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:9141)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1582)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEZForPath(AuthorizationProviderProxyClientProtocol.java:926)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1343)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
Any ideas how to counter it?? Basically, we are trying to exploit the fact that by giving the group level read and write permissions, we should be able to make any group user to create and use the tables created by the folder owner, but that does not seem to be possible. Is it because of the fact that Impala alone has the Sentry authorization which uses the user impersonalization while Hive, stand-alone doesn't?
Can someone please guide or confirm?
Thanks

You can set the umask of hdfs to 000 and restart the cluster. This will ensure that all the directories or files created after this change will be with permissions 777. After this apply proper ownership and permissions to the directories and folders to ensure that the permissions of other directories are not open. Setting the umask to 000 will not change the permissions of existing directories. Only the newly created directories/files will be affected. If you are using cloudera manager, it is very easy to make this change.
NB: Umask 000 will make all the files/directories with default permission 777. This will make open permissions. So handle this by applying permissions and acls at the parent directory level.

Related

Create database in console hive without permission in Ranger

I have not kerberos cluster Hadoop. I manage the permission hive, hdfs via Ranger.
The Resource Path in Ranger for HDFS are:
/user/myLogin
/apps/hive/warehouse/mylogin_*
/apps/hive/warehouse
I can create a database in hive ( via console) also in Ambari.
But when I remove the permission /apps/hive/warehouse I can't create a database in Hive (Console) but in Ambari I can create it.
This following the error:
hive> create database database_tesst;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTa sk. MetaException(message:org.apache.hadoop.security.AccessControlException:
Permission denied: user=AAAAA, access=EXECUTE,
inode="/apps/hive/warehouse/database_tesst.db":hdfs:hdfs:d---------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPe rmissionChecker.java:353)
How can I create a database or runing a request in hive (console) without the permission /apps/hive/warehouse ? Because I should remove this permission from Ranger to allow access users only to there data.
Thank you

Unable to create table in Hive

I am running below simple query to create a simple table.
create table test (id int, name varchar(20));
But I am getting the below error, please let know what need to be done exactly.
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:file:/user
/hive/warehouse/test is not a directory or unable to create one)
I have given full read/write access to /user/hive/warehouse folder.
your hive user doesn't have permission for create director into hdfs. Whenever you create a table, hive will make a directory into User/hive/warehouse/table but here It's not able to create a directory into user/hive/warehouse/ so give permission to this directory to allow your user to create a table.
http://www.cloudera.com/documentation/archive/cdh/4-x/4-2-0/CDH4-Installation-Guide/cdh4ig_topic_18_7.html
Sounds like a permissions issue. The change mode command below may help.
hadoop fs -chmod -R 755 /user/hive/warehouse/
The error message says
file:/user /hive/warehouse/test".
Despite that space between the /user and the rest of the path, file:/ means that Hive is trying to create that directory on your local file system instead on hdfs. There is probably problem with accessing configuration. I would check is HADOOP_CONF_DIR environment variable is properly initialized.
For me, the issue was exactly like yours(create internal table), not related to permission, but to storage, it was 100% used. Try checking for the same.

AccessControlException in Hadoop for access=EXECUTE

I have a small application which reads a file from my local machine and writes the data into hdfs.
Now i want to list the files present in the hdfs folder, say HadoopTest. When i try to do that , i am getting the below exception:
org.apache.hadoop.security.AccessControlException: Permission denied: user=rpoornima, access=EXECUTE, inode="/hbase/HadoopTest/Hadoop_File_1.txt":rpoornima:hbase:-rw-r--r--
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:161)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:128)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4547)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:4523)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:3312)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:3289)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:652)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:431)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44098)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
I'm not sure how to resolve this issue. kindly give you inputs.
You exception is clear enough to show the problem.
As the exception says
Permission denied: user=rpoornima, access=EXECUTE,
inode="/hbase/HadoopTest/Hadoop_File_1.txt":rpoornima:hbase:-rw-r--r--`
This means your account rpoornima only has -rw-r--r-- permission(no execute) on the file /hbase/HadoopTest/Hadoop_File_1.txt. So you have to use another full privilege account to do the execution.
UPDATE
If you want to give access to specified user. Use a chmod command.
chown
Usage: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]
Change the owner of files. The user must be a super-user. Additional information is in the Permissions Guide.
Options
The -R option will make the change recursively through the directory structure.

Hadoop Hive: How to allow regular user continuously write data and create tables in warehouse directory?

I am running Hadoop 2.2.0.2.0.6.0-101 on a single node.
I am trying to run Java MRD program that writes data to an existing Hive table from Eclipse under regular user. I get exception:
org.apache.hadoop.security.AccessControlException: Permission denied: user=dev, access=WRITE, inode="/apps/hive/warehouse/testids":hdfs:hdfs:drwxr-xr-x
This happens because regular user has no write permission to warehouse directory, only hdfs user does:
drwxr-xr-x - hdfs hdfs 0 2014-03-06 16:08 /apps/hive/warehouse/testids
drwxr-xr-x - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
To circumvent this I change permissions on warehouse directory, so everybody now have write permissions:
[hdfs#localhost wks]$ hadoop fs -chmod -R a+w /apps/hive/warehouse
[hdfs#localhost wks]$ hadoop fs -ls /apps/hive/warehouse
drwxrwxrwx - hdfs hdfs 0 2014-03-06 16:08 /apps/hive/warehouse/testids
drwxrwxrwx - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
This helps to some extent, and MRD program can now write as a regular user to warehouse directory, but only once. When trying to write data into the same table second time I get:
ERROR security.UserGroupInformation: PriviledgedActionException as:dev (auth:SIMPLE) cause:org.apache.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.testids
Now, if I delete output table and create it anew in hive shell, I again get default permissions that do not allow regular user to write data into this table:
[hdfs#localhost wks]$ hadoop fs -ls /apps/hive/warehouse
drwxr-xr-x - hdfs hdfs 0 2014-03-11 12:19 /apps/hive/warehouse/testids
drwxrwxrwx - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
Please advise on Hive correct configuration steps that will allow a program run as a regular user do the following operations in Hive warehouse:
Programmatically create / delete / rename Hive tables?
Programmatically read / write data from Hive tables?
Many thanks!
If you maintain the table from outside Hive, then declare the table as external:
An EXTERNAL table points to any HDFS location for its storage, rather than being stored in a folder specified by the configuration property hive.metastore.warehouse.dir.
A Hive administrator can create the table and it can point it toward your own user owned HDFS storage location and you grant Hive permission to read from there.
As a general comment, there are no ways for an unprivileged user to do an unauthorized privileged action. Any such way is technically an exploit and you should never rely on it: even if is possible today, it will likely be closed soon. Hive Authorization (and HCatalog authorization) is orthogonal to HDFS authorization.
Your application is also incorrect, irrelevant of authorization issues. You are trying to write 'twice' in the same table which means your application does not handle partitions correctly. Start from An Introduction to Hive’s Partitioning.
You can configure for hdfs-site.xml such as:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
This configure will disable permissions on HDFS. So, a regular user can do the operations on HDFS.
I hope this solve will help you.

Steps to install Hive

I have Hadoop configured in my REDHAT system. I am getting the following error when $HIVE_HOME/bin/hive is executed..
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1704)
at java.io.File.createTempFile(File.java:1792)
at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
hive uses a 'metastore'; it creates this directory when you invoke it for the first time. The meta-directory is usually created in the current working directory you are in (i.e. where you are running the hive command)
which dir are you invoking hive command from? Do you have write permissions there?
try this:
cd <--- this will take you to your home dir (you will have write permissions there)
hive

Resources