How to check who created database in impala - hadoop

I have one Hadoop cluster (Cloudera distribution) given access to multiple user. Now from different users we are creating databases. How do i verify which user is creating which database.? Can anyone suggest me.?

Use below query:
Describe formatted databaseName.tableName;
Will show the owner and other details like table type,size etc.

Related

Apache Drill - Not listing tables in Hive DB

I have created the necessary storage plugins and the relevant databases in hive show up when issuing the show database command.
When using one of the hive databases though using the use command, I found that I cannot select any tables which are within that database. Looking further, when issuing the show table command, no tables within that database show up via Apache Drill whereas they appear fine in Hive.
Is there anything I am missing by any chance in terms of granting permission via Hive to any user? How exactly does Apache Drill connect to Hive to run the relevant jobs?
Appreciate your responses.
Show tables; will not list hive tables as of now. It's better to create views on top of hive tables. These Views will show up on show tables; command.

Configure Sentry to show/hide different databases for different users

I have a cluster running with cdh-5.7.0 and configured the following setup
hadoop with kerberos
hive with LDAP authentication
hive with sentry authorization (rules stored in JDBC derby)
My goal is to restrict users to see which databases exist in my system.
E.g.:
User-A should only see database DB-A when execute show databases
User-B should only see database DB-B when execute show databases
I followed the article https://blog.cloudera.com/blog/2013/12/how-to-get-started-with-sentry-in-hive/ to make that happen. But without success.
What I achieved was that
User-A can only select tables from DB-A and not from DB-B.
User-B can only select tables from DB-B and not from DB-A.
But both can still see DB-A and DB-B when executing show databases. But i want to avoid this.
Any hints from you how the rules or the setup could looks like to get that running?
Thanks
Marko
According your description and from what I've learned from existing setups, in case of Sentry v1.6+ you need to add the following property to your hive-site.xml:
<property>
<name>hive.metastore.filter.hook</name>
<value>org.apache.sentry.binding.metastore.SentryMetaStoreFilterHook</value>
</property>
Even if you are on CDH 5.7, the MapR 5 documentation is providing some context. As well Sentry Service Interactions.
After re-starting the Hive service you should be able to see the result which you are expecting.

Authentication and Security in Hadoop

We are building system which queries Hive table. Our Service Layer will construct Hive Query based on User Selection on UI, We have some security related questions over here
• Is it Ok to pass Hive Dynamic Query constructed at service layer to a UDF/HQL in Hive ?
• Are there any SQL Injection kind of Scenarios occurs in Hive, We are Hive 0.14, it contains delete and update statements.
• How can we manage Role Authorization to access table only like perform Read instead of Write and Delete. Is there way to manage permission for Hive table. Or will it be managed by HCatalog?
Yes, you can pass a dynamic query to Hive using PowerShell APIs. Check out https://hadoopsdk.codeplex.com/
Hive 14 does support Insert, Update and Delete. Check out https://issues.apache.org/jira/browse/HIVE-5317
Role authorization is currently not supported by HDInsight (as of 6/2015) but is something we are actively investigating and hope to bring to market soon.
Role based authorization and auditing with record, fields and cell level control and dynamic masking is available on hadoop with bluetalon policy engine.

create database in hive with multiple locations having sentry enable

I am creating a database in hive with multiple location for example
CREATE DATABASE sample1 location 'hdfs://nameservice1:8020/db/dev/abc','hdfs://nameservice1:8020/db/dev/def','hdfs://nameservice1:8020/db/dev/ghi'
but i am getting error while doing this. Can anyone help in this kind of creating a database with multiple locations is allowed ? Is there any alternate solution for this.
PS: My cluster is sentry enabled
Which error? If that is
User xx does not have privileges for CREATETABLE
then look at
http://community.cloudera.com/t5/Batch-SQL-Apache-Hive/quot-User-does-not-have-privileges-for-CREATETABLE-quot-Error/td-p/21044
You may have to omit LOCATION, and upload file directly to a hive warehouse location of that hive schema. I can't think of a better workaround.

How to use hive with multiple users

I have several users use the same hive.
Now i want each user to have a private metadata in hive.
example:
user a call show table : a1 , a2, a3 ...
user b call show table : b1 , b2 ,b3 ...
Of course when user run query they can not access table of other user.
thanks.
In order to make setup easy for new users, Hive's Metastore is
configured to store metadata locally in an embedded Apache Derby
database. Unfortunately, this configuration only allows a single user
to access the Metastore at a time. Cloudera strongly encourages users
to use a MySQL database instead. This section describes how to
configure Hive to use a remote MySQL database, which allows Hive to
support multiple users. See the Hive Metastore documentation for
additional information.
For more details see the part with heading 'Configuring the Hive Metastore' here.
Once the external meta store has been created then Hive authorization can be used to grant/restrict privileges.
This is the disclaimer from Hive
Hive authorization is not completely secure. In its current form, the authorization scheme is intended primarily to prevent good users from accidentally doing bad things, but makes no promises about preventing malicious users from doing malicious things.

Resources