Hive cannot create roles and show roles in cloudera? - hadoop

I'm getting the error, once i given the command show roles; in the hive terminal. Kindly do help me, and i add some property in hive-site.xml.
I am working in cloudera-quickstart-5.4.2.0-vmware.
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description>enable or disable the hive client authorization</description>
</property>
<property>
<name>hive.security.authorization.createtable.owner.grants</name>
<value>ALL</value>
<description>the privileges automatically granted to the owner whenever a table gets created.
An example like "select,drop" will grant select and drop privilege to the owner of the table</description>
</property>
[cloudera#quickstart ~]$ hive
Logging initialized using configuration in jar:file:/usr/jars/hive-common-1.1.0-cdh5.4.2.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> show roles;
FAILED: SemanticException The current builtin authorization in Hive is incomplete and disabled.
I am waiting for the answers.
Thanks in Advance

Command like these will not work in the Hive Shell, you have to move to Beeline.
Which is the CLI for HiveServer2.
Use this string for beeline connect:
!connect jdbc:hive2://localhost:10000/ (Replace the localhost with the FQDN of the hive server)
Once you are in the beeline shell.
show roles;
show current roles;
Will give you your desired outputs

Related

Pyspark: remote Hive warehouse location

I need to read / write tables stored in remote Hive Server from Pyspark. All I know about this remote Hive is that it runs under Docker. From Hadoop Hue I have found two urls for an iris table that I try to select some data from:
I have a table metastore url:
http://xxx.yyy.net:8888/metastore/table/mytest/iris
and table location url:
hdfs://quickstart.cloudera:8020/user/hive/warehouse/mytest.db/iris
I have no idea why last url contains quickstart.cloudera:8020. Maybe this is because Hive runs under Docker?
Discussing access to Hive tables Pyspark tutorial writes:
https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables
When working with Hive, one must instantiate SparkSession with Hive support, including connectivity to a persistent Hive metastore, support for Hive serdes, and Hive user-defined functions. Users who do not have an existing Hive deployment can still enable Hive support. When not configured by the hive-site.xml, the context automatically creates metastore_db in the current directory and creates a directory configured by spark.sql.warehouse.dir, which defaults to the directory spark-warehouse in the current directory that the Spark application is started. Note that the hive.metastore.warehouse.dir property in hive-site.xml is deprecated since Spark 2.0.0. Instead, use spark.sql.warehouse.dir to specify the default location of database in warehouse. You may need to grant write privilege to the user who starts the Spark application.
In my case hive-site.xml that I managed to get does not have neither hive.metastore.warehouse.dir nor spark.sql.warehouse.dir property.
Spark tutorial suggests to use the following code to access remote Hive tables:
from os.path import expanduser, join, abspath
from pyspark.sql import SparkSession
from pyspark.sql import Row
// warehouseLocation points to the default location for managed databases and tables
val warehouseLocation = new File("spark-warehouse").getAbsolutePath
spark = SparkSession \
.builder \
.appName("Python Spark SQL Hive integration example") \
.config("spark.sql.warehouse.dir", warehouse_location) \
.enableHiveSupport() \
.getOrCreate()
And in my case, after running similar to the above code, but with correct value for warehouseLocation, I think I can then do:
spark.sql("use mytest")
spark.sql("SELECT * FROM iris").show()
So where can I find remote Hive warehouse location? How to make Pyspark to work with remote Hive tables?
Update
hive-site.xml has the following properties:
...
...
...
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://127.0.0.1/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
...
...
...
<property>
<name>hive.metastore.uris</name>
<value>thrift://127.0.0.1:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
So it looks like 127.0.0.1 is Docker localhost that runs Clouder docker app. Does not help to get to Hive warehouse at all.
How to access Hive warehouse when Cloudera Hive runs as a Docker app.?
Here https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cdh_ig_hive_metastore_configure.html at "Remote Mode" you'll find that you the Hive metastore runs its own JVM process, other process such as HiveServer2, HCatalog, Cloudera Impala communicate with it through the Thrift API using property hive.metastore.uri in the hive-site.xml:
<property>
<name>hive.metastore.uris</name>
<value>thrift://xxx.yyy.net:8888</value>
</property>
(Not sure about the way you have to specify the address)
And maybe this property too:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://xxx.yyy.net/hive</value>
</property>

how to connect hive with multiple users

I am very new to Hadoop and some how we managed to install it with apache distribution and Derby database.
My requirement is having multiple users to access hive at a single time. But right now we are only able to allow a single user at a time.
I searched some of the blogs but haven't found the solution.
Could some one help me with solution?
Derby only allows single connection (process) to access the database at a give time, hence only one user can access the Hive.
Upgrade your hive metastore to either MySQL, PostgreSQL to support multiple concurrent connections to Hive.
For upgrading your metastore from Derby to MySQL/PostgreSQL there are lot resources online here's some of them:
From Cloudera
From Apache Hive Wiki
There are many different ways to access metastore by multiple users concurrently.
Embedded metastore.(default metastore:derby)
Local metastore.
Remote metastore.
Let's see the usage of above mentioned metastore.
Embedded metastore :
This metastore is only using for Unit test. And it's limitation that, it allows only a user to access Hive at same (Multiple sessions are not allowed and it throws error).
Local metastore:(By using MySql or Oracle DB)
To overcome the default metastore limitation the Local metastore is used, this can allow multiple user in same JVM (It allows multiple session on same machine). To setup this mode see below of this answer.
Remote Metastore(This metastore is using in production)
In a same project multiple hive users need to worked on it, and they can use hive concurrently on different machine but the metadata should be stored on centralized by using MySql or Oracle, ect,. Here, hive are running on each users JVM, If users are are processing, then they want to communicate with metastore which is centralized, for communicating we are going with Thrift Network APIs. To setup this mode see below of this answer.
METASTORE SETUP FOR MULTIPLE USER:
Step 1 : Download and install mysql server
sudo apt-get install mysql-server
Step 2 : Download and install JDBC driver.
sudo apt-get install libmysql-java
Step 3 : We need to copy the downloaded JDBC driver to hive/lib/ or link the JDBC location to hive/lib.
-Goto to the $HIVE_HOME/lib folder and create a link to the MySQL JDBC library.
ln -s /usr/share/java/mysql-connector-java.jar
Step 4 : Create users on metastore to access remotly and locally.
mysql -u root -p <Give password while installing DB>
mysql> CREATE USER 'user1'#'%' IDENTIFIED BY 'user1pass';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'hduserdb'#'%' WITH GRANT OPTION;
mysql> flush privileges;
IF you want multiple user to access do repeat the step 4 by giving user name, password.
Step 5 :: Goto hive/conf/hive-site.xml (If it's not available create it.)
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true</value>
<description>replace -master- with your database hostname</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>user1</value>
<description>user name for connecting to mysql server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>user1pass</value>
<description>password for connecting to mysql server</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://slave2:9083</value>
<description>Here use your metasore host name to access from different machine</description>
</property>
</configuration>
Do repeat only Step 5 on all users machine and change user name and password according.
Step 6 : From hive-2.. onwards we must give this comment.
slave#ubuntu~$: schematool -initSchema -dbType mysql
Step 7 : To start hive metastore server
~$: hive --service metastore &
Now, check hive with different user concurrently from different machine.

Hive not fully honoring fs.default.name/fs.defaultFS value in core-site.xml

I have the NameNode service installed on a machine called hadoop.
The core-site.xml file has the fs.defaultFS (equivalent to fs.default.name) set to the following:
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop:8020</value>
</property>
I have a very simple table called test_table that currently exists in the Hive server on the HDFS. That is, it is stored under /user/hive/warehouse/test_table. It was created using a very simple command in Hive:
CREATE TABLE new_table (record_id INT);
If I attempt to load data into the table locally (that is, using LOAD DATA LOCAL), everything proceeds as expected. However, if the data is stored on the HDFS and I want to load from there, an issue occurs.
I run a very simple query to attempt this load:
hive> LOAD DATA INPATH '/user/haduser/test_table.csv' INTO TABLE test_table;
Doing so leads to the following error:
FAILED: SemanticException [Error 10028]: Line 1:17 Path is not legal ''/user/haduser/test_table.csv'':
Move from: hdfs://hadoop:8020/user/haduser/test_table.csv to: hdfs://localhost:8020/user/hive/warehouse/test_table is not valid.
Please check that values for params "default.fs.name" and "hive.metastore.warehouse.dir" do not conflict.
As the error states, it is attempting to move from hdfs://hadoop:8020/user/haduser/test_table.csv to hdfs://localhost:8020/user/hive/warehouse/test_table. The first path is correct because it references hadoop:8020; the second path is incorrect, because it references localhost:8020.
The core-site.xml file clearly states to use hdfs://hadoop:8020. The hive.metastore.warehouse value in hive-site.xml correctly points to /user/hive/warehouse. Thus, I doubt this error message has any true value.
How can I get the Hive server to use the correct NameNode address when creating tables?
I found that the Hive metastore tracks the location of each table. You can see the that location be running the following in the Hive console.
hive> DESCRIBE EXTENDED test_table;
Thus, this issue occurs if the NameNode in core-site.xml was changed while the metastore service was still running. Therefore, to resolve this issue the service should be restarted on that machine:
$ sudo service hive-metastore restart
Then, the metastore will use the new fs.defaultFS for newly created tables such.
Already Existing Tables
The location for tables that already exist can be corrected by running the following set of commands. These were obtained from Cloudera documentation to configure the Hive metastore to use High-Availability.
$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://localhost:8020/user/hive/warehouse
hdfs://localhost:8020/user/hive/warehouse/test.db
Correcting the NameNode location:
$ /usr/lib/hive/bin/metatool -updateLocation hdfs://hadoop:8020 hdfs://localhost:8020
Now the listed NameNode is correct.
$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://hadoop:8020/user/hive/warehouse
hdfs://hadoop:8020/user/hive/warehouse/test.db

Hive doesn't show tables when started from another directory

I installed Hive cdh4 on RHEL. Whenever I start Hive from a directory, it creates metastore_db dir in it and a derby.log file. Is it a normal behaviour? Moreover, when I create a table, starting Hive from a particular directory; I'm unable to see that table when I start Hive from a directory, other than that.
For example,
Let's say I started Hive from my home dir, i.e. $HOME or ~ and I create table in Hive. But when I start Hive from /path/to/my/Hive/directory and do a show tables, the table i just creted wouldn't show up. However, if start Hive from my home directory again and look for tables, I'm able to see the table.
Also, if I make some changes in hive-site.xml, they are simply being ignored by Hive.
Please help me where am I going wrong.
You can change this and use one metastore_db by updating "$HIVE_HOME/conf/hive-default.xml" file's "javax.jdo.option.ConnectionURL" as below:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/path/to/my/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
Where /path/to/my/metastore_db is the location you want to keep your meta store dB.

Unable to instantiate HiveMetaStoreClient

I have a 3 nodes cluster running hive.
When i try to run some test from outside the cluster i am getting following given below error
FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Logging initialized using configuration in file:/net/slc01nwj/scratch/ashsshar/view_storage/ashsshar_bda_latest_2/work/hive_scratch/conf/hive-log4j.properties
When I login to cluster node and execute hive its working fine.
hive> show databases ;
OK
default
Following error is genereted in test log files
13/04/04 03:10:49 ERROR security.UserGroupInformation: PriviledgedActionException as:ashsshar {my username }(auth:SIMPLE) cause:java.io.IOException: javax.jdo.JDOFatalDataStoreException: Failed to create database '/var/lib/hive/metastore/metastore_db', see the next exception for details.
NestedThrowables:
java.sql.SQLException: Failed to create database '/var/lib/hive/metastore/metastore_db', see the next exception for details.
My hive-site.xml file contains this connection property ::
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
I have changed the /var/lib/hive/metastore/metastore_db at my cluster node, but still getting the same error
I have also tried removing all *lck files from above directory
Does {username} have the permissions to create
/var/lib/hive/metastore/metastore_db ?
If it is a test cluster you could do
sudo chmod -R 777 /var/lib/hive/metastore/metastore_db
or chown it to the user running it.
Try removing the $HADOOP_HOME/build folder. I had same problem with hive-0.10.0 or above versions. Then I tried hive-0.9.0 and got a different set of errors. Luckily found this thread Hive doesn't work on install. Tried the same trick and it worked for me magically. I am using default derby db.
this is for permissions issue for hive folder. please de the following will work well.
go to hive user ,for me hduser,
sudo chmod -R 777 hive
This issue occur due to abrupt termination of hive shell. Which created a unattended db.lck file.
TO resolve this issue,
browse to your metastore_db location
remove the tmp, dbex.lck and db.lck files.
Open the hive shell again. It will work.
You can see tmp, dbex.lck and db.lck files get created once again.
It worked after i moved the metastore from /var/lib/hive/. I did that by editing: /etc/hive/conf.dist/hive-site.xml
from:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
to:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/home/prashant/hive/metastore/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
`
Pls make sure of that whether you have a MetaStore_db in your hadoop directory already, if have, remove it and format your hdfs again,
and then try to start hive
Yes it's privilege problem. Enter your hive shell by following command:
sudo -u hdfs hive

Resources