Oozie cannot access metastore database in HUE

Oozie cannot access metastore database in HUE - hadoop

I'm on CDH4, in HUE, I have a database in Metastore Manager named db1. I can run Hive queries that create objects in db1 with no problem. I put those same queries in scripts and run them through Oozie and they fail with this message:
FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://lad1dithd1002.thehartford.com:8020/appl/hive/warehouse/db1.db. Error encountered near token 'TOK_TMP_FILE'
I created db1 in the Metastore Manager as HUE user db1, and as HUE user admin, and as HUE user db1, and nothing works. The db1 user also has a db1 ID on the underlying Linux cluster, if that helps.
I have chmod'd the /appl/hive/warehouse/db1.db to read, write, execute to owner, group, other, and none of that makes a difference.
I'm almost certain it's a rights issue, but what? Oddly, I have this working under another ID where I had hacked some combination of things that seemed to have worked, but I'm not sure how. It was all in HUE, so if possible, I'd like a solution doable in HUE so I can easily hand it off to folks who prefer to work at the GUI level.
Thanks!

Did you also add hive-site.xml into your Files and Job XML fields? Hue has great tutorial about how to run Hive job. Watch it here. Adding of hive-site.xml is described around 4:20.

Exact same error on Hadoop MapR.
Root cause : Main database and temporary(scrat) database were created by different users.
Resolution : Creating both folders with same ID might help with this.

Related

Configure Sentry to show/hide different databases for different users

I have a cluster running with cdh-5.7.0 and configured the following setup
hadoop with kerberos
hive with LDAP authentication
hive with sentry authorization (rules stored in JDBC derby)
My goal is to restrict users to see which databases exist in my system.
E.g.:
User-A should only see database DB-A when execute show databases
User-B should only see database DB-B when execute show databases
I followed the article https://blog.cloudera.com/blog/2013/12/how-to-get-started-with-sentry-in-hive/ to make that happen. But without success.
What I achieved was that
User-A can only select tables from DB-A and not from DB-B.
User-B can only select tables from DB-B and not from DB-A.
But both can still see DB-A and DB-B when executing show databases. But i want to avoid this.
Any hints from you how the rules or the setup could looks like to get that running?
Thanks
Marko

According your description and from what I've learned from existing setups, in case of Sentry v1.6+ you need to add the following property to your hive-site.xml:
<property>
<name>hive.metastore.filter.hook</name>
<value>org.apache.sentry.binding.metastore.SentryMetaStoreFilterHook</value>
</property>
Even if you are on CDH 5.7, the MapR 5 documentation is providing some context. As well Sentry Service Interactions.
After re-starting the Hive service you should be able to see the result which you are expecting.

After restarting the services, the impala tables are not coming up

After restarting the Impala server, we are not able to see the tables(i.e. tables are not coming up).Anyone help me what order we have to follow to avoid this issue.
Thanks,
Srinivas

You should try running "invalidate metadata;" from impala-shell. This usually clears up tables not being visible as impala caches metadata.
From:
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_invalidate_metadata.html
The following example shows how you might use the INVALIDATE METADATA
statement after creating new tables (such as SequenceFile or HBase tables) through the Hive shell. Before the INVALIDATE METADATA statement was issued, Impala would give a "table not found" error if you tried to refer to those table names.

HUE Query Results - Expired

Team,
I am using HUE-BEEWAX (Hive UI) to execute hive queries. So far, I have been always able to access the query results of queries execute on the same day, but today I see lot of the queries results shown as expired despite running them just an hour back.
my question is?
When does query result set become expired?
What settings control this?
Is it possible to retain this result-set somewhere in HDFS? (how?)
Regards

My understanding is that it's controlled by Hive, not Hue (Beeswax). When HiveServer is restarted it cleans up the scratch directories.
This is controlled by this setting : hive.start.cleanup.scratchdir.
Are you restarting your HiveServers?
Looking through some code, I found that Beeswax sets the scratch directory to "/tmp/hive-beeswax-" + Hadoop Username.

create database in hive with multiple locations having sentry enable

I am creating a database in hive with multiple location for example
CREATE DATABASE sample1 location 'hdfs://nameservice1:8020/db/dev/abc','hdfs://nameservice1:8020/db/dev/def','hdfs://nameservice1:8020/db/dev/ghi'
but i am getting error while doing this. Can anyone help in this kind of creating a database with multiple locations is allowed ? Is there any alternate solution for this.
PS: My cluster is sentry enabled

Which error? If that is
User xx does not have privileges for CREATETABLE
then look at
http://community.cloudera.com/t5/Batch-SQL-Apache-Hive/quot-User-does-not-have-privileges-for-CREATETABLE-quot-Error/td-p/21044
You may have to omit LOCATION, and upload file directly to a hive warehouse location of that hive schema. I can't think of a better workaround.

Tables not found when hive cli called from different directory

I am facing a weird problem with Hive Tables. I have HIVE_HOME set in my environ and it is also in my search path so i can invoke hive directly.
Now I invoke hive from a directory lets say /a/b/c and create some tables. I can see the tables.
Now I change to a directory e.g /a/b and invoke hive from there. Here is the problem part. Either i am unable to see the tables or i get this error
hive> show tables;
FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Failed to start
database 'metastore_db', see the next exception for details.
NestedThrowables:
java.sql.SQLException: Failed to start database 'metastore_db', see the next exception
for details.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Why are tables tied to the directory from which the hive cli was called from? Any pointers?

I think you are using derby server which hive uses for storing the metadata. So, for that what you can do is delete everything inside metastore_db folder and then try to restart the hadoop. And then try to see. But, i think best advice would be you use the mysql as a metastore.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Oozie cannot access metastore database in HUE - hadoop

Did you also add hive-site.xml into your Files and Job XML fields? Hue has great tutorial about how to run Hive job. Watch it here. Adding of hive-site.xml is described around 4:20.

Exact same error on Hadoop MapR. Root cause : Main database and temporary(scrat) database were created by different users. Resolution : Creating both folders with same ID might help with this.

Related

Configure Sentry to show/hide different databases for different users

After restarting the services, the impala tables are not coming up

HUE Query Results - Expired

create database in hive with multiple locations having sentry enable

Tables not found when hive cli called from different directory

Categories

Resources