hive/impala metadata refresh

hive/impala metadata refresh - hadoop

Does REFRESH table command refresh the metadata in Impala when a partition location is changed in Hive?
I am changing the Hive table partition location using
ALTER TABLE db.table partition(key=value1, key2=value2) set location='path'
After that, I am running REFRESH db.table in Impala which is not updating metadata. If I run INVALIDATE METADATA it is working.

There is Impala JIRA-4364 open for that. However its been in product backlog since 2017, so currently INVALIDATE METADATA is the only workaround.
UPDATE: This has been fixed in Impala 4.0 (see same JIRA link above).

Please make sure you perform msck repair after loading into the Hive partition instance.
Afterwards, you can invalidate the metadata for the DB in which the table resides in Impala shell/UI

Related

Migrating Hive Transaction table to another cloud platform

I'm trying to migrate my Hive Transaction table to another cloud platform. As a POC work I copied a partition directory to another location and created a new hive transaction table with the same schema pointing to the copied location. Then I executed the following command to add the partition to meta store.
ALTER TABLE <table name> ADD PARTITION (date_time='2020-03-06',bin='95');
SHOW PARTITIONS <table name>
RESULT => date_time=2020-03-06/bin=95
But when I execute a select query I'm getting empty results. Any steps that I'm missing here?
Thank you very much in advance.

Apache Drill - Not listing tables in Hive DB

I have created the necessary storage plugins and the relevant databases in hive show up when issuing the show database command.
When using one of the hive databases though using the use command, I found that I cannot select any tables which are within that database. Looking further, when issuing the show table command, no tables within that database show up via Apache Drill whereas they appear fine in Hive.
Is there anything I am missing by any chance in terms of granting permission via Hive to any user? How exactly does Apache Drill connect to Hive to run the relevant jobs?
Appreciate your responses.

Show tables; will not list hive tables as of now. It's better to create views on top of hive tables. These Views will show up on show tables; command.

Update JDBC Database table using storage handler and Hive

I have read that using Hive JDBC storage handler
(https://github.com/qubole/Hive-JDBC-Storage-Handler),
the external table in Hive can be created on different databases (MySQL, Oracle, DB2) and users can read from and write to JDBC databases using Hive using this handler.
My question is in the update .
If we use hive.14 where Hive update/delete is supported and use storage handler to point an external table to a JDBC database table, will it allow us to update the database table as well when we fire the update query from Hive end?

You can not update an external table in hive.
In hive only transcational tables support ACID properties. By default transactions are configured to be off. So to create transaction tables you need to add 'TBLPROPERTIES ('transactional'='true')' in your create statement.
There are many limitations to it. One of which is you cannot make external tables to be an ACID table because external tables are beyond the control of hive compactor.
To read more on this click here

After restarting the services, the impala tables are not coming up

After restarting the Impala server, we are not able to see the tables(i.e. tables are not coming up).Anyone help me what order we have to follow to avoid this issue.
Thanks,
Srinivas

You should try running "invalidate metadata;" from impala-shell. This usually clears up tables not being visible as impala caches metadata.
From:
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_invalidate_metadata.html
The following example shows how you might use the INVALIDATE METADATA
statement after creating new tables (such as SequenceFile or HBase tables) through the Hive shell. Before the INVALIDATE METADATA statement was issued, Impala would give a "table not found" error if you tried to refer to those table names.

Creating Hive Tables via Informatica Big Data Edition

I am an old Informatica PowerCenter 8 guy and am heading up a team using Informatica Big Data Edition 9.5.1. I have a question regarding Hive. Can Informatica build Hive tables or do they have to be built separately? If they can be built when 'Not Exists', what are the steps?
Thanks!

If you enable the below option in PC, this generate/create the HIVE tables
"Generate And Load Hive Table"
This is available only in PC, with Power Exchange for Hadoop
The way this works is, PC will first load into an HDFS file and then will create an HIVE definition on this and make sure you select "Externally Managed Hive Table", so that PC will create an external table.
Although at the mapping level, you need to define flat file as your target

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

hive/impala metadata refresh - hadoop

There is Impala JIRA-4364 open for that. However its been in product backlog since 2017, so currently INVALIDATE METADATA is the only workaround. UPDATE: This has been fixed in Impala 4.0 (see same JIRA link above).

Please make sure you perform msck repair after loading into the Hive partition instance. Afterwards, you can invalidate the metadata for the DB in which the table resides in Impala shell/UI

Related

Migrating Hive Transaction table to another cloud platform

Apache Drill - Not listing tables in Hive DB

Update JDBC Database table using storage handler and Hive

After restarting the services, the impala tables are not coming up

Creating Hive Tables via Informatica Big Data Edition

Categories

Resources