Hive partitioned view not showing partitions info - hadoop

I have created a partitioned view in Hive as below
create view if not exists view_name
PARTITIONED ON(date)
as
select col1,col2,date
from table1
union all
select col1,col2,date
from table2
The underlying tables are partitioned on 'date' column. When I use DESCRIBE FORMATTED VIEW_NAME I could see the partitions information as null as showin in screenshot.
enter image description here
If I use SHOW CREATE TABLE View_Name, I get view definition without partitions as below
create view if not exists view_name
as
select col1,col2,date
from table1
union all
select col1,col2,date
from table2
Please let me know what I am missing

From the hive documentation
Although there is currently no connection between the view partition
and underlying table partitions, Hive does provide dependency
information as part of the hook invocation for ALTER VIEW ADD
PARTITION. It does this by compiling an internal query of the form
in the other words, there is no partition information available in the views about the underlying tables. A workaround (depending how complex is your view query) is add the partitions as follow
ALTER VIEW view_name ADD [IF NOT EXISTS] partition_spec partition_spec
At least from the user perspective, it will provide information about the available partitions in the underlying tables.

Related

Creating view in HIVE

I want to create a view on a hive table which is partitioned . My view definition is as below:
create view schema.V1 as select t1.* from scehma.tab1 as t1 inner join (select record_key ,max(last_update) as last_update from scehma.tab1 group by record_key) as t2 on t1.record_key=t2.record_key and t1.last_update=t2.last_update
My table of tab1 is partitioned on quarter_id.
When i run any query on the view it gives error:
FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "V1:t2:tab1" Table "tab1"
Regards
Jayanta Layak
Your Hive settings must be set to execute jobs in Strict mode (Default in Hive 2.x). This prevents queries of partitioned tables without a WHERE clause that filters on partitions.
If you need to run a query across all partitions(full table scan) you can set the mode to
'nonstrict'. Use this property with care as it triggers enormous mapreduce jobs.
set hive.mapred.mode=nonstrict;
If you don't need an entire table scan, you can simply specify the partition value in your query's WHERE clause.

Hive - How to query a table to get its own name?

I want to write a query such that it returns the table name (of the table I am querying) and some other values. Something like:
select table_name, col1, col2 from table_name;
I need to do this in Hive. Any idea how I can get the table name of the table I am querying?
Basically, I am creating a lookup table that stores the table name and some other information on a daily basis in Hive. Since Hive does not (at least the version we are using) support full-fledged INSERTs, I am trying to use the workaround where we can INSERT into a table with a SELECT query that queries another table. Part of this involves actually storing the table name as well. How can this be achieved?
For the purposes of my use case, this will suffice:
select 'table_name', col1, col2 from table_name;
It returns the table name with the other columns that I will require.

create table with select union has no constraints

I created a table using select with a union, as follows:
create table tableC as
select column1, column2 from tableA
union all
select column1, column2 from tableB
The resulting table (tableC) has inherited none of the constraints from tableA or tableB. Why weren't the constraints copied to the new table?
Using select ... as ... to create a table never copies constraints. If you want the new table to inherit constraints from the original tables, you must create the new constraints manually.
As #Davek points out, not null constraints will get copied from a single table select ... as .... I imagine that's because they are both column attributes and constraints. However, once the column has more than one source, it is reasonable that Oracle would not try to apply that constraint.
In response to the follow-up question "would it be possible to give tableC the same constraints either from tableA or tableB, after a CTAs?":
Of course it's possible, but there's no single command to do it. You could write a procedure that used dynamic SQL to copy the constraints. However, unless you're looking to automate this behavior, it'll generally be easier to extract the DDL using an IDE and change the table name.

How to find out that an Oracle Table Partition is a System Generated Partition?

Am creating an Oracle HASH Table Partitions by using the below query
CREATE TABLE Table1 (
ID NUMBER, NAME VARCHAR2(50))
PARTITION BY HASH (ID)
PARTITIONS 25
STORE IN (Tablespace1);
Which Creates 25 HASH table partitions and also, the Database generates the 25 Unique partition names like SYS_P122, SYS_P123, SYS_P124... and so on for the partitions. Is there a way to find out this Partition lets say SYS_P123 is a system generated Partition name with the help of Oracle Catalog tables.
With the below link
http://docs.oracle.com/cd/B28359_01/server.111/b28320/statviews_2096.htm#REFRN20281
I could find the Oracle Table Partition information, but this catalog table does not have any value to say that the give Table Partition is a system generate or not. Is there any way to find out the given table partition name is system generated ?
Am using Oracle version 10 and 11.
Thanks,
Ravi,
Yes. The generated column in dba_objects gives the information.
Run the following query -
select owner, object_name, subobject_name, generated from all_objects where object_name = 'TABLE1' and object_type = 'TABLE PARTITION';
View the description for the 'generated' column in the following link - http://docs.oracle.com/cd/B28359_01/server.111/b28320/statviews_1145.htm#REFRN20146

Hive: Create New Table from Existing Partitioned Table

I'm using Amazon's Elastic MapReduce and I have a hive table created based on a series of log files stored in Amazon S3 and split in folders by day like so:
data/day=2011-09-01/log_file.tsv
data/day=2011-09-02/log_file.tsv
I am currently trying to create an additional table which filters out some unwanted activity in these log files but I can't figure out how to do this and keep getting errors such as:
FAILED: Error in semantic analysis: need to specify partition columns because the destination table is partitioned.
If my initial table create statement looks something like this:
CREATE EXTERNAL TABLE IF NOT EXISTS table1 (
... fields ...
)
PARTITIONED BY ( DAY STRING )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://bucketname/data/';
That initial table works fine and I've been able to query it with no problems.
How then should I create a new table that shares the structure of the previous one but simply filters out data? This doesn't seem to work.
CREATE EXTERNAL TABLE IF NOT EXISTS table2 LIKE table1;
FROM table1
INSERT OVERWRITE TABLE table2
SELECT * WHERE
col1 = '%somecriteria%' AND
more criteria...
;
As I've stated above, this returns:
FAILED: Error in semantic analysis: need to specify partition columns because the destination table is partitioned.
Thanks!
This always works for me:
CREATE EXTERNAL TABLE IF NOT EXISTS table2 LIKE table1;
INSERT OVERWRITE TABLE table2 PARTITION (day) SELECT col1, col2, ..., day FROM table1;
ALTER TABLE table2 RECOVER PARTITIONS;
Notice that I've added 'day' as a column in the SELECT statement. Also notice that there is an ALTER TABLE line which is necessary for Hive to become aware of the partitions that were newly created in table2.
I have never used the like option.. so thanks for showing me that. Will that actually create all of the partitions that the first table has as well? If not, that could be the issue. You could try using dynamic partitions:
create external table if not exists table2 like table1;
insert overwrite table table2 partition(part) select col1, col2 from table1;
Might not be the best solution, as I think you have to specify your columns in the select clause (as well as the partition column in the partition clause).
And, you must turn on dynamic partitioning.
I hope this helps.

Resources