Greenplum : Is there a way to define a table that data will be stored in master only - greenplum

In Greenplum data base, Is there a way to define a table that data will be stored in master only ?

No, such tables are not supported in Greenplum. Also Greenplum does not support "replicated" table that stores the same data on all the segments. If you will clarify your case I would recommend you solution. Storing data on the master is not a good choice because in most cases no data processing is done on master

Got one solution
Follow below steps:
a> Create a table without any column
create table test_monly();
b> Then alter table and add some column
alter table test_monly add column x integer;
c> Now insert some records
insert into test_monly values(1);
insert into test_monly values(3);
insert into test_monly values(5);
insert into test_monly values(7);
d> Check using this query
select gp_segment_id,x from test_monly
e> All the records are available in master only

If you don't specify the distributed by clause, your table will only stored in one segment. But it is not guaranteed that the table will be stored in the master.

Related

Dynamic Partition on external hive table

Is there a way to dynamically partition an external Hive table? I referred to posts which mentioned to follow below steps -
Create an external non-partitioned table
Create an external partitioned table
Use insert into table command to insert the data from non-partitioned to partitioned table
Problem with the above solution is that I need to always use the insert command if my base (non-partitioned) table is modified (addition/deletion of parquet files).
I am seeking for a solution where I need not do any insert command and my partitioned table should be updated as and how my non-partitioned table is changed.

How to create table in Hive with specific column values from another table

I am new to Hive and have some problems. I try to find a answer here and other sites but with no luck... I also tried many different querys that come to my mind, also without success.
I have my source table and i want to create new table like this.
Were:
id would be number of distinct counties as auto increment numbers and primary key
counties as distinct names of counties (from source table)
You could follow this approach.
A CTAS(Create Table As Select)
with your example this CTAS could work
CREATE TABLE t_county
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE AS
WITH t AS(
SELECT DISTINCT county, ROW_NUMBER() OVER() AS id
FROM counties)
SELECT id, county
FROM t;
You cannot have primary key or foreign keys on Hive as you have primary key on RBDMSs like Oracle or MySql because Hive is schema on read instead of schema on write like Oracle so you cannot implement constraints of any kind on Hive.
I can not give you the exact answer because of it suppose to you must try to do it by yourself and then if you have a problem or a doubt come here and tell us. But, what i can tell you is that you can use the insertstatement to create a new table using data from another table, I.E:
create table CARS (name string);
insert table CARS select x, y from TABLE_2;
You can also use the overwrite statement if you desire to delete all the existing data that you have inside that table (CARS).
So, the operation will be
CREATE TABLE ==> INSERT OPERATION (OVERWRITE?) + QUERY OPERATION
Hive is not an RDBMS database, so there is no concept of primary key or foreign key.
But you can add auto increment column in Hive. Please try as:
Create table new_table as
select reflect("java.util.UUID", "randomUUID") id, countries from my_source_table;

Is it possible to make Oracle PL/SQL trigger as function and package?

I have two identical tables and to copy data from master table (Table 1) to backup table (Table 2), I have below trigger:
CREATE OR REPLACE TRIGGER copy_Table1_to_Table2 AFTER INSERT ON table1
FOR EACH ROW
BEGIN
INSERT INTO TABLE2
(col1, col2, ..., coln)
VALUES
(:NEW.col1, :NEW.col2, ..., :NEW.coln);
END;
Question 1:
This works to copy data from one table to other. If I have to copy data from 20 different tables to 20 backup table (Identical), is it possible to have package or functions that takes arguments as table name and column names and reuse the same trigger instead of writing 20 different triggers?
If yes, would like to know how the code will look. Appreciate any help on this.
Question 2:
To make backup table (Table 2) as clone of master table (Table 1), do I have to truncate backup table first? So that no duplicate values are inserted (assuming master table has no duplicates).
Technically am loading bulk data into master table and insert trigger is inserting duplicates into backup table. Not sure how to avoid this scenario.
Am sorry if am asking too many questions, its just am not able to figure out best way to achieve the goal.
Appreciate any help.
Thanks,
Richa

Add partition to hive table with no data

I am trying to create a hive table which has the same columns as another table (partitioned). I use the following query for the same
CREATE TABLE destTable STORED AS PARQUET AS select * from srcTable where 1=2;
Apparently I cannot use 'PARTITIONED BY(col_name)' because destTable must not be partitioned. But I want to mention that destTable should be partitioned by a column (same as srcTable) before I add data to it.
Is there a way to do that?
As you mentioned, destTable can not be a partitioned table so there is no way to do this directly. Also, destTable can not be an external table.
In this situation you will need to create a temporary "staging_table" (un-partitioned and a Hive-managed table) to hold the data.
Step 1: Transfer everything from srcTable to the staging_table
Step 2: Create a partitioned destTable and do:
INSERT OVERWRITE TABLE destTable PARTITION(xxxx)
SELECT * FROM staging_table;
Hope this helps.

HIVE: How create a table with all columns in another table EXCEPT one of them?

When I need to change a column into a partition (convert normal column as partition column in hive), I want to create a new table to copy all columns except one. I currently have >50 columns in the original table. Is there any clean way of doing that?
Something like:
CREATE student_copy LIKE student EXCEPT age and hair_color;
Thanks!
You can use a regex:
CTAS using REGEX column spec. :
set hive.support.quoted.identifiers=none;
CREATE TABLE student_copy AS SELECT `(age|hair_color)?+.+` FROM student;
set hive.support.quoted.identifiers=column;
BUT (as mentioned by Kishore Kumar Suthar :
this will not create a partitioned table, as that is not supported with CTAS (Create Table As Select).
Only way I see for you to get your partitioned table is by getting the complete create statement of the table (as mentioned by Abraham):
SHOW CREATE TABLE student;
Altering it to create a partition on the column you want. And after that you can use the select with regex when inserting into the new table.
If your partition column is already part of this select, then you need to make sure it is the last column you insert. If it is not you can exclude that column in the regex and including it as last. Also if you expect several partitions to be created based on your insert statement you need to enable 'dynamic partitioning':
set hive.support.quoted.identifiers=none;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT INTO TABLE student_copy PARTITION(partcol1) SELECT `(age|hair_color|partcol1)?+.+`, partcol1 FROM student;
set hive.support.quoted.identifiers=column;
the 'hive.support.quoted.identifiers=none' is required to use the backticks '`' in the regex part of the query. I set this parameter to it's original value after my statement: 'hive.support.quoted.identifiers=column'
CREATE TABLE student_copy LIKE student;
It just copies the source table definition.
CREATE TABLE student_copy AS select name, age, class from student;
Target cannot be partitioned table.
Target cannot be external table.
It copies the structure as well as the data
I use below command to get the create statement of existing table.
SHOW CREATE TABLE student;
Copy the result and modify that based on your requirement for new table and run the modified command to get the new table.

Resources