Hive 'create table like' without including partition columns - hadoop

Say I create tbl1 like so:
create table tbl1 (
col_a STRING,
col_b STRING,
col_c STRING )
partitioned by (col_d STRING);
Is there a shorthand way to create tbl2 - a table with same columns as tbl1, but without paritioning by anything (and without including the parition column). tbl2 manual ddl would be:
create table tbl2 (
col_a STRING,
col_b STRING,
col_c STRING );
Thanks for any help!

You can use CTAS(Create table as Select) in hive.
create table tbl2 as select * from tbl1
This will not create any partition in tbl2 even though the tbl1 holds the partition.Only limitation is with out select you cannot be able to create the structure.

Partitioned column also one of the columns in the table.
if you want only 3 columns(col_a, col_b, col_c), you need to explicitly mention them in the query answered by Avinash.
create table tbl2 as select col_a, col_b, col_c from tbl1;

Related

If oracle table1 is empty then output table2 else output table1 itself; No columns on both the tables are common

I have two different table:
Table1, Table2.
Two tables have different columns and nothing in common. What I am looking to get is
if Table1 is empty/null then
output Table2
else
output table1
Is it possible to get it done in Oracle? Any leads would be appreciated.
Accordingly to your phrase:
if Table1 is empty/null then output Table2 else output table1
I think the solution is (I briefed Table1, Table2 by A, B respectively ):
--I created this tables to test the solution
create table A( id number, val varchar2(5));
create table B( code varchar2(5), event_dt date);
insert into b(code, event_dt)
values ('test', sysdate);
--query(1)
select b.code, to_char(b.event_dt,'yyyy-mm-dd')
from b
where not exists (select 1 from a)
union
select to_char(id), to_char(val)
from a
;
--now insert data on the other table (to test purposes)
insert into A(id, val)
values(1, 'TestA');
--run the query(1) again
The key is "union", kind of repeat your query when the first portion deals to no data
found.
Please remember to CAST your columns to achieve the same DATA-TYPES required by UNION
Best regards.

Insert data listing columns with partitioning field in Hive

First of all let's setup a test environment:
CREATE TABLE IF NOT EXISTS source_table (
`col1` TIMESTAMP,
`col2` STRING
);
CREATE TABLE IF NOT EXISTS dest_table (
`col1` TIMESTAMP,
`col2` STRING,
`col3` STRING
)
PARTITIONED BY (day STRING)
STORED AS AVRO;
INSERT INTO TABLE source_table VALUES ('2018-03-21 17:08:04.401', 'test1'), ('2018-03-22 12:02:04.222', 'test2'), ('2018-03-22 07:21:04.111', 'test3');
How could I list the column names during insertion and put the partition value dynamically? The following command doesn't work:
INSERT INTO TABLE dest_table(col1, col2) PARTITION(day) SELECT col1, col2, date_format(col1, 'yyyy-MM-dd') FROM source_table;
By the way, without listing the columns of dest_table inside the INSERT INTO command, having two tables with the same columns number, everything works fine. What if my dest_table has more fields than the source_table?
Thank you for helping me.
P.S.
Ok, if I hardcode NULL this works. I leave the question opened because there might be better ways to achieve that.
INSERT INTO TABLE dest_table PARTITION(day) SELECT col1, col2, NULL, date_format(col1, 'yyyy-MM-dd') FROM source_table;
Anyway, this method is strictly bounded with columns order? In a real-life scenario, how could I handle lots of columns specifying a mapping, to avoid mistakes?
The syntax for inserting into a partitioned table when you want to list the specific columns is shown below. You don't need to put null on col3 since Hive will put a default value NULL since it is not in the column list during insert.
INSERT INTO TABLE dest_table PARTITION (day)(col1, col2, day)
SELECT col1, col2, date_format(col1, 'yyyy-MM-dd') FROM source_table;
Result:
col1 col2 col3 day
2018-03-22 12:02:04.222 test2 NULL 2018-03-22
2018-03-22 07:21:04.111 test3 NULL 2018-03-22
2018-03-21 17:08:04.401 test1 NULL 2018-03-21

Convert String List to Number in Oracle DB

I am saving table ids as foreign key into another table using Oracle Apex Shuttle field like(3:4:5). Now I want to use these IDS in sql query using IN Clause. I have replaced : with , using replace function but it shows
no data found
message.
The following query works fine when I use static values.
select * from table where day_id IN(3,4,5)
But when I try to use
select * from table where id IN(Select id from table2)
it shows no data found.
From what i understand you have a list like 1:2:3:4 that you want to use in a IN clause; you can transform the list into separated values like this:
select regexp_substr('1:2:3:4','[^:]+', 1, level) as list from dual
connect by regexp_substr('1:2:3:4', '[^:]+', 1, level) is not null;
This will return:
List
1
2
3
4
Then you can simply add it to your query like this:
SELECT *
FROM TABLE
WHERE day_id IN
(SELECT regexp_substr('1:2:3:4','[^:]+', 1, level) AS list
FROM dual
CONNECT BY regexp_substr('1:2:3:4', '[^:]+', 1, level) IS NOT NULL
);
Can you try below statement. It has working as you expected.
create table table1 (id number, name varchar2(20));
alter table table1 add constraints pri_cons primary key(id);
create table table2 (id number, name varchar2(20));
alter table table2 add constraints ref_cons FOREIGN KEY(id) REFERENCES table1 (id);
begin
insert into table1 values (1,'Bala');
insert into table1 values (2,'Sathish');
insert into table1 values (3,'Subbu');
insert into table2 values (1,'Nalini');
insert into table2 values (2,'Sangeetha');
insert into table2 values (3,'Rubini');
end;
/
select * from table1 where id IN (Select id from table2);

MERGING DATA OF TWO TABLES

I want to write a query which finds the difference between two tables and writes updates or new data into third table. My two tables have identical column names. Third table which captures changes have extra column called comment. I would like to insert the comment whether it is a new row or updated row based on the row modification.
**TABLE1 (BACKUP)**
KEY,FIRST_NAME,LAST_NAME,CITY
1,RAM,KUMAR,INDIA
2,TOM,MOODY,ENGLAND
3,MOHAMMAD,HAFEEZ,PAKISTAN
4,MONIKA,SAM,USA
5,MIKE,PALEDINO,USA
**TABLE2 (CURRENT)**
KEY,FIRST_NAME,LAST_NAME,CITY
1,RAM,KUMAR,USA
2,TOM,MOODY,ENGLAND
3,MOHAMMAD,HAFEEZ,PAKISTAN
4,MONIKA,SAM,INDIA
5,MIKE,PALEDINO,USA
6,MAHELA,JAYA,SL
**TABLE3 (DIFFERENCE FROM TABLE2 TO TABLE1)**
KEY,FIRST_NAME,LAST_NAME,CITY,COMMENT
1,RAM,KUMAR,USA,UPDATE
4,MONIKA,SAM,INDIA,UPDATE
6,MAHELA,JAYA,SL,INSERT
table scripts
DROP TABLE TABLE1;
DROP TABLE TABLE2;
DROP TABLE TABLE3;
CREATE TABLE TABLE1
(
KEY NUMBER,
FIRST_NAME VARCHAR2(100),
LAST_NAME VARCHAR2(100),
CITY VARCHAR2(50)
);
/
CREATE TABLE TABLE2
(
KEY NUMBER,
FIRST_NAME VARCHAR2(100),
LAST_NAME VARCHAR2(100),
CITY VARCHAR2(50)
);
/
CREATE TABLE TABLE3
(
KEY NUMBER,
FIRST_NAME VARCHAR2(100),
LAST_NAME VARCHAR2(100),
CITY VARCHAR2(50),
COMMENTS VARCHAR2(200)
);
/
INSERT ALL
INTO TABLE1
VALUES(1,'RAM','KUMAR','INDIA')
INTO TABLE1 VALUES(2,'TOM','MOODY','ENGLAND')
INTO TABLE1 VALUES(3,'MOHAMMAD','HAFEEZ','PAKISTAN')
INTO TABLE1 VALUES(4,'MONIKA','SAM','USA')
INTO TABLE1 VALUES(5,'MIKE','PALEDINO','USA')
SELECT 1 FROM DUAL;
/
INSERT ALL
INTO TABLE2
VALUES(1,'RAM','KUMAR','USA')
INTO TABLE2 VALUES(2,'TOM','MOODY','ENGLAND')
INTO TABLE2 VALUES(3,'MOHAMMAD','HAFEEZ','PAKISTAN')
INTO TABLE2 VALUES(4,'MONIKA','SAM','INDIA')
INTO TABLE2 VALUES(5,'MIKE','PALEDINO','USA')
INTO TABLE2 VALUES(6,'MAHELA','JAYA','SL')
SELECT 1 FROM DUAL;
I was using the merge statement to accomplish the same. but i have hit a roadblock in merge statement , it's rhrowing an error "SQL Error: ORA-00905: missing keyword
00905. 00000 - "missing keyword"" I dont understand where is the error. please help
INSERT INTO TABLE3
SELECT KEY,FIRST_NAME,LAST_NAME,CITY,NULL AS COMMENTS FROM TABLE2
MINUS
SELECT KEY,FIRST_NAME,LAST_NAME,CITY,NULL AS COMMENTS FROM TABLE1
;
MERGE INTO TABLE3 A
USING TABLE1 B
ON (A.KEY=B.KEY)
WHEN MATCHED THEN
UPDATE SET A.COMMENTS='UPDATED'
WHEN NOT MATCHED THEN
UPDATE SET A.COMMENTS='INSERTED';
There is no such WHEN NOT MATCHED THEN UPDATE clause, you should use WHEN NOT MATCHED THEN INSERT. Refer to MERGE for details.
A few assumptions made about the data:
An INSERT event will be a record identified by its key in table2 (current data) that does not have a matching key in the original back-up table: table1.
An UPDATE event is a field that exists in both table1 and table2 for the same KEY but is not the same.
Records which did not change between tables are not to be recorded in table3.
Example Query: Check for Updates
SELECT UPD_QUERY.NEW_CITY, 'UPDATED' as COMMENTS
FROM (SELECT CASE WHEN REPLACE(CURR.CITY, BKUP.CITY,'') IS NOT NULL THEN CURR.CITY
ELSE NULL END as NEW_CITY
FROM table1 BKUP, table2 CURR
WHERE BKUP.KEY = CURR.KEY) UPD_QUERY
WHERE UPD_QUERY.NEW_CITY is NOT NULL;
You can repeat this comparison method for the other fields:
SELECT UPD_QUERY.*
FROM (SELECT CURR.KEY,
CASE WHEN REPLACE(CURR.FIRST_NAME, BKUP.FIRST_NAME,'') IS NOT NULL
THEN CURR.FIRST_NAME
ELSE NULL END as FIRST_NAME,
CASE WHEN REPLACE(CURR.LAST_NAME, BKUP.LAST_NAME,'') IS NOT NULL
THEN CURR.LAST_NAME
ELSE NULL END as LAST_NAME,
CASE WHEN REPLACE(CURR.CITY, BKUP.CITY,'') IS NOT NULL
THEN CURR.CITY
ELSE NULL END as CITY
FROM table1 BKUP, table2 CURR
WHERE BKUP.KEY = CURR.KEY) UPD_QUERY
WHERE COALESCE(UPD_QUERY.FIRST_NAME, UPD_QUERY.LAST_NAME, UPD_QUERY.CITY)
is NOT NULL;
NOTE: This could get unwieldy very quickly if the number of columns compared are many. Since the target table design (table3) requires not only identification of a change, but the field and its new value are also recorded.
Example Query: Look for Newly Added Records
SELECT CURR.*, 'INSERTED' as COMMENTS
FROM table2 CURR, table1 BKUP
WHERE CURR.KEY = BKUP.KEY(+)
AND BKUP.KEY is NULL;
Basically MERGE forces the operation: MATCHED=UPDATE (or DELETE), NOT MATCHED = INSERT. It's in the docs.
You can do what you want but you need two insert statements with different set operators,
For UPDATED:
Insert into table3
table1 INTERSECT table2
For INSERTED:
Insert into table3
table2 MINUS table1

Oracle Equivalent to MySQL INSERT IGNORE?

I need to update a query so that it checks that a duplicate entry does not exist before insertion. In MySQL I can just use INSERT IGNORE so that if a duplicate record is found it just skips the insert, but I can't seem to find an equivalent option for Oracle. Any suggestions?
If you're on 11g you can use the hint IGNORE_ROW_ON_DUPKEY_INDEX:
SQL> create table my_table(a number, constraint my_table_pk primary key (a));
Table created.
SQL> insert /*+ ignore_row_on_dupkey_index(my_table, my_table_pk) */
2 into my_table
3 select 1 from dual
4 union all
5 select 1 from dual;
1 row created.
Check out the MERGE statement. This should do what you want - it's the WHEN NOT MATCHED clause that will do this.
Do to Oracle's lack of support for a true VALUES() clause the syntax for a single record with fixed values is pretty clumsy though:
MERGE INTO your_table yt
USING (
SELECT 42 as the_pk_value,
'some_value' as some_column
FROM dual
) t on (yt.pk = t.the_pke_value)
WHEN NOT MATCHED THEN
INSERT (pk, the_column)
VALUES (t.the_pk_value, t.some_column);
A different approach (if you are e.g. doing bulk loading from a different table) is to use the "Error logging" facility of Oracle. The statement would look like this:
INSERT INTO your_table (col1, col2, col3)
SELECT c1, c2, c3
FROM staging_table
LOG ERRORS INTO errlog ('some comment') REJECT LIMIT UNLIMITED;
Afterwards all rows that would have thrown an error are available in the table errlog. You need to create that errlog table (or whatever name you choose) manually before running the insert using DBMS_ERRLOG.CREATE_ERROR_LOG.
See the manual for details
I don't think there is but to save time you can attempt the insert and ignore the inevitable error:
begin
insert into table_a( col1, col2, col3 )
values ( 1, 2, 3 );
exception when dup_val_on_index then
null;
end;
/
This will only ignore exceptions raised specifically by duplicate primary key or unique key constraints; everything else will be raised as normal.
If you don't want to do this then you have to select from the table first, which isn't really that efficient.
Another variant
Insert into my_table (student_id, group_id)
select distinct p.studentid, g.groupid
from person p, group g
where NOT EXISTS (select 1
from my_table a
where a.student_id = p.studentid
and a.group_id = g.groupid)
or you could do
Insert into my_table (student_id, group_id)
select distinct p.studentid, g.groupid
from person p, group g
MINUS
select student_id, group_id
from my_table
A simple solution
insert into t1
select from t2
where not exists
(select 1 from t1 where t1.id= t2.id)
This one isn't mine, but came in really handy when using sqlloader:
create a view that points to your table:
CREATE OR REPLACE VIEW test_view
AS SELECT * FROM test_tab
create the trigger:
CREATE OR REPLACE TRIGGER test_trig
INSTEAD OF INSERT ON test_view
FOR EACH ROW
BEGIN
INSERT INTO test_tab VALUES
(:NEW.id, :NEW.name);
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN NULL;
END test_trig;
and in the ctl file, insert into the view instead:
OPTIONS(ERRORS=0)
LOAD DATA
INFILE 'file_with_duplicates.csv'
INTO TABLE test_view
FIELDS TERMINATED BY ','
(id, field1)
How about simply adding an index with whatever fields you need to check for dupes on and say it must be unique? Saves a read check.
yet another "where not exists"-variant using dual...
insert into t1(id, unique_name)
select t1_seq.nextval, 'Franz-Xaver' from dual
where not exists (select 1 from t1 where unique_name = 'Franz-Xaver');

Resources