I have a table TABLE1 and is having more than 10L records.
As part of a requirement, I need to get some records based on the below query to a temp table TEMP_TABLE1.
CREATE TABLE TEMP_TABLE1 AS (SELECT * FROM TABLE1 tb1 WHERE tb1.repo_id IN (condition A));
And,I have to get the rest of the records in to TEMP_TABLE2 based on the opposite query like,
CREATE TABLE TEMP_TABLE2 AS (SELECT * FROM TABLE1 tb1 WHERE tb1.repo_id NOT IN (condition A)); --added NOT IN
My question is, after the first query execution, can we move the rest of the records in the table without a new query which also takes the same time to process. I am using the same condition to filter in both the cases.
INSERT ALL
WHEN repo_id IN (condition A) THEN INTO temp_table1
WHEN repo_id not IN (condition A) THEN INTO temp_table2
SELECT * FROM table1;
At first you need to create empty temp_tables:
create table temp_table1 as (select * from table1 where 1=0);
create table temp_table2 as (select * from table1 where 1=0);
Instead of second condition WHEN repo_id not IN (condition A) THEN you can use simply ELSE if both sets are excluding.
Examples of conditional inserts (look at the bottom of page).
Related
Could you please tell me how to compare differences between table and my select query and insert those results in separate table? My plan is to create one base table (name RESULT) by using select statement and populate it with current result set. Then next day I would like to create procedure which will going to compare same select with RESULT table, and insert differences into another table called DIFFERENCES.
Any ideas?
Thanks!
You can create the RESULT_TABLE using CTAS as follows:
CREATE TABLE RESULT_TABLE
AS SELECT ... -- YOUR QUERY
Then you can use the following procedure which calculates the difference between your query and data from RESULT_TABLE:
CREATE OR REPLACE PROCEDURE FIND_DIFF
AS
BEGIN
INSERT INTO DIFFERENCES
--data present in the query but not in RESULT_TABLE
(SELECT ... -- YOUR QUERY
MINUS
SELECT * FROM RESULT_TABLE)
UNION
--data present in the RESULT_TABLE but not in the query
(SELECT * FROM RESULT_TABLE
MINUS
SELECT ... );-- YOUR QUERY
END;
/
I have used the UNION and the difference between both of them in a different order using MINUS to insert the deleted data also in the DIFFERENCES table. If this is not the requirement then remove the query after/before the UNION according to your requirement.
-- Create a table with results from the query, and ID as primary key
create table result_t as
select id, col_1, col_2, col_3
from <some-query>;
-- Create a table with new rows, deleted rows or updated rows
create table differences_t as
select id
-- Old values
,b.col_1 as old_col_1
,b.col_2 as old_col_2
,b.col_3 as old_col_3
-- New values
,a.col_1 as new_col_1
,a.col_2 as new_col_2
,a.col_3 as new_col_3
-- Execute the query once again
from <some-query> a
-- Outer join to detect also detect new/deleted rows
full join result_t b using(id)
-- Null aware comparison
where decode(a.col_1, b.col_1, 1, 0) = 0
or decode(a.col_2, b.col_2, 1, 0) = 0
or decode(a.col_3, b.col_3, 1, 0) = 0;
I am using Toad for oracle 12c. I need to copy a table and data (40M) from one shcema to another (prod to test). However there is an unique key(not the PK for this table) called record_Id col which has something data like this 3.000*******19E15. About 2M rows has same numbers(I believe its because very large number) which are unique in prod. When I try to copy it violets the unique key of that col. I am using toad "export data to another schema" function to copy the data.
when I execute query in prod
select count(*) from table_name
OR
select count(distinct(record_id) from table_name
Both query gives the exact same numbers of data.
I don't have DBA permission. How do I copy all data without violating unique key of the table.
Thanks in advance!
You can use UPSERT for decisional INSERT or UPDATE or you may write small procedure for this.
you may consider to use NOT EXISTS, but your data is big and it might not be resource efficient.
insert into prod_tab
select * from other_tab t1 where NOT exists (
select 1 from prod_tab t2 where t1.id = t2.id
);
In Oracle you can use a MERGE query for that.
The following query proceeds as follows for each data row :
if the source record_id does not yet exist in the target table, a new record is inserted
else, the existing record is updated with source values
For the sake of the example, I assumed that there are two other columns in the table : column1 and column2.
MERGE INTO target_table t1
USING (SELECT * from source_table t2)
ON (t1.record_id = t2.record_id)
WHEN MATCHED THEN UPDATE SET
t1.column1 = t2.column1,
t1.column2 = t2.column2
WHEN NOT MATCHED THEN INSERT
(record_id, column1, column2) VALUES (t2.record_id, t2.column1, t2.column2)
I have two hive tables, in which one table is updating an hourly basic by Java API team (they are calling and storing it into hive table1). And now I have to aggregate the latest data and store it into another table called table2 (data which are loaded newly,because old data have been aggregated and stored). For that I have used the query below:
set maxtime = select max(lastactivitytimestamp) from table2;
insert into table2 select * from table1 where lastactivitytimestamp > unix_timestamp('${hivevar:maxtime}');
I am not getting any result. But when I give the timestamp value manually I am getting data, like below:
insert into table2 select * from table1 where lastactivitytimestamp > unix_timestamp('2014-08-18 15:23:26.754');
Is it possible to pass dynamic values in unix_timestamp?
Try removing the upper commas from the unix_timestamp() function, like this:
insert into table2 select * from table1 where lastactivitytimestamp > unix_timestamp(${hivevar:maxtime});
I create a table partitioned by date. But can not use the partition in where clause.
Here is the process
step1:
CREATE TABLE new_table (
a string,
b string
)
PARTITIONED BY (dt string);
Step2:
Insert overwrite table new_table partition (dt=$date)
Select a, b from my_table
where dt = '$date
Table is created.
Describe new_table;
a string
b string
dt string
Problem:
select * from new_table where dt='$date'
returns empty set.
whereas
select * from new_table
returns a, b, and dt.
Does anyone know what might be the reason for this?
Thanks
Sahara, partitions in where clause do work. Any chance you are specifying the wrong value in the RHS of your predicate?
You may also want to see the data inside /user/hive/warehouse/new_table (by default) to see if the data there looks the way you expect it to be.
Try inserting with
Insert overwrite table new_table partition (dt=$date)
Select a, b, dt from my_table
where dt = '$date'
I have 2 tables that are the same structure. One is a temp one and the other is a prod one. The entire data set gets loaded each time and sometimes this dataset will have deleted records from the prior datasets. I load the dataset into temp table first and if any records were deleted I want to deleted them from the prod table also.
So how can I find the records that exist in prod but not in temp? I tried outer join but it doesn't seem to be working. It's returning all the records from the table in the left or right depending on doing left or right outer join.
I then also want to delete those records in the prod table.
One way would be to use the MINUS operator
SELECT * FROM table1
MINUS
SELECT * FROM table2
will show all the rows in table1 that do not have an exact match in table2 (you can obviously specify a smaller column list if you are only interested in determining whether a particular key exists in both tables).
Another would be to use a NOT EXISTS
SELECT *
FROM table1 t1
WHERE NOT EXISTS( SELECT 1
FROM table2 t2
WHERE t1.some_key = t2.some_key )
How about something like:
SELECT * FROM ProdTable WHERE ID NOT IN
(select ID from TempTable);
It'd work the same as a DELETE statement as well:
DELETE FROM ProdTable WHERE ID NOT IN
(select ID from TempTable);
MINUS can work here
The following statement combines results with the MINUS operator, which returns only rows returned by the first query but not by the second:
SELECT * FROM prod
MINUS
SELECT * FROM temp;
Minus will only work if the table structure is same