In Snowflake I am trying to insert updated records to a table. Then I want to identify the records that were just inserted as the most recent records save that as the final table output in a new column called ACTIVE which will either be true or flase. I am having an issue incorporating some sort of updated table segment to my current query. I need everything be contained in the same query rather than break it up into separate parts.
I have my table as follows
CREATE TABLE IF NOT EXISTS MY_TABLE
(
LINK_ID BINARY NOT NULL,
LOAD TIMESTAMP NOT NULL,
SOURCE STRING NOT NULL,
SOURCE_DATE TIMESTAMP NOT NULL,
ORDER BIGINT NOT NULL,
ID BINARY NOT NULL,
ATTRIBUTE_ID BINARY NOT NULL
);
I have records being inserted in this way:
INSERT ALL
WHEN HAS_DATA AND ID_SEQ_NUM > 1 AND (SELECT COUNT(1) FROM MY_TABLE WHERE ID = KEY) = 0 THEN
INTO MY_TABLE VALUES (
LINK_KEY,
TIME,
DATASET_NAME,
DATASET_DATE,
ORDER_NUMBER,
O_KEY,
OA_KEY
)
SELECT *
FROM TEST_TABLE;
I would like my final table from this to be the output as
SELECT *, ORDER != MAX(ORDER) OVER (PARTITION BY ID) AS ACTIVE
FROM MY_TABLE;
This is so I can identify the most recent record per ID group as ACTIVE/TRUE and the previous records within that ID group as INACTIVE/FALSE
I tried to use an insert overwrite method like this
INSERT ALL
WHEN HAS_DATA AND ID_SEQ_NUM > 1 AND (SELECT COUNT(1) FROM MY_TABLE WHERE ID = KEY) = 0 THEN
INTO MY_TABLE VALUES (
LINK_KEY,
TIME,
DATASET_NAME,
DATASET_DATE,
ORDER_NUMBER,
O_KEY,
OA_KEY
)
INSERT OVERWRITE INTO MY_TABLE
SELECT *, RSRC_OFFSET != MAX(RSRC_OFFSET) OVER (PARTITION BY ID) AS ACTIVE
FROM L_OPTION_OPTION_ALLOCATION_TEST
SELECT *
FROM MY_TABLE;
However, it seems the insert overwrite doesn't work in this way (also I am not sure if I can just add a new column to the table like this?). Is there a way I can incorporate it into this query or a different way to update the table with this new ACTIVE column within this query itself?
Also I am using INSERT ALL here because I actually have multiple different tables I am inserting into at once, but this is the current table that I am trying to modify.
You can use the overwrite option with conditional multi-table inserts.
Starting with your current statement:
INSERT ALL
WHEN HAS_DATA AND ID_SEQ_NUM > 1 AND (SELECT COUNT(1) FROM MY_TABLE WHERE ID = KEY) = 0 THEN
INTO MY_TABLE VALUES (
LINK_KEY,
TIME,
DATASET_NAME,
DATASET_DATE,
ORDER_NUMBER,
O_KEY,
OA_KEY
)
SELECT *
FROM TEST_TABLE;
Add the overwrite option immediately after the insert command:
INSERT OVERWRITE ALL
WHEN HAS_DATA AND ID_SEQ_NUM > 1 AND (SELECT COUNT(1) FROM MY_TABLE WHERE ID = KEY) = 0 THEN
INTO MY_TABLE VALUES (
LINK_KEY,
TIME,
DATASET_NAME,
DATASET_DATE,
ORDER_NUMBER,
O_KEY,
OA_KEY
)
SELECT *
FROM TEST_TABLE;
Note that this will truncate and insert ALL tables in the multi-table insert. There is not a way to be selective about which tables get truncated and inserted and which don't.
https://docs.snowflake.com/en/sql-reference/sql/insert-multi-table.html#optional-parameters
Related
INSERT INTO CUST_ID_HISTORY(CUST_ID,PREVIOUS_CUST_ID,UPDATE_DATE)
SELECT CUST_ID,PREVIOUS_CUST_ID,UPDATE_DATE
FROM CUST_ID_HISTORY WHERE NOT EXISTS (SELECT 1 FROM CUST_ID_HISTORY WHERE
CUST_ID='SCB301' AND PREVIOUS_CUST_ID='SCB201');
Above query insert a new record to CUST_ID_HISTORY when inserted value is not exist in the table.I encounter error ORA-00001: unique constraint with the supplied value, but both values are not exist in the table.
Your statement selects from CUST_ID_HISTORY and inserts into CUST_ID_HISTORY . In other words, when there is no record for where CUST_ID != 'SCB301' and PREVIOUS_CUST_ID != 'SCB201' it tries to create a duplicate record for every record in CUST_ID_HISTORY. Presumably this is the table's primary key, hence the ORA-00001 unique key violation.
If what you're trying to achieve is insert a record into CUST_ID_HISTORY for those values perhaps you need a merge statement. Something like this:
merge into CUST_ID_HISTORY tgt
using ( select 'SCB301' as id, 'SCB201' as prev_id
from dual ) src
on ( src.id = tgt.cust_id
and src.prev_id = tgt.prev_cust_id)
when not matched then
insert (cust_id, prev_cust_id, update_date)
values (src.id, src.prev_id, sysdate)
/
The query:
INSERT INTO CUST_ID_HISTORY(CUST_ID,PREVIOUS_CUST_ID,UPDATE_DATE)
SELECT CUST_ID,PREVIOUS_CUST_ID,UPDATE_DATE
FROM CUST_ID_HISTORY
will duplicate all the rows of the CUST_ID_HISTORY table.
Adding the filter:
WHERE NOT EXISTS (
SELECT 1
FROM CUST_ID_HISTORY
WHERE CUST_ID='SCB301' AND PREVIOUS_CUST_ID='SCB201'
);
Will check whether there is an existing row in table with CUST_ID='SCB301' AND PREVIOUS_CUST_ID='SCB201' if the row does not exist then all the rows from the CUST_ID_HISTORY table will be duplicated and, if such a row exists then the statement will insert zero rows.
So your query will either duplicate all the rows or will do nothing.
If you want to insert a single new row, checking that it does not exist then:
INSERT INTO CUST_ID_HISTORY(CUST_ID,PREVIOUS_CUST_ID,UPDATE_DATE)
SELECT 'SCB301',
'SCB201',
SYSDATE -- The current date/time
FROM DUAL
WHERE NOT EXISTS (
SELECT 1
FROM CUST_ID_HISTORY
WHERE CUST_ID='SCB301' AND PREVIOUS_CUST_ID='SCB201'
);
or:
MERGE INTO CUST_ID_HISTORY dst
USING (
SELECT 'SCB301' AS CUST_ID,
'SCB201' AS PREVIOUS_CUST_ID
FROM DUAL
) src
ON ( src.CUST_ID = dst.CUST_ID
AND src.PREVIOUS_CUST_ID = dst.PREVIOUS_CUST_ID )
WHEN NOT MATCHED THEN
INSERT ( CUST_ID, PREVIOUS_CUST_ID, UPDATE_DATE )
VALUES ( src.CUST_ID, src.PREVIOUS_CUST_ID, SYSDATE );
Which will both check if there is an existing row and insert a single row if it does not exist.
I am saving table ids as foreign key into another table using Oracle Apex Shuttle field like(3:4:5). Now I want to use these IDS in sql query using IN Clause. I have replaced : with , using replace function but it shows
no data found
message.
The following query works fine when I use static values.
select * from table where day_id IN(3,4,5)
But when I try to use
select * from table where id IN(Select id from table2)
it shows no data found.
From what i understand you have a list like 1:2:3:4 that you want to use in a IN clause; you can transform the list into separated values like this:
select regexp_substr('1:2:3:4','[^:]+', 1, level) as list from dual
connect by regexp_substr('1:2:3:4', '[^:]+', 1, level) is not null;
This will return:
List
1
2
3
4
Then you can simply add it to your query like this:
SELECT *
FROM TABLE
WHERE day_id IN
(SELECT regexp_substr('1:2:3:4','[^:]+', 1, level) AS list
FROM dual
CONNECT BY regexp_substr('1:2:3:4', '[^:]+', 1, level) IS NOT NULL
);
Can you try below statement. It has working as you expected.
create table table1 (id number, name varchar2(20));
alter table table1 add constraints pri_cons primary key(id);
create table table2 (id number, name varchar2(20));
alter table table2 add constraints ref_cons FOREIGN KEY(id) REFERENCES table1 (id);
begin
insert into table1 values (1,'Bala');
insert into table1 values (2,'Sathish');
insert into table1 values (3,'Subbu');
insert into table2 values (1,'Nalini');
insert into table2 values (2,'Sangeetha');
insert into table2 values (3,'Rubini');
end;
/
select * from table1 where id IN (Select id from table2);
Vertica allows duplicates to be inserted into the tables. I can view those using the 'analyze_constraints' function.
How to delete duplicate rows from Vertica tables?
You should try to avoid/limit using DELETE with a large number of records. The following approach should be more effective:
Step 1 Create a new table with the same structure / projections as the one containing duplicates:
create table mytable_new like mytable including projections ;
Step 2 Insert into this new table de-duplicated rows:
insert /* +direct */ into mytable_new select <column list> from (
select * , row_number() over ( partition by <pk column list> ) as rownum from <table-name>
) a where a.rownum = 1 ;
Step 3 rename the original table (the one containing dups):
alter table mytable rename to mytable_orig ;
Step 4 rename the new table:
alter table mytable_new rename to mytable ;
That's all.
The answer of Mauro is correct, but there is an error in the sql of step 2. So, the complete way of working by avoiding DELETE should then be as follows:
Step 1 Create a new table with the same structure / projections as the one containing duplicates:
create table mytable_new like mytable including projections ;
Step 2 Insert into this new table de-duplicated rows:
insert /* +direct */ into mytable_new select <column list> from (
select * , row_number() over ( partition by <pk column list> ) as rownum from mytable
) a where a.rownum = 1 ;
Step 3 rename the original table (the one containing dups):
alter table mytable rename to mytable_orig ;
Step 4 rename the new table:
alter table mytable_new rename to mytable ;
Off the top of my head, and not a great answer so let's let this be the final word, you can delete both and insert one back in.
You can delete duplicates by Vertica tables by creating a temporary table and generating pseudo row_ids. Here are few steps, especially if you are removing duplicates from very large and wide tables. In the example below, i assume, k1 and k2 rows have more than 1 duplicates. For more info see here.
-- Find the duplicates
select keys, count(1) from large-table-1
where [where-conditions]
group by 1
having count(1) > 1
order by count(1) desc ;
-- Step 2: Dump the duplicates into temp table
create table test.large-table-1-dups
like large-table-1;
alter table test.large-table-1-dups -- add row_num column (pseudo row_id)
add column row_num int;
insert into test.large-table-1-dups
select *, ROW_NUMBER() OVER(PARTITION BY key)
from large-table-1
where key in ('k1', 'k2'); -- where, say, k1 has n and k2 has m exact dups
-- Step 3: Remove duplicates from the temp table
delete from test.large-table-1-dups
where row_num > 1;
select * from test.dim_line_items_dups;
-- Sanity test. Should have 1 row each of k1 & k2 rows above
-- Step 4: Delete all duplicates from main table...
delete from large-table-1
where key in ('k1', 'k2');
-- Step 5: Insert data back into main table from temp dedupe data
alter table test.large-table-1-dups
drop column row_num;
insert into large-table-1
select * from test.large-table-1-dups;
Step1: Create a intermediate table to port/load the data from original table along with row number.
Here in below sample, porting data from Table1 to Table2 along with row_num column
select * into Table2 from (select *, ROW_NUMBER() OVER(PARTITION BY A,B order by C)as row_num from Table1 ) A;
Step2: Delete data from Table1 using earlier created Table2 in above step
DELETE FROM Table1 WHERE EXISTS (SELECT NULL FROM Table2
where Table2.A=Table1.A
and Table2.B=Table1.B
and row_num > 1);
Step3: Drop table create in first step1 i.e Table2
Drop Table Table2;
You should have a look at this answer from the PostgreSQL wiki which also works for Vertica:
DELETE
FROM
tablename
WHERE
id IN(
SELECT
id
FROM
(
SELECT
id,
ROW_NUMBER() OVER(
partition BY column1,
column2,
column3
ORDER BY
id
) AS rnum
FROM
tablename
) t
WHERE
t.rnum > 1
);
It deletes all duplicate entries but the one with the lowest id.
I need to update a query so that it checks that a duplicate entry does not exist before insertion. In MySQL I can just use INSERT IGNORE so that if a duplicate record is found it just skips the insert, but I can't seem to find an equivalent option for Oracle. Any suggestions?
If you're on 11g you can use the hint IGNORE_ROW_ON_DUPKEY_INDEX:
SQL> create table my_table(a number, constraint my_table_pk primary key (a));
Table created.
SQL> insert /*+ ignore_row_on_dupkey_index(my_table, my_table_pk) */
2 into my_table
3 select 1 from dual
4 union all
5 select 1 from dual;
1 row created.
Check out the MERGE statement. This should do what you want - it's the WHEN NOT MATCHED clause that will do this.
Do to Oracle's lack of support for a true VALUES() clause the syntax for a single record with fixed values is pretty clumsy though:
MERGE INTO your_table yt
USING (
SELECT 42 as the_pk_value,
'some_value' as some_column
FROM dual
) t on (yt.pk = t.the_pke_value)
WHEN NOT MATCHED THEN
INSERT (pk, the_column)
VALUES (t.the_pk_value, t.some_column);
A different approach (if you are e.g. doing bulk loading from a different table) is to use the "Error logging" facility of Oracle. The statement would look like this:
INSERT INTO your_table (col1, col2, col3)
SELECT c1, c2, c3
FROM staging_table
LOG ERRORS INTO errlog ('some comment') REJECT LIMIT UNLIMITED;
Afterwards all rows that would have thrown an error are available in the table errlog. You need to create that errlog table (or whatever name you choose) manually before running the insert using DBMS_ERRLOG.CREATE_ERROR_LOG.
See the manual for details
I don't think there is but to save time you can attempt the insert and ignore the inevitable error:
begin
insert into table_a( col1, col2, col3 )
values ( 1, 2, 3 );
exception when dup_val_on_index then
null;
end;
/
This will only ignore exceptions raised specifically by duplicate primary key or unique key constraints; everything else will be raised as normal.
If you don't want to do this then you have to select from the table first, which isn't really that efficient.
Another variant
Insert into my_table (student_id, group_id)
select distinct p.studentid, g.groupid
from person p, group g
where NOT EXISTS (select 1
from my_table a
where a.student_id = p.studentid
and a.group_id = g.groupid)
or you could do
Insert into my_table (student_id, group_id)
select distinct p.studentid, g.groupid
from person p, group g
MINUS
select student_id, group_id
from my_table
A simple solution
insert into t1
select from t2
where not exists
(select 1 from t1 where t1.id= t2.id)
This one isn't mine, but came in really handy when using sqlloader:
create a view that points to your table:
CREATE OR REPLACE VIEW test_view
AS SELECT * FROM test_tab
create the trigger:
CREATE OR REPLACE TRIGGER test_trig
INSTEAD OF INSERT ON test_view
FOR EACH ROW
BEGIN
INSERT INTO test_tab VALUES
(:NEW.id, :NEW.name);
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN NULL;
END test_trig;
and in the ctl file, insert into the view instead:
OPTIONS(ERRORS=0)
LOAD DATA
INFILE 'file_with_duplicates.csv'
INTO TABLE test_view
FIELDS TERMINATED BY ','
(id, field1)
How about simply adding an index with whatever fields you need to check for dupes on and say it must be unique? Saves a read check.
yet another "where not exists"-variant using dual...
insert into t1(id, unique_name)
select t1_seq.nextval, 'Franz-Xaver' from dual
where not exists (select 1 from t1 where unique_name = 'Franz-Xaver');
How to update multiple rows in the oracle database with different record set.
It's not pretty, but you can do this with MERGE INTO, using a UNION of SELECT ... FROM DUAL as the merge source.
Given:
create table writer (
id integer primary key,
name varchar2(255) not null
)
You can do:
merge into writer dst
using (
select 1 as id, 'Edward Luttwak' as name from dual
union
select 2 as id, 'Iain Sinclair' as name from dual
) src
on (dst.id = src.id)
when matched then update set dst.name = src.name
when not matched then insert (dst.id, dst.name) values (src.id, src.name);
For each row you want to update, add a term to the union in the subquery. If you know that the rows are definitely updates, and never inserts, you can drop the whole when matched clause.