Delete data returned from subquery in oracle - oracle

I have two tables. if the data in table1 is more than a predefined limit (say 2), i need to copy the remaining contents of table1 to table2 and delete those same contents from table1.
I used the below query to insert the excess data from table1 to table2.
insert into table2
SELECT * FROM table1 WHERE ROWNUM < ((select count(*) from table1)-2);
Now i need the delete query to delete the above contents from table1.
Thanks in advance.

A straightforward approach would be an interim storage in a temporary table. Its content can be used to determine the data to be deleted from table1 as well as the source to feed table 2.
Assume (slightly abusing notation) to be the PK column (or that of any candidate key) of table1 - usually there'll be some key that comprises only 1 column.
create global temporary table t_interim as
( SELECT <pk> pkc FROM table1 WHERE ROWNUM < ((select count(*) from table1)-2 )
;
insert into table2
select * from table1 where <pk> IN (
select pkc from t_interim
);
delete from table1 where <pk> IN (
select pkc from t_interim
);
Alternative
If any key of table1 spans more than 1 column, use an EXISTS clause instead as follows ( denoting the i-th component of a candidate key in table1):
create global temporary table t_interim as
( SELECT <ck_1> ck1, <ck_2> ck2, ..., <ck_n> ckn FROM table1 WHERE ROWNUM < ((select count(*) from table1)-2 )
;
insert into table2
select * from table1 t
where exists (
select 1
from t_interim t_i
where t.ck_1 = t_i.ck1
and t.ck_2 = t_i.ck2
...
and t.ck_n = t_i.ckn
)
;
delete from table1 t where
where exists (
select 1
from t_interim t_i
where t.ck_1 = t_i.ck1
and t.ck_2 = t_i.ck2
...
and t.ck_n = t_i.ckn
)
;
(Technically you could try to adjust the first scheme by synthesizing a key from the components of any CK, eg. by concatenating. You run the risk of introducing ambiguities ( (a bc, ab c) -> (abc, abc) ) or run into implementation limits ( max. varchar length ) using the first method)
Note
In case the table doesn't have a PK, you can apply the technique using any candidate key of table1. There will always be one, in the extreme case it's the set of all columns.
This situation may be the right time to improve the db design and add a (synthetic) pk column to table1 ( and any other tables in the system that lack it).

Related

insert all and inner join in oracle

I would like to insert data in to two tables. Will be one-to-many connection. For this, I have to use Foreign Key, of course.
I think, table1 - ID column is an ideal for this a Primary Key. But I generate it always with a trigger, automatically, every line. SO,
How can I put Table1.ID (auto generated, Primary Key) column in to table2.Fkey column in the same insert query?
INSERT ALL INTO table1 ( --here (before this) generated the table1.id column automatically with a trigger.
table1.food,
table1.drink,
table1.shoe
) VALUES (
'apple',
'water',
'slippers'
)
INTO table2 (
fkey,
color
) VALUES (
table1.id, -- I would like table2.fkey == table1.id this gave me error
'blue'
) SELECT
*
FROM
table1
INNER JOIN table2 ON table1.id = table2.fkey;
The error message:
"00904. 00000 - "%s: invalid identifier""
As suggested by #OldProgrammer, use sequence
INSERT ALL INTO table1 ( --here (before this) generated the table1.id column automatically with a trigger.
table1_id,
table1.food,
table1.drink,
table1.shoe
) VALUES (
<sequecename_table1>.nextval,
'apple',
'water',
'slippers'
)
INTO table2 (
fkey,
color
) VALUES (
<sequecename_table2>.nextval,
<sequecename_table1>.currval, -- returns the current value of a sequence.
'blue'
) SELECT
*
FROM
table1
INNER JOIN table2 ON table1.id = table2.fkey;
Since you're using Oracle DB's 12c version, then might use Identity Column Property. Then easily return the value of first table's (table1) to a local variable by charging of returning clause just after an insert statement for table1, and use inside the next insert statement which is for table2 as stated below :
SQL> create table table1(
2 ID integer generated always as identity primary key,
3 food varchar2(50), drink varchar2(50), shoe varchar2(50)
4 );
SQL> create table table2(
2 fkey integer references table1(ID),
3 color varchar2(50)
4 );
SQL> declare
2 cl_tab table1.id%type;
3 begin
4 insert into table1(food,drink,shoe) values('apple','water','slippers' )
5 returning id into cl_tab;
6 insert into table2 values(cl_tab,'blue');
7 end;
8 /
SQL> select * from table1;
ID FOOD DRINK SHOE
-- ------- ------- -------
1 apple water slippers
SQL> select * from table2;
FKEY COLOR
---- --------------------------------------------------
1 blue
Anytime you issue the above statement for insertions between begin and end, both table1.ID and table2.fkey columns will be populated by the same integer values. By the way do not forget to commit the changes by insertions, if you need these values throughout the DB(i.e.from other sessions also).

Appropriate Join/Connection For This Scenario?

I want to join two tables together which I have done
I also want to join them based on a condition, where a particular column has a specific value, and I also have done this successfully. I used an inner join and a where clause so far.
However, for this result set, I want to further filter it by selecting ONLY the columns where a particular string appears more than once for a set of columns, eg;
employee_ID and CERTIFICATE
I'd like to group where employee_id has CERTIFICATE count > 2. This is after I have joined the tables together using a where clause.
I am perhaps thinking of using a subquery in my WHERE clause (which is the 3rd line that is also last)
For further clarification, I want to display only employees who have a certificate count greater than 2. By certificate, I am referencing a table with a string 'Certificate' under a column 'Skill'. In other words, select only columns where the string 'Certificate' appears TWICE for a particular employee ID.
To get just the employee ids:
SELECT t1.employee_id
FROM table1 t1
INNER JOIN
table2 t2
ON ( t1.col1 = t2.col1 )
GROUP BY t1.employee_id
HAVING COUNT( CASE t2.skill WHEN 'CERTIFICATE' THEN 1 END ) > 1
Or, to get all the columns:
SELECT *
FROM (
SELECT t1.*,
t2.*,
COUNT( CASE t2.skill WHEN 'CERTIFICATE' THEN 1 END )
OVER ( PARTITION BY t1.employee_id )
AS num_certificate
FROM table1 t1
INNER JOIN
table2 t2
ON ( t1.col1 = t2.col1 )
)
WHERE num_certificate > 1

Delete Duplicate rows in Vertica database

Vertica allows duplicates to be inserted into the tables. I can view those using the 'analyze_constraints' function.
How to delete duplicate rows from Vertica tables?
You should try to avoid/limit using DELETE with a large number of records. The following approach should be more effective:
Step 1 Create a new table with the same structure / projections as the one containing duplicates:
create table mytable_new like mytable including projections ;
Step 2 Insert into this new table de-duplicated rows:
insert /* +direct */ into mytable_new select <column list> from (
select * , row_number() over ( partition by <pk column list> ) as rownum from <table-name>
) a where a.rownum = 1 ;
Step 3 rename the original table (the one containing dups):
alter table mytable rename to mytable_orig ;
Step 4 rename the new table:
alter table mytable_new rename to mytable ;
That's all.
The answer of Mauro is correct, but there is an error in the sql of step 2. So, the complete way of working by avoiding DELETE should then be as follows:
Step 1 Create a new table with the same structure / projections as the one containing duplicates:
create table mytable_new like mytable including projections ;
Step 2 Insert into this new table de-duplicated rows:
insert /* +direct */ into mytable_new select <column list> from (
select * , row_number() over ( partition by <pk column list> ) as rownum from mytable
) a where a.rownum = 1 ;
Step 3 rename the original table (the one containing dups):
alter table mytable rename to mytable_orig ;
Step 4 rename the new table:
alter table mytable_new rename to mytable ;
Off the top of my head, and not a great answer so let's let this be the final word, you can delete both and insert one back in.
You can delete duplicates by Vertica tables by creating a temporary table and generating pseudo row_ids. Here are few steps, especially if you are removing duplicates from very large and wide tables. In the example below, i assume, k1 and k2 rows have more than 1 duplicates. For more info see here.
-- Find the duplicates
select keys, count(1) from large-table-1
where [where-conditions]
group by 1
having count(1) > 1
order by count(1) desc ;
-- Step 2: Dump the duplicates into temp table
create table test.large-table-1-dups
like large-table-1;
alter table test.large-table-1-dups -- add row_num column (pseudo row_id)
add column row_num int;
insert into test.large-table-1-dups
select *, ROW_NUMBER() OVER(PARTITION BY key)
from large-table-1
where key in ('k1', 'k2'); -- where, say, k1 has n and k2 has m exact dups
-- Step 3: Remove duplicates from the temp table
delete from test.large-table-1-dups
where row_num > 1;
select * from test.dim_line_items_dups;
-- Sanity test. Should have 1 row each of k1 & k2 rows above
-- Step 4: Delete all duplicates from main table...
delete from large-table-1
where key in ('k1', 'k2');
-- Step 5: Insert data back into main table from temp dedupe data
alter table test.large-table-1-dups
drop column row_num;
insert into large-table-1
select * from test.large-table-1-dups;
Step1: Create a intermediate table to port/load the data from original table along with row number.
Here in below sample, porting data from Table1 to Table2 along with row_num column
select * into Table2 from (select *, ROW_NUMBER() OVER(PARTITION BY A,B order by C)as row_num from Table1 ) A;
Step2: Delete data from Table1 using earlier created Table2 in above step
DELETE FROM Table1 WHERE EXISTS (SELECT NULL FROM Table2
where Table2.A=Table1.A
and Table2.B=Table1.B
and row_num > 1);
Step3: Drop table create in first step1 i.e Table2
Drop Table Table2;
You should have a look at this answer from the PostgreSQL wiki which also works for Vertica:
DELETE
FROM
tablename
WHERE
id IN(
SELECT
id
FROM
(
SELECT
id,
ROW_NUMBER() OVER(
partition BY column1,
column2,
column3
ORDER BY
id
) AS rnum
FROM
tablename
) t
WHERE
t.rnum > 1
);
It deletes all duplicate entries but the one with the lowest id.

conditional UPDATE in ORACLE with subquery

I have a table ( table1 ) that represents item grouping and another table ( table2 ) that represents the items themselves.
table1.id is foreign key to table2 and in every record of table1 I also collect information like the total number of records in table2 associated with that particular record and the sum of various fields so that I can show the grouping and a summary of what's in it without having to query table2.
Usually items in table2 are added/removed one at a time, so I update table1 to reflect the changes in table2.
A new requirement arose, choosen items in a group must be moved to a new group. I thought of it as a 3 step operation:
create a new group in table1
update choosen records in table2 to point to the newly created rec in table1
the third step would be to subtract to the group the number of records / the sum of those other fields I need do show and add them to the new group, data that I can find simply querying table2 for items associated with the new group.
I came up with the following statement that works.
update table1 t1 set
countitems = (
case t1.id
when 1 then t1.countitems - ( select count( t2.id ) from table2 t2 where t2.id = 2 )
when 2 then ( select count( t2.id ) from table2 t2 where t2.id = 2 )
end
),
sumitems = (
case t1.id
when 1 then t1.sumitems - ( select sum( t2.num ) from table2 t2 where t2.id = 2 )
when 2 then ( select sum( t2.num ) from table2 t2 where t2.id = 2 )
end
)
where t1.id in( 1, 2 );
is there a way to rewrite the statement without having to repeat the subquery every time?
thanks
Piero
You can use a cursor and a bulk collect update statement on the rowid. That way you can simply write the join query with the desired result and update the table with those values. I always use this function and make slight adjustments each time.
declare
cursor cur_cur
IS
select ti.rowid row_id
, count(t2.id) countitems
, sum(t2.num) numitems
from table t1
join table t2 on t1.id = t2.t1_id
order by row_id
;
type type_rowid_array is table of rowid index by binary_integer;
type type_countitems_array is table of table1.countitems%type;
type type_numitems_array is table of table1.numitems%type;
arr_rowid type_rowid_array;
arr_countitems type_countitems_array;
arr_numitems type_numitems_array;
v_commit_size number := 10000;
begin
open cur_cur;
loop
fetch cur_cur bulk collect into arr_rowid, arr_countitems, arr_numitems limit v_commit_size;
forall i in arr_rowid.first .. arr_rowid.last
update table1 tab
SET tab.countitems = arr_countitems(i)
, tab.numitems = arr_numitems(i)
where tab.rowid = arr_rowid(i)
;
commit;
exit when cur_cur%notfound;
end loop;
close cur_cur;
commit;
exception
when others
then rollback;
raise_application_error(-20000, 'ERROR updating table1(countitems,numitems) - '||sqlerrm);
end;

Oracle: how to INSERT if a row doesn't exist

What is the easiest way to INSERT a row if it doesn't exist, in PL/SQL (oracle)?
I want something like:
IF NOT EXISTS (SELECT * FROM table WHERE name = 'jonny') THEN
INSERT INTO table VALUES ("jonny", null);
END IF;
But it's not working.
Note: this table has 2 fields, say, name and age. But only name is PK.
INSERT INTO table
SELECT 'jonny', NULL
FROM dual -- Not Oracle? No need for dual, drop that line
WHERE NOT EXISTS (SELECT NULL -- canonical way, but you can select
-- anything as EXISTS only checks existence
FROM table
WHERE name = 'jonny'
)
Assuming you are on 10g, you can also use the MERGE statement. This allows you to insert the row if it doesn't exist and ignore the row if it does exist. People tend to think of MERGE when they want to do an "upsert" (INSERT if the row doesn't exist and UPDATE if the row does exist) but the UPDATE part is optional now so it can also be used here.
SQL> create table foo (
2 name varchar2(10) primary key,
3 age number
4 );
Table created.
SQL> ed
Wrote file afiedt.buf
1 merge into foo a
2 using (select 'johnny' name, null age from dual) b
3 on (a.name = b.name)
4 when not matched then
5 insert( name, age)
6* values( b.name, b.age)
SQL> /
1 row merged.
SQL> /
0 rows merged.
SQL> select * from foo;
NAME AGE
---------- ----------
johnny
If name is a PK, then just insert and catch the error. The reason to do this rather than any check is that it will work even with multiple clients inserting at the same time. If you check and then insert, you have to hold a lock during that time, or expect the error anyway.
The code for this would be something like
BEGIN
INSERT INTO table( name, age )
VALUES( 'johnny', null );
EXCEPTION
WHEN dup_val_on_index
THEN
NULL; -- Intentionally ignore duplicates
END;
I found the examples a bit tricky to follow for the situation where you want to ensure a row exists in the destination table (especially when you have two columns as the primary key), but the primary key might not exist there at all so there's nothing to select.
This is what worked for me:
MERGE INTO table1 D
USING (
-- These are the row(s) you want to insert.
SELECT
'val1' AS FIELD_A,
'val2' AS FIELD_B
FROM DUAL
) S ON (
-- This is the criteria to find the above row(s) in the
-- destination table. S refers to the rows in the SELECT
-- statement above, D refers to the destination table.
D.FIELD_A = S.FIELD_A
AND D.FIELD_B = S.FIELD_B
)
-- This is the INSERT statement to run for each row that
-- doesn't exist in the destination table.
WHEN NOT MATCHED THEN INSERT (
FIELD_A,
FIELD_B,
FIELD_C
) VALUES (
S.FIELD_A,
S.FIELD_B,
'val3'
)
The key points are:
The SELECT statement inside the USING block must always return rows. If there are no rows returned from this query, no rows will be inserted or updated. Here I select from DUAL so there will always be exactly one row.
The ON condition is what sets the criteria for matching rows. If ON does not have a match then the INSERT statement is run.
You can also add a WHEN MATCHED THEN UPDATE clause if you want more control over the updates too.
Using parts of #benoit answer, I will use this:
DECLARE
varTmp NUMBER:=0;
BEGIN
-- checks
SELECT nvl((SELECT 1 FROM table WHERE name = 'john'), 0) INTO varTmp FROM dual;
-- insert
IF (varTmp = 1) THEN
INSERT INTO table (john, null)
END IF;
END;
Sorry for I don't use any full given answer, but I need IF check because my code is much more complex than this example table with name and age fields. I need a very clear code. Well thanks, I learned a lot! I'll accept #benoit answer.
In addition to the perfect and valid answers given so far, there is also the ignore_row_on_dupkey_index hint you might want to use:
create table tq84_a (
name varchar2 (20) primary key,
age number
);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Johnny', 77);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Pete' , 28);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Sue' , 35);
insert /*+ ignore_row_on_dupkey_index(tq84_a(name)) */ into tq84_a values ('Johnny', null);
select * from tq84_a;
The hint is described on Tahiti.
you can use this syntax:
INSERT INTO table_name ( name, age )
select 'jonny', 18 from dual
where not exists(select 1 from table_name where name = 'jonny');
if its open an pop for asking as "enter substitution variable" then use this before the above queries:
set define off;
INSERT INTO table_name ( name, age )
select 'jonny', 18 from dual
where not exists(select 1 from table_name where name = 'jonny');
You should use Merge:
For example:
MERGE INTO employees e
USING (SELECT * FROM hr_records WHERE start_date > ADD_MONTHS(SYSDATE, -1)) h
ON (e.id = h.emp_id)
WHEN MATCHED THEN
UPDATE SET e.address = h.address
WHEN NOT MATCHED THEN
INSERT (id, address)
VALUES (h.emp_id, h.address);
or
MERGE INTO employees e
USING hr_records h
ON (e.id = h.emp_id)
WHEN MATCHED THEN
UPDATE SET e.address = h.address
WHEN NOT MATCHED THEN
INSERT (id, address)
VALUES (h.emp_id, h.address);
https://oracle-base.com/articles/9i/merge-statement
CTE and only CTE :-)
just throw out extra stuff. Here is almost complete and verbose form for all cases of life. And you can use any concise form.
INSERT INTO reports r
(r.id, r.name, r.key, r.param)
--
-- Invoke this script from "WITH" to the end (";")
-- to debug and see prepared values.
WITH
-- Some new data to add.
newData AS(
SELECT 'Name 1' name, 'key_new_1' key FROM DUAL
UNION SELECT 'Name 2' NAME, 'key_new_2' key FROM DUAL
UNION SELECT 'Name 3' NAME, 'key_new_3' key FROM DUAL
),
-- Any single row for copying with each new row from "newData",
-- if you will of course.
copyData AS(
SELECT r.*
FROM reports r
WHERE r.key = 'key_existing'
-- ! Prevent more than one row to return.
AND FALSE -- do something here for than!
),
-- Last used ID from the "reports" table (it depends on your case).
-- (not going to work with concurrent transactions)
maxId AS (SELECT MAX(id) AS id FROM reports),
--
-- Some construction of all data for insertion.
SELECT maxId.id + ROWNUM, newData.name, newData.key, copyData.param
FROM copyData
-- matrix multiplication :)
-- (or a recursion if you're imperative coder)
CROSS JOIN newData
CROSS JOIN maxId
--
-- Let's prevent re-insertion.
WHERE NOT EXISTS (
SELECT 1 FROM reports rs
WHERE rs.name IN(
SELECT name FROM newData
));
I call it "IF NOT EXISTS" on steroids. So, this helps me and I mostly do so.

Resources