UPDATE on INSERT duplicate primary key in Oracle? - oracle

I have a simple INSERT query where I need to use UPDATE instead when the primary key is a duplicate. In MySQL this seems easier, in Oracle it seems I need to use MERGE.
All examples I could find of MERGE had some sort of "source" and "target" tables, in my case, the source and target is the same table. I was not able to make sense of the examples to create my own query.
Is MERGE the only way or maybe there's a better solution?
INSERT INTO movie_ratings
VALUES (1, 3, 5)
It's basically this and the primary key is the first 2 values, so an update would be like this:
UPDATE movie_ratings
SET rating = 8
WHERE mid = 1 AND aid = 3
I thought of using a trigger that would automatically execute the UPDATE statement when the INSERT was called but only if the primary key is a duplicate. Is there any problem doing it this way? I need some help with triggers though as I'm having some difficulty trying to understand them and doing my own.

MERGE is the 'do INSERT or UPDATE as appropriate' statement in Standard SQL, and probably therefore in Oracle SQL too.
Yes, you need a 'table' to merge from, but you can almost certainly create that table on the fly:
MERGE INTO Movie_Ratings M
USING (SELECT 1 AS mid, 3 AS aid, 8 AS rating FROM dual) N
ON (M.mid = N.mid AND M.aid = N.aid)
WHEN MATCHED THEN UPDATE SET M.rating = N.rating
WHEN NOT MATCHED THEN INSERT( mid, aid, rating)
VALUES(N.mid, N.aid, N.rating);
(Syntax not verified.)

A typical way of doing this is
performing the INSERT and catch a DUP_VAL_ON_INDEX and then perform an UPDATE instead
performing the UPDATE first and if SQL%Rows = 0 perform an INSERT
You can't write a trigger on a table that does another operation on the same table. That's causing an Oracle error (mutating tables).

I'm a T-SQL guy but a trigger in this case is not a good solution. Most triggers are not good solutions. In T-SQL, I would simply perform an IF EXISTS (SELECT * FROM dbo.Table WHERE ...) but in Oracle, you have to select the count...
DECLARE
cnt NUMBER;
BEGIN
SELECT COUNT(*)
INTO cnt
FROM mytable
WHERE id = 12345;
IF( cnt = 0 )
THEN
...
ELSE
...
END IF;
END;
It would appear that MERGE is what you need in this case:
MERGE INTO movie_ratings mr
USING (
SELECT rating, mid, aid
WHERE mid = 1 AND aid = 3) mri
ON (mr.movie_ratings_id = mri.movie_ratings_id)
WHEN MATCHED THEN
UPDATE SET mr.rating = 8 WHERE mr.mid = 1 AND mr.aid = 3
WHEN NOT MATCHED THEN
INSERT (mr.rating, mr.mid, mr.aid)
VALUES (1, 3, 8)
Like I said, I'm a T-SQL guy but the basic idea here is to "join" the movie_rating table against itself. If there's no performance hit on using the "if exists" example, I'd use it for readability.

Related

Oracle equivalent query for this postgress query - CONFLICT [duplicate]

The UPSERT operation either updates or inserts a row in a table, depending if the table already has a row that matches the data:
if table t has a row exists that has key X:
update t set mystuff... where mykey=X
else
insert into t mystuff...
Since Oracle doesn't have a specific UPSERT statement, what's the best way to do this?
The MERGE statement merges data between two tables. Using DUAL
allows us to use this command. Note that this is not protected against concurrent access.
create or replace
procedure ups(xa number)
as
begin
merge into mergetest m using dual on (a = xa)
when not matched then insert (a,b) values (xa,1)
when matched then update set b = b+1;
end ups;
/
drop table mergetest;
create table mergetest(a number, b number);
call ups(10);
call ups(10);
call ups(20);
select * from mergetest;
A B
---------------------- ----------------------
10 2
20 1
The dual example above which is in PL/SQL was great becuase I wanted to do something similar, but I wanted it client side...so here is the SQL I used to send a similar statement direct from some C#
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" )
However from a C# perspective this provide to be slower than doing the update and seeing if the rows affected was 0 and doing the insert if it was.
An alternative to MERGE (the "old fashioned way"):
begin
insert into t (mykey, mystuff)
values ('X', 123);
exception
when dup_val_on_index then
update t
set mystuff = 123
where mykey = 'X';
end;
Another alternative without the exception check:
UPDATE tablename
SET val1 = in_val1,
val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%rowcount = 0 )
THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;
insert if not exists
update:
INSERT INTO mytable (id1, t1)
SELECT 11, 'x1' FROM DUAL
WHERE NOT EXISTS (SELECT id1 FROM mytble WHERE id1 = 11);
UPDATE mytable SET t1 = 'x1' WHERE id1 = 11;
None of the answers given so far is safe in the face of concurrent accesses, as pointed out in Tim Sylvester's comment, and will raise exceptions in case of races. To fix that, the insert/update combo must be wrapped in some kind of loop statement, so that in case of an exception the whole thing is retried.
As an example, here's how Grommit's code can be wrapped in a loop to make it safe when run concurrently:
PROCEDURE MyProc (
...
) IS
BEGIN
LOOP
BEGIN
MERGE INTO Employee USING dual ON ( "id"=2097153 )
WHEN MATCHED THEN UPDATE SET "last"="smith" , "name"="john"
WHEN NOT MATCHED THEN INSERT ("id","last","name")
VALUES ( 2097153,"smith", "john" );
EXIT; -- success? -> exit loop
EXCEPTION
WHEN NO_DATA_FOUND THEN -- the entry was concurrently deleted
NULL; -- exception? -> no op, i.e. continue looping
WHEN DUP_VAL_ON_INDEX THEN -- an entry was concurrently inserted
NULL; -- exception? -> no op, i.e. continue looping
END;
END LOOP;
END;
N.B. In transaction mode SERIALIZABLE, which I don't recommend btw, you might run into
ORA-08177: can't serialize access for this transaction exceptions instead.
I'd like Grommit answer, except it require dupe values. I found solution where it may appear once: http://forums.devshed.com/showpost.php?p=1182653&postcount=2
MERGE INTO KBS.NUFUS_MUHTARLIK B
USING (
SELECT '028-01' CILT, '25' SAYFA, '6' KUTUK, '46603404838' MERNIS_NO
FROM DUAL
) E
ON (B.MERNIS_NO = E.MERNIS_NO)
WHEN MATCHED THEN
UPDATE SET B.CILT = E.CILT, B.SAYFA = E.SAYFA, B.KUTUK = E.KUTUK
WHEN NOT MATCHED THEN
INSERT ( CILT, SAYFA, KUTUK, MERNIS_NO)
VALUES (E.CILT, E.SAYFA, E.KUTUK, E.MERNIS_NO);
I've been using the first code sample for years. Notice notfound rather than count.
UPDATE tablename SET val1 = in_val1, val2 = in_val2
WHERE val3 = in_val3;
IF ( sql%notfound ) THEN
INSERT INTO tablename
VALUES (in_val1, in_val2, in_val3);
END IF;
The code below is the possibly new and improved code
MERGE INTO tablename USING dual ON ( val3 = in_val3 )
WHEN MATCHED THEN UPDATE SET val1 = in_val1, val2 = in_val2
WHEN NOT MATCHED THEN INSERT
VALUES (in_val1, in_val2, in_val3)
In the first example the update does an index lookup. It has to, in order to update the right row. Oracle opens an implicit cursor, and we use it to wrap a corresponding insert so we know that the insert will only happen when the key does not exist. But the insert is an independent command and it has to do a second lookup. I don't know the inner workings of the merge command but since the command is a single unit, Oracle could execute the correct insert or update with a single index lookup.
I think merge is better when you do have some processing to be done that means taking data from some tables and updating a table, possibly inserting or deleting rows. But for the single row case, you may consider the first case since the syntax is more common.
A note regarding the two solutions that suggest:
1) Insert, if exception then update,
or
2) Update, if sql%rowcount = 0 then insert
The question of whether to insert or update first is also application dependent. Are you expecting more inserts or more updates? The one that is most likely to succeed should go first.
If you pick the wrong one you will get a bunch of unnecessary index reads. Not a huge deal but still something to consider.
Try this,
insert into b_building_property (
select
'AREA_IN_COMMON_USE_DOUBLE','Area in Common Use','DOUBLE', null, 9000, 9
from dual
)
minus
(
select * from b_building_property where id = 9
)
;
From http://www.praetoriate.com/oracle_tips_upserts.htm:
"In Oracle9i, an UPSERT can accomplish this task in a single statement:"
INSERT
FIRST WHEN
credit_limit >=100000
THEN INTO
rich_customers
VALUES(cust_id,cust_credit_limit)
INTO customers
ELSE
INTO customers SELECT * FROM new_customers;

Oracle PL/SQL Update statement looping forever - 504 Gateway Time-out

I'm trying to update a table based on another one's information:
Source_Table (Table 1) columns:
TABLE_ROW_ID (Based on trigger-sequence when insert)
REP_ID
SOFT_ASSIGNMENT
Description (Table 2) columns:
REP_ID
NEW_SOFT_ASSIGNMENT
This is my loop statement:
SELECT count(table_row_id) INTO V_ROWS_APPROVED FROM Source_Table;
FOR i IN 1..V_ROWS_APPROVED LOOP
SELECT REQUESTED_SOFT_MAPPING INTO V_SOFT FROM Source_Table WHERE ROW_ID = i;
SELECT REP_ID INTO V_REP_ID FROM Source_Table WHERE ROW_ID = i;
UPDATE Description_Table D
SET D.NEW_SOFT_ASSIGNMENT = V_SOFT
WHERE D.REP_ID = V_REP_ID;
END LOOP;
END;
The ending result of this loop is a beautiful ''504 Gateway Time-out''.
I know the issue is on the Update query but there's no other way (I can think about) of doing it.
Can someone give me a hand please?
Thanks
Unless your row_id values are contiguous - i.e. count(row_id) == max(row_id) - then this will get a no-data-found. Sequences aren't gapless, so this seems fairly likely. We have no way of telling if that is happening and somehow that is leaving your connection hanging until it times out, or if it's just taking a long time because you're doing a lot of individual queries and updates over a large data set. (And you may be squashing any errors that do occur, though you haven't shown that.)
You don't need to query and update in a loop though, or even use PL/SQL; you can apply all the values in the source table to the description table with a single update or merge:
merge into description_table d
using source_table s
on (s.rep_id = d.rep_id)
when matched then
update set d.new_soft_assignment = s.requested_soft_mapping;
db<>fiddle with some dummy data, including a non-contiguous row_id to show that erroring.

optimizing a dup delete statement Oracle

I have 2 delete statements that are taking a long time to complete. There are several indexes on the columns in where clause.
What is a duplicate?
If 2 or more records have same values in columns id,cid,type,trefid,ordrefid,amount and paydt then there are duplicates.
The DELETEs delete about 1 million record.
Can they be re-written in any way to make it quicker.
DELETE FROM TABLE1 A WHERE loaddt < (
SELECT max(loaddt) FROM TABLE1 B
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
COMMIT;
DELETE FROM TABLE1 a where rowid > (
Select min(rowid) from TABLE1 b
WHERE
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
commit;
Explain Plan:
DELETE TABLE1
HASH JOIN 1296491
Access Predicates
AND
A.ID=ITEM_1
A.CID=ITEM_2
ITEM_3=NVL(TYPE,'-99999')
ITEM_4=NVL(TREFID,'-99999')
ITEM_5=NVL(ORDREFID,'-99999')
ITEM_6=NVL(AMOUNT,(-99999))
ITEM_7=NVL(PAYDT,TO_DATE(' 9999-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Filter Predicates
LOADDT<MAX(LOADDT)
TABLE ACCESS TABLE1 FULL 267904
VIEW VW_SQ_1 690385
SORT GROUP BY 690385
TABLE ACCESS TABLE1 FULL 267904
How large is the table? If count of deleted rows is up to 12% then you may think about index.
Could you somehow partition your table - like week by week and then scan only actual week?
Maybe this could be more effecient. When you're using aggregate function, then oracle must walk through all relevant rows (in your case fullscan), but when you use exists it stops when the first occurence is found. (and of course the query would be much faster, when there was one function-based(because of NVL) index on all columns in where clause)
DELETE FROM TABLE1 A
WHERE exists (
SELECT 1
FROM TABLE1 B
WHERE
A.loaddt != b.loaddt
a.id=b.id and
a.cid=b.cid and
NVL(a.type,'-99999') = NVL(b.type,'-99999') and
NVL(a.trefid,'-99999')=NVL(b.trefid,'-99999') and
NVL(a.ordrefid,'-99999')= NVL(b.ordrefid,'-99999') and
NVL(a.amount,'-99999')=NVL(b.amount,'-99999') and
NVL(a.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))=NVL(b.paydt,TO_DATE('9999-12-31','YYYY-MM-DD'))
);
Although some may disagree, I am a proponent of running large, long running deletes procedurally. In my view it is much easier to control and track progress (and your DBA will like you better ;-) Also, not sure why you need to join table1 to itself to identify duplicates (and I'd be curious if you ever run into snapshot too old issues with your current approach). You also shouldn't need multiple delete statements, all duplicates should be handled in one process. Finally, you should check WHY you're constantly re-introducing duplicates each week, and perhaps change the load process (maybe doing a merge/upsert rather than all inserts).
That said, you might try something like:
-- first create mat view to find all duplicates
create materialized view my_dups_mv
tablespace my_tablespace
build immediate
refresh complete on demand
as
select id,cid,type,trefid,ordrefid,amount,paydt, count(1) as cnt
from table1
group by id,cid,type,trefid,ordrefid,amount,paydt
having count(1) > 1;
-- dedup data (or put into procedure and schedule along with mat view refresh above)
declare
-- make sure my_dups_mv is refreshed first
cursor dup_cur is
select * from my_dups_mv;
type duprec_t is record(row_id rowid);
duprec duprec_t;
type duptab_t is table of duprec_t index by pls_integer;
duptab duptab_t;
l_ctr pls_integer := 0;
l_dupcnt pls_integer := 0;
begin
for rec in dup_cur
loop
l_ctr := l_ctr + 1;
-- assuming needed indexes exist
select rowid
bulk collect into duptab
from table1
where id = rec.id
and cid = rec.cid
and type = rec.type
and trefid = rec.trefid
and ordrefid = rec.ordrefid
and amount = rec.amount
and paydt = rec.paydt
-- order by whatever makes sense to make the "keeper" float to top
order by loaddt desc
;
for i in 2 .. duptab.count
loop
l_dupcnt := l_dupcnt + 1;
delete from table1 where rowid = duptab(i).row_id;
end loop;
if (mod(l_ctr, 10000) = 0) then
-- log to log table here (calling autonomous procedure you'll need to implement)
insert_logtable('Table1 deletes', 'Commit reached, deleted ' || l_dupcnt || ' rows');
commit;
end if;
end loop;
commit;
end;
Check your log table for progress status.
1. Parallel
alter session enable parallel dml;
DELETE /*+ PARALLEL */ FROM TABLE1 A WHERE loaddt < (
...
Assuming you have Enterprise Edition, a sane server configuration, and you are on 11g. If you're not on 11g, the parallel syntax is slightly different.
2. Reduce memory requirements
The plan shows a hash join, which is probably a good thing. But without any useful filters, Oracle has to hash the entire table. (Tbone's query, that only use a GROUP BY, looks nicer and may run faster. But it will also probably run into the same problem trying to sort or hash the entire table.)
If the hash can't fit in memory it must be written to disk, which can be very slow. Since you run this query every week, only one of the tables needs to look at all the rows. Depending on exactly when it runs, you can add something like this to the end of the query: ) where b.loaddt >= sysdate - 14. This may significantly reduce the amount of writing to temporary tablespace. And it may also reduce read IO if you use some partitioning strategy like jakub.petr suggested.
3. Active Report
If you want to know exactly what your query is doing, run the Active Report:
select dbms_sqltune.report_sql_monitor(sql_id => 'YOUR_SQL_ID_HERE', type => 'active')
from dual;
(Save the output to an .html file and open it with a browser.)

How to? Correct sql syntax for finding the next available identifier

I think I could use some help here from more experienced users...
I have an integer field name in a table, let's call it SO_ID in a table SO, and to each new row I need to calculate a new SO_ID based on the following rules
1) SO_ID consists of 6 letters where first 3 are an area code, and the last three is the sequenced number within this area.
309001
309002
309003
2) so the next new row will have a SO_ID of value
309004
3) if someone deletes the row with SO_ID value = 309002, then the next new row must recycle this value, so the next new row has got to have the SO_ID of value
309002
can anyone please provide me with either a SQL function or PL/SQL (perhaps a trigger straightaway?) function that would return the next available SO_ID I need to use ?
I reckon I could get use of keyword rownum in my sql, but the follwoing just doens't work properly
select max(so_id),max(rownum) from(
select (so_id),rownum,cast(substr(cast(so_id as varchar(6)),4,3) as int) from SO
where length(so_id)=6
and substr(cast(so_id as varchar(6)),1,3)='309'
and cast(substr(cast(so_id as varchar(6)),4,3) as int)=rownum
order by so_id
);
thank you for all your help!
This kind of logic is fraught with peril. What if two sessions calculate the same "next" value, or both try to reuse the same "deleted" value? Since your column is an integer, you'd probably be better off querying "between 309001 and 309999", but that begs the question of what happens when you hit the thousandth item in area 309?
Is it possible to make SO_ID a foreign key to another table as well as a unique key? You could pre-populate the parent table with all valid IDs (or use a function to generate them as needed), and then it would be a simple matter to select the lowest one where a child record doesn't exist.
well, we came up with this... sort of works.. concurrency is 'solved' via unique constraint
select min(lastnumber)
from
(
select so_id,so_id-LAG(so_id, 1, so_id) OVER (ORDER BY so_id) AS diff,LAG(so_id, 1, so_id) OVER (ORDER BY so_id)as lastnumber
from so_miso
where substr(cast(so_id as varchar(6)),1,3)='309'
and length(so_id)=6
order by so_id
)a
where diff>1;
Do you really need to compute & store this value at the time a row is inserted? You would normally be better off storing the area code and a date in a table and computing the SO_ID in a view, i.e.
SELECT area_code ||
LPAD( DENSE_RANK() OVER( PARTITION BY area_code
ORDER BY date_column ),
3,
'0' ) AS so_id,
<<other columns>>
FROM your_table
or having a process that runs periodically (nightly, for example) to assign the SO_ID using similar logic.
If your application is not pure sql, you could do this in application code (ie: Java code). This would be more straightforward.
If you are recycling numbers when rows are deleted, your base table must be consulted when generating the next number. "Legacy" pre-relational schemes that attempt to encode information in numbers are a pain to make airtight when numbers must be recycled after deletes, as you say yours must.
If you want to avoid having to scan your table looking for gaps, an after-delete routine must write the deleted number to a separate table in a "ReuseMe" column. The insert routine does this:
begins trans
selects next-number table for update
uses a reuseme number if available else uses the next number
clears the reuseme number if applicable or increments the next-number in the next-number table
commits trans
Ignoring the issues about concurrency, the following should give a decent start.
If 'traffic' on the table is low enough, go with locking the table in exclusive mode for the duration of the transaction.
create table blah (soc_id number(6));
insert into blah select 309000 + rownum from user_tables;
delete from blah where soc_id = 309003;
commit;
create or replace function get_next (i_soc in number) return number is
v_min number := i_soc* 1000;
v_max number := v_min + 999;
begin
lock table blah in exclusive mode;
select min(rn) into v_min
from
(select rownum rn from dual connect by level <= 999
minus
select to_number(substr(soc_id,4))
from blah
where soc_id between v_min and v_max);
return v_min;
end;

How to put more than 1000 values into an Oracle IN clause [duplicate]

This question already has answers here:
SQL IN Clause 1000 item limit
(5 answers)
Closed 8 years ago.
Is there any way to get around the Oracle 10g limitation of 1000 items in a static IN clause? I have a comma delimited list of many of IDs that I want to use in an IN clause, Sometimes this list can exceed 1000 items, at which point Oracle throws an error. The query is similar to this...
select * from table1 where ID in (1,2,3,4,...,1001,1002,...)
Put the values in a temporary table and then do a select where id in (select id from temptable)
select column_X, ... from my_table
where ('magic', column_X ) in (
('magic', 1),
('magic', 2),
('magic', 3),
('magic', 4),
...
('magic', 99999)
) ...
I am almost sure you can split values across multiple INs using OR:
select * from table1 where ID in (1,2,3,4,...,1000) or
ID in (1001,1002,...,2000)
You may try to use the following form:
select * from table1 where ID in (1,2,3,4,...,1000)
union all
select * from table1 where ID in (1001,1002,...)
Where do you get the list of ids from in the first place? Since they are IDs in your database, did they come from some previous query?
When I have seen this in the past it has been because:-
a reference table is missing and the correct way would be to add the new table, put an attribute on that table and join to it
a list of ids is extracted from the database, and then used in a subsequent SQL statement (perhaps later or on another server or whatever). In this case, the answer is to never extract it from the database. Either store in a temporary table or just write one query.
I think there may be better ways to rework this code that just getting this SQL statement to work. If you provide more details you might get some ideas.
Use ...from table(... :
create or replace type numbertype
as object
(nr number(20,10) )
/
create or replace type number_table
as table of numbertype
/
create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
open p_ref_result for
select *
from employees , (select /*+ cardinality(tab 10) */ tab.nr from table(p_numbers) tab) tbnrs
where id = tbnrs.nr;
end;
/
This is one of the rare cases where you need a hint, else Oracle will not use the index on column id. One of the advantages of this approach is that Oracle doesn't need to hard parse the query again and again. Using a temporary table is most of the times slower.
edit 1 simplified the procedure (thanks to jimmyorr) + example
create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
open p_ref_result for
select /*+ cardinality(tab 10) */ emp.*
from employees emp
, table(p_numbers) tab
where tab.nr = id;
end;
/
Example:
set serveroutput on
create table employees ( id number(10),name varchar2(100));
insert into employees values (3,'Raymond');
insert into employees values (4,'Hans');
commit;
declare
l_number number_table := number_table();
l_sys_refcursor sys_refcursor;
l_employee employees%rowtype;
begin
l_number.extend;
l_number(1) := numbertype(3);
l_number.extend;
l_number(2) := numbertype(4);
tableselect(l_number, l_sys_refcursor);
loop
fetch l_sys_refcursor into l_employee;
exit when l_sys_refcursor%notfound;
dbms_output.put_line(l_employee.name);
end loop;
close l_sys_refcursor;
end;
/
This will output:
Raymond
Hans
I wound up here looking for a solution as well.
Depending on the high-end number of items you need to query against, and assuming your items are unique, you could split your query into batches queries of 1000 items, and combine the results on your end instead (pseudocode here):
//remove dupes
items = items.RemoveDuplicates();
//how to break the items into 1000 item batches
batches = new batch list;
batch = new batch;
for (int i = 0; i < items.Count; i++)
{
if (batch.Count == 1000)
{
batches.Add(batch);
batch.Clear()
}
batch.Add(items[i]);
if (i == items.Count - 1)
{
//add the final batch (it has < 1000 items).
batches.Add(batch);
}
}
// now go query the db for each batch
results = new results;
foreach(batch in batches)
{
results.Add(query(batch));
}
This may be a good trade-off in the scenario where you don't typically have over 1000 items - as having over 1000 items would be your "high end" edge-case scenario. For example, in the event that you have 1500 items, two queries of (1000, 500) wouldn't be so bad. This also assumes that each query isn't particularly expensive in of its own right.
This wouldn't be appropriate if your typical number of expected items got to be much larger - say, in the 100000 range - requiring 100 queries. If so, then you should probably look more seriously into using the global temporary tables solution provided above as the most "correct" solution. Furthermore, if your items are not unique, you would need to resolve duplicate results in your batches as well.
Yes, very weird situation for oracle.
if you specify 2000 ids inside the IN clause, it will fail.
this fails:
select ...
where id in (1,2,....2000)
but if you simply put the 2000 ids in another table (temp table for example), it will works
below query:
select ...
where id in (select userId
from temptable_with_2000_ids )
what you can do, actually could split the records into a lot of 1000 records and execute them group by group.
Here is some Perl code that tries to work around the limit by creating an inline view and then selecting from it. The statement text is compressed by using rows of twelve items each instead of selecting each item from DUAL individually, then uncompressed by unioning together all columns. UNION or UNION ALL in decompression should make no difference here as it all goes inside an IN which will impose uniqueness before joining against it anyway, but in the compression, UNION ALL is used to prevent a lot of unnecessary comparing. As the data I'm filtering on are all whole numbers, quoting is not an issue.
#
# generate the innards of an IN expression with more than a thousand items
#
use English '-no_match_vars';
sub big_IN_list{
#_ < 13 and return join ', ',#_;
my $padding_required = (12 - (#_ % 12)) % 12;
# get first dozen and make length of #_ an even multiple of 12
my ($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l) = splice #_,0,12, ( ('NULL') x $padding_required );
my #dozens;
local $LIST_SEPARATOR = ', '; # how to join elements within each dozen
while(#_){
push #dozens, "SELECT #{[ splice #_,0,12 ]} FROM DUAL"
};
$LIST_SEPARATOR = "\n union all\n "; # how to join #dozens
return <<"EXP";
WITH t AS (
select $a A, $b B, $c C, $d D, $e E, $f F, $g G, $h H, $i I, $j J, $k K, $l L FROM DUAL
union all
#dozens
)
select A from t union select B from t union select C from t union
select D from t union select E from t union select F from t union
select G from t union select H from t union select I from t union
select J from t union select K from t union select L from t
EXP
}
One would use that like so:
my $bases_list_expr = big_IN_list(list_your_bases());
$dbh->do(<<"UPDATE");
update bases_table set belong_to = 'us'
where id in ($bases_list_expr)
UPDATE
Instead of using IN clause, can you try using JOIN with the other table, which is fetching the id. that way we don't need to worry about limit. just a thought from my side.
Instead of SELECT * FROM table1 WHERE ID IN (1,2,3,4,...,1000);
Use this :
SELECT * FROM table1 WHERE ID IN (SELECT rownum AS ID FROM dual connect BY level <= 1000);
*Note that you need to be sure the ID does not refer any other foreign IDS if this is a dependency. To ensure only existing ids are available then :
SELECT * FROM table1 WHERE ID IN (SELECT distinct(ID) FROM tablewhereidsareavailable);
Cheers

Resources