Oracle: Merge equivalent of insert all? - oracle

I've tried to find an answer on several forums with no luck, so perhaps you can help me out.
I've got an INSERT ALL request that inserts thousands of rows at once.
INSERT ALL
INTO my_table (field_x, field_y, field_z) VALUES ('value_x1', 'value_y1', 'value_z1')
INTO my_table (field_x, field_y, field_z) VALUES ('value_x2', 'value_y2', 'value_z2')
...
INTO my_table (field_x, field_y, field_z) VALUES ('value_xn', 'value_yn', 'value_zn')
SELECT * FROM DUAL;
Now I'd like to amend it to update rows when some criteria are met. For each row, I could have something like:
MERGE INTO my_table m
USING (SELECT 'value_xi' x, 'value_yi' y, 'value_zi' z FROM DUAL) s
ON (m.field_x = s.x and m.field_y = s.y)
WHEN MATCHED THEN UPDATE SET
field_z = s.z,
WHEN NOT MATCHED THE INSERT (field_x, field_y, field_z)
VALUE(s.x, s.y, s.z);
Is there a way for me to do a kind of "MERGE ALL" that would allow to have all those merge requests in one?
Or maybe I'm missing the point and there's a better way to do this?
Thanks,
Edit: One possible solution is to use "UNION ALL" for a set of selects from dual, as follows:
MERGE INTO my_table m
USING (
select '' as x, '' as y, '' as z from dual
union all select 'value_x1', 'value_y1', 'value_z1' from dual
union all select 'value_x2', 'value_y2', 'value_z2' from dual
[...]
union all select 'value_xn', 'value_yn', 'value_zn' from dual
) s
ON (m.field_x = s.x and m.field_y = s.y)
WHEN MATCHED THEN UPDATE SET
field_z = s.z,
WHEN NOT MATCHED THEN INSERT (field_x, field_y, field_z)
VALUES (s.x, s.y, s.z);
NB: I've used a first empty row to be able generate all rows in the same format when I write the request. I also specify the columns names there.
Another solution would be to create a temporary table, INSERT ALL data into it, then merge with the target table and delete the temporary table.

If you're passing in tens of thousands of rows from your python script, I would do:
Create a global temporary table (GTT - this is a permanent table that holds data at session level)
Get your python script to insert the rows into the GTT
Use the GTT in the Merge statement, e.g.:
merge into your_main_table tgt
using your_gtt src
on (<join conditions>)
when matched then
update ...
when not matched then
insert ...;

Related

Insert issue while trying with not exist operator in oracle

Trying to insert values if particular column value not exist in table
I have tried with sub query in where statement
INSERT
INTO ANIMALDATA VALUES
(
( SELECT MAX(first)+1 FROM ANIMALDATA
)
,
'Animals',
'Lion',
10,
'',
'13-06-2019',
'STOP'
)
where not exists
(select NAMES from ANIMALDATA where NAMES='Lion');
If the lion not exist then do insert statement should run
Give me an idea what i am missing as i am a beginner to oracle queries. help me to proceed further. thanks in advance
Since you have a condition, I think you need to do an INSERT INTO...SELECT:
(UPDATE: the CREATE TABLE statement is there to provide simple test data. It is not part of the solution).
create table animaldata(first, kingdom, names, num, nl, dte, s) as
select 1, 'Animals', 'Tiger', 11, 'a', '13-06-2019', 'STOP' from dual;
INSERT
INTO ANIMALDATA select
( SELECT MAX(first)+1 FROM ANIMALDATA
)
,
'Animals',
'Lion',
10,
'',
'13-06-2019',
'STOP'
from dual
where not exists
(select NAMES from ANIMALDATA where NAMES='Lion');
Best regards,
Stew Ashton
please try below. Thanks,
INSERT
INTO ANIMALDATA a
select
( SELECT MAX(first)+1 FROM ANIMALDATA
)
,
'Animals',
'Lion',
10,
'',
'13-06-2019',
'STOP'
from dual
where not exists
(select 1 from ANIMALDATA b where b.NAMES='Lion' and a.NAMES = b.NAMES );
First off, don't use max(<value>) + 1 to come up with new values for a column - that does not play well with concurrent sessions.
Instead, you should create a sequence and use that in your inserts.
Next, if you are trying to do an upsert (update the row if it exists or insert if it doesn't), you could use a MERGE statement. In this case, you're trying to insert a row if it doesn't already exist, so you don't need the update part.
Therefore you should be doing something like:
CREATE SEQUENCE animaldata_seq
START WITH <find MAX VALUE OF animaldata.first>
INCREMENT BY 1
MAXVALUE 9999999999999999
CACHE 20
NOCYCLE;
MERGE INTO animaldata tgt
USING (SELECT 'Animals' category,
'Lion' animal,
10 num_animals,
NULL unknown_col,
TRUNC(SYSDATE) date_added,
'STOP' action
FROM dual) src
ON (tgt.animal = src.animal)
WHEN NOT MATCHED THEN
INSERT (<list of animaldata columns>)
VALUES (animaldata_seq.nextval,
src.animal,
src.unknown_col,
src.date_added,
src.action);
Note that I have tried to specify the columns being inserted into - that's good practice! Code that has insert statements that don't list the columns being inserted into are prone to errors should someone add a column to the table.
I have also assumed that the column you're adding the date into is of DATE datatypee; I have used sysdate (truncated to remove the time part) as the value to insert, but you may which to use a specific date, in which case you should use to_date(<string date>, '<string date format')

Delete duplicate rows from a BigQuery table

I have a table with >1M rows of data and 20+ columns.
Within my table (tableX) I have identified duplicate records (~80k) in one particular column (troubleColumn).
If possible I would like to retain the original table name and remove the duplicate records from my problematic column otherwise I could create a new table (tableXfinal) with the same schema but without the duplicates.
I am not proficient in SQL or any other programming language so please excuse my ignorance.
delete from Accidents.CleanedFilledCombined
where Fixed_Accident_Index
in(select Fixed_Accident_Index from Accidents.CleanedFilledCombined
group by Fixed_Accident_Index
having count(Fixed_Accident_Index) >1);
You can remove duplicates by running a query that rewrites your table (you can use the same table as the destination, or you can create a new table, verify that it has what you want, and then copy it over the old table).
A query that should work is here:
SELECT *
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY Fixed_Accident_Index)
row_number
FROM Accidents.CleanedFilledCombined
)
WHERE row_number = 1
UPDATE 2019: To de-duplicate rows on a single partition with a MERGE, see:
https://stackoverflow.com/a/57900778/132438
An alternative to Jordan's answer - this one scales better when having too many duplicates:
#standardSQL
SELECT event.* FROM (
SELECT ARRAY_AGG(
t ORDER BY t.created_at DESC LIMIT 1
)[OFFSET(0)] event
FROM `githubarchive.month.201706` t
# GROUP BY the id you are de-duplicating by
GROUP BY actor.id
)
Or a shorter version (takes any row, instead of the newest one):
SELECT k.*
FROM (
SELECT ARRAY_AGG(x LIMIT 1)[OFFSET(0)] k
FROM `fh-bigquery.reddit_comments.2017_01` x
GROUP BY id
)
To de-duplicate rows on an existing table:
CREATE OR REPLACE TABLE `deleting.deduplicating_table`
AS
# SELECT id FROM UNNEST([1,1,1,2,2]) id
SELECT k.*
FROM (
SELECT ARRAY_AGG(row LIMIT 1)[OFFSET(0)] k
FROM `deleting.deduplicating_table` row
GROUP BY id
)
Not sure why nobody mentioned DISTINCT query.
Here is the way to clean duplicate rows:
CREATE OR REPLACE TABLE project.dataset.table
AS
SELECT DISTINCT * FROM project.dataset.table
If your schema doesn’t have any records - below variation of Jordan’s answer will work well enough with writing over same table or new one, etc.
SELECT <list of original fields>
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Fixed_Accident_Index) AS pos,
FROM Accidents.CleanedFilledCombined
)
WHERE pos = 1
In more generic case - with complex schema with records/netsed fields, etc. - above approach can be a challenge.
I would propose to try using Tabledata: insertAll API with rows[].insertId set to respective Fixed_Accident_Index for each row.
In this case duplicate rows will be eliminated by BigQuery
Of course, this will involve some client side coding - so might be not relevant for this particular question.
I havent tried this approach by myself either but feel it might be interesting to try :o)
If you have a large-size partitioned table, and only have duplicates in a certain partition range. You don't want to overscan nor process the whole table. use the MERGE SQL below with predicates on partition range:
-- WARNING: back up the table before this operation
-- FOR large size timestamp partitioned table
-- -------------------------------------------
-- -- To de-duplicate rows of a given range of a partition table, using surrage_key as unique id
-- -------------------------------------------
DECLARE dt_start DEFAULT TIMESTAMP("2019-09-17T00:00:00", "America/Los_Angeles") ;
DECLARE dt_end DEFAULT TIMESTAMP("2019-09-22T00:00:00", "America/Los_Angeles");
MERGE INTO `gcp_project`.`data_set`.`the_table` AS INTERNAL_DEST
USING (
SELECT k.*
FROM (
SELECT ARRAY_AGG(original_data LIMIT 1)[OFFSET(0)] k
FROM `gcp_project`.`data_set`.`the_table` AS original_data
WHERE stamp BETWEEN dt_start AND dt_end
GROUP BY surrogate_key
)
) AS INTERNAL_SOURCE
ON FALSE
WHEN NOT MATCHED BY SOURCE
AND INTERNAL_DEST.stamp BETWEEN dt_start AND dt_end -- remove all data in partiion range
THEN DELETE
WHEN NOT MATCHED THEN INSERT ROW
credit: https://gist.github.com/hui-zheng/f7e972bcbe9cde0c6cb6318f7270b67a
Easier answer, without a subselect
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY Fixed_Accident_Index)
row_number
FROM Accidents.CleanedFilledCombined
WHERE TRUE
QUALIFY row_number = 1
The Where True is neccesary because qualify needs a where, group by or having clause
Felipe's answer is the best approach for most cases. Here is a more elegant way to accomplish the same:
CREATE OR REPLACE TABLE Accidents.CleanedFilledCombined
AS
SELECT
Fixed_Accident_Index,
ARRAY_AGG(x LIMIT 1)[SAFE_OFFSET(0)].* EXCEPT(Fixed_Accident_Index)
FROM Accidents.CleanedFilledCombined AS x
GROUP BY Fixed_Accident_Index;
To be safe, make sure you backup the original table before you run this ^^
I don't recommend to use ROW NUMBER() OVER() approach if possible since you may run into BigQuery memory limits and get unexpected errors.
Update BigQuery schema with new table column as bq_uuid making it NULLABLE and type STRING

Create duplicate rows by running same command 5 times for example
insert into beginner-290513.917834811114.messages (id, type, flow, updated_at) Values(19999,"hello", "inbound", '2021-06-08T12:09:03.693646')
Check if duplicate entries exist
select * from beginner-290513.917834811114.messages where id = 19999
Use generate uuid function to generate uuid corresponding to each message

UPDATE beginner-290513.917834811114.messages
SET bq_uuid = GENERATE_UUID()
where id>0
Clean duplicate entries
DELETE FROM beginner-290513.917834811114.messages
WHERE bq_uuid IN
(SELECT bq_uuid
FROM
(SELECT bq_uuid,
ROW_NUMBER() OVER( PARTITION BY updated_at
ORDER BY bq_uuid ) AS row_num
FROM beginner-290513.917834811114.messages ) t
WHERE t.row_num > 1 );

Fastest way of doing field comparisons in the same table with large amounts of data in oracle

I am recieving information from a csv file from one department to compare with the same inforation in a different department to check for discrepencies (About 3/4 of a million rows of data with 44 columns in each row). After I have the data in a table, I have a program that will take the data and send reports based on a HQ. I feel like the way I am going about this is not the most efficient. I am using oracle for this comparison.
Here is what I have:
I have a vb.net program that parses the data and inserts it into an extract table
I run a procedure to do a full outer join on the two tables into a new table with the fields in one department prefixed with '_c'
I run another procedure to compare the old/new data and update 2 different tables with detail and summary information. Here is code from inside the procedure:
DECLARE
CURSOR Cur_Comp IS SELECT * FROM T.AEC_CIS_COMP;
BEGIN
FOR compRow in Cur_Comp LOOP
--If service pipe exists in CIS but not in FM and the service pipe has status of retired in CIS, ignore the variance
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
--If there is not a summary record for this HQ in the table for this run, create one
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT compRow.HQ, to_date(sysdate, 'DD/MM/YYYY') from dual WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = compRow.HQ AND RUN_DATE = to_date(sysdate, 'DD/MM/YYYY'))
-- Check fields and update the tables accordingly
If (compRow.cis_loop <> compRow.cis_loop_c) Then
--Insert information into the details table
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl,
DateTime, Changed_Field, CIS_Value, FM_Value)
VALUES(compRow.Fac_ID, compRow.Pipe_Num, compRow.Hq, compRow.Street_Num || ' ' || compRow.Street_Name,
'Y', sysdate, 'Cis_Loop', compRow.cis_loop, compRow.cis_loop_c);
-- Update information into the summary table
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq = compRow.Hq
AND Run_Date = to_date(sysdate, 'DD/MM/YYYY')
End If;
END LOOP;
END;
Any suggestions of an easier way of doing this rather than an if statement for all 44 columns of the table? (This is run once a week if it matters)
Update: Just to clarify, there are 88 columns of data (44 of duplicates to compare with one suffixed with _c). One table lists each field in a row that is different so one row can mean 30+ records written in that table. The other table keeps tally of the number of discrepencies for each week.
First of all I believe that your task can be implemented (and should be actually) with staight SQL. No fancy cursors, no loops, just selects, inserts and updates. I would start with unpivotting your source data (it is not clear if you have primary key to join two sets, I guess you do):
Col0_PK Col1 Col2 Col3 Col4
----------------------------------------
Row1_val A B C D
Row2_val E F G H
Above is your source data. Using UNPIVOT clause we convert it to:
Col0_PK Col_Name Col_Value
------------------------------
Row1_val Col1 A
Row1_val Col2 B
Row1_val Col3 C
Row1_val Col4 D
Row2_val Col1 E
Row2_val Col2 F
Row2_val Col3 G
Row2_val Col4 H
I think you get the idea. Say we have table1 with one set of data and the same structured table2 with the second set of data. It is good idea to use index-organized tables.
Next step is comparing rows to each other and storing difference details. Something like:
insert into diff_details(some_service_info_columns_here)
select some_service_info_columns_here_along_with_data_difference
from table1 t1 inner join table2 t2
on t1.Col0_PK = t2.Col0_PK
and t1.Col_name = t2.Col_name
and nvl(t1.Col_value, 'Dummy1') <> nvl(t2.Col_value, 'Dummy2');
And on the last step we update difference summary table:
insert into diff_summary(summary_columns_here)
select diff_row_id, count(*) as diff_count
from diff_details
group by diff_row_id;
It's just rough draft to show my approach, I'm sure there is much more details should be taken into account. To summarize I suggest two things:
UNPIVOT data
Use SQL statements instead of cursors
You have several issues in your code:
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
"cis_status_c" is not declared. Is it a variable or a column in AEC_CIS_COMP?
In case it is a column, just put the condition into the cursor, i.e. SELECT * FROM T.AEC_CIS_COMP WHERE not (compRow.pipe_num = '' AND cis_status_c = 'R')
to_date(sysdate, 'DD/MM/YYYY')
That's nonsense, you convert a date into a date, simply use TRUNC(SYSDATE)
Anyway, I think you can use three single statements instead of a cursor:
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT comp.HQ, trunc(sysdate)
from AEC_CIS_COMP comp
WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = comp.HQ AND RUN_DATE = trunc(sysdate));
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl, DateTime, Changed_Field, CIS_Value, FM_Value)
select comp.Fac_ID, comp.Pipe_Num, comp.Hq, comp.Street_Num || ' ' || comp.Street_Name, 'Y', sysdate, 'Cis_Loop', comp.cis_loop, comp.cis_loop_c
from T.AEC_CIS_COMP comp
where comp.cis_loop <> comp.cis_loop_c;
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq IN (Select Hq from T.AEC_CIS_COMP)
AND trunc(Run_Date) = trunc(sysdate);
They are not tested but they should give you a hint how to do it.

pl-sql include column names in query

A weird request maybe but. My boss wants me to create an admin version of a page we have that displays data from an oracle query in a table.
The admin page, instead of displaying the data (query returns 1 row), needs to return the table name and column name
Ex: Instead of:
Name Initial
==================
Bob A
I want:
Name Initial
============================
Users.FirstName Users.MiddleInitial
I realize I can do this in code but would rather just modify the query to return the data I want so I can leave the report generation code mostly alone.
I don't want to do it in a stored procedure.
So when I spit out the data in the report using something like:
blah blah = MyDataRow("FirstName")
I can leave that as is but instead of it displaying "BOB" it would display "Users.FirstName"
And I want to do the query using select * if possible instead of listing all the columns
So for each of the columns I am querying in the * , I want to get (instead of the column value) the tablename.ColumnName or tablename|columnName
hope you are following- I am confusing myself...
pseudo:
select tablename + '.' + Columnname as WhateverTheColumnNameIs
from Table1
left join Table2 on whatever...
Join Table_Names on blah blah
Whew- after writing all this I think I will just do it on the code side.
But if you are up for it maybe a fun challenge
Oracle does not provide an authentic way(there is no pseudocolumn) to get the column name of a table as a result of a query against that table. But you might consider these two approaches:
Extract column name from an xmltype, formed by passing cursor expression(your query) in the xmltable() function:
-- your table
with t1(first_name, middle_name) as(
select 1,2 from dual
), -- your query
t2 as(
select * -- col1 as "t1.col1"
--, col2 as "t1.col2"
--, col3 as "t1.col3"
from hr.t1
)
select *
from ( select q.object_value.getrootelement() as col_name
, rownum as rn
from xmltable('//*'
passing xmltype(cursor(select * from t2 where rownum = 1))
) q
where q.object_value.getrootelement() not in ('ROWSET', 'ROW')
)
pivot(
max(col_name) for rn in (1 as "name", 2 as "initial")
)
Result:
name initial
--------------- ---------------
FIRST_NAME MIDDLE_NAME
Note: In order for column names to be prefixed with table name, you need to list them
explicitly in the select list of a query and supply an alias, manually.
PL/SQL approach. Starting from Oracle 11g you could use dbms_sql() package and describe_columns() procedure specifically to get the name of columns in the cursor(your select).
This might be what you are looking for, try selecting from system views USER_TAB_COLS or ALL_TAB_COLS.

How to put more than 1000 values into an Oracle IN clause [duplicate]

This question already has answers here:
SQL IN Clause 1000 item limit
(5 answers)
Closed 8 years ago.
Is there any way to get around the Oracle 10g limitation of 1000 items in a static IN clause? I have a comma delimited list of many of IDs that I want to use in an IN clause, Sometimes this list can exceed 1000 items, at which point Oracle throws an error. The query is similar to this...
select * from table1 where ID in (1,2,3,4,...,1001,1002,...)
Put the values in a temporary table and then do a select where id in (select id from temptable)
select column_X, ... from my_table
where ('magic', column_X ) in (
('magic', 1),
('magic', 2),
('magic', 3),
('magic', 4),
...
('magic', 99999)
) ...
I am almost sure you can split values across multiple INs using OR:
select * from table1 where ID in (1,2,3,4,...,1000) or
ID in (1001,1002,...,2000)
You may try to use the following form:
select * from table1 where ID in (1,2,3,4,...,1000)
union all
select * from table1 where ID in (1001,1002,...)
Where do you get the list of ids from in the first place? Since they are IDs in your database, did they come from some previous query?
When I have seen this in the past it has been because:-
a reference table is missing and the correct way would be to add the new table, put an attribute on that table and join to it
a list of ids is extracted from the database, and then used in a subsequent SQL statement (perhaps later or on another server or whatever). In this case, the answer is to never extract it from the database. Either store in a temporary table or just write one query.
I think there may be better ways to rework this code that just getting this SQL statement to work. If you provide more details you might get some ideas.
Use ...from table(... :
create or replace type numbertype
as object
(nr number(20,10) )
/
create or replace type number_table
as table of numbertype
/
create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
open p_ref_result for
select *
from employees , (select /*+ cardinality(tab 10) */ tab.nr from table(p_numbers) tab) tbnrs
where id = tbnrs.nr;
end;
/
This is one of the rare cases where you need a hint, else Oracle will not use the index on column id. One of the advantages of this approach is that Oracle doesn't need to hard parse the query again and again. Using a temporary table is most of the times slower.
edit 1 simplified the procedure (thanks to jimmyorr) + example
create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
open p_ref_result for
select /*+ cardinality(tab 10) */ emp.*
from employees emp
, table(p_numbers) tab
where tab.nr = id;
end;
/
Example:
set serveroutput on
create table employees ( id number(10),name varchar2(100));
insert into employees values (3,'Raymond');
insert into employees values (4,'Hans');
commit;
declare
l_number number_table := number_table();
l_sys_refcursor sys_refcursor;
l_employee employees%rowtype;
begin
l_number.extend;
l_number(1) := numbertype(3);
l_number.extend;
l_number(2) := numbertype(4);
tableselect(l_number, l_sys_refcursor);
loop
fetch l_sys_refcursor into l_employee;
exit when l_sys_refcursor%notfound;
dbms_output.put_line(l_employee.name);
end loop;
close l_sys_refcursor;
end;
/
This will output:
Raymond
Hans
I wound up here looking for a solution as well.
Depending on the high-end number of items you need to query against, and assuming your items are unique, you could split your query into batches queries of 1000 items, and combine the results on your end instead (pseudocode here):
//remove dupes
items = items.RemoveDuplicates();
//how to break the items into 1000 item batches
batches = new batch list;
batch = new batch;
for (int i = 0; i < items.Count; i++)
{
if (batch.Count == 1000)
{
batches.Add(batch);
batch.Clear()
}
batch.Add(items[i]);
if (i == items.Count - 1)
{
//add the final batch (it has < 1000 items).
batches.Add(batch);
}
}
// now go query the db for each batch
results = new results;
foreach(batch in batches)
{
results.Add(query(batch));
}
This may be a good trade-off in the scenario where you don't typically have over 1000 items - as having over 1000 items would be your "high end" edge-case scenario. For example, in the event that you have 1500 items, two queries of (1000, 500) wouldn't be so bad. This also assumes that each query isn't particularly expensive in of its own right.
This wouldn't be appropriate if your typical number of expected items got to be much larger - say, in the 100000 range - requiring 100 queries. If so, then you should probably look more seriously into using the global temporary tables solution provided above as the most "correct" solution. Furthermore, if your items are not unique, you would need to resolve duplicate results in your batches as well.
Yes, very weird situation for oracle.
if you specify 2000 ids inside the IN clause, it will fail.
this fails:
select ...
where id in (1,2,....2000)
but if you simply put the 2000 ids in another table (temp table for example), it will works
below query:
select ...
where id in (select userId
from temptable_with_2000_ids )
what you can do, actually could split the records into a lot of 1000 records and execute them group by group.
Here is some Perl code that tries to work around the limit by creating an inline view and then selecting from it. The statement text is compressed by using rows of twelve items each instead of selecting each item from DUAL individually, then uncompressed by unioning together all columns. UNION or UNION ALL in decompression should make no difference here as it all goes inside an IN which will impose uniqueness before joining against it anyway, but in the compression, UNION ALL is used to prevent a lot of unnecessary comparing. As the data I'm filtering on are all whole numbers, quoting is not an issue.
#
# generate the innards of an IN expression with more than a thousand items
#
use English '-no_match_vars';
sub big_IN_list{
#_ < 13 and return join ', ',#_;
my $padding_required = (12 - (#_ % 12)) % 12;
# get first dozen and make length of #_ an even multiple of 12
my ($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l) = splice #_,0,12, ( ('NULL') x $padding_required );
my #dozens;
local $LIST_SEPARATOR = ', '; # how to join elements within each dozen
while(#_){
push #dozens, "SELECT #{[ splice #_,0,12 ]} FROM DUAL"
};
$LIST_SEPARATOR = "\n union all\n "; # how to join #dozens
return <<"EXP";
WITH t AS (
select $a A, $b B, $c C, $d D, $e E, $f F, $g G, $h H, $i I, $j J, $k K, $l L FROM DUAL
union all
#dozens
)
select A from t union select B from t union select C from t union
select D from t union select E from t union select F from t union
select G from t union select H from t union select I from t union
select J from t union select K from t union select L from t
EXP
}
One would use that like so:
my $bases_list_expr = big_IN_list(list_your_bases());
$dbh->do(<<"UPDATE");
update bases_table set belong_to = 'us'
where id in ($bases_list_expr)
UPDATE
Instead of using IN clause, can you try using JOIN with the other table, which is fetching the id. that way we don't need to worry about limit. just a thought from my side.
Instead of SELECT * FROM table1 WHERE ID IN (1,2,3,4,...,1000);
Use this :
SELECT * FROM table1 WHERE ID IN (SELECT rownum AS ID FROM dual connect BY level <= 1000);
*Note that you need to be sure the ID does not refer any other foreign IDS if this is a dependency. To ensure only existing ids are available then :
SELECT * FROM table1 WHERE ID IN (SELECT distinct(ID) FROM tablewhereidsareavailable);
Cheers

Resources