Oracle maximum number of expression issue - oracle

I have this query
#Query("SELECT b from BillInfoDetails b where b.masterAcctCode in :masterAccountList and b.msisdn in :msisdnList") List<BillInfoDetails>findAllByMsisdnAndMasterAcctList(#Param("masterAccountList")List<String> masterAccountList, #Param("msisdnList") List<String> msisdnList);
and then find this error ORA-01795: maximum number of expressions in a list is 1000.
This error has some manual solution that I have split the list manually 1 to 999 and then 1000 to 1999 and so on. But this will not the good one for me cause in this msisdnList there could be 1500 or 18000 or some other more values. Moreover I want a dynamic solution actually where any dynamic value whatever it is, it will work properly

One option is to store all those values into a table; then you'd be able to use it as a join (or a subquery). For example:
select b.*
from billinfodetails b join new_table n on n.masteracctcode = b.masteracctcode
or
select b.*
from billinfodetails b
where b.masteracctcode in (select n.masteracctcode
from new_table n)
or
select b.*
from billinfodetails b
where exists (select null
from new_table n
where n.masteracctcode = b.masteracctcode)

Related

Clickhouse - Latest Record

We have almost 1B records in a replicated merge tree table.
The primary key is a,b,c
Our App keeps writing into this table with every user action. (we accumulate almost a million records per hour)
We append (store) the latest timestamp (updated_at) for a given unique combination of (a,b)
The key requirement is to provide a roll-up against the latest timestamp for a given combination of a,b,c
Currently, we are processing the queries as
select a,b,c, sum(x), sum(y)...etc
from table_1
where (a,b,updated_at) in (select a,b,max(updated_at) from table_1 group by a,b)
and c in (...)
group by a,b,c
clarification on the sub-query
(select a,b,max(updated_at) from table_1 group by a,b)
^ This part is for illustration only.. our app writes latest updated_at for every a,b implying that the clause shown above is more like
(select a,b,updated_at from tab_1_summary)
[where tab_1_summary has latest record for a given a,b]
Note: We have to keep the grouping criteria as-is.
The table is structured with partition (c) order by (a, b, updated_at)
Question is, is there a way to write a better query. (that can returns results faster..we are required to shave off few seconds from the overall processing)
FYI: We toyed working with Materialized View ReplicatedReplacingMergeTree. But, given the size of this table, and constant inserts + the FINAL clause doesn't necessarily work well as compared to the query above.
Thanks in advance!
Just for test try to use join instead of tuple in (tuples):
select t.a, t.b, t.c, sum(x), sum(y)...etc
from table_1 AS t inner join tab_1_summary using (a, b, updated_at)
where c in (...)
group by t.a, t.b, t.c
Consider using AggregatingMergeTree to pre-calculate result metrics:
CREATE MATERIALIZED VIEW table_1_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(updated_at)
ORDER BY (updated_at, a, b, c)
AS SELECT
updated_at,
a,b,c,
sum(x) AS x, /* see [SimpleAggregateFunction data type](https://clickhouse.tech/docs/en/sql-reference/data-types/simpleaggregatefunction/) */
sum(y) AS y,
/* For non-simple functions should be used [AggregateFunction data type](https://clickhouse.tech/docs/en/sql-reference/data-types/aggregatefunction/). */
// etc..
FROM table_1
GROUP BY updated_at, a, b, c;
And use this way to get result:
select a,b,c, sum(x), sum(y)...etc
from table_1_mv
where (updated_at,a,b) in (select updated_at,a,b from tab_1_summary)
and c in (...)
group by a,b,c

11g Oracle aggregate SQL query

Can you please help me in getting a query for this scenario. In below case it should return me single row of A=13 because 13,14 in column A has most occurrences and value of B (30) is greater for 13. We are interested in maximum occurrences of A and in case of tie B should be considered as tie breaker.
A B
13 30
13 12
14 10
14 25
15 5
In below case where there are single occurrence of A (all tied) it should return 14 having maximum value of 40 for B.
A B
13 30
14 40
15 5
Use case - we get calls from corporate customers. We are interested in knowing during what hours of day when most calls come and in case of tie - which of the busiest hours has longest call.
Further question
There is further questions on this. I want to use either of two solutions - '11g or lower' from #GurV or 'dense_rank' from #mathguy in bigger query below how can I do it.
SELECT dv.id , u.email , dv.email_subject AS headline , dv.start_date , dv.closing_date, b.name AS business_name, ls.call_cost, dv.currency,
SUM(lsc.duration) AS duration, COUNT(lsc.id) AS call_count, ROUND(AVG(lsc.duration), 2) AS avg_duration
-- max(extract(HOUR from started )) keep (dense_rank last order by count(duration), max(duration)) as most_popular_hour
FROM deal_voucher dv
JOIN lead_source ls ON dv.id = ls.deal_id
JOIN lead_source_call lsc ON ls.PHONE_SID = lsc.phone_number_id
JOIN business b ON dv.business_id = b.id
JOIN users u ON b.id = u.business_id
AND TRUNC(dv.closing_date) = to_date('13-01-2017', 'dd-mm-yyyy')
AND lsc.status = 'completed' and lsc.duration >= 30
GROUP BY dv.id , u.email , dv.email_subject , dv.start_date , dv.closing_date, b.name, ls.call_cost, dv.currency
--, extract(HOUR from started )
Try this if 12c+
select a
from t
group by a
order by count(*) desc, max(b) desc
fetch first 1 row only;
If 11g or lower:
select * from (
select a
from t
group by a
order by count(*) desc, max(b) desc
) where rownum = 1;
Note that if there is equal count and equal max value for two or more values of A, then any one of them will be fetched.
Here is a query that will work in older versions (no fetch clause) and does not require a subquery. It uses the first/last function. In case of ties by both "count by A" and "value of max(B)" it selects only the row with the largest value of A. You can change that to min(A), or even to sum(A) (although that probably doesn't make sense in your problem) or LISTAGG(A, ',') WITHIN GROUP (ORDER BY A) to get a comma-delimited list of the A's that are tied for first place, but that requires 11.2 (I believe).
select max(a) keep (dense_rank last order by count(b), max(b)) as a
, max(max(b)) keep (dense_rank last order by count(b)) as b
from inputs
group by a
;

select statement from a table ONLY if some of the fields were updated ORACLE

Can anyone explain, how I can create a select statement and fetch the data from a table, but only if particular fields were updated ?! Let's say I have:
select a, b, c, d , e, f
from table 1 t1
inner join table2 t2
on t1.a = t2.a
I'm interesting if columns d, e, f were updated since yesterday let's say, than I want to include this row in my select statement, but if d, e, f were not updated since yesterday than ignore this row. In table1 I have a date field when the data was inserted (date_created) and the date field when it was updated (date_modified). The tricky bit is, that data in table1 might be updated by the users during the day, but not obligatory fields d, e, f , lets say user simply updated columns a, b, c. But date_modified column will show that the row has been updated. So I cannot rely purely on the date_modified column. My question is, is there any other way how to filter the data and get correct rows in return ? Triggers and stored procedures is not an option, ideally pure sql .. Any help?
It's unclear which columns belong to which table but one solution is to use a flashback query (provided you have sufficient undo retention to accommodate the 24 hour difference between queries).
An example of finding the differences on a table where columns d, e or f have changed from their value 24 hours ago is:
SELECT t.*
FROM table_name t
INNER JOIN
(
SELECT *
FROM table_name
AS OF TIMESTAMP SYSTIMESTAMP - INTERVAL '1' DAY
) p
ON ( t.a = p.a
AND ( t.d <> p.d OR t.e <> p.e OR t.f <> p.f ) );
Solved! Solution: Add an extra column (lets say Total) to the target table as a sum of columns d, e, f and update it for the first time. After that, if columns d,e,f were changed during the day, the sum of columns d, e, f will differ from the Total column, and you can simply filter it in where clause.
Maybe it is not the most elegant solution, but it does the job.
Thanks for yours ideas !!!

How to get records randomly from the oracle database?

I need to select rows randomly from an Oracle DB.
Ex: Assume a table with 100 rows, how I can randomly return 20 of those records from the entire 100 rows.
SELECT *
FROM (
SELECT *
FROM table
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum < 21;
SAMPLE() is not guaranteed to give you exactly 20 rows, but might be suitable (and may perform significantly better than a full query + sort-by-random for large tables):
SELECT *
FROM table SAMPLE(20);
Note: the 20 here is an approximate percentage, not the number of rows desired. In this case, since you have 100 rows, to get approximately 20 rows you ask for a 20% sample.
SELECT * FROM table SAMPLE(10) WHERE ROWNUM <= 20;
This is more efficient as it doesn't need to sort the Table.
SELECT column FROM
( SELECT column, dbms_random.value FROM table ORDER BY 2 )
where rownum <= 20;
In summary, two ways were introduced
1) using order by DBMS_RANDOM.VALUE clause
2) using sample([%]) function
The first way has advantage in 'CORRECTNESS' which means you will never fail get result if it actually exists, while in the second way you may get no result even though it has cases satisfying the query condition since information is reduced during sampling.
The second way has advantage in 'EFFICIENT' which mean you will get result faster and give light load to your database.
I was given an warning from DBA that my query using the first way gives loads to the database
You can choose one of two ways according to your interest!
In case of huge tables standard way with sorting by dbms_random.value is not effective because you need to scan whole table and dbms_random.value is pretty slow function and requires context switches. For such cases, there are 3 additional methods:
1: Use sample clause:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
for example:
select *
from s1 sample block(1)
order by dbms_random.value
fetch first 1 rows only
ie get 1% of all blocks, then sort them randomly and return just 1 row.
2: if you have an index/primary key on the column with normal distribution, you can get min and max values, get random value in this range and get first row with a value greater or equal than that randomly generated value.
Example:
--big table with 1 mln rows with primary key on ID with normal distribution:
Create table s1(id primary key,padding) as
select level, rpad('x',100,'x')
from dual
connect by level<=1e6;
select *
from s1
where id>=(select
dbms_random.value(
(select min(id) from s1),
(select max(id) from s1)
)
from dual)
order by id
fetch first 1 rows only;
3: get random table block, generate rowid and get row from the table by this rowid:
select *
from s1
where rowid = (
select
DBMS_ROWID.ROWID_CREATE (
1,
objd,
file#,
block#,
1)
from
(
select/*+ rule */ file#,block#,objd
from v$bh b
where b.objd in (select o.data_object_id from user_objects o where object_name='S1' /* table_name */)
order by dbms_random.value
fetch first 1 rows only
)
);
To randomly select 20 rows I think you'd be better off selecting the lot of them randomly ordered and selecting the first 20 of that set.
Something like:
Select *
from (select *
from table
order by dbms_random.value) -- you can also use DBMS_RANDOM.RANDOM
where rownum < 21;
Best used for small tables to avoid selecting large chunks of data only to discard most of it.
Here's how to pick a random sample out of each group:
SELECT GROUPING_COLUMN,
MIN (COLUMN_NAME) KEEP (DENSE_RANK FIRST ORDER BY DBMS_RANDOM.VALUE)
AS RANDOM_SAMPLE
FROM TABLE_NAME
GROUP BY GROUPING_COLUMN
ORDER BY GROUPING_COLUMN;
I'm not sure how efficient it is, but if you have a lot of categories and sub-categories, this seems to do the job nicely.
-- Q. How to find Random 50% records from table ?
when we want percent wise randomly data
SELECT *
FROM (
SELECT *
FROM table_name
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum <= (select count(*) from table_name) * 50/100;

How to put more than 1000 values into an Oracle IN clause [duplicate]

This question already has answers here:
SQL IN Clause 1000 item limit
(5 answers)
Closed 8 years ago.
Is there any way to get around the Oracle 10g limitation of 1000 items in a static IN clause? I have a comma delimited list of many of IDs that I want to use in an IN clause, Sometimes this list can exceed 1000 items, at which point Oracle throws an error. The query is similar to this...
select * from table1 where ID in (1,2,3,4,...,1001,1002,...)
Put the values in a temporary table and then do a select where id in (select id from temptable)
select column_X, ... from my_table
where ('magic', column_X ) in (
('magic', 1),
('magic', 2),
('magic', 3),
('magic', 4),
...
('magic', 99999)
) ...
I am almost sure you can split values across multiple INs using OR:
select * from table1 where ID in (1,2,3,4,...,1000) or
ID in (1001,1002,...,2000)
You may try to use the following form:
select * from table1 where ID in (1,2,3,4,...,1000)
union all
select * from table1 where ID in (1001,1002,...)
Where do you get the list of ids from in the first place? Since they are IDs in your database, did they come from some previous query?
When I have seen this in the past it has been because:-
a reference table is missing and the correct way would be to add the new table, put an attribute on that table and join to it
a list of ids is extracted from the database, and then used in a subsequent SQL statement (perhaps later or on another server or whatever). In this case, the answer is to never extract it from the database. Either store in a temporary table or just write one query.
I think there may be better ways to rework this code that just getting this SQL statement to work. If you provide more details you might get some ideas.
Use ...from table(... :
create or replace type numbertype
as object
(nr number(20,10) )
/
create or replace type number_table
as table of numbertype
/
create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
open p_ref_result for
select *
from employees , (select /*+ cardinality(tab 10) */ tab.nr from table(p_numbers) tab) tbnrs
where id = tbnrs.nr;
end;
/
This is one of the rare cases where you need a hint, else Oracle will not use the index on column id. One of the advantages of this approach is that Oracle doesn't need to hard parse the query again and again. Using a temporary table is most of the times slower.
edit 1 simplified the procedure (thanks to jimmyorr) + example
create or replace procedure tableselect
( p_numbers in number_table
, p_ref_result out sys_refcursor)
is
begin
open p_ref_result for
select /*+ cardinality(tab 10) */ emp.*
from employees emp
, table(p_numbers) tab
where tab.nr = id;
end;
/
Example:
set serveroutput on
create table employees ( id number(10),name varchar2(100));
insert into employees values (3,'Raymond');
insert into employees values (4,'Hans');
commit;
declare
l_number number_table := number_table();
l_sys_refcursor sys_refcursor;
l_employee employees%rowtype;
begin
l_number.extend;
l_number(1) := numbertype(3);
l_number.extend;
l_number(2) := numbertype(4);
tableselect(l_number, l_sys_refcursor);
loop
fetch l_sys_refcursor into l_employee;
exit when l_sys_refcursor%notfound;
dbms_output.put_line(l_employee.name);
end loop;
close l_sys_refcursor;
end;
/
This will output:
Raymond
Hans
I wound up here looking for a solution as well.
Depending on the high-end number of items you need to query against, and assuming your items are unique, you could split your query into batches queries of 1000 items, and combine the results on your end instead (pseudocode here):
//remove dupes
items = items.RemoveDuplicates();
//how to break the items into 1000 item batches
batches = new batch list;
batch = new batch;
for (int i = 0; i < items.Count; i++)
{
if (batch.Count == 1000)
{
batches.Add(batch);
batch.Clear()
}
batch.Add(items[i]);
if (i == items.Count - 1)
{
//add the final batch (it has < 1000 items).
batches.Add(batch);
}
}
// now go query the db for each batch
results = new results;
foreach(batch in batches)
{
results.Add(query(batch));
}
This may be a good trade-off in the scenario where you don't typically have over 1000 items - as having over 1000 items would be your "high end" edge-case scenario. For example, in the event that you have 1500 items, two queries of (1000, 500) wouldn't be so bad. This also assumes that each query isn't particularly expensive in of its own right.
This wouldn't be appropriate if your typical number of expected items got to be much larger - say, in the 100000 range - requiring 100 queries. If so, then you should probably look more seriously into using the global temporary tables solution provided above as the most "correct" solution. Furthermore, if your items are not unique, you would need to resolve duplicate results in your batches as well.
Yes, very weird situation for oracle.
if you specify 2000 ids inside the IN clause, it will fail.
this fails:
select ...
where id in (1,2,....2000)
but if you simply put the 2000 ids in another table (temp table for example), it will works
below query:
select ...
where id in (select userId
from temptable_with_2000_ids )
what you can do, actually could split the records into a lot of 1000 records and execute them group by group.
Here is some Perl code that tries to work around the limit by creating an inline view and then selecting from it. The statement text is compressed by using rows of twelve items each instead of selecting each item from DUAL individually, then uncompressed by unioning together all columns. UNION or UNION ALL in decompression should make no difference here as it all goes inside an IN which will impose uniqueness before joining against it anyway, but in the compression, UNION ALL is used to prevent a lot of unnecessary comparing. As the data I'm filtering on are all whole numbers, quoting is not an issue.
#
# generate the innards of an IN expression with more than a thousand items
#
use English '-no_match_vars';
sub big_IN_list{
#_ < 13 and return join ', ',#_;
my $padding_required = (12 - (#_ % 12)) % 12;
# get first dozen and make length of #_ an even multiple of 12
my ($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l) = splice #_,0,12, ( ('NULL') x $padding_required );
my #dozens;
local $LIST_SEPARATOR = ', '; # how to join elements within each dozen
while(#_){
push #dozens, "SELECT #{[ splice #_,0,12 ]} FROM DUAL"
};
$LIST_SEPARATOR = "\n union all\n "; # how to join #dozens
return <<"EXP";
WITH t AS (
select $a A, $b B, $c C, $d D, $e E, $f F, $g G, $h H, $i I, $j J, $k K, $l L FROM DUAL
union all
#dozens
)
select A from t union select B from t union select C from t union
select D from t union select E from t union select F from t union
select G from t union select H from t union select I from t union
select J from t union select K from t union select L from t
EXP
}
One would use that like so:
my $bases_list_expr = big_IN_list(list_your_bases());
$dbh->do(<<"UPDATE");
update bases_table set belong_to = 'us'
where id in ($bases_list_expr)
UPDATE
Instead of using IN clause, can you try using JOIN with the other table, which is fetching the id. that way we don't need to worry about limit. just a thought from my side.
Instead of SELECT * FROM table1 WHERE ID IN (1,2,3,4,...,1000);
Use this :
SELECT * FROM table1 WHERE ID IN (SELECT rownum AS ID FROM dual connect BY level <= 1000);
*Note that you need to be sure the ID does not refer any other foreign IDS if this is a dependency. To ensure only existing ids are available then :
SELECT * FROM table1 WHERE ID IN (SELECT distinct(ID) FROM tablewhereidsareavailable);
Cheers

Resources