How can i set a maxiumum value using list agg - oracle

i have read the other questions and answers and they do not help with my issue. i am asking if there is a way to set a limit on the number of results returned in listagg.
I am using this query
HR--Any baby with a HR<80
AS
SELECT fm.y_inpatient_dat, h.pat_id, h.pat_enc_csn_id,
LISTAGG(meas_value, '; ') WITHIN GROUP (ORDER BY fm.recorded_time)
abnormal_HR_values
from
ip_flwsht_meas fm
join pat_enc_hsp h on fm.y_inpatient_dat = h.inpatient_data_id
where fm.flo_meas_id in ('8' ) and (to_number(MEAS_VALUE) <80)
AND fm.recorded_time between (select start_date from dd) AND (select end_date from dd)
group by fm.y_inpatient_dat,h.pat_id, h.pat_enc_csn_id)
and I get the following error:
ORA-01489: result of string concatenation is too long
I have researched online how to set a size limit, but I can't seem to make it work. Can someone please advise how to set a limit so it does not exceed the 50 characters.

In Oracle 12.2, you can use ON OVERFLOW ERROR in the LISTAGG, like:
LISTAGG(meas_value, '; ' ON OVERFLOW ERROR) WITHIN GROUP (ORDER BY fm.recorded_time)
Then you can surround that with a SUBSTR() to get the first 50 characters.
Pre 12.2, you need to restructure the query to limit the number of rows that get seen by the LISTAGG. Here is an example of that that uses DBA_OBJECTS (so people without your tables can run it). It will only the 1st three values for each object type.
SELECT object_type,
listagg(object_name, ', ') within group ( order by object_name) first_three
FROM (
SELECT object_type,
object_name,
row_number() over ( partition by object_type order by object_name ) ord
FROM dba_objects
WHERE owner = 'SYS'
)
WHERE ord <= 3
GROUP BY object_type
ORDER BY object_type;
The idea is to number the row that you want to aggregate and then only aggregate the first X of them, where "X" is small enough not to overflow the max length on VARCHAR2. "X" will depend on your data.
Or, if you don't want the truncation at 50 characters to happen mid-values and/or you don't know how many values are safe to allow, you can replace the ord expression with a running_length expression to keep a running count of the length and cap it off before it gets to your limit (of 50 chars). That expression would be a SUM(length()) OVER (...). Like this:
SELECT object_type,
listagg(object_name, ', ') within group ( order by object_name) first_50_char,
FROM (
SELECT object_type,
object_name,
sum(length(object_name || ', '))
over ( partition by object_type order by object_name ) running_len
FROM dba_objects
WHERE owner = 'SYS'
)
WHERE running_len <= 50+2 -- +2 because the last one won't have a trailing delimiter
GROUP BY object_type
ORDER BY object_type;
With your query, all that put together would look like this:
SELECT y_inpatient_dat,
pat_id,
pat_enc_csn_id,
LISTAGG(meas_value, '; ') WITHIN GROUP ( ORDER BY fm.recorded_time ) abnormal_HR_values
FROM (
SELECT fm.y_inpatient_dat,
h.pat_id,
h.pat_enc_csn_id,
meas_value,
fm.recorded_time,
SUM(length(meas_value || '; ') OVER ( ORDER BY fm.recorded_time ) running_len
FROM ip_flwsht_meas fm
INNER JOIN pat_enc_hsp h on fm.y_inpatient_dat = h.inpatient_data_id
WHERE fm.flo_meas_id in ('8' ) and (to_number(MEAS_VALUE) <80)
AND fm.recorded_time BETWEEN
(SELECT start_date FROM dd) AND (SELECT end_date FROM dd)
)
WHERE running_len <= 50+2
GROUP BY fm.y_inpatient_dat,h.pat_id, h.pat_enc_csn_id;

Related

This query looks clunky. Is there a better way to do this?

I've written a query that works, but looks super clunky. Also, the table only has a few hundred records in it right now, but will in the future have hundreds of thousands of records. It might get into the millions, but I'm not sure. So, performance might become an issue.
I'm wondering if there is a cleaner way to do this.
Thanks.
with objects as
(
select object_type, object_name
from pt_objectshistory
where export_guid = 'PTGAA5V0H2U1XAQYFLQ0QXGWF0OY7Z'
),
distinct_objects as
(select distinct * from objects),
o_count as
(select count(*) ocount from objects),
do_count as
(select count(*) docount from distinct_objects)
select
o_count.ocount,
do_count.docount,
o_count.ocount - do_count.docount delta
from o_count join do_count on 1=1
There is no group by, so you just count the number of (distinct) rows in the table
select count(*), count(distinct object_type || object_name)
from pt_objectshistory
where export_guid = 'PTGAA5V0H2U1XAQYFLQ0QXGWF0OY7Z'
Thanks to #MT0, object_type || object_name could be ambiguous and discard distinct rows, when there are none like in 'abc' || 'def' and 'ab' || 'cdef'.
So depending on the data you have, adding a separator could be helpful, e.g. object_type || ';' || object_name.
You should also look on performance and test, if this solution (concatenating columns) is really faster than a count of a subselect/CTE.
You may group by the object_type, object_name pairs and then sum up the result
select sum(gcount) as ocount, count(*) as docount
from (select count(*) as gcount, object_type, object_name
from pt_objectshistory
where export_guid = 'PTGAA5V0H2U1XAQYFLQ0QXGWF0OY7Z'
group by object_type, object_name) temp

HiveSQLException: cannot recognize input near 'SELECT' 'MAX' '(' in expression specification

I'm trying to get the maximum value of a count. The code is as follows
SELECT coachID, COUNT(coachID)
FROM coaches_awards GROUP BY coachID
HAVING COUNT(coachID) =
(
SELECT MAX(t2.awards)
FROM (
SELECT coachID, count(coachID) as awards
FROM coaches_awards
GROUP BY coachID
) t2
);
Yet something keeps failing. The inner query works and gives the answer that I want and the outer query will work if the inner query is replaced by the number required. So I'm assuming I've made some syntax error.
Where am I going wrong?
If you are just looking for one row, why not do:
SELECT coachID, COUNT(coachID) as cnt
FROM coaches_awards
GROUP BY coachID
ORDER BY cnt DESC
LIMIT 1;
If you want ties, then use RANK() or DENSE_RANK():
SELECT ca.*
FROM (SELECT coachID, COUNT(*) as cnt,
RANK() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM coaches_awards
GROUP BY coachID
) ca
WHERE seqnum = 1;

UPDATE using a WITHIN GROUP

I've written the below statement which returns the data in the format i need to update another table with however i'm struggling with the update
SELECT element_id,
LISTAGG(cast(0 as varchar2(20))||', '|| VALUE, ' | ') WITHIN GROUP (ORDER BY display_order)
FROM EDRN.MD$$_ELEMENT_VALUES
WHERE element_id IN
(SELECT element_id FROM EDRN_NEW.DATA_DICTIONARY)
GROUP BY element_id;
I done a basic convert into an UPDATE statement
UPDATE EDRN_NEW.DATA_DICTIONARY
SET Choices = (LISTAGG(CAST(0 AS VARCHAR2(20))||', '|| VALUE, ' | ') WITHIN GROUP (ORDER BY display_order)
FROM EDRN.MD$$_ELEMENT_VALUES
WHERE element_id IN
(SELECT element_id FROM EDRN_NEW.DATA_DICTIONARY)
GROUP BY element_id);
This received a "ORA-00934: group function is not allowed here" error. I'm unsure how to remove the group function but retain the data format i require?
You need a subquery to use listagg(). In this case, a correlated subquery:
update EDRN_NEW.DATA_DICTIONARY dd
set choices = (SELECT LISTAGG(cast(0 as varchar2(20))||', '|| VALUE, ' | ') WITHIN GROUP (ORDER BY display_order)
FROM EDRN.MD$$_ELEMENT_VALUES ev
WHERE ev.element_id = dd.element_id
)
where exists (select 1
from EDRN.MD$$_ELEMENT_VALUES ev
where ev.element_id = dd.element_id
);

Quering all columns of single table in where condition for same input data

If we want to fetch information based on condition on single column, we do like this
SELECT * FROM contact WHERE firstName = 'james'
IF we want to put conditions on multiple columns, we do this
SELECT * FROM contact WHERE firstName = 'james' OR lastName = 'james' OR businessName = 'james'
But What if we have more than 50 columns.
Is there better way other than WHERE Condition with OR Keyword?
The approach should not involve writing all column names.
There is a way to do this in MySql as shown here.
If you're wanting to search all VARCHAR2 columns, then the following script ought to help:
set pages 0;
set lines 200
select case when rn = 1 and rn_desc = 1 then 'select * from '||table_name||' where '||column_name||' = ''james'';'
when rn = 1 then 'select * from '||table_name||' where '||column_name||' = ''james'''
when rn_desc = 1 then ' and '||column_name||' = ''james'';'
else ' and '||column_name||' = ''james'''
end sql_stmt
from (select table_name,
column_name,
column_id,
row_number() over (partition by table_name order by column_id) rn,
row_number() over (partition by table_name order by column_id desc) rn_desc
from user_tab_columns
where data_type in ('VARCHAR2')
-- and table_name in (<list of tables>) -- uncomment and amend as appropriate!
)
order by table_name, column_id;
If you only want to search specific tables, you would have to put a filter in for the table_names you're after.
Running the above as a script will give you a script containing multiple queries that you can then run
There is no way you can avoid writing all the column names,
But you can use an IN condition to make writing this a bit shorter:
SELECT *
FROM contact
WHERE 'james' in (firstName, lastName, businessName)

Exists query with not equal running extremly slow

I am trying to modify a query that someone else wrote which is taking a really long time to run. The problem has to do with the <> portion of the exists query. Any idea how this can be changed to run quicker?
SELECT m.level4 center, cc.description, m.employeename, m.empno,
TO_DATE (ct.tsdate, 'dd-mon-yyyy') tsdate, ct.starttime, ct.endtime,
ct.DURATION,
NVL (DECODE (ct.paycode, ' ', 'REG', ct.paycode), 'REG') paycode,
ct.titlecode, ct.costcenter, m.tsgroup
FROM clairvia_text ct, MASTER m, costcenteroutbound cc
WHERE ct.recordtype = '1'
AND ct.empno = m.empno
AND m.level4 = cc.center
AND EXISTS (
SELECT ct1.recordtype,ct1.empno,ct1.tsdate,ct1.processdate
FROM clairvia_text ct1
WHERE ct.recordtype = ct1.recordtype
AND ct.empno = ct1.empno
AND ct.tsdate = ct1.tsdate
AND ct.processdate = ct1.processdate
group by ct1.recordtype,ct1.empno,ct1.tsdate,ct1.processdate
having count(*) < 2)
Oracle can be finnicky with exists statements and subqueries. A couple of things to try:
Change the exists to an "in"
Change the exists to a group by statement with a "having count(1) > 1". This could even be changed into a join.
I'm assuming indexes are not an issue.
You can use analytic function count here to eliminate duplicated rows.
select * from (
SELECT m.level4 center, cc.description, m.employeename, m.empno,
TO_DATE (ct.tsdate, 'dd-mon-yyyy') tsdate, ct.starttime, ct.endtime,
ct.DURATION,
NVL (DECODE (ct.paycode, ' ', 'REG', ct.paycode), 'REG') paycode,
ct.titlecode, ct.costcenter, m.tsgroup,
count(1) over (partition by ct.recordtype,ct.empno,ct.tsdate,ct.processdate
order by null) cnt
FROM clairvia_text ct, MASTER m, costcenteroutbound cc
WHERE ct.recordtype = '1'
AND ct.empno = m.empno
AND m.level4 = cc.center
) where cnt=1
I do not have your structures and data, so I run similar queries with all_tab_cols and first query took ~500s on my laptop and second query ~2s.
-- slow test
select count(1)
from all_tab_cols atc
where exists (
select 1
from all_tab_cols atc1
where atc1.column_name = atc.column_name
group by column_name
having count(1) = 1)
-- fast test
select count(1)
from (
select column_name,
count(1) over (partition by atc.column_name order by null) cnt
from all_tab_cols atc
)
where cnt = 1

Resources