I have written this query to be used by orasql and orafetch in a tcl script. The query is nested into a proc that is run as many times as a date range given needs it to be run (usually around 3 times per instance). The tables with $YYYYMM are full month tables consisting of about 13 million rows. The $advocate variable will pull about 39 different account_ids which will reduce the number of rows returned substantially but not great as great as claim_status_code='4' (~460,000 rows). I have no idea what the current run-time of the query is because I have no allowed it to run longer than about 30 mins. I really need this to pull in a couple of seconds, I am just way to new to all of this to know how to improve the speed. I have looked at oracle documentation over optimizing queries and I have tried incorporate those suggestions into what I have written. Any help at all would be much appreciated.
select
case
when clg_payor_id
is null
then payor_id1
else clg_payor_id
end
, count(unique era_id||invoice) count
from marge.e835_clp_$YYYYMM#e835v2
where claim_status_code='4' and payor_id1!='client'
group by case when clg_payor_id is null then payor_id1 else clg_payor_id end, era_id
having era_id
in (
select era_id
from marge.e835_checks_$YYYYMM#e835v2
where input_date between '$date1' and '$date2'
group by account_id, era_id
having account_id
in (
select ea.account_id
from cowboy.enrolled_submitters es
, cowboy.enrolled_accounts ea
where ea.submitter_id=es.submitter_id
and es.advocate='$advocate'
)
)
order by count desc
UPDATE: As suggested, I have moved the era_id out of the HAVING clause and into the WHERE. When I run this query, it returns a lot of unwanted results because of all of the account_id's but I can't figure out why. The query did actually run though and it only took ~9 secs. So much improved.
select
case
when clg_payor_id
is null
then payor_id1
else clg_payor_id
end
, account_id, count(unique era_id||invoice) count
from marge.e835_clp_$YYYYMM#e835v2
where claim_status_code='4'
and payor_id1!='client'
and era_id
in (
select era_id
from marge.e835_checks_$YYYYMM#e835v2
where input_date between '$date1' and '$date2'
group by account_id
, era_id
having account_id
in (
select ea.account_id
from cowboy.enrolled_submitters es
, cowboy.enrolled_accounts ea
where ea.submitter_id=es.submitter_id
and es.advocate='$advocate'
)
)
group by case when clg_payor_id is null then payor_id1 else clg_payor_id end, account_id
order by count desc
Related
I am trying to create a sales table however after 8 hours it still has not run. I have attempted to speed up the query by adding the hints and reducing the timeframe to 2022 only however after 3 hours it is still running. Is there a way to optimise this query?
DROP TABLE BIRTHDAY_SALES;
CREATE TABLE BIRTHDAY_SALES AS
--(
SELECT /*+ parallel(32) */
DISTINCT T.CONTACT_KEY
, S.CAMPAIGN_NAME
, S.CONTROL_GROUP_FLAG
, S.SEGMENT_NAME
, count(distinct t.ORDER_NUM) as TRANS
, count(distinct case when p.store_key = '42381' then t.ORDER_NUM else NULL end) as TRANS_ONLINE
, count(distinct case when p.store_key != '42381' then t.ORDER_NUM else NULL end) as TRANS_OFFLINE
, sum(t.ITEM_AMT) as SALES
, sum(case when p.store_key = '42381' then t.ITEM_AMT else NULL end) as SALES_ONLINE
, sum(case when p.store_key != '42381' then t.ITEM_AMT else NULL end) as SALES_OFFLINE
, sum(case when t.item_quantity_val>0 and t.item_amt<=0 then 0 else t.item_quantity_val end) QTY
, sum(case when (p.store_key = '42381' and t.ITEM_QUANTITY_VAL>0 and t.ITEM_AMT>0) then t.ITEM_QUANTITY_VAL else null end) QTY_ONLINE
, sum(case when (p.store_key != '42381' and t.ITEM_QUANTITY_VAL>0 and t.ITEM_AMT>0) then t.ITEM_QUANTITY_VAL else null end) QTY_OFFLINE
FROM CRM_TARGET.B_TRANSACTION T
JOIN BDAY_PROG S
ON T.CONTACT_KEY = S.CONTACT_KEY
JOIN CRM_TARGET.T_ORDITEM_SD P
ON T.PRODUCT_KEY = P.PRODUCT_KEY
where t.TRANSACTION_TYPE_NAME = 'Item'
and t.BU_KEY = '15'
and t.TRANSACTION_DT_KEY >= '20220101'
and t.TRANSACTION_DT_KEY <= '20221231'
and t.member_sale_flag = 'Y'
and t.bu_key = '15'
and t.CONTACT_KEY != 0
group by
T.CONTACT_KEY
, S.CAMPAIGN_NAME
, S.CONTROL_GROUP_FLAG
, S.SEGMENT_NAME
-- )
;
Performance tuning is not something we can effectively deal with on a forum like this, as there are too many factors to consider. You will have to examine the explain plan and look at ASH data (v$active_session_history) to see what the predominant waits are and on what plan step. Only then can you determine what's wrong and take steps to fix it.
However, here are some obvious things to look for:
Make sure there are no many-to-many joins. I'm guessing B_TRANSACTION probably has many rows with the same CONTACT_KEY and many rows with the same PRODUCT_KEy. That's okay, but then you must ensure that CONTACT_KEY is unique within BDA_PROG and PRODUCT_KEY is unique within T_ORDITEM_SD. IF that's not the case, you will get a partial Cartesian product from the hidden many-to-many join and will spend a huge amount of time on reading/writing to temp.
Make sure no more than one of those joins is one-to-many. Multiple one-to-manies stemming off the same parent table will effectively give you a many-to-many between the children, with the same effect.
You are asking for a date range of a month. In most systems, you are better off doing a full table scan (with parallel query if you can) than using indexes to get a whole month's worth of transactional data. If it is using an index that can really mess you up. You can fix this with hints (see below)
It might be using nested loops joins when a reporting query like this is likely better off using hash joins. Again, I'm just guessing based on the names of your tables; only knowledge of your data can determine this for sure.
Ensure that the PGA workareas are of reasonable size. Ask your DBA to query v$pgastat and report the global memory bound. It should be at its max of 1G, but probably anything over 100M is reasonable. If it's less than that, you may need to ask the DBA to increase the pga_aggregate_target, or you can manually set your own sort_area_size/hash_area_size session parameters (not the best thing to do).
You are asking for DOP 32. That's pretty high. Ensure there are that many CPU cores on the database server, that parallel_max_servers > 64 and that you aren't getting downgraded to serial by anything. Ask your DBA what a reasonable DOP would be.
Do you really need COUNT(DISTINCT ... ) on ORDERNUM? If you are just counting # of transactions, it would be less work to simply say SUM(CASE (WHEN .... ) THEN 1 ELSE 0 END)
Remove the DISTINCT keyword. It's not doing anything - your GROUP BY will already result in the results being distinct.
Consult ASH (v$active_session_history) to see if you are actually blocked by something, showing some kind of concurrency wait. Your CTAS might not be doing anything at all because of some library cache lock or full tablespace if the database is configured to suspend until space is added.
Here's something to try - again, it's a long shot without knowing your data or table structure. But I've seen enough reports like this to make at least a somewhat educated guess:
SELECT /*+ USE_HASH(t s p) FULL(t) FULL(s) FULL(p) PARALLEL(8) */ t.contact_key . . .
i am trying to better a query. I have a dataset of ticket opened. Every ticket has different rows, every row rappresent an update of the ticket. There is a field (dt_update) that differs it every row.
I have this indexs in the st_remedy_full_light.
IDX_ASSIGNMENT (ASSIGNMENT)
IDX_REMEDY_INC_ID (REMEDY_INC_ID)
IDX_REMDULL_LIGHT_DTUPD (DT_UPDATE)
Now, the query is performed in 8 second. Is high for me.
WITH last_ticket AS
( SELECT *
FROM st_remedy_full_light a
WHERE a.dt_update IN
( SELECT MAX(dt_update)
FROM st_remedy_full_light
WHERE remedy_inc_id = a.remedy_inc_id
)
)
SELECT remedy_inc_id, ASSIGNMENT FROM last_ticket
This is the plan
How i could to better this query?
P.S. This is just a part of a big query
Additional information:
- The table st_remedy_full_light contain 529.507 rows
You could try:
WITH last_ticket AS
( SELECT remedy_inc_id, ASSIGNMENT,
rank() over (partition by remedy_inc_id order by dt_update desc) rn
FROM st_remedy_full_light a
)
SELECT remedy_inc_id, ASSIGNMENT FROM last_ticket
where rn = 1;
The best alternative query, which is also much easier to execute, is this:
select remedy_inc_id
, max(assignment) keep (dense_rank last order by dt_update)
from st_remedy_full_light
group by remedy_inc_id
This will use only one full table scan and a (hash/sort) group by, no self joins.
Don't bother about indexed access, as you'll probably find a full table scan is most appropriate here. Unless the table is really wide and a composite index on all columns used (remedy_inc_id,dt_update,assignment) would be significantly quicker to read than the table.
MERGE INTO ////////1 GFO
USING
(SELECT *
FROM
(SELECT facto/////rid,
p-Id,
PRE/////EDATE,
RU//MODE,
cre///date,
ROW_NUMBER() OVER (PARTITION BY facto/////id ORDER BY cre///te DESC) col
FROM ///////////2
) x
WHERE x.col = 1) UFD
ON (GFO.FACTO-/////RID=UFD.FACTO////RID)
WHEN MATCHED THEN UPDATE
SET
GFO.PRE////DATE=UFD.PRE//////DATE
WHERE UFD.CRE/////DATE IS NOT NULL
AND UFD.RU//MODE= 'S'
AND GFO.P////ID=:2
hi every1, my above merge statement is taking too long , it has to run 40 times on table 1 using table2 each having 4millions plus records, for 40 different p--id, please suggest more efficient way as currently its taking 40+ minutes.
its updating only one colummn using a column from table2.t
i am unable to execute the query, its returning
Error: cannot fetch last explain plan from PLAN_TABLE
EXPLAIN PLAN IMAGE
HERE IS THE SCREENSHOT OF EXPLAIN PLAN
cost
The shown plan seems to by OK, the observed problem stems from the LOOP over P_ID that do not scale.
I assume you performs something like this (strongly simplified) - assuming the P_ID to be processed are in table TAB_PID
begin
for cur in (select p_id from tab_pid) loop
merge INTO tab1 USING tab2 ON (tab1.r_id = tab2.r_id)
WHEN MATCHED THEN
UPDATE SET tab1.col1=tab2.col1 WHERE p_id = cur.p_id;
end loop;
end;
/
HASH JOIN on large tables (in NO PARALLEL mode) with elapsed time 60 seconds is not a catastrophic result. But looping 40 times makes your 40 minutes.
So I'd sugesst to try to integrate the loop in the MERGE statement, without knowing details something like this (mayby you'll need also ajdust the MERGE JOIN condition).
merge INTO tab1 USING tab2 ON (tab1.r_id = tab2.r_id)
WHEN MATCHED THEN
UPDATE SET tab1.col1=tab2.col1
WHERE p_id in (select p_id from tab_pid);
Hi I have a database with large number of records roughly, 400K which is supposed to grow even more.
I have a query to fetch data from this table to display records to user . my query is below.
SELECT "PC0".PYID AS "pyID" ,
"PC0".NAME AS "Name" ,
"PC0".OPPORTUNITYSTAGE AS "OpportunityStage" ,
"PC0".PXCREATEOPNAME AS "pxCreateOpName" ,
"PC0".PZINSKEY AS "pzInsKey" ,
"PC0".OPPORTUNITYSHORTNAME AS "OpportunityShortName" ,
"PC0".IDTYPE AS "IDType" ,
"PC0".IDNO AS "IDNo" ,
"Campaign".PROGRAMNAME AS "ProgramName" ,
"Campaign".ENDDATE AS "EndDate" ,
"PC0".PRODUCTNAME AS "ProductName" ,
"PC0".PRODUCTTYPE AS "ProductType" ,
"PC0".OPPORTUNITYSTAGE AS "OpportunityStage" ,
"PC0".PXCREATEOPNAME AS "pxCreateOpName" ,
"PC0".OPPORTUNITYSOURCE AS "OpportunitySource" ,
"PC0".OPPORTUNITYOWNER AS "OpportunityOwner" ,
"PC0".IDTYPE
||"PC0".IDNO AS "pyTextValue(1)" ,
"PC0".REMINDERDATE AS "ReminderDate" ,
"PC0".STAGELASTCHANGED AS "StageLastChanged" ,
ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".STAGELASTCHANGED AS DATE))) AS "pyIntegerValue(1)" ,
(
CASE
WHEN ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".REMINDERDATE AS DATE))) > 0
THEN 1
WHEN ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".STAGELASTCHANGED AS DATE))) > 7
THEN 2
ELSE 3
END) AS "pyIntegerValue(2)" ,
"PC0".PXCREATEDATETIME AS "pxCreateDateTime" ,
"PC0".CAMPAIGNID AS "CampaignID" ,
ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".REMINDERDATE AS DATE))) AS "pyIntegerValue(3)"
FROM MYCO_OPPORTUNITY "PC0"
LEFT OUTER JOIN MYCO_CAMPAIGN "Campaign"
ON ( "PC0".CAMPAIGNID = "Campaign".PYID)
ORDER BY 21 ASC,
22 DESC
This takes near to 13 seconds to fetch first 50 records in SQl developer. In real time I will be fetching almost 5k records at a time.
The time 13 sec is coming after i have defined functional index for CAST on REMINDERDATE and STAGELASTCHANGED column and a bitmap join index.
Can you please suggest how should i optimize the query. Order by on a large set might be an issue bit it is must for me. :(
Make sure you have an index on: "PC0".CAMPAIGNID and on: "Campaign".PYID
Make sure your SGA is set high enough. Without knowing a lot information about the server and database it's hard to provide guidance other than make sure the SGA is large enough.
You're using "order by" on a computed column, which means Oracle has to compute this value for all 400k rows, before being able to sort and return results. To be certain that this is the problem test without using order by.
There are a number of possible solutions but this example does not seem to be your actual use case so its pretty much meaningless to suggest optimizations for it.
Without more knowledge about the data I'd suggest splitting the query into three parts connected with union and implement indexes on reminderdate and stagelastchanged.
select * from ( [part 1] where reminderdate > sysdate order by pxCreateDateTime )
union all
select * from ( [part 2] where reminderdate <= sysdate and stagelastchanged + 7 < sysdate order by pxCreateDateTime )
union all
select * from ( [part 3] where reminderdate <= sysdate and stagelastchanged + 7 >= sysdate order by pxCreateDateTime )
I'd then expect that 1. and 2. should be satisfied using index and 3. a full table scan, which might be helped by adding a first_rows hint.
I'm pulling two pieces of information over a specific time period, but I would like to fetch the daily average of one tag and the daily count of another tag. I'm not sure how to do daily averages over a specific time period, can anyone provide some advice? Below were my first ideas on how to handle this however to change every date would be annoying. Any help is appreciated thanks
SELECT COUNT(distinct chargeno), to_char(chargetime, 'mmddyyyy') AS chargeend
FROM batch_index WHERE plant=1 AND chargetime>to_date('2012-06-18:00:00:00','yyyy-mm-dd:hh24:mi:ss')
AND chargetime<to_date('2012-07-19:00:00:00','yyyy-mm-dd:hh24:mi:ss')
group by chargetime;
The working version of the daily sum
SELECT to_char(bi.chargetime, 'mmddyyyy') as chargtime, SUM(cv.val)*0.0005
FROM Charge_Value cv, batch_index bi WHERE cv.ValueID =97
AND bi.chargetime<=to_date('2012-07-19','yyyy-mm-dd')
AND bi.chargeno = cv.chargeno AND bi.typ=1
group by to_char(bi.chargetime, 'mmddyyyy')
seems like in the first one you want to change the group to the day - not the time... (plus i dont think you need to specify all those 0's for seconds..)
SELECT COUNT(distinct chargeno), to_char(chargetime, 'mmddyyyy') AS chargeend
FROM batch_index WHERE plant=1 AND chargetime>to_date('2012-06-18','yyyy-mm-dd')
AND chargetime<to_date('2012-07-19','yyyy-mm-dd')
group by to_char(chargetime, 'mmddyyyy') ;
not 100% I'm following your question, but if you just want to do aggregates (sums, avg), then do just that. I threw in the rollup just in case that is what you were looking for
with fakeData as(
select trunc(level *.66667) nr
, trunc(2*level * .33478) lvl --these truncs just make the doubles ints
,trunc(sysdate+trunc(level*.263784123)) dte --note the trunc, this gets rid of the to_char to drop the time
from dual
connect by level < 600
) --the cte is just to create fake data
--below is just some aggregates that may help you
select sum(nr) daily_sum_of_nr
, avg(nr) daily_avg_of_nr
, count(distinct lvl) distinct_lvls_per_day
, count(lvl) count_of_nonNull_lvls_per_day
, dte days
from fakeData
group by rollup(dte)
--if you want the query to supply a total for the range, you may use rollup ( http://psoug.org/reference/rollup.html )