Hi I have a database with large number of records roughly, 400K which is supposed to grow even more.
I have a query to fetch data from this table to display records to user . my query is below.
SELECT "PC0".PYID AS "pyID" ,
"PC0".NAME AS "Name" ,
"PC0".OPPORTUNITYSTAGE AS "OpportunityStage" ,
"PC0".PXCREATEOPNAME AS "pxCreateOpName" ,
"PC0".PZINSKEY AS "pzInsKey" ,
"PC0".OPPORTUNITYSHORTNAME AS "OpportunityShortName" ,
"PC0".IDTYPE AS "IDType" ,
"PC0".IDNO AS "IDNo" ,
"Campaign".PROGRAMNAME AS "ProgramName" ,
"Campaign".ENDDATE AS "EndDate" ,
"PC0".PRODUCTNAME AS "ProductName" ,
"PC0".PRODUCTTYPE AS "ProductType" ,
"PC0".OPPORTUNITYSTAGE AS "OpportunityStage" ,
"PC0".PXCREATEOPNAME AS "pxCreateOpName" ,
"PC0".OPPORTUNITYSOURCE AS "OpportunitySource" ,
"PC0".OPPORTUNITYOWNER AS "OpportunityOwner" ,
"PC0".IDTYPE
||"PC0".IDNO AS "pyTextValue(1)" ,
"PC0".REMINDERDATE AS "ReminderDate" ,
"PC0".STAGELASTCHANGED AS "StageLastChanged" ,
ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".STAGELASTCHANGED AS DATE))) AS "pyIntegerValue(1)" ,
(
CASE
WHEN ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".REMINDERDATE AS DATE))) > 0
THEN 1
WHEN ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".STAGELASTCHANGED AS DATE))) > 7
THEN 2
ELSE 3
END) AS "pyIntegerValue(2)" ,
"PC0".PXCREATEDATETIME AS "pxCreateDateTime" ,
"PC0".CAMPAIGNID AS "CampaignID" ,
ROUND((CAST(SYSDATE AS DATE) - CAST("PC0".REMINDERDATE AS DATE))) AS "pyIntegerValue(3)"
FROM MYCO_OPPORTUNITY "PC0"
LEFT OUTER JOIN MYCO_CAMPAIGN "Campaign"
ON ( "PC0".CAMPAIGNID = "Campaign".PYID)
ORDER BY 21 ASC,
22 DESC
This takes near to 13 seconds to fetch first 50 records in SQl developer. In real time I will be fetching almost 5k records at a time.
The time 13 sec is coming after i have defined functional index for CAST on REMINDERDATE and STAGELASTCHANGED column and a bitmap join index.
Can you please suggest how should i optimize the query. Order by on a large set might be an issue bit it is must for me. :(
Make sure you have an index on: "PC0".CAMPAIGNID and on: "Campaign".PYID
Make sure your SGA is set high enough. Without knowing a lot information about the server and database it's hard to provide guidance other than make sure the SGA is large enough.
You're using "order by" on a computed column, which means Oracle has to compute this value for all 400k rows, before being able to sort and return results. To be certain that this is the problem test without using order by.
There are a number of possible solutions but this example does not seem to be your actual use case so its pretty much meaningless to suggest optimizations for it.
Without more knowledge about the data I'd suggest splitting the query into three parts connected with union and implement indexes on reminderdate and stagelastchanged.
select * from ( [part 1] where reminderdate > sysdate order by pxCreateDateTime )
union all
select * from ( [part 2] where reminderdate <= sysdate and stagelastchanged + 7 < sysdate order by pxCreateDateTime )
union all
select * from ( [part 3] where reminderdate <= sysdate and stagelastchanged + 7 >= sysdate order by pxCreateDateTime )
I'd then expect that 1. and 2. should be satisfied using index and 3. a full table scan, which might be helped by adding a first_rows hint.
Related
I am working on ODI mapping where I am calculating" Min(ID) over parition by(device_num, sys_id) as min_id" in expression component, I used another expression component to filter duplicates using row_number() over partition by (ID) order by(min_id) followed by a filter component "rownum=1" this results in window function error are not allowed here.
I understand that I need to run the analytical function on top the aggregate results. I am not sure how to achieve this in odi mapping (odi 12c). can anyone of you please guide me?
merge into (
select /*+ */ *
from target_base.tgt_table
where (1=1)
) TGT
using (
select /*+ */
RESULT2.ID_1 AS ID,
RESULT2.COL AS MIN_ID
from (
SELECT
RESULT1.ID AS ID ,
RESULT1.DEVICE__NUM AS DEVICE__NUM ,
RESULT1.SYS_ID AS SYS_ID ,
MIN(RESULT1.ID) OVER (PARTITION BY RESULT1.DEVICE__NUM ,RESULT1.SYS_ID) AS COL ,
ROW_NUMBER() OVER (PARTITION BY RESULT1.ID ORDER BY (MIN(RESULT1.ID) OVER (PARTITION BY RESULT1.DEVICE__NUM ,RESULT1.SYS_ID) AS COL) DESC ) AS COL_1
-- WINDOW FUNCTION ERROR,
FROM
(
select * from union_table
) RESULT1
)RESULT2
where (1=1)
and (RESULT2.COL_1 = 1)
) SRC
on (
and TGT.ID=SRC.ID )
when matched then update set
TGT.COMMON_ID = SRC.MIN_ID
, TGT.REC_UPDATE = SYSDATE
WHERE (
DECODE(TGT.COMMON_ID, SRC.COMMON_ID, 0, 1) > 0
)
UNION_TABLE has data as per below table
ID
device_num
sys_id
1
A
5
2
B
15
3
C
25
4
D
35
5
A
10
5
A
5
6
B
15
6
B
20
7
C
25
7
C
30
8
D
35
8
D
40
output expected: the ID where the rown_num=1 will be updated in target
ODI Mapping
This is very complex use case to model in ODI and the parser might not understand what you are trying to achieve.
My advice would be to write the difficult part of the query manually in SQL and use it as a source in ODI. Here is how to do it :
In the physical design of your mapping click on your source table. In the property pane, go to the Extract Options. You can then paste your SQL as a value for option CUSTOMER_TEMPLATE.
Of course it hides a bit the logic of the mapping so it shouldn't be used everywhere but for complex use cases as this one, this is an easy way to get the job done. I personally always add a memo on mapping with custom SQL so other developers can quickly see it.
Let try use IKM :Oracle Incremental Update on target table replace for IKM Oracle Merge.
Physical -> click target table -> Intergration Knowlege Module -> Oracle Incremental Update
I work primarily with SAS and Oracle and am still new to DB2. Im faced with needing a hierarchical query to separate a clob into chunks that can be pulled into sas. SAS has a limit of 32K for character variables so I cant just pull the dataset in normally.
I found an old stackoverflow question about the best way to pull a clob into a sas data set but it is written in Oracle.
Import blob through SAS from ORACLE DB
Since I am new to DB2 and the syntax for this type of join seems very different I was hoping to find someone that could help convert it and explain the syntax. I find the Oracle syntax to be much easier to understand. I'm not sure in DB2 if you would use a CTE recursion like this https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/apsg/src/tpc/db2z_xmprecursivecte.html or if you would use hierarchical queries like this https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_71/sqlp/rbafyrecursivequeries.htm
Here is the Oracle query.
SELECT
id
, level as chunk_id
, regexp_substr(clob_value, '.{1,32767}', 1, level, 'n') as clob_chunk
FROM (
SELECT id, clob_value
FROM schema.table
WHERE id = 1
)
CONNECT BY LEVEL <= regexp_count(clob_value, '.{1,32767}',1,'n')
order by id, chunk_id;
The table has two fields the id and the clob_value and would look like this.
ID CLOB_VALUE
1 really large clob
2 medium clob
3 another large clob
The thought is I would want this result. I would only ever be doing this one row at a time where id= which ever row I am processing.
ID CHUNK_ID CLOB
1 1 clob_chunk1of3
1 2 clob_chunk2of3
1 3 clob_chunk3of3
Thanks for any time spent reading and helping.
Here is a solution that should work in DB2 with few changes (but please be advised that I don't know DB2 at all; I am just using Oracle features that are in the SQL Standard, so they should be implemented identically - or almost so - in DB2).
Below I create a table with your sample data; then I show how to chunk it into substrings of length at most 8 characters. Although the strings are short, I defined the column as CLOB and I am using CLOB tools; this should work on much larger CLOBs.
You can make both the chunk size and the id into bind parameters, if needed. In my demo below I hardcoded the chunk size and I show the result for all IDs in the table. In case the CLOB is NULL, I do return one chunk (which is NULL, of course).
Note that touching CLOBs in a query is very expensive; so most of the work is done without touching the CLOBs. I only work on them as little as possible.
PREP WORK
drop table tbl purge; -- If needed
create table tbl (id number, clob_value clob);
insert into tbl (id, clob_value)
select 1, 'really large clob' from dual union all
select 2, 'medium clob' from dual union all
select 3, 'another large clob' from dual union all
select 4, null from dual -- added to check handling
;
commit;
QUERY
with
prep(id, len) as (
select id, dbms_lob.getlength(clob_value)
from tbl
)
, rec(id, len, ord, pos) as (
select id, len, 1, 1
from prep
union all
select id, len, ord + 1, pos + 8
from rec
where len >= pos + 8
)
select id, ord, dbms_lob.substr(clob_value, 8, pos)
from tbl inner join rec using (id)
order by id, ord
;
ID ORD CHUNK
---- ---- --------
1 1 really l
1 2 arge clo
1 3 b
2 1 medium c
2 2 lob
3 1 another
3 2 large cl
3 3 ob
4 1
Another option is to enable the Oracle compatibility in Db2 and just issue the hierarchical query.
This GitHub repository has background information on SQL recursion in DB2, including the Oracle-style syntax and a side by side example (both work against the Db2 sample database):
-- both queries are against the SAMPLE database
-- and should return the same result
SELECT LEVEL, CAST(SPACE((LEVEL - 1) * 4) || '/' || DEPTNAME
AS VARCHAR(40)) AS DEPTNAME
FROM DEPARTMENT
START WITH DEPTNO = 'A00'
CONNECT BY NOCYCLE PRIOR DEPTNO = ADMRDEPT;
WITH tdep(level, deptname, deptno) as (
SELECT 1, CAST( DEPTNAME AS VARCHAR(40)) AS DEPTNAME, deptno
FROM department
WHERE DEPTNO = 'A00'
UNION ALL
SELECT t.LEVEL+1, CAST(SPACE(t.LEVEL * 4) || '/' || d.DEPTNAME
AS VARCHAR(40)) AS DEPTNAME, d.deptno
FROM DEPARTMENT d, tdep t
WHERE d.admrdept=t.deptno and d.deptno<>'A00')
SELECT level, deptname
FROM tdep;
I have a typical oracle pagination sql called from a web application like this.
SELECT *
FROM
(SELECT * ( Very complex inner queries )
FROM xyz table
ORDER BY unique_colunn DESC ==> killer
)
WHERE rownum >= 50 and rownum<100
The sql works fine(returns data) within 2 or 3 seconds , but once the order by clause is introduced, it kills the query , it takes 200+ secs , but i cannot remove the order by unique column , because that is the one that drives the pagination logic , since it is a inline view not able to add any tuning hints , any pointers?
Have tried rank() , row_num over etc instead of using order by in the where condition as suggested , nothing works.
I'm pulling two pieces of information over a specific time period, but I would like to fetch the daily average of one tag and the daily count of another tag. I'm not sure how to do daily averages over a specific time period, can anyone provide some advice? Below were my first ideas on how to handle this however to change every date would be annoying. Any help is appreciated thanks
SELECT COUNT(distinct chargeno), to_char(chargetime, 'mmddyyyy') AS chargeend
FROM batch_index WHERE plant=1 AND chargetime>to_date('2012-06-18:00:00:00','yyyy-mm-dd:hh24:mi:ss')
AND chargetime<to_date('2012-07-19:00:00:00','yyyy-mm-dd:hh24:mi:ss')
group by chargetime;
The working version of the daily sum
SELECT to_char(bi.chargetime, 'mmddyyyy') as chargtime, SUM(cv.val)*0.0005
FROM Charge_Value cv, batch_index bi WHERE cv.ValueID =97
AND bi.chargetime<=to_date('2012-07-19','yyyy-mm-dd')
AND bi.chargeno = cv.chargeno AND bi.typ=1
group by to_char(bi.chargetime, 'mmddyyyy')
seems like in the first one you want to change the group to the day - not the time... (plus i dont think you need to specify all those 0's for seconds..)
SELECT COUNT(distinct chargeno), to_char(chargetime, 'mmddyyyy') AS chargeend
FROM batch_index WHERE plant=1 AND chargetime>to_date('2012-06-18','yyyy-mm-dd')
AND chargetime<to_date('2012-07-19','yyyy-mm-dd')
group by to_char(chargetime, 'mmddyyyy') ;
not 100% I'm following your question, but if you just want to do aggregates (sums, avg), then do just that. I threw in the rollup just in case that is what you were looking for
with fakeData as(
select trunc(level *.66667) nr
, trunc(2*level * .33478) lvl --these truncs just make the doubles ints
,trunc(sysdate+trunc(level*.263784123)) dte --note the trunc, this gets rid of the to_char to drop the time
from dual
connect by level < 600
) --the cte is just to create fake data
--below is just some aggregates that may help you
select sum(nr) daily_sum_of_nr
, avg(nr) daily_avg_of_nr
, count(distinct lvl) distinct_lvls_per_day
, count(lvl) count_of_nonNull_lvls_per_day
, dte days
from fakeData
group by rollup(dte)
--if you want the query to supply a total for the range, you may use rollup ( http://psoug.org/reference/rollup.html )
I need to select rows randomly from an Oracle DB.
Ex: Assume a table with 100 rows, how I can randomly return 20 of those records from the entire 100 rows.
SELECT *
FROM (
SELECT *
FROM table
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum < 21;
SAMPLE() is not guaranteed to give you exactly 20 rows, but might be suitable (and may perform significantly better than a full query + sort-by-random for large tables):
SELECT *
FROM table SAMPLE(20);
Note: the 20 here is an approximate percentage, not the number of rows desired. In this case, since you have 100 rows, to get approximately 20 rows you ask for a 20% sample.
SELECT * FROM table SAMPLE(10) WHERE ROWNUM <= 20;
This is more efficient as it doesn't need to sort the Table.
SELECT column FROM
( SELECT column, dbms_random.value FROM table ORDER BY 2 )
where rownum <= 20;
In summary, two ways were introduced
1) using order by DBMS_RANDOM.VALUE clause
2) using sample([%]) function
The first way has advantage in 'CORRECTNESS' which means you will never fail get result if it actually exists, while in the second way you may get no result even though it has cases satisfying the query condition since information is reduced during sampling.
The second way has advantage in 'EFFICIENT' which mean you will get result faster and give light load to your database.
I was given an warning from DBA that my query using the first way gives loads to the database
You can choose one of two ways according to your interest!
In case of huge tables standard way with sorting by dbms_random.value is not effective because you need to scan whole table and dbms_random.value is pretty slow function and requires context switches. For such cases, there are 3 additional methods:
1: Use sample clause:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
for example:
select *
from s1 sample block(1)
order by dbms_random.value
fetch first 1 rows only
ie get 1% of all blocks, then sort them randomly and return just 1 row.
2: if you have an index/primary key on the column with normal distribution, you can get min and max values, get random value in this range and get first row with a value greater or equal than that randomly generated value.
Example:
--big table with 1 mln rows with primary key on ID with normal distribution:
Create table s1(id primary key,padding) as
select level, rpad('x',100,'x')
from dual
connect by level<=1e6;
select *
from s1
where id>=(select
dbms_random.value(
(select min(id) from s1),
(select max(id) from s1)
)
from dual)
order by id
fetch first 1 rows only;
3: get random table block, generate rowid and get row from the table by this rowid:
select *
from s1
where rowid = (
select
DBMS_ROWID.ROWID_CREATE (
1,
objd,
file#,
block#,
1)
from
(
select/*+ rule */ file#,block#,objd
from v$bh b
where b.objd in (select o.data_object_id from user_objects o where object_name='S1' /* table_name */)
order by dbms_random.value
fetch first 1 rows only
)
);
To randomly select 20 rows I think you'd be better off selecting the lot of them randomly ordered and selecting the first 20 of that set.
Something like:
Select *
from (select *
from table
order by dbms_random.value) -- you can also use DBMS_RANDOM.RANDOM
where rownum < 21;
Best used for small tables to avoid selecting large chunks of data only to discard most of it.
Here's how to pick a random sample out of each group:
SELECT GROUPING_COLUMN,
MIN (COLUMN_NAME) KEEP (DENSE_RANK FIRST ORDER BY DBMS_RANDOM.VALUE)
AS RANDOM_SAMPLE
FROM TABLE_NAME
GROUP BY GROUPING_COLUMN
ORDER BY GROUPING_COLUMN;
I'm not sure how efficient it is, but if you have a lot of categories and sub-categories, this seems to do the job nicely.
-- Q. How to find Random 50% records from table ?
when we want percent wise randomly data
SELECT *
FROM (
SELECT *
FROM table_name
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum <= (select count(*) from table_name) * 50/100;