oracle query to get max hour every day, and corresponding row values - oracle

I'm having a hard time creating a query to do the following:
I have this table, called LOG:
ID | INSERT_TIME | LOG_VALUE
----------------------------------------
1 | 2013-04-29 18:00:00.000 | 160473
2 | 2013-04-29 21:00:00.000 | 154281
3 | 2013-04-30 09:00:00.000 | 186552
4 | 2013-04-30 14:00:00.000 | 173145
5 | 2013-04-30 14:30:00.000 | 102235
6 | 2013-05-01 11:00:00.000 | 201541
7 | 2013-05-01 23:00:00.000 | 195234
What I want to do is build a query that returns, for each day, the last values inserted (using the max value of INSERT_TIME). I'm only interested in the date part of that column, and in the column LOG_VALUE. So, this would be my resultset after running the query:
2013-04-29 154281
2013-04-30 102235
2013-05-01 195234
I guess that I need to use GROUP BY over the INSERT_TIME column, along with MAX() function, but by doing that, I can't seem to get the LOG_VALUE. Can anyone help me on this, please?
(I'm on Oracle 10g)

SELECT trunc(insert_time),
log_value
FROM (
SELECT insert_time,
log_value,
rank() over (partition by trunc(insert_time)
order by insert_time desc) rnk
FROM log)
WHERE rnk = 1
is one option. This uses the analytic function rank to identify the row with the latest insert_time on each day.

Related

What is the best way to work around data?

I am trying to display data as such:
In our database we have events (with unique ID) and then a start date. The events do not overlap, and each one starts on the date the last one ended. However we don't have 'end date' in the database.
I have to feed the data into another system so that it shows event ID, start date, and end date (which is just the next start date).
I want to avoid creating a custom view as that's really frowned upon here for this database. So I'm wondering if there's a good way to do this in a query.
Essentially it would be:
EventA | Date1 | Date2
EventB | Date2 | Date3
EventC | Date3 | Date4
The events are planned years in advance and I only need to have the next few months pulled for the query, so no worry about running out of 'next event start dates' and in case it matters, this query will be part of a webservice call.
The basic pseudo code for event and date would be:
select Event.ID, Event.StartDate
from Event
where Event.StartDate > sysdate and Event.StartDate < sysdate+90
Essentially I want to take the next row's Event.StartDate and make it the current row's Event.EndDate
Use the LEAD analytic function:
Oracle Setup:
A table with 10 rows:
CREATE TABLE Event ( ID, StartDate ) AS
SELECT LEVEL, TRUNC( SYSDATE ) + LEVEL
FROM DUAL
CONNECT BY LEVEL <= 10;
Query:
select ID,
StartDate,
LEAD( StartDate ) OVER ( ORDER BY StartDate ) AS EndDate
from Event
where StartDate > sysdate and StartDate < sysdate+90
Output:
ID | STARTDATE | ENDDATE
-: | :-------- | :--------
1 | 22-JUN-19 | 23-JUN-19
2 | 23-JUN-19 | 24-JUN-19
3 | 24-JUN-19 | 25-JUN-19
4 | 25-JUN-19 | 26-JUN-19
5 | 26-JUN-19 | 27-JUN-19
6 | 27-JUN-19 | 28-JUN-19
7 | 28-JUN-19 | 29-JUN-19
8 | 29-JUN-19 | 30-JUN-19
9 | 30-JUN-19 | 01-JUL-19
10 | 01-JUL-19 | null
db<>fiddle here

Insert value based on min value greater than value in another row

It's difficult to explain the question well in the title.
I am inserting 6 values from (or based on values in) one row.
I also need to insert a value from a second row where:
The values in one column (ID) must be equal
The values in column (CODE) in the main source row must be IN (100,200), whereas the other row must have value of 300 or 400
The value in another column (OBJID) in the secondary row must be the lowest value above that in the primary row.
Source Table looks like:
OBJID | CODE | ENTRY_TIME | INFO | ID | USER
---------------------------------------------
1 | 100 | x timestamp| .... | 10 | X
2 | 100 | y timestamp| .... | 11 | Y
3 | 300 | z timestamp| .... | 10 | F
4 | 100 | h timestamp| .... | 10 | X
5 | 300 | g timestamp| .... | 10 | G
So to provide an example..
In my second table I want to insert OBJID, OBJID2, CODE, ENTRY_TIME, substr(INFO(...)), ID, USER
i.e. from my example a line inserted in the second table would look like:
OBJID | OBJID2 | CODE | ENTRY_TIME | INFO | ID | USER
-----------------------------------------------------------
1 | 3 | 100 | x timestamp| substring | 10 | X
4 | 5 | 100 | h timestamp| substring2| 10 | X
My insert for everything that just comes from one row works fine.
INSERT INTO TABLE2
(ID, OBJID, INFO, USER, ENTRY_TIME)
SELECT ID, OBJID, DECODE(CODE, 100, (SUBSTR(INFO, 12,
LENGTH(INFO)-27)),
600,'CREATE') INFO, USER, ENTRY_TIME
FROM TABLE1
WHERE CODE IN (100,200);
I'm aware that I'll need to use an alias on TABLE1, but I don't know how to get the rest to work, particularly in an efficient way. There are 2 million rows right now, but there will be closer to 20 million once I start using production data.
You could try this:
select primary.* ,
(select min(objid)
from table1 secondary
where primary.objid < secondary.objid
and secondary.code in (300,400)
and primary.id = secondary.id
) objid2
from table1 primary
where primary.code in (100,200);
Ok, I've come up with:
select OBJID,
min(case when code in (300,400) then objid end)
over (partition by id order by objid
range between 1 following and unbounded following
) objid2,
CODE, ENTRY_TIME, INFO, ID, USER1
from table1;
So, you need a insert select the above query with a where objid2 is not null and code in (100,200);

ORDER BY subquery and ROWNUM goes against relational philosophy?

Oracle 's ROWNUM is applied before ORDER BY. In order to put ROWNUM according to a sorted column, the following subquery is proposed in all documentations and texts.
select *
from (
select *
from table
order by price
)
where rownum <= 7
That bugs me. As I understand, table input into FROM is relational, hence no order is stored, meaning the order in the subquery is not respected when seen by FROM.
I cannot remember the exact scenarios but this fact of "ORDER BY has no effect in the outer query" I have read more than once. Examples are in-line subqueries, subquery for INSERT, ORDER BY of PARTITION clause, etc. For example in
OVER (PARTITION BY name ORDER BY salary)
the salary order will not be respected in outer query, and if we want salary to be sorted at outer query output, another ORDER BY need to be added in the outer query.
Some insights from everyone on why the relational property is not respected here and order is stored in the subquery ?
The ORDER BY in this context is in effect Oracle's proprietary syntax for generating an "ordered" row number on a (logically) unordered set of rows. This is a poorly designed feature in my opinion but the equivalent ISO standard SQL ROW_NUMBER() function (also valid in Oracle) may make it clearer what is happening:
select *
from (
select ROW_NUMBER() OVER (ORDER BY price) rn, *
from table
) t
where rn <= 7;
In this example the ORDER BY goes where it more logically belongs: as part of the specification of a derived row number attribute. This is more powerful than Oracle's version because you can specify several different orderings defining different row numbers in the same result. The actual ordering of rows returned by this query is undefined. I believe that's also true in your Oracle-specific version of the query because no guarantee of ordering is made when you use ORDER BY in that way.
It's worth remembering that Oracle is not a Relational DBMS. In common with other SQL DBMSs Oracle departs from the relational model in some fundamental ways. Features like implicit ordering and DISTINCT exist in the product precisely because of the non-relational nature of the SQL model of data and the consequent need to work around keyless tables with duplicate rows.
Not surprisingly really, Oracle treats this as a bit of a special case. You can see that from the execution plan. With the naive (incorrect/indeterminate) version of the limit that crops up sometimes, you get SORT ORDER BY and COUNT STOPKEY operations:
select *
from my_table
where rownum <= 7
order by price;
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 13 | 3 (34)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | TABLE ACCESS FULL| MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(ROWNUM<=7)
If you just use an ordered subquery, with no limit, you only get the SORT ORDER BY operation:
select *
from (
select *
from my_table
order by price
);
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 13 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS FULL| MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------
With the usual subquery/ROWNUM construct you get something different,
select *
from (
select *
from my_table
order by price
)
where rownum <= 7;
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 13 | 3 (34)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 13 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS FULL | MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=7)
3 - filter(ROWNUM<=7)
The COUNT STOPKEY operation is still there for the outer query, but the inner query (inline view, or derived table) now has a SORT ORDER BY STOPKEY instead of the simple SORT ORDER BY. This is all hidden away in the internals so I'm speculating, but it looks like the stop key - i.e. the row number limit - is being pushed into the subquery processing, so in effect the subquery may only end up with seven rows anyway - though the plan's ROWS value doesn't reflect that (but then you get the same plan with a different limit), and it still feels the need to apply the COUNT STOPKEY operation separately.
Tom Kyte covered similar ground in an Oracle Magazine article, when talking about "Top- N Query Processing with ROWNUM" (emphasis added):
There are two ways to approach this:
- Have the client application run that query and fetch just the first N rows.
- Use that query as an inline view, and use ROWNUM to limit the results, as in SELECT * FROM ( your_query_here ) WHERE ROWNUM <= N.
The second approach is by far superior to the first, for two reasons. The lesser of the two reasons is that it requires less work by the client, because the database takes care of limiting the result set. The more important reason is the special processing the database can do to give you just the top N rows. Using the top- N query means that you have given the database extra information. You have told it, "I'm interested only in getting N rows; I'll never consider the rest." Now, that doesn't sound too earth-shattering until you think about sorting—how sorts work and what the server would need to do.
... and then goes on to outline what it's actually doing, rather more authoritatively than I can.
Interestingly I don't think the order of the final result set is actually guaranteed; it always seems to work, but arguably you should still have an ORDER BY on the outer query too to make it complete. It looks like the order isn't really stored in the subquery, it just happens to be produced like that. (I very much doubt that will ever change as it would break too many things; this ends up looking similar to a table collection expression which also always seems to retain its ordering - breaking that would stop dbms_xplan working though. I'm sure there are other examples.)
Just for comparison, this is what the ROW_NUMBER() equivalent does:
select *
from (
select ROW_NUMBER() OVER (ORDER BY price) rn, my_table.*
from my_table
) t
where rn <= 7;
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 52 | 4 (25)| 00:00:01 |
|* 1 | VIEW | | 2 | 52 | 4 (25)| 00:00:01 |
|* 2 | WINDOW SORT PUSHED RANK| | 2 | 26 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL | MY_TABLE | 2 | 26 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN"<=7)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "PRICE")<=7)
Adding to sqlvogel's good answer :
"As I understand, table input into FROM is relational"
No, table input into FROM is not relational. It is not relational because "table input" are tables and tables are not relations. The myriads of quirks and oddities in SQL eventually all boil down to that simple fact : the core building brick in SQL is the table, and a table is not a relation. To sum up the differences :
Tables can contain duplicate rows, relations cannot. (As a consequence, SQL offers bag algebra, not relational algebra. As another consequence, it is as good as impossible for SQL to even define equality comparison for its most basic building brick !!! How would you compare tables for equality given that you might have to deal with duplicate rows ?)
Tables can contain unnamed columns, relations cannot. SELECT X+Y FROM ... As a consequence, SQL is forced into "column identity by ordinal position", and as a consequence of that, you get all sorts of quirks, e.g. in SELECT A,B FROM ... UNION SELECT B,A FROM ...
Tables can contain duplicate column names, relations cannot. A.ID and B.ID in a table are not distinct column names. The part before the dot is not part of the name, it is a "scope identifier", and that scope identifier "disappears" once you're "outside the SELECT" it appears/is introduced in. You can verify this with a nested SELECT : SELECT A.ID FROM (SELECT A.ID, B.ID FROM ...). It won't work (unless your particular implementation departs from the standard in order to make it work).
Various SQL constructs leave people with the impression that tables do have an ordering to rows. The ORDER BY clause, obviously, but also the GROUP BY clause (which can be made to work only by introducing rather dodgy concepts of "intermediate tables with rows grouped together"). Relations simply are not like that.
Tables can contain NULLs, relations cannot. This one has been beaten to death.
There should be some more, but I don't remember them off the tip of the hat.

Create a table of dates in rows

I have a Oracle Database;
and I want to create a table with two columns, one contain id and the other contain incremented dates in rows.
i want to specify in my PL/SQL code the limit dates, and the code will generate the rows between the two limit dates (from and to).
This is an example output result :
+-----+--------------------+
| id |dates |
+-----+--------------------+
| 1 |01/02/2011 04:00:00 |
+-----+--------------------+
| 2 |01/02/2011 05:00:00 |
+-----+--------------------+
| 3 |01/02/2011 06:00:00 |
+-----+--------------------+
| 4 |01/02/2011 07:00:00 |
+-----+--------------------+
| 5 |01/02/2011 08:00:00 |
....
...
..
| 334 |05/03/2011 023:00:00|
+-----+--------------------+
You haven't exactly deluged us with details, but this is the sort of construct you want:
select level as id
, &&start_date + ((level-1) * (1/24) as dates
from dual
connect by level <= ((&&end_date - &&start_date)*24)
/
This assumes your input values are whole days, You will need to adjust the maths if your start or end date contains a time component.
You would need to start with a date baseline:
vBaselineDate := TRUNC(SYSDATE);
OR
vBaselineDate := TO_DATE('28-03-2013 12:00:00', 'DD-MM-YYYY HH:MI:SS');
Then increment the baseline by adding fractions of a day depending on how large you want the range, eg: 1 minute, 1 hour etc.
FOR i IN 1..334 LOOP
INSERT INTO mytable
(id, dates)
VALUES
(i, (vBaselineDate + i/24));
END LOOP;
COMMIT;
1/24 = 1 hour.
1/1440 = 1 minute;
Hope this helps.

query taking too long to execute from plsql block

This query when executed alone it takes 1 second to executed when the same query is executed through procedure it is taking 20 seconds, please help me on this
SELECT * FROM
(SELECT TAB1.*,ROWNUM ROWNUMM FROM
(SELECT wh.workitem_id, wh.workitem_priority, wh.workitem_type_id, wt.workitem_type_nm,
wh.workitem_status_id, ws.workitem_status_nm, wh.analyst_group_id,
ag.analyst_group_nm, wh.owner_uuid, earnings_estimate.pr_get_name_from_uuid(owner_uuid) owner_name,
wh.create_user_id, earnings_estimate.pr_get_name_from_uuid( wh.create_user_id) create_name, wh.create_ts,
wh.update_user_id,earnings_estimate.pr_get_name_from_uuid(wh.update_user_id) update_name, wh.update_ts, wh.bb_ticker_id, wh.node_id,
wh.eqcv_analyst_uuid, earnings_estimate.pr_get_name_from_uuid( wh.eqcv_analyst_uuid) eqcv_analyst_name,
WH.WORKITEM_NOTE,Wh.PACKAGE_ID ,Wh.COVERAGE_STATUS_NUM ,CS.COVERAGE_STATUS_CD ,Wh.COVERAGE_REC_NUM,I.INDUSTRY_CD INDUSTRY_CODE,I.INDUSTRY_NM
INDUSTRY_NAME,WOT.WORKITEM_OUTLIER_TYPE_NM as WORKITEM_SUBTYPE_NM
,count(1) over() AS total_count,bro.BB_ID BROKER_BB_ID,bro.BROKER_NM BROKER_NAME, wh.assigned_analyst_uuid,earnings_estimate.pr_get_name_from_uuid(wh.assigned_analyst_uuid)
assigned_analyst_name
FROM earnings_estimate.workitem_type wt,
earnings_estimate.workitem_status ws,
earnings_estimate.workitem_outlier_type wot,
(SELECT * FROM (
SELECT WH.ASSIGNED_ANALYST_UUID,WH.DEFERRED_TO_DT,WH.WORKITEM_NOTE,WH.UPDATE_USER_ID,EARNINGS_ESTIMATE.PR_GET_NAME_FROM_UUID(WH.UPDATE_USER_ID)
UPDATE_NAME, WH.UPDATE_TS,WH.OWNER_UUID, EARNINGS_ESTIMATE.PR_GET_NAME_FROM_UUID(OWNER_UUID)
OWNER_NAME,WH.ANALYST_GROUP_ID,WH.WORKITEM_STATUS_ID,WH.WORKITEM_PRIORITY,EARNINGS_ESTIMATE.PR_GET_NAME_FROM_UUID( WI.CREATE_USER_ID) CREATE_NAME, WI.CREATE_TS,
wi.create_user_id,wi.workitem_type_id,wi.workitem_id,RANK() OVER (PARTITION BY WH.WORKITEM_ID ORDER BY WH.CREATE_TS DESC NULLS LAST, ROWNUM) R,
wo.bb_ticker_id, wo.node_id,wo.eqcv_analyst_uuid,
WO.PACKAGE_ID ,WO.COVERAGE_STATUS_NUM ,WO.COVERAGE_REC_NUM,
wo.workitem_outlier_type_id
FROM earnings_estimate.workitem_history wh
JOIN EARNINGS_ESTIMATE.workitem_outlier wo
ON wh.workitem_id=wo.workitem_id
JOIN earnings_estimate.workitem wi
ON wi.workitem_id=wo.workitem_id
AND WI.WORKITEM_TYPE_ID=3
and wh.workitem_status_id not in (1,7)
WHERE ( wo.bb_ticker_id IN (SELECT
column_value from table(v_tickerlist) )
)
)wh
where r=1
AND DECODE(V_DATE_TYPE,'CreatedDate',WH.CREATE_TS,'LastModifiedDate',WH.UPDATE_TS) >= V_START_DATE
AND decode(v_date_type,'CreatedDate',wh.create_ts,'LastModifiedDate',wh.update_ts) <= v_end_date
and decode(wh.owner_uuid,null,-1,wh.owner_uuid)=decode(v_analyst_id,null,decode(wh.owner_uuid,null,-1,wh.owner_uuid),v_analyst_id)
) wh,
earnings_estimate.analyst_group ag,
earnings_estimate.coverage_status cs,
earnings_estimate.research_document rd,
( SELECT
BB.BB_ID ,
BRK.BROKER_ID,
BRK.BROKER_NM
FROM EARNINGS_ESTIMATE.BROKER BRK,COMMON.BB_ID BB
WHERE BRK.ORG_ID = BB.ORG_ID
AND BRK.ORG_LOC_REC_NUM = BB.ORG_LOC_REC_NUM
AND BRK.primary_broker_ind='Y') bro,
earnings_estimate.industry i
WHERE wh.analyst_group_id = ag.analyst_group_id
AND wh.workitem_status_id = ws.workitem_status_id
AND wh.workitem_type_id = wt.workitem_type_id
AND wh.coverage_status_num=cs.coverage_status_num
AND wh.workitem_outlier_type_id=wot.workitem_outlier_type_id
AND wh.PACKAGE_ID=rd.PACKAGE_ID(+)
AND rd.industry_id=i.industry_id(+)
AND rd.BROKER_BB_ID=bro.BB_ID(+)
ORDER BY wh.create_ts)tab1 )
;
I agree that the problem is most likely related to SELECT column_value from table(v_tickerlist).
By default, Oracle estimates that table functions return 8168 rows. Since you're testing the query with a single value, I assume that the actual number of values is usually much smaller. Cardinality estimates, like any forecast, are always wrong. But they should at least be in the ballpark of the actual cardinality for the optimizer to do its job properly.
You can force Oracle to always check the size with dynamic sampling. This will require more time to generate the plan, but it will probably be worth it in this case.
For example:
SQL> --Sample type
SQL> create or replace type v_tickerlist is table of number;
2 /
Type created.
SQL> --Show explain plans
SQL> set autotrace traceonly explain;
SQL> --Default estimate is poor. 8168 estimated, versus 3 actual.
SQL> SELECT column_value from table(v_tickerlist(1,2,3));
Execution Plan
----------------------------------------------------------
Plan hash value: 1748000095
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 16 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 8168 | 16336 | 16 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
SQL> --Estimate is perfect when dynamic sampling is used.
SQL> SELECT /*+ dynamic_sampling(tickerlist, 2) */ column_value
2 from table(v_tickerlist(1,2,3)) tickerlist;
Execution Plan
----------------------------------------------------------
Plan hash value: 1748000095
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 6 | 6 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 3 | 6 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
SQL>
If that doesn't help, look at your explain plan (and post it here). Find where the cardinality estimate is most wrong, then try to figure out why that is.
Your query is too big and will take time when executing on bulk data. Try putting few de-normalised temp tables, extract the data there and then join between the temp tables. That will increase the performance.
With this stand alone query, do not pass any variable inside the subqueries as in the below line...
WHERE ( wo.bb_ticker_id IN (SELECT
column_value from table(v_tickerlist)
Also, the outer joins will toss the performance.. Better to implement the denormalised temp tables

Resources