How do I improve this Stored Procedure? - oracle

I have a question:
Assuming an assembly line where a bike goes through some tests, and then the devices send the information regarding the test to
our database (in oracle). I created this stored procedure; it works correctly for what I want, which is:
It gets a list of the first test (per type of test) that a bike has gone through. For instance, if a bike had 2 tests of the same type, it only
shows the first one, AND it shows it only when that first test is between the dates specified by the user. Also I look from 2 months back
because a bike cannot spend more than 2 months (I'm probably overestimating) at the assembly line, but if the user searches 2 days for instance, and I only look in between those days, I could let outside of my results a test made over a bike 3 days ago or maybe 4, and it get's worst if they search between hours.
As I said before, the sp works just fine, but I'm wondering if there's a way to optimize it.
Also consider that the table has around 7 millions of records by the end of the year, so I cannot query the whole year because it could get ugly.
Here's the main part of the stored procedure:
SELECT pid AS "bike_id",
TYPE AS "type",
stationnr AS "stationnr",
testtime AS "testtime",
rel2.releasenr AS "releasenr",
placedesc AS description,
tv.recordtime AS "recordtime",
To_char(tv.testtime, 'YYYY.MM.DD') AS "dategroup",
testcounts AS "testcounts",
tv.result AS "result",
progressive AS "PROGRESIVO"
FROM (SELECT l_bike_id AS pid,
l_testcounts AS testcounts,
To_char(l_testtime, 'yyyy-MM-dd hh24:mi:ss') AS testtimes,
testtime,
pl.code AS place,
t2.recordtime,
t2.releaseid,
t2.testresid,
t2.stationnr,
t2.result,
v.TYPE,
v.progressive,
v.prs,
pl.description AS placeDesc
FROM (SELECT v.bike_id AS l_bike_id,
v.TYPE AS l_type,
Min(t.testtime) AS l_testtime,
Count(t.testtime) AS l_testcounts
FROM result_test t
inner join bikes v
ON v.bike_id = t.pid
inner join result_release rel
ON t.releaseid = rel.releaseid
inner join resultconfig.places p
ON p.place = t.place
WHERE t.testtime >= Add_months(Trunc(p_startdate), -2)
GROUP BY v.bike_id,
v.TYPE,
p.code)p_bikelist
inner join result_test t2
ON p_bikelist.l_bike_id = t2.pid
AND p_bikelist.l_testtime = t2.testtime
inner join resultconfig.places pl
ON pl.place = t2.place
inner join bikes v
ON v.bike_id = t2.pid
inner join result_release rel2
ON t2.releaseid = rel2.releaseid
ORDER BY t2.pid)tv
inner join result_release rel2
ON tv.releaseid = rel2.releaseid
WHERE tv.testtime BETWEEN p_startdate AND p_enddate
ORDER BY testtime;
Thank you for answering!!

I'm struggling a bit to understand the business requirement from the English description you give. The wording suggests that this procedure is intended to work per bike but I don't see any obvious bike_id parameters being supplied, instead, you appear to be returning the earliest result for all bikes tested between given dates. Is that the aim? If it is designed to be run per bike, then ensure bike id gets passed in and used early :)
There is some confusion about your data types. You convert testtime in result_test (presumably a DATE or TIMESTAMP column ) into a string in the p_bikelist subquery but then compare back to the original value in the tv subquery. You further use (presumably typed parameters) p_startdate and p_enddate to filter results. I strongly suspect the conversion in p_bikelist to be unnecessary, and possibly a cause for index avoidance.
Finally, I don't get the add_months logic. By all means, extend the window back in time to get tests that finished within the window but started up to 2 months before the start date, but as written you will exclude the earlier starts anyway because of the condition on tv.testtime. Most likely you'd be better off fudging the startdate earlier in the stored procedure with code like
l_assumedstart := add_months(p_startdate, -2);
and then using l_assumedstart in the query itself.

Related

How to restrict query result from multiple instances of overlapping date ranges in Django ORM

First off, I admit that I am not sure whether what I am trying to achieve is possible (or even logical). Still I am putting forth this query (and if nothing else, at least be told that I need to redesign my table structure / business logic).
In a table (myValueTable) I have the following records:
Item
article
from_date
to_date
myStock
1
Paper
01/04/2021
31/12/9999
100
2
Tray
12/04/2021
31/12/9999
12
3
Paper
28/04/2021
31/12/9999
150
4
Paper
06/05/2021
31/12/9999
130
As part of the underlying process, I am to find out the value (of field myStock) as on a particular date, say 30/04/2021 (assuming no inward / outward stock movement in the interim).
To that end, I have the following values:
varRefDate = 30/04/2021
varArticle = "Paper"
And my query goes something like this:
get_value = myValueTable.objects.filter(from_date__lte=varRefDate, to_date__gte=varRefDate).get(article=varArticle).myStock
which should translate to:
get_value = SELECT myStock FROM myValueTable WHERE varRefDate BETWEEN from_date AND to_date
But with this I am coming up with more than one result (actually THREE!).
How do I restrict the query result to get ONLY the 3rd instance i.e. the one with value "150" (for article = "paper")?
NOTE: The upper limit of date range (to_date) is being kept constant at 31/12/9999.
Edit
Solved it. In a round about manner. Instead of .get, resorted to generating values_list with fields from_date and myStock. Using the count of objects returned; appended a list with date difference between from_date and the ref date (which is 30/04/2021) and the value of field myStock, sorted (ascending) the generated list. The first tuple in the sorted list will have the least date difference and the corresponding myStock value and that will be the value I am searching for. Tested and works.

MAximo workorder total labor costs and total Material costs

I'm working with my DBA to try to figure out a way to roll up all costs associated with a Work Order. Since any work Order can have multiple child work orders (through multiple "generations") as well as related work orders (through the RELATEDRECORDS table), I need to be able to get the total of the ACTLABORCOST and ACTMATERIALCOST fields for all child and related work orders (as well as each of their child and related work orders). I've worked though a hierarchical query (using CONNECT BY PRIOR) to get all the children, grandchildren, etc., but I'm stuck on the related work orders. Since every work order can have a related work order with it's own children and related work orders, I need an Oracle function that drills down through the children and the related work orders and their children and related work orders. Since I would think that this is something that should be fairly common, I'm hoping that there is someone who has done this and can share what they've done.
Another option would be a recursive query, as suggested by Francisco Sitja. Since my Oracle didn't allow 2 UNION ALLs, I had to joint to the WOANCESTOR table in both child queries instead of dedicating a UNION ALL for doing the WO hierarchy. I was then able to use the one permitted UNION ALL for doing the RELATEDRECORD hierarchy. And it seems to run pretty quickly.
with mywos (wonum, parent, taskid, worktype, description, origrecordid, woclass, siteid) as (
-- normal WO hierarchy
select wo.wonum, wo.parent, wo.taskid, wo.worktype, wo.description, wo.origrecordid, wo.woclass, wo.siteid
from woancestor a
join workorder wo
on a.wonum = wo.wonum
and a.siteid = wo.siteid
where a.ancestor = 'MY-STARTING-WONUM'
union all
-- WO hierarchy associated via RELATEDRECORD
select wo.wonum, wo.parent, wo.taskid, wo.worktype, wo.description, wo.origrecordid, wo.woclass, wo.siteid
from mywos
join relatedrecord rr
on mywos.woclass = rr.class
and mywos.siteid = rr.siteid
and mywos.wonum = rr.recordkey
-- prevent cycle / going back up the hierarchy
and rr.relatetype not in ('ORIGINATOR')
join woancestor a
on rr.relatedrecsiteid = a.siteid
and rr.relatedreckey = a.ancestor
join workorder wo
on a.siteid = wo.siteid
and a.wonum = wo.wonum
)
select * from mywos
;
Have you considered the WOGRANDTOTAL object? Its description in MAXOBJECT is "Non-Persistent table to display WO grandtotals". There is a dialog in the Work Order Tracking application that you can get to from the Select Action / More Actions menu. Since you mentioned it repeatedly, I should note that WOGRANDTOTAL values do not include joins across RELATEDRECORDS to other work order hierarchies.
You can also save yourself the complication of CONNECT BY PRIOR by joining to WOANCESTOR, which is effectively a dump from a CONNECT BY PRIOR query. (There are other %ANCESTOR tables for other hierarchies.)
I think a recursive automation script would be the best way to do what you want, if you need the results in Maximo. If you need the total cost outside of Maximo, maybe a recursive function would work.
We finally figured out how to pull this off.
WITH WO(WONUM,
PARENT) AS
((SELECT X.WONUM,
X.PARENT
FROM (SELECT R.RECORDKEY WONUM,
R.RELATEDRECKEY PARENT
FROM MAXIMO.RELATEDRECORD R
WHERE R.RELATEDRECKEY = '382418'
UNION ALL
SELECT W.WONUM,
W.PARENT
FROM MAXIMO.WORKORDER W
START WITH W.PARENT = '382418'
CONNECT BY PRIOR W.WONUM = W.PARENT) X)
UNION ALL
SELECT W.WONUM, W.PARENT FROM MAXIMO.WORKORDER W, WO WHERE W.WONUM = WO.PARENT)
SELECT DISTINCT WONUM FROM WO;
This returns a list of all of the child and related work orders for a given work order.

nested for loops in stata

I am having trouble to understand why a for loop construction does not work. I am not really used to for loops so I apologize if I am missing something basic. Anyhow, I appreciate any piece of advice you might have.
I am using a party level dataset from the parlgov project. I am trying to create a variable which captures how many times a party has been in government before the current observation. Time is important, the counter should be zero if a party has not been in government before, even if after the observation period it entered government multiple times. Parties are nested in countries and in cabinet dates.
The code is as follows:
use "http://eborbath.github.io/stackoverflow/loop.dta", clear //to get the data
if this does not work, I also uploaded in a csv format, try:
import delimited "http://eborbath.github.io/stackoverflow/loop.csv", bindquote(strict) encoding(UTF-8) clear
The loop should go through each country-specific cabinet date, identify the previous observation and check if the party has already been in government. This is how far I have got:
gen date2=cab_date
gen gov_counter=0
levelsof country, local(countries) // to get to the unique values in countries
foreach c of local countries{
preserve // I think I need this to "re-map" the unique cabinet dates in each country
keep if country==`c'
levelsof cab_date, local(dates) // to get to the unique cabinet dates in individual countries
restore
foreach i of local dates {
egen min_date=min(date2) // this is to identify the previous cabinet date
sort country party_id date2
bysort country party_id: replace gov_counter=gov_counter+1 if date2==min_date & cabinet_party[_n-1]==1 // this should be the counter
bysort country: replace date2=. if date2==min_date // this is to drop the observation which was counted
drop min_date //before I restart the nested loop, so that it again gets to the minimum value in `dates'
}
}
The code works without an error, but it does not do the job. Evidently there's a mistake somewhere, I am just not sure where.
BTW, it's a specific application of a problem I super often encounter: how do you count frequencies of distinct values in a multilevel data structure? This is slightly more specific, to the extent that "time matters", and it should not just sum all encounters. Let me know if you have an easier solution for this.
Thanks!
The problem with your loop is that it does not keep the replaced gov_counter after the loop. However, there is a much easier solution I'd recommend:
sort country party_id cab_date
by country party_id: gen gov_counter=sum(cabinet_party[_n-1])
This sorts the data into groups and then creates a sum by group, always up to (but not including) the current observation.
I would start here. I have stripped the comments so that we can look at the code. I have made some tiny cosmetic alterations.
foreach i of local dates {
egen min_date = min(date2)
sort country party_id date2
bysort country party_id: replace gov_counter=gov_counter+1 ///
if date2 == min_date & cabinet_party[_n-1] == 1
bysort country: replace date2 = . if date2 == min_date
drop min_date
}
This loop includes no reference to the loop index i defined in the foreach statement. So, the code is the same and completely unaffected by the loop index. The variable min_date is just a constant for the dataset and the same each time around the loop. What does depend on how many times the loop is executed is how many times the counter is incremented.
The fallacy here appears to be a false analogy with constructs in other software, in which a loop automatically spawns separate calculations for different values of a loop index.
It's not illegal for loop contents never to refer to the loop index, as is easy to see
forval j = 1/3 {
di "Hurray"
}
produces
Hurray
Hurray
Hurray
But if you want different calculations for different values of the loop index, that has to be explicit.

Oracle: getting non unique duplicates with group by ... having count

I'm trying to build a query that shows only non-unique duplicates. I've already built a query that shows all the records coming into consideration:
SELECT tbl_tm.title, lp_index.starttime, musicsound.archnr
FROM tbl_tm
INNER JOIN musicsound on tbl_tm.fk_tbl_tm_musicsound = musicsound.pk_musicsound
INNER JOIN lp_index ON musicsound.pk_musicsound = lp_index.fk_index_musicsound
INNER JOIN plan ON lp_index.fk_index_plan = plan.pk_plan
WHERE tbl_tm.FK_tbl_tm_title_type_music = '22' AND plan.airdate
BETWEEN to_date ('15-01-13') AND to_date('17-01-13')
GROUP BY tbl_tm.title, lp_index.starttime, musicsound.archnr
HAVING COUNT (tbl_tm.title) > 0;
The corresponding result set looks like this:
title starttime archnrr
============================================
Pumped up kicks 05:05:37 0616866
People Help The People 05:09:13 0620176
I can't dance 05:12:43 0600109
Locked Out Of Heaven 05:36:08 0620101
China in your hand 05:41:33 0600053
Locked Out Of Heaven 08:52:50 0620101
It gives me music titles played between a certain timespan along with their starting time and archive ID.
What I want to achieve is something like this:
title starttime archnr
============================================
Locked Out Of Heaven 05:36:08 0620101
Locked Out Of Heaven 08:52:50 0620101
There would only be two columns left: both share the same title and archive number but differ in the time part. Increasing the 'HAVING COUNT' value will give me a zero-row
result set, since there aren't any entries that are exactly the same.
What I've found out so far is that the solution for this problem will most likely have a nested subquery, but I can't seem to get it done. Any help on this would be greatly appreciated.
Note: I'm on a Oracle 11g-server. My user has read-only privileges. I use SQL Developer on my workstation.
You can try something like this:
SELECT title, starttime, archnr
FROM (
SELECT title, starttime, archnr, count(*) over (partition by title) cnt
FROM (your_query))
WHERE cnt > 1
Here is a sqlfiddle demo

Using Linq to bring back last 3,4...n orders for every customer

I have a database with customers orders.
I want to use Linq (to EF) to query the db to bring back the last(most recent) 3,4...n orders for every customer.
Note:
Customer 1 may have just made 12 orders in the last hr; but customer 2 may not have made any since last week.
I cant for the life of me work out how to write query in linq (lambda expressions) to get the data set back.
Any good ideas?
Edit:
Customers and orders is a simplification. The table I am querying is actually a record of outbound messages to various web services. It just seemed easer to describe as customers and orders. The relationship is the same.
I am building a task that checks the last n messages for each web service to see if there were any failures. We are wanting a semi real time Health status of the webservices.
#CoreySunwold
My table Looks a bit like this:
MessageID, WebserviceID, SentTime, Status, Message, Error,
Or from a customer/order context if it makes it easer:
OrderID, CustomerID, StatusChangedDate, Status, WidgetName, Comments
Edit 2:
I eventually worked out something
(Hat tip to #StephenChung who basically came up with the exact same, but in classic linq)
var q = myTable.Where(d => d.EndTime > DateTime.Now.AddDays(-1))
.GroupBy(g => g.ConfigID)
.Select(g =>new
{
ConfigID = g.Key,
Data = g.OrderByDescending(d => d.EndTime)
.Take(3).Select(s => new
{
s.Status,
s.SentTime
})
}).ToList();
It does take a while to execute. So I am not sure if this is the most efficient expression.
This should give the last 3 orders of each customer (if having orders at all):
from o in db.Orders
group o by o.CustomerID into g
select new {
CustomerID=g.Key,
LastOrders=g.OrderByDescending(o => o.TimeEntered).Take(3).ToList()
}
However, I suspect this will force the database to return the entire Orders table before picking out the last 3 for each customer. Check the SQL generated.
If you need to optimize, you'll have to manually construct a SQL to only return up to the last 3, then make it into a view.
You can use SelectMany for this purpose:
customers.SelectMany(x=>x.orders.OrderByDescending(y=>y.Date).Take(n)).ToList();
How about this? I know it'll work with regular collections but don't know about EF.
yourCollection.OrderByDescending(item=>item.Date).Take(n);
var ordersByCustomer =
db.Customers.Select(c=>c.Orders.OrderByDescending(o=>o.OrderID).Take(n));
This will return the orders grouped by customer.
var orders = orders.Where(x => x.CustomerID == 1).OrderByDescending(x=>x.Date).Take(4);
This will take last 4 orders. Specific query depends on your table / entity structure.
Btw: You can take x as a order. So you can read it like: Get orders where order.CustomerID is equal to 1, OrderThem by order.Date and take first 4 'rows'.
Somebody might correct me here, but i think doing this is linq with a single query is probably very difficult if not impossible. I would use a store procedure and something like this
select
*
,RANK() OVER (PARTITION BY c.id ORDER BY o.order_time DESC) AS 'RANK'
from
customers c
inner join
order o
on
o.cust_id = c.id
where
RANK < 10 -- this is "n"
I've not used this syntax for a while so it might not be quite right, but if i understand the question then i think this is the best approach.

Resources