I'm trying to build a query that shows only non-unique duplicates. I've already built a query that shows all the records coming into consideration:
SELECT tbl_tm.title, lp_index.starttime, musicsound.archnr
FROM tbl_tm
INNER JOIN musicsound on tbl_tm.fk_tbl_tm_musicsound = musicsound.pk_musicsound
INNER JOIN lp_index ON musicsound.pk_musicsound = lp_index.fk_index_musicsound
INNER JOIN plan ON lp_index.fk_index_plan = plan.pk_plan
WHERE tbl_tm.FK_tbl_tm_title_type_music = '22' AND plan.airdate
BETWEEN to_date ('15-01-13') AND to_date('17-01-13')
GROUP BY tbl_tm.title, lp_index.starttime, musicsound.archnr
HAVING COUNT (tbl_tm.title) > 0;
The corresponding result set looks like this:
title starttime archnrr
============================================
Pumped up kicks 05:05:37 0616866
People Help The People 05:09:13 0620176
I can't dance 05:12:43 0600109
Locked Out Of Heaven 05:36:08 0620101
China in your hand 05:41:33 0600053
Locked Out Of Heaven 08:52:50 0620101
It gives me music titles played between a certain timespan along with their starting time and archive ID.
What I want to achieve is something like this:
title starttime archnr
============================================
Locked Out Of Heaven 05:36:08 0620101
Locked Out Of Heaven 08:52:50 0620101
There would only be two columns left: both share the same title and archive number but differ in the time part. Increasing the 'HAVING COUNT' value will give me a zero-row
result set, since there aren't any entries that are exactly the same.
What I've found out so far is that the solution for this problem will most likely have a nested subquery, but I can't seem to get it done. Any help on this would be greatly appreciated.
Note: I'm on a Oracle 11g-server. My user has read-only privileges. I use SQL Developer on my workstation.
You can try something like this:
SELECT title, starttime, archnr
FROM (
SELECT title, starttime, archnr, count(*) over (partition by title) cnt
FROM (your_query))
WHERE cnt > 1
Here is a sqlfiddle demo
Related
I'm brand-spakin' new to SQL and was asked to help write a query for a report. I need to limit the data to the last 10 services done by a clinician, and then subtotal the difference between the two times (time in and out) for each clinician.
I'm guessing I need to do a "LIMIT" clause to limit the data, but I'm not sure how or where to put that information. I am also thinking I need to use "GROUP BY", but not positive on that either. Any help would be appreciated.
I tried simplifying the existing query that my boss started but I'm getting error messages about the GROUP BY clause because I don't have an aggregate.
Select CV.emp_name,
CV.Visittype,
CVt.clientvisit_id,
CV.client_id,
CV.rev_timein,
CV.rev_timeout,
Convert(varchar(25),Cast(CV.rev_timein As Time),8) As Start_Time,
CV.program_id,
CV.cptcode
From ClientVisit CV
Where CV.visittype = 'Mobile Therapy' And CV.program_id = 31
And CV.cptcode <> 'NB' And CV.rev_timein <=
Convert(datetime,IsNull(#param2, GetDate())) And CV.rev_timein >=
Convert(datetime,IsNull(#param1, GetDate())) And
Cast(CV.rev_timein As time) > '15:59'
Group By CV.emp_name,
CV.rev_timein
I'm working with my DBA to try to figure out a way to roll up all costs associated with a Work Order. Since any work Order can have multiple child work orders (through multiple "generations") as well as related work orders (through the RELATEDRECORDS table), I need to be able to get the total of the ACTLABORCOST and ACTMATERIALCOST fields for all child and related work orders (as well as each of their child and related work orders). I've worked though a hierarchical query (using CONNECT BY PRIOR) to get all the children, grandchildren, etc., but I'm stuck on the related work orders. Since every work order can have a related work order with it's own children and related work orders, I need an Oracle function that drills down through the children and the related work orders and their children and related work orders. Since I would think that this is something that should be fairly common, I'm hoping that there is someone who has done this and can share what they've done.
Another option would be a recursive query, as suggested by Francisco Sitja. Since my Oracle didn't allow 2 UNION ALLs, I had to joint to the WOANCESTOR table in both child queries instead of dedicating a UNION ALL for doing the WO hierarchy. I was then able to use the one permitted UNION ALL for doing the RELATEDRECORD hierarchy. And it seems to run pretty quickly.
with mywos (wonum, parent, taskid, worktype, description, origrecordid, woclass, siteid) as (
-- normal WO hierarchy
select wo.wonum, wo.parent, wo.taskid, wo.worktype, wo.description, wo.origrecordid, wo.woclass, wo.siteid
from woancestor a
join workorder wo
on a.wonum = wo.wonum
and a.siteid = wo.siteid
where a.ancestor = 'MY-STARTING-WONUM'
union all
-- WO hierarchy associated via RELATEDRECORD
select wo.wonum, wo.parent, wo.taskid, wo.worktype, wo.description, wo.origrecordid, wo.woclass, wo.siteid
from mywos
join relatedrecord rr
on mywos.woclass = rr.class
and mywos.siteid = rr.siteid
and mywos.wonum = rr.recordkey
-- prevent cycle / going back up the hierarchy
and rr.relatetype not in ('ORIGINATOR')
join woancestor a
on rr.relatedrecsiteid = a.siteid
and rr.relatedreckey = a.ancestor
join workorder wo
on a.siteid = wo.siteid
and a.wonum = wo.wonum
)
select * from mywos
;
Have you considered the WOGRANDTOTAL object? Its description in MAXOBJECT is "Non-Persistent table to display WO grandtotals". There is a dialog in the Work Order Tracking application that you can get to from the Select Action / More Actions menu. Since you mentioned it repeatedly, I should note that WOGRANDTOTAL values do not include joins across RELATEDRECORDS to other work order hierarchies.
You can also save yourself the complication of CONNECT BY PRIOR by joining to WOANCESTOR, which is effectively a dump from a CONNECT BY PRIOR query. (There are other %ANCESTOR tables for other hierarchies.)
I think a recursive automation script would be the best way to do what you want, if you need the results in Maximo. If you need the total cost outside of Maximo, maybe a recursive function would work.
We finally figured out how to pull this off.
WITH WO(WONUM,
PARENT) AS
((SELECT X.WONUM,
X.PARENT
FROM (SELECT R.RECORDKEY WONUM,
R.RELATEDRECKEY PARENT
FROM MAXIMO.RELATEDRECORD R
WHERE R.RELATEDRECKEY = '382418'
UNION ALL
SELECT W.WONUM,
W.PARENT
FROM MAXIMO.WORKORDER W
START WITH W.PARENT = '382418'
CONNECT BY PRIOR W.WONUM = W.PARENT) X)
UNION ALL
SELECT W.WONUM, W.PARENT FROM MAXIMO.WORKORDER W, WO WHERE W.WONUM = WO.PARENT)
SELECT DISTINCT WONUM FROM WO;
This returns a list of all of the child and related work orders for a given work order.
I have a question:
Assuming an assembly line where a bike goes through some tests, and then the devices send the information regarding the test to
our database (in oracle). I created this stored procedure; it works correctly for what I want, which is:
It gets a list of the first test (per type of test) that a bike has gone through. For instance, if a bike had 2 tests of the same type, it only
shows the first one, AND it shows it only when that first test is between the dates specified by the user. Also I look from 2 months back
because a bike cannot spend more than 2 months (I'm probably overestimating) at the assembly line, but if the user searches 2 days for instance, and I only look in between those days, I could let outside of my results a test made over a bike 3 days ago or maybe 4, and it get's worst if they search between hours.
As I said before, the sp works just fine, but I'm wondering if there's a way to optimize it.
Also consider that the table has around 7 millions of records by the end of the year, so I cannot query the whole year because it could get ugly.
Here's the main part of the stored procedure:
SELECT pid AS "bike_id",
TYPE AS "type",
stationnr AS "stationnr",
testtime AS "testtime",
rel2.releasenr AS "releasenr",
placedesc AS description,
tv.recordtime AS "recordtime",
To_char(tv.testtime, 'YYYY.MM.DD') AS "dategroup",
testcounts AS "testcounts",
tv.result AS "result",
progressive AS "PROGRESIVO"
FROM (SELECT l_bike_id AS pid,
l_testcounts AS testcounts,
To_char(l_testtime, 'yyyy-MM-dd hh24:mi:ss') AS testtimes,
testtime,
pl.code AS place,
t2.recordtime,
t2.releaseid,
t2.testresid,
t2.stationnr,
t2.result,
v.TYPE,
v.progressive,
v.prs,
pl.description AS placeDesc
FROM (SELECT v.bike_id AS l_bike_id,
v.TYPE AS l_type,
Min(t.testtime) AS l_testtime,
Count(t.testtime) AS l_testcounts
FROM result_test t
inner join bikes v
ON v.bike_id = t.pid
inner join result_release rel
ON t.releaseid = rel.releaseid
inner join resultconfig.places p
ON p.place = t.place
WHERE t.testtime >= Add_months(Trunc(p_startdate), -2)
GROUP BY v.bike_id,
v.TYPE,
p.code)p_bikelist
inner join result_test t2
ON p_bikelist.l_bike_id = t2.pid
AND p_bikelist.l_testtime = t2.testtime
inner join resultconfig.places pl
ON pl.place = t2.place
inner join bikes v
ON v.bike_id = t2.pid
inner join result_release rel2
ON t2.releaseid = rel2.releaseid
ORDER BY t2.pid)tv
inner join result_release rel2
ON tv.releaseid = rel2.releaseid
WHERE tv.testtime BETWEEN p_startdate AND p_enddate
ORDER BY testtime;
Thank you for answering!!
I'm struggling a bit to understand the business requirement from the English description you give. The wording suggests that this procedure is intended to work per bike but I don't see any obvious bike_id parameters being supplied, instead, you appear to be returning the earliest result for all bikes tested between given dates. Is that the aim? If it is designed to be run per bike, then ensure bike id gets passed in and used early :)
There is some confusion about your data types. You convert testtime in result_test (presumably a DATE or TIMESTAMP column ) into a string in the p_bikelist subquery but then compare back to the original value in the tv subquery. You further use (presumably typed parameters) p_startdate and p_enddate to filter results. I strongly suspect the conversion in p_bikelist to be unnecessary, and possibly a cause for index avoidance.
Finally, I don't get the add_months logic. By all means, extend the window back in time to get tests that finished within the window but started up to 2 months before the start date, but as written you will exclude the earlier starts anyway because of the condition on tv.testtime. Most likely you'd be better off fudging the startdate earlier in the stored procedure with code like
l_assumedstart := add_months(p_startdate, -2);
and then using l_assumedstart in the query itself.
I have a database with customers orders.
I want to use Linq (to EF) to query the db to bring back the last(most recent) 3,4...n orders for every customer.
Note:
Customer 1 may have just made 12 orders in the last hr; but customer 2 may not have made any since last week.
I cant for the life of me work out how to write query in linq (lambda expressions) to get the data set back.
Any good ideas?
Edit:
Customers and orders is a simplification. The table I am querying is actually a record of outbound messages to various web services. It just seemed easer to describe as customers and orders. The relationship is the same.
I am building a task that checks the last n messages for each web service to see if there were any failures. We are wanting a semi real time Health status of the webservices.
#CoreySunwold
My table Looks a bit like this:
MessageID, WebserviceID, SentTime, Status, Message, Error,
Or from a customer/order context if it makes it easer:
OrderID, CustomerID, StatusChangedDate, Status, WidgetName, Comments
Edit 2:
I eventually worked out something
(Hat tip to #StephenChung who basically came up with the exact same, but in classic linq)
var q = myTable.Where(d => d.EndTime > DateTime.Now.AddDays(-1))
.GroupBy(g => g.ConfigID)
.Select(g =>new
{
ConfigID = g.Key,
Data = g.OrderByDescending(d => d.EndTime)
.Take(3).Select(s => new
{
s.Status,
s.SentTime
})
}).ToList();
It does take a while to execute. So I am not sure if this is the most efficient expression.
This should give the last 3 orders of each customer (if having orders at all):
from o in db.Orders
group o by o.CustomerID into g
select new {
CustomerID=g.Key,
LastOrders=g.OrderByDescending(o => o.TimeEntered).Take(3).ToList()
}
However, I suspect this will force the database to return the entire Orders table before picking out the last 3 for each customer. Check the SQL generated.
If you need to optimize, you'll have to manually construct a SQL to only return up to the last 3, then make it into a view.
You can use SelectMany for this purpose:
customers.SelectMany(x=>x.orders.OrderByDescending(y=>y.Date).Take(n)).ToList();
How about this? I know it'll work with regular collections but don't know about EF.
yourCollection.OrderByDescending(item=>item.Date).Take(n);
var ordersByCustomer =
db.Customers.Select(c=>c.Orders.OrderByDescending(o=>o.OrderID).Take(n));
This will return the orders grouped by customer.
var orders = orders.Where(x => x.CustomerID == 1).OrderByDescending(x=>x.Date).Take(4);
This will take last 4 orders. Specific query depends on your table / entity structure.
Btw: You can take x as a order. So you can read it like: Get orders where order.CustomerID is equal to 1, OrderThem by order.Date and take first 4 'rows'.
Somebody might correct me here, but i think doing this is linq with a single query is probably very difficult if not impossible. I would use a store procedure and something like this
select
*
,RANK() OVER (PARTITION BY c.id ORDER BY o.order_time DESC) AS 'RANK'
from
customers c
inner join
order o
on
o.cust_id = c.id
where
RANK < 10 -- this is "n"
I've not used this syntax for a while so it might not be quite right, but if i understand the question then i think this is the best approach.
I have a set set of records that I am loading from a file and the first thing I need to do is get the max and min of a column.
In SQL I would do this with a subquery like this:
select c.state, c.population,
(select max(c.population) from state_info c) as max_pop,
(select min(c.population) from state_info c) as min_pop
from state_info c
I assume there must be an easy way to do this in PIG as well but I'm having trouble finding it. It has a MAX and MIN function but when I tried doing the following it didn't work:
records=LOAD '/Users/Winter/School/st_incm.txt' AS (state:chararray, population:int);
with_max = FOREACH records GENERATE state, population, MAX(population);
This didn't work. I had better luck adding an extra column with the same value to each row and then grouping them on that column. Then getting the max on that new group. This seems like a convoluted way of getting what I want so I thought I'd ask if anyone knows a simpler way.
Thanks in advance for the help.
As you said you need to group all the data together but no extra column is required if you use GROUP ALL.
Pig
records = LOAD 'states.txt' AS (state:chararray, population:int);
records_group = GROUP records ALL;
with_max = FOREACH records_group
GENERATE
FLATTEN(records.(state, population)), MAX(records.population);
Input
CA 10
VA 5
WI 2
Output
(CA,10,10)
(VA,5,10)
(WI,2,10)