I have this query which takes around 11-12 hrs to execute:
SELECT A.EVENT_DATE SENT_DATE
,A.TRANSACTION_CODE EVENT_CODE
,COUNT(a.transaction_id) COUNT
FROM customer A
WHERE A.REQUEST_DATE >= (select max(RUNPERIODFROM_DATE) from auditeventbatch where auditevent_code = 'CDPE')
AND A.TRANSACTION_STATUS = 'STC'
AND NOT EXISTS
(SELECT /*use_nl(a b)*/1 FROM EVENTS_V1 B
WHERE B.TRANSACTIONID=a.transaction_id
AND TRUNC(B.EVENT_DATE) = A.CCE_EVENT_DATE
AND B.TRANSACTION_STATUS='STC'
)
GROUP BY A.CCE_EVENT_DATE
,A.TRANSACTION_CODE
Is there any way I can rewrite this to reduce the execution time of this.The view CDS_EVENTS_V1 has millions of record it it.I don't have the option the make the view as materialized view.
Well, one of the problems is this row:
AND TRUNC(B.CCE_EVENT_DATE) = A.CCE_EVENT_DATE
TRUNC makes the optimizer to ignore the index(if exists , if not - add it) . I suggest adding another column containing the trunced value and comparing by that column.
Also, if not exists, add the following index:
CDS_EVENTS_V1 (transactionid,CCE_EVENT_DATE,TRANSACTION_STATUS)
cds_auditeventbatch (RUNPERIODFROM_DATE,auditevent_code )
Related
DB = Oracle 11g
Apex = 4.2.6
In the form I have various items which all work great. However I now have a set of check boxes(:P14_DAYS) one for each day of the week.
What I need to do is get all records between :P14_START_DATE :P14_END_DATE, but only within the days select that's checked.
Below is also a sample of the DATE_SETS table
http://i.stack.imgur.com/YAckN.png
so for example
dates 01-AUG-14 - 5-AUG-14 But only require Sundays AND Mondays date would bring back 2 refs.
BEGIN
UPDATE MD_TS_DETAIL
SET job_for = :P14_JOBFORTEM,
job_type_id = :P14_JOB_TYPE_VALUE,
account_id = :P14_ACC_VAL,
qty = :P14_HRS,
rio = :P14_RIO,
post_code = :P14_POSTCODE
WHERE id IN (SELECT D.id
FROM MD_TS_MAST M
LEFT JOIN MD_TS_DETAIL D
ON M.mast_id = D.md_id
LEFT JOIN DATE_SETS
ON ms_date = dt
WHERE eng_id = :P14_ENG_VAL
AND ms_date BETWEEN :P14_START_DATE AND :P14_END_DATE
AND DATE_SETS.col_day = ANY instr(':'||:P14_DAYS||':',Return)
END;
Any help would be much appreciated .
I found this example: http://docs.oracle.com/cd/B31036_01/doc/appdev.22/b28839/check_box.htm#CHDBGDJH
As I can understand, when you choose some values in your checkbox list, item :P14_DAYS receives value, that contains return values of chosen elements of the LOV with delimiters. Then you need to replace this string in your query
AND DATE_SETS.col_day = ANY instr(':'||:P14_DAYS||':',Return)
with
AND instr(':'||:P14_DAYS||':', ':'||DATE_SETS.col_day||':') > 0
Here function instr searches substring DATE_SETS.col_day in string :P14_DAYS. It returns position of substring if substring was found or 0, if not found. Then you compare result of function with 0, and if result > 0, it means that DATE_SETS.col_day is among selected values.
I am newbie with neo4j db and just started learning it, looking for some help, because I am stuck. Is it possible to get it in one cypher query? how?
my graph structure looks like that:
(s:Store)-[r:RELEASED]->(m:Movie)<-[r1:ASSIGNED]-(cat:MovieCategorie)
How I could get this data?
Movie store (got)
Movie (got)
Most common 5 categories of movies in that store (I don't know how to sort them before using collect(cat.name)[0..5])
Anyone could suggest how to get this data? I tried lots of times and failed, this is what I got and it doesn't work.
match (s:Store)
with s
match (s)-[r:RELEASED]->(m:Movie)
with s,m
match (m)<-[r1:ASSIGNED]-(cat:MovieCategorie)
with s, m, count(r1) as stylesCount, cat
order by stylesCount
return distinct s as store, collect(cat.name)[0..5] as topCategories
order by store.name
Thank you!
Ok, so as I got my query right and I am developing this query further, got some problem by combining multiple aggregation functions COUNT and SUM.
My query witch works well for finding top 5 categories per store:
MATCH (s:Store)
OPTIONAL MATCH (s)-[:RELEASED]->(m:Movie)<-[r:ASSIGNED]-(cat:MovieCategorie)
WITH s, COUNT(r) AS count, cat
ORDER BY count DESC
RETURN c AS Store, COLLECT(distinct cat.name) AS `Top Categories`
ORDER BY Store.name
On top of this query I need count how much views this store has sum(m.viewsCount) as Total store views. I tried to add in to same WITH statement as COUNT is, and tried to put it in return, In both scenarios it doesn't work how I would like to. Any suggestions, examples? I am still confused how WITH with aggregation functions works... :(
create example database
CREATE (s1:Store) SET s1.name = 'Store 1'
CREATE (s2:Store) SET s2.name = 'Store 2'
CREATE (s3:Store) SET s3.name = 'Store 3'
CREATE (m1:Movie) SET m1.title = 'Movie 1', m1.viewsCount = 50
CREATE (m2:Movie) SET m2.title = 'Movie 2', m2.viewsCount = 50
CREATE (m3:Movie) SET m3.title = 'Movie 3', m3.viewsCount = 50
CREATE (m4:Movie) SET m4.title = 'Movie 4', m4.viewsCount = 50
CREATE (m5:Movie) SET m5.title = 'Movie 5', m5.viewsCount = 50
CREATE (c1:MovieCategorie) SET c1.name = 'Cat 1'
CREATE (c2:MovieCategorie) SET c2.name = 'Cat 2'
CREATE (c3:MovieCategorie) SET c3.name = 'Cat 3'
CREATE (m1)<-[:ASSIGNED]-(c1)
CREATE (m1)<-[:ASSIGNED]-(c3)
CREATE (m2)<-[:ASSIGNED]-(c2)
CREATE (m3)<-[:ASSIGNED]-(c1)
CREATE (m3)<-[:ASSIGNED]-(c2)
CREATE (m3)<-[:ASSIGNED]-(c3)
CREATE (m4)<-[:ASSIGNED]-(c1)
CREATE (m4)<-[:ASSIGNED]-(c3)
CREATE (m5)<-[:ASSIGNED]-(c3)
CREATE (s1)-[:RELEASED]->(m1)
CREATE (s1)-[:RELEASED]->(m3)
CREATE (s1)-[:RELEASED]->(m4)
CREATE (s1)-[:RELEASED]->(m5)
CREATE (s2)-[:RELEASED]->(m1)
CREATE (s2)-[:RELEASED]->(m2)
CREATE (s2)-[:RELEASED]->(m3)
CREATE (s2)-[:RELEASED]->(m4)
CREATE (s2)-[:RELEASED]->(m5)
CREATE (s3)-[:RELEASED]->(m1)
SOLVED!! FINALLY I DID IT! Trick was use one more match after everything , great - now I can sleep in peace. Thank you.
MATCH (s:Store)-[:RELEASED]->(m:Movie)<-[r:ASSIGNED]-(cat:MovieCategorie)
with s,count(r) as catCount, cat
order by catCount desc
with s, collect( distinct cat.name)[0..5] as TopCategories
match (s)-[:RELEASED]->(m:Movie)
return s as Store, TopCategories, sum(m.viewsCount) as TotalViews
Ok, that was fast :D I finally got it!
match (s:Store)
with s
match (s)-[r:PUBLISHED]->(m:Movie)
with s
match (s)<-[r2:ASSIGNED]-(cat:MovieCategorie)
with s, count(r2) as stylesCount, cat
order by stylesCount desc
return distinct s, collect(distinct cat.name)[0..5] as topCategories
order by s.name
So trick is first count() in with , then order by that with, and collect DISTINCT in return. I am not so sure about these mutiple with statements, will try to clean it up. ;)
MATCH (s:Store)-[:RELEASED]->(:Movie)<-[:ASSIGNED]-(cat:MovieCategorie)
WITH s, COUNT(cat) AS count, cat
ORDER BY s.name, count DESC
RETURN s.name AS Store, COLLECT(cat.name)[0..5] AS `Top Categories`
And if you want the sum of the viewsCount property from the Movie nodes per store:
MATCH (s:Store)-[:RELEASED]->(m:Movie)<-[:ASSIGNED]-(cat:MovieCategorie)
WITH s, COUNT(cat) AS count, m, cat
ORDER BY s.name, count DESC
RETURN s.name AS Store, COLLECT(cat.name)[0..5] AS `Top Categories`, SUM(m.viewsCount) AS `Total Views`
The following LINQ executes query which takes 90 milliseconds to execute:
.Where(Function(i) (i.RequestedByUserId = MySession.ApplicationUserId)
And (i.RequestKey1 = searchJson)
And (i.RequestMethod = "ProjectPlanService.GetProjectPlanMaintenanceData"))
.Select(Function(i) i.ResultJson).FirstOrDefault
The SQL generated is as below :
SELECT
[Limit1].[ResultJson] AS [ResultJson]
FROM ( SELECT TOP (1)
[Extent1].[ResultJson] AS [ResultJson]
FROM [dbo].[ApplicationCache] AS [Extent1]
WHERE ([Extent1].[RequestedByUserId] = 2) AND ([Extent1].[RequestKey1] = '{"SortProperty":"","SortOrder":0,"PageNumber":1,"RecordsPerPage":15,"CriteriaCount":"1","CriteriaString":"~=~Id"}') AND ('ProjectPlanService.GetProjectPlanMaintenanceData' = [Extent1].[RequestMethod])
) AS [Limit1]
How Can I optimize the above LINQ expression to reduce the time taken to execute?
Is there a way to get a single Select statement like below:
SELECT
[ResultJson]
FROM [dbo].[ApplicationCache] AS [Extent1]
WHERE ([Extent1].[RequestedByUserId] = 2) AND ([Extent1].[RequestKey1] = '{"SortProperty":"","SortOrder":0,"PageNumber":1,"RecordsPerPage":15,"CriteriaCount":"1","CriteriaString":"~=~Id"}') AND ('ProjectPlanService.GetProjectPlanMaintenanceData' = [Extent1].[RequestMethod])
1: There's no need to optimise, it's just fine as it is - The extra "wrapping" won't change anything.
2: It's doing a top 1 because you asked for FirstOrDefault(), if you want a full list of results don't do that.
New to cascading, trying to find out a way to get top N tuples based on a sort/order. for example, I'd like to know the top 100 first names people are using.
here's what I can do similar in teradata sql:
select top 100 first_name, num_records
from
(select first_name, count(1) as num_records
from table_1
group by first_name) a
order by num_records DESC
Here's similar in hadoop pig
a = load 'table_1' as (first_name:chararray, last_name:chararray);
b = foreach (group a by first_name) generate group as first_name, COUNT(a) as num_records;
c = order b by num_records DESC;
d = limit c 100;
It seems very easy to do in SQL or Pig, but having a hard time try to find a way to do it in cascading. Please advise!
Assuming you just need the Pipe set up on how to do this:
In Cascading 2.1.6,
Pipe firstNamePipe = new GroupBy("topFirstNames", InPipe,
new Fields("first_name"),
);
firstNamePipe = new Every(firstNamePipe, new Fields("first_name"),
new Count("num_records"), Fields.All);
firstNamePipe = new GroupBy(firstNamePipe,
new Fields("first_name"),
new Fields("num_records"),
true); //where true is descending order
firstNamePipe = new Every(firstNamePipe, new Fields("first_name", "num_records")
new First(Fields.Args, 100), Fields.All)
Where InPipe is formed with your incoming tap that holds the tuple data that you are referencing above. Namely, "first_name". "num_records" is created when new Count() is called.
If you have the "num_records" and "first_name" data in separate taps (tables or files) then you can set up two pipes that point to those two Tap sources and join them using CoGroup.
The definitions I used were are from Cascading 2.1.6:
GroupBy(String groupName, Pipe pipe, Fields groupFields, Fields sortFields, boolean reverseOrder)
Count(Fields fieldDeclaration)
First(Fields fieldDeclaration, int firstN)
Method 1
Use a GroupBy and group them base on the columns required and u can make use of secondary sorting that is provided by the cascading ,by default it provies them in ascending order ,if we want them in descing order we can do them by reverseorder()
To get the TOP n tuples or rows
Its quite simple just use a static variable count in FILTER and increment it by 1 for each tuple count value increases by 1 and check weather it is greater than N
return true when count value is greater than N or else return false
this will provide the ouput with first N tuples
method 2
cascading provides an inbuit function unique which returns firstNbuffer
see the below link
http://docs.cascading.org/cascading/2.2/javadoc/cascading/pipe/assembly/Unique.html
At my job our main application was written long ago before n-tier was really a thing, ergo - it has tons and tons of business logic begin handled in stored procs and such.
So we have finally decided to bite the bullet and make it not suck so bad. I have been tasked with converting a 900+ line sql script to a .NET exe, which I am doing in C#/Linq. Problem is...for the last 5-6 years at another job, I had been doing Linq exclusively, so my SQL has gotten somewhat rusty, and some of thing I am converting I have never tried to do before in Linq, so I'm hitting some roadblocks.
Anyway, enough whining.
I'm having trouble with the following sql statement, I think due to the fact that he is joining on a temp table and a derived table. Here's the SQL:
insert into #processedBatchesPurgeList
select d.pricebatchdetailid
from pricebatchheader h (nolock)
join pricebatchstatus pbs (nolock) on h.pricebatchstatusid = pbs.pricebatchstatusid
join pricebatchdetail d (nolock) on h.pricebatchheaderid = d.pricebatchheaderid
join
( -- Grab most recent REG.
select
item_key
,store_no
,pricebatchdetailid = max(pricebatchdetailid)
from pricebatchdetail _pbd (nolock)
join pricechgtype pct (nolock) on _pbd.pricechgtypeid = pct.pricechgtypeid
where
lower(rtrim(ltrim(pct.pricechgtypedesc))) = 'reg'
and expired = 0
group by item_key, store_no
) dreg
on d.item_key = dreg.item_key
and d.store_no = dreg.store_no
where
d.pricebatchdetailid < dreg.pricebatchdetailid -- Make sure PBD is not most recent REG.
and h.processeddate < #processedBatchesPurgeDateLimit
and lower(rtrim(ltrim(pbs.pricebatchstatusdesc))) = 'processed' -- Pushed/processed batches only.
So that's raising an overall question first: how to handle temp tables in Linq? This script uses about 10 of them. I currently have them as List. The problem is, if I try to .Join() on one in a query, I get the "Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator." error.
I was able to get the join to the derived table to work using 2 queries, just so a single one wouldn't get nightmarishly long:
var dreg = (from _pbd in db.PriceBatchDetails.Where(pbd => pbd.Expired == false && pbd.PriceChgType.PriceChgTypeDesc.ToLower().Trim() == "reg")
group _pbd by new { _pbd.Item_Key, _pbd.Store_No } into _pbds
select new
{
Item_Key = _pbds.Key.Item_Key,
Store_No = _pbds.Key.Store_No,
PriceBatchDetailID = _pbds.Max(pbdet => pbdet.PriceBatchDetailID)
});
var query = (from h in db.PriceBatchHeaders.Where(pbh => pbh.ProcessedDate < processedBatchesPurgeDateLimit)
join pbs in db.PriceBatchStatus on h.PriceBatchStatusID equals pbs.PriceBatchStatusID
join d in db.PriceBatchDetails on h.PriceBatchHeaderID equals d.PriceBatchHeaderID
join dr in dreg on new { d.Item_Key, d.Store_No } equals new { dr.Item_Key, dr.Store_No }
where d.PriceBatchDetailID < dr.PriceBatchDetailID
&& pbs.PriceBatchStatusDesc.ToLower().Trim() == "processed"
select d.PriceBatchDetailID);
So that query gives the expected results, which I am holding in a List, but then I need to join the results of that query to another one selected from the database, which is leading me back to the aforementioned "Local sequence cannot be used..." error.
That query is this:
insert into #pbhArchiveFullListSaved
select h.pricebatchheaderid
from pricebatchheader h (nolock)
join pricebatchdetail d (nolock)
on h.pricebatchheaderid = d.pricebatchheaderid
join #processedBatchesPurgeList dlist
on d.pricebatchdetailid = dlist.pricebatchdetailid -- PBH list is restricted to PBD purge list rows that have PBH references.
group by h.pricebatchheaderid
The join there on #processedBatchesPurgeList is the problem I am running into.
So uh...help? I have never written SQL like this, and certainly never tried to convert it to Linq.
As pointed out by the comments above, this is no longer being rewritten as Linq.
Was hoping to get a performance improvement along with achieving better SOX compliance, which was the whole reason for the rewrite in the first place.
I'm happy with just satisfying the SOX compliance issues.
Thanks, everyone.