How can we use partition in Oracle query - oracle

I have query in which i used partition of avoid the duplicate value for particular column , but still it is giving duplicate row below i am mention my query in which i used partition
SELECT iol.M_product_id as faultyProduct , iol.SERIALNO,iol.M_product_id as newproduct, ma.Description,
mp.M_Product_category_id ,mi.issotrx, co.C_BPartner_ID,
ROW_NUMBER() OVER(PARTITION BY ma.Description ORDER BY iol.M_product_id DESC) rn
FROM M_inoutline iol
inner join M_inout mi ON (iol.m_inout_id = mi.m_inout_id)
inner join C_Order co ON (co.c_order_id = mi.c_order_id )
inner Join M_AttributeSetInstance ma ON (ma.m_attributesetinstance_id =iol.m_attributesetinstance_id)
inner join M_Product mp ON (mp.m_product_id = iol.m_product_id)
where mp.m_product_category_id= 1000447 AND mi.issotrx = 'Y';
Please help me out

For me it looks you want to do:
select * from (/*YOUR QUERY*/) where rn = 1;

Related

ORA-00979: not a GROUP BY expression in Oracle

I can not execute this code on Oracle, the error shows:
"ORA-00979: not a GROUP BY expression"
However, I was able to run it successfully on MySQL.
How does this happen?
SELECT CONCAT(i.lname, i.fname) AS inst_name,
CONCAT(s.lname, s.fname) AS stu_name,
t.avg_grade AS stu_avg_grade
FROM(
SELECT instructor_id, student_id, AVG(grade) as avg_grade, RANK() OVER(PARTITION BY instructor_id ORDER BY grade DESC) AS rk
FROM grade
GROUP BY 1,2) t
JOIN instructor i
ON t.instructor_id = i.instructor_id
JOIN student s
ON s.student_id = t.student_id
WHERE t.rk = 1
ORDER BY 3 DESC
You can't use ordinals like GROUP BY 1,2 in Oracle. In addition, the ORDER BY grade clause inside your RANK() function has a problem. Keep in mind that analytic functions evaluate after the GROUP BY aggregation, so grade is no longer available. Here is a version which should work without error:
SELECT CONCAT(i.lname, i.fname) AS inst_name,
CONCAT(s.lname, s.fname) AS stu_name,
t.avg_grade AS stu_avg_grade
FROM
(
SELECT instructor_id, student_id, AVG(grade) AS avg_grade,
RANK() OVER (PARTITION BY instructor_id ORDER BY AVG(grade) DESC) AS rk
FROM grade
GROUP BY instructor_id, student_id
) t
INNER JOIN instructor i
ON t.instructor_id = i.instructor_id
INNER JOIN student s
ON s.student_id = t.student_id
WHERE t.rk = 1
ORDER BY t.avg_grade DESC;

How to left join with conditions in Toad Data Point Query Builder?

I'm trying to build a query in Toad Data Point. I have a subquery that has a row number to identify the records I'm interested in. This subquery needs to be left joined onto the main table only when the row number is 1. Here's the query I'm trying to visualize:
SELECT distinct E.EMPLID, E.ACAD_CAREER
FROM PS_STDNT_ENRL E
LEFT JOIN (
SELECT ACAD_CAREER, ROW_NUMBER() OVER (PARTITION BY ACAD_CAREER ORDER BY EFFDT DESC) as RN
FROM PS_ACAD_CAR_TBL
) T on T.ACAD_CAREER = E.ACAD_CAREER and RN = 1
When I try to replicate this, the row number condition is placed in the global WHERE clause. This is not the intended functionality because it removes any records that don't have a match in the subquery effectively making it an inner join.
Here is the query it's generating:
SELECT DISTINCT E.EMPLID, E.ACAD_CAREER, T.RN
FROM SYSADM.PS_STDNT_ENRL E
LEFT OUTER JOIN
(SELECT PS_ACAD_CAR_TBL.ACAD_CAREER,
ROW_NUMBER ()
OVER (PARTITION BY ACAD_CAREER ORDER BY EFFDT DESC)
AS RN
FROM SYSADM.PS_ACAD_CAR_TBL PS_ACAD_CAR_TBL) T
ON (E.ACAD_CAREER = T.ACAD_CAREER)
WHERE (T.RN = 1)
Is there a way to get the query builder to place that row number condition on the left join instead of the global WHERE clause?
I found a way to get this to work.
Add a calculated field to the main table with a value of 1.
Join the row number to this new calculated field.
Now the query has the filter in the join condition instead of the WHERE clause so that it joins as intended. Here is the query it made:
SELECT DISTINCT E.EMPLID, E.ACAD_CAREER, T.RN
FROM SYSADM.PS_STDNT_ENRL E
LEFT OUTER JOIN
(SELECT PS_ACAD_CAR_TBL.ACAD_CAREER,
ROW_NUMBER ()
OVER (PARTITION BY ACAD_CAREER ORDER BY EFFDT DESC)
AS RN
FROM SYSADM.PS_ACAD_CAR_TBL PS_ACAD_CAR_TBL) T
ON (E.ACAD_CAREER = T.ACAD_CAREER) AND (1 = T.RN)

Using LAG to Find Previous Value in Oracle

I'm trying to use the LAG function in Oracle to find the previous registration value for donors.
To see my data, I started with this query to find all registrations for the particular donor:
select registration_id, registration_date from registration r where r.person_id=52503290 order by r.registration_date desc;
Then I used the LAG function to return the previous value along with the most recent value:
select registration_id as reg_id, registration_date as reg_date,
lag(registration_date,1) over (order by registration_date) as prev_reg_date
from registration
where person_id=52503290
order by registration_date desc;
And the results are as expected:
So I thought I should be good to place the LAG function within the main query to get the previous value but for some reason, the previous value returns NULL or no value at all.
SELECT
P.Person_Id AS Person_ID,
R.Registration_Date AS Drive_Date,
LAG(R.Registration_Date,1) OVER (ORDER BY R.REGISTRATION_DATE) AS Previous_Drive_Date,
P.Abo AS Blood_Type,
DT.Description AS Donation_Type
FROM
Person P
JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
WHERE
TRUNC(R.Registration_Date) = TRUNC(SYSDATE)-1
AND R.Person_Id=52503290
ORDER BY
R.Registration_Date DESC;
Here is the result set:
Any suggestions on what I am missing here? Or why this query isn't returning the values expected?
Based on #Alex Poole's suggestions, I changed the query to look like:
SELECT * FROM (
SELECT
P.Person_Id AS Person_ID,
R.Registration_Date AS Drive_Date,
LAG(R.Registration_Date,1) OVER (partition by p.person_id ORDER BY r.registration_date) AS Previous_Drive_Date,
P.Abo AS Blood_Type,
DT.Description AS Donation_Type
FROM
Person P
JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
--WHERE R.Person_Id=52503290
)
WHERE TRUNC(Drive_Date) = TRUNC(SYSDATE)-1
ORDER BY Drive_Date DESC;
It takes about 85 seconds to pull back the first 30 rows:
My original query (before the LAG function was added) took about 2 seconds to pull back approximately 2100 records. But it was nothing but a SELECT with a couple of JOINS and one item in the WHERE clause.
Looking at the record counts, Person has almost 5.5 million records and Registration has 9.1 million records.
The lag is only applied within the rows that match the where clause filter, so you would only see the previous value if that was also yesterday.
You can apply the lag in a subquery, and then filter in an outer query:
SELECT * FROM (
SELECT
P.Person_Id AS Person_ID,
R.Registration_Date AS Drive_Date,
LAG(R.Registration_Date,1) OVER (ORDER BY R.REGISTRATION_DATE) AS Previous_Drive_Date,
P.Abo AS Blood_Type,
DT.Description AS Donation_Type
FROM
Person P
JOIN Registration R ON P.Person_Id = R.Person_Id AND P.First_Name <> 'Pooled' AND P.First_Name <> 'IMPORT'
LEFT OUTER JOIN Drives DR ON R.Drive_Id = DR.Drive_Id AND DR.Group_Id <> 24999
LEFT OUTER JOIN Branches B ON R.Branch_Id = B.Branch_Id
LEFT OUTER JOIN Donor_Group DG on DR.Group_Id = DG.Group_Id
LEFT OUTER JOIN Donation_Type DT ON R.Donation_Type_Id = DT.DONATION_TYPE_ID
WHERE R.Person_Id=52503290
)
WHERE TRUNC(Drive_Date) = TRUNC(SYSDATE)-1
ORDER BY Drive_Date DESC;

RANK OVER function in Hive

I'm trying to run this query in Hive to return only the top 10 url which appear more often in the adimpression table.
select
ranked_mytable.url,
ranked_mytable.cnt
from
( select iq.url, iq.cnt, rank() over (partition by iq.url order by iq.cnt desc) rnk
from
( select url, count(*) cnt
from store.adimpression ai
inner join zuppa.adgroupcreativesubscription agcs
on agcs.id = ai.adgroupcreativesubscriptionid
inner join zuppa.adgroup ag
on ag.id = agcs.adgroupid
where ai.datehour >= '2014-05-15 00:00:00'
and ag.siteid = 1240
group by url
) iq
) ranked_mytable
where
ranked_mytable.rnk <= 10
order by
ranked_mytable.url,
ranked_mytable.rnk desc
;
Unfortunately I get an error message stating:
FAILED: SemanticException [Error 10002]: Line 26:23 Invalid column reference 'rnk'
I've tried to debug it and until the ranked_mytable sub-queries everything goes smooth. I've tried to comment the where ranked_mytable.rnk <= 10 clause but the error message keep appearing.
Hive is unable to order by a column that is not in the "output" of a select statement. To fix it, just include that column in the selected columns:
select
ranked_mytable.url,
ranked_mytable.cnt,
ranked_mytable.rnk
from
( select iq.url, iq.cnt, rank() over (partition by iq.url order by iq.cnt desc) rnk
from
( select url, count(*) cnt
from store.adimpression ai
inner join zuppa.adgroupcreativesubscription agcs
on agcs.id = ai.adgroupcreativesubscriptionid
inner join zuppa.adgroup ag
on ag.id = agcs.adgroupid
where ai.datehour >= '2014-05-15 00:00:00'
and ag.siteid = 1240
group by url
) iq
) ranked_mytable
where
ranked_mytable.rnk <= 10
order by
ranked_mytable.url,
ranked_mytable.rnk desc
;
If you don't want that 'rnk' column in your final output, I expect you could wrap that whole thing in another inner-query and just select out the 'url' and 'cnt' fields.
RANK OVER is not the best function to achieve this goal.
A better solution would be to use a combination of SORT BY and LIMIT. It's true in fact LIMIT picks randomly the rows in a table, but this might be avoided if used with the SORT BY function. From the Apache Wiki:
-- Top k queries. The following query returns the top 5 sales records wrt amount.
SET mapred.reduce.tasks = 1 SELECT * FROM sales SORT BY amount
DESC LIMIT 5
The query can be re-written in this way:
select
iq.url,
iq.cnt
from
( select url, count(*) cnt
from store.adimpression ai
inner join zuppa.adgroupcreativesubscription agcs
on agcs.id = ai.adgroupcreativesubscriptionid
inner join zuppa.adgroup ag
on ag.id = agcs.adgroupid
where ai.datehour >= '2014-05-15 00:00:00'
and ag.siteid = 1240
group by url ) iq
sort by
iq.cnt desc
limit
10
;
Remove the partition by iq.url clause from rank over() and re-run query.
Thanks & Regards,
Kamleshkumar Gujarathi
Put as before the rnk variable. It should work fine.

Distinct on one column in linq with joins

I know we can get distinct on one column using following query:
I know we can get distinct on one column using following query:
SELECT *
FROM (SELECT A, B, C,
ROW_NUMBER() OVER (PARTITION BY B ORDER BY A) AS RowNumber
FROM MyTable
WHERE B LIKE 'FOO%') AS a
WHERE a.RowNumber = 1
I have used similar sql query in my case where i am joining multiple tables but my project is in mvc4 and i need linq to entity equivalent of the same. Here is my code:
select * from
(
select fp.URN_No,
ROW_NUMBER() OVER
(PARTITION BY pdh.ChangedOn ORDER BY fp.CreatedOn)
as num,
fp.CreatedOn, pdh.FarmersName, pdh.ChangedOn, cdh.Address1, cdh.State, ich.TypeOfCertificate, ich.IdentityNumber, bdh.bankType, bdh.bankName,
pidh.DistrictId, pidh.PacsRegistrationNumber, idh.IncomeLevel, idh.GrossAnnualIncome
from MST_FarmerProfile as fp inner join PersonalDetailsHistories as pdh on fp.personDetails_Id = pdh.PersonalDetails_Id
inner join ContactDetailsHistories as cdh on fp.contactDetails_Id = cdh.ContactDetails_Id
inner join IdentityCertificateHistories as ich on fp.IdentityCertificate_Id = ich.IdentityCertificate_Id
inner join BankDetailsHistories as bdh on fp.BankDetails_Id = bdh.BankDetails_Id
left join PacInsuranceDataHistories as pidh on fp.PacsInsuranceData_Id = pidh.PacsInsuranceData_Id
left join IncomeDetailsHistories as idh on fp.IncomeDetails_Id = idh.IncomeDetails_Id
where URN_No in(
select distinct MST_FarmerProfile_URN_No from PersonalDetailsHistories where MST_FarmerProfile_URN_No in(
select URN_No from MST_FarmerProfile where (CreatedOn>=#fromDate and CreatedOn<= #toDate and Status='Active')))
)a where a.num=1
Use this linq query after getting result from sql. p.ID is be your desire distinct column name
List<Person> distinctRecords = YourResultList
.GroupBy(p => new { p.ID})
.Select(g => g.First())
.ToList();

Resources