i have two tables in oracle --> master_file and payment tables.
master_file table:
id (primary key), nama, status
41121, john, PL
41122, ryan, UP
41121, john, UP
there are duplicate data in columns id ( id 41121 ) . I do not know where the mistake , so the id column can store duplicate data .
payment table:
id, idr
41121, 1000
41122, 500
my query :
select a.id,a.nama,a.status,b.idr from master_file a, payment b
where a.id=b.id(+)
result
id, nama, status, idr
41121, john, PL, 1000
41122, ryan, UP, 500
41121, john, UP, 1000
i want result
id, nama, status, idr
41121, john, PL, 1000
41122, ryan, UP, 500
i know, i cannot use distinct to remove row : 41121, john, UP, 1000.
give me solution, please. thanks before. :)
Your master table cannot have a primary key on ID as it has duplicate values; perhaps it has a foreign key or a composite primary key, say on ID and status.
Assuming your data and data model are correct and you don't want to modify it, you need to establish the precedence of the possible status values. From your question it seems PL takes precedence over UP, but there may be other possible values, and you may have more than two rows for a given ID. Once you know the precedence you can use that in a case statement to create a numeric equivalent and use that to rack the 'duplicate' rows:
select a.id, a.nama, a.status, b.idr,
dense_rank() over (partition by a.id order by
case a.status when 'UP' then 1 when 'PL' then 2 else 0 end desc) as rnk
from master_file a
left join payment b on b.id = a.id
order by a.id;
ID NAMA STATUS IDR RNK
---------- ---------- ------ ---------- ----------
41121 john PL 1000 1
41121 john UP 1000 2
41122 ryan UP 500 1
You can then use that as an inline view and filter out all but the highest-precedence rows:
select id, nama, status, idr
from (
select a.id, a.nama, a.status, b.idr,
dense_rank() over (partition by a.id order by
case a.status when 'UP' then 1 when 'PL' then 2 else 0 end desc) as rnk
from master_file a
left join payment b on b.id = a.id
)
where rnk = 1
order by id;
ID NAMA STATUS IDR
---------- ---------- ------ ----------
41121 john PL 1000
41122 ryan UP 500
It's also possible to use an inline view that filter the master_file before joining, but that's largely a matter of preference.
SQL Fiddle demo.
If the possible status values are in another reference table, and their precedence is established in that table, then you could join to that instead of doing it manually through case.
You want to remove the row with 'UP' in it?
In that case you need to include it as another item in your WHERE clause:
... where a.id=b.id(+) and status = 'UP'
Related
I want to Delete the Duplicates from the table update the unique identifier and merge it with the already existing record.
I have a table which can contain following records -
ID Name Req_qty
1001 ABC-02/01+Time 10
1001 ABC-03/01+Time 20
1001 ABC 30
1002 XYZ 40
1003 DEF-02/01+Time 10
1003 DEF-02/01+Time 20
And I am expecting the records after the operation as follows:
ID Name Req_Qty
1001 ABC 60
1002 XYZ 40
1003 DEF 30
Any assistance would be really helpful. Thanks!
It is possible to do this in a single SQL statement:
merge into (select rowid as rid, x.* from test_table x ) o
using ( select id
, regexp_substr(name, '^[[:alpha:]]+') as name
, sum(reg_qty) as reg_qty
, min(rowid) as rid
from test_table
group by id
, regexp_substr(name, '^[[:alpha:]]+')
) n
on (o.id = n.id)
when matched then
update
set o.name = n.name
, o.reg_qty = n.reg_qty
delete where o.rid > n.rid;
Working example
This uses a couple of tricks:
the delete clause of a merge statement will only operate on data that has been updated, and so there's no restriction on what gets updated.
you can't select rowid from a "view" and so it's faked as rid before updating
by selecting the minimum rowid from per ID we make a random choice about which row we're going to keep. We can then delete all the rows that have a "greater" rowid. If you have a primary key or any other column you'd prefer to use as a discriminator just substitute that column for rowid (and ensure it's indexed if your table has any volume!)
Note that the regular expression differs from the other answer; it uses caret (^) to anchor the search for characters to the beginning of the string before looking for all alpha characters thereafter. This isn't required as the default start position for REGEXP_SUBSTR() is the first (1-indexed) but it makes it clearer what the intention is.
In your case, you will need to update the records first and then delete the records which are not required as following (Update):
UPDATE TABLE1 T
SET T.REQ_QTY = (
SELECT
SUM(TIN.REQ_QTY) AS REQ_QTY
FROM
TABLE1 TIN
WHERE TIN.ID = T.ID
)
WHERE (T.ROWID,1) IN
(SELECT TIN1.ROWID, ROW_NUMBER() OVER (PARTITION BY TIN1.ID)
FROM TABLE1 TIN1); --TAKING RANDOM RECORD FOR EACH ID
DELETE FROM TABLE1 T
WHERE NOT EXISTS (SELECT 1 FROM TABLE1 TIN
WHERE TIN.ID = T.ID AND TIN.REQ_QTY > T.REQ_QTY);
UPDATE TABLE1 SET NAME = regexp_substr(NAME,'[[:alpha:]]+');
--Update--
The following merge should work for you
MERGE INTO
(select rowid as rid, T.* from MY_TABLE1 T ) MT
USING
(
SELECT * FROM
(SELECT ID,
regexp_substr(NAME,'^[[:alpha:]]+') AS NAME_UPDATED,
SUM(Req_qty) OVER (PARTITION BY ID) AS Req_qty_SUM,
ROWID AS RID
FROM MY_TABLE1) MT1
WHERE RN = 1
) mt1
ON (MT.ID = MT1.ID)
WHEN MATCHED THEN
UPDATE SET MT.NAME = MT1.NAME_UPDATED, MT.Req_qty = MT1.Req_qty_SUM
delete where (MT.RID <> MT1.RID);
Cheers!!
This example is invented for the purpose of the question.
SELECT
PR.PROVINCE_NAME
,CO.COUNTRY_NAME
FROM
PROVINCE PR
JOIN COUNTRY CO ON CO.COUNTRY_ID=PR.COUNTRY_ID
WHERE
PR.PROVINCE_ID IN (1,2)
Let's assume that COUNTRY_ID is not the Primary Key in the Country table and the above join on Country table returns potentially multiple rows. We don't know how many rows and we don't care why there are multiple ones. We only want to join on one of them, so we get one row per Province.
I tried subquery for the join but can't pass in PR.COUNTRY_ID for Oracle 11.2. Are there any other ways that this can be achieved?
A typical safe approach of handling tables without PK is to extend the duplicated column with a unique index (row_numer of the duplicated row)
In your case this would be:
with COUNTRY_UNIQUE as (
select COUNTRY_ID,
row_number() over (partition by COUNTRY_ID order by COUNTRY_NAME) rn,
COUNTRY_NAME
from country)
select * from COUNTRY_UNIQUE
order by COUNTRY_ID, rn;
leading to
COUNTRY_ID RN COUNTRY_NAME
---------- ---------- ------------
1 1 C1
2 1 C2
2 2 C3
The combination of COUNTRY_IDand RN is unique, so if you constraint only RN = 1 the COUNTRY_ID is unique.
You may define the order of the duplicated records and control with it the selection - in our case we choose the smalest COUNTRY_NAME.
The whole join used this subquery and constraints the countries on RN = 1
with COUNTRY_UNIQUE as (
select COUNTRY_ID,
row_number() over (partition by COUNTRY_ID order by COUNTRY_NAME) rn,
COUNTRY_NAME
from country)
SELECT
PR.PROVINCE_NAME
,CO.COUNTRY_NAME
FROM
PROVINCE PR
JOIN COUNTRY_UNIQUE CO ON CO.COUNTRY_ID=PR.COUNTRY_ID
WHERE
PR.PROVINCE_ID IN (1,2)
AND CO.RN = 1; /* consider only unique countries */
If you have Oracle 12c, you can use a LATERAL view in the join. Like this:
SELECT
PR.PROVINCE_NAME
,CO.COUNTRY_NAME
FROM
PROVINCE PR
CROSS JOIN LATERAL (
SELECT * FROM COUNTRY CO
WHERE CO.COUNTRY_ID=PR.COUNTRY_ID
FETCH FIRST 1 ROWS ONLY) CO
WHERE
PR.PROVINCE_ID IN (1,2)
Update for Oracle 11.2
In Oracle 11.2, you can use something along these lines. Depending on the size of COUNTRY and how many duplicates there are per COUNTRY_ID, it could perform as well or better than the 12c approach. (Fewer buffer gets but more memory required).
SELECT pr.province_name,
co.country_name
FROM province pr
INNER JOIN (SELECT *
FROM (SELECT co.*,
ROW_NUMBER () OVER (PARTITION BY co.country_id ORDER BY co.country_name) rn
FROM country co)
WHERE rn = 1) co
ON co.country_id = pr.country_id
WHERE pr.province_id IN (1, 2)
I'm using this query:
SELECT *
FROM HISTORY
LEFT JOIN CUSTOMER ON CUSTOMER.CUST_NUMBER = HISTORY.CUST_NUMBER
LEFT JOIN (
Select LOAN_DATE, CUST_NUMBER, ACCOUNT_NUMBER, STOCK_NUMBER, LOC_SALE
From LOAN
WHERE ACCOUNT_NUMBER != 'DD'
ORDER BY LOAN_DATE DESC
) LOAN ON LOAN.CUST_NUMBER = HISTORY.CUST_NUMBER
order by DATE desc
But I want only the top result from the loan table to be joined (Most recent by Loan_date). For some reason, it's getting three records (one for each loan on the customer I'm looking at). I'm sure I'm missing something simple?
If you're after joining the latest loan row per cust_number, then this ought to do the trick:
select *
from history
left join customer on customer.cust_number = history.cust_number
left join (select loan_date,
cust_number,
account_number,
stock_number,
loc_sale
from (select loan_date,
cust_number,
account_number,
stock_number,
loc_sale,
row_number() over (partition by cust_number
order by loan_date desc) rn
from loan
where account_number != 'DD')
where rn = 1) loan on loan.cust_number = history.cust_number
order by date desc;
If there are two rows with the same loan_date per cust_number and you want to retrieve both, then change the row_number() analytic function for rank().
If you only want to retreive one row, then you'd have to make sure you add additional columns into the order by, to make sure that the tied rows always display in the same order, otherwise you could find that sometimes you get different rows returned on subsequent runs of the query.
So I have a table like this. This is a standard Order header - Order Detail table:
order id order_line
----------- -----------
100 1
100 2
100 3
101 1
102 1
103 1
103 2
104 1
105 1
Now, how can I make a SELECT that will only pick the orders that only have one line?
In this case I don't want orders 100 and 103.
Thanks!
Tiago
Counting lines using "group by order_id" is a good solution, however counting is not needed, simpler Max function works fine:
select order_id from orders
group by order_id
having max(order_line)=1;
In case order_line has consecutive values further "optimization" is possible:
select order_id from orders
where order_line <= 2
group by order_id
having max(order_line)=1;
Group by the order_id and take only those having 1 record per group
select order_id
from orders
group by order_id
having count(*) = 1
If you need the complete record then do
select t1.*
from orders t1
join
(
select order_id
from orders
group by order_id
having count(*) = 1
) t2 on t1.order_id = t2.order_id
You can try following query too :
select order_id , order_line
from Order_Detail
group by order_id ,order_line
having count(order_id)<2;
my table have several records which has the same MemberID. i want to result out only one record.
select DISTINCT(MemberID) from AnnualFees;
then result will come. but i want to show the other column data also but when i do this
select DISTINCT(MemberID),StartingDate,ExpiryDate,Amount from AnnualFees;
all the details including same MemberID data also displaying.
can someone help me.
Assuming you just want any row at random for each memberid than you can do this:
select memberid, this, that, theother
from
(
select memberid, this, that, theother,
row_number() over (partition by memberid order by this) rn
from annualfees
)
where rn = 1;
If you wanted a specific row per memberid, e.g. the one with the most recent StartDate then you could modify it to:
select memberid, this, that, theother
from
(
select memberid, this, that, theother,
row_number() over (partition by memberid order by StartDate desc) rn
from annualfees
)
where rn = 1;
don't know if this is quite what you need, but you may need to look at GROUP BY instead of DISTINCT...
if you have several records with the same member id, you may need to specify exaclty how to identify the one you want from the others
eg to get each member`s last starting date:
SELECT memberid, max(startingdate)
FROM annualfees
GROUP BY memberid
but if you need to identify one record in this kind of way but also display the other columns, i think you may need to do some trickery like this...
eg sub-query the above SELECT with a join to join the other columns you want:
SELECT subq.memid, subq.startdate, a.expirydate, a.amount
FROM (
SELECT memberid AS memid, max(startingdate) AS startdate
FROM annualfees
GROUP BY memberid ) subq
INNER JOIN annualfees a ON a.memberid = subq.memid
AND a.startingdate = subq.startdate
from start to finish, also showing data table (o/p was traced/grabbed using "SET VERIFY ON")...
-- show all rows
select *
from annualfees
order by memberid, startingdate
MEMBERID STARTINGDATE EXPIRYDATE AMOUNT
---------------------- ------------------------- -------------------- --------------------
1 02-DEC-09 05-FEB-10 111
1 25-JUN-10 25-JUN-11 222
2 25-APR-10 25-JUN-13 333
3 rows selected
/
-- show one member`s data using max(startingdate) as selector.
SELECT memberid, max(startingdate)
FROM annualfees
GROUP BY memberid
MEMBERID MAX(STARTINGDATE)
---------------------- -------------------------
1 25-JUN-10
2 25-APR-10
2 rows selected
/
-- show above data joined with the other columns.
SELECT subq.memid, subq.startdate, a.expirydate, a.amount
FROM (
SELECT memberid AS memid, max(startingdate) AS startdate
FROM annualfees
GROUP BY memberid ) subq
INNER JOIN annualfees a ON a.memberid = subq.memid AND a.startingdate = subq.startdate
MEMID STARTDATE EXPIRYDATE AMOUNT
---------------------- ------------------------- -------------------- --------------------
1 25-JUN-10 25-JUN-11 222
2 25-APR-10 25-JUN-13 333
2 rows selected
/
You need to select which of the rows with duplicate MemberIDs to return in some way. This will get the row with the greatest startingDate.
SELECT MemberID,StartingDate,ExpiryDate,Amount
FROM AnnualFees af
WHERE NOT EXISTS (
SELECT * from AnnualFees af2
WHERE af2.MemberID = af.MemberID
AND af2.StartingDate > af.StartingDate)
select DISTINCT MemberID,StartingDate,ExpiryDate,Amount from AnnualFees;
remove paranthesis