How to ignore duplicate records in sybase? - distinct

I'm trying to retrieve only unique records from a table, but I guess something is wrong with my query.
select distinct RIID, duplicateInfo from duplicateRecords where RIID > 3920011
When I execute above query I get this result
RIID | duplicateInfo
___________________________________
3920011 Repeated:12009:CLEAR
3920011 Repeated:12012:CLEAR
4233901 Repeated:18129:HIT
4820129 Repeated:22901:PENDING
4820129 Repeated:22983:PENDING
And I want the below result
RIID | duplicateInfo
___________________________________
3920011 Repeated:12012:CLEAR
4233901 Repeated:18129:HIT
4820129 Repeated:22983:PENDING
Please any help would be highly appreciated.
Thanks

select distinct RRID,
(select duplicateInfo
from duplicateRecords m
where m.RIID = duplicateRecords.RRID
having cast(substring(duplicateInfoNumber,10,6) as int) = min(cast(substring(duplicateInfoNumber,10,6) as int)))
from duplicateRecords
where RRID > 3920011

I don't have sybase to test this with.
Here is an example from mysql to give you some pointers.
DROP TABLE IF EXISTS `duplicaterecords`;
CREATE TABLE `duplicaterecords` (
`RRID` int(11) DEFAULT NULL,
`duplicateInfo` varchar(200) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `duplicaterecords` (`RRID`, `duplicateInfo`)
VALUES
(3920011,'Repeated:12009:CLEAR'),
(3920011,'Repeated:12012:CLEAR'),
(4233901,'Repeated:18129:HIT'),
(4820129,'Repeated:22901:PENDING'),
(4820129,'Repeated:22983:PENDING'),
(4233901,'Duplicate:5555555:CLEAR');
select grouped.*
, base.duplicateInfo
from
(
select grouped.RRID, max(grouped.duplicateInfoId) duplicateInfoId
from (
select RRID
, cast(substring_index(substring_index(duplicateInfo,':',2 ),':',-1) as unsigned) duplicateInfoId
, duplicateInfo
from duplicateRecords
) grouped
group by grouped.RRID
) grouped
inner join (
select RRID
, cast(substring_index(substring_index(duplicateInfo,':',2 ),':',-1) as unsigned) duplicateInfoId
, duplicateInfo
from duplicateRecords
) base
on grouped.duplicateInfoId = base.duplicateInfoId ;
-- example results
RRID duplicateInfoId duplicateInfo
3920011 12012 Repeated:12012:CLEAR
4233901 5555555 Duplicate:5555555:CLEAR
4820129 22983 Repeated:22983:PENDING

There is a simpler and more efficient way -- but only if you're running on Sybase ASE (doesn't work for Sybase IQ or Sybase SQL Anywhere).
First, this is a 'duplicate key' problem, not a 'duplicate row' problem.
The trick below will remove all rows with duplicate keys. But note that it is not defined which row to choose in case of duplicate keys -- so the first one is kept, the rest is discarded. So you should apply some ordering in the SELECT query in order to implement a different selection criterium
CREATE TABLE uniquetab (RRID ..., duplicateInfo ...)
go
CREATE UNIQUE INDEX ix on uniquetab(RRID) WITH IGNORE_DUP_KEY
go
INSERT uniquetab
SELECT * FROM duplicateRecords ORDER BY
go
An alternative way is to BCP-out the duplicateRecords table, and then to BCP it into the uniquetab table.

Related

Oracle Join with operation returning null values

I'm trying to Right join two table on a column named "compte"
I need to do an addition after. The problem is that some "compte" doesn't exist in one of the table and as a result, the addition return null instead of keeping the based value
Here's the query
SELECT t.compte,t.posdev+x.mnt
FROM (
SELECT compte,SUM(mntdev) as mnt FROM mvtc22
WHERE compte IN ('11510198451','00610198451','40010198451','40010198453','00610198461','00101980081','00101980094',
'00101980111','40010198461','40010198462','40010198466','40010198463')
AND datoper BETWEEN '01/01/22' AND '06/01/22'
GROUP BY compte
)x
RIGHT OUTER JOIN
(
SELECT c.compte,c.posdev
FROM v_sldoper c
WHERE c.compte IN ('11510198451','00610198451','40010198451','40010198453','00610198461','00101980081','00101980094',
'00101980111','40010198461','40010198462','40010198466','40010198463')
AND datpos = '31/12/21'
)t
ON t.compte = x.compte
And the results :
I'm expecting to keep the results from the second subquery if there's no "compte" in the first subquery.
Thanks In advance,
Alex
You are very close, the problem is that in oracle SQL the result of any value + null value is null, so you need to handle potential null values from each column before applying the + operator betwen them.
To solve the issue, you can apply functions like NVL or decode or even CASE WHEN for that purpose.
Below I use NVL function to solve it (I assume t.posdev column cannot contain null values, otherwise apply nvl function to both columns).
SELECT t.compte, t.posdev + NVL(x.mnt, 0)
FROM (
SELECT compte,SUM(mntdev) as mnt FROM mvtc22
WHERE compte IN ('11510198451','00610198451','40010198451','40010198453','00610198461','00101980081','00101980094',
'00101980111','40010198461','40010198462','40010198466','40010198463')
AND datoper BETWEEN '01/01/22' AND '06/01/22'
GROUP BY compte
)x
RIGHT OUTER JOIN
(
SELECT c.compte,c.posdev
FROM v_sldoper c
WHERE c.compte IN ('11510198451','00610198451','40010198451','40010198453','00610198461','00101980081','00101980094',
'00101980111','40010198461','40010198462','40010198466','40010198463')
AND datpos = '31/12/21'
)t
ON t.compte = x.compte

How to determine if an Index is required for my Oracle query

i would like to know if an index is required or would help to run the below query? i dont have any idea how can i analyze this question.
if some one can help please thanks
WITH C(A0_ID, A1_ID, A1_Col0)
AS (
SELECT
Table_1.ID AS A0_ID,
Table_2.ID AS A1_ID,
Table_2.Col0 AS A1_Col0
FROM Table_1 ,Table_2
WHERE Table_2.ID = Table_1.ID
AND Table_1.col1 = ?
AND BITAND(Table_1.col2, ?) <> ?
AND Table_2.col3 IN (?,?,?)
), T(A0_ID, A1_ID, A1_Col0) AS (
SELECT
A0_ID,
A1_ID,
A1_Col0
FROM C
WHERE A1_ID = ?
UNION ALL
SELECT
C.A0_ID,
C.A1_ID,
C.A1_Col0
from C
INNER JOIN T P ON C.A1_Col0 = P.A1_ID
) SELECT A0_ID, A1_ID, A1_Col0 FROM T
The main query selects from T with no post-processing (filtering, aggregation, sorting, etc.), so it doesn't require optimization.
T is a recursive CTE based on the subquery C. Therefore, T doesn't need optimization (unless you materialized it, but that's a different story).
Now, C can be optimized:
I would consider Table_1 as the driving table since it has an equality in the filtering criteria. It also, uses ID to join against Table_2. Therefore a good index for it is:
create index ix1 on Table_1 (col1, ID);
Then, to access Table_2 you'll need to get through ID that should be the main index column. You may add col3 to the index to somewhat improve the performance of the query; only a benchmark will tell if this is a wise idea. The index could look like:
create index ix2 on Table_2 (ID, col3); -- col3 is optional here
I would recommend you create these indexes and compare the performance that each option produces.

Iterate a select query for a set of varchar2 in oracle

I have a below select query
select * from BUSINESS_T
where store_code = '075'
and item_no in
(
select item_no from BUSINESS_T a
where store_code = '075'
and exists
(
select * from BUSINESS_T
where store_code = a.store_code
and item_no = a.item_no
and
(
VALID_FROM_DTIME between a.VALID_FROM_DTIME and a.VALID_TO_DTIME
or VALID_TO_DTIME between a.VALID_FROM_DTIME and a.VALID_TO_DTIME
or (VALID_FROM_DTIME > a.VALID_FROM_DTIME and a.VALID_TO_DTIME is null)
or (VALID_FROM_DTIME < a.VALID_FROM_DTIME and VALID_TO_DTIME is null)
)
and del_dtime is null
and not
(
a.rowid = rowid
)
)
)
order by item_no, VALID_FROM_DTIME
Need to run it for a array of store numbers {'071','072','073','074','075','076'}
This array should defined inside the query itself.
Nearly 400+ fixed store numbers are there. The above query has to be run for each store, at a time for one store , To find the overlapping in that particular store
If i run by passing the collection of store numbers, there is a chance items are common in many stores that will cause a problem.
You can still use in if you modify the first subquery to get return the store/item pairs, which handles the common items:
select * from BUSINESS_T
where (store_code, item_no) in
(
select store_code, item_no from BUSINESS_T a
where store_code in ('071','072','073','074','075','076')
...
Or with a collection:
select * from BUSINESS_T
where (store_code, item_no) in
(
select store_code, item_no from BUSINESS_T a
where store_code member of sys.dbms_debug_vc2coll('071','072','073','074','075','076')
...
db<>fiddle with very simple demo of the idea.
Use in:
select *
from business_t
where
class_unit_code in ('071', '072', '073', '074', '075', '076')
and b_type = 'CASH_AND_CARRY'
and delete_date is null
For this specific sequence of string values, we might try to shorten the predicate using a regex (although this is probably less efficient):
regexp_like(class_unit_code, '^07[1-6]$')
Or if the string always contains numeric values, we can convert and use a range comparison (which also is not as efficient as the first option - in that case, the column should have been created with a numeric datatype to start with):
to_number(class_unit_code) between 71 and 76

Copy records from one table to another with pl-sql

I want to copy records from one table to another.
The only records from table 1 that will be copied to table 2 are the ones that still dont exist in table 2.
If duplicate records exists in Table 1 then only be copied to table 2 the record with the larger size name.
I could already implement a query that almost does what I want.
The problem I have is when there are names with the same maximum size of characters.
In these cases, my query returns more than one record and I just want to insert one new record in table 2.
Does anyone know how I can fix this?
Here is my code:
For x in (Select distinct xdd.id_t, xdd.name_t
From table1 xdd
Where xdd.id_t not in (Select distinct det.id_t2
From table2 det)
And LENGTH(xdd.name_t) in (Select Max(LENGTH(xdd2.name_t))
From table1 xdd2
Where xdd2.id_t = xdd.id_t)
) Loop
Insert into id_t2 (id_t2, name_t2)
Values (x.id_t, x.name_t);
End loop;
Can you give me an example to solve this?
Sure. If I understood requirements correctly, then the merge statement will look similar to this one:
We use row_number() analytic function to choose a duplicate record with longer name_t
merge into table_two t2
using(
select id_t
, name_t
from (select id_t
, name_t
, row_number() over(partition by id_t
order by length(name_t) desc) as rn
from table_one) q
where q.rn = 1
) t1
on (t2.id_t = t1.id_t)
when not matched then
insert(id_t, name_t)
values(t1.id_t, t1.name_t)
SQLFiddle demo
This is a merge statement that should "upsert" data from table 1 into table 2. Matching keys should update only when the name field in table1 is greater than that of table 2. And inserts should occur when keys from table one are not matched to table 2.
MERGE INTO table2 D
USING (SELECT table1.id_t, table1.name_t FROM table1) S
ON (D.id_t2 = S.id_t)
WHEN MATCHED THEN UPDATE SET D.name_t2 = S.name_t
WHERE (LENGTH(S.name_t) > LENGTH(D.name_t2))
WHEN NOT MATCHED THEN INSERT (D.id_t, D.name_t)
VALUES (S.id_t2, S.name_t2);

Need to select column from subquery into main query

I have a query like below - table names etc. changed for keeping the actual data private
SELECT inv.*,TRUNC(sysdate)
FROM Invoice inv
WHERE (inv.carrier,inv.pro,inv.ndate) IN
(
SELECT carrier,pro,n_dt FROM Order where TRUNC(Order.cr_dt) = TRUNC(sysdate)
)
I am selecting records from Invoice based on Order. i.e. all records from Invoice which are common with order records for today, based on those 3 columns...
Now I want to select Order_Num from Order in my select query as well.. so that I can use the whole thing to insert it into totally seperate table, let's say orderedInvoices.
insert into orderedInvoices(seq_no,..same columns as Inv...,Cr_dt)
(
SELECT **Order.Order_Num**, inv.*,TRUNC(sysdate)
FROM Invoice inv
WHERE (inv.carrier,inv.pro,inv.ndate) IN
(
SELECT carrier,pro,n_dt FROM Order where TRUNC(Order.cr_dt) = TRUNC(sysdate)
)
)
?? - how to do I select that Order_Num in main query for each records of that sub query?
p.s. I understand that trunc(cr_dt) will not use index on cr_dt (if a index is there..) but I couldn't select records unless I omit the time part of it..:(
If the table ORDER1 is unique on CARRIER, PRO and N_DT you can use a JOIN instead of IN to restrict your records, it'll also enable you to select whatever data you want from either table:
select order.order_num, inv.*, trunc(sysdate)
from Invoice inv
join order ord
on inv.carrier = ord.carrier
and inv.pro = ord.pro
and inv.ndate = ord.n_dt
where trunc(order.cr_dt) = trunc(sysdate)
If it's not unique then you have to use DISTINCT to deduplicate your record set.
Though using TRUNC() on CR_DT will not use an index on that column you can use a functional index on this if you do need an index.
create index i_order_trunc_cr_dt on order (trunc(cr_dt));
1. This is a really bad name for a table as it's a keyword, consider using ORDERS instead.

Resources