Deletting duplicate data on oracle using sql failed - oracle

I have a table abc as:
acc subgroup
720V A
720V A
720V A
720V A
111 C
222 D
333 E
My expected output is:
acc subgroup
720V A
111 C
222 D
333 E
Since 720V A is duplicate i want to delete all three duplicate data and only want one data in my table.
So,i tried
DELETE FROM (
select t.*,rownum rn from abc t where acc='720V') where rn>1;
So,I get error as:
ORA-01732: data manipulation operation not legal on this view
How i can get my expected output?

Your table seems to be lacking a primary key column, which is a big problem here. Assuming there actually is a primary key column PK, we can try using ROW_NUMBER to identify any "duplictes":
DELETE
FROM abc t1
WHERE pk IN (SELECT pk
FROM (
SELECT t.pk, ROW_NUMBER() OVER (PARTITION BY acc, subgroup ORDER BY pk) rn
FROM abc t) x
WHERE rn > 1
);
Note that if you can live with keeping your original data, then the most expedient thing to do might be to create a distinct view:
CREATE VIEW abc_view AS
SELECT DISTINCT acc, subgroup
FROM abc;

Related

delete unique rows from table having minimum value using joins

select a.rowid,a.ta_transaction_at_k,a.ta_approvalid_k
from OF_TATRANSACTIONAPPROVALS a,OF_ATAVAILMENTTICKETS b
where a.TA_TRANSACTION_AT_K=b.at_transaction_k
and a.TA_APPROVALROLE_RO = 98
and a.TA_APPROVALTYPE = 'TA'
and b.AT_GROUP_ID=402
I have this query which gives result in below format. How can I delete records from it
331789 3
331789 4
331789 5
331787 3
331787 4
331787 5
I want to delete ids with minimum values
got the issue resolved using row ids.
delete from OF_TATRANSACTIONAPPROVALS where rowid in(
select rowsid from(
select a.rowid rowsid,a.ta_transaction_at_k trns,a.ta_approvalid_k,row_number() over (partition by TA_TRANSACTION_AT_K order by TA_TRANSACTION_AT_K,ta_approvalid_k) rn
from OF_TATRANSACTIONAPPROVALS a,OF_ATAVAILMENTTICKETS b where a.TA_TRANSACTION_AT_K=b.at_transaction_k
and a.TA_APPROVALROLE_RO = 98 and a.TA_APPROVALTYPE = 'TA' and b.AT_GROUP_ID=402) where rn=1)

Oracle: Update values in table with aggregated values from same table

I am looking for a possibly better approach to this.
I have created a temp table in Oracle 11.2 that I'm using to pre calculate values that I will need in other selects instead of always generating them again with each select.
create global temporary table temp_foo (
DT timestamp(6), --only the date part will be used in this example but for later things I will need the time
Something varchar2(100),
Customer varchar2(100),
MinDate timestamp(6),
MaxDate timestamp(6),
Filecount int,
Errorcount int,
AvgFilecount int,
constraint PK_foo primary key (DT, Customer)
) on commit preserve rows;
I then first insert some fixed values for everything except AvgFilecount. AvgFilecount should contain the average for the Filecount for the 3 previous records (going by the date in DT). It doesn’t matter that the result will be converted to an int, I don’t need the decimal places
DT | Customer | Filecount | AvgFilecount
2019-04-30 | x | 10 | avg(2+3+9)
2019-04-29 | x | 2 | based on values before this
2019-04-28 | x | 3 | based on values before this
2019-04-27 | x | 9 | based on values before this
I thought about using a normal UPDATE statement as this should be faster than looping through the values. I should mention that there are no gaps in the DT field but obviously there is a first one where I won‘t find any previous records. If I would loop through, I could easily calculate AvgFilecount with (the record before previous record/2 + previous record)/3 which I cannot with UPDATE as I cannot guarantee the order of how they are executed. So I‘m fine with just taking the last 3 records (going by DT) and calcuting it from there.
What I thought would be an easy update is giving me headaches. I‘m mostly doing SQL Server where I would just join the 3 other records but it seems is a bit different in Oracle. I have found https://stackoverflow.com/a/2446834/4040068 and wanted to use the second approach in the answer.
update
(select curr.DT, curr.temp_foo, curr.Filecount, curr.AvgFilecount as OLD, (coalesce(Minus1.Filecount, 0) + coalesce(Minus2.Filecount, 0) + coalesce(Minus3.Filecount, 0)) / 3 as NEW
from temp_foo curr
left join temp_foo Minus1 ON Minus1.Customer = curr.Customer and trunc(Minus1.DT) = trunc(curr.DT-1)
left join temp_foo Minus2 ON Minus2.Customer = curr.Customer and trunc(Minus2.DT) = trunc(curr.DT-2)
left join temp_foo Minus3 ON Minus3.Customer = curr.Customer and trunc(Minus3.DT) = curr.DT-3
order by 1, 2
)
set OLD = NEW;
Which gives me an
ORA-01779: cannot modify a column which maps to a non key-preserved
table
01779. 00000 - "cannot modify a column which maps to a non key-preserved table"
*Cause: An attempt was made to insert or update columns of a join view which
map to a non-key-preserved table.
*Action: Modify the underlying base tables directly.
I thought this should work as both join conditions are in the primary key and thus unique. I am currently implementing the first approach in the above mentioned answer but it is getting quite big and it feels like there should be a better solution to this.
Other things I thought about trying:
using a nested subselect (nested because Oracle doesn’t know top(n) and I need to sort the subselect) to select the previous 3 records ordered by DT and then he outer select with rownum <=3 and then I could just use AVG(). However, I was told subselect can be quite slow and joins are better in Oracle performance wise. Dunno if that is really the case, haven‘t done any testing
Edit: My insert right now looks like this. I am already aggregating the Filecount for a day as there can be multiple records per DT per Customer per Something.
insert into temp_foo (DT, Something, Customer, Filecount)
select dates.DT, tbl1.Something, tbl1.Customer, coalesce(sum(tbl3.Filecount),0)
from table(Function_Returning_Daterange(NULL, NULL)) dates
cross join
(SELECT Something,
Code,
Value
FROM Table2 tbl2
WHERE (Something = 'Value')) tbl1
left outer join Table3 tbl3
on tbl3.Customer = tbl1.Customer
and trunc(tbl3.MinDate) = trunc(dates.DT)
group by dates.DT, tbl1.Something, tbl1.Customer;
You could use an analytic average with a window clause:
select dt, customer, filecount,
avg(filecount) over (partition by customer order by dt
rows between 3 preceding and 1 preceding) as avgfilecount
from tmp_foo
order by dt desc;
DT CUSTOMER FILECOUNT AVGFILECOUNT
---------- -------- ---------- ------------
2019-04-30 x 10 4.66666667
2019-04-29 x 2 6
2019-04-28 x 3 9
2019-04-27 x 9
and then do the update part with a merge statement:
merge into tmp_foo t
using (
select dt, customer,
avg(filecount) over (partition by customer order by dt
rows between 3 preceding and 1 preceding) as avgfilecount
from tmp_foo
) s
on (s.dt = t.dt and s.customer = t.customer)
when matched then update set t.avgfilecount = s.avgfilecount;
4 rows merged.
select dt, customer, filecount, avgfilecount
from tmp_foo
order by dt desc;
DT CUSTOMER FILECOUNT AVGFILECOUNT
---------- -------- ---------- ------------
2019-04-30 x 10 4.66666667
2019-04-29 x 2 6
2019-04-28 x 3 9
2019-04-27 x 9
You haven't shown your original insert statement; it might be possible to add the analytic calculation to that, and avoid the separate update step.
Also, if you want the first two date values to be calculated as if the 'missing' extra days before them had zero counts, you could use sum and division instead of avg:
select dt, customer, filecount,
sum(filecount) over (partition by customer order by dt
rows between 3 preceding and 1 preceding)/3 as avgfilecount
from tmp_foo
order by dt desc;
DT CUSTOMER FILECOUNT AVGFILECOUNT
---------- -------- ---------- ------------
2019-04-30 x 10 4.66666667
2019-04-29 x 2 4
2019-04-28 x 3 3
2019-04-27 x 9
It depends what you expect those last calculated values to be.

Oracle PL/SQL- query a table, selecting data from the same table but utilizing an ID from another

I know the title will not do justice, and I have tried searching around between Joins and Merges, however I am a bit stumped and could use some guidance writing this.
I have two tables:
Table 1
A B C
20348 12306 191
31502 12306
20342 12297 191
31492 12297
20341 12296 191
31504 12296
20344 12299 191
31499 12299
Table 2
A(ident)B (F_Key of T1_A) C D E
25003 20348 1 2 3
35915 20342 1 2 3
41883 20341 1 2 3
31303 20344 1 2 3
I want to take table 2, select the contents B, C, D, E, However I want to select Table 1, column A as B where Table 1 C is null Table 1 B = X.
Table 2 Column B is the foreign Key of Table 1 A
As a result, I would like to see this:
A B C D E
5555 31502 1 2 3
5556 31492 1 2 3
5557 31504 1 2 3
5558 31499 1 2 3
Any help is much appreciated.
So if I understood your question correctly, you need case statement in your select:
select (case when t1.C is null and t1.b=X then t1.A else t2.B end), t2.C, t2.D, t2.E
from Table1 t1 join table2 t2 on t1.A = t2.B
Also not sure what Table 1 B = X means so I just kept the syntax same as in your explanation. Feel free to modify that part. Give it a try to see whether it works for you
I was able to figure this out for what I needed.
I end up going up a further level in my database where I introduced Table 3, where its Primary key was the Foreign Key of Table 1 Column B that referenced a primary key and created:
select t1b.A, t2.C, t2.D, t2.E
from Table1 t1a, table1 t1b, table2 t2, table3 t3
WHERE t3.A = t1a.b
AND t1a.C is null
AND t2.B = t1a.A
AND t1b.B = t3.A
AND t1b.B = 'xxx'

How to find Column with same (some x value) value repeated more than once? Needs to return those rows.

There is a table called contacts with columns id, name, address, ph_no etc.
I need to find out rows with the same name, if the rows count is more than 1, show those rows.
For example:
Table: contacts
id--------name--------address---------ph_no--------
111 apple U.K 99*******
112 banana U.S 99*******
123 grape INDIA 99*******
143 orange S.AFRICA 99*******
152 grape KENYA 99*******
For the above table I need to get rows with same column name data like the below:
id--------name--------address---------ph_no--------
123 grape INDIA 99*******
152 grape KENYA 99*******
I need to get the rows based on the name what I given as argument like below example syntax:
select * from contacts where name='grape' and it's count(*) >1 return those rows.
How can I achieve the solution for above problem.
As #vc74 suggests analytic functions would work work a lot better here; especially if your data has any volume.
select id, name, address, ph_no ...
from ( select c.*, count(name) over ( partition by name ) as name_ct
from contacts c )
where name_ct > 1
;
EDIT
restricting on specific names the table contacts should really have an index on name and the query would look like this:
select id, name, address, ph_no ...
from ( select c.*, count(name) over ( partition by name ) as name_ct
from contacts c
where name = 'grape' )
where name_ct > 1
;
select id, name, address, ph_no
from contacts
where name in
(
select name from contacts
group by name
having count(*) > 1
)
If you have access to Oracle's analytical functions there might be a more straightforward way
select *
from contacts c
where c.name in ( select cc.name
from contacts
group by cc.name
having count(1) > 1 );

how to merge data while loading them into hive?

I'm tring to use hive to analysis our log, and I have a question.
Assume we have some data like this:
A 1
A 1
A 1
B 1
C 1
B 1
How can I make it like this in hive table(order is not important, I just want to merge them) ?
A 1
B 1
C 1
without pre-process it with awk/sed or something like that?
Thanks!
Step 1: Create a Hive table for input data set .
create table if not exists table1 (fld1 string, fld2 string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
(i assumed field seprator is \t, you can replace it with actual separator)
Step 2 : Run below to get the merge data you are looking for
create table table2 as select fld1,fld2 from table1 group by fld1,fld2 ;
I tried this for below input set
hive (default)> select * from table1;
OK
A 1
A 1
A 1
B 1
C 1
B 1
create table table4 as select fld1,fld2 from table1 group by fld1,fld2 ;
hive (default)> select * from table4;
OK
A 1
B 1
C 1
You can use external table as well , but for simplicity I have used managed table here.
One idea.. you could create a table around the first file (called 'oldtable').
Then run something like this....
create table newtable select field1, max(field) from oldtable group by field1;
Not sure I have the syntax right, but the idea is to get unique values of the first field, and only one of the second. Make sense?
For merging the data, we can also use "UNION ALL" , it can also merge two different types of datatypes.
insert overwrite into table test1
(select x.* from t1 x )
UNION ALL
(select y.* from t2 y);
here we are merging two tables data (t1 and t2) into one single table test1.
There's no way to pre-process the data while it's being loaded without using an external program. You could use a view if you'd like to keep the original data intact.
hive> SELECT * FROM table1;
OK
A 1
A 1
A 1
B 1
C 1
B 1
B 2 # Added to show it will group correctly with different values
hive> CREATE VIEW table2 (fld1, fld2) AS SELECT fld1, fld2 FROM table1 GROUP BY fld1, fld2;
hive> SELECT * FROM table2;
OK
A 1
B 1
B 2
C 1

Resources