Oracle Query : Single update statement - oracle

I need to code an Oracle Query for the below logic and any help is appreciated.
I have a table with 8 columns and out of that need to consider 3 column for the a specific business logic.
Table data (with the 3 columns)
A B C
071699 01 I
071699 01W
071699 02W
071699 01W I
071699 02W
more rows.
Amount of data varies depending upon case, meaning it could be one or more rows per column A-B combination and usually out of these
column C is populated for at least 1 combination.
This table has over 100K of distinct A values.
Logic I need to implement:
Check - for a specific A value, how many combinations we have (A-B).
For a specific A: Check if any combination (A-B) is populated with column C data.
Take the value from the populated C column and update the same table (for the other combination of same A)
Data before (only showing specific rows)
A B C
071699 01 I
071699 01W
071699 02W
Data After Query
A B C
071699 01 I
071699 01W I
071699 02W I
I have a SQL server query doing this logic in a single query but not working in Oracle and I am getting error,"Single row query returning more than one row"
SQL Server Query
update c
set c.colC = u.colC
from data_table u
join data_table c on u.colA = c.colA and u.colB <> c.colB
and u.colB = (select MIN(colB) from data_table
where colA = u.colA and colC is not null)
and u.colC is not null and c.colC is null
Any help is appreciated to write similar oracle version.

Oracle doesn't allow joins in update queries unless you have a unique column which guarantees a 1-1 mapping which definitely doesn't apply in your case. So about the best you can do here is a nested subquery. It ain't pretty but it will work. I wasn't able to completely match up the logic you said you needed to implement with the logic that was in the SQL Server update statement, so I went with the statement.
UPDATE data_table u
SET u.COLC =(
SELECT c.ColC
FROM data_table c
WHERE c.ColA = u.ColA
and c.ColB =(
SELECT MIN( ColB )
FROM data_table
WHERE ColA = c.ColA
AND ColC IS NOT NULL
)
)
where u.ColC is null;

I tried various ways to do this task in a single query but not able to achieve that due to Oracle limitations, below is the solution worked for me:
I split the query into two parts - one doing insert statement and another with update.
First query : Here I am inserting the rows into another temp table, logic - get distinct rows per colA - ColC where colC is polulated.
Use the tmp table created in the step 1, to update the main data table by join on colA.
If anyone has better solution, then please send your response and I'll surely try it.

I resolved this issue with the set based coding after trying various methods like spliting into two tmp tables, cursor, etc, below is the sample code :
This query is 3-5X faster than any other solution.
UPDATE data_table a SET C = (
WITH comp AS (
SELECT DISTINCT A, C FROM data_table a
WHERE B = (SELECT MIN(B) FROM data_table
WHERE A = a.A
AND C IS NOT NULL)
)
SELECT
CASE
when a.C IS NULL THEN c.C
when a.C IS NOT NULL THEN a.C
END
FROM comp c
WHERE a.A = c.A AND c.C IS NOT NULL
);

Related

Why Oracle changes rowid with fetch?

I have a query like this:
select w.rowid, w.waclogin
from tableA w, tableB wa, tableC a
where wa.alucod = a.alucod
and w.waclogin = wa.waclogin
and a.cpf = '31808013875'
and rownum <= 1;
The results are:
ROWID WACLOGIN
AAA0CEAHSAABE07ABA 31808013875
But when I use fetch (for performance) the rowid returned is different:
select w.rowid, w.waclogin
from tableA w, tableB wa, tableC a
where wa.alucod = a.alucod
and w.waclogin = wa.waclogin
and a.cpf = '31808013875'
fetch first row only;
Results in:
ROWID WACLOGIN
AAA0DMAHaAAA+ZcAAX 31808013875
Why fetch changes the rowid?
For me this no makes sense.
Update
When fetch is used, that row id returned is from table B, instead of table A.
There are two rows in tableA with the same wacLogin value (but obviously different rowID values). Neither of your queries specifies an order by so which of those rows is returned is arbitrary. Presumably, there is a slightly different query plan being used for both queries so each one returns a different arbitrary row. Of course, tomorrow, either or both queries could start returning a different arbitrary row if the query plan or physical organization of the table changes. If you want the same row to be returned in both cases, you'd need to make both queries deterministic with an order by clause that uniquely orders the results.

Hash Join with Partition restriction from third table

my current problem is in 11g, but I am also interested in how this might be solved smarter in later versions.
I want to join two tables. Table A has 10 million rows, Table B is huge and has a billion of records across about a thousand partitions. One partition has around 10 million records. I am not joining on the partition key. For most rows of Table A, one or more rows in Table B will be found.
Example:
select * from table_a a
inner join table_b b on a.ref = b.ref
The above will return about 50 million rows, whereas the results come from about 30 partitions of table b. I am assuming a hash join is the correct join here, hashing table a and FTSing/index-scanning table b.
So, 970 partitions were scanned for no reason. And, I have a third query that could tell oracle which 30 partitions to check for the join.
Example of third query:
select partition_id from table_c
This query gives exactly the 30 partitions for the query above.
To my question:
In PL/SQL one can solve this by
select the 30 partition_ids into a variable (be it just a select listagg(partition_id,',') ... into v_partitions from table_c
Execute my query like so:
execute immediate 'select * from table_a a
inner join table_b b on a.ref = b.ref
where b.partition_id in ('||v_partitions||')' into ...
Let's say this completes in 10 minutes.
Now, how can I do this in the same amount of time with pure SQL?
Just simply writing
select * from table_a a
inner join table_b b on a.ref = b.ref
where b.partition_id in (select partition_id from table_c)
does not do the trick it seems, or I might be aiming at the wrong plan.
The plan I think I want is
hash join
table a
nested loop
table c
partition pruning here
table b
But, this does not come back in 10 minutes.
So, how to do this in SQL and what execution plan to aim at? One variation I have not tried yet that might be the solution is
nested loop
table c
hash join
table a
partition pruning here (pushed predicate from the join to c)
table b
Another feeling I have is that the solution might lie in joining table a to table c (not sure on what though) and then joining this result to table b.
I am not asking you to type everything out for me. Just a general concept of how to do this (getting partition restriction from a query) in SQL - what plan should I aim at?
thank you very much! Peter
I'm not an expert at this, but I think Oracle generally does the joins first, then applies the where conditions. So you might get the plan you want by moving the partition pruning up into a join condition:
select * from table_a a
inner join table_b b on a.ref = b.ref
and b.partition_id in (select partition_id from table_c);
I've also seen people try to do this sort of thing with an inline view:
select * from table_a a
inner join (select * from table_b
where partition_id in (select partition_id from table_c)) b
on a.ref = b.ref;
thank you all for your discussions with me on this one. In my case this was solved (not by me) by adding a join-path between table_c and table_a and by overloading the join conditions as below. In my case this was possible by adding column partition_id to table_a:
select * from
table_c c
JOIN table_a a ON (a.partition_id = c.partition_id)
JOIN table_b b ON (b.partition_id = c.partition_id and b.partition_id = a.partition_id and b.ref = a.ref)
And this is the plan you want:
leading(c,b,a) use_nl(c,b) swap_join_inputs(a) use_hash(a)
So you get:
hash join
table a
nested loop
table c
partition list iterator
table b

Update statement with joins in Oracle

I need to update one column in table A with the result of a multiplication of one field from table A with one field from table B.
It would be pretty simple to do this in T-SQL, but I can't write the correct syntax in Oracle.
What I've tried:
UPDATE TABLE_A
SET TABLE_A.COLUMN_TO_UPDATE =
(select TABLE_A.COLUMN_WITH_SOME_VALUE * TABLE_B.COLUMN_WITH_PERCENTAGE
from TABLE_A
INNER JOIN TABLE_B
ON TABLE_A.PRODUCT_ID = TABLE_B.PRODUCT_ID
AND TABLE_A.SALES_CHANNEL_ID = TABLE_B.SALES_CHANNEL_ID)
WHERE TABLE_A.MONTH_ID IN (201601, 201602, 201603);
But I keep getting errors. Could anybody help me, please?
I generally prefer to use the below format for such cases since this will ensure there's no update performed if there's no data in the table(query extracted temp table) whereas in the above solution provided by Brian Leach will update the new value as null if there's no record present in the 2nd table but exists in the first table.
UPDATE
(
select TABLE_A.COLUMN_TO_UPDATE
, TABLE_A.PRODUCT_ID
, TABLE_A.COLUMN_WITH_SOME_VALUE * TABLE_B.COLUMN_WITH_PERCENTAGE as value
from TABLE_A
INNER JOIN TABLE_B
ON TABLE_A.PRODUCT_ID = TABLE_B.PRODUCT_ID
AND TABLE_A.SALES_CHANNEL_ID = TABLE_B.SALES_CHANNEL_ID
AND TABLE_A.MONTH_ID IN (201601, 201602, 201603)
) DATA
SET DATA.COLUMN_TO_UPDATE = DATA.value;
This solution can cause key preserved value issues which shouldn't be an issue here since i expect a single row in both the tables for one product(ID).
More on Key Preserved table concept in inner join can be found here
https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:548422757486
#Jayesh Mulwani raiesed a valid point, this will set the value to null if there is no matching record. This may or may not be the desired result. If it isn't, and no change is desirect, you can change the select statement to:
coalesce((SELECT table_b.column_with_percentage
FROM table_b
WHERE table_a.product_id = table_b.product_id AND table_a.sales_channel_id = table_b.sales_channel_id),1)
If this is the desired outcome, Jayesh's solution will be more efficient as it will only update matching records.
UPDATE table_a
SET table_a.column_to_update = table_a.column_with_some_value
* (SELECT table_b.column_with_percentage
FROM table_b
WHERE table_a.product_id = table_b.product_id
AND table_a.sales_channel_id = table_b.sales_channel_id)
WHERE table_a.month_id IN (201601, 201602, 201603);

Insert Statement Returns ORA-01427 Error While Trying To Insert From Multiple Tables

I have this table F_Flight which I am trying to insert into from 3 different tables. The first, fourth and fifth columns are from the same, and the second and third columns from different tables. When I execute the code, I get a "single-row subquery returns more than one row" error.
insert when 1 = 1 then into F_Flight (planeid, groupid, dateid, flightduration, kmsflown) values
(planeid, (select b.groupid from BridgeTable b where exists (select p.p1id from pilotkeylookup p where b.pilotid = p.p1id)),
(select dd.id from D_Date dd where exists (select p.launchtime from PilotKeyLookup p where dd."Date" = p.launchtime)),
flightduration, kmsflown) select * from PilotKeyLookup p;
Your subqueries get multiple rows back, which is what the error message says. There is no correlation between the various bits of data and subqueries you're trying to insert into a single row.
This can be done as a much simpler insert...select with joins, something like:
insert into f_flight (planeid, groupid, dateid, flightduration, kmsflown)
select pkl.planeid, bt.groupid, dd.id, pkl.flightduration, pkl.kmsflown
from pilotkeylookup pkl
join bridgetable bt on bt.pilotid = pkl.p1id
join d_date dd on dd."Date" = pkl.launchtime;
This joins the main PilotKeyLookup table to the other two on the keys you used in your subqueries.
Storing an ID value instead of an actual date is unusual, and if launchtime has a time component - which seems likely from the name - and your d_date entries are just dates (i.e. all with time at midnight) then you won't find matches; you might need to do:
join d_date dd on dd."Date" = trunc(pkl.launchtime);
It also seems like this could be a view, as you're storing duplicate data - everything in f_flight could, obviously, be found from the other tables.

Worse query plan with a JOIN after ANALYZE

I see that running ANALYZE results in significantly poor performance on a particular JOIN I'm making between two tables.
Suppose the following schema:
CREATE TABLE a ( id INTEGER PRIMARY KEY, name TEXT );
CREATE TABLE b ( a NOT NULL REFERENCES a, value INTEGER, PRIMARY KEY(a, b) );
CREATE VIEW ab AS SELECT a.name, b.text, MAX(b.value)
FROM a
JOIN b ON b.a = a.id;
GROUP BY a.id
ORDER BY a.name
Table a is approximately 10K rows, table b is approximately 48K rows (~5 rows per row in table a).
Before ANALYZE
Now when I run the following query:
SELECT * FROM ab;
The query plan looks as follows:
1|0|0|SCAN TABLE b
1|1|1|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?)
This is a good plan, b is larger and I want it to be in the outer loop, making use of the index in table a. It finishes well within a second.
After ANALYZE
When I execute the same query again, the query plan results in two table scans:
1|0|1|SCAN TABLE a
1|1|0|SCAN TABLE b
This is far for optimal. For some reason the query planner thinks that an outer loop of 10K rows and an inner loop of 48K rows is a better fit. This takes about 1.5 minute to complete.
Should I adapt the index in table b to make it work after ANALYZE? Anything else to change to the indexing/schema?
I just try to understand the problem here. I worked around it using a CROSS JOIN, but that feels dirty and I don't really understand why the planner would go with a plan that is orders of magnitude slower than the un-analyzed plan. It seems to be related to GROUP BY, since the query planner puts table b in the outer loop without it (but that renders the query useless for what I want).
Accidentally found the answer by adjusting the GROUP BY clause in the view definition. Instead of joining on a.id, I group on b.a instead, although they have the same values.
CREATE VIEW ab AS SELECT a.name, b.text, MAX(b.value)
FROM a
JOIN b ON b.a = a.id;
GROUP BY b.a -- <== changed this from a.id to b.a
ORDER BY a.name
I'm still not entirely sure what the difference is, since it groups the same data.

Resources