I'm cranking my head on a set of data in order to generate a report from a Oracle DB.
Data are in two tables:
SUPPLY
DEVICE
There is only one column that links the two tables:
SUPPLY.DEVICE_ID
DEVICE.ID
In SUPPLY, there are these data: (Markdown is not working well. it's supposed to show a table)
| DEVICE_ID | COLOR_TYPE | SERIAL | UNINSTALL_DATE |
|----------- |------------ |-------------- |--------------------- |
| 1232 | 1 | CAP857496 | 08/11/2016,19:10:50 |
| 5263 | 2 | CAP57421 | 07/11/2016,11:20:00 |
| 758 | 3 | CBO753421869 | 07/11/2016,04:25:00 |
| 758 | 4 | CC9876543 | 06/11/2016,11:40:00 |
| 8575 | 4 | CVF75421 | 05/11/2016,23:59:00 |
| 758 | 4 | CAP67543 | 30/09/2016,11:00:00 |
In DEVICE, there are columns that I've to select all (more or less), but each row is unique.
What i need to achieve is:
for each SUPPLY.DEVICE_ID and SUPPLY.COLOR_TYPE, I need the most recent ROW -> MAX(UNINSTALL_DATE)
JOINED with
more or less all the columns in DEVICE.
At the end I should have something like this:
| ACCOUNT_CODE | MODEL | DEVICE.SERIAL | DEVICE_ID | COLOR_TYPE | SUPPLY.SERIAL | UNINSTALL_DATE |
|-------------- |------- |--------------- |----------- |------------ |--------------- |--------------------- |
| BUSTO | MS410 | LM753 | 1232 | 1 | CAP857496 | 08/11/2016,19:10:50 |
| MACCHI | MX310 | XC876 | 5263 | 2 | CAP57421 | 07/11/2016,11:20:00 |
| ASL_COMO | MX711 | AB123 | 758 | 3 | CBO753421869 | 07/11/2016,04:25:00 |
| ASL_COMO | MX711 | AB123 | 758 | 4 | CC9876543 | 06/11/2016,11:40:00 |
| ASL_VARESE | X950 | DE8745 | 8575 | 4 | CVF75421 | 05/11/2016,23:59:00 |
So far, using a nested select like:
SELECT DEVICE_ID,COLOR_TYPE,SERIAL,UNINSTALL_DATE FROM
(SELECT SELECT DEVICE_ID,COLOR_TYPE,SERIAL,UNINSTALL_DATE
FROM SUPPLY WHERE DEVICE_ID = '123456' ORDER BY UNINSTALL_DATE DESC)
WHERE ROWNUM <= 1
I managed to get the highest value on the UNISTALL_DATE column after trying MAX(UNISTALL_DATE) or HIGHEST(UNISTALL_DATE).
I tried also:
SELECT SUPPLY.DEVICE_ID, SUPPLY.COLOR_TYPE, ....
FROM SUPPLY,DEVICE WHERE SUPPLY.DEVICE_ID = DEVICE.ID
and it works, but gives me ALL the items, basically it's a merge of the two tables.
When I try to narrow the data selected, i get errors or a empty result.
I'm starting to wonder that it's not possible to obtain this data and i'm starting to export the data in excel and work from there, but I wish someone can help me before giving up...
Thank you in advance.
for each SUPPLY.DEVICE_ID and SUPPLY.COLOR_TYPE, I need the most recent ROW -> MAX(UNINSTALL_DATE)
Use ROW_NUMBER function in this way:
SELECT s.*,
row_number() OVER (
PARTITION BY DEVICE_ID, COLOR_TYPE
ORDER BY UNINSTALL_DATE DESC
) As RN
FROM SUPPLY s
This query marks most recent rows with RN=1
JOINED with more or less all the columns in DEVICE.
Just join the above query to DEVICE table
SELECT d.*,
x.COLOR_TYPE,
x.SERIAL,
x.UNINSTALL_DATE
FROM (
SELECT s.*,
row_number() OVER (
PARTITION BY DEVICE_ID, COLOR_TYPE
ORDER BY UNINSTALL_DATE DESC
) As RN
FROM SUPPLY s
) x
JOIN DEVICE d
ON d.DEVICE_ID = x.DEVICE_ID AND x.RN=1
OK - so you could group by device_id, color_type and select max(uninstall_date) as well, and join to the other table. But you would miss the serial value for the most recent row (for each combination of device_id, color_type).
There are a few ways to fix that. Your attempt with rownum was close, but the problem is that you need to order within each "group" (by device_id, color_type) and get the first row from each group. I am sure someone will post a solution along those lines, using either row_number() or rank() or perhaps the analytic version of max(uninstall_date).
When you just need the "top" row from each group, you can use keep (dense_rank first/last) - which may be slightly more efficient - like so:
select device_id, color_type,
max(serial) keep (dense_rank last order by uninstall_date) as serial,
max(uninstall_date) as uninstall_date
from supply
group by device_id, color_type
;
and then join to the other table. NOTE: dense_rank last will pick up the row OR ROWS with the most recent (max) date for each group. If there are ties, that is more than one row; the serial will then be the max (in lexicographical order) among those rows with the most recent date. You can also select min, or add some order so you pick a specific one (you didn't discuss this possibility).
SELECT
d.ACCOUNT_CODE, d.DNS_HOST_NAME,d.IP_ADDRESS,d.MODEL_NAME,d.OVERRIDE_SERIAL_NUMBER,d.SERIAL_NUMBER,
s.COLOR, s.SERIAL_NUMBER, s.UNINSTALL_TIME
FROM (
SELECT s.DEVICE_ID, s.LAST_LEVEL_READ, s.SERIAL_NUMBER,TRUNC(s.UNINSTALL_TIME), row_number()
OVER (
PARTITION BY DEVICE_ID, COLOR
ORDER BY UNINSTALL_TIME DESC
) As RN
FROM SUPPLY s
WHERE s.UNINSTALL_TIME IS NOT NULL AND s.SERIAL_NUMBER IS NOT NULL
)
JOIN DEVICE d
ON d.ID = s.DEVICE_ID AND s.RN=1;
#krokodilko: thank you very much for your help. First query works. Modified it in order to remove junk, putting real columns name i need (yesterday evening i had no access to the DB) and getting only the data I need.
Unfortunately, when I join the two tables as you suggested I get error:
ORA-00904: "S"."RN": invalid identifier
00904. 00000 - "%s: invalid identifier"
If i remove s. before RN, the ORA-00904 moves back to s.DEVICE_ID.
Related
I have been tasked to find out the SELECT statement for an explain plan
------------------------------------------
| Id | Operation | Name |
------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | HASH JOIN RIGHT ANTI | |
| 2 | VIEW | VW_NSO_1 |
| 3 | HASH JOIN RIGHT SEMI| |
| 4 | TABLE ACCESS FULL | PART |
| 5 | TABLE ACCESS FULL | ORDERS |
| 6 | TABLE ACCESS FULL | CUSTOMER |
------------------------------------------
I am able to find the select statement from Id 0-5 but what does the line 6 mean?
This is what I have managed to figure out so far I can't get where the last sentence comes into play.
select *
from customer c join orders o
on c.custkey = o.custkey
where o_totalprice
not in
(select p_retailprice
from part p join orders o
on orders.o_custkey >= 0 and 0.1*o_totalprice >= 0)
I can't get where the last sentence comes into play?
Your query is:
select *
from customer c join orders o
on c.custkey = o.custkey
where o_totalprice
not in
(select p_retailprice
from part p join orders o
on orders.o_custkey >= 0 and 0.1*o_totalprice >= 0)
And your explain plan is
------------------------------------------
| Id | Operation | Name |
------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | HASH JOIN RIGHT ANTI | |
| 2 | VIEW | VW_NSO_1 |
| 3 | HASH JOIN RIGHT SEMI| |
| 4 | TABLE ACCESS FULL | PART |
| 5 | TABLE ACCESS FULL | ORDERS |
| 6 | TABLE ACCESS FULL | CUSTOMER |
------------------------------------------
In your case, this is what happens:
You are getting all the records from both customer and orders that match the condition based on the custkey field.
Your predicate information is delimiting the output to those where o_totalprice ( by the way it should clarified for reading easiness where this field is coming from, although I guess is from orders table ) is not part of the dataset retrieved from the subquery.
the subquery is getting all values of p_retailprice that match the join between part and orders using orders.o_custkey >= 0 and 0.1*o_totalprice >= 0
Getting this in consideration the CBO is:
Accessing ( Line 6 ) by TABLE FULL SCAN the table CUSTOMER, which is logical as you are getting all fields from the table and probably you have no index over custkey.
Making a HASH SEMI JOIN ( line 3 ) between PARTS and ORDERS. In general, a semi join is used for an in or exists clause, and the join stops as soon as the exists condition or the in condition is satisfied.
The HASH JOIN ANTI of line 1 is when the optimizer push the join predicate into a view, normally when an anti join ( not in ) is in place. This is then join to the CUSTOMER TABLE in line 6.
You are filtering only in the right table of the join ( ORDERS ) that is why the access are reflecting that.
This is just an overview of your execution plan and the reasons why the CBO is using those access paths.
I have one table that need to split into several other tables.
But the main table is just like a transitive table.
I dump data from a excel into it (from 5k to 200k rows) , and using insert into select, split into the correct tables (Five different tables).
However, the latest dataset that my client sent has records with duplicates values.
The primary key usually is ENI for my table. But even this record is duplicated because the same company can be a customer and a service provider, so they have two different registers but use the same ENI.
What i have so far.
I found a script that uses merge and modified it to find same eni and update the same main_id to all
|Main_id| ENI | company_name| Type
| 1 | 1864 | JOHN | C
| 2 | 351485 | JOEL | C
| 3 | 16546 | MICHEL | C
| 2 | 351485 | JOEL J. | S
| 1 | 1864 | JOHN E. E. | C
Main_id: Primarykey that the main BD uses
ENI: Unique company number
Type: 'C' - COSTUMER 'S' - SERVICE PROVIDERR
Some Cases it can have the same type. just like id 1
there are several other Columns...
What i need:
insert any of the main_id my other script already sorted, and set a flag on the others that they were not inserted. i cant delete any data i'll need to send these info to the costumer validate.
or i just simply cant make this way and go back to the good old excel
Edit: as a question below this is a example
|Main_id| ENI | company_name| Type| RANK|
| 1 | 1864 | JOHN | C | 1 |
| 2 | 351485 | JOEL | C | 1 |
| 3 | 16546 | MICHEL | C | 1 |
| 2 | 351485 | JOEL J. | S | 2 |
| 1 | 1864 | JOHN E. E. | C | 2 |
RANK - would be like the 1864 appears 2 times,
1st one found gets 1 second 2 and so on. i tryed using
RANK() OVER (PARTITION BY MAIN_ID ORDER BY ENI)
RANK() OVER (PARTITION BY company_name ORDER BY ENI)
Thanks to TEJASH i was able to come up with this solution
MERGE INTO TABLEA S
USING (Select ROWID AS ID,
row_number() Over(partition by eniorder by eni, type) as RANK_DUPLICATED
From TABLEA
) T
ON (S.ROWID = T.ID)
WHEN MATCHED THEN UPDATE SET S.RANK_DUPLICATED= T.RANK_DUPLICATED;
As far as I understood your problem, you just need to know the duplicate based on 2 columns. You can achieve it using analytical function as follows:
Select t.*,
row_number() Over(partition by main_id, eni order by company_name) as rnk
From your_table t
A comment on this answer notes that anti-joins may have been optimized to be more efficient that outer joins in Oracle. I'd be interested to see what explanations/evidence might support or disprove this claim.
When you use "not exists" or "not in" in your SQL query, you let Oracle to choose merge anti-join or hash anti-join access paths.
Quick Explanation
For example, given join betwen table A and B (from A join B on A.x = B.x) Oracle will fetch all relevant data from table A, and try to match them with corresponding rows in table B, so it's strictly dependent on selectivity of table A predicate.
When using anti-join optimization, Oracle can choose the table with higher selectivity and match it with the other one, which may result in much faster code.
It can't do that with regular join or subquery, because it can't assume that one match between tables A and B is enough to return that row.
Related hints: HASH_AJ, MERGE_AJ.
More:
This looks like a nice and detailed article on the subject.
Here is another, more dencet article.
If Oracle can transform left join + where is null into ANTI join then it's exactly the same.
create table ttt1 as select mod(rownum,10) id from dual connect by level <= 50000;
insert into ttt1 select 10 from dual;
create table ttt2 as select mod(rownum,10) id from dual connect by level <= 50000;
select ttt1.id
from ttt1
left join ttt2
on ttt1.id = ttt2.id
where ttt2.id is null;
select * from ttt1 where id not in (select id from ttt2);
If you have a look at
Final query after transformations:******* UNPARSED QUERY IS *******
in trace for event 10053 then you'll find two exactly the same queries (you can see "=" in predicate in the trace file because there is no special sign for ANTI join)
SELECT "TTT1"."ID" "ID" FROM "TTT2" "TTT2","TTT1" "TTT1" WHERE "TTT1"."ID"="TTT2"."ID"
And they have exactly the same plans
-----------------------------------
| Id | Operation | Name |
-----------------------------------
| 0 | SELECT STATEMENT | |
| 1 | HASH JOIN ANTI | |
| 2 | TABLE ACCESS FULL| TTT1 |
| 3 | TABLE ACCESS FULL| TTT2 |
-----------------------------------
If you, however, put a hint to disable transformations then plan will be
select --+ no_query_transformation
ttt1.id
from ttt1, ttt2
where ttt1.id = ttt2.id(+) and ttt2.id is null;
------------------------------------
| Id | Operation | Name |
------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | FILTER | |
| 2 | HASH JOIN OUTER | |
| 3 | TABLE ACCESS FULL| TTT1 |
| 4 | TABLE ACCESS FULL| TTT2 |
------------------------------------
and performance will degrade significantly.
If you will use ANSI join syntax with deisabled transformation it will be even worse.
select --+ no_query_transformation
ttt1.id
from ttt1
left join ttt2
on ttt1.id = ttt2.id
where ttt2.id is null;
select * from table(dbms_xplan.display_cursor(format => 'BASIC'));
--------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | FILTER | |
| 3 | MERGE JOIN OUTER | |
| 4 | TABLE ACCESS FULL | TTT1 |
| 5 | BUFFER SORT | |
| 6 | VIEW | VW_LAT_2131DCCF |
| 7 | TABLE ACCESS FULL| TTT2 |
--------------------------------------------------
So, in a nutshell, if Oracle can apply transformation to ANTI join then performance is exactly the same otherwise it can be worse. You can also use hint "--+ rule" to disable CBO transformations and see what happenes.
PS. On a separate note, SEMI join may be in some cases much better than inner join + distinct even with enabled CBO transformations.
I need to add a column to a table that check for input to be a max value of 999 to 999, like a soccer match score. How do I write this statement?
example:
| Score |
---------
| 1-2 |
| 10-1 |
|999-999|
| 99-99 |
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE SCORES (Score ) AS
SELECT '1-2' FROM DUAL
UNION ALL SELECT '10-1' FROM DUAL
UNION ALL SELECT '999-999' FROM DUAL
UNION ALL SELECT '99-99' FROM DUAL
UNION ALL SELECT '1000-1000' FROM DUAL;
Query 1:
SELECT SCORE,
CASE WHEN REGEXP_LIKE( SCORE, '^\d{1,3}-\d{1,3}$' )
THEN 'Valid'
ELSE 'Invalid'
END AS Validity
FROM SCORES
Results:
| SCORE | VALIDITY |
|-----------|----------|
| 1-2 | Valid |
| 10-1 | Valid |
| 999-999 | Valid |
| 99-99 | Valid |
| 1000-1000 | Invalid |
The query in listing 1 joins two subqueries, both of which are computed from two named subqueries (ANIMAL and SEA_CREATURE). The output should list animals that don't live in the sea, and list animals that do live in the sea.
When run in a console window (SQL Navigator 5.5), the server returns error:
15:21:30 ORA-00600: internal error code, arguments: [evapls1], [], [], [], [], [], [], []
Why? And how to get around it?
Interesting to note, I can run the same query in a program written in Delphi XE7 (using TSQLQuery component), and it works ok. But this is not a problem with SQL Navigator. If I create a view containing the expression in listing 1, selecting from the view does not output an error. The problem is in the oracle server.
If I make the ANIMAL subquery really simple, like in Listing 2, it works. but anything else, even just selecting from a table, results in this internal error.
Listing 1: (Outputs error)
with ANIMAL as (
select ANIMAL_NAME
from xmltable( 't/e' passing xmltype( '<t><e>Tuna</e><e>Cat</e><e>Dolphin</e><e>Swallow</e></t>')
columns
ANIMAL_NAME varchar2(100) path 'text()')),
SEA_CREATURE as (
select 'Tuna' as CREATURE_NAME from dual
union all select 'Shark' from dual
union all select 'Dolphin' from dual
union all select 'Plankton' from dual)
select NONSEA_ANIMALS, SEA_ANIMALS
from (
select stringagg( ANIMAL_NAME) as NONSEA_ANIMALS
from ( (select * from ANIMAL)
minus (select CREATURE_NAME as ANIMAL_NAME from SEA_CREATURE))),
(select stringagg( ANIMAL_NAME) as SEA_ANIMALS
from ANIMAL
where ANIMAL_NAME in
(select CREATURE_NAME as ANIMAL_NAME from SEA_CREATURE))
Listing 2: (This works)
with ANIMAL as (
select 'Tuna' as ANIMAL_NAME from dual
union all select 'Cat' from dual
union all select 'Dolphin' from dual
union all select 'Swallow' from dual),
SEA_CREATURE as (
select 'Tuna' as CREATURE_NAME from dual
union all select 'Shark' from dual
union all select 'Dolphin' from dual
union all select 'Plankton' from dual)
select NONSEA_ANIMALS, SEA_ANIMALS
from (
select stringagg( ANIMAL_NAME) as NONSEA_ANIMALS
from ( (select * from ANIMAL)
minus (select CREATURE_NAME as ANIMAL_NAME from SEA_CREATURE))),
(select stringagg( ANIMAL_NAME) as SEA_ANIMALS
from ANIMAL
where ANIMAL_NAME in
(select CREATURE_NAME as ANIMAL_NAME from SEA_CREATURE));
Listing 3: Expected output for expressions in both Listings 1 & 2:
NONSEA_ANIMALS SEA_ANIMALS
-------------------------------
'Cat,Swallow' 'Tuna,Dolphin'
The Oracle banner is shown in Listing 4.
Listing 4: select * from v$version
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production
How is this craziness explained?
Update
Here is the explain plan ...
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TEMP TABLE TRANSFORMATION | |
| 2 | LOAD AS SELECT | |
| 3 | VIEW | |
| 4 | COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE |
| 5 | LOAD AS SELECT | |
| 6 | UNION-ALL | |
| 7 | FAST DUAL | |
| 8 | FAST DUAL | |
| 9 | FAST DUAL | |
| 10 | FAST DUAL | |
| 11 | NESTED LOOPS | |
| 12 | VIEW | |
| 13 | SORT AGGREGATE | |
| 14 | VIEW | |
| 15 | MINUS | |
| 16 | SORT UNIQUE | |
| 17 | VIEW | |
| 18 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6666_765BCCBD |
| 19 | SORT UNIQUE | |
| 20 | VIEW | |
| 21 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6667_765BCCBD |
| 22 | VIEW | |
| 23 | SORT AGGREGATE | |
| 24 | HASH JOIN RIGHT SEMI | |
| 25 | VIEW | VW_NSO_1 |
| 26 | VIEW | |
| 27 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6667_765BCCBD |
| 28 | VIEW | |
| 29 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6666_765BCCBD |
----------------------------------------------------------------------------
ORA-03113 and ORA-6000 usually happens on using WITH clause query when
something fatal happened on execution.
Oracle's subquery factoring or WITH clause, can be overused at times. Oracle may create a global temporary table for every query inside WITH clause, for reusing the results. So, XMLTABLE() here, could have created another GTT here, and perhaps this crash the database.
COLLECTION ITERATOR PICKLER FETCH is something when fetched from a
PL/SL object. It returns pickled(packed and formatted) data
It might involve creation of some temp table beneath like I mentioned previously. So the subquery factoring and the PL/Sql array selection didnt go well.
I have also seen queries with nested UNION ALL in WITH getting crashed.
This is most a bug in Oracle, and should be reported to them.
Only way to get around this now, would be reforming the query. In our application, usage of WITH is strictly restricted(due to high CPU usage) for report only purposes executed as batch.