I have a very complex SQL view definition that has been inherited and requires altering to improve performance. It takes a list of records based on a foreign key and displays the rows returned as columns.
Thus :-
Data from select using RANK
ID RANK DKEY RECORD1 RECORD2 RECORD3
1 1 1 003 Rob Emmerry
1 2 2 004 Sue Emmerry
Returns
ID REC11 REC12 REC13 REC21 REC22 REC23
1 003 Rob Emmerry 004 Sue Emmerry
There are 37 columns of data repeated for each returned row upto a max of 5.
Using
SELECT ID,
MIN(DECODE(ranking,1,RECORD1, NULL)) AS REC11
MIN(DECODE(ranking,1,RECORD2, NULL)) AS REC12
MIN(DECODE(ranking,1,RECORD3, NULL)) AS REC13
MIN(DECODE(ranking,1,RECORD4, NULL)) AS REC14
MIN(DECODE(ranking,1,RECORD5, NULL)) AS REC15
MIN(DECODE(ranking,1,RECORD6, NULL)) AS REC16
MIN(DECODE(ranking,2,RECORD1, NULL)) AS REC21
MIN(DECODE(ranking,2,RECORD2, NULL)) AS REC22
MIN(DECODE(ranking,2,RECORD3, NULL)) AS REC23
MIN(DECODE(ranking,2,RECORD4, NULL)) AS REC24
MIN(DECODE(ranking,2,RECORD5, NULL)) AS REC25
MIN(DECODE(ranking,2,RECORD6, NULL)) AS REC26
FROM
(
SELECT ID, RANK () OVER (PARTITION BY id ORDER BY dkey) ranking,
RECORD1,
RECORD2,
RECORD3,
RECORD4,
RECORD5,
RECORD6
FROM TABLEA
JOIN
(SELECT ID, DKEY, RECORD4, RECORD5, RECORD6
FROM TABLEB
) ON TABLEB.DKEY = TABLEA.DKEY AND TABLEB.ID = TABLEA.ID
)
GROUP BY ID;
When using the explain plan and filtering on the DKEY field which has an index the index is ignored presumably because of the min/decode statements.
So I thought about rewriting this using PIVOT but don't know how to start.
Any thoughts as to how I can
a) Get the query to use the index
b) Rewrite using PIVOT
First option is obviously preferable.
Thanks
Craig
UPDATE
Here is some sample data showing how my tables are.
Table 1
DKEY PID RECORD1 RECORD2 RECORD3
1 1 3 Rob Emmerry
2 1 4 Sue Emmerry
3 1 4 Jan Morris
4 1 4 Sue Pye
5 1 4 Jane Taylor
Table 2
CID DKEY RECORD10
1 3 A
2 3 D
3 3 G
4 3 J
5 4 A
6 5 A
7 5 D
8 6 A
9 6 D
10 6 G
11 7 A
12 7 D
13 7 G
14 7 J
15 7 M
Table 3
QID DKEY RECORD3
1 3 C
2 6 C
3 6 F
4 7 C
5 7 F
So tables 2 & 3 link to table 1 with DKEY
If we took the DKEY=3 as an example I would want to see this:-
PID DKEY REC1 REC2 REC3 REC4 REC5 REC6 REC7 REC8 REC9 REC10 REC11 REC12 REC13
1 3 4 Jan Morris A D G J NULL C NULL NULL NULL NULL
There could be up to 5 rows in each of tables 2 & 3. Fields PID, DKEY, REC1-REC3 from table 1, REC4-REC8 come from table 2 and the rest from table 3. The other records from table 1 would simply continue on the row so after REC13, DKEY=4 etc etc.
Hope this makes sense.
SELECT
ID,
MIN(DECODE(ranking,1,RECORD1, NULL)) AS REC11,
MIN(DECODE(ranking,1,RECORD2, NULL)) AS REC12,
MIN(DECODE(ranking,1,RECORD3, NULL)) AS REC13,
MIN(DECODE(ranking,1,RECORD4, NULL)) AS REC14,
MIN(DECODE(ranking,1,RECORD5, NULL)) AS REC15,
MIN(DECODE(ranking,1,RECORD6, NULL)) AS REC16,
MIN(DECODE(ranking,2,RECORD1, NULL)) AS REC21,
MIN(DECODE(ranking,2,RECORD2, NULL)) AS REC22,
MIN(DECODE(ranking,2,RECORD3, NULL)) AS REC23,
MIN(DECODE(ranking,2,RECORD4, NULL)) AS REC24,
MIN(DECODE(ranking,2,RECORD5, NULL)) AS REC25,
MIN(DECODE(ranking,2,RECORD6, NULL)) AS REC26
FROM
(
SELECT /*+ INDEX(tablea tablea_index) */
ID,
RANK () OVER (PARTITION BY id ORDER BY dkey) ranking,
RECORD1,
RECORD2,
RECORD3,
RECORD4,
RECORD5,
RECORD6
FROM TABLEA
JOIN TABLEB
-- was: ON TABB.DKEY = TABLEA.DKEY AND TABB ON TABB.ID = TABLEA.ID
ON TABLEB.DKEY = TABLEA.DKEY
AND TABLEB.ID = TABLEA.ID
)
GROUP BY ID;
Related
There are two table as below
Table1
ID Name Age Active PID
-----------------------------
1 A 2 Y 100
2 A 2 Y 100
3 A 2 Y 100
4 B 3 Y 200
5 B 3 Y 200
Table2
T2ID CID
---------
10 1
20 1
30 1
40 2
50 2
60 3
70 3
80 3
90 4
100 5
110 5
I am trying to inactivate the duplicate record of table 1 and reassign the table2 record to activated rows of table 1,The result for table1 and table2 should be as below
ID Name Age Active PID
-----------------------------
1 A 2 Y 100
2 A 2 N 100
3 A 2 N 100
4 B 3 N 200
5 B 3 Y 200
T2ID CID
---------
10 1
20 1
30 1
40 1
50 1
60 1
70 1
80 1
90 5
100 5
110 5
please help for oracle query to update
You can do this by using two merge statements, like so:
Update table2:
MERGE INTO table2 tgt
USING (WITH t1 AS (SELECT ID,
NAME,
age,
active,
pid,
MIN(ID) OVER (PARTITION BY pid) min_id,
CASE WHEN COUNT(CASE WHEN active = 'Y' THEN 1 END) OVER (PARTITION BY pid) > 1 THEN 'Y' ELSE 'N' END multi_active_rows
FROM table1)
SELECT t2.t2id,
t2.cid old_cid,
t1.min_id new_cid
FROM t1
INNER JOIN table2 t2 ON t1.id = t2.cid
WHERE t1.multi_active_rows = 'Y') src
ON (tgt.t2id = src.t2id)
WHEN MATCHED THEN
UPDATE SET tgt.cid = src.new_cid;
Update table1:
MERGE INTO table1 tgt
USING (WITH t1 AS (SELECT ID,
NAME,
age,
active,
pid,
MIN(ID) OVER (PARTITION BY pid) min_id,
CASE WHEN COUNT(CASE WHEN active = 'Y' THEN 1 END) OVER (PARTITION BY pid) > 1 THEN 'Y' ELSE 'N' END multi_active_rows
FROM table1)
SELECT ID
FROM t1
WHERE multi_active_rows = 'Y'
AND ID != min_id) src
ON (tgt.id = src.id)
WHEN MATCHED THEN
UPDATE SET active = 'N';
Since we want to derive the results to update both table1 and table2 from the original dataset in table1, it's easier to update table2 first before updating table1.
This works by finding the lowest id across each set of pids in table1, plus checking to see if there is more than one active row for each pid (there's no need to do any updates if we have at most one active row available).
Once we have that information, we can use that to decide which rows to update in each table, and we can use the min_id to update table2 with, and we can update any rows in table1 where the id doesn't match the min_id to be not active.
N.B. If you could have a mix of Ys and Ns in your data, you may need to skip the and id != min_id check in the second merge statement and amend the update part to update the row to Y if the id is the min_id, otherwise set it to N.
Given the following oracle database table:
group revision comment
1 1 1
1 2 2
1 null null
2 1 1
2 2 2
2 3 3
2 4 4
2 null null
3 1 1
3 2 2
3 3 3
3 null null
I want to shift the comment column one step down in relation to version, within its group, so that I get the following table:
group revision comment
1 1 null
1 2 1
1 null 2
2 1 null
2 2 1
2 3 2
2 4 3
2 null 4
3 1 null
3 2 1
3 3 2
3 null 3
I have the following query:
MERGE INTO example_table t1
USING example_table t2
ON (
(t1.revision = t2.revision+1 OR
(t2.revision = (
SELECT MAX(t3.revision)
FROM example_table t3
WHERE t3.group = t1.group
) AND t1.revision IS NULL)
)
AND t1.group = t2.group)
WHEN MATCHED THEN UPDATE SET t1.comment = t2.comment;
That does most of this (still need a separate query to cover revision = 1), but it is very slow.
So my question is, how do I use Max here as efficiently as possible to pull out the highest revision for each group?
I would use lag not max
create table example_table(group_id number, revision number, comments varchar2(40));
insert into example_table values (1,1,1);
insert into example_table values (1,2,2);
insert into example_table values (1,3,null);
insert into example_table values (2,1,1);
insert into example_table values (2,2,2);
insert into example_table values (2,3,3);
insert into example_table values (2,4,null);
select * from example_table;
merge into example_table e
using (select group_id, revision, comments, lag(comments, 1) over (partition by group_id order by revision nulls last) comments1 from example_table) u
on (u.group_id = e.group_id and nvl(u.revision,0) = nvl(e.revision,0))
when matched then update set comments = u.comments1;
select * from example_table;
can any one help to create a AUTO_INCREMENT column on a view in oracle 11g.
Thanks
While it's not possible to return a single unique identity column for a view whose underlying data does not have any single unique identifier, it is possible to return composite values that uniquely identify the data. For example given a table of CSV Data with a unique ID on each row:
create table sample (id number primary key, csv varchar2(4000));
where the CSV column contains a string of comma separated values:
insert into sample
select 1, 'a' from dual union all
select 2, 'b,c' from dual union all
select 3, 'd,"e",f' from dual union all
select 4, ',h,' from dual union all
select 5, 'j,"",l' from dual union all
select 6, 'm,,o' from dual;
The following query will unpivot the csv data and the composite values (ID, SEQ) will uniquely identify each VALue, The ID column idetifies the record the data came from, and SEQ uniquely identifies the position in the CSV:
WITH pvt(id, seq, csv, val, nxt) as (
SELECT id -- Parse out individual list items
, 1 -- separated by commas and
, csv -- optionally enclosed by quotes
, REGEXP_SUBSTR(csv,'(["]?)([^,]*)\1',1,1,null,2)
, REGEXP_INSTR(csv, ',', 1, 1)
FROM sample
UNION ALL
SELECT id
, seq+1
, csv
, REGEXP_SUBSTR(csv,'(["]?)([^,]*)\1',nxt+1,1,null,2)
, REGEXP_INSTR(csv, ',', nxt+1, 1)
FROM pvt
where nxt > 0
)
select * from pvt order by id, seq;
ID SEQ CSV VAL NXT
---------- ---------- ---------- ---------- ----------
1 1 a a 0
2 1 b,c b 2
2 2 b,c c 0
3 1 d,"e",f d 2
3 2 d,"e",f e 6
3 3 d,"e",f f 0
4 1 ,h, [NULL] 1
4 2 ,h, h 3
4 3 ,h, [NULL] 0
5 1 j,"",l j 2
5 2 j,"",l [NULL] 5
5 3 j,"",l l 0
6 1 m,,o m 2
6 2 m,,o [NULL] 3
6 3 m,,o o 0
15 rows selected.
Given a table
$cat data.csv
ID,State,City,Price,Flag
1,CA,A,95,0
2,CA,A,96,1
3,CA,A,195,1
4,NY,B,124,0
5,NY,B,128,1
6,NY,C,24,0
7,NY,C,27,1
8,NY,C,29,0
9,NY,C,39,1
Expected Result:
ID0, ID1
1,2
4,5
6,7
8,7
for each ID with Flag=0 above, we want to find another ID from Flag=1, with the same "State" and "City", and the nearest Price.
I have two rough stupid ideas:
Method 1.
Use a left outer join with the table itself on
(a.State=b.State and a.City=b.city and a.Flag=0 and b.Flag=1),
where a.Flag=0 and b.Flag=1,
and then use RANK() over (partitioned by a.State,a.City order by a.Price - b.Price) as rank
where rank=1
Method 2.
Use a left outer join with the table itself,
on
(a.State=b.State and a.City=b.city and a.Flag=0 and b.Flag=1),
where a.Flag=0 and b.Flag=1,
and then Use Distribute by a.State,a.City Sort by Price_Diff ASC limit 1
What's the best way to find the nearest neighbor in Hive?
Any valuable tips will be greatly appreciated!
select a.id, b.id , min(abs(b.price-a.price)) as delta
from data as a
inner join data as b
on a.country=b.country and
a.flag=0 and b.flag=1 and
a.city=b.city
group by a.id, b.id
order by delta asc;
This returns
1 2 1 <---
8 7 2 <---
6 7 3 <---
4 5 4 <---
8 9 10
6 9 15
1 3 100
The problem is that the last 3 rows have the same id used into the first 4.
select a.id as id0, b.id as id1, abs(b.price-a.price) as delta,
rank() over ( partition by a.country, a.city order by abs(b.price-a.price) )
from data as a
inner join data as b
on a.country=b.country and
a.flag=0 and b.flag=1 and
a.city=b.city;
This will return
id0 id1 prc rank
1 2 1 1 <---
1 3 100 2
4 5 4 1 <---
8 7 2 1 <---
6 7 3 2
8 9 10 3
6 9 15 4
We are missing 6,7 and this is somehow correct.
6,NY,C,24,0
7,NY,C,27,1
8,NY,C,29,0
9,NY,C,39,1
The lowest price difference for (6,7),(6,9),(8,7),(8,9) is in (8,7). (ambiguous join)
I think you will love this video about this topic : Big Data Analytics Using Window Functions
Please help me make an oracle stored procedure ; I have two tables
tblLead:
lead_id Name
1 x
2 y
3 z
tblTransaction:
Tran_id lead_id date status
1 1 04/20/2010 call Later
2 1 05/05/2010 confirmed
I want a result like
lead_id Name status
1 x confirmed
2 y not available !
3 z not available !
Use an outer join to the relevant rows of tblTransaction:
SQL> SELECT l.lead_id, l.NAME,
2 CASE
3 WHEN t.status IS NULL THEN
4 'N/A'
5 ELSE
6 t.status
7 END status
8 FROM tbllead l
9 LEFT JOIN (SELECT lead_id,
10 MAX(status) KEEP(DENSE_RANK FIRST
11 ORDER BY adate DESC) status
12 FROM tbltransaction
13 GROUP BY lead_id) t ON l.lead_id = t.lead_id;
LEAD_ID NAME STATUS
---------- ---- ----------
1 x confirmed
2 y N/A
3 z N/A
Alternatively you can use analytics:
SQL> SELECT lead_id, NAME, status
2 FROM (SELECT l.lead_id, l.NAME,
3 CASE
4 WHEN t.status IS NULL THEN
5 'N/A'
6 ELSE
7 t.status
8 END status,
9 row_number()
10 over(PARTITION BY l.lead_id ORDER BY t.adate DESC) rn
11 FROM tbllead l
12 LEFT JOIN tbltransaction t ON l.lead_id = t.lead_id)
13 WHERE rn = 1;
LEAD_ID NAME STATUS
---------- ---- ----------
1 x confirmed
2 y N/A
3 z N/A
It can be written in plain SQL as follows,
SELECT lead_id, name, NVL(status,'not available !')
FROM (
SELECT tblLead.lead_id, tblLead.name, tblTransaction.status,
rank ( ) OVER (PARTITION BY tblTransaction.lead_id ORDER BY tblTransaction.datee DESC, tblTransaction.tran_id DESC) rank
FROM tblLead
LEFT JOIN tblTransaction ON tblLead.lead_id = tblTransaction.lead_id
)
WHERE rank = 1
ORDER BY lead_id;
Or you may think of writing a view as follows,
CREATE VIEW trx_view AS
------
------;
Personally I think stored procedure is not necessary for scenarios like this.