Simplify a redundant left join - oracle

Using Oracle, I'm looking to do the following query, but I'd like to know if there is a more "intelligent" way to do it.
Select * from Sales Sales1
left join Sales Sales2 on Sales2.val = Sales1.val
left join Sales Sales3 on Sales3.val = Sales2.val
left join Sales Sales4 on Sales4.val = Sales3.val
left join Sales Sales5 on Sales5.val = Sales4.val
...
Here's what my sample data might look like
customer number | acct | start balance | open date | prev account
a 1 100 01-01-15 b-1
b 1 80 03-04-14
c 2 200 04-11-14 c-1
c 1 150 06-12-15
d 1 600 08-16-15
e 3 400 12-19-15 e-2
e 2 150 10-21-14 e-1
e 1 100 01-18-13
And a result set would look like this:
Customer | start | open | prevStart_01 | prevOpen_01 | prevStart_02 | prevOpen_02
a-1 | 100| 01-01-15| 80 | 03-04-14 | |
c-2 | 200| 04-11-14| 150 | 06-11-14 | |
e-3 | 400| 12-19-15| 150 | 10-21-14 | 100| 01-18-13
As you can see, I need to keep joining another record of sales based upon the result, and I need to keep doing so until I return an empty result set. My current scenario is running the query and seeing whether there are values in sales5, sales6, sales7, and so on.

Whenever you have to self-join an unknown number of times, you should be thinking CONNECT BY. Your particular need here isn't so straightforward, but CONNECT BY is still the key element of the solution.
In the SQL below, the mockup_data subfactor is just to give me some data. You'd use your actual table.
The idea is that you search your data for "root" -- records that are not a prev_account of any other record. Then, you start with those and CONNECT BY to get all their previous accounts, as many as there are. Then you PIVOT to get them all into columns.
One thing -- an Oracle SQL statement cannot have an arbitrary (data-driven) number of columns. The number must be known when the SQL is parsed. Therefore, in your PIVOT clause, you need to specify the maximum number of "levels" you'll support, so that Oracle knows how many columns the result set could have.
Here's the SQL.
WITH
mockup_data as (
SELECT
'a' customer_Number, 1 acct, 100 start_balance, to_date('01-01-15','MM-DD-YY') open_date, 'b-1' prev_account from dual union all
SELECT 'b' ,1, 80, to_date('03-04-14','MM-DD-YY'), null from dual union all
SELECT 'c' ,2, 200, to_date('04-11-14','MM-DD-YY'), 'c-1' from dual union all
SELECT 'c' ,1, 150, to_date('06-12-15','MM-DD-YY'), null from dual union all
SELECT 'd' ,1, 600, to_date('08-16-15','MM-DD-YY'), null from dual union all
SELECT 'e' ,3, 400, to_date('12-19-15','MM-DD-YY'), 'e-2' from dual union all
SELECT 'e' ,2, 150, to_date('10-21-14','MM-DD-YY'), 'e-1' from dual union all
SELECT 'e' ,1, 100, to_date('01-18-13','MM-DD-YY'), null from dual ),
data_with_roots AS
(SELECT d.*,
CASE
WHEN (SELECT COUNT (*)
FROM mockup_data d2
WHERE d2.prev_account = d.customer_number || '-' || d.acct) = 0 THEN
'Y'
ELSE
'N'
END
is_root
FROM mockup_data d),
hierarchy AS
(SELECT CONNECT_BY_ROOT (customer_number) customer_number,
CONNECT_BY_ROOT (acct) acct,
CONNECT_BY_ROOT (start_balance) start_balance,
CONNECT_BY_ROOT (open_date) open_date,
start_balance prev_start_balance,
open_date prev_open_date,
LEVEL - 1 lvl
FROM data_with_roots d
CONNECT BY customer_number || '-' || acct = PRIOR prev_account
START WITH is_root = 'Y'),
previous_only AS
(SELECT *
FROM hierarchy
WHERE lvl >= 1)
SELECT *
FROM previous_only PIVOT (MAX (prev_start_balance) AS prev_start, MAX (prev_open_date) AS prev_open
FOR lvl
IN (1 AS "01", 2 AS "02", 3 AS "03", 4 AS "04", 5 AS "05" -- etc... as many levels as you need to support
));

Related

ORACLE - How to use LAG to display strings from all previous rows into current row

I have data like below:
group
seq
activity
A
1
scan
A
2
visit
A
3
pay
B
1
drink
B
2
rest
I expect to have 1 new column "hist" like below:
group
seq
activity
hist
A
1
scan
NULL
A
2
visit
scan
A
3
pay
scan, visit
B
1
drink
NULL
B
2
rest
drink
I was trying to solve with LAG function, but LAG only returns one row from previous instead of multiple.
Truly appreciate any help!
Use a correlated sub-query:
SELECT t.*,
(SELECT LISTAGG(activity, ',') WITHIN GROUP (ORDER BY seq)
FROM table_name l
WHERE t."GROUP" = l."GROUP"
AND l.seq < t.seq
) AS hist
FROM table_name t
Or a hierarchical query:
SELECT t.*,
SUBSTR(SYS_CONNECT_BY_PATH(PRIOR activity, ','), 3) AS hist
FROM table_name t
START WITH seq = 1
CONNECT BY
PRIOR seq + 1 = seq
AND PRIOR "GROUP" = "GROUP"
Or a recursive sub-query factoring clause:
WITH rsqfc ("GROUP", seq, activity, hist) AS (
SELECT "GROUP", seq, activity, NULL
FROM table_name
WHERE seq = 1
UNION ALL
SELECT t."GROUP", t.seq, t.activity, r.hist || ',' || r.activity
FROM rsqfc r
INNER JOIN table_name t
ON (r."GROUP" = t."GROUP" AND r.seq + 1 = t.seq)
)
SEARCH DEPTH FIRST BY "GROUP" SET order_rn
SELECT "GROUP", seq, activity, SUBSTR(hist, 2) AS hist
FROM rsqfc
Which, for the sample data:
CREATE TABLE table_name ("GROUP", seq, activity) AS
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL;
All output:
GROUP
SEQ
ACTIVITY
HIST
A
1
scan
null
A
2
visit
scan
A
3
pay
scan,visit
B
1
drink
null
B
2
rest
drink
db<>fiddle here
To aggregate strings in Oracle we use LISAGG function.
In general, you need a windowing_clause to specify a sliding window for analytic function to calculate running total.
But unfortunately LISTAGG doesn't support it.
To simulate this behaviour you may use model_clause of the select statement. Below is an example with explanation.
select
group_
, activity
, seq
, hist
from t
model
/*Where to restart calculation*/
partition by (group_)
/*Add consecutive numbers to reference "previous" row per group.
May use "seq" column if its values are consecutive*/
dimension by (
row_number() over(
partition by group_
order by seq asc
) as rn
)
measures (
/*Other columnns to return*/
activity
, cast(null as varchar2(1000)) as hist
, seq
)
rules update (
/*Apply this rule sequentially*/
hist[any] order by rn asc =
/*Previous concatenated result*/
hist[cv()-1]
/*Plus comma for the third row and tne next rows*/
|| presentv(activity[cv()-2], ',', '') /**/
/*lus previous row's value*/
|| activity[cv()-1]
)
GROUP_ | ACTIVITY | SEQ | HIST
:----- | :------- | --: | :---------
A | scan | 1 | null
A | visit | 2 | scan
A | pay | 3 | scan,visit
B | drink | 1 | null
B | rest | 2 | drink
db<>fiddle here
Few more variants (without subqueries):
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
DBFIddle: https://dbfiddle.uk/?rdbms=oracle_21&fiddle=9b477a2089d3beac62579d2b7103377a
Full test case with output:
with table_name ("GROUP", seq, activity) AS (
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL
)
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
GROUP SEQ ACTIV HIST1 HIST2
------ ---------- ----- ------------------------------ ------------------------------
A 1 scan
A 2 visit scan, scan
A 3 pay scan,visit, scan,visit
B 1 drink
B 2 rest drink, drink

Find top users who cumulatively have 75% of all points

I am trying to find out the top users who cumulatively have 75% of all points.
Table is:
In this users list must the result should be users (dick, mary, jack and sam).
I try with (Oracle select..)
SELECT o.users, SUM (o.points)
FROM (SELECT users,
SUM (points),
RANK () OVER (ORDER BY SUM (points) DESC) r
FROM points_tbl) o;
--> error is:
ORA-00904: "o"."points": invalid identifier
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE points ( "user", points ) AS
SELECT 'joe', 10 FROM DUAL UNION ALL
SELECT 'bill', 15 FROM DUAL UNION ALL
SELECT 'dick', 25 FROM DUAL UNION ALL
SELECT 'jack', 32 FROM DUAL UNION ALL
SELECT 'mary', 45 FROM DUAL UNION ALL
SELECT 'noe', 12 FROM DUAL UNION ALL
SELECT 'sam', 18 FROM DUAL;
Query 1:
SELECT "user", points
FROM (
SELECT p.*,
COALESCE(
SUM( points ) OVER (
ORDER BY points DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
),
0
) / SUM( points ) OVER () AS pct
FROM points p
ORDER BY points DESC
)
WHERE pct < .75
Results:
| user | POINTS |
|------|--------|
| mary | 45 |
| jack | 32 |
| dick | 25 |
| sam | 18 |

Oracle 11g - How to calculate the value of a number in range minimum or max

i need a help to get solution to my problem, Please.
I have a table like this :
ID Number
|6 |20.90 |
|7 |45.00 |
|8 |52.00 |
|9 |68.00 |
|10 |120.00 |
|11 |220.00 |
|12 |250.00 |
The first range is 0 - 20.90.
When the value is in the half, the value id is for the max range.
When i got value 20.91, i want to get "ID = 6".
If the value is 31.00, i want to get "ID = 6"
If the value is
33.95, i want to get "ID = 7".
if the value is 44.99, i want to get ID = 7
How i can do it? Is there a function that will do what I need?
If you want the record with a number that is closest to your input, then you can use this:
select *
from (
select *
from mytable
order by abs(number - my_input_number), id
)
where rownum < 2
The inner query selects all records, but orders them by the distance they have from your input number. This distance can be calculated with number - my_input_number. But that could be negative, so we take the absolute value of that. This result is not output; it is just used to order by. So records with smaller distances will come first.
Now we need just the first of those records, and that is what the outer query does with the typical Oracle reserved word rownum: it represents a sequence number for every record of the final result set (1, 2, 3, ...). The where clause will effectively filter away all records we do not want to see, leaving only one (with smallest distance).
As mathguy suggested in comments, the order by now also has a second value to order by in case the input value is right at the mid point between the two closest records. In that case the record with the lowest id value will be chosen.
This is a good illustration of the power of analytic functions:
with mytable ( id, value ) as (
select 6, 20.90 from dual union all
select 7, 45.00 from dual union all
select 8, 52.00 from dual union all
select 9, 68.00 from dual union all
select 10, 120.00 from dual union all
select 11, 220.00 from dual union all
select 12, 250.00 from dual
),
inputs ( x ) as (
select 0.00 from dual union all
select 20.91 from dual union all
select 31.00 from dual union all
select 33.95 from dual union all
select 44.99 from dual union all
select 68.00 from dual union all
select 32.95 from dual union all
select 400.11 from dual
)
-- End of test data (not part of the solution). SQL query begins BELOW THIS LINE
select val as x, new_id as closest_id
from (
select id, val,
last_value(id ignore nulls) over (order by val desc) as new_id
from (
select id, (value + lead(value) over (order by value))/2 as val
from mytable
union all
select null, x
from inputs
)
)
where id is null
order by x -- if needed
;
Output:
X CLOSEST_ID
------ ----------
0 6
20.91 6
31 6
32.95 6
33.95 7
44.99 7
68 9
400.11 12

How do I conditionally group by two different columns in Oracle?

Suppose I have a table with the following data:
+----------+-----+--------+
| CLASS_ID | Day | Period |
+----------+-----+--------+
| 1 | A | CCR |
+----------+-----+--------+
| 1 | B | CCR |
+----------+-----+--------+
| 2 | A | 1 |
+----------+-----+--------+
| 2 | A | 2 |
+----------+-----+--------+
| 3 | A | 3 |
+----------+-----+--------+
| 3 | B | 4 |
+----------+-----+--------+
| 4 | A | 5 |
+----------+-----+--------+
As you could probably guess from the nature of the data, I'm working on an Oracle SQL query that pulls class schedule data from a Student Information System. I'm trying to pull a class's "period expression", a calculated value that contains the Day and Period fields into a single field. Let's get my expectation out of the way first:
If the Periods match, Period should be the GROUP BY field, and Day should be the aggregated field (via a LISTAGG function), so the calculated field would be CCR (A-B)
If the Days match, Day should be the GROUP BY field, and Period should be the aggregated field, so the calculated field would be 1-2 (A)
I'm only aware of how to do each GROUP BY individually, something like for where Days match:
SELECT
day,
LISTAGG(period, '-') WITHIN GROUP (ORDER BY period)
FROM schedule
GROUP BY day
and vice versa for matching Periods, but I'm not seeing how I could do that dynamically for Period and Day in the same query.
You'll also notice that the last row in the example data set doesn't span multiple days or periods, so I also need to account for classes that don't need a GROUP BY at all.
Edit
The end result should be:
+------------+
| Expression |
+------------+
| CCR(A-B) |
+------------+
| 1-2(A) |
+------------+
| 3-4(A-B) |
+------------+
| 5(A) |
+------------+
It is really not clear to me WHY you want output in that way. It doesn't provide any useful information (I don't think) - you can't tell, for example for class_id = 3, which combinations of day and period are actually used. There are four possible combinations (according to the output), but only two are actually in the class schedule.
Anyway - you may have your reasons. Here is how you can do it. You seem to want to LISTAGG both the day and the period (both grouped by class_id, they are not grouped by each other). The difficulty is that you want distinct values in the aggregate lists only - no duplicates. So you will need to select distinct, separately for period and for day, then to the list aggregations, and then concatenate the results in an inner join.
Something like this:
with
test_data ( class_id, day, period ) as (
select 1, 'A', 'CCR' from dual union all
select 1, 'B', 'CCR' from dual union all
select 2, 'A', '1' from dual union all
select 2, 'A', '2' from dual union all
select 3, 'A', '3' from dual union all
select 3, 'B', '4' from dual union all
select 4, 'A', '5' from dual
)
-- end of test data; the actual solution (SQL query) begins below this line
select a.class_id, a.list_per || '(' || b.list_day || ')' as expression
from ( select class_id,
listagg(period, '-') within group (order by period) as list_per
from ( select distinct class_id, period from test_data )
group by class_id
) a
inner join
( select class_id,
listagg(day, '-') within group (order by day) as list_day
from ( select distinct class_id, day from test_data )
group by class_id
) b
on a.class_id = b.class_id
;
CLASS_ID EXPRESSION
-------- ----------
1 CCR(A-B)
2 1-2(A)
3 3-4(A-B)
4 5(A)
How about union with having count(*) = 1?
select LISTAGG(period, '-') list WITHIN GROUP (ORDER BY period)
from schedule
group by CLASS_ID, day
having count(*) = 1
union all
select LISTAGG(day, '-') list WITHIN GROUP (ORDER BY day)
from schedule
group by CLASS_ID, period
having count(*) = 1

duplicating entries in listagg function

i have a table in which two fields are id, controlflag.It looks like
Id CntrlFlag
121 SSSSSRNNNSSRSSNNR
122 SSSNNRRSSNNRSSSSS
123 RRSSSNNSSSSSSSSSSSSSSS
I have to get output in the following form( the occurences of R)
Id Flag
121 6,12,17
122 6,7,12
123 1,2
I tried oracle query( as i obtained from this forum):
select mtr_id,listagg(str,',') within group (order by lvl) as flags from
( select mtr_id, instr(mtr_ctrl_flags,'R', 1, level) as str, level as lvl
from mer_trans_reject
connect by level <= regexp_count(mtr_ctrl_flags, 'R'))group by mtr_id;
it gives the result but 2nd and 3rd occurrences(not 1st one) are duplicated a no. of times.
it looks like
id Flag
123 6,12,12,12,12,17,17,17,17,17.
Can anybody know what's wrong here?
It could be avoided by select distinct keyword.Is there any other way?
Yes, there is, but this one is a little bit heavier(distinct will cost you less):
with t1(Id1, CntrlFlag) as(
select 121, 'SSSSSRNNNSSRSSNNR' from dual union all
select 122, 'SSSNNRRSSNNRSSSSS' from dual union all
select 123, 'RRSSSNNSSSSSSSSSSSSSSS' from dual
)
select w.id1
, listagg(w.r_pos, ',') within group(order by w.id1) as R_Positions
from (select q.id1
, regexp_instr(q.CntrlFlag,'R', 1, t.rn) as r_pos
from t1 q
cross join (select rownum rn
from(select max (regexp_count(CntrlFlag, 'R')) ml
from t1
)
connect by level <= ml
) t
) w
where w.r_pos > 0
group by w.id1
Result:
ID1 R_POSITIONS
---------- -----------
121 12,17,6
122 12,6,7
123 1,2

Resources