PLSQL : need help to understand a CASE instructions in a ORDER BY - oracle

I have a piece of code that had a ORDER BY with a CASE in it:
ORDER BY
(
CASE
WHEN r.id BETWEEN 900 AND 999 THEN '1AAAAA'
ELSE '2'
|| upper(id.name)
END) ASC,
r.date DESC ;
Could someone explain:
what is the meaning of the '1AAAAA' and '2' ?
what is the meaning of
|| upper(id.name)

In PL/SQL, || is the concatenation operator.
Exactly how the ordering is happening depends on the rest of the query, but it looks like it's putting records with r.id BETWEEN 900 AND 999 before other records, which are sorted by id.name.

The case expression evaluates to a value
CASE
WHEN r.id BETWEEN 900 AND 999 THEN '1AAAAA'
ELSE '2'
|| upper(id.name)
END
The whole above block of code with either evaluate to '1AAAAA', or '2[value-of-r.id]' depending on the value of r.id.
As this is in the order by clause this value will be used to sort the results as follows:
first list all records where r.id is between 900 and 999
then list all other records in ascending order of r.id (the || is the string concatenation operator).

Here is some data. As you can see the name sorts in ASCII order, which is not exactly the same as alphabetical order:
SQL> select id, name, somedate
2 from t42
3 order by name, somedate
4 /
ID NAME SOMEDATE
---------- ---------- ---------
8 Billington 24-MAR-11
13 Cave 19-MAR-11
4 Clarke 28-MAR-11
919 Feuerstein 13-MAR-11
16 Gasparotto 16-MAR-11
1014 KULASH 18-MAR-11
1 Kestelyn 31-MAR-11
917 Kishore 15-MAR-11
2 Lira 30-MAR-11
6 PADFIELD 26-MAR-11
11 Rigby 21-MAR-11
1007 Robertson 25-MAR-11
12 SCHNEIDER 20-MAR-11
9 SPENCER 23-MAR-11
3 TRICHLER 29-MAR-11
918 VERREYNNE 14-MAR-11
10 boehmer 22-MAR-11
15 hall 17-MAR-11
920 poder 12-MAR-11
5 van wijk 27-MAR-11
1021 11-MAR-11
21 rows selected.
SQL>
Sorting by upper(name) makes it case-insensitive:
SQL> select id, name, somedate
2 from t42
3 order by upper(name), somedate
4 /
ID NAME SOMEDATE
---------- ---------- ---------
8 Billington 24-MAR-11
10 boehmer 22-MAR-11
13 Cave 19-MAR-11
4 Clarke 28-MAR-11
919 Feuerstein 13-MAR-11
16 Gasparotto 16-MAR-11
15 hall 17-MAR-11
1 Kestelyn 31-MAR-11
917 Kishore 15-MAR-11
1014 KULASH 18-MAR-11
2 Lira 30-MAR-11
6 PADFIELD 26-MAR-11
920 poder 12-MAR-11
11 Rigby 21-MAR-11
1007 Robertson 25-MAR-11
12 SCHNEIDER 20-MAR-11
9 SPENCER 23-MAR-11
3 TRICHLER 29-MAR-11
5 van wijk 27-MAR-11
918 VERREYNNE 14-MAR-11
1021 11-MAR-11
21 rows selected.
SQL>
The CASE() changes this further by grouping all the records within the specfied ID range first, then all the other records. The records in the selected range are just sorted by the DATE whereas the other records are still sorted by name then date:
SQL> select id, name, somedate
2 from t42
3 ORDER BY
4 (
5 CASE
6 WHEN id BETWEEN 900 AND 999 THEN '1AAAAA'
7 ELSE '2'
8 || upper(name)
9 END) ASC,
10 somedate DESC
11 /
ID NAME SOMEDATE
---------- ---------- ---------
917 Kishore 15-MAR-11
918 VERREYNNE 14-MAR-11
919 Feuerstein 13-MAR-11
920 poder 12-MAR-11
1021 11-MAR-11
8 Billington 24-MAR-11
10 boehmer 22-MAR-11
13 Cave 19-MAR-11
4 Clarke 28-MAR-11
16 Gasparotto 16-MAR-11
15 hall 17-MAR-11
1 Kestelyn 31-MAR-11
1014 KULASH 18-MAR-11
2 Lira 30-MAR-11
6 PADFIELD 26-MAR-11
11 Rigby 21-MAR-11
1007 Robertson 25-MAR-11
12 SCHNEIDER 20-MAR-11
9 SPENCER 23-MAR-11
3 TRICHLER 29-MAR-11
5 van wijk 27-MAR-11
21 rows selected.
SQL>

1. what is the meaning of the '1AAAAA' and '2' ?
That are literal constants.
2. what is the meaning of || upper(id.name)
|| is the SQL standard concatenation operator. 'A' || 'B' produces 'AB'.
IMHO, your question is what the entire order by case means, so, go step by step:
ORDER BY
(
CASE
WHEN r.id BETWEEN 900 AND 999 THEN '1AAAAA'
ELSE '2'
|| upper(id.name)
END) ASC,
r.date DESC ;
This will order your result set by the result of the case expression evaluation (ascendant), then by r.date (descendant).
The case will just return '1AAAAA' for any ID between 900 and 999 (this will then be ordered by r.date, remember?'
For any other value, it will concatenate 2 before the id.name.
This ensures any record with id between 900 and 999 to appear in the first "group", which is ordered just by date, descending. Then a second group will contain all the other records, ordered by the upper of name, then by the date.
You may want to see this data to understand how this works... just add the case expression to your select statement as a new column.
For example if your query starts like this:
SELECT r.id, id.name
FROM
add the case like this:
SELECT r.id, id.name
,
CASE
WHEN r.id BETWEEN 900 AND 999 THEN '1AAAAA'
ELSE '2'|| upper(id.name)
END ORDER_CRITERIA
FROM
This will help you understand what's going on with that expression, as you will see the produced data as the last column of your query.

Related

For each company in a table Company, I want to create a random number of rows between 50 and 250 in table Employee in PL/SQL?

For each company entry in a table Company, I want to create a random number of rows between 50 and 250 in table Employee in PL/SQL.
Here's one option, based on data in Scott's sample schema.
Departments (that's your company):
SQL> SELECT * FROM dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Target table:
SQL> CREATE TABLE employee
2 (
3 deptno NUMBER,
4 empno NUMBER PRIMARY KEY,
5 ename VARCHAR2 (10),
6 salary NUMBER
7 );
Table created.
Sequence (for primary key values):
SQL> CREATE SEQUENCE seq_e;
Sequence created.
Here's the procedure: for each department, it creates L_ROWS number of rows (line #8) (I restricted it to a number between 1 and 5; your boundaries would be 50 and 250). It also creates random names (line #16) and salaries (line #17):
SQL> DECLARE
2 l_rows NUMBER;
3 BEGIN
4 DELETE FROM employee;
5
6 FOR cur_d IN (SELECT deptno FROM dept)
7 LOOP
8 l_rows := ROUND (DBMS_RANDOM.VALUE (1, 5));
9
10 INSERT INTO employee (deptno,
11 empno,
12 ename,
13 salary)
14 SELECT cur_d.deptno,
15 seq_e.NEXTVAL,
16 DBMS_RANDOM.string ('x', 7),
17 ROUND (DBMS_RANDOM.VALUE (100, 900))
18 FROM DUAL
19 CONNECT BY LEVEL <= l_rows;
20 END LOOP;
21 END;
22 /
PL/SQL procedure successfully completed.
Result:
SQL> SELECT *
2 FROM employee
3 ORDER BY deptno, empno;
DEPTNO EMPNO ENAME SALARY
---------- ---------- ---------- ----------
10 1 ZMO4RFN 830
10 2 AEXL34I 589
10 3 SI6X38Z 191
10 4 59EWI42 397
20 5 DBAMQDA 559
20 6 79X78JV 491
30 7 56ITU5V 178
30 8 09KPAIS 297
30 9 VQUVWDP 446
40 10 AHJZNVJ 182
40 11 0XWI3GC 553
40 12 7GNTCG4 629
40 13 23G871Z 480
13 rows selected.
SQL>
Adapting my answer to this question:
INSERT INTO employees (id, first_name, last_name, department_id)
SELECT employees__id__seq.NEXTVAL,
CASE FLOOR(DBMS_RANDOM.VALUE(1,6))
WHEN 1 THEN 'Faith'
WHEN 2 THEN 'Tom'
WHEN 3 THEN 'Anna'
WHEN 4 THEN 'Lisa'
WHEN 5 THEN 'Andy'
END,
CASE FLOOR(DBMS_RANDOM.VALUE(1,6))
WHEN 1 THEN 'Andrews'
WHEN 2 THEN 'Thorton'
WHEN 3 THEN 'Smith'
WHEN 4 THEN 'Jones'
WHEN 5 THEN 'Beirs'
END,
d.id
FROM ( SELECT id,
FLOOR(DBMS_RANDOM.VALUE(50,251)) AS num_employees
FROM departments
ORDER BY ROWNUM -- Materialize the sub-query so the random values are individually
-- generated.
) d
CROSS JOIN LATERAL (
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= d.num_employees
);
fiddle

how to do cumulative sums in oracle

I am new to SQL and I wanted to do a report which shows the daily number of tickets per shift and also the to-date total.
Here's the query I have which shows the first 5 columns below:
SELECT
TO_CHAR(DTTM,'YYYY-MM-DD') as "DATE"
,COUNT(CASE WHEN TO_CHAR(DTTM, 'HH24:MI') BETWEEN '14:00' AND '22:00' THEN TKTNUM ELSE NULL END) AS "DAYS"
,COUNT(CASE WHEN TO_CHAR(DTTM, 'HH24:MI') BETWEEN '06:00' AND '14:00' THEN TKTNUM ELSE NULL END) AS "MIDS"
,COUNT(CASE WHEN TO_CHAR(DTTM, 'HH24:MI') NOT BETWEEN '06:00' AND '22:00' THEN TKTNUM ELSE NULL END) AS "SWINGS"
,COUNT(TKTNUM) AS "TOTAL"
FROM TKTHISTORY
GROUP BY TO_CHAR(DTTM,'YYYY-MM-DD')
ORDER BY TO_CHAR(DTTM,'YYYY-MM-DD')
DATE DAYS MIDS SWINGS TOTAL
2019-08-01 8 13 1 22 22
2019-08-02 19 5 3 27 49
2019-08-03 23 6 6 35 84
2019-08-04 7 9 13 29 113
2019-08-05 4 17 2 23 136
2019-08-06 10 5 16 31 167
2019-08-07 3 12 11 26 193
The 6th column should be the cumulative sum for the dates. I tried browsing the internet and read about "over" and "partition by" but I still can't figure out how to use it :(
Here's an example based on Scott's EMP table, which counts jobs per department. The last column is the "running total" value.
Sample data shows that there are 3 employees in DEPTNO = 10, 5 of them in dept. 20 and 6 in dept. 30:
SQL> select deptno, empno, ename from emp order by deptno;
DEPTNO EMPNO ENAME
---------- ---------- ----------
10 7782 CLARK
10 7839 KING
10 7934 MILLER
20 7566 JONES
20 7902 FORD
20 7876 ADAMS
20 7369 SMITH
20 7788 SCOTT
30 7521 WARD
30 7844 TURNER
30 7499 ALLEN
30 7900 JAMES
30 7698 BLAKE
30 7654 MARTIN
14 rows selected.
Query then looks like this:
SQL> select
2 deptno,
3 count(empno) emps_per_dept,
4 sum(count(*)) over (order by deptno) total
5 from emp
6 group by deptno;
DEPTNO EMPS_PER_DEPT TOTAL
---------- ------------- ----------
10 3 3
20 5 8
30 6 14
SQL>
Which, in your case, might be like this:
SELECT
...
,sum(COUNT(TKTNUM)) over (order by TO_CHAR(DTTM,'YYYY-MM-DD')) AS "TOTAL"
FROM TKTHISTORY
...
SELECT t.user_id,
t.transactions_,
SUM(t.transactions_) over(ORDER BY t.user_id) cum_sum
FROM FEBRUARY_2023_USER_ACTIVITIES t

how to do a query based on a shifting schedule (day/mids/swings)

I need to pull the total number of tickets created per shift together with the running total. I have an existing query which I thought was correct but after checking, it seems that it is pulling the wrong numbers.
SELECT
TO_CHAR(TRUNC(DTTM,'Y'),'YYYY') as "DATE"
,COUNT(CASE WHEN TO_CHAR(DTTM, 'HH24:MI') BETWEEN '14:00' AND '22:00' THEN TKTNUM ELSE NULL END) AS "DAYS"
,COUNT(CASE WHEN TO_CHAR(DTTM, 'HH24:MI') BETWEEN '06:00' AND '14:00' THEN TKTNUM ELSE NULL END) AS "MIDS"
,COUNT(CASE WHEN TO_CHAR(DTTM, 'HH24:MI') NOT BETWEEN '06:00' AND '22:00' THEN TKTNUM ELSE NULL END) AS "SWINGS"
,COUNT(TKTNUM) "TOTAL"
,SUM(COUNT(TKTNUM)) OVER (ORDER BY (TRUNC(E.ESCDTTM,'Y'),'YYYY')) -- c/o Littlefoot and Stew Ashton
FROM TKTCHISTORY
GROUP BY TRUNC(E.ESCDTTM,'Y')
ORDER BY TRUNC(E.ESCDTTM,'Y')
SAMPLE DATA:
TKTNUM TKT_CREATED
INC0001 01/10/2019 1:00
INC0002 01/10/2019 23:00
INC0003 03/10/2019 5:00
INC0004 03/10/2019 9:20
INC0005 05/11/2019 15:00
DESIRED OUTPUT:
DATE DAYS MIDS SWINGS TOTAL
2019-08-01 8 13 1 22 22
2019-08-02 19 5 3 27 49
2019-08-03 23 6 6 35 84
2019-08-04 7 9 13 29 113
2019-08-05 4 17 2 23 136
2019-08-06 10 5 16 31 167
2019-08-07 3 12 11 26 193
"SWINGS" would pull tickets between 00:00 and 06:00 or 22:00 and 24:00 on the same date. For example, a ticket was generated on 02-Nov 01:00... when I pull the report it would be counted on 02-Nov for SWINGS when it should be for the 01-Nov duty.
I've come up with something that would probably help with the logic but am not 100% sure.
WITH Shift_Sched (shiftdate,shiftsched) as
(
SELECT
--sysdate
CASE
WHEN TO_CHAR(TRUNC(sysdate,'MI'),'HH24:MI') BETWEEN '06:00' AND '23:59' THEN TRUNC(sysdate,'DD')
WHEN TO_CHAR(TRUNC(sysdate,'MI'),'HH24:MI') BETWEEN '00:00' AND '05:59' THEN TRUNC(sysdate -1,'DD')
END as "SHIFT DATE",
CASE
WHEN TO_CHAR(TRUNC(sysdate,'MI'),'HH24:MI') BETWEEN '06:00' AND '14:00' THEN 'MIDS'
WHEN TO_CHAR(TRUNC(sysdate,'MI'),'HH24:MI') BETWEEN '14:00' AND '22:00' THEN 'DAYS'
ELSE 'SWINGS'
END as "SHIFT SCHED"
FROM DUAL
)
SELECT shiftdate,shiftsched,COUNT(shiftsched)
FROM shift_sched
GROUP by shiftdate,shiftsched
Any help would be greatly appreciated
If I followed you correctly, you want days from 6 AM to 6 AM the next day. If so, you can substract 6 hours to tkt_created, truncate it to day, and group by that. The rest is just conditional aggregation.
select
trunc(tkt_created - 6/24) "DATE",
sum(case when extract(hour from tkt_created) between 6 and 13 then 1 end) days,
sum(case when extract(hour from tkt_created) between 14 and 21 then 1 end) mids,
sum(case
when extract(hour from tkt_created) between 22 and 23
or extract(hour from tkt_created) between 0 and 5
then 1 end
) swings,
count(*) total
from tktchistory
group by trunc(tkt_created - 6/24)
order by "DATE"
Note: I used standard SQL extract() instead of to_char(), which is Oracle specific; apart from being standard, another advantage is that it returns an integer rather than a string.
This can also be phrased as:
select
trunc(tkt_created - 6/24) "DATE",
sum(case when extract(hour from (tkt_created) between 6 and 13 then 1 end) days,
sum(case when extract(hour from tkt_created) between 14 and 21 then 1 end) mids,
sum(case when extract(hour from (tkt_created - 6/24)) between 16 and 23 then 1 end) swings,
count(*) total
from tktchistory
group by trunc(tkt_created - 6/24)
order by "DATE"

Hive: Sum over a specified group (HiveQL)

I have a table:
key product_code cost
1 UK 20
1 US 10
1 EU 5
2 UK 3
2 EU 6
I would like to find the sum of all products for each group of "key" and append to each row. For example for key = 1, find the sum of costs of all products (20+10+5=35) and then append result to all rows which correspond to the key = 1. So end result:
key product_code cost total_costs
1 UK 20 35
1 US 10 35
1 EU 5 35
2 UK 3 9
2 EU 6 9
I would prefer to do this without using a sub-join as this would be inefficient. My best idea would be to use the over function in conjunction with the sum function but I cant get it to work. My best try:
SELECT key, product_code, sum(costs) over(PARTITION BY key)
FROM test
GROUP BY key, product_code;
Iv had a look at the docs but there so cryptic I have no idea how to work out how to do it. Im using Hive v0.12.0, HDP v2.0.6, HortonWorks Hadoop distribution.
Similar to #VB_ answer, use the BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING statement.
The HiveQL query is therefore:
SELECT key, product_code,
SUM(costs) OVER (PARTITION BY key ORDER BY key ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM test;
You could use BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW to achieve that without a self join.
Code as below:
SELECT a, SUM(b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM T;
The analytics function sum gives cumulative sums. For example, if you did:
select key, product_code, cost, sum(cost) over (partition by key) as total_costs from test
then you would get:
key product_code cost total_costs
1 UK 20 20
1 US 10 30
1 EU 5 35
2 UK 3 3
2 EU 6 9
which, it seems, is not what you want.
Instead, you should use the aggregation function sum, combined with a self join to accomplish this:
select test.key, test.product_code, test.cost, agg.total_cost
from (
select key, sum(cost) as total_cost
from test
group by key
) agg
join test
on agg.key = test.key;
This query gives me perfect result
select key, product_code, cost, sum(cost) over (partition by key) as total_costs from zone;
similar answer (if we use oracle emp table):
select deptno, ename, sal, sum(sal) over(partition by deptno) from emp;
output will be like below:
deptno ename sal sum_window_0
10 MILLER 1300 8750
10 KING 5000 8750
10 CLARK 2450 8750
20 SCOTT 3000 10875
20 FORD 3000 10875
20 ADAMS 1100 10875
20 JONES 2975 10875
20 SMITH 800 10875
30 BLAKE 2850 9400
30 MARTIN 1250 9400
30 ALLEN 1600 9400
30 WARD 1250 9400
30 TURNER 1500 9400
30 JAMES 950 9400
The table above looked like
key product_code cost
1 UK 20
1 US 10
1 EU 5
2 UK 3
2 EU 6
The user wanted a tabel with the total costs like the following
key product_code cost total_costs
1 UK 20 35
1 US 10 35
1 EU 5 35
2 UK 3 9
2 EU 6 9
Therefor we used the following query
SELECT key, product_code,
SUM(costs) OVER (PARTITION BY key ORDER BY key ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM test;
So far so good.
I want a column more, counting the occurences of each country
key product_code cost total_costs occurences
1 UK 20 35 2
1 US 10 35 1
1 EU 5 35 2
2 UK 3 9 2
2 EU 6 9 2
Therefor I used the following query
SELECT key, product_code,
SUM(costs) OVER (PARTITION BY key ORDER BY key ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as total_costs
COUNT(product code) OVER (PARTITION BY key ORDER BY key ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as occurences
FROM test;
Sadly this is not working. I get an cryptic error. To exclude an error in my query I want to ask if I did something wrong.
Thanks

Retrieve data in group based on the Value of other column

I have one table with the following Column.
BoxNumber Status
580 4
581 4
582 4
583 4
584 2
585 2
586 4
587 4
588 4
589 4
590 2
591 2
I need one select Query to get the following output .
StartingBoxNumber EndingBoxNumber Status
580 583 4
584 585 2
586 589 4
590 591 2
You can get the result with a single scan of the table, using analytics to define "groups" of contiguous rows:
SQL> SELECT MIN(boxnumber), MAX(boxnumber), status
2 FROM (SELECT boxnumber, status,
3 SUM(status_change) over(ORDER BY boxnumber) group_id
4 FROM (SELECT boxnumber, status,
5 CASE
6 WHEN lag(status) over(ORDER BY boxnumber)
7 = status
8 AND lag(boxnumber) over(ORDER BY boxnumber)
9 = boxnumber - 1 THEN
10 0
11 ELSE
12 1
13 END status_change
14 FROM box))
15 GROUP BY status, group_id
16 ORDER BY 1;
MIN(BOXNUMBER) MAX(BOXNUMBER) STATUS
-------------- -------------- ----------
580 583 4
584 585 2
586 589 4
590 591 2
Assuming box numbers are always consecutive:
SELECT COALESCE(
LAG(Boxnumber) OVER (ORDER BY BoxNumber),
(
SELECT MIN(BoxNumber)
FROM mytable
)) AS StartBoxNumber,
BoxNumber AS EndBoxNumber,
status
FROM mytable qo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable qi
WHERE qi.boxnumber = qo.boxnumber + 1
AND qi.status = qo.status
)

Resources