Find max number of open tasks - oracle

What I am attempting to do is find out the MAX number of tasks I may receive on a day during the next 6 months.
for example
task 1 runs 1-jan-16 and ends 10-jan-16
Task 2 runs 3-Jan-16 and ends 15-jan-16
task 3 runs 6-Jan-16 and ends 10-Jan-16
Task 4 runs 9-Jan-16 and ends 20-Jan-16
So in this example there are 4 tasks that are open between 1-Jan and 10th Jan so I want the outcome to be 4 in this scenario. The reason being is I'm displaying them in a Gantt chart so they'll be all underneath each other.
All I have to work with so far is:
select schedule_start_date,am.AC,count(wo) as from ac_master am
left outer join wo on wo.ac = am.ac and ac_type = '190'
where wo.status = 'OPEN'
group by am.ac,schedule_start_date
This will show the count per day but some of these may overlap.
Is there anyway to do what I am trying to accomplish?

If you just want the count for each scheduled group at a given point in time, then you can just use BETWEEN with the start and end dates:
SELECT schedule_start_date,
am.AC,
COUNT(*) AS theCount
FROM ac_master am
LEFT OUTER JOIN wo
ON wo.ac = am.ac AND
ac_type = '190'
WHERE wo.status = 'OPEN' AND
'2016-01-10' BETWEEN schedule_start_date AND schedule_end_date
GROUP BY schedule_start_date,
am.ac

Regardless of how you develop a set of rows with start_date and end_date, here is a method to show how the task count changes over time. Each date is the first date when the task count changes from the previous value to the new one.
If you only need max(tasks), that's a simple matter of grouping by whatever is needed. (Or, in Oracle 12, you can order by tasks and use the new fetch first feature.) Notice also the partition by clause - if you need different groups for different categories (for example: for different "departments" etc.) you can use this clause so that the computations are done separately for each group, all in one pass over the data.
with
intervals ( start_date, end_date ) as (
select date '2016-01-01', date '2016-01-10' from dual union all
select date '2016-01-03', date '2016-01-15' from dual union all
select date '2016-01-06', date '2016-01-10' from dual union all
select date '2016-01-09', date '2016-01-20' from dual
),
u ( dt, flag ) as (
select start_date , 1 from intervals
union all
select end_date + 1, -1 from intervals
)
select distinct dt, sum(flag) over (partition by null order by dt) as tasks
from u
order by dt;
DT TASKS
---------- ---------
2016-01-01 1
2016-01-03 2
2016-01-06 3
2016-01-09 4
2016-01-11 2
2016-01-16 1
2016-01-21 0

Related

reset index every 5th row - Oracle SQL

How can I make index column start over after reaching 5th row? I can't do that with a window function as there are no groups, I just need an index with max number of 5 like this:
date
index
01.01.21
1
02.01.21
2
03.01.21
3
04.01.21
4
05.01.21
5
06.01.21
1
07.01.21
2
and so on.
Appreciate any ideas.
You can use below solution for that purpose.
First, rank (row_number analytic function)the rows in your table within inline view
Then, use again the row_number function with partition by clause to group the previously ranked rows by TRUNC((rnb - 1)/5)
SELECT t."DATE"
, row_number()over(PARTITION BY TRUNC((rnb - 1)/5) ORDER BY rnb) as "INDEX"
FROM (
select "DATE", row_number()OVER(ORDER BY "DATE") rnb
from Your_table
) t
ORDER BY 1
;
demo on db<>fiddle
Your comment about using analytic functions is wrong; you can use analytic functions even when there are no "groups" (or "partitions"). Here you do need an analytic function, to order the rows (even if you don't need to partition them).
Here is a very simple solution, using just row_number(). Note the with clause, which is not part of the solution; I included it just for testing. In your real-life case, remove the with clause, and use your actual table and column names. The use of mod(... , 5) is pretty much obvious; it looks a little odd (subtracting 1, taking the modulus, then adding 1) because in Oracle we seem to count from 1 in all cases, instead of the much more natural counting from 0 common in other languages (like C).
Note that both date and index are reserved keywords, which shouldn't be used as column names. I used one common way to address that - I added an underscore at the end.
alter session set nls_date_format = 'dd.mm.rr';
with
sample_inputs (date_) as (
select date '2021-01-01' from dual union all
select date '2021-01-02' from dual union all
select date '2021-01-03' from dual union all
select date '2021-01-04' from dual union all
select date '2021-01-05' from dual union all
select date '2021-01-06' from dual union all
select date '2021-01-07' from dual
)
select date_, 1 + mod(row_number() over (order by date_) - 1, 5) as index_
from sample_inputs
;
DATE_ INDEX_
-------- ----------
01.01.21 1
02.01.21 2
03.01.21 3
04.01.21 4
05.01.21 5
06.01.21 1
07.01.21 2
You can combine MOD() with ROW_NUMBER() to get the index you want. For example:
select date, 1 + mod(row_number() over(order by date) - 1, 5) as idx from t

Efficiently get array of all previous dates per id per date limited to past 6 months in BigQuery

I have a very big table 'DATES_EVENTS' (20 T) that looks like this:
ID DATE
1 '2022-04-01'
1 '2022-03-02'
1 '2022-03-01'
2 '2022-05-01'
3 '2021-12-01'
3 '2021-11-11'
3 '2020-11-11'
3 '2020-10-01'
I want per each row to get all past dates (per user) limited to up to 6 months.
My desired table:
ID DATE DATE_list
1 '2022-04-01' ['2022-04-01','2022-03-02','2022-03-01']
1 '2022-03-02' ['2022-03-02','2022-03-01']
1 '2022-03-01' ['2022-03-01']
2 '2022-05-01' ['2022-05-01']
3 '2021-12-01' ['2021-12-01','2021-11-11']
3 '2021-11-11' ['2021-11-11']
3 '2020-11-11' ['2020-11-11','2020-10-01']
3 '2020-10-01' ['2020-10-01']
I have a solution for all dates not limited:
SELECT
ID, DATE, ARRAY_AGG(DATE) OVER (PARTITION BY ID ORDER BY DATE) as DATE_list
FROM
DATES_EVENTS
But for a limited up to 6 months I don't have an efficient solution:
SELECT
distinct A.ID, A.DATE, ARRAY_AGG(B.DATE) OVER (PARTITION BY B.ID ORDER BY B.DATE) as DATE_list
FROM
DATES_EVENTS A
INNER JOIN
DATES_EVENTS B
ON
A.ID=B.ID
AND B.DATE BETWEEN DATE_SUB(A.DATE, INTERVAL 180 DAY) AND A.DATE
** ruffly a solution
Anyone know of a good and efficient way to do what I need?
Consider below approach
select id, date, array(
select day
from t.date_list day
where day <= date
order by day desc
) as date_list
from (
select *, array_agg(date) over win as date_list
from dates_events
window win as (
partition by id
order by extract(year from date) * 12 + extract(month from date)
range between 5 preceding and current row
)
) t
if applied to sample data in your question - output is
In case if (as I noticed in your question) 180 days is appropriate substitution for 6 months for you - you can use below simpler version
select *, array_agg(date) over win as date_list
from dates_events
window win as (
partition by id
order by unix_date(date)
range between current row and 179 following
)

SQL: Is there a way to exclude duplicate results if a condition is met

I am trying to build a query that will search for a certain user and the most recent job they were associated with. I want to only pull the most recent date but some users have two jobs associated with the same date. Is there a way to only pull one of those while not excluding that result if it was a user's only job?
For example
User Job Date
1 Chef 7/13/21
1 Server 7/13/21
2 Server 7/3/21
3 Chef 7/1/21
Desired result:
User Job Date
1 Chef 7/13/21
2 Server 7/3/21
3 Chef 7/1/21
Thanks!
Is it possible? Yes, but data you posted as example doesn't reflect what you're saying because for user = 1 both jobs have same date.
Anyway, here's one option how to do that: use one of analytic functions (I used row_number; rank might also be an option) to "rank" rows partitioned by user & sorted by date, and then fetch the one with the lowest rank.
Sample data till line #6; query begins at line #7.
SQL> with test (cuser, job, datum) as
2 (select 1, 'chef' , date '2021-07-13' from dual union all
3 select 1, 'server', date '2021-07-13' from dual union all
4 select 2, 'server', date '2021-07-03' from dual union all
5 select 3, 'chef' , date '2021-07-01' from dual
6 ),
7 temp as
8 (select cuser, job, datum,
9 row_number() over (partition by cuser order by datum desc) rn
10 from test
11 )
12 select cuser, job, datum
13 from temp
14 where rn = 1;
CUSER JOB DATUM
--------- ------ ----------
1 chef 07/13/2021
2 server 07/03/2021
3 chef 07/01/2021
SQL>
Make use of keep dense_rank
https://oracle-base.com/articles/misc/rank-dense-rank-first-last-analytic-functions
select user, date,
MIN(job) KEEP (DENSE_RANK FIRST ORDER BY date)
from table
group by user, date
A rough example
http://sqlfiddle.com/#!4/14ee4d/2
Now this is not quite deterministic as when there are two jobs on the same date you need to specify how to sort them. If your date column includes time then you should be OK

Max number of counts in a tparticular hour

I have a table called Orders, i want to get maximum number of orders for each day with respect to hours with following query
SELECT
trunc(created,'HH') as dated,
count(*) as Counts
FROM
orders
WHERE
created > trunc(SYSDATE -2)
group by trunc(created,'HH') ORDER BY counts DESC
this gets the result of all hours, I want only max hour of a day e.g.
Image
This result looks good but now i want only rows with max number of count for a day
e.g.
for 12/23/2019 max number of counts is 90 for "12/23/2019 4:00:00 PM",
for 12/22/2019 max number of counts is 25 for "12/22/2019 3:00:00 PM"
required dataset
1 12/23/2019 4:00:00 PM 90
2 12/24/2019 12:00:00 PM 76
3 12/22/2019 1:00:00 PM 25
This could be the solution and in my opinion is the most trivial.
Use the WITH clause to make a sub query then search for the greatest value in the data set on a specific date.
WITH ORD AS (
SELECT
trunc(created,'HH') as dated,
count(*) as Counts
FROM
orders
WHERE
created > trunc(SYSDATE-2)
group by trunc(created,'HH')
)
SELECT *
FROM ORD ord
WHERE NOT EXISTS (
SELECT 'X'
FROM ORD ord1
WHERE trunc(ord1.dated) = trunc(ord.dated) AND ord1.Counts > ord.Counts
)
Use ROW_NUMBER analytic function over your original query and filter the rows with number 1.
You need to partition on the day, i.e. TRUNC(dated) to get the correct result
with ord1 as (
SELECT
trunc(created,'HH') as dated,
count(*) as Counts
FROM
orders
WHERE
created > trunc(SYSDATE -2)
group by trunc(created,'HH')
),
ord2 as (
select dated, Counts,
row_number() over (partition by trunc(dated) order by Counts desc) as rn
from ord1)
select dated, Counts
from ord2
where rn = 1
The advantage of using the ROW_NUMBER is that it correct handels ties, i.e. cases where there are more hour in a day with the same maximal count. The query shows only one record and you can controll with the order by e.g. to show the first / last hour.
You can use the analytical function ROW_NUMBER as following to get the desired result:
SELECT DATED, COUNTS
FROM (
SELECT
TRUNC(CREATED, 'HH') AS DATED,
COUNT(*) AS COUNTS,
ROW_NUMBER() OVER(
PARTITION BY TRUNC(CREATED)
ORDER BY COUNT(*) DESC NULLS LAST
) AS RN
FROM ORDERS
WHERE CREATED > TRUNC(SYSDATE - 2)
GROUP BY TRUNC(CREATED, 'HH'), TRUNC(CREATED)
)
WHERE RN = 1
Cheers!!

Minimum Travel Time

Display schedule_id, source, destination and travel_time which has minimum travel time. Sort the result based on schedule id.
I have tried this code and there is something missing in my query as I m getting the error.
select sh.schedule_id,sh.source,sh.destination,sh.duration as travel_time
from schedule sh
(select min(sh.duration) from schedule)
order by sh.schedule_id;
Almost correct. You forgot to define the minimum travel time in the where-clause.
SELECT sh.schedule_id,
sh.source,
sh.destination,
sh.duration as travel_time
FROM schedule sh
WHERE sh.duration = (select min(duration) from schedule) -- This is where the problem was.
ORDER BY sh.schedule_id;
Then only column which looks like travel time is DURATION, its datatype is NUMBER. What does that number represent? Minutes? Hours? Something else?
Anyway, here's one option you might consider. It "sorts" durations (i.e. "travel time") using RANK analytic function, and fetches a row (or rows) whose duration is minimal.
Advantage of such an approach is that you have to scan the table only once; if you select minimum duration in a subquery, and then use its result to fetch data you're interested in, you're accessing the same table twice which might matter when there are many rows involved. For a small sample data set, you won't notice any difference.
The SCHEDULE CTE represents some test data; you need code that starts at line 6.
SQL> with schedule (schedule_id, source, destination, duration) as
2 (select 1, 'Paris', 'London' , 8 from dual union all
3 select 2, 'Berlin', 'Prague' , 4 from dual union all
4 select 3, 'Zagreb', 'Budapest', 4 from dual
5 )
6 select schedule_id, source, destination, duration
7 from (select schedule_id, source, destination, duration,
8 rank() over (order by duration) rn
9 from schedule
10 )
11 where rn = 1;
SCHEDULE_ID SOURCE DESTINAT DURATION
----------- ------ -------- ----------
2 Berlin Prague 4
3 Zagreb Budapest 4
SQL>
I guess there was no need for alias names as you were accessing the data from the same table.
Code:
select
schedule_id,source,destination,duration from schedule
where duration = (select min(duration) from schedule)
order by schedule_id;

Resources