Power Pivot - Looking up details with start and end dates? - dax

I need help in the formula correctness of my approach in this PowerPivot table in Excel.
I have two tables: daily timesheet and employee details. The idea is to lookup the department, sub-department, and managers from the employee_details table based on the employee number and shift date in the daily_timesheet table.
Here's a representation of the tables:
daily_timesheet
shift_date
emp_num
scheduled_hours
actual_worked_hrs
dept
sub_dept
mgr
2022-02-28
01234
7.5
7.34
16100
16181
05432
2022-03-15
01234
7.5
7.50
16200
16231
06543
employee_details
emp_num
dept_code
sub_dept_code
mgr_emp_num
start_date
end_date
is_current
01234
16000
16041
04321
2022-01-01
2022-01-31
FALSE
01234
16100
16181
05432
2022-02-01
2022-01-28
FALSE
01234
16200
16231
06543
2022-03-01
null
TRUE
End dates have null values if it is the employee's current assignment; but it's never null if the is_current field is FALSE.
Here's what I've managed so far:
First, lookup the current start date in the employee details is less than or equal to the shift date.
If true, I'm using the LOOKUP function to return the department, sub-department, and manager by searching the employee number and the true value in the is_current field.
Else, I use the MIN function to get the value of those fields and wrap it around a CALCULATE function then apply FILTER for: (1) emp_num matching the timesheet, (2) is_current that has a FALSE value, (3) start_date less than or equal to the shift_date, and (4) end_date is greater than or equal to the shift_date.
And the bedrock of my question is actually, the item 3 above. I know using the MIN function is incorrect, but I can't find any solution that will work.
Here's the formula I've been using for to get the dept in the daily_timesheet table from the employee_details table:
=IF(
LOOKUP(employee_details[start_date],
employee_details[emp_num],
daily_timesheet[emp_num],
employee_details[is_current] = TRUE) <= daily_timesheet[shift_date],
LOOKUP(employee_details[dept_code],
employee_details[emp_num],
daily_timesheet[emp_num],
employee_details[is_current] = TRUE),
CALCULATE(MIN(employee_details[dept_code]),
FILTER(employee_details, employee_details[emp_num] = daily_timesheet[emp_num]),
FILTER(employee_details, employee_details[is_current] = FALSE),
FILTER(employee_details, employee_details[start_date] <= daily_timesheet[shift_date]),
FILTER(employee_details, employee_details[end_date] >= daily_timesheet[shift_date]))
)
Any advice please?

Related

Oracle: Update values in table with aggregated values from same table

I am looking for a possibly better approach to this.
I have created a temp table in Oracle 11.2 that I'm using to pre calculate values that I will need in other selects instead of always generating them again with each select.
create global temporary table temp_foo (
DT timestamp(6), --only the date part will be used in this example but for later things I will need the time
Something varchar2(100),
Customer varchar2(100),
MinDate timestamp(6),
MaxDate timestamp(6),
Filecount int,
Errorcount int,
AvgFilecount int,
constraint PK_foo primary key (DT, Customer)
) on commit preserve rows;
I then first insert some fixed values for everything except AvgFilecount. AvgFilecount should contain the average for the Filecount for the 3 previous records (going by the date in DT). It doesn’t matter that the result will be converted to an int, I don’t need the decimal places
DT | Customer | Filecount | AvgFilecount
2019-04-30 | x | 10 | avg(2+3+9)
2019-04-29 | x | 2 | based on values before this
2019-04-28 | x | 3 | based on values before this
2019-04-27 | x | 9 | based on values before this
I thought about using a normal UPDATE statement as this should be faster than looping through the values. I should mention that there are no gaps in the DT field but obviously there is a first one where I won‘t find any previous records. If I would loop through, I could easily calculate AvgFilecount with (the record before previous record/2 + previous record)/3 which I cannot with UPDATE as I cannot guarantee the order of how they are executed. So I‘m fine with just taking the last 3 records (going by DT) and calcuting it from there.
What I thought would be an easy update is giving me headaches. I‘m mostly doing SQL Server where I would just join the 3 other records but it seems is a bit different in Oracle. I have found https://stackoverflow.com/a/2446834/4040068 and wanted to use the second approach in the answer.
update
(select curr.DT, curr.temp_foo, curr.Filecount, curr.AvgFilecount as OLD, (coalesce(Minus1.Filecount, 0) + coalesce(Minus2.Filecount, 0) + coalesce(Minus3.Filecount, 0)) / 3 as NEW
from temp_foo curr
left join temp_foo Minus1 ON Minus1.Customer = curr.Customer and trunc(Minus1.DT) = trunc(curr.DT-1)
left join temp_foo Minus2 ON Minus2.Customer = curr.Customer and trunc(Minus2.DT) = trunc(curr.DT-2)
left join temp_foo Minus3 ON Minus3.Customer = curr.Customer and trunc(Minus3.DT) = curr.DT-3
order by 1, 2
)
set OLD = NEW;
Which gives me an
ORA-01779: cannot modify a column which maps to a non key-preserved
table
01779. 00000 - "cannot modify a column which maps to a non key-preserved table"
*Cause: An attempt was made to insert or update columns of a join view which
map to a non-key-preserved table.
*Action: Modify the underlying base tables directly.
I thought this should work as both join conditions are in the primary key and thus unique. I am currently implementing the first approach in the above mentioned answer but it is getting quite big and it feels like there should be a better solution to this.
Other things I thought about trying:
using a nested subselect (nested because Oracle doesn’t know top(n) and I need to sort the subselect) to select the previous 3 records ordered by DT and then he outer select with rownum <=3 and then I could just use AVG(). However, I was told subselect can be quite slow and joins are better in Oracle performance wise. Dunno if that is really the case, haven‘t done any testing
Edit: My insert right now looks like this. I am already aggregating the Filecount for a day as there can be multiple records per DT per Customer per Something.
insert into temp_foo (DT, Something, Customer, Filecount)
select dates.DT, tbl1.Something, tbl1.Customer, coalesce(sum(tbl3.Filecount),0)
from table(Function_Returning_Daterange(NULL, NULL)) dates
cross join
(SELECT Something,
Code,
Value
FROM Table2 tbl2
WHERE (Something = 'Value')) tbl1
left outer join Table3 tbl3
on tbl3.Customer = tbl1.Customer
and trunc(tbl3.MinDate) = trunc(dates.DT)
group by dates.DT, tbl1.Something, tbl1.Customer;
You could use an analytic average with a window clause:
select dt, customer, filecount,
avg(filecount) over (partition by customer order by dt
rows between 3 preceding and 1 preceding) as avgfilecount
from tmp_foo
order by dt desc;
DT CUSTOMER FILECOUNT AVGFILECOUNT
---------- -------- ---------- ------------
2019-04-30 x 10 4.66666667
2019-04-29 x 2 6
2019-04-28 x 3 9
2019-04-27 x 9
and then do the update part with a merge statement:
merge into tmp_foo t
using (
select dt, customer,
avg(filecount) over (partition by customer order by dt
rows between 3 preceding and 1 preceding) as avgfilecount
from tmp_foo
) s
on (s.dt = t.dt and s.customer = t.customer)
when matched then update set t.avgfilecount = s.avgfilecount;
4 rows merged.
select dt, customer, filecount, avgfilecount
from tmp_foo
order by dt desc;
DT CUSTOMER FILECOUNT AVGFILECOUNT
---------- -------- ---------- ------------
2019-04-30 x 10 4.66666667
2019-04-29 x 2 6
2019-04-28 x 3 9
2019-04-27 x 9
You haven't shown your original insert statement; it might be possible to add the analytic calculation to that, and avoid the separate update step.
Also, if you want the first two date values to be calculated as if the 'missing' extra days before them had zero counts, you could use sum and division instead of avg:
select dt, customer, filecount,
sum(filecount) over (partition by customer order by dt
rows between 3 preceding and 1 preceding)/3 as avgfilecount
from tmp_foo
order by dt desc;
DT CUSTOMER FILECOUNT AVGFILECOUNT
---------- -------- ---------- ------------
2019-04-30 x 10 4.66666667
2019-04-29 x 2 4
2019-04-28 x 3 3
2019-04-27 x 9
It depends what you expect those last calculated values to be.

How to avoid zero value in column after minus 2 dates in sql?

I have this query:
SELECT CREAD, DATE, NOTIFICATION,
NVL (ROUND ((((:endate - :stdate) * 24) - CASE
WHEN MAX(CREAD) = MIN(CREAD) THEN MAX(CREAD)
ELSE MAX(CREAD) - MIN(CREAD)
END) / COUNT(NOTIFICATION),2),0) "MTTR"
FROM DUAL;
When user select dates between start_date and end_date then execute this table:
CREAD DATE NOTIFICATION MTTR
123 1/1/2017 6 56
1000 30/1/2017 3 80
When user select no dates between start_date and end_date then execute this table:
CREAD DATE NOTIFICATION MTTR
123 1/1/2017 6 0
1000 30/1/2017 3 0
IN "MTTR" column value is 0. but i want same value in the above table.
How to write query in oracle sql?
When user select no dates between start_date and end_date then ... "MTTR" column value is 0.
When the user selects no date the first part of the expression (:endate - :stdate) evaluates to (null - null), hence the rest of the expression evaluates to null and therefore your query returns the NVL() default of 0.
but i want same value in the above table.
To get this outcome you need to handle null values for start_date and end_date. Maybe this will work for you:
nvl(:endate -:stdate, 1)
Or you may want some other default. But as the value of MTTR depends on the range of start_date and end_date it's hard to see how you can get "the same value" when the user enters no bounds.

Logic for change in one column value in pl/sql

I have an assignmentes table asg_tab with effective start date and effective end date columns which track on which dates which change was made.
asg_tab
eff_start_date eff_End_date PER_ASG_ATTRIBUTE2 job name pos name
01-Jan-2015 03-feb-2015 Ck Bonus Retail Mgr
04-Feb-2015 20-Feb-2015 UK Bonus Sales Mgr
21-Feb-2015 28-Nov-2015 UK Bonus Sales Snr. Mgr
Now I have to calculate the number of days for which PER_ASG_ATTRIBUTE2 is UK Bonus. For example in the above case it will be days between 04-Feb-2015 to 28-Nov-2015.
I have written the logic below, which is fetching values from cursor.
cursor cur_asg
is
select
eff_start_date,
eff_End_date,
PER_ASG_ATTRIBUTE2,
job_name,
pos_name
from
asg_tab
Logic I have built :
START_DT ='01-Jan-2015'
END_DT ='31-Dec-2015'
IF PER_ASG_ATTRIBUTE2 LIKE '%UK Bonus%' THEN
(
l_new_ATTR = PER_ASG_ATTRIBUTE2
l_effective_date = i.eff_start_date
IF (l_new_ATTR <> l_old_ATTR) AND (l_effective_date >= START_DT ) AND (l_effective_date =< END_DT)
THEN
(
l_days=eff_end_date -eff_start_date
)
l_old_ATTR = l_new_ATTR
)
The issue which is coming up is that from this condition: IF (l_new_ATTR <> l_old_ATTR) AND (l_effective_date >= START_DT ) AND (l_effective_date =< END_DT)
This condition will pick the 2nd row where the PER_ASG_ATTRIBUTE2 changed from Ck Bonus to UK Bonus but when the pos name changes the 3rd row is generated.
Even though the PER_ASG_ATTRIBUTE2 is still UK Bonus this will not be filtered in the if condition.
What more can I add to this condition ?

Min(), Max() within date subset

Not sure if the title fits, but here's my problem:
I have the following table:
create table OpenTrades(
AccountNumber number,
SnapshotTime date,
Ticket number,
OpenTime date,
TradeType varchar2(4),
TradeSize number,
TradeItem char(6),
OpenPrice number,
CurrentAsk number,
CurrentBid number,
TradeSL number,
TradeTP number,
TradeSwap number,
TradeProfit number
);
alter table OpenTrades add constraint OpenTrades_PK Primary Key (AccountNumber, SnapshotTime, Ticket) using index tablespace MyNNIdx;
For every (SnapshotTime, account), I want to select min(OpenPrice), max(OpenPrice) in such a way that the resultimg min and max are relative to the past only, with respect to SnapshotTime.
For instance, for any possible (account, tradeitem) pair, I may have 10 records with, say, Snapshottime=10-jun and openprice between 0.9 and 2.0, as well as 10 more records with SnapshotTime=11-jun and openprice between 1.0 and 2.1, as well as 10 more records with SnapshotTime=12-jun and openprice between 0.7 and 1.9.
In such scenario, the sought query should return something like this:
AccountNumber SnapshotTime MyMin MyMax
------------- ------------ ----- -----
1234567 10-jun 0.9 2.0
1234567 11-jun 0.9 2.1
1234567 12-jun 0.7 2.1
I've already tried this, but it only returns min() and max() within the same snapshottime:
select accountnumber, snapshottime, tradeitem, min(openprice), max(openprice)
from opentrades
group by accountnumber, snapshottime, tradeitem
Any help would be appreciated.
You can use the analytic versions of min() and max() for this, along with windowing clauses:
select distinct accountnumber, snapshottime, tradeitem,
min(openprice) over (partition by accountnumber, tradeitem
order by snapshottime, openprice
rows between unbounded preceding and current row) as min_openprice,
max(openprice) over (partition by accountnumber, tradeitem
order by snapshottime, openprice desc
rows between unbounded preceding and current row) as max_openprice
from opentrades
order by accountnumber, snapshottime, tradeitem;
ACCOUNTNUMBER SNAPSHOTTIME TRADEITEM MIN_OPENPRICE MAX_OPENPRICE
------------- ------------ --------- ------------- -------------
1234567 10-JUN-14 X .9 2
1234567 11-JUN-14 X .9 2.1
1234567 12-JUN-14 X .7 2.1
SQL Fiddle.
The partition by calculates the value for the current accountnumber and tradeitem, within the subset of rows based on the rows between clause; the order by means that it only looks at rows in any previous snapshot and up to the lowest (for min) or highest (for max, because of the desc) in the current snapshot, when calculating the appropriate min/max for each row.
The analytic result is calculated for every row. If you run it without the distinct then you see all your base data plus the same min/max for each snapshot (Fiddle). As you don't want any of the varying data you can suppress the duplication with distinct, or by making it a query with a row_number() that you then filter on, etc.
Does this answer your problem ?
select ot1.accountnumber, ot1.snapshottime, ot1.tradeitem,
min(ot2.openprice), max(ot2.openprice)
from opentrades ot1, opentrades ot2
where ot2.accountnumber = ot1.accountnumber
and ot2.tradeitem = ot1.tradeitem
and ot2.snapshottime <= ot1.snapshottime
group by ot1.accountnumber, ot1.snapshottime, ot1.tradeitem

Grouping data by date ranges

I wonder how do I select a range of data depending on the date range?
I have these data in my payment table in format dd/mm/yyyy
Id Date Amount
1 4/1/2011 300
2 10/1/2011 200
3 27/1/2011 100
4 4/2/2011 300
5 22/2/2011 400
6 1/3/2011 500
7 1/1/2012 600
The closing date is on the 27 of every month. so I would like to group all the data from 27 till 26 of next month into a group.
Meaning to say I would like the output as this.
Group 1
1 4/1/2011 300
2 10/1/2011 200
Group 2
1 27/1/2011 100
2 4/2/2011 300
3 22/2/2011 400
Group 3
1 1/3/2011 500
Group 4
1 1/1/2012 600
It's not clear the context of your qestion. Are you querying a database?
If this is the case, you are asking about datetime but it seems you have a column in string format.
First of all, convert your data in datetime data type (or some equivalent, what db engine are you using?), and then use a grouping criteria like this:
GROUP BY datepart(month, dateadd(day, -26, [datefield])), DATEPART(year, dateadd(day, -26, [datefield]))
EDIT:
So, you are in Linq?
Different language, same logic:
.GroupBy(x => DateTime
.ParseExact(x.Date, "dd/mm/yyyy", CultureInfo.InvariantCulture) //Supposed your date field of string data type
.AddDays(-26)
.ToString("yyyyMM"));
If you are going to do this frequently, it would be worth investing in a table that assigns a unique identifier to each month and the start and end dates:
CREATE TABLE MonthEndings
(
MonthID INTEGER NOT NULL PRIMARY KEY,
StartDate DATE NOT NULL,
EndDate DATE NOT NULL
);
INSERT INTO MonthEndings VALUES(201101, '27/12/2010', '26/01/2011');
INSERT INTO MonthEndings VALUES(201102, '27/01/2011', '26/02/2011');
INSERT INTO MonthEndings VALUES(201103, '27/02/2011', '26/03/2011');
INSERT INTO MonthEndings VALUES(201112, '27/11/2011', '26/01/2012');
You can then group accurately using:
SELECT M.MonthID, P.Id, P.Date, P.Amount
FROM Payments AS P
JOIN MonthEndings AS M ON P.Date BETWEEN M.StartDate and M.EndDate
ORDER BY M.MonthID, P.Date;
Any group headings etc are best handled out of the DBMS - the SQL gets you the data in the correct sequence, and the software retrieving the data presents it to the user.
If you can't translate SQL to LINQ, that makes two of us. Sorry, I have never used LINQ, so I've no idea what is involved.
SELECT *, CASE WHEN datepart(day,date)<27 THEN datepart(month,date)
ELSE datepart(month,date) % 12 + 1 END as group_name
FROM payment

Resources