Overlapping task duration aggregation in DAX Hierarchy - dax

SCENARIO:
Monitor “Task” duration. Purpose is to track trend in changing Durations.
There is a framework that stores information about tasks in place.
There is a hierarchy setup using parent/child ID.
Columns; Task_ID, Parent_Task_ID, Start_Time, End_Time, Duration.
A hierarchy has been set up With Calculated Columns
Depth, Path, Level1-Level4. (according to DAX Patterns on Hierarchy)
Measures; Browsedepth and RowDepth has been set up to remove blanks in the report setup for empty levels.
DAX;
Sum Hierarchy =
VAR Val = [Sum Duration]
VAR ShowRow=[BrowseDepth] <= [RowDepth]
RETURN IF(ShowRow,Val)
CHALLENGE:
In the Task table all durations are correct, so there are no need to aggregate the duration of the children up to parent.
In Daxpatterns connected to hierarchies in the examples I have found, there are always aggregation up from children to parent, as the black numbers in the matrix under are showing
The goal is to find a way to create a measure that avoid aggregation from children to parent, and present the "Blue numbers in the picture above.
Do anyone have any pointers on the pattern or logic to use, it would be greatly appreciated.
Kind regards,
Atle Røen

The answer to this was to use MAX( 'Task'[Duration] ) and filter on the last task execution. Simple when you finally find the answer :D
DAX;
Max Task duration - Last run:=
CALCULATE (
MAX( 'Task'[Duration] ),
LASTDATE ( (
FILTER (
VALUES ( 'Task'[Start_Time] ),
MAX ( 'Calendar'[Date] ) >= 'Task'[Start_Time]
)
) )
)
Task duration - Last run - Hier:=
VAR Val = [Max Task duration - Last run]
VAR ShowRow = [BrowseDepth] <= [MaxNodeDepth]
RETURN IF(ShowRow,Val)

Related

DAX: Unwanted cartesian product lines?

Please help to calculate/understand properly lastDate and rankDate measures for following simplified example (download):
Desired result:
Reality (incorrect subtypes):
Why relationship is broken?
How to avoid this cartesian product lines?
My Measure (I commented workaround, because it's kind of postfilter, not prefilter):
rnkDate =
VAR t =
CALCULATETABLE(
VALUES(tstTable[Date]),
REMOVEFILTERS(tstTable[Date])
)
RETURN
//IF( MAX(tstTable[Amount])<>BLANK(), // WORKAROUND To hide unwantedd rows
RANKX(
t,
LASTDATE(tstTable[Date])
)
//)
P.S. Mess happens only if I use fields from dimensional table dimType[Type] (within one table everything is Ok):
The problem is that the query generated by Power BI performs the cartesian product and filers the result by checking the result of the measure.
in our case is something similar to
SUMMARIZECOLUMNS(
'dimType'[Type],
'tstTable'[subType],
'tstTable'[Date],
"MinAmount", CALCULATE(MIN('tstTable'[Amount])),
"lastDate", 'tstTable'[lastDate],
"rnkDate", 'tstTable'[rnkDate]
)
SUMMARIZECOLUMNS doesn't use relationships when iterating on different tables, it applies them when evaluating the measures. There is an article explaining what is the equivalent DAX code executed by SUMMARIZECOLUMNS
Introducing SUMMARIZECOLUMNS
the problem is that RANKX evaluated on an empty table retuns 1. This can be seen executing this on dax.do
EVALUATE
VAR t =
FILTER ( ALL ( 'Date'[Date] ), FALSE )
RETURN
{ RANKX ( t, [Sales Amount] ), CALCULATE ( [Sales Amount], t ) }
so the solution is to first check that the table t is not empty, which is the reason because the workaround that you implemented solved the issue
lastDate =
IF( NOT ISEMPTY(tstTable), // checks fact table in particular context
CALCULATE(
LASTDATE(tstTable[Date]),
REMOVEFILTERS(tstTable[Date])
)
)
rnkDate =
VAR t =
CALCULATETABLE(
VALUES(tstTable[Date]),
REMOVEFILTERS(tstTable[Date])
)
RETURN
IF( NOT ISEMPTY(tstTable),
RANKX(
t,
LASTDATE(tstTable[Date])
)
)

Dax Power Pivot Cumulative Total Over Development Period Dimension

I'm new to Dax and I'm struggling to get the results I require with a running total.
I hope I've give you enough information below to help..
thank for any help and pointer in advance.
I'm trying to create an insurance triangle, I have 3 tables fact_transaction_claims_payments, dimension_development_Periods and reporting_upto_information
fact_transaction_claims_payments
dimension_development_Periods
reporting_upto_information
diagram view
I've managed to get my running total to work but only where there is a development period held in the fact table and not all the ones in between held in my development period dimension
Power Pivot Running Total
for example i'd expect development period 1 - 10 to total 0.00 and 13 to 18 to total 103,710
but i can seem to get these to appear in my pivot.
Running Total Dax Expression
RT_TotalAmount1:=VAR CurrentDevelopmentPeriod =
CALCULATE(
DATEDIFF(
IF(
MAX('fact_transaction_claims_payments'[cUWYear]) < 2011
, DATE(MAX('fact_transaction_claims_payments'[cUWYear]),1,1)
, DATE(MAX('fact_transaction_claims_payments'[cUWYear]),4,1)
)
,MAX('fact_transaction_claims_payments'[EndOfMonthDate])
,MONTH)
, 'dimension_development_Periods' ) +1
VAR MaxDevelopmentPeriod =
CALCULATE(
DATEDIFF(
IF(
MAX('fact_transaction_claims_payments'[cUWYear]) < 2011
, DATE(MAX('fact_transaction_claims_payments'[cUWYear]),1,1)
, DATE(MAX('fact_transaction_claims_payments'[cUWYear]),4,1)
)
,MAX('reporting_upto_information'[EndOfMonthDate])
,MONTH)
, 'dimension_development_Periods' ) +1
VAR RunningTotal =
CALCULATE (
SUM('fact_transaction_claims_payments'[TotalAmount])
,FILTER (
ALL ('fact_transaction_claims_payments')
,'fact_transaction_claims_payments'[DevelopmentMonthNumber] <= CurrentDevelopmentPeriod
)
)
RETURN
RunningTotal
I've tried adding the ALL('dimension_development_Periods') into the VAR DevelopmentPeriod
but this just puts the grand total against every development period.
I'm thinking I now need to use the RunningTotal to calculate against the Development Period Dimension filtered <= the max development period for the cUWYear but I'm not sure on how to implement this and I need some advice can anyone help.

Calculated column that ignores row context

I'm trying to calculate the Total Price per Order number. It specifically needs to be a column, because I'll be needing it for further calculations.
Can someone help me write code that calculates the total per Order Number, instead of line amount as it does now?
Since it's a calcualted column, just avoiding any context transition gives a straightforward solution
Total Price Per Order =
VAR CurrentOrder = SalesDetail[Order Number]
RETURN
SUMX (
FILTER (
SalesDetail,
SalesDetail[Order Number] = CurrentOrder
),
SalesDetail[Unit Price] * SalesDetail[Quantity]
)

How do I create a DAX expression to display a calculation from a derived table?

I'm using DaxStudio to test some measures, but am having trouble getting them to work. I can run the following expression, but don't know how to run an average of the field Mean to show just the mean of that. I'm basically expecting output to be a single cell with the average.
DAX Query:
EVALUATE
FILTER(
NATURALINNERJOIN(Alldata, NATURALINNERJOIN('Label', NATURALINNERJOIN('LabelBSkill', 'LabelCSkill'))),
'LabelCSkill'[Name] = "Critical"
&& 'Label'[Type]="Red"
)
Mean is in the table Alldata if that matters
Give this a try:
EVALUATE
ROW (
"Mean", CALCULATE (
AVERAGE ( Alldata[Mean] ),
'LabelCSkill'[Name] = "Critical",
'Label'[Type] = "Red"
)
)

PIG- Aggregations based on multiple columns

My Input data set has 3 columns and schema looks like below:
ActivityDate, EventId, EventDate
Now, using pig i need to derive multiple variables like below in one output file:
1) All Event Ids after ActivityDate >= EventDate -30 days
2) All Event Ids after ActivityDate >= EventDate -60 days
3) All Event Ids after ActivityDate >= EventDate -90 days
I have more than 30 variables like this. If it is one variable, we can use simple FILTER to filter the data.
I am thinking about any UDF implementation which takes bag as input and returns count of Event IDs based on above criteria for each parameter.
What is the best way to aggregate the data on multiple columns in pig ?
I would suggest creating another file with all of your thresholds and cross joining with the file.
so you would have a file containing:
30
60
90
etc
read it like this:
grouping = load 'grouping.txt' using PigStorage(',') as (groups:double);
Then do:
data_with_grouping = cross data, grouping;
Then have this binary condition:
data_with_binary_condition = foreach data_with_grouping generate ActivityDate, EventId, EventDate, groups, (ActivityDate >= EventDate - groups ? 1 : 0) as binary_condition;
Now you will have one column with the threshold and one column with a binary variable that tells you whether the ID follows the condition or not.
you can do a filter out all of the zeros from the binary_condition and then group on the groups column:
data_with_binary_condition_filtered = filter data_with_binary_condition by (binary_condition != 0);
grouped_by_threshold = group data_with_binary_condition_filtered by groups;
count_of_IDS = foreach grouped_by_threshold generate group, COUNT(data_with_binary_condition.EventId);
I hope this works. Obviously, I didn't debug it for you since I don't have your files.
This code will take a tad more time to run, but it will produce the output you need without a UDF.
If I understand your question correctly, you want to divide the difference between EventDate and ActivityDate in 30 days blocks (e.g. 1 to 30, 31 to 60, 61 to 90 and so on) and then count the frequency of each block.
In this case, I would just rearrange the above equation to create the variable 'range' as below:
// assuming input contains 3 columns ActivityDate, EventId, EventDate
// dividing the difference between ED and AD by 30 and casting it to int, so that 1 block is represented by 1 integer.
input1 = FOREACH input GENERATE (int)((EventDate - ActivityDate) / 30) as range;
output1 = GROUP input1 BY range;
output2 = FOREACH output1 GENERATE group AS range, COUNT(range) as count;
Hope this helps.

Resources