I am fairly new to the DAX universe, but scrolling around I managed to successfully implement a cumulative (running) total, with a measure defined along this structure: Running_Total_QTY:=CALCULATE(SUM(Reporting[QTY]),FILTER(ALL(Reporting[DATE_R]),Reporting[DATE_R]<=MAX(Reporting[DATE_R])))
For a table that looks like this:
ID DATE_R QTY
A1 5/11/2018 9:00 5
A1 5/11/2018 9:01 10
A1 5/11/2018 9:01 -5
A1 5/11/2018 9:02 50
A1 5/11/2018 9:05 -20
B1 5/11/2018 9:00 3
B1 5/11/2018 9:01 -20
B1 5/11/2018 9:01 4
B1 5/11/2018 9:02 20
B1 5/11/2018 9:03 10
The problem is that I would need to add to this running total a starting QTY - QTY_INIT, which I receive from another table that looks like this:
ID1 QTY_INIT
A1 100
B1 200
By trial and error I have succeeded by creating a second measure that calculates the average (of 1 item!) defined like this:
Average_starting_quantity:=CALCULATE(AVERAGE(Starting_Quantity[QTY_INIT]),FILTER(ALL(Starting_Quantity[ID1]),Starting_Quantity[ID1]=LASTNONBLANK(Reporting[ID],TRUE())))
And then just adding the two measures together.
Running_plus_total:=[Running_Total_QTY]+[Average_starting_quantity]
This method works, but is very inefficient and very slow (the data set is quite big).
How can I add QTY_INIT from the second table directly without using a "fake" average (or max, min, etc..)? How can I optimize the measure for a faster performance?
Thanks in advance for any help.
Regards
How about this instead of your Average_starting_quantity?
StartingQty = LOOKUPVALUE(Starting_Quantity[QTY_INIT],
Starting_Quantity[ID1], MAX(Reporting[ID]))
If your tables are related on ID and ID1 with cross filter direction going both ways,
then you can just use
StartingQty = MAX(Starting_Quantity[QTY_INIT])
since the filter context on ID will flow through to ID1.
Related
I am trying to run an Explanatory Factor Analysis on my questionnaire data.
I have data for 201 participants and 30 questions. The head of my data looks somehow like this (I am showing only the first 5 questions to give an idea of the dataset structure):
Q1 Q2 Q3 Q3 Q4 Q5
1 14 0 20 0 0 0
2 14 14 20 20 20 1
3 20 18 20 20 20 9
4 14 14 20 20 20 0
5 20 18 20 20 20 5
6 20 18 20 20 8 7
I want to find multivariate outliers ,so I am trying to calculate the Mahalanobis distance (cases with Mahalanobis Distance p values bigger than 0.001 are considered outliers).
I am using this code in R-studio (all_data_EFA is my dataset name):
distance <- as.matrix(mahalanobis(all_data_EFA, colMeans(all_data_EFA), cov = cov(all_data_EFA)))
Mah_significant <- all_data_EFA %>%
transmute(row_number = 1:nrow(all_data_EFA),
Mahalanobis_distance = distance,
Mah_p_value = pchisq(distance, df = ncol(all_data_EFA), lower.tail = F)) %>%
filter(Mah_p_value <= 0.001)
However, when I run "distance" I get the following Error:
Error in solve.default(cov, ...) :
Lapack routine dgesv: system is exactly singular: U[26,26] = 0
As far as I understood, this means that the covariance matrix of my data is singular, hence the matrix is not invertible and I cannot calculate Mahalanobis distance.
Is there an alternative way to calculate multivariate outliers or how can I solve this problem?
Many thanks.
Context
I was not able to find the correct terminology. So I'll give a brief explanation of what it is.
I sold some products and i don't know how my client will pay, so my ERP create a registry of a Receivable.
However, my client had already made 2 other orders, so now i have 3 Receivable registry. however, the client doesn't pay and they renegotiate the debt.
That can happen N times like the following chart.
What I want to do is.
Pass the Receivable 'D' ID then it finds the original Receivable (A0,A1,A2) and from that get all Receivable movement.
Current Function
First, i use a Recursive Function that brings me the origin.
Then a recursive function that already brings all the payments when the origin is a single Receivable.
However, when the origin has more than one Receivable, it will duplicate the values. Since (A0,A1,A2) are aggregated.
Current scenario
QUESTION
While using Pipelined function before returning, is it possible to pass through the data and "update/delete" it?
Or i'll have to make another function only to validate the data?
What would be the best mode to treat it?
EXPECTED RETURN
POSITION
Father Node
STATE
New Node
VALUE
0
A0
RENEGOTIATED
B
100
0
A1
RENEGOTIATED
B
100
0
A2
RENEGOTIATED
B
100
1
B
CASH
CASH
100
2
B
RENEGOTIATED
C
200
2
C
RENEGOTIATED
C0
50
2
C
RENEGOTIATED
C1
50
2
C
RENEGOTIATED
C2
50
2
C
RENEGOTIATED
C3
50
3
C2
RENEGOTIATED
D
50
3
C3
RENEGOTIATED
D
50
3
C0
OPEN
NULL
50
3
C1
OPEN
NULL
50
4
D
OPEN
NULL
100
Real Scenario
Current output
I have a temp table using to test and need direction with some analytics function. Still trying to figure out my real solution.. and any help to lead me in right direction will be appreciated.
A1 B1
40 5
50 4
60 3
70 2
90 1
Tyring to find the previous value and subtract and add the column
SELECT A1, B1,
(A1-B1) AS C1,
(A1-B1) + LEAD((A1-B1),1,0) OVER (ORDER BY ROWNUM) AS G1
FROM TEST;
The output is not what I expect
A1 B1 C1
40 5 35
50 4 46
60 3 57
70 2 68
90 1 89
From last rows (5th row), first subtract A1 -B2 to get C1..then (C1+ previous A1) - previous row B1 that is ---> 89 + 70 - 2 = 157 (save results in C1 previous row)
4th row: 157+60 -3 = 214
repeat until the first row...
Expected final output should be ;--
A1 B1 C1
40 5 295
50 4 260
60 3 214
70 2 157
90 1 89
LAG and LEAD only get a single row's value not an aggregation of multiple rows and it is not applied recursively.
You want:
SELECT A1,
B1,
SUM( A1 - B1 ) OVER ( ORDER BY ROWNUM
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
) AS C1
FROM test;
Here's the problem I'm facing. I have some number of items. I then have a varying number of buckets with a weight (between 0 and 1) attached to them. I'm trying to calculate the percentage of the items that should go in each bucket.
For example, let's say I have 20 items and 3 buckets:
B1 - weight: 0.5
B2 - weight: 0.5
B3 - weight: 0.25
The percentage would then be:
B1 - 40% of the items = 8 items
B2 - 40% of the items = 8 items
B3 - 20% of the items = 4 items
The percentage should add to 100% so that all items will be distributed into buckets. In the example above, B1 and B2 should both have twice as many items as B3 since their weight is double that of B3; but, when all 3 buckets are put together, the actual percentage of items B1 gets is 40%.
Is there an algorithm already out there for this or do any of you have an idea of how to solve it?
I think you can just divide the weight of each bucket by the total weight of all items to find the percentage of items which each bucket should bear.
However, there is a slight issue should the number of items and bucket weights not divide evenly. For the sake of example, let's consider the following scenario:
B1 - weight: 0.15
B2 - weight: 0.15
B3 - weight: 0.70
And let us suppose that there are 23 items.
Then we can compute the number of items which should be allocated to each bucket by just multiplying the fraction of total weight against the total number of items:
B1 - weight: 0.15, 3.45 items
B2 - weight: 0.15, 3.45 items
B3 - weight: 0.70, 16.1 items
One algorithm which could deal with this fractional bucket problem would be to compute the number of items for each bucket, one at a time, and then shift the remainder to the next calculation. So, in this example, we would do this:
B1 - 3.45 items, keep 3, rollover 0.45
B2 - 3.45 items + 0.45 = 3.9 items, keep 3, rollover 0.9
B3 - 16.1 items + 0.9 = 17 items (whole number, and last bucket)
Sum the weights from all buckets, then divide each bucket's weight by that sum to derive the bucket's percentage of the total.
I'm searching for a measure to utilize within SSAS Tabular model that will me to perform dynamic ranking that will automatically update the associated rank value based on filters and slicer values that are applied.
I am not in this kind of scearios : PowerPivot DAX - Dynamic Ranking Per Group (Min Per Group)
The difference is the following, my data are not in the same table :
I have a fact table like this :
-------------------------------------------------------------------------------------
ClientID | ProductID | Transaction Date | Sales
------------------------------------------------------------------------------------
C1 P3 1/1/2012 $100
C2 P1 8/1/2012 $150
C3 P4 9/1/2012 $200
C1 P2 3/5/2012 $315
C2 P2 9/5/2012 $50
C3 P2 12/9/2012 $50
------------------------------------------------------------------------------------
A Customer table
-------------------------------------------------------------------------------------
ClientID | ClientCountry |
C1 France
C2 France
C3 Germany
------------------------------------------------------------------------------------
...and also a Product table
-------------------------------------------------------------------------------------
ProductID | ProductSubCategory |
P1 SB1
P2 SB1
P3 SB2
P4 SB3
------------------------------------------------------------------------------------
So here is my visualization pivot table :
-------------------------------------------------------------------------------------
ProductSubCategory | Sales
SB1 565 (150 + 315 + 50 + 50)
SB2 100
SB3 200
And the measure I'm looking for should perform like this :
-------------------------------------------------------------------------------------
ProductSubCategory | Sales | Rank
SB1 565 (150 + 315 + 50 + 50) 1
SB2 100 3
SB3 200 2
...simple, I browse my cube into Excel, put the ProductSubCategory in line, sum of Sales and expect my measure gives me correct ranking by ProductSubCategory.
Now, scenario also includes using a slicer on ClientCountry.
So when I select 'France', I expect my measure gives me an adapted ranking, only including ProductSubCategory for Clients living in France (so C1 and C2).
I tried a lot of solutions but without any result. Has anyone and idea with this kind of scenario ?
I greatly appreciate your help with this!
Thank's all