DAX Power Pivot measure – last previous value with where condition - dax

I am looking for a DAX measure to calculate volume = quantity * price, where the price is the last previous price for a given product.
In other words, I am looking for a DAX measure with last previous value with a "where" condition.
Take this example from the attached workbook:
I have 3 products:
Apples
Bananas
Oranges
Each of these has a USD price and volume is simply quantity * price.
However, oranges can also be exchanged for apples!
To sum up the value of these orange-for-apple exchange transactions with the other USD transactions, I first need to calculate the USD value of the oranges, and for this I need to know the last price paid for an apple, i.e. last previous price, where product = apple.
Take this example from the attached workbook:
The last previous price paid for an apple was USD 5
The total USD price (volume) for 10 apples sold is: 10*50= USD50
Subsequently 3 oranges were exchanged for apples, at a rate of 4 apples per orange
The total USD price (volume) for 3 oranges is: 3x4x5= USD60, i.e. # number of oranges * ratio oranges to apples * last previous price for an apple
Total transaction volume = 50 + 60 = USD 110
There are a few more examples in this sample file:
https://docs.google.com/spreadsheets/d/1PTaKg9a3Yv1um2RTnpeYC4gdLVjQXEzl/edit?usp=sharing&ouid=106440602605717108817&rtpof=true&sd=true
What I am looking for is a DAX formula that gives me the last previous value with a condition or filter or where clause.

The following works as a calculated column, but is expensive, because it uses EARLIER:
=
VAR Conditional_Volume = IF(Transactions[product] = "orange_apple",
CALCULATE(
MAX(Transactions[price]),
ALL(Transactions),
(Transactions[product]="apple"),
Transactions[transaction_id] < EARLIER(Transactions[transaction_id])
)*Transactions[price]*Transactions[quantity],Transactions[price]*Transactions[quantity])
RETURN Conditional_Volume

Related

SSAS Tabular - how to aggregate differently at month grain?

In my cube, I have several measures at the day grain that I'd like to sum at the day grain but average (or take latest) at the month grain or year grain.
Example:
We have a Fact table with Date and number of active subscribers in that day (aka PMC). This is snapshotted per day.
dt
SubscriberCnt
1/1/22
50
1/2/22
55
This works great at the day level. At the month level, we don't want to sum these two values (count = 105) because it doesn't make sense and not accurate.
when someone is looking at month grain, it should look like this - take the latest for the month. (we may change this to do an average instead, management is still deciding)
option 1 - Take latest
Month-Dt
Subscribers
Jan-2022
55
Feb-2022
-
option 2 - Take aveage
Month-Dt
Subscribers
Jan-2022
52
Feb-2022
-
I've not been able to find the right search terms for this but this seems like a common problem.
I added some sample data at the end of a month for testing:
dt
SubscriberCnt
12/30/21
46
12/31/21
48
This formula uses LASTNONBLANKVALUE, which sorts by the first column and provides the latest value that is not blank:
Monthly Subscriber Count = LASTNONBLANKVALUE( 'Table'[dt], SUM('Table'[SubscriberCnt]) )
If you do an AVERAGE, a simple AVERAGE formula will work. If you want an average just for the current month, then try this:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] )
RETURN IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) )
But the total row will be misleading, so I would add this so the total row is the latest number:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] ) //Get the number on the last day of the month
VAR _TOT = NOT HASONEVALUE(DateDim[MonthNo]) // Check if this is a total row (more than one month value)
RETURN IF(_TOT, [Monthly Subscriber Count], // For total rows, use the latest nonblank value
IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) ) // For month rows, use final day if available, else use the average
)

Calculate percentage when an item can be assigned to multiple categories

this might be a stupid question but I'm blanking at the moment and can't find the answer in google.
I have the following example table
customer
bought
A
food
B
food,drink
C
drink
D
drink
now how do I calculate the percentage of customers that bought food/drink over total customers? what would be the best way to calculate this?
solution
% food = #customers who bought food / #total unique customers = 2 / 4 = 50%
% drink = #customers who bought drink / #total unique customers = 3 / 4 = 75%
the problem here is the total % exceeds 100%
solution - count customer B twice, once for drink and once for food
% food = #customers who bought food / #total customers = 2 / 5 = 40%
% drink = #customers who bought drink / #total customers = 3 / 5 = 60%
the total is 100% but is this the correct way to calc % in this case?
The issue is obviously that one customer can buy both food and drink and I'm not sure how to handle this case. Any help would be appreciated. Thank you!
UPDATE:
thanks for the answer. It makes sense to remove altogether the customers who bought both products. But now I'm wondering what happens if there are more than 2 categories?
Example as follows, now we have an additional product (ice cream) in the mix
customer
bought
A
food
B
food,drink
C
drink
D
drink
E
drink,ice cream
F
ice cream
G
food,drink,ice cream
I guess with the same logic we can remove customer G since they bought all the products? And how should we handle customer E and B?
I think that what you are looking for is simply to eliminate the number of customers that purchased both products.
If you are looking for unique combinations of customers who purchased only a specific product then:
3 unique customers bought a single unique item
1/3 food => 33%
2/3 drink => 66%
The ones who bought both products, don't matter in these calculations since they add the same percentage to both cases.
EDIT:
I used excel to help me out. I hope that is fine
I added your data into an excel sheet and created a crosstable pivot.
I've set the Customers as columns and the Products as rows, in order to see what each individual customer has purchased via the values field Count of Product
I've changed the count of Product into the formula for % of Total
in order to show me for each product, the percentage split between the customers, so for example if customer B bought both drink and food he will be shown as 50% in both corresponding rows. If he bought all 3 then 33%
The grand total column, will contains the final percentage for each product row item. Excel calculates it the same way as states in some comments, it counts the products total instead of the customers total, which makes sense since that is your data set being consumed, and not the customers themselves.
If we swap the rows and columns around and do the same calculations, we see individual percentages for each customer (how much % of product they each individually purchased) and we can sum them up for each product. The result is the same

How to calculate the total revenue of a day using vb.net?

Currently I'm developing a vb.net program which is based on 'Movie Ticket Purchasing System'. Right now I'm trying to develop a code where, once the cinema is closed for the day, the system will calculate the total amount of collections in both movie ticket purchases and snack/drink purchases.
Let's say that:
TicketPrice - total price of movie tickets purchased
SnackPrice - total price of snacks/drinks purchased
TotalCinemaPrice - TicketPrice + SnackPrice
So, "TotalDayRevenue", for example, will be the total number of tickets, snacks and drinks sold throughout the day.
Also, I know that 'For Loop' will be used in order to get this kind of value, but I don't know how to efficiently, and properly write the code.
Help is much appreciated, thank you.
First create a variable to store the total day revenue:
double totalAmount = 0;
Then get the amount of tickets and snacks you've sold and multiply that with the price. Then add it to the total day revenue:
totalAmount += amountOfTicketsSold * priceForOneTicket;
totalAmount += amountOfSnacksSold * priceForOneSnack;
Kind regards

Time aligned variable in stata

I have a panel data set for multiple waves (13) for roughly 10,000 individuals each year, with people entering and exiting at various time points. I am interested in what happens as people become diagnosed with a disease over time. Therefore I need to recode the time variable so that it becomes t=0 the first wave when diagnosed, then t=1 is the next year and so on, so that all of my individuals are comparable (and I guess -1 for t-1 etc). However I am unsure about how to go about this in stata. Would anyone be able to advise? Many thanks
The case of one diagnosis per person
clear all
set more off
*----- example data -----
set obs 100
set seed 2357
generate id = _n
generate year = floor(10 * runiform()) + 1990
expand 5
bysort id: replace year = year + _n
bysort id (year): generate diag = cond(_n == 3, 1, 0)
list in 1/20, sepby(id)
*----- what you seek -----
bysort id (diag): gen time = year - year[_N]
sort id year
list in 1/20
I assume the same data structure as #RichardHerron and use his example. diag is an indicator variable that takes on the value of 1 at the time of diagnosis and 0 otherwise (only one diagnosis per person is considered).
The sorting done by bysort is critical. The observation holding the time of diagnosis is pushed to the end of the database (by id groups) and then all that's left to do is compare (subtract) all years with that reference year. See help _variables for details on system variables like _N.
The case of multiple diagnoses per person
If several diagnoses are made per person, but we care only for the first occurence (according to year), we could do:
gsort id diag -year
by id: gen time = year - year[_N]
Simple but not optimal solution
Suppose diagnosis is 1 when diagnosed (at most once per person) and 0 otherwise.
Then the time at diagnosis is at its simplest
egen time_diagnosis = total(diagnosis * year), by(id)
but you need to ignore any zeros. To spell that out,
replace time_diagnosis = . if time_diagnosis == 0
Better alternative
A more complicated but preferable alternative can handle multiple diagnoses if they occur:
egen time_diagnosis = min(year / diagnosis), by(id)
as year / diagnosis is year when diagnosis is 1 and missing otherwise. This yields missing values if there is no diagnosis, which is as it should be.
Then you subtract that to get a new time variable.
gen time2 = time - time_diagnosis
In short, I think you can get this done in two statements, handling panel structure too.
Update
#Richard Herron asks why use egen with by(), and not just
gen time_diagnosis = time * diagnosis
A limitation of that is that the "correct" value is contained only in those observations for which diagnosis is 1; that value still has to be "spread" to other values for the same id. But that is precisely what egen does here. In the simplest situation, with one diagnosis the total of time * diagnosis is just time * 1 or time, as any zeros make no difference to the sum.
It is usually helpful to provide test data, but here they are easy enough to generate. The trick is to find the first year for each individual (my fyear), which I'll do with min() from egen. Then I'll subtract this first year fyear from the actual year to find the year relative to diagnosis ryear.
/* generate panel */
clear
set obs 10000
generate id = _n
generate year = floor(10 * runiform()) + 1990
expand 10
bysort id: replace year = year + _n
sort id year
list in 1/20
/* generate relative year */
bysort id: egen fyear = min(year)
generate ryear = year - fyear
list in 1/20
If the first year in the panel is not diagnosis, then just construct fyear based on diagnosis criteria.
Edit: Thinking more on this, maybe it's the last part that you're having a hard time with (i.e., identifying the diagnosis year to subtract from the calendar year). Here's what I would do.
bysort id (year): generate diagnosis = cond(_n == 5, 1, 0)
preserve
tempfile diagnosis
keep if (diagnosis == 1)
rename year dyear
keep id dyear
save `diagnosis'
restore
merge m:1 id using `diagnosis', nogenerate
generate ryear2 = year - dyear

How do I select the max child value in ActiveRecord?

I'm not even sure how to word this, so an example:
I have two models,
Chicken
id
name
EggCounterReadings
id
chicken_id
value_on_counter
timestamp
I don't always record a count for every chicken when I do counts.
Using ActiveRecord how do I get the latest egg count per chicken?
So if I have 1 chicken and 3 counts, the counts would be 1 today, 15 tomorrow, and 18 the next day. That chicken has laid 18 eggs, not 34
UPDATE: Found exactly what I was trying to do in MySQL. Find "The Rows Holding the Group-wise Maximum of a Certain Column". So I need to .find_by_sql("SELECT * FROM (SELECT * FROM EggCounterReadings WHERE <conditions> ORDER BY timestamp DESC) GROUP BY chicken_id")
Given your updated question, I've changed my answer.
chicken = Chicken.first
count = chicken.egg_counter_readings.last.value_on_counter
If you don't want the latest record, but the largest egg yield, then try this:
chicken = Chicken.first
count = chicken.egg_counter_readings.maximum(value_on_counter)
I believe that should do what you want.

Resources