I'm trying to find the right way to structure a DAX formula to compute a specific average. I think I might be able to construct the average more or less explicitly by using a sum/count construction, but I'm wondering if averagex with an appropriate set of table filters might get the job done.
Specifically, my problem can be explained like this: I'm trying to compute the average cost of a car in DAX, but my data includes the cost of all the components individually (call it body, wheels and engine for now).
Name Year Part Cost
Alice 2000 Engine $10
Alice 2000 Wheels $5
Alice 2000 Body $25
Alice 2001 Engine $8
Alice 2001 Wheels $6
Alice 2001 Body $2
Bob 2000 Engine $10
Bob 2000 Wheels $5
Bob 2000 Body $25
Bob 2001 Engine $8
Bob 2001 Wheels $6
Bob 2001 Body $2
Is there any way to tell DAX that I want to first sum across all the components of the car first, and then compute averages on the data set where the dimensionality of the data has been reduced by one (only the "part" dimension removed)?
For example, the average cost for Alice then would yield
((10+5+25)+(8+6+2))/2 = 28
While if I had a pivot table constructed per name and per year, it would show
Alice 2000 40
Alice 2001 16
etc...
Thanks.
Try this... it works in the case where Name,Year provides a unique combination.
[nCombinations]:=COUNTROWS(SUMMARIZE(Table1,Table1[Name],Table1[Year]))
[TotalCost]:=SUM(Table1[Cost])
[AverageCost]:=CALCULATE([TotalCost]/[nCombinations])
Create a PivotTable with [Name] and [Year] on rows,
Then add [nCombinations] [TotalCost] and [AverageCost] in the body.
Row nCombinations TotalCost AverageCost
Alice 2 56 28
2000 1 40 40
2001 1 16 16
Bob 2 56 28
2000 1 40 40
2001 1 16 16
Grand Total 4 112 28
Related
I have a PowerBi matrix and I'm trying to 3 some custom rows at the end of each group but can't figure out how to do so. Below is what the matrix looks like.
Salesperson
Total Units Sold
John
Apples
10
Oranges
5
Spoilage
2
Katie
Mangoes
12
Apples
9
Pears
15
Spoilage
1
And I'm trying to get a Total, Net and Percentage into the matrix as shown below. Total Fruits is a summation of all the rows above except the spoilage row. Net is the summation of all above including the Spoilage and Percentage (Pct) is Spoilage divided by Total Fruits.
Salesperson
Total Units Sold
John
Apples
10
Oranges
5
Total Fruits
15
Spoilage
2
Net
13
Pct
13.3%
Katie
Mangoes
12
Apples
9
Pears
15
Total Fruits
36
Spoilage
1
Net
35
Pct
2.9%
I have a fact table that records each fruit sold by the product code and the salesperson id and dimension tables for the salesperson and the products.
I'm new to PowerBI and so I would appreciate all the details to make this work.
I have 10 players on a team.
My team of players need to purchase a total of 10 bats and 10 balls.
They can purchase:
A bat and a ball
Two bats
Two balls
They cannot buy 3 bats, or 3 balls, or any other combination. Two items only.
The Seller has 10 different balls, and 10 different bats, all with different prices.
Once 1 bat is sold, then that would be removed from the list.
The Buyer can go into debt.
The Seller does not have any change (even after a purchase).
If the Buyer spends 100 on a ball worth 10, he does NOT get 90 back.
I do have access to how much money each player has, as well as the value of how much the Seller will sell each bat and ball for.
Buyer - Name, Ball Purchase Price, Bat Purchase Price
Alpha 10 15
Bravo 20 20
Charlie 30 30
Delta 40 40
Echo 50 50
Foxtrot 60 60
Golf 70 70
Hotel 80 80
India 90 95
Juliett 99 99
Seller - Ball Name, Ball Price
A 10
B 20
C 30
D 40
E 50
F 60
G 70
H 80
I 90
J 99
Seller - Bat Name, Bat Price
A 99
B 95
C 80
D 70
E 60
F 50
G 40
H 30
I 20
J 15
In this example Alpha should purchase Ball-A and Bat-J, and Juliett should purchase Ball-J and Bat-A.
I am trying find an optimized way of figuring out which player should be buying which items to save the most money as a team, or for the team to be the least in debt.
How do you find the smallest difference in mapping elements of a Primary List to two Secondary Lists?
In a more complex scenario, how do I find out which Buyer should purchase which items?
In other examples, the seller might have a very expensive store where the players might have to go into debt.
Searching this type of question was difficult, as I am trying to find the smallest differences between my primary list, and two secondary lists, where the primary list can compare the same element twice.
I need help in a particular issue with Stata. I have a panel dataset by id year from 1996 to 2018.
The panel data is a combination of world countries and regions, yearly observations, for 7 different crops, area cultivated.
I would like to create a mean around years 2000, 2010 and 2018, so that mean(year2000)= mean of (1999+2000+2001), mean(year2010)=mean from (2009+2010+2011) and mean(year2018)= mean from (2016+2017+2018) for every crop from my 7 crops selection.
Then the problem is even more complicated when I need to combine some countries to form sub-regions: say I need the sub-region RUS1 = Russia + Ukraine. How can I create another variable that shows the total from crop1 between crop1 area cultivated in Russia + crop1 area cultivated in Ukraine on yearly basis. Meaning another variable that shows these sums for each year using the above means.
I've tried with by id year: egen area_rus1=total(area) if area=="Russia" & area=="Ukraine"
but nothing works.
The names of area being strings I used encode (area), gen (area2) and automatically Stata generates a number.
In order to create a panel dataset i've used gen id=area2+itemcode
The panel data looks like this after sort year
Please be aware that the period is 1996-2018. The example above shows only year 1996.
This didn't get much of a response, for several reasons:
You didn't show very much code.
You didn't show data in a form that is especially useful. An image can't be copied and pasted easily into someone's Stata to allow experiment. In fact your image shows variables that are irrelevant and variables that are different versions of each other and so is much more complicated than we need.
You escalated the question to ask the most complicated version of what you want to know.
There is a problem you should have explained better. area is string and so totals can't be calculated at all and area2 is just arbitrary integers so totals can be calculated but don't make sense. "nothing works" is not informative as a problem report. The only totals that make sense to me are totals of value.
You need to simplify your problem first and then build up.
The essence seems to be as follows:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str2 country str6 item float year str1 region float value
"A" "barley" 1999 "X" 1
"B" "barley" 1999 "X" 2
"C" "barley" 1999 "Y" 3
"A" "barley" 2000 "X" 4
"B" "barley" 2000 "X" 5
"C" "barley" 2000 "Y" 6
"A" "barley" 2001 "X" 7
"B" "barley" 2001 "X" 8
"C" "barley" 2001 "Y" 9
end
* means by countries: similar variables for other periods
egen mean_9901_c = mean(cond(inrange(year, 1999, 2001), value, .)), by(country item)
* aggregation to regions, but ensure that you don't double count
egen value_region = total(value), by(region item year)
egen tag = tag(region item year)
* means by regions: similar variables for other periods
egen mean_9901_r = mean(cond(tag == 1 & inrange(year, 1999, 2001), value_region, .)), by(region item)
list, sepby(year)
+---------------------------------------------------------------------------------+
| country item year region value mean_9~c value_~n tag mean_9~r |
|---------------------------------------------------------------------------------|
1. | A barley 1999 X 1 4 3 1 9 |
2. | B barley 1999 X 2 5 3 0 9 |
3. | C barley 1999 Y 3 6 3 1 6 |
|---------------------------------------------------------------------------------|
4. | A barley 2000 X 4 4 9 1 9 |
5. | B barley 2000 X 5 5 9 0 9 |
6. | C barley 2000 Y 6 6 6 1 6 |
|---------------------------------------------------------------------------------|
7. | A barley 2001 X 7 4 15 1 9 |
8. | B barley 2001 X 8 5 15 0 9 |
9. | C barley 2001 Y 9 6 9 1 6 |
+---------------------------------------------------------------------------------+
The example shows just one item, but the code should work for several.
The example shows fake data for just three years, but means for other periods can be constructed similarly.
Results are repeated for all observations to which they apply. To see or use results just once, use if. For example the means over 1999 to 2001 are shown for each of those years (and others) but if year == 1999 would be a way to see results just once.
See also help collapse, help egen for its tag() function and this paper.
What was wrong with your code
Your problems start with
if area=="Russia" & area=="Ukraine"
which selects observations for which it is true that area is both "Russia" and "Ukraine" in the same observation, which is impossible. You need the | (or) operator there, not the & operator, or to approach the problem in another way.
The prefix id is wrong too. Using by id: enforces separate calculations for different values of id and is going to make the combinations of identifiers impossible.
Week Sales
1 100
2 250
3 350
4 145
5 987
6 26
7 32
8 156
I wanted to calculate the sales only for the last 3 weeks so the total will be 156+32+26.
If new weeks are added it should automatically calculate only the data from the last 3 rows.
Tried this formula but it is returning an incorrect sum
sum(sales) over (lastperiod(3(week))
https://i.stack.imgur.com/6Y7h7.jpg
If you want only the last 3 weeks sum in calculated column you can use a simple if calculation.
If([week]>(Max([week]) - 3),Sum([sales]),0)
If you need 3 weeks calculation throughout table use below one.
sum([sales]) OVER (LastPeriods(3,[week]))
How can I claulate the rank of each candidate when I have the total candidates and votes secured by each?
I've managed the percentage part, but calculating the rank has me stuck.
I'll be using MySql in the end for this, but right now I only need the formula or method to calculate ranks.
Id be glad if you could help with just the formula. Just like the formula for interest is PTR/100.
Total Candidates
5
Total Votes
75
Votes
Name Marks Percentage Rank(What I'm trying to calculate)
A 25 33.34 1/5 ->Rank 1/5 has the most votes
B 20 26.67 2/5 ->And so on
C 10 13.34 4/5
D 5 6.67 5/5
E 15 20.00 3/5
There is a previous question on SO that addresses this, using MySQL and a ranking variable. There is some lovely stuff in the answers
MySQL rank function