I'm combining reports to create a single general ledger report, currently it consists of 84 seperate reports. My idea is the end user will have a drop down for the department and month. These are my columns:
Account Number Account Description Current Period Actual
YTD Actual YTD Budget YTD Budget Variance
Total YR Budget Account Status
I have most of it figured out, but can't understand how to figure YTD Actual and YTD Budget since these will require a Sum of multiple Fields depending on what month and department is selected.
My where statement goes something like this and takes care of the current period actual and account number:
Where
( gl_master.acct_cde = gl_master_comp_v.acct_cde ) and
( gl_master.budget_officer = budget_off_mstr.budget_officer ) and
( ( gl_master_comp_v.acct_comp1 = '01' ) AND
( budget_off_mstr.budget_officer IN (#BudgetOfficer) ) ) AND
((#Month = 1 AND gl_master.post_bal_mon_1) OR
(#Month = 2 AND gl_master.post_bal_mon_2)...
How can I have the query recognize what needs to be put into the column when there are multiple fields being summed.
Thanks for any insight. If you made something like this before a small sample of it would be very helpful.
I worked it out.
It needs to be done in a calculated field within the dataset. Then a sum can be done on the field in the textbox.
small chunk of calculated field:
=Cdec(Switch(Parameters!Month.Value = 1, Fields!post_bal_mon_1.Value,
Parameters!Month.Value = 2, Fields!post_bal_mon_2.Value + Fields!post_bal_mon_1.Value,
Parameters!Month.Value = 3, Fields!post_bal_mon_3.Value + Fields!post_bal_mon_1.Value + Fields!
post_bal_mon_2.Value,
Parameters!Month.Value = 4, Fields!post_bal_mon_4.Value + Fields!post_bal_mon_1.Value + Fields!
post_bal_mon_2.Value + Fields!post_bal_mon_3.Value,...))
textbox for sum(I place this in footer):
=SUM(Fields!post_bal_mon_1.value, "DataSet1")
Related
In my cube, I have several measures at the day grain that I'd like to sum at the day grain but average (or take latest) at the month grain or year grain.
Example:
We have a Fact table with Date and number of active subscribers in that day (aka PMC). This is snapshotted per day.
dt
SubscriberCnt
1/1/22
50
1/2/22
55
This works great at the day level. At the month level, we don't want to sum these two values (count = 105) because it doesn't make sense and not accurate.
when someone is looking at month grain, it should look like this - take the latest for the month. (we may change this to do an average instead, management is still deciding)
option 1 - Take latest
Month-Dt
Subscribers
Jan-2022
55
Feb-2022
-
option 2 - Take aveage
Month-Dt
Subscribers
Jan-2022
52
Feb-2022
-
I've not been able to find the right search terms for this but this seems like a common problem.
I added some sample data at the end of a month for testing:
dt
SubscriberCnt
12/30/21
46
12/31/21
48
This formula uses LASTNONBLANKVALUE, which sorts by the first column and provides the latest value that is not blank:
Monthly Subscriber Count = LASTNONBLANKVALUE( 'Table'[dt], SUM('Table'[SubscriberCnt]) )
If you do an AVERAGE, a simple AVERAGE formula will work. If you want an average just for the current month, then try this:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] )
RETURN IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) )
But the total row will be misleading, so I would add this so the total row is the latest number:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] ) //Get the number on the last day of the month
VAR _TOT = NOT HASONEVALUE(DateDim[MonthNo]) // Check if this is a total row (more than one month value)
RETURN IF(_TOT, [Monthly Subscriber Count], // For total rows, use the latest nonblank value
IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) ) // For month rows, use final day if available, else use the average
)
I want to calculate total value between two dates in DAX. I have a formula SUM(C5:C16) in excel sheet, which C5 is the sales amount for a specific date (last year + 1month), and C16 is the sales amount for current row date.
I tried this formula in DAX, but it did not return sum value:
var Rolling = CALCULATE(sum('proces'[HOURS]),DATESINPERIOD('Date'[DateField],ENDOFMONTH('proces'[date_start]),-12,MONTH))
Also, I tried this one, but it is not working:
=SumX (
var prev=DATEADD(DATEADD('proces'[date_start] ,-1,YEAR),+1,MONTH)
return
Filter ( 'proces',
'proces'[date_end] <= Earlier ( 'proces'[date_end] ) &&
'proces'[date_start]>=prev,
'proces'[HOURS])
Also, I tried this one but it returns nothing
=CALCULATE(
SUMX('proces','proces'[HOURS]),
DATESBETWEEN(
'Date'[DateField],
STARTOFMONTH(DATEADD(LASTDATE('Date'[DateField]),-1,MONTH)),
ENDOFMONTH(DATEADD('Date'[DateField],-1,MONTH))
)
)
You seem to be confused about variables in DAX and your formulas are not even valid DAX expressions. Learn about variables in the official documentation:
Use variables to improve your DAX formulas
If you need further help with calculating the your total amount, add sample data to your question.
I am trying to calculate month over month difference but it makes data negative.
I created a measure, but it makes source data negative.
CALCULATE (
COUNTA ( SOURCE_DATA[COLUMN] ),
FILTER ( SOURCE_DATA, SOURCE_DATA[YYYYMM] = "201906" )
)
- (
CALCULATE (
COUNTA ( SOURCE_DATA[COLUMN] ),
FILTER ( SOURCE_DATA, SOURCE_DATA[YYYYMM] = "201905" )
)
)
The outcome is correct, but it changes data in previous month to negative.
This is due to the filter context and the way you've written the measure.
Look at the visual table. For the field corresponding to Column = 201905 and row = GA you get -16 813. This is because the context of the visual table tells CALCULATE to COUNTA(SOURCE_DATA[Column]) only when MtM = GA and Columns = 201905. However, adding the FILTER you also tell CALCULATE to keep these criteria AND also make sure that SOURCE_DATA[Column] = 201906 in the first calculate and 201905 in the second one.
This results in CALCULATE looking for rows where Column is both 201905 and 201906 at the same time. Or in other words you generate a venn diagram with no overlapping fields. Therefore the first calculate evaluates to 0 and the second to 16 813, so that the measure is actually evaluating 0-16813 = -16 813.
Since you didn't post any description of your data model I can inly guess what it looks like. However, since you're filtering on the SOURCE_DATA table I guess you don't use a Calendar table. This you should do! Have a calendar with a 1:* (1-to-many) relationship with the SOURCE_DATA and do filtering on the calendar. In addition you can have dynamically calculated day/week/month/year offsets so that you can create measures which don't have to be updated when there's a new month.
I think this video can be helpful: sqlbi videolecture
Also, have a look at this article: sqlbi filter in calculate
I am a newbie to Power BI and DAX.
I have a dataset as attached. I need to find the maximum value for each person for each week. I have written the formula in Excel.
=MAX(IF(A$2:A$32=A2,IF(D$2:D$32=D2,IF(B$2:B$32=1,C$2:C$32))))
How can I convert it to DAX or write the same formula in Power BI? I tried the DAX Code as below, But it did not work(ALLEXCEPT Function expects table).
Weekly Maximum =
CALCULATE ( MAX ( PT[Value] ), ALLEXCEPT ( PT, PT[person], PT[Week],
PT[category] ==1 ) )
Once I calculate this, then I need to calculate the Expected value for each week, that has the maximum value of the previous week * 2.85, as shown in the screenshot. How can I put the previous week's maximum value for this week?
Any corrections/solutions, please?
TIA
The Max Value for Category 1 can be written like this:
= CALCULATE(MAX(PT[Value]),
ALLEXCEPT(PT, PT[Person], PT[Week]),
PT[Category] = 1)
(The Category filter doesn't go inside ALLEXCEPT().)
For your Expected Value column, you can do something similar:
= CALCULATE(2.85 * MAX(PT[Value]),
ALLEXCEPT(PT, PT[Person]),
PT[Category] = 1,
PT[Week] = EARLIER(PT[Week]) - 1)
(The EARLIER function gives you the value for the row you are in. The name refers to the earlier row context.)
I have a panel data set for multiple waves (13) for roughly 10,000 individuals each year, with people entering and exiting at various time points. I am interested in what happens as people become diagnosed with a disease over time. Therefore I need to recode the time variable so that it becomes t=0 the first wave when diagnosed, then t=1 is the next year and so on, so that all of my individuals are comparable (and I guess -1 for t-1 etc). However I am unsure about how to go about this in stata. Would anyone be able to advise? Many thanks
The case of one diagnosis per person
clear all
set more off
*----- example data -----
set obs 100
set seed 2357
generate id = _n
generate year = floor(10 * runiform()) + 1990
expand 5
bysort id: replace year = year + _n
bysort id (year): generate diag = cond(_n == 3, 1, 0)
list in 1/20, sepby(id)
*----- what you seek -----
bysort id (diag): gen time = year - year[_N]
sort id year
list in 1/20
I assume the same data structure as #RichardHerron and use his example. diag is an indicator variable that takes on the value of 1 at the time of diagnosis and 0 otherwise (only one diagnosis per person is considered).
The sorting done by bysort is critical. The observation holding the time of diagnosis is pushed to the end of the database (by id groups) and then all that's left to do is compare (subtract) all years with that reference year. See help _variables for details on system variables like _N.
The case of multiple diagnoses per person
If several diagnoses are made per person, but we care only for the first occurence (according to year), we could do:
gsort id diag -year
by id: gen time = year - year[_N]
Simple but not optimal solution
Suppose diagnosis is 1 when diagnosed (at most once per person) and 0 otherwise.
Then the time at diagnosis is at its simplest
egen time_diagnosis = total(diagnosis * year), by(id)
but you need to ignore any zeros. To spell that out,
replace time_diagnosis = . if time_diagnosis == 0
Better alternative
A more complicated but preferable alternative can handle multiple diagnoses if they occur:
egen time_diagnosis = min(year / diagnosis), by(id)
as year / diagnosis is year when diagnosis is 1 and missing otherwise. This yields missing values if there is no diagnosis, which is as it should be.
Then you subtract that to get a new time variable.
gen time2 = time - time_diagnosis
In short, I think you can get this done in two statements, handling panel structure too.
Update
#Richard Herron asks why use egen with by(), and not just
gen time_diagnosis = time * diagnosis
A limitation of that is that the "correct" value is contained only in those observations for which diagnosis is 1; that value still has to be "spread" to other values for the same id. But that is precisely what egen does here. In the simplest situation, with one diagnosis the total of time * diagnosis is just time * 1 or time, as any zeros make no difference to the sum.
It is usually helpful to provide test data, but here they are easy enough to generate. The trick is to find the first year for each individual (my fyear), which I'll do with min() from egen. Then I'll subtract this first year fyear from the actual year to find the year relative to diagnosis ryear.
/* generate panel */
clear
set obs 10000
generate id = _n
generate year = floor(10 * runiform()) + 1990
expand 10
bysort id: replace year = year + _n
sort id year
list in 1/20
/* generate relative year */
bysort id: egen fyear = min(year)
generate ryear = year - fyear
list in 1/20
If the first year in the panel is not diagnosis, then just construct fyear based on diagnosis criteria.
Edit: Thinking more on this, maybe it's the last part that you're having a hard time with (i.e., identifying the diagnosis year to subtract from the calendar year). Here's what I would do.
bysort id (year): generate diagnosis = cond(_n == 5, 1, 0)
preserve
tempfile diagnosis
keep if (diagnosis == 1)
rename year dyear
keep id dyear
save `diagnosis'
restore
merge m:1 id using `diagnosis', nogenerate
generate ryear2 = year - dyear