MDX filter problem - filter

I'm pretty new to the whole MDX thing, but the following is just driving me batty. A FILTER statement I'm using is acting... strangely. Code sample, followed by description:
SELECT
{
FILTER(
MEMBERS([Time].[5-4-4 Week Year]),
[Measures].[Ship Gross Units] > 0
)
}
ON COLUMNS,
{
FILTER(
MEMBERS([Group].[Alternate Hierarchies]),
[Measures].[Ship Gross Units] > 0
)
}
ON ROWS
FROM SBD.SBD
WHERE
(
[FiscalYear].[FY09],
[Scenario].[Actuals Total],
[Measures].[Ship Gross Units],
[Channel].[FOS]
)
I'm trying to pull gross units for a particular sales channel, by week of the fiscal year (some columns obfuscated slightly). All those filters are in place due to the fact that I often need these broken out at the SKU level, and it's simpler to deal with a truncated dataset on my machine (let the DB do the work, I say!).
The problem is, this query returns 0 sales in the FOS channel. That seemed strange, so I removed the row filter:
SELECT
{
FILTER(
MEMBERS([Time].[5-4-4 Week Year]),
[Measures].[Ship Gross Units] > 0
)
}
ON COLUMNS,
MEMBERS([Group].[Alternate Hierarchies])
ON ROWS
FROM SBD.SBD
WHERE
(
[FiscalYear].[FY09],
[Scenario].[Actuals Total],
[Measures].[Ship Gross Units],
[Channel].[FOS]
)
And all of a sudden, sales show up in the FOS channel. This blows my mind; previously, I'd assumed I was filtering to just receive rows showing sales, and I got none. Now I'm showing everything, and there are rows with sales. It's easy enough to work around this problem with Perl or whatever, but I'd rather solve it "right".
I'm reasonably certain I'm just misunderstanding some niggling detail, but I'm tired of bashing my head against the desk.
Thanks!

I usually work with Microsoft Analysis Services, but the MDX is generally very similar to that used in Essbase.
The axis will be getting evaluated independantly, so the filter statement on Group will be looking at only the members in the where clause. So is it possible that either your Gross Units have some negatives and are less than or equal to 0 for FY09? Or are your unit counts large enough that they are potentially overflowing the data type and wrapping around to negative numbers?
One possible work around if you are just looking for non-empty cells is to use the NON EMPTY keyword on the axis.
eg.
SELECT
NON EMPTY MEMBERS([Time].[5-4-4 Week Year]),
ON COLUMNS,
NON EMPTY MEMBERS([Group].[Alternate Hierarchies])
ON ROWS
FROM SBD.SBD
WHERE
(
[FiscalYear].[FY09],
[Scenario].[Actuals Total],
[Measures].[Ship Gross Units],
[Channel].[FOS]
)

Related

Need to filter result set in DAX Power BI

I have following simple relationship:
I have created following visuals in Power BI:
I want to show Store Name, Orders (by Salesman selected in slicer) and Total Orders in that Store (ignoring Salesman selected in slicer). I have created two very simple measure (can be seen in above visual) and used in matrix visuals. Visual is showing All stores while I want to show only those stores where Salesman X (selected salesman in slicer) have orders i.e. I don't want Store B row.
while solving, I suspected that it is due to fact that visual is not cross filtering. I used crossfilter but it made no difference. data can be seen in below image:
Please guide. Thanks in advance.
Try to change [Total Orders] to this measure, but keep [Total Orders].
IF( ISBLANK([Orders Count]), BLANK(), [Total Orders])
By Adding VALUES('Order'[Store ID]) in measure solved the problem. complete measure definition is as follows:
Total Orders = CALCULATE(
count('Order'[Order ID]),
REMOVEFILTERS(Salesman[Salesman Name]),
VALUES('Order'[Store ID]))
This issues the problem but I could not understand how? Because VALUES bring only those stores where salesman has Order. But when salesman removed from the filter context by REMOVEFILTERS, then how come VALUES bring only stores where salesman have orders?
a) You intend to utilize Store.salesmanName from Store in a slicer, meaning whatever is selected from there, you intend that selection to be applied on Order to give you the Order.StoreName. So when X is selected only A and C are returned.
b) Once that selection happens, you intend DAX to return the total count of each Order.StoreName whether it has a corresponding Store.salesmanID in Order.salesmanID or not. In other words, in this layer of the analysis, you want the previous selection to remain applied in the outer loop but to be ignored in the inner loop.
To be able to do that, you can do this,
totalCount =
VAR _store =
MAX ( 'Order'[storeID] ) //what is the max store ID
VAR _count =
CALCULATE (
COUNT ( 'Order'[SalesmanId] ),
FILTER ( ALL ( 'Order' ), 'Order'[storeID] = _store ) //remove any filters and apply the value from above explicitly in the filter
)
RETURN
_count

DAX formula to subtract columns

I am trying to calculate month over month difference but it makes data negative.
I created a measure, but it makes source data negative.
CALCULATE (
COUNTA ( SOURCE_DATA[COLUMN] ),
FILTER ( SOURCE_DATA, SOURCE_DATA[YYYYMM] = "201906" )
)
- (
CALCULATE (
COUNTA ( SOURCE_DATA[COLUMN] ),
FILTER ( SOURCE_DATA, SOURCE_DATA[YYYYMM] = "201905" )
)
)
The outcome is correct, but it changes data in previous month to negative.
This is due to the filter context and the way you've written the measure.
Look at the visual table. For the field corresponding to Column = 201905 and row = GA you get -16 813. This is because the context of the visual table tells CALCULATE to COUNTA(SOURCE_DATA[Column]) only when MtM = GA and Columns = 201905. However, adding the FILTER you also tell CALCULATE to keep these criteria AND also make sure that SOURCE_DATA[Column] = 201906 in the first calculate and 201905 in the second one.
This results in CALCULATE looking for rows where Column is both 201905 and 201906 at the same time. Or in other words you generate a venn diagram with no overlapping fields. Therefore the first calculate evaluates to 0 and the second to 16 813, so that the measure is actually evaluating 0-16813 = -16 813.
Since you didn't post any description of your data model I can inly guess what it looks like. However, since you're filtering on the SOURCE_DATA table I guess you don't use a Calendar table. This you should do! Have a calendar with a 1:* (1-to-many) relationship with the SOURCE_DATA and do filtering on the calendar. In addition you can have dynamically calculated day/week/month/year offsets so that you can create measures which don't have to be updated when there's a new month.
I think this video can be helpful: sqlbi videolecture
Also, have a look at this article: sqlbi filter in calculate

Power Pivot and Closing Price

I am trying to use power pivot to analyze a stock portfolio at any point in time.
The data model is:
transactions table with buy and sell transactions
historical_prices table with the closing price of each stock
security_lookup table with the symbol and other information about the stock (whether it’s a mutual fund, industry, large cap, etc.).
One to many relationships link the symbol column in security_lookup to the transactions and historical_prices tables.
I am able to get the cost basis to work correctly by doing sumx(transactions, quantity*price). However, I’m not able to get the current value of my holdings. I have a measure called “Current Price” which finds the most recent closing price by
Current Price :=
CALCULATE (
LASTNONBLANK ( Historical_prices[close], min[close] ),
FILTER (
Historical_Prices,
Historical_prices[date] = LASTDATE ( historical_prices[date] )
)
)
However, when I try to find the current value of a security by using
Current Value = sumx(transactions,transactions[quantity]*[Current Price])
the total is not accurate. I'd appreciate suggestions on a way to find the current value of a position. Preferably using sumx or an iterator function so that the subtotals are accurate.
The problem with your Current Value measure is that you are evaluating [Current Price] within the row context of the transactions table (since SUMX is an iterator), so it's only seeing the date associated with that row instead of the last date. Or more precisely, that row's date is the last date in the measure's filter context.
The simplest solution is probably to calculate the Current Price outside of the iterator using a variable and then pass that constant in so you don't have to worry about row and filter contexts.
Current Value =
VAR CurrentPrice = [Current Price]
RETURN SUMX(transactions, transactions[quantity] * CurrentPrice)

nested for loops in stata

I am having trouble to understand why a for loop construction does not work. I am not really used to for loops so I apologize if I am missing something basic. Anyhow, I appreciate any piece of advice you might have.
I am using a party level dataset from the parlgov project. I am trying to create a variable which captures how many times a party has been in government before the current observation. Time is important, the counter should be zero if a party has not been in government before, even if after the observation period it entered government multiple times. Parties are nested in countries and in cabinet dates.
The code is as follows:
use "http://eborbath.github.io/stackoverflow/loop.dta", clear //to get the data
if this does not work, I also uploaded in a csv format, try:
import delimited "http://eborbath.github.io/stackoverflow/loop.csv", bindquote(strict) encoding(UTF-8) clear
The loop should go through each country-specific cabinet date, identify the previous observation and check if the party has already been in government. This is how far I have got:
gen date2=cab_date
gen gov_counter=0
levelsof country, local(countries) // to get to the unique values in countries
foreach c of local countries{
preserve // I think I need this to "re-map" the unique cabinet dates in each country
keep if country==`c'
levelsof cab_date, local(dates) // to get to the unique cabinet dates in individual countries
restore
foreach i of local dates {
egen min_date=min(date2) // this is to identify the previous cabinet date
sort country party_id date2
bysort country party_id: replace gov_counter=gov_counter+1 if date2==min_date & cabinet_party[_n-1]==1 // this should be the counter
bysort country: replace date2=. if date2==min_date // this is to drop the observation which was counted
drop min_date //before I restart the nested loop, so that it again gets to the minimum value in `dates'
}
}
The code works without an error, but it does not do the job. Evidently there's a mistake somewhere, I am just not sure where.
BTW, it's a specific application of a problem I super often encounter: how do you count frequencies of distinct values in a multilevel data structure? This is slightly more specific, to the extent that "time matters", and it should not just sum all encounters. Let me know if you have an easier solution for this.
Thanks!
The problem with your loop is that it does not keep the replaced gov_counter after the loop. However, there is a much easier solution I'd recommend:
sort country party_id cab_date
by country party_id: gen gov_counter=sum(cabinet_party[_n-1])
This sorts the data into groups and then creates a sum by group, always up to (but not including) the current observation.
I would start here. I have stripped the comments so that we can look at the code. I have made some tiny cosmetic alterations.
foreach i of local dates {
egen min_date = min(date2)
sort country party_id date2
bysort country party_id: replace gov_counter=gov_counter+1 ///
if date2 == min_date & cabinet_party[_n-1] == 1
bysort country: replace date2 = . if date2 == min_date
drop min_date
}
This loop includes no reference to the loop index i defined in the foreach statement. So, the code is the same and completely unaffected by the loop index. The variable min_date is just a constant for the dataset and the same each time around the loop. What does depend on how many times the loop is executed is how many times the counter is incremented.
The fallacy here appears to be a false analogy with constructs in other software, in which a loop automatically spawns separate calculations for different values of a loop index.
It's not illegal for loop contents never to refer to the loop index, as is easy to see
forval j = 1/3 {
di "Hurray"
}
produces
Hurray
Hurray
Hurray
But if you want different calculations for different values of the loop index, that has to be explicit.

Most efficient way to filter multiple dimensions

Trying to get the [Number] and [Sum of Time Spent] for all Changes that were open during period 201405.
The best definition of open I can think of is is:
- Changes that were logged before or during the [MonthPeriod], while closed during or after the [MonthPeriod]
SELECT
[Measures].[Sum of Time Spent] ON COLUMNS
,
[FactChange].[Number].[Number] ON ROWS
FROM
[Change Management]
WHERE
(FILTER(
[DimLoggedDate].[MonthPeriod].[MonthPeriod]
,[DimLoggedDate].[MonthPeriod].MEMBERVALUE <= 201405
)
,
FILTER(
[DimClosedDate].[MonthPeriod].[MonthPeriod]
,[DimClosedDate].[MonthPeriod].MEMBERVALUE >= 201405
))
The above query returns a list with all numbers, with a null value when the filters in the WHERE clause don't apply. I would like to remove the NULL items.
Because the query returns ALL Numbers, I wonder if this is the most efficient query to solve the issue. Applying NonEmpty() would remove the numbers, but since all changes are enumerated, isn't this putting more stress on the system than required?
You do it simply by adding "Non empty" in the on Rows clause:
...
non empty [FactChange].[Number].[Number] on Rows
...

Resources