Tabular DAX Daily Many To Many Association - dax

I am almost at the end of my tether on a problem that i have in a model currently... basically, i have a 'temporal' many to many mapping table which maps commission rates for managers over time... these can change daily, (but do so rarely), so I've tried to avoid just having a huge table with the same values repeated albeit for different specific dates, if i do this anyway i end up with a 200 million record table, crucially though more than one manager can get commission for the sale of a certain product type :s
Note: Some commissions only go to a single manager, and some go to multiple, and this can switch over time..
What I've done instead is hold ValidFrom, and ValidTo dates in the mapping table...
Every solution i come up with is deathly slow, and i just have no idea if there even IS a solution at this point... here is a link to a very small sample:- http://1drv.ms/1gOr7uw
The area that i think is the most troublesome is actually getting the correct "rate" for a given Manager, on a given day... the only way i can seem to be able to do this is nested SUMX, but there must be something slicker that I'm missing?!
One thing i thought about (but failed to actually implement) was just to hold an effective date, and filter using that and leverage LASTNONBLANK() or something?
Perhaps someone with some fresh eyes can help me out? Pulling my hair out here!
Edit: Someone will probably say why don't i do this in the ETL, what i've not shown here is that I have other measures that don't need to be split by the managers, but instead the full amount should be reported for each... but the total not a sum of all the managers (aka, default M2M behaviour)
Perhaps I need two fact tables? Perhaps someone could model my data in Excel to achieve a 'split' of the Facts, as i see it, whichever way the problem is cut some calc is going to need to be done at run-time. I think?!
Ty.

Give this a try, I have not checked performance, but I don't expect them to be super... nevertheless, at least it looks elegant :)
Amount :=
SUMX (
SUMMARIZE (
Mapping,
Mapping[Manager],
Mapping[Rate],
Mapping[ValidFrom],
Mapping[ValidTo],
Products[Products],
"SalesInPeriod", Mapping[Rate]
* CALCULATE (
SUM ( Sales[Amount] ),
VALUES ( 'Date'[Date] ),
DATESBETWEEN (
'Date'[Date],
Mapping[ValidFrom],
Mapping[ValidTo]
)
)
),
[SalesInPeriod]
)
If you want to count, still keeping the temporal M2M working, this should do the trick:
Count :=
CALCULATE (
COUNTROWS ( Sales ),
GENERATE (
SUMMARIZE (
Mapping,
Mapping[ValidFrom],
Mapping[ValidTo],
Products[Products]
),
DATESBETWEEN (
'Date'[Date],
Mapping[ValidFrom],
Mapping[ValidTo]
)
),
VALUES ( 'Date'[Date] )
)
This latter version is easier, because it does not need the Rate and Manager (or, it needs the manager, but it is already in the filter context and no SUMX is required since you are aggregating using canonical M2M)

Related

Add multiple calculating filters in MDX

I am really new to MDX but I have spent the past two days looking for an answer but failed. So I greatly appreciate your help and patience.
I am trying to query a Cube with filters on multiple dimensions, and I realize that there are many similar questions already there, like this or this.
The thing is, instead of specifying a particular content I am looking for, I am trying to set up filters that picks up all records that begins with a specific string. This requires left function in the filters (i.e. calculating filters?) but I cannot blend them nicely into the code.
My failed code is like this (the two filters should be in ANDrelation)
Select Non Empty ([Measures].[Sales]) ON 0
FROM [Cube_Name]
WHERE
(
FILTER
(
[Customer].[CustomerID].Members, Left([Customer].[CustomerID].CurrentMember.Name,4)="ABCD"),
[Product].[ProductID].Members, Left([Product].[ProductID].CurrentMember.Name,3)="EFG")
)
)
(My trial is based on the last answer here.)
I also read that there are some workarounds like CROSSJOIN WITH AGGREGATE or Sub-SELECT, but I just do not have any clue on 1)how to incorporate the conditions inside; 2) performance (I heard that CROSSJOIN can be slow).
I am not sure if I should mention it here, but I am actually implementing the MDX from Excel VBA by using the ADOMB.Cellset object. It only gives me the Grand total of the query I implemented under Cellset.Items(0) (there are no more items).
Thank you!
You need to split two sets into two filters:
Select
Non Empty [Measures].[Sales] on 0
From [Cube_Name]
Where
(
Filter(
[Customer].[CustomerID].[CustomerID].Members,
Left(
[Customer].[CustomerID].CurrentMember.Name,
4
) = "ABCD"
),
Filter(
[Product].[ProductID].[ProductID].Members,
Left(
[Product].[ProductID].CurrentMember.Name,
3
) = "EFG"
)
)

oracle index for like query

I'm fighting a case of customer stupidity / stubbornness here. We have an application to look up retail shopper by various criteria. The most common variety we see is some combination of (partial) last name and (partial) postal code.
When they enter the full postal code, it works remarkably well. The problem is they sometimes choose to enter, effectively, postal code like '3%'.
Any miracle out there to overcome our customer stupidity?
ETA: There are two tables involved in this particular dog of an operation: customers and addresses. I'm a DBA involved in supporting this application, rather than on the development side. I have no ability to change the code (though I can pass on suggestions in that vein) but I have some leeway on improving indexing.
Customers has 22 million rows; addresses has 23 million.
"Stupidity" may be a harsh word, but I don't understand why you would ever try to look up a customer by postal code like '3%'. I mean, how much effort is it to type in their full zip or postal code?
A difficulty is that
WHERE postal_code LIKE '3%'
AND last_name LIKE 'MC%'
can usually only benefit from either an index on postal_code or an index on last_name. A composite index on both is no help (beyond the leading column).
Consider this as a possible solution (assuming your table name is RETAIL_RECORDS:
alter table retail_records
add postal_code_first_1 VARCHAR2(2)
GENERATED ALWAYS AS ( substr(postal_code, 1,1) );
alter table retail_records
add last_name_first_1 VARCHAR2(2)
GENERATED ALWAYS AS ( substr(last_name, 1,1) );
create index retail_records_n1
on retail_records ( postal_code_first_1, last_name_first_1, postal_code );
create index retail_records_n2
on retail_records ( postal_code_first_1, last_name_first_1, last_name );
Then, in situations, where postal_code and/or last_name conditions are given to you, also include a condition on the appropriate ...first_1 column.
So,
WHERE postal_code LIKE :p1
AND last_name LIKE :p2
AND postal_code_first_1 = SUBSTR(:p1,1,1)
AND last_name_first_2 = SUBSTR(:p2,1,2)
That's going to allow Oracle to, on average, search through 1/260th of the data. (1/10th for the postal codes and 1/26th for the first letter). OK, there are a lot more last names starting with "M" than with "Z", so that's a little generous. But even for a high-frequency combination (say postal_code like '1%' and last_name like 'M%'), it still shouldn't have to look through more than 1% of the rows.
I expect that you'll have to tweak this once you see what Oracle's Cost-Based Optimizer is actually doing, but I think the basic principle of the idea should be sound.

OBIEE 11g Sort Pivot Prompt

I have created a query that selects user base data from two different weeks, uses a MSUM to work out the difference between the two weeks and then create a projection of base size across different verticals based on the net change.
This requires the use of a pivot table with prompts to display just the data from the most recent financial week (in format YYYY-MM), however, every time a new week rolls around, it resets the ordering in the pivot prompt to show the least recent week, which makes the calculations redundant.
I can't re-order the weeks in the base data, as the MSUM calc requires a specific order to be used across multiple dimensions.
Whilst this is very easily fixed by the end user each time by changing the drop down, or by the support team by editing the pivot table and changing the prompt before saving, (which then persists until the next week), it is either going to be a poor customer experience, or extra work for the support group.
Is there a method that I'm missing to create a sort on the pivot prompt options from within the pivot table options?
The equation follows this kind of logic...
"Metrics"."Base Size" + (
(
(
"Metrics"."Base Size" - (
MSUM ("Metrics"."Base Size", 2) - "Metrics"."Base Size"
)
) / [days in time period]
) * 365
)
OBI will order the data as defined by the sort order in the RPD, but ascending is probably the best choice for it at that level.
In your case you could put the Analysis on a dashboard and use a dashboard prompt instead. For that you have the ability, in the options, to change the "Choice List Options" to SQL Results. This should put in a default query, to which you could add an ORDER BY clause. You can also set that to default to the most recent/current period no matter the sort order of the column.
SELECT "Date"."Financial Week"
FROM "My Subject Area"
ORDER BY "Date"."Financial Week" DESC
Instead of using the MSUM() function, you may also be better to use one of the built in time-series functions that can get the value of a previous period for you, without having to rely on any ordering. Have a look into the Ago() function to get the previous period.

Creating DAX peer measure

The scenario:
We are an insurance brokerage company. Our fact table is claim metrics current table. This table has unique rows for multiple claim sid-s, so that, countrows(claim current) gives the correct count of the number of unique claims. Now, this table also has clientsid and industrysid. The relation between client and industry here is that, 1 industry can have multiple clients, and 1 client can belong to only 1 industry.
Now, let us consider a fact called claimlagdays, which is present in the table at the granularity of claimsid.
Now, one requirement is that, we need to find out "peer" sum(claimlagdays). This, for a particular client, is basically calculated as:
sum(claimlagdays) for the industry of the client being filtered (minus) sum(claimlagdays) for this particular client. Let's call this measure A.
Similar to above, we need to calculate "peer" claim count , which is claimcount for the industry of the client being filtered (minus) claimcount for this particular client.
Let's call this measure B.
In the final calculation, we need to divide A by B, to get the "peer" average lag days.
So basically, the hard part here is this: find the industry of the particular client which is being filtered for, and then, apply this filter to the fact table (claim metrics current) to find out the total claim count/other metric only for this industry. then of course, subtract the client figure from this industry figure to get the "peer" measure. This has to be done for each row, keeping intact any other filters which might be applied in the slicer(date/business unit, etc.)
There are a couple of other filters static which need to be considered, which are present in other tables, such as "Claim Type"(=Indemnity/Medical) and Claim Status(=Closed).
My solution:
For measure B
I tried creating a calculated column, as:
Claim Count_WC_MO_Industry=COUNTROWS(FILTER(FILTER('Claim Metrics Current',RELATED('Claim WC'[WC Claim Type])="Medical" && RELATED('Coverage'[Coverage Code])="WC" && RELATED('Claim Status'[Status Code])="CL"),EARLIER('Claim Metrics Current'[IndustrySID])='Claim Metrics Current'[IndustrySID]))
Then I created the measure
Claim Count - WC MO Peer:=CALCULATE(SUM([Claim Count_WC_MO_Industry])/[Claim - Count])- [Claim - Count WC MO]
{I did a sum because, tabular model doesn't directly allow me to use a calculated column as a measure, without any aggregation. And also, that wouldn't make any sense since tabular model wouldn't understand which row to take}
The second part of the above measure is obviously, the claim count of the particular client, with the above-mentioned filters.
Problem with my solution:
The figures are all wrong.I am not getting a client-wise or year-wise separation of the industry counts or the peer counts. I am only getting a sum of all the industry counts in the measure.
My suspicion is that this is happening because of the sum which is being done. However, I don't really have a choice, do I, as I can't use a calculated column as a measure without some aggregation...
Please let me know if you think the information provided here is not sufficient and if you'd like me to furnish some data (dummy). I would be glad to help.
So assuming that you are filtering for the specific client via a frontend, it sounds like you just want
ClientLagDays :=
CALCULATE (
SUM ( 'Claim Metrics Current'[Lag Days] ),
Static Filters Here
)
Just your base measure of appropriate client lag days, including your static filters.
IndustryLagDays :=
CALCULATE (
[ClientLagDays],
ALL ( 'Claim Metrics Current'[Client] ),
VALUES ( 'Claim Metrics Current'[IndustrySID] )
)
This removes the filter on client but retains the filter on Industry to get the industry-wide total of lag days.
PeerLagDays:=[IndustryLagDays]-[ClientLagDays]
Straightforward enough.
And then repeat for claim counts, and then take [PeerLagDays] / [PeerClaimCount] for your [Average Peer Lag Days].

Subselecting with MDX

Greetings stack overflow community.
I've recently started building an OLAP cube in SSAS2008 and have gotten stuck. I would be grateful if someone could at least point me towards the right direction.
Situation: Two fact tables, same cube. FactCalls holds information about calls made by subscribers, FactTopups holds topup data. Both tables have numerous common dimensions one of them being the Subscriber dimension.
FactCalls FactTopups
SubscriberKey SubscriberKey
CallDuration DateKey
CallCost Topup Value
...
What I am trying to achieve is to be able to build FactCalls reports based on distinct subscribers that have topped up their accounts within the last 7 days.
What I am basically looking for an MDX equivalent to SQL's:
select *
from FactCalls
where SubscriberKey in
( select distinct SubscriberKey from FactTopups where ... );
I've tried creating a degenerate dimension for both tables containing SubscriberKey and doing:
Exist(
[Calls Degenerate].[Subscriber Key].Children,
[Topups Degenerate].[Subscriber Key].Children
)
Without success.
Kind regards,
Vince
You would probably find something like the following would perform better. The filter approach will be forced to iterate through each subscriber, while the NonEmpty() function can take advantage of optimizations in the storage engine.
select non empty{
[Measures].[Count],
[Measures].[Cost],
[Measures].[Topup Value]
} on columns,
{
NonEmtpy( [Subscriber].[Subscriber Key].Children,
( [Measures].[Topups Count],
[Topup Date].[Calendar].[Month Name].&[2010]&[3] ) )
} on rows
from [Calls] ;
You know how sometimes it's the simplest and most obvious solutions that somehow elude you? Well, this is apparently one of them. They say "MDX is not SQL" and I now know what they mean. I've been working at this from an entirely SQL point of view, completely overlooking the obvious use of the filter command.
with set [OnlyThoseWithTopupsInMarch2010] as
filter(
[Subscriber].[Subscriber Key].Children,
( [Measures].[Topups Count],
[Topup Date].[Calendar].[Month Name].&[2010]&[3] ) > 0
)
select non empty{
[Measures].[Count],
[Measures].[Cost],
[Measures].[Topup Value]
} on columns,
non empty{ [Test] } on rows
from [Calls] ;
Embarrassingly simple.

Resources