Sum of only Distinct values in a Column in DAX - dax

I have table[Table 1] having three columns
OrganizationName, FieldName, Acres having data as follows
organizationname fieldname Acres
ABC |F1 |0.96
ABC |F1 |0.96
ABC |F1 |0.64
I want to calculate the sum of Distinct values of Acres
(eg: 0.96+0.64) in DAX.

One of the problems with doing what you want is that many measures rely on filters and not actual table expressions. So, getting a distinct list of values and then filtering the table by those values, just gives you the whole table back.
The iterator functions are handy and operate on table expressions, so try SUMX
TotalDistinctAcreage = SUMX(DISTINCT(Table1[Acres]),[Acres])
This will generate a table that is one column containing only the distinct values for Acres, and then add them up. Note that this is only looking at the Acres column, so if different fields and organizations had the same acreage -- then that acreage would still only be counted once in this sum.
If instead you want to add up the acreage simply on distinct rows, then just make a small change:
TotalAcreageOnDistinctRows = SUMX(DISTINCT(Table1),[Acres])
Hope it helps.

Ok, you added these requirements:
Thank You. :) However, I want to add Distinct values of Acres for a
Particular Fieldname. Is this possible? – Pooja 3 hours ago
The easiest way really is just to go ahead and slice or filter the original measure that I gave you. But if you have to apply the filter context in DAX, you can do it like this:
Measure =
SUMX(
FILTER(
SUMMARIZE( Table1, [FieldName], [Value] )
, [FieldName] = "<put the name of your specific field here"
)
, [Value]
)

Related

Google Sheets Formula - Get Total from filtered dates per row (undefined number of columns)

I have this data in Google Sheets where in I need to get the total of the filtered data columns per row. The date columns are not fixed (may increase over time, I already know how to handle this undefined number of columns). What my current challenge encountered is how can I efficiently get a summary of totals per user based on filtered date columns.
My data is like this:
My expected result is like this:
My current idea is this:
Here is a sample spreadsheet for reference:
https://docs.google.com/spreadsheets/d/1_dByPabStGQvh94TabKxwFeUyVaRFnkBCRf4ioTY5jM/edit?usp=sharing
This is a method to unpivot the data so you can work with it
=ARRAYFORMULA(
QUERY(
IFERROR(
SPLIT(
FLATTEN(
IF(ISBLANK(A2:A),,A2:A&"|"&B1:G1&"|"&B2:G)),
"|")),
"select Col1, Sum(Col3)
where
Col2 >= "&DATE(2022,1,1)&" and
Col2 <= "&DATE(2022,1,15)&"
group by Col1
label
Col1 'Person',
Sum(Col3) 'Total'"))
Basically, its creating an output of User1|44557|8 -- it then FLATTENs it all and splits by the pipe, which gives you three clean columns.
Run that through a QUERY to SUM by the person between the dates and you get what you're after. If you wanted to use cell references for dates, simply replace the dates with the cell references.
To expand the table, change B1:G1 and B2:G2 to match the width of the range.

How to filter by measure values in MDX while having dimension members in both axis

I'm developing an application that uses a tabular database to show some business data.
I need to provide some basic filtering over measures values (equal to, greater than, lesser than etc.) and I'm currently analyzing the proper way to generate the MDX.
Looking at some documentation (and other threads on this site), I found that the most efficient approach would be using the FILTER or HAVING functions to filter out undesired values.
Unfortunately all examples normally include measures on one axis and dimension member on the other, but I potentially have dimension members in both axis and can't find a proper solution to use such functions to filter by measure value.
What have I done so far?
To make it easier to explain, let's say that we want to get the yearly sales quantities by product class filtering quantity > 1.3 milions
Trying to use HAVING or FILTER Functions, the resulting MDX I came up with is
SELECT
NON EMPTY {[YearList].[Year].[Year].MEMBERS * [Measures].[Qty]}
HAVING [Measures].[Qty] > 1.3e6 ON COLUMNS,
NON EMPTY {[Classes].[cClass].[cClass].MEMBERS}
HAVING [Measures].[Qty] > 1.3e6 ON ROWS
FROM [Model]
or
SELECT
NON EMPTY FILTER({[YearList].[Year].[Year].MEMBERS * [Measures].[Qty]},
[Measures].[Qty] > 1.3e6) ON COLUMNS,
NON EMPTY FILTER({[Classes].[cClass].[cClass].MEMBERS} ,
[Measures].[Qty] > 1.3e6) ON ROWS
FROM [Model]
But this is of course leading to unexpected result for the final user because the filter is happening on the aggregation of the quantities by the dimension on that axis only, which is greater then 1.3M
The only way I found so far to achieve what I need is to define a custom member with an IIF statement
WITH
MEMBER [Measures].[FilteredQty] AS
IIF ( [Measures].[Qty] > 1.3e6, [Measures].[Qty], NULL)
SELECT
NON EMPTY {[YearList].[Year].[Year].MEMBERS * [Measures].[FilteredQty]} ON COLUMNS,
NON EMPTY {[Classes].[cClass].[cClass].MEMBERS} ON ROWS
FROM [Model]
The result is the one expected:
Is this the best approach or I should keep using FILTER and HAVING functions? Is there even a better approach I'm still missing?
Thanks
This is the best approach. You need to consider how MDX resolves result. In the example above it is a coincidence that your valid data in a continous region of first four columns of first row. Lets relax the filtering clause and make it >365000. Now take a look at last row of the result, the first two columns and the last column are eligible cells but the third and fourth column is not eligible. However your query will report it as null and the non empty function will not help. The reason is that non empty needs the entire row to be null
Now the question that why filter is not eliminating the cell? Filter will eliminate a row or column when the criteria is greater then the sum on the other axis. So if filter is on columns the filter value has to be greater than the sum of rows for that column. Take a look at the sample below as soon as you remove the comments the last column will be removed.
select
non empty
filter(
([Measures].[Internet Sales Amount]
,{[Date].[Calendar Year].&[2013],[Date].[Calendar Year].&[2014]}
,[Date].[Calendar Quarter of Year].[Calendar Quarter of Year]
),([Date].[Calendar Year].currentmember,[Date].[Calendar Quarter of Year].currentmember,[Product].[Subcategory].currentmember,[Measures].[Internet Sales Amount])>45694.70--+0.05
)
on columns
,
non empty
[Product].[Subcategory].members
on rows
from
[Adventure Works]
Edit another sample added.
with
member [Measures].[Internet Sales AmountTest]
as
iif(([Date].[Calendar Year].currentmember,[Date].[Calendar Quarter of Year].currentmember,[Product].[Subcategory].currentmember,[Measures].[Internet Sales Amount])>9000,
([Date].[Calendar Year].currentmember,[Date].[Calendar Quarter of Year].currentmember,[Product].[Subcategory].currentmember,[Measures].[Internet Sales Amount]),
null
)
select
non empty
({[Measures].[Internet Sales Amount],[Measures].[Internet Sales AmountTest]}
,{[Date].[Calendar Year].&[2013]}
,[Date].[Calendar Quarter of Year].[Calendar Quarter of Year]
)
on columns
,
non empty
[Product].[Subcategory].[Subcategory]
on rows
from
[Adventure Works]

DAX - Meassure that sums only the first occurance by group

I'm trying to figure out how to build a measure that sums a total, but only taking the first non-empty row for a user.
For example, my data looks like the below:
date user value
-----------------
1/1/17 a 15
2/1/17 a 12
1/1/17 b null
5/1/17 b 3
I'd therefore like a result of 18 (15 + 3).
I'm thinking that using FIRSTNONBLANK would help, but it only takes a single column, I'm not sure how to give it the grouping - perhaps some sort of windowing is required.
I've tried the below, but am struggling to work out what the correct syntax is
groupby(
GROUPBY (
myTable,
myTable[user],
“Total”, SUMX(CURRENTGrOUP(), FIRSTNONBLANK( [value],1 ))
),
sum([total])
)
I didn't have much luck getting FIRSTNONBLANK or GROUPBY to work exactly how I wanted, but I think I found something that works:
SUMX(
ADDCOLUMNS(
ADDCOLUMNS(VALUES(myTable[User]),
"FirstDate",
CALCULATE(MIN(myTable[Date]),
NOT(ISBLANK(myTable[Value])))),
"FirstValue",
CALCULATE(SUM(myTable[Value]),
FILTER(myTable, myTable[Date] = [FirstDate]))),
[FirstValue])
The inner ADDCOLUMNS calculates the first non-blank date values for each user in the filter context.
The next ADDCOLUMNS, takes that table of users and first dates and for each user sums each [value] that occurred on each respective date.
The outer SUMX takes that resulting table and totals all of the values of [FirstValue].

MDX - How to select one column and sort the returned data

For a SSRS report, I'm trying to return a list of sorted data from a dimension to use with a parameter.
My dimension is [Radio].[Radio NO].[Radio NO] where the last Radio NO is a string.
I can find examples of returning one column while sorting on another but I can't figure out how to sort and return just one column.
Thanks whytheq! Based on your answer, here's what I came up with that works:
SELECT {} ON COLUMNS,
ORDER(
[Radio].[Radio NO].[Radio NO].MEMBERS
,[Radio].[Radio NO].CURRENTMEMBER.MEMBER_CAPTION
,BASC
) On ROWS
FROM [OurCube]
Without seeing the exact structure of your cube / query an avenue you could explore, if you'd like to order alphabetical, is the following
ORDER(
[Radio].[Radio NO].[Radio NO].MEMBERS
,[Radio].[Radio NO].CURRENTMEMBER.MEMBER_CAPTION
,BDESC
)
If you want to order by a measure in your cube, then something like the following:
ORDER(
[Radio].[Radio NO].[Radio NO].MEMBERS
,[Measures].[Profit]
,BDESC
)
This is a possible if you really need to change the column name before hitting SSRS but it has the disadvantage of changing it to a measure:
WITH
MEMBER [Measures].[thisIsTheNewName] AS
[Radio].[Radio NO].CURRENTMEMBER.MEMBER_CAPTION
SELECT
{[Measures].[thisIsTheNewName]} ON COLUMNS,
ORDER(
[Radio].[Radio NO].[Radio NO].MEMBERS
,[Radio].[Radio NO].CURRENTMEMBER.MEMBER_CAPTION
,BASC
) On ROWS
FROM [OurCube];

Avoid duplicate values for certain column in DAX query

I am using the following statement to get a result table:
EVALUATE
(
CALCULATETABLE
(
ADDCOLUMNS (
'Case',
"Casenumber", RELATED( 'CaseDetails'[nr]),
),
'Case'[Date] <= value(#dateto) )
)
However, I want to only get one record pr casenumber. In SQL I would solve this with a GROUP BY statement, but how should I do this in DAX? Case also has a dimkey, so several cases with the same casenumber can have different dimkeys.
Try this:
EVALUATE
CALCULATETABLE(
SUMMARIZE(
Case
,<comma-separated list of fields from Case you want>
,"CaseNumber"
,RELATED(CaseDetails[nr])
)
,Case[Date] <= VALUE(#dateto)
)
SUMMARIZE() takes a table as its first argument, then a comma-separated list of fields from that table and any tables that it is related to where it is on the many side (thus in a star schema, SUMMARIZE()ing the fact table will allow you to refer directly to any dimension table field), followed by a comma-separated list of , pairs where is a quoted field name and is a scalar value which is evaluated in the row context of the table in the first argument.
If you don't need to rename CaseDetails[nr], then the query would look like this (just for an illustrative example):
EVALUATE
CALCULATETABLE(
SUMMARIZE(
Case
,<comma-separated list of fields from Case you want>
,CaseDetails[nr]
)
,Case[Date] <= VALUE(#dateto)
)
In such a query, all fields will come through with column headings in the format of 'table'[field], so there is no ambiguity if you have identical field names in multiple related tables.
Edit to address new information in original:
SUMMARIZE(), just like SQL's GROUP BY clause will not eliminate distinct values from the resultset. If there is a field that is a higher cardinality than the field you want to group by, you will always see duplicates.
Is your [DimKey] necessary in the resultset? If yes, then there's no way to decrease the size of your resultset smaller than the number of distinct values of [DimKey].
If [DimKey] is unnecessary, simply omit it from the list of fields in SUMMARIZE().
If you want only a specific [DimKey], e.g. the most recent (assuming it's an IDENTITY field and the max value is the latest), then you can bring it in with another ADDCOLUMNS() wrapped around your current SUMMARIZE():
EVALUATE
ADDCOLUMNS(
SUMMARIZE(
Case
,<comma-separated list of fields from Case except for [DimKey]>
,"CaseNumber"
,RELATED(CaseDetails[nr])
)
,"MaxDimKey"
,CALCULATE(MAX(Case[DimKey]))
)

Resources