Group on sum of distinct values in Tableau - distinct

I'm using Tableau 8.3 and i'm trying to find out how to group on each of the values that I find after making a "count of distinct values".
To illustrate the case I have made a fictive dataset which includes 58 rows (buys), 7 different IDs (customers) and 5 different products. Then I have made a count distinct to find out how many of the 5 different products each ID have bought. It looks like this.
ID1 = 4
ID2 = 4
ID3 = 5
ID4 = 4
ID5 = 3
ID6 = 4
ID7 = 2
Now I want to turn the view around and find out how many of the IDs who have bought X different products. It should ultimately look like this.
2 = 1
3 = 1
4 = 4
5 = 1
Hope to find a solution by posting here! Thank you,
Mikael

You need to update to Tableau 9.0 to achieve that (in a fast way).
You can create a calculated field named #of products:
{ FIXED [id_customer] : COUNTD([id_product]) }
Then you can cross the [# of products] with COUNTD(id_customer) to get what you want.
In older versions of Tableau you need to create a new table in a proper format (1 line per customer with the aggregations) and connect to it.

Related

Elastisearch Query to filter out values in column based on another column

I'm working with a dataset that I'm trying to filter, but I'm having trouble trying to get the results I want, and I'm not sure it's possible to do so. Some sample data is below:
Column A
Column B
Column C
Column D
good
awe
1
9834
great
niopre
1
78964
bad
nue
1
12
good
btr
2
6543
great
muy
2
8765
bad
xdrg
2
1432
bad
thr
3
648
good
cfg
3
6
bad
mk
3
1958
What I want to do is use Column A and C in the filter and only show the values in Column C that also have a row that includes "great" in Column A. So for this dataset the filter would return:
Column A
Column B
Column C
Column D
good
awe
1
9834
great
niopre
1
78964
bad
nue
1
12
good
btr
2
6543
great
muy
2
8765
bad
xdrg
2
1432
I've tried through the built in filters and through the Elasticsearch Query DSL and haven't had any luck yet. If anyone can help guide me in the right direction on how to do this it would be greatly appreciated.

How do you create a new column based on Max value of 1 column and Category of another?

I am working with a bunch of data for my job creating status reports on the documents that we are working through that we then assign to an area. We decided to use PowerBI as an interactive way to see where everything is at.
Using Power BI Desktop I've created a new table that excludes documents that are not ready for QC but we have several different statuses. Instead of creating a new table for each status type (since some can be grouped together) I would like to create a new column that has the grouped status value's Max for each area. The higher the Status Value the further it is from being complete.
EX:
Record:
Area:
Status Value:
Max Status Value:
152385
A
1
2
354354
B
2
3
131322
B
3
3
132136
A
2
2
213513
A
1
2
351315
B
2
3
If anyone knows how to get the Max Status Value column that would greatly help. I did find another post (https://community.powerbi.com/t5/Desktop/LOOKUPVALUE-return-min-max-of-values-found/td-p/657534) that was similar but I'm still new to DAX and could not figure out how to apply it to my situation.
This post actually helped me answer the question.
https://community.powerbi.com/t5/Power-Query/Maxifs-Power-Query/m-p/1693606
The only difference I made was getting rid of the true/false portion to receive my results. Thus my result was:
Max Status Value =
VAR vMaxVal=
CALCULATE (
MAX ( 'Table'[Status Value] ),
ALLEXCEPT (
'Table',
'Table'[Area]
)
)
RETURN
vMaxVal

SUMIF with date range for specific column

I've been trying to find an answer for this, but haven't succeeded - I need to sum a column for a specified date range, as long as my rowname matches the reference sheet's column name.
i.e
Reference_Sheet
Date John Matt
07/01/19 1 2
07/02/19 1 2
07/03/19 2 1
07/04/19 1 1
07/05/19 3 3
07/06/19 1 2
07/07/19 1 1
07/08/19 5 9
07/09/19 9 2
Sheet1
A B
1 07/01
2 07/07
3 Week1
4 John 10
5 Matt 12
Have to work in google sheets, and I tried using SUMPRODUCT which told me I can't multiply texts and I tried SUMIFS which let me know I can't have different array arguments - failed efforts were similar to below,
=SUMIFS('Reference_Sheet'!B2:AO1000,'Reference_Sheet'!A1:AO1,"=A4",'Reference_Sheet'!A2:A1000,">=B1",'Reference_Sheet'!A2:A1000,"<=B2")
=SUMPRODUCT(('Reference_Sheet'!$A$2:$AO$1000)*('Reference_Sheet'!$A$2:$A$1000>=B$1)*('Reference_Sheet'!$A$2:$A$1000<=B$2)*('Reference_Sheet'!$A$1:$AO$1=$A4))
This might work:
=sumifs(indirect("Reference_Sheet!"&address(2,match(A4,Reference_Sheet!A$1:AO$1,0))&":"&address(100,match(A4,Reference_Sheet!A$1:AO$1,0))),Reference_Sheet!A$2:A$100,">="&B$1,Reference_Sheet!A$2:A$100,"<="&B$2)
But you'll need to specify how many rows down you need it to go. In my formula, it looks down till 100 rows.
To change the number of rows, you need to change the number in three places:
&address(100
Reference_Sheet!A$2:A$100," ... in two places
To briefly explain what is going on:
look for the person's name in row 1 using match
Use address and indirect to build the address of cells to add
and then sumIfs() based on dates.
alternative:
=SUMPRODUCT(QUERY(TRANSPOSE(QUERY($A:$D,
"where A >= date '"&TEXT(F$1, "yyyy-mm-dd")&"'
and A <= date '"&TEXT(F$2, "yyyy-mm-dd")&"'", 1)),
"where Col1 = '"&$E4&"'", 0))

Making DAX code more efficient - counting unique Start dates in overlapping date ranges

I have a table of every product purchased by every client over 25 years. The table contains client#, product, start date, and end date.
The products can be owned by the client for any amount of time (1 day to 100 years). While the client owns products with us, the client is active. If a client ends all products they cease to be a client. I want to count new client starts each year. The problem is, some clients end all products then start purchasing products again years later (but clients always retain the same client#) - If the client leaves then rejoins year's later I want to count the client as a new client.
I have created DAX code to do this which works perfectly on a small file, but the code uses up too many resources and so I cannot use it on my data (about 200,000 records). I know my code is HIGHLY INEFFICIENT and could probably be cleaned up...but I am not sure how. Alternately, if I could figure out how to make these columns in PowerQuery, perhaps that would work
Here is how I do it.
1) Add four calculated columns to my table:
VeryFirstStart = Calculate(
Min('Products'[StartDate]),
ALLEXCEPT(Products,Products[ClientNumber]))=Products[StartDate]
this flags records that contain the first ever start date of any client
MaxEndDateofEarlierDates = Calculate(
Max('Products'[EndDate]),
Filter(
Filter(ALLEXCEPT(Products, Products[ClientNumber]), Products[EndDate]),
Products[StartDate] < EARLIER(Products[StartDate])))
This step blows up my PowerBI - this shows the date of any NEW product purchases where the new start date occurs AFTER an ending date
Second+Start = And(
Products[MaxEndDateofEarlierDates]<>BLANK(),
Products[MaxEndDateofEarlierDates]<Products[StartDate])
this flags records where we want to count the new start date as a new client
NewStart = OR(Products[Second+Start],Products[VeryFirstStart])
**this flags ANY new client start date regardless of whether it was the first or a subsequent*
Finally I added this measure:
!MemberNewStarts = CALCULATE(
DISTINCTCOUNT(Products[ClientNumber]),
FILTER(
'Products',
('Products'[StartDate] <= LASTDATE('DIMDate'[Date]) &&
'Products'[StartDate]>= FIRSTDATE('DIMDate'[Date]) &&
Products[NewStart]=TRUE())))
Does anyone have any suggestions about how to achieve this with less resources?
Thanks
Here is some data to try
MemberNumber Product StartDate EndDate Note (not in real data)
1 A 02/02/2003 02/02/2004
1 C 02/02/2009 02/02/2010
2 A 02/02/2001 02/02/2002
2 C 02/02/2001 02/02/2002
2 B 02/02/2005 02/02/2010
3 C 02/02/2002 02/02/2005
3 B 02/02/2002 02/02/2005
3 A 02/02/2003 02/02/2008
4 B 02/02/2002 02/02/2003
4 C 02/02/2003 02/02/2006
5 B 02/02/2003 02/02/2007
5 C 02/02/2005 02/02/2010
5 A 02/02/2005 02/02/2007
6 A 02/02/2001 02/02/2006
6 C 02/02/2003 02/02/2007
7 B 02/02/2001 02/02/2004
7 A 02/02/2001 02/02/2005
7 C 02/02/2005 02/02/2006
8 B 02/02/2002 02/02/2006
8 A 02/02/2004 02/02/2009
note member 1 starts as a new client in 2009 since all previous products ended in 2004 and member 2 starts as a new client in 2005 since all previous products ended in 2002
The desired outcome is:
Start Year 2001 2002 2003 2004 2005 2006 2007 2008
New Clients 3 3 2 0 1 0 0 0
Here's one way of trying to solve it. Let me know if this is any more efficient than yours:
1st New Column:
PreviousHighestFinish:=
Calculate(
Max(Products[EndDate]),
ALLEXCEPT(Products,Products[ClientNumber]),
Products[StartDate] < Earlier(Products[StartDate]
)
This will give you the latest end date where the Client Number matches and the start date is before the current start date. If there is no earlier start date, it returns a blank.
2nd New Column:
NewClientProduct:=
if(Products[StartDate]>=Products[PreviousHighestFinish],1,0)
This will give you a 1 for every row where the client has either not been seen before (and the previous column showed blank) or the client has ben seen before, but has no current products.
The problem with this measure is that if you have a client starting more than one product on the same date, they will show as multiple new clients.
The fix for this is to count up the instances of each client-date combination
3rd New Column:
ClientDateCount:=
CALCULATE(
COUNTROWS(Products),
ALLEXCEPT(Products,Products[ClientNumber],Products[StartDate])
)
This essentially gives the number of times that the client on this row in the table has started a product on this date.
Now divide the 2nd new column by this one
4th New Column:
NewClients:=
DIVIDE(Products[NewClientProduct],Products[ClientDateCount])
And voila:

XPages ratio between 2 column totals in view

I have a view with 2 columns. Both number types. I want to add a button to the page, which will calculate the ration between total of column 1 and total of column 2.
For example:
Column 1 | Column 2
5 | 5
10 | 10
4 | 3
3 | 4
It should be (5+10+4+3)/(5+10+3+4) = 1..
Thank you,
Florin
I would probably go into the view properties and turn on the totals for both those columns. Then in SSJS I'd get a NotesNavigator for that view and then create a NotesViewEntry object. Then I'd set that ViewEntry to the NotesNavigator.getLastEntry()... something like that. That entry should then have a columnValues() property. I THINK it would be something like:
var column1 = entry.getColumnValues()[1];
var column2 = entry.getColumnValues()[2];
Note: getColumnValues() is zero based and depending on the view design some columns might not be available to retrieve from getColumnValues().
Once you have you're 2 var's then you can do the math and do whatever with the result.
That's how I would approach it at least.
(I don't have an editor handy so sorry if some of the syntax is off)

Resources