Is there a way to filter and return results for the unique of one column, with conditions of another? - filter

Question
For the data below, is there a way to return results —for each order in col B— either:
If the most recent status [col D] for an order (ex. for order 10021) is closed, then return that row.
If not, return every row since the most recent closed status for that order (ex. for order 10020, rows 4 and 5).
Previous efforts and attempted solutions
Previously, I was only returning one result, the most recent status for each order with the following:
=SORTN(SORT(A2:D,1,FALSE),9^9,2,2,FALSE)
However, I would like if orders can have more than one current status.
I've tried a few things, and was able to achieve what I'm looking for, unfortunately only if there is one order, with the following:
(The linked sheet below explains how I got to this)
=IFERROR(FILTER(A2:D5,A2:A5>INDEX(SORT(FILTER(A2:D5,D2:D5="CLOSED"),1,0),1,1)),FILTER(A2:D5,A2:A5>=INDEX(SORT(FILTER(A2:D5,D2:D5="CLOSED"),1,0),1,1)))
The other alternative I can think of is a script with a loop.
Summary
It was difficult to know how to title this question but came to it since essentially we're trying to filter for the unique of col B, with conditions against col A & D.
Here's a link to a sample Google spreadsheet you can edit, showing all the attempts.
All your help and comments are greatly appreciated!

maybe like this:
=ARRAYFORMULA(UNIQUE(SORT({VLOOKUP(UNIQUE(INDIRECT("B2:B"&COUNTA(B2:B)+1)),
SORT({B2:B, TO_TEXT(A2:D)}, 2, 0), {2, 3, 4, 5}, 0);
FILTER(A2:D, D2:D="RETURN")},1,1)))

Solution
All credit to Matt King who found a complete answer.
=ARRAYFORMULA(QUERY({A:D,(COUNTIFS(C:C,C:C,A:A,">="&A:A)=1)*(D:D="CLOSED")
+NOT(REGEXMATCH(TRIM(TRANSPOSE(QUERY(IF((TRANSPOSE(A:A)<=A:A)*(TRANSPOSE(C:C)=C:C),D:D,)
,,9^99))),"CLOSED"))},"select Col1,Col2,Col3,Col4 where Col5=1"))
Essentially,
First assigning a count to each row for any given order
Transpose and combine statuses
If there is a CLOSED status in there, don't use it
Then filter with query language
Resulting in—
See solution here.
Mod
I added a clause to completely exclude the status NOTE from current records—
=ARRAYFORMULA(QUERY({A:D,(COUNTIFS(B:B,B:B,A:A,">="&A:A)=1)*(D:D="CLOSED")+
(ARRAYFORMULA(IF((ARRAYFORMULA(NOT(REGEXMATCH(TRIM(TRANSPOSE(QUERY
(IF((TRANSPOSE(A:A)<=A:A)*(TRANSPOSE(B:B)=B:B),D:D,),,9^99))),"CLOSED"))))
=TRUE,IF((ARRAYFORMULA(NOT(TRANSPOSE(ARRAYFORMULA(TRANSPOSE(D:D)))="NOTE")))
=FALSE,FALSE,TRUE),FALSE)))},"select Col1,Col2,Col3,Col4 where Col5=1"))
Implementation
This was sample tested against 2500 rows of data, and took over 80 seconds to execute. So although this answers the question, it isn't necessarily a viable solution.

Related

In Google Sheets, How do I sumif(s) over a comma separated list in string?

In our schools, we have books of the same title by the same author but different ISBN #s. I am working on an inventory list so that we can scan the different ISBNs and then find out what is on hand for a title.
Here is my working spreadsheet demo. The live version will be separated (columns A-D by data that comes in on another sheet (possibly by Google Forms) and a separate sheet (F-J) that does all the math. For convenience / testing, they are all on one sheet.
Essentially, in column F, I would like to sum all the quantities in A where the ISBN's in C match any of the values of G and place it in F.
The formula I am using in F doesn't seem to completely work:
=SUMIF(C:C,arrayformula(split(G2,",")),A:A)
It captures the first match but ignores / doesn't loop over the rest. I have looked at Sumifs and Match and I cannot seem to get any closer with the syntax. I would greatly appreciate if anyone can help me solve this dilemma.
Additionally, I know how to do this with a custom script but I need to avoid that as end users break things for one reason or another and I can't handle the debugging load the way this could possibly be deployed.
Thanks in advance for anyone willing to take a look at this!
~Allan
Try in F2
=sum(query(A:D,"select A where C matches '"& textjoin("|",,split(G2,",")) &"' ",0))
delete everything in F2:F & J2:J and use F2:
=INDEX(IF(G2:G="",,MMULT(IFERROR(VLOOKUP(SPLIT(G2:G, ","), {C:C, A:A}, 2, ), 0),
SEQUENCE(COLUMNS(SPLIT(G2:G, ",")), 1, 1, ))))
in J2 use:
=ARRAYFORMULA(IF(G2:G="",,F2:F*I2:I))

In Wolfram Mathematica, who do I query the result of a Counts operation efficiently and conveniently?

EDIT At the suggestion of #HighPerformanceMark, I've moved the question to mathematica.stackexchange.com: my question, so I attempted to close the question here. But SO doesn't allow me to do it properly, hence this up-front warning.
Setup
Say, I'm given a dataset, like the one below:
titanic = ExampleData[{"Dataset", "Titanic"}]; titanic
Answering with:
And I want to count the occurrences of any combination between { "1st", "2nd"} and {"female", "male"}, using the Counts operator on the dataset, like:
genderclasscounts = titanic[All, {"class", "sex"}][Counts]
Problem statement
This is not a "flat" dataset and I don't have a clue how to query in the usual way, like:
genderclasscount[Select[ ... ], ...]
The resulting dataset doesn't provide "column" names to be used as parameters in the Select nor can I refer to the number representing the count by a name.
And I've no clue how to express an Association as a value in a Select!?
Furthermore, try genderclasscount[Print], this demonstrates the values presented to the operation over this dataset are just numbers!
An unsatisfactory attempt
Of course, I can "flatten" the Counts result, by doing something horrific and inefficient like:
temp = Dataset[(row \[Function]
AssociationThread[{"class", "sex", "count"} -> row]) /# (Nest[
Normal, genderclasscounts, 3] /.
Rule[{Rule["class", class_], Rule["sex", sex_]},
count_] -> {class, sex, count})]
In this form it is easy to query a count result:
First#temp[Select[#class == "1st" \[And] #sex == "female" &], "count"]
Question
So, my questions are
How can I query the (immediate) result of the Count operation in a convenient and efficient fashion, like using a Select operation on the resulting dataset? Or, if that is not possible;
Is there an efficient and convenient transformation of the Counts result dataset possible facilitating such a query? With "convenient" I mean, for example, that you just provide the dataset and the transformation handles the rest. So, not something like I've shown above in my unsatisfactory "solution" ;-)
Thanks for reading this far and I'm looking forward to anwsers and inspiration.
/#nanitous

How to sort text columns in a specific manner?

I was wondering if it was possible to have a way to sort text rows in a specific manner. What I mean by that is, imagine we have a column of 15 rows which could contain :
Foo
Bar
Something
Other_example
We want to be able to sort them in this particular order. The easy option I found was to put numbers in front of them (1 - Foo, 2 - Bar, etc) and then sorting them in a normal (alphanumerical) manner, but that is not really visually appealing, I would say.
Is there another way to do it simply ? For example by "hiding" the numbers, or something else ? Of course I could write a script for it with with a condition such as a switch-case, but that wouldn't be truly what I am looking for and easily done within the spreadsheet itself (I think so at least, I have never really gone far in sheets possibilities). However if there is no other option, I will do it and add a simple button to access the script, I guess.
Thank for your time !
Sanimys
=ARRAYFORMULA(ARRAY_CONSTRAIN(SORT(IFERROR(
VLOOKUP(C2:C, {A2:A, ROW(A2:A)}, {1, 2}, 0)), 2, 1), 999^99, 1))

Reporting Multiple Values & Sorting

Having a bit of an issue and unsure if it's actually possible to do.
I'm working on a file that I will enter target progression vs actual target reporting the % outcome.
PAGE 1
¦NAME ¦TAR 1 %¦TAR 2 %¦TAR 3 %¦TAR 4 %¦OVERALL¦SUB 1¦SUB 2¦SUB 3¦
¦NAME1¦ 114%¦ 121%¦ 100%¦ 250%¦ 146%¦ 2¦ 0¦ 0%¦
¦NAME2¦ 88%¦ 100%¦ 90%¦ 50%¦ 82%¦ 0¦ 1¦ 0%¦
¦NAME3¦ 82%¦ 54%¦ 64%¦ 100%¦ 75%¦ 6¦ 6¦ 15%¦
¦NAME4¦ 103%¦ 64%¦ 56%¦ 43%¦ 67%¦ 4¦ 4¦ 24%¦
¦NAME5¦ 87%¦ 63%¦ 89%¦ 0%¦ 60%¦ 3¦ 2¦ 16%¦
Now I already have it sorting all rows by the Overall % column so I can quickly see at a glance but I am creating a second page that I need to reference points.
So on the second page I would like to somehow sort and reference different columns for example
PAGE 2
TOP TAR 1¦Name of top %¦Top %¦
TOP TAR 2¦Name of top %¦Top %¦
Is something like this possible to do?
Essentially I'm creating an Employee of the Month form that automatically works out who has topped what.
I'm willing to drop a paypal donation for whoever can figure this out for me as I've been doing it manually every month and would appreciate the time saved
I don't think a complicated array formula is necessary for this - I am suggesting a fairly standard Index/Match approach.
First set up the row titles - you can just copy and transpose them from Page 1, or use a formula in A2 of Page 2 like
=transpose('Page 1'!B1:E1)
The use them in an index/match to get the data in the corresponding column of the main sheet and find its maximum (in C2)
=max(index('Page 1'!A:E,0,match(A2,'Page 1'!A$1:E$1,0)))
Finally look up the maximum in the main sheet to find the corresponding name:
=index('Page 1'!A:A,match(C2,index('Page 1'!A:E,0,match(A2,'Page 1'!A$1:E$1,0)),0))
If you think there could be a tie for first place with two or more people getting the same score, you could use a filter to get the different names:
So if the max score is in B8 this time (same formula)
=max(index('Page 1'!A:E,0,match(A8,'Page 1'!A$1:E$1,0)))
the different names could be spread across the corresponding row using transpose (in C8)
=ArrayFormula(TRANSPOSE(filter('Page 1'!A:A,index('Page 1'!A:E,0,match(A8,'Page 1'!A$1:E$1,0))=B8)))
I have changed the test data slightly to show these different scenarios
Results

Report Builder Expressions

Im new to Report Builder and having issues with some expressions that Im trying to implement in a report. I got the standard ones to work however as soon as I try any distinctions, I get error messages. Over the last couple weeks, Ive tried many combinations, read the expression help, google and looking at other questions at internet sites. To reduce my frustrations, I even would jump to other expressions and walk away hoping I would have different insight coming back.
Its probably something simple or something I dont know about writing expressions.
Im hoping that someone can help with these expressions; they are the versions I get the least errors with(usually just expression expected) and show what Im trying to accomplish.
=IIF((Fields!RECORDFLAG.Value)='D',COUNTDISTINCT(Fields!TICKETNUM.Value),0)
=IIF((Fields!TRANSTYPE.Value)='1' and (Fields!RECORDFLAG.VALUE)='A' or
'B',SUM(Fields!DOLLARS.Value),0)
=IIF((Fields!TRANSTYPE.Value)='1' and
(Fields!RECORDFLAG.VALUE)='P',SUM(Fields!DOLLARS.Value),0)
=Sum([DOLLARS] case when [RECORDFLAG]='P' then -1*[DOLLARS])
Thank You.
=IIF((Fields!RECORDFLAG.Value)=”D”,COUNTDISTINCT(Fields!TICK‌​ETNUM.Value))
The error message gives you the answer here - no false part of the iif() has been specified. Use =IIF((Fields!RECORDFLAG.Value)=”D”,COUNTDISTINCT(Fields!TICK‌​ETNUM.Value), 0)
=IIF((Fields!TRANSTYPE.Value)="1" and (Fields!RECORDFLAG.VALUE)="A" or "B",SUM(Fields!DOLLARS.Value),0)
This is not how an OR works in SSRS. Use:
=IIF((Fields!TRANSTYPE.Value)="1" and (Fields!RECORDFLAG.VALUE="A" or Fields!RECORDFLAG.Value = "B"),SUM(Fields!DOLLARS.Value),0)
The 0s are returned due to your report design. countdistinct() is an aggregate function - it's meant to be used on a set of data. However, your iif() is only testing on a per row basis - you're basically saying "if the current row is thing, count all the distinct values" which doesn't make sense. There are a couple of ways forward:
You can count the number of times a certain value occurs in a given condition using a sum(). This is not the same as the countdistinct(), but if you use =sum(iif(Fields!RECORDFLAG.Value = "D", 1, 0)) then you will get the number of times RECORDFLAG is D in that set. Note: this requires the data to be aggregated (so in SSRS, grouped in a tablix).
You can use custom code to count distinct values in a set. See https://itsalocke.com/aggregate-on-a-lookup-in-ssrs/. You can apply this even if you have only one dataset - just reference the same one twice.
You can change the way your report works. You can group on Fields!RECORDFLAG.Value and filter the group to where Fields!RECORDFLAG.Value = "D". Then in your textbox, use =countdistinct(Fields!TICKETNUM.Value) to get the distinct values for TICKETNUM when RECORDFLAG is D.

Resources