I feel like my question should be easy to figure out, but I've looked around and can't seem to find out how to get a basic array spill function that produces the max value. Here's my simplified data set:
Col A
Col B
Apple
864
Carrot
189
Pear
256
Apple
975
Pear
873
Carrot
495
Apple
95
Pear
36
Carrot
804
My objective is to have a unique list of food (from Col A), that returns the max corresponding Value from Col B. The formula for unique list from Col A is easy... =UNIQUE(filter(A:A,A:A<>"")), what I'm struggling with is getting a dynamic maxifs to align with this.
To illustrate, if I put the unique function in cell D2 (thus it would spill to d4 as shown below in blue), a correct corresponding non-array function would be =MAXIFS(B:B,A:A,D2) (shown in column e). I could drag this down the remaining rows but I would like this to be dynamic as there may be more food in my data set in the future.
What I would EXPECT to work is... =filter(MAXIFS(B:B,A:A,D2:D),D2:D<>"") but this returns #Value!. By comparison, if I were to use sumif/Average, =filter(SUMIF(A:A,D2:D,B:B),D2:D<>""), I get what I WOULD expect (which really confuses me).
Is there a way to get a dynamic maxifs (or any function that produces an equal value in column E) that would spill based on unique values in column D?
try:
=QUERY({A:B}, "select Col1,max(Col2) where Col2 is not null group by Col1 label max(Col2)''")
bonus:
=QUERY({A:B}, "select Col1,max(Col2),sum(Col2) where Col2 is not null group by Col1 label max(Col2)'',sum(Col2)''")
bonus 2:
=SORTN(SORT(A1:B, 2, ), 9^9, 2, 1, 1)
2 - sort the second column of range A1:B
<empty> - or 0 or FALSE = "in descending order"
9^9 - output all rows
2 - 2nd mode of SORTN = "group by..."
1 - 1st column
1 - in ascending order
Responding to provide a more clear answer and simplification as others see this looking for same:
The easiest way to accomplish this is by using an array formula such as:
=MAX(IF($A$1:$A$7="Apple",$B$1:$B%7)) followed by CTRL-SHIFT-ENTER
Related
How can I find the most commonly found 'Code' (Col B) associated with each unique 'Name' in (Col A) and find the closest value if the 'Code' in Col B is unique?
The image below shows the shared google sheet with Starting data in Columns A & B and the desired output columns in columns C and D. Each Unique Name has associated codes. Column D displays the most commonly occuring Code for each unique name. For example, Buick La Sabre 1 has 3 associated codes in B3,B4,B5 but in D3 only 98761 because it appears more frequently than the other 2 codes do in B2:B. I will explain what I mean by the closest value below.
The Codes that have a count = 1 are unique so the output in column D tries to find the closest match.
However, when the count of the code in B2:B > 1, then the output in column D = to the most frequent code associated with the Name.
Approach when there is 2 or more of the same values in column B
Query
I thought I might use a QUERY with a ORDER BY count(B) DESC LIMIT 2 in a fashion similar to this working equation:
QUERY($A$1:$D$25,"SELECT A, B ORDER BY B DESC Limit 2",1)
but I could not get it to work when I substituted in the Count function.
SORT & INDEX OR VLOOKUP
If the query function can't be fixed to work, then I thought another approach might be to combine a Vlookup/Index after sorting column B in a descending order.
UNIQUE(sort($B$3:$B,if(len($B$3:$B),countif($B$3:$B,$B$3:$B),),0,1,1))
Since a Vlookup or Index using multiple criteria would just pull the first value it finds, you would just end up with the first matching value, we would then get the most frequent value.
Approach when there is < 2 of the same values in column B
This is a little more complicated since the values can be numbers and letters.
A solution like that seen in the image below could be used if everything were a number. In our case there will usually be between 3 - 5 character alphanumeric code starting with 0 - 1 letters numbers and followed by numbers. I'm not sure what the best way to match a code like A1234 would be. I imagine a solution might be to SPLIT off letters and trying to match those first. For example A1234 would be split into A | 1234, then matching the closest letter and then the closest number. But I really am not sure what the best solution to this might be that works within the constraints of Google Sheets.
In the event that a number is equidistant between two numbers, the lower number should be chosen. For example, if 8 is the number and the closest match would be 6 or 10, then 6 should be selected.
In the event that a letter is being used it should work in a similar fashion. For example, thinking of {A, B, C} as {1, 2, 3}, B should preferrentially match to A since it comes before C.
In summary, looking for a way to find the most frequently associated code in col B that is associated with unique names in col A in this sheet and; In the event where there are none of the same codes in B2:B, a formula that will find the closest match for a number or alphanumeric code.
You can use this formula:
=QUERY({range of numerators & denominators}, "select Col2, count(Col2) group by Col2 label Col2 'Denominator', count(Col2) 'Count'")
That outputs something like this:
Denominator
Count
Den 1
Count 1
Den 2
Count 2
use:
=ARRAY_CONSTRAIN(SORTN(QUERY({A3:B},
"select Col1,Col2,count(Col2)
where Col1 is not null
group by Col1,Col2
order by count(Col2) desc,Col2 asc
label count(Col2)''"), 9^9, 2, 1, 1), 9^9, 2)
I'm trying to calculate a row value based on the previous row value in the same column within a report expression. I can't precalculate this from database since starting point of calculation is dependent from input parameters and values in a table should be recalculated dynamically within report itself.
In Excel analogical data and formula look like as it is shown below (starting point is always 100):
B C D E
Price PreviousPrice CalcValue Formula
1 NULL NULL 100
2 2.6 2.5 104 B2/C2*D1
3 2.55 2.6 102 B3/C3*D2
4 2.6 2.55 104 B4/C4*D3
5 2.625 2.6 105 B5/C5*D4
6 2.65 2.625 106 B6/C6*D5
7 2.675 2.65 107 B7/C7*D6
I tried to calculate expected values ("CalcValue" is the name of column where expression is set) like this:
=Fields!Price.Value/ PreviousPrice.Value * Previous(reportitems("CalcValue").Value))
but got an error "Aggregate functions can be used only on report items contained in page headers and footers"
Can you please advice whether expected result is achievable in my case and suggest a solution?
Thank you in advance!
Sadly I'm still facing with issue: calculated column does not consider previous calculated value. E.g., I added CalcVal field with 100 as default and tried to calculate using above approach, like: =previous(runningValue(Fields!CalcVal.Value, sum, "DataSet1") ) * Fields!Price.Value/Fields!PreviousPrice.Value.
But in this case it always multiples Fields!Price.Value/Fields!PreviousPrice.Value by 100..
For example CalcVal on Fly always show 200
=previous(runningValue(Fields!CalcVal.Value, sum, "DataSet1")) * 2
https://imgur.com/Wtg3Wsg
I tried with your sample data, here is how I achieved the results
Formula to use, You might have to take care of null values
=Fields!Price.Value/(Fields!PreviousPrice.Value*Previous(Fields!CalcValue.Value))
Edit: Update to answer after Op's comment
CalcValue is caluated with below formula i.e on the fly
=RunningValue(CountDistinct("Tablix6"),Count,"Tablix6"*100
and then Final value as below
=Fields!Price.Value/(Fields!PreviousPrice.Value*
Previous(RunningValue(CountDistinct("Tablix6"),Count,"Tablix6"))*100)
I have explored different solutions suggestions on stackoverflow. Honestly, got a lot of #NUM! , #VALUE! Looks like I really need help on this.
Sharing my effort so far.
A B C
Doc Ref A-Ref
3904 1234 3904
3904 1237 3904-1
3904 1235 3904-2
3907 1110 3907
3907 1111 3907-1
This is the sample data that I'm working on. I'd want to sort 3 columns by descending order (2 numeric cols, Col C is not numeric because of hyphen) by only using excel formula - no VBA or SORT ribbon)
Column D = Rank of Numeric Col A - formula used is =RANK(A2,A$2:A$6)
Column E = Rank of Numeric Col B - formula used is =RANK(B2,B$2:B$E)
On Column C - since it has hyphen, - I may not be able to use RANK
Therefore not sure what will work here - [pthere was a resource about sorting text first 3 letters by desc on stackoverflow, but mine has "spl character hyphen with numbers and so it's text" - that solution won't help me]
Now after RANK, What next?
How can I ensure that Col A is ordered by descending and it's respective Col B entries is turn ordered by descending and therefore Col C.
There will be duplicates on Col A, and at the max Col B.
Col C will be unique (therefore we'll not have to create any intermediate column) but Col C can be blank as well
Please please help!
For non-numerical ranking, you may use COUNTIF.
=COUNTIF($B$2:$B$6,"<="&B2)
Hope this helps..
I am trying to display a dataframe in an RMarkdown document using the Pander package.
I would like to highlight the minimum value in each row of values. Here's what I have tried:
df <- replicate(4, rnorm(5))
df <- as.data.frame(df)
df$min <- apply(df, 1, min)
emphasize.strong.cells(which(df == df$min, arr.ind = T))
pander(df[1:4])
When I do this I get the error:
Error in check.highlight.parameters(emphasize.strong.cells, nrow(t), ncol(t)) :
Too high number passed for column indexes that should be kept below 6
I can print out the whole table (with the min column) without any trouble or I can print out a partial table without emphasis, but neither of these is ideal. I want the highlighting, but I do not wish to include the 'min' column.
I imagine the fact that I am leaving some highlighted cells out of the pander command is causing the error.
Is there a way around this? Or a better way to do this?
Thanks.
Subquestion: What if I wanted to highlight the minimum in the first few rows and the maximum in the next few. Is that possible in a single table?
Instead of the which lookup, with the possibility to match row minimums in the wrong rows, you can easily construct those array indices with a simple sequence (1:N) and calling which.min on each row, eg with apply:
> df <- replicate(4, rnorm(5))
> df <- as.data.frame(df)
> emphasize.strong.cells(cbind(1:nrow(df), apply(df, 1, which.min)))
> pander(df)
----------------------------------------------
V1 V2 V3 V4
----------- ----------- ----------- ----------
0.6802 0.1409 **-0.7992** 0.1997
0.6797 **-0.2212** 1.016 0.6874
2.031 -0.009855 0.3881 **-1.275**
1.376 0.2619 **-2.337** -0.1066
**-0.4541** 1.135 -0.1566 0.2912
----------------------------------------------
About your next question: you could of course do that in a single table, eg rbind two matrices created similarly as described above with which.min and which.max.
I have a huge list of items in Column A (1,000 items) and a smaller list of items in Column B (510 items).
I want to put a formula in Column C to show only the Column A items not in Column B.
How to achieve this through a formula, preferably a FILTER formula?
Select the list in column A
Right-Click and select Name a Range...
Enter "ColumnToSearch"
Click cell C1
Enter this formula: =MATCH(B1,ColumnToSearch,0)
Drag the formula down for all items in B
If the formula fails to find a match, it will be marked "#N/A", otherwise it will be a number.
If you'd like it to be TRUE for match and FALSE for no match, use this formula instead:
=ISNUMBER(MATCH(B1,ColumnToSearch,0))
If you'd like to return the unfound value and return empty string for found values
=IF(ISNUMBER(MATCH(B1,ColumnToSearch,0)),"",B1)
Alternative method is simply =
FILTER(A1:A,if(COUNTIF(B1:B,A1:A),0,1))
It's much more efficient.
It uses countif to get a 0 or a 1 as an array if the values in B are in A, then it reverses the 0 and 1 to get the values that are missing instead of only the values that are in there. It then filters based on that.
Columns look like this
A B
1 2
2 5
3
4
5
ARE formulae:
=FILTER(A1:A, MATCH(A1:A, B1:B, 0))
=FILTER(A1:A, COUNTIF(B1:B, A1:A))
ARE NOT formulae:
=FILTER(A1:A, ISNA(MATCH(A1:A, B1:B, 0)))
=FILTER(A1:A, NOT(COUNTIF(B1:B, A1:A)))
in your case:
=FILTER(A1:A; ISNA(MATCH(A:A; B:B; )))
if you face a mismatch of ranges see: https://stackoverflow.com/a/54795616/5632629