SUM a column by IF criteria in second column - Google Sheets - filter

I have a table like this with two columns (A and B) - the first column contains times, the second column contains "checkboxes":
12:00:00 x
10:00:00
13:00:00 x
I want to SUM up all times with an "x" so the outcome should be 25:00:00.
I tried with =VLOOKUP(A4;A1:B3;1;FALSE) the key is A4=x sadly this does not work! (x was not found)

SUMIF is what you're looking for:
=SUMIF(B:B, "=x", A:A)
Where B:B is the range to match to the condition =x and A:A is the range to sum when the condition is met.

=TEXT(SUMIF(B:B, "=x", A:A), "[h]:mm:ss")
=TEXT(SUM(FILTER(A:A, B:B="x")), "[h]:mm:ss")
=ARRAYFORMULA(TEXT(SUM(IF(B:B="x", A:A, )), "[h]:mm:ss"))
=TEXT(SUM(QUERY(A:B, "select A where B ='x'")), "[h]:mm:ss")
=TEXT(SUMPRODUCT((B:B="x")*(A:A)), "[h]:mm:ss")

You could use:
=SUMPRODUCT(($B$1:$B$3="X")*($A$1:$A$3))

Something like:
=SUMPRODUCT((A1:A100)*(B1:B100="x"))
with the proper formatting applied.

Related

Most common "denominators" in a two column list in Google Sheets

How can I find the most commonly found 'Code' (Col B) associated with each unique 'Name' in (Col A) and find the closest value if the 'Code' in Col B is unique?
The image below shows the shared google sheet with Starting data in Columns A & B and the desired output columns in columns C and D. Each Unique Name has associated codes. Column D displays the most commonly occuring Code for each unique name. For example, Buick La Sabre 1 has 3 associated codes in B3,B4,B5 but in D3 only 98761 because it appears more frequently than the other 2 codes do in B2:B. I will explain what I mean by the closest value below.
The Codes that have a count = 1 are unique so the output in column D tries to find the closest match.
However, when the count of the code in B2:B > 1, then the output in column D = to the most frequent code associated with the Name.
Approach when there is 2 or more of the same values in column B
Query
I thought I might use a QUERY with a ORDER BY count(B) DESC LIMIT 2 in a fashion similar to this working equation:
QUERY($A$1:$D$25,"SELECT A, B ORDER BY B DESC Limit 2",1)
but I could not get it to work when I substituted in the Count function.
SORT & INDEX OR VLOOKUP
If the query function can't be fixed to work, then I thought another approach might be to combine a Vlookup/Index after sorting column B in a descending order.
UNIQUE(sort($B$3:$B,if(len($B$3:$B),countif($B$3:$B,$B$3:$B),),0,1,1))
Since a Vlookup or Index using multiple criteria would just pull the first value it finds, you would just end up with the first matching value, we would then get the most frequent value.
Approach when there is < 2 of the same values in column B
This is a little more complicated since the values can be numbers and letters.
A solution like that seen in the image below could be used if everything were a number. In our case there will usually be between 3 - 5 character alphanumeric code starting with 0 - 1 letters numbers and followed by numbers. I'm not sure what the best way to match a code like A1234 would be. I imagine a solution might be to SPLIT off letters and trying to match those first. For example A1234 would be split into A | 1234, then matching the closest letter and then the closest number. But I really am not sure what the best solution to this might be that works within the constraints of Google Sheets.
In the event that a number is equidistant between two numbers, the lower number should be chosen. For example, if 8 is the number and the closest match would be 6 or 10, then 6 should be selected.
In the event that a letter is being used it should work in a similar fashion. For example, thinking of {A, B, C} as {1, 2, 3}, B should preferrentially match to A since it comes before C.
In summary, looking for a way to find the most frequently associated code in col B that is associated with unique names in col A in this sheet and; In the event where there are none of the same codes in B2:B, a formula that will find the closest match for a number or alphanumeric code.
You can use this formula:
=QUERY({range of numerators & denominators}, "select Col2, count(Col2) group by Col2 label Col2 'Denominator', count(Col2) 'Count'")
That outputs something like this:
Denominator
Count
Den 1
Count 1
Den 2
Count 2
use:
=ARRAY_CONSTRAIN(SORTN(QUERY({A3:B},
"select Col1,Col2,count(Col2)
where Col1 is not null
group by Col1,Col2
order by count(Col2) desc,Col2 asc
label count(Col2)''"), 9^9, 2, 1, 1), 9^9, 2)

Sorting 3 columns ( 2 numerical data and 1 text) with duplicate values in descending order using excel formula only

I have explored different solutions suggestions on stackoverflow. Honestly, got a lot of #NUM! , #VALUE! Looks like I really need help on this.
Sharing my effort so far.
A B C
Doc Ref A-Ref
3904 1234 3904
3904 1237 3904-1
3904 1235 3904-2
3907 1110 3907
3907 1111 3907-1
This is the sample data that I'm working on. I'd want to sort 3 columns by descending order (2 numeric cols, Col C is not numeric because of hyphen) by only using excel formula - no VBA or SORT ribbon)
Column D = Rank of Numeric Col A - formula used is =RANK(A2,A$2:A$6)
Column E = Rank of Numeric Col B - formula used is =RANK(B2,B$2:B$E)
On Column C - since it has hyphen, - I may not be able to use RANK
Therefore not sure what will work here - [pthere was a resource about sorting text first 3 letters by desc on stackoverflow, but mine has "spl character hyphen with numbers and so it's text" - that solution won't help me]
Now after RANK, What next?
How can I ensure that Col A is ordered by descending and it's respective Col B entries is turn ordered by descending and therefore Col C.
There will be duplicates on Col A, and at the max Col B.
Col C will be unique (therefore we'll not have to create any intermediate column) but Col C can be blank as well
Please please help!
For non-numerical ranking, you may use COUNTIF.
=COUNTIF($B$2:$B$6,"<="&B2)
Hope this helps..

Stack multiple columns into one

I want to do a simple task but somehow I'm unable to do it. Assume that I have one column like:
a
z
e
r
t
How can I create a new column with the same value twice with the following result:
a
a
z
z
e
e
r
r
t
t
I've already tried to double my column and do something like :
=TRANSPOSE(SPLIT(JOIN(";",A:A,B:B),";"))
but it creates:
a
z
e
r
t
a
z
e
r
t
I get inspired by this answer so far.
Try this:
=SORT({A1:A5;A1:A5})
Here we use:
sort
{} to combine data
Accounting your comment, then you may use this formula:
=QUERY(SORT(ArrayFormula({row(A1:A5),A1:A5;row(A1:A5),A1:A5})),"select Col2")
The idea is to use additional column of data with number of row, then sort by row, then query to get only values.
And join→split method will do the same:
=TRANSPOSE(SPLIT(JOIN(",",ARRAYFORMULA(CONCAT(A1:A5&",",A1:A5))),","))
Here we use range only two times, so this is easier to use. Also see Concat + ArrayFormula sample.
Few hundreds rows is nothing :)
I created index from 1 to n, then pasted it twice and sorted by index. But it's obviously fancier to do it with a formula :)
Assuming Your list is in column A and (for now) the times of repeat are in C1 (can be changed to a number in the formula), then something simple like this will do (starting in B1):
=INDEX(A:A,(INT(ROW()-1)/$C$1)+1)
Simply copy down as you need it (will give just 0 after the last item). No sorting. No array. No sheets/excel problems. No heavy calculations.

Highlighting mininimum row value in Pander

I am trying to display a dataframe in an RMarkdown document using the Pander package.
I would like to highlight the minimum value in each row of values. Here's what I have tried:
df <- replicate(4, rnorm(5))
df <- as.data.frame(df)
df$min <- apply(df, 1, min)
emphasize.strong.cells(which(df == df$min, arr.ind = T))
pander(df[1:4])
When I do this I get the error:
Error in check.highlight.parameters(emphasize.strong.cells, nrow(t), ncol(t)) :
Too high number passed for column indexes that should be kept below 6
I can print out the whole table (with the min column) without any trouble or I can print out a partial table without emphasis, but neither of these is ideal. I want the highlighting, but I do not wish to include the 'min' column.
I imagine the fact that I am leaving some highlighted cells out of the pander command is causing the error.
Is there a way around this? Or a better way to do this?
Thanks.
Subquestion: What if I wanted to highlight the minimum in the first few rows and the maximum in the next few. Is that possible in a single table?
Instead of the which lookup, with the possibility to match row minimums in the wrong rows, you can easily construct those array indices with a simple sequence (1:N) and calling which.min on each row, eg with apply:
> df <- replicate(4, rnorm(5))
> df <- as.data.frame(df)
> emphasize.strong.cells(cbind(1:nrow(df), apply(df, 1, which.min)))
> pander(df)
----------------------------------------------
V1 V2 V3 V4
----------- ----------- ----------- ----------
0.6802 0.1409 **-0.7992** 0.1997
0.6797 **-0.2212** 1.016 0.6874
2.031 -0.009855 0.3881 **-1.275**
1.376 0.2619 **-2.337** -0.1066
**-0.4541** 1.135 -0.1566 0.2912
----------------------------------------------
About your next question: you could of course do that in a single table, eg rbind two matrices created similarly as described above with which.min and which.max.

How to filter one list of items from another list of items?

I have a huge list of items in Column A (1,000 items) and a smaller list of items in Column B (510 items).
I want to put a formula in Column C to show only the Column A items not in Column B.
How to achieve this through a formula, preferably a FILTER formula?
Select the list in column A
Right-Click and select Name a Range...
Enter "ColumnToSearch"
Click cell C1
Enter this formula: =MATCH(B1,ColumnToSearch,0)
Drag the formula down for all items in B
If the formula fails to find a match, it will be marked "#N/A", otherwise it will be a number.
If you'd like it to be TRUE for match and FALSE for no match, use this formula instead:
=ISNUMBER(MATCH(B1,ColumnToSearch,0))
If you'd like to return the unfound value and return empty string for found values
=IF(ISNUMBER(MATCH(B1,ColumnToSearch,0)),"",B1)
Alternative method is simply =
FILTER(A1:A,if(COUNTIF(B1:B,A1:A),0,1))
It's much more efficient.
It uses countif to get a 0 or a 1 as an array if the values in B are in A, then it reverses the 0 and 1 to get the values that are missing instead of only the values that are in there. It then filters based on that.
Columns look like this
A B
1 2
2 5
3
4
5
ARE formulae:
=FILTER(A1:A, MATCH(A1:A, B1:B, 0))
=FILTER(A1:A, COUNTIF(B1:B, A1:A))
ARE NOT formulae:
=FILTER(A1:A, ISNA(MATCH(A1:A, B1:B, 0)))
=FILTER(A1:A, NOT(COUNTIF(B1:B, A1:A)))
in your case:
=FILTER(A1:A; ISNA(MATCH(A:A; B:B; )))
if you face a mismatch of ranges see: https://stackoverflow.com/a/54795616/5632629

Resources