I'm looking to add a column to my PowerQuery data which will count how many of 5 cells in the row are greater than zero. Example data below with end result:
I could do this with lots of If statements but I need to be able to expand my number of columns in the future.
I think this should do it. Add a column with:
List.Count(List.RemoveMatchingItems(Record.FieldValues(_),{0}))
e.g.:
Related
I have a matrix - 1,172 words down column A, then the same 1,172 names across row 1. Then each word is cross-referenced with all the other names to give a similarity score (this is already done).
In another sheet, I want to look up a word, and return all the words with which it has a certain similarity score - in this case, greater than or equal to 0.33. I attach a MWE, in which I give an idea of the answer I am looking for by looking it up manually.
I think it's some sort of reverse lookup. As in, instead of finding the value corresponding to a particular row and a particular column, it's finding the column based on value in the main sheet and row. I'm just really stuck at this point and would massively appreciate some help. Thanks! MWE here
If your words on the second sheet are in the same order then:
=IFERROR(TEXTJOIN(", ",,FILTER(Scores!B$1:W$1,(Scores!B2:W2>=0.33)*((Scores!B2:W2<1)))),"-")
Drag down.
Explanation:
Filter the values from row 1 according to the similarity score condition, using FILTER.
Concatenate the filtered values using TEXTJOIN.
I'm trying to figure out how to write a parquet file where the columns do not contain the same number of rows per Row Group. For example, my first column might be a value sampled at 10Hz, while my second column may be a value sampled at only 5Hz. I'd rather not repeat values in the slower column since this can lead to computational errors. However, I cannot write columns of two different sizes to the same Row Group, so how can I accomplish this?
I'm attempting to do this with ParquetSharp.
It is not possible for the columns in a parquet file to have different row counts.
It is not explicit in the documentation but if you look on https://parquet.apache.org/documentation/latest/#metadata, you will see that a RowGroup has a num_rows and several ColumnChunks that do not themselves have individual row numbers.
I have a column which contains time, numbers, text, and blank cells.
Question is: How do I just count the cells containing only time values. I don't need the total (sum) of time, just need to know how many cells contain a time value. Using Google Sheets.
Try this =COUNTIF(F12:F22,"<=1.00") put you range instead of F12:F22
I have an excel that I'm calculating my Scrum Task's completed average. I have Story point item also in the excel. My calculation is:
Result= SP * percentage of completion --> This calculation is for each row and after that I sum up all result and taking the summary.
But sometimes I am adding new task and for each task I am adding the calculation to the average result.
Is there any way to use for loop in the excel?
for(int i=0;i<50;i++){ if(SP!=null && task!=null)(B+i)*(L+i)}
My calculation is like below:
AVERAGE((B4*L4+B5*L5+B6*L6+B7*L7+B8*L8+B9*L9+B10*L10)/SUM(B4:B10))
First of all, AVERAGE is not doing anything in your formula, since the argument you pass to it is just one single value. You already do an average calculation by dividing by the sum. That average is in fact a weighted average, and so you could not even achieve that with a plain AVERAGE function.
I see several ways to make this formula more generic, so it keeps working when you add rows:
1. Use SUMPRODUCT
=SUMPRODUCT(B4:B100,L4:L100)/SUM(B4:B100)
The row number 100 is chosen arbitrarily, but should evidently encompass all data rows. If you have no data occurring below your table, then it is safe to add a large margin. You'll want to avoid the situation where you think you add a line to the table, but actually get outside of the range of the formula. Using proper Excel tables can help to avoid this situation.
2. Use an array formula
This would be a second resort for when the formula becomes more complicated and cannot be executed with a "simple" SUMPRODUCT. But the above would translate to this array formula:
=SUM(B4:B100*L4:L100)/SUM(B4:B100)
Once you have typed this in the formula bar, make sure to press Ctrl+Shift+Enter to enter it. Only then will it act as an array formula.
Again, the same remark about row number 100.
3. Use an extra column
Things get easy when you use an extra column for storing the product of B & L values for each row. So you would put in cell N4 the following formula:
=B4*L4
...and then copy that relative formula to the other rows. You can hide that column if you want.
Then the overal formula can be:
=SUM(N4:N100)/SUM(B4:B100)
With this solution you must take care to always copy a row when inserting a new row, as you need the N column to have the intermediate product formula also for any new row.
I have a data frame with 9 columns and many rows. I want to filter all the rows that have observations greater than 3.0 in at least 3 columns. Which conditional statements should I use to subset my data frame?
Since I am a n00b, I only came up with this:
data_frame[data_frame > 3,]
Obviously, this gives me all the rows for which all values are > 2, regardless of what I actually need.
Thanks!
I figured that you could also combine logical operators:
data[rowSums(data>2)>=3,]
Like this, you can subset from a data frame the rows for which the sum of observations (higher than 2) occurs three or more times. And no specification for the columns.
Logical operator, in this case, the brain. I used the sum(rowSum(data))>x # x =sum of the limit value times columns available.