Return median value array in google sheets [duplicate] - google-sheets-formula

This question already has an answer here:
Query with Median and Group by and where google sheets
(1 answer)
Closed last month.
I'm trying to match the value in an array and return the median value found in the fourth column of an array in another sheet B_Sheet!B1:B65404
Since Vlookup only returns the first value I tried using Index and Match but not sure why it failed.
Tried this but didnt work ARRAYFORMULA(IFERROR(MEDIAN(FILTER(INDEX(B_Sheet!D1:D65404,MATCH(B5:B36518,B_Sheet!B1:B65404,0)),LEN(INDEX(B_Sheet!D1:D65404,MATCH(B5:B36518,B_Sheet!B1:B65404,0)))>0)),""))
For reference, this Vlookup did work:
ARRAYFORMULA(IFERROR(VLOOKUP(B5:B36518,B_Sheet!B1:E65404, 4, FALSE),""))
Example1:
With formula in top of second column of Sheet1
Name
Array Return Median value
James
5 (formula is here)
Robert
6
And looking up in the next sheet....
:
B_Sheet!
Name
Test Scores
James
3
Robert
4
James
5
Robert
6
James
7
Robert
8

There are a few ways you could do this. To answer your question using a formula that calculates the median based on the adjacent cells names you could use:
=IFERROR(BYROW(A2:A, LAMBDA(xx,MEDIAN(FILTER(B_Sheet!B:B, B_Sheet!A:A=xx)))),"")
However I would suggest the following formula which just groups and averages all names in B_Sheet.
=QUERY(B_Sheet!A:B,"SELECT A,AVG(B) WHERE A IS NOT NULL GROUP BY A LABEL AVG(B) ''")
The second formula can be pasted in cell A1 and it will populate Column A with a name and Column B with the average for that name. You can further sort this and filter it down as desired if you do not want every single name. Of course this calculates AVERAGE and not MEDIAN.

Related

How to calculate the ratio between the grand total of two metrics on Google Data Studio?

I created a table on Data Studio that shows the columns:
A: Date
B: 1st metric (number)
C: 2nd metric (number)
D: custom formula to calculate the ratio between the 1st and 2nd metric (percentage)
Then I checked the option to show the Summary Row that sums all the values of each date. But in the column D I don't want it to calculate the sum of the values in column D (nor the average of the values), instead, I want the ratio between the sum of the values of column D and C. How to achieve that?
To have the calculated field correctly in the total, you have to make sure to aggregate your calculated field. To do so, use 'sum()' in your calculation.
That would be this formula:
sum(total sales)/sum(gross sales)
I hope this answers your question!

Count largest duplicate in a list of cells in Google Sheets

I have a list of cells which contain values (A,B,C,D,E...). I'd like to count the largest duplicate.
I think about multiple MAX together with COUNTIF but that would be very long since my list of value has 60+ items
Example file: https://docs.google.com/spreadsheets/d/1ZUnSokdPsEPVJw9S8DfGHvJE1L1Ng8litbCXeA9QWzI/edit?usp=sharing
I'm new at this, but I think the following formula does what you've asked.
=INDEX(A2:G2,MODE(MATCH(A2:G2,A2:G2,0)))
Your sheet is view only, so I can't place it there.
Replace my A2:G2 range in the formula with the range with your long list (either column or row) of values.
This will search the range for the amount of the most duplicates, and return the first value that is duplicated the most.
Note: it doesn't flag if there are other values that have an equal number of duplciates as the first value.
Here is a sample sheet, with examples using values in a row, or in a column:
https://docs.google.com/spreadsheets/d/1mIxTKXjED9kpqAGV55pf8Yjq102Sc5Ymn_yY6qh3XmE/edit?usp=sharing
Let me know if this doesn't achieve what you want.

Student Election Results Tallying

This is a coding interview question:
Your school is having an election and you are tasked with coding a program that tallies the results.
You are given a Set of Votes, each vote containing a candidate and a time stamp. Given a time stamp, return the top N candidates with the most votes at that timestamp. (each vote tallied must come before or at the given timestamp)
Create the Min Heap and HashMap Data structure to solve this problem.
1. Cast each vote in HashMap(Candidate, Votes).
2. At any time we want to find the N top trending Candidate, Add all the HashMap keys(Candidate votes) to the min heap with restriction of N size.
3. return all the item from the min heap, which will return the top N candidate with the votes. (as min heap filter the candidate with the restriction on size N).
This is probably far from the most efficient way, but I would:
Create a list candidateList containing each candidate and their respective number of votes (initially 0)
Go through the set of votes, and if a vote meets the time stamp requirement, add 1 to the candidate's votes in candidateList.
After you've gone through the set of votes, find the nth most popular candidate in candidateList (using a selection algorithm), and then iterate over the list to find the candidates more popular than them.
I would do it per array
For every new date you read, you create a new subarray: lets say you get a vote from the 9th of August 2016, for which you don't have votes registered yet l, for John Doe soons as you register a vote for lets say John Doe.
Your array should then be constructed like this:
array ->0->date: 09/08/2016
->John Doe: 1
Since I assume at an election all the names are known, we can simply save all the candidates names in another array, which we can use when we loop through this one.
Incase a new vote for John Doe on another date gets registered, your array would look like this
array ->0->date: 09/08/2016
->John Doe: 1
->1->date: 11/08/2016
->John Doe: 1
If someone votes for another persom on an already known date, it should look like this
array ->0->date: 09/08/2016
->John Doe: 1
->Jane Doe: 1
Hope this helps. If you want help looping through this array-structure-thingy, don't be afraid to ask :)

What is the good approach in solving this programming challenge?

In one programming contest, this problem was given.
A database contains a table with two columns.
First is the id of the member,
Second can be
0(if he doesn't have any sub-ordinates),
id(if only one sub-ordinate),
sum of id's(if he has two sub-ordinates)
//Max Two assistants only.
We need to find the head of the gang
Example Input:
The first line indicates 'n' [the number of records,3<n<100]
the next four are the actual records
4
1 7
2 1
3 0
4 0
Here 3,4 has 0 in their second columns which means they don't have any sub-ordinates.
1 has 7 in the second column which is not the id of any of the member ,so it can be the sum of two id's[so 3,4 are sub-ordinates of 7]
2 has 1 as the sub-ordinate
so 2 is the head of the gang.
Output:
2
I am unable to solve the problem.
Can anyone help me?
If it is not a correct place to ask this type of question means
Can suggest me some websites where I can post these type of questions?
I will give you a hint (which is almost a solution) here:
What is the sum of all the numbers in the second column?
Answer (spoiler alert):
The id of the head of the gang (if exists) is: 1 + 2 + ... + n - (the sum of all the numbers in the second column). Note that, the above number actually gives the sum of the id's of all top-level members (i.e. members who do not have any sub-ordinates). Thus the correctness relies on the assumption that there exists one unique head of the gang.

Maximize marks obtained

There is a n*n matrix given.
The row denotes the student and the corresponding column denotes marks obtained in that particular paper.
for example->
n=3
1 2 3
4 5 6
7 8 9
Then 1st student scores 1 in 1st paper, 2 in 2nd paper and so on.
2nd student scores 4 in 1st paper, 5 in 2nd paper and so on.
given->Every student will get only one exam paper to solve
We need to maximize total marks obtained by n students following above condition.
for above input, output->>(8+6+1)=15.
constraints->
1<=n<=100
My approach->
I thought to solve it using dp+bitmask but n can be as large as 100 so had to drop this idea.
This is a typical weighted bipartite graph problem, and can be solved using KM algorithm (Hungarian algorithm).
To construct the bipartite, we put all students in one set, and all exam papers in the other set. We connect a student to an exam paper with an edge of value X, where X is the score that student can gain in that exam. After the graph is constructed, just run KM algorithm and you will get the answer.
Here is a tutorial from top coder which explains this kind of problem quite well, and a code template is also given. You can start from here :)

Resources