(Google Sheets) How to remove certain dropdown options after a certain number of cells with said option is met? - validation

I'm currently working on a google sheets file to organize the members of my class. I am currently assigning committees and I want them to choose their committee in Google Sheets. However, I want to apply only a certain limit per committee.
What I want to happen is, if a certain choice has been chosen i.e. 5 times, I would like that choice to disappear from the choices and would make it reappear again if ever a students change their choice, however, I do not know how to do this in terms of a formula or through data validation.
I would really appreciate your help. Thank you!

Here's a toy example you may be able to adapt to your needs:
Create a list of options a,b,c,d,e in A1:E1 of Sheet1
Create a list of the limits for each option in A2:E2 (for instance 2,1,3,5,3)
Create a list of people Person1,Person2,Person3 in G2:G4
Apply data validation to H2:H4:
Use criteria 'drop down (from a range)'
Set the data range to =Sheet1!$A3:$E3 (only lock columns, not rows)
In A3 enter the following formula:
=lambda(people,choices,list,limits,
makearray(counta(people),counta(list),lambda(r,c,
if(index(choices,r)<>index(list,,c),if(countif(choices,index(list,,c))<index(limits,,c),index(list,,c),),index(list,,c)))))(
$G$2:$G$4,$H$2:$H$4,$A$1:$E$1,$A$2:$E$2)
We are using MAKEARRAY to create a 2D array with the list of options on each line, however we are asking it to omit elements of the list from each line if they haven't already been selected AND a preset limit on the number of selections for that option has not been reached. Obviously in a 'real' example you would place the data range for validation in a separate sheet and probably hide and protect that sheet as well. You could also potentially use an array literal of strings rather than a cell range as the list of options in order to make the validation list formula completely self-contained.

Related

Google Sheets data validation different for each row

For my current project, I'm making a sheet that lets me keep track of my D&D characters. I use data validation to remind me what all the options are for various stats, with the information being kept in a separate "RefTables" sheet. Creating a data validation for selecting a character class is very easy, since there are only 14 classes total. What I'm having trouble with is the 'subclass' column. After you choose the character class, you get to choose your specialization, or 'subclass'. This differs depending on the character class you chose.
Right now I can do the proper data validation for each cell individually. In my ref tables sheet, I have a section where it will grab the character class value and put all the 'subclass' options into a row. I can then use data validation in that specific cell to grab the subclass row. This works, but is tedious to do for every single cell.
The formula I would love to put in the range section is
=INDIRECT(CONCATENATE("RefTables!Q",ROW(),":AJ",ROW()))
which appends the row number with the appropriate columns so each row automatically gets its own subclass row (EX: RefTables!Q3:AJ3, RefTables!Q21:AJ21, etc.). I've seen solutions for Excel, but I'm using Google Sheets so I can share this document more easily with friends.
tldr; How to use data validation in Google Sheets that is slightly different for each row
unfortunately, this is possible to achieve only the manual way setting it up for every single cell/row. Google Sheets' Data validation does not support injecting CSVs via formula.

Filter Data for Each Row in a Column

EVE Online Manufacturing Spreadsheet
In Batch!F3:G, I'm attempting to break down the data input from columns B3:C to their components (and eventually materials/minerals in I3:J) by using filter to compare results in Engine!P:R. Multiplied of course by the total number of each finished product I need.
I've been trying to figure out ways to arrayformula this together, and even tried quite a few query functions without success. The best I've been able to come up with is to string the actual formula together, appending them with {}, but this gets bloated quickly. I need this to be open ended because I have a tendency to build a lot of things at once. Any help would be appreciated, even just point me in the right direction!
Well, based on my limited knowledge about google sheet, I can only think of one way to do this automatically.
Here's a sheet I constructed based on your sheet.
https://docs.google.com/spreadsheets/d/1AfX8o05gUGPiN5S90w4o0yxuIYjsJRaXsaYUFTJuEPo/edit?usp=sharing
First, on Engine sheet, add one more column which will give you the number of materials required for that part, which is looked up in the PART LIST of BATCH sheet. For this I use VLOOKUP, as you see in D2.
Then on BATCH sheet, query the materials that VLOOKUP return positive, multiply it by the amount of item and then sum them.
This is done by the QUERY used in F3
This method only if you don't have duplicate item in your PART LIST, due to the way VLOOKUP work.
Of course if you want to break the material list further, you can do the same approach..

How to filter entries that are not duplicates of entries from others columns in Google Sheets?

I have a column called "Masterlist" which contains values from Lists 1, 2 and 3. It also contains values which are present only in Masterlist.
How can I filter them, like shown at the attached image in Google Sheets?
EDIT: The lists will have more than one entries.
Solution 1
In E2, type in
=filter(A2:A,arrayformula(iserror(match(A2:A,B2:D2,0))))
Check the documentation of filter or match for how to use them. With match, be sure to include the third argument. That is an easy one to forget. arrayformula iterates a formula over a range. The output can be a range, in which case it will print over any un-written cells. When arrayformula interacts with match, it only iterates over the first argument, which is why this solution works.
EDIT: If you have a two-dimensional range to match to, you need to collapse them into a one-dimensional range using the concatenation operators such as
=filter(A2:A,arrayformula(iserror(match(A2:A,{B2:B4;C2:C4;D2:C4},0))))
You can experiment with endings without row indices and let Google Sheets select an ending index for you.
Solution 2
Use the native Filter View feature. Good for the scenarios where you don't need to separately print a list of the unique values in "masterlist".
Go to Data -> Create Filter View
Use the relevant help pages to navigate yourself. I can see a few ways to implement what you desire, including
filter by value on the same column (selecting the actual values manually);
filter by value on a "helper column" where you include a formula in the cells to check whether the content in "masterlist" belongs to the list you want to check against. You can use the match and iserror combo here;
custom formula using a similar formula as above.
If your column A, ie. the "masterlist", is something a user would add to, then Data Validation can be used to good effect in conjunction with Filter View.

How to conditionally require 18+ fields based on selection of two dropdowns

I'm new to Sharepoint 2010 with what I would call a highschool freshman level of coding experience, though I can generally stumble and tinker my way through. I don't currently have access to Sharepoint designer, but from the searching I've done so far, it may required. Still I'm hoping to find an OOTB solution to the problem below.
I have been tasked with building a incident resolution tracking sheet on Sharepoint. My boss is very concerned with being audited by legal, and has some very specific requirements about required information. Column A contains a drop down list of 5 choices that indicate the Final Solution. Column B Contains a drop down list with 4 choices that indicate the Initial Problem. Based on The selections in A and B, different Columns in C-X are required to be blank, not blank, or contain specific entries. The only way I can find to do this is to create a list validation containing a nested if for each combination of A and B resulting in 20 nested ifs. However sharepoint is limited to 7 nested ifs, so I'm looking for any possible solutions.
*This List will primarily be accessed in Datasheet view, so "HTML in calculated column" type solutions are not viable.
You can use calculated columns to break up the validation formula into more manageable chunks.
Let's start with a simple example.
Condition 1: If the initial problem was that the user's computer was too slow and the final solution was restarting the computer, you need to fill in the [C] column.
Condition 2: If the initial problem was that the user was on fire and the final solution was dousing them with water, you need to fill in the [D] column.
You could perform that list validation all in one formula, as below:
=IF(
AND([A]="Restarted Computer",[B]="Computer is slow"),
NOT(ISBLANK([C])),
IF(
AND([A]="Doused with water",[B]="User is on fire"),
NOT(ISBLANK([D]),
TRUE
)
)
But that's long and ugly (especially when you condense it to one line).
Instead, you could add two calculated columns, one for each condition you want to check. For the sake of this example, let's say you add a column called C_is_valid and a column called D_is_valid:
C_is_valid calculated column formula:
=IF(AND([A]="Restarted Computer",[B]="Computer is slow"),NOT(ISBLANK([C])),TRUE)
D_is_valid calculated column formula:
IF(AND([A]="Doused with water",[B]="User is on fire"),NOT(ISBLANK([D]),TRUE)
Updated validation formula:
=AND([C_is_valid],[D_is_valid])
It's easy to see how this can simplify even a very complex set of validation conditions...
=AND(C_is_valid,AND(D_is_valid,AND(E_is_valid,AND(F_is_valid,AND(G_is_valid,AND(H_is_valid,I_is_valid)))))
But even that could be simplified by consolidating some of those AND()s into multiple calculated columns, so that your final validation formula could be as simple as:
=AND([First set of conditions is valid],[Second set of conditions is valid])

Random exhaustive (non-repeating) selection from a large pool of entries

Suppose I have a large (300-500k) collection of text documents stored in the relational database. Each document can belong to one or more (up to six) categories. I need users to be able to randomly select documents in a specific category so that a single entity is never repeated, much like how StumbleUpon works.
I don't really see a way I could implement this using slow NOT IN queries with large amount of users and documents, so I figured I might need to implement some custom data structure for this purpose. Perhaps there is already a paper describing some algorithm that might be adapted to my needs?
Currently I'm considering the following approach:
Read all the entries from the database
Create a linked list based index for each category from the IDs of documents belonging to the this category. Shuffle it
Create a Bloom Filter containing all of the entries viewed by a particular user
Traverse the index using the iterator, randomly select items using Bloom Filter to pick not viewed items.
If you track via a table what entries that the user has seen... try this. And I'm going to use mysql because that's the quickest example I can think of but the gist should be clear.
On a link being 'used'...
insert into viewed (userid, url_id) values ("jj", 123)
On looking for a link...
select p.url_id
from pages p left join viewed v on v.url_id = p.url_id
where v.url_id is null
order by rand()
limit 1
This causes the database to go ahead and do a 1 for 1 join, and your limiting your query to return only one entry that the user has not seen yet.
Just a suggestion.
Edit: It is possible to make this one operation but there's no guarantee that the url will be passed successfully to the user.
It depend on how users get it's random entries.
Option 1:
A user is paging some entities and stop after couple of them. for example the user see the current random entity and then moving to the next one, read it and continue it couple of times and that's it.
in the next time this user (or another) get an entity from this category the entities that already viewed is clear and you can return an already viewed entity.
in that option I would recommend save a (hash) set of already viewed entities id and every time user ask for a random entity- randomally choose it from the DB and check if not already in the set.
because the set is so small and your data is so big, the chance that you get an already viewed id is so small, that it will take O(1) most of the time.
Option 2:
A user is paging in the entities and the viewed entities are saving between all users and every time user visit your page.
in that case you probably use all the entities in each category and saving all the viewed entites + check whether a entity is viewed will take some time.
In that option I would get all the ids for this topic- shuffle them and store it in a linked list. when you want to get a random not viewed entity- just get the head of the list and delete it (O(1)).
I assume that for any given <user, category> pair, the number of documents viewed is pretty small relative to the total number of documents available in that category.
So can you just store indexed triples <user, category, document> indicating which documents have been viewed, and then just take an optimistic approach with respect to randomly selected documents? In the vast majority of cases, the randomly selected document will be unread by the user. And you can check quickly because the triples are indexed.
I would opt for a pseudorandom approach:
1.) Determine number of elements in category to be viewed (SELECT COUNT(*) WHERE ...)
2.) Pick a random number in range 1 ... count.
3.) Select a single document (SELECT * FROM ... WHERE [same as when counting] ORDER BY [generate stable order]. Depending on the SQL dialect in use, there are different clauses that can be used to retrieve only the part of the result set you want (MySQL LIMIT clause, SQLServer TOP clause etc.)
If the number of documents is large the chance serving the same user the same document twice is neglibly small. Using the scheme described above you don't have to store any state information at all.
You may want to consider a nosql solution like Apache Cassandra. These seem to be ideally suited to your needs. There are many ways to design the algorithm you need in an environment where you can easily add new columns to a table (column family) on the fly, with excellent support for a very sparsely populated table.
edit: one of many possible solutions below:
create a CF(column family ie table) for each category (creating these on-the-fly is quite easy).
Add a row to each category CF for each document belonging to the category.
Whenever a user hits a document, you add a column with named and set it to true to the row. Obviously this table will be huge with millions of columns and probably quite sparsely populated, but no problem, reading this is still constant time.
Now finding a new document for a user in a category is simply a matter of selecting any result from select * where == null.
You should get constant time writes and reads, amazing scalability, etc if you can accept Cassandra's "eventually consistent" model (ie, it is not mission critical that a user never get a duplicate document)
I've solved similar in the past by indexing the relational database into a document oriented form using Apache Lucene. This was before the recent rise of NoSQL servers and is basically the same thing, but it's still a valid alternative approach.
You would create a Lucene Document for each of your texts with a textId (relational database id) field and multi valued categoryId and userId fields. Populate the categoryId field appropriately. When a user reads a text, add their id to the userId field. A simple query will return the set of documents with a given categoryId and without a given userId - pick one randomly and display it.
Store a users past X selections in a cookie or something.
Return the last selections to the server with the users new criteria
Randomly choose one of the texts satisfying the criteria until it is not a member of the last X selections of the user.
Return this choice of text and update the list of last X selections.
I would experiment to find the best value of X but I have in mind something like an X of say 16?

Resources