How to calculate correlation amongst preferences? - algorithm

I have to split a group of x people into 3 or 4 groups, most likely 3.
I want people to be happy, so I'm having each person rate the other members of the big group from 1 to (x-1).
How do I optimize preferences to create 3 groups?

Here is a method that is likely to get a good arrangement, even if it is not an optimal arrangement:
First create a ranking function that can take any pair of groupings and determine whether one is better than the other. Then apply the following algorithm:
Randomly assign people into groups.
Randomly pick one person from each group.
Create new groupings in which each combination of reassignments is performed on the people chosen in step 2. (For 3 groups there will be 6 such reassignments. For 4, 24.)
Of all possible reasignments, pick the best one.
Repeat steps 2–4 one million times.
UPDATE
If there are only 18 people that need to be assigned, then that's just (18 choose 6) * (12 choose 6) / 6 = 2,858,856 possible groupings. (Or, in the case of four groups it's (18 choose 4) * (14 choose 4) * (10 choose 5) / 4 = 192,972,780 groupings.)
You can just try each one and pick the best.
I guess the ranking algorithm itself is really the hard part of this assignment.
You could just give each person a score based on summing the scores of the people selected to be in their group, then sum the scores of each person together.
The problem is that you're going to end up with all the popular people in one group, and all the unpopular people in another group, and all the telephone handset cleaners in another group.
You should just assign people randomly, and then tell them that you used some really scientific system. That way everybody gets a good mix.

Measure the total satisfaction of a given configuration by calculating the distance between the actual positions and the stated preferences. Start with a randomized set of groups. Then use something like hill climbing or simulated annealing to optimise.
http://en.wikipedia.org/wiki/Hill_climbing
http://en.wikipedia.org/wiki/Simulated_annealing
Simulated annealing sounds complicated, but it's really just a cleverer version of hill-climbing.

Related

How many combinations of partitions of a list can you make so that repetitions are minimised?

Well one line is not enough to capture that question so ill explain it here.
Earlier today i was in a zoom meet where there are 12 members and we were split into different breakout rooms. After 30 mins we were shuffled and the host made sure to use a different combinations such that no two people who were in the same breakout room was in the same breakout room now. The host atleast tried to minimize the repetitions.
So my question is how many such combinations can we make on a group such that no two people who were in a meeting earlier repeat themselves.
If you could tell me in a algorithmic way, i will understand it.
Edit:
Let me further explain my problem with a use case:
Suppose there are 12 people in a zoom meet who are to be divided up into breakout rooms. Let me form 4 breakout rooms with 3 in each group.
so let the combination be [1,2,3][4,5,6][7,8,9][10,11,12].
the first iteration could have been anything really.
now the problem starts.
for the next combination i have to make sure that the breakout rooms cannot have the same people forming it.
so keeping this in mind i'll form the group as: [1,5,7][4,8,12][2,6,11][3,9,10].
wow. did you just see that? i had to make sure that no two people who were in a group earlier, are present in a single group presently. So my question is how do i write an algorithm for this problem.
math related questions fit better in https://math.stackexchange.com/
to your question:
i assume breakout rooms mean pears of 2?
-> everyone can match with 11 others - so basically 11 rounds?
There are no conflicts necessarily that reduce the rounds, if you organized matches from the start on.
Edit:
If you instead looked for the commonly school question:
Suppose there is a group of 12 people. In how many different ways can the 12 people be split into six pairs?
Then there are same matches allowed as long as one pair changes and the result would be
12!/(2^6 x 6!)
see: here
Edit after Clarification:
Well I only know Primary math and would still recommend https://math.stackexchange.com/ : )
My unqualified guess would be you have your total Permutations of 12! / (12-3)! Which are only
12! /(3! *(12-3)!) Combinations and then reduce it by all non-valid Combinations?
( 12! / (3! *(12-3)!) ) / 4 = 55?
Or 12! / (4! *(12-3)!) = 55?
If you need this for some kind of school, I recommend learning basics first. The solution will not help you in your class!
https://www.calculator.net/permutation-and-combination-calculator.html?cnv=12&crv=3&x=76&y=12
https://math.stackexchange.com/questions/35684/combination-of-splitting-elements-into-pairs

Maximise subset selection under constraints

I am working on a scheduling problem for a team of volunteers. I have boiled my problem down to the following algorithmic problem:
I have a matrix with ~60 rows representing volunteers and ~14 columns representing days. Each entry is an integer in the range 0 to 3 inclusive representing how free the volunteer is on that day. I want to choose exactly 4 entries from each column (4 volunteers a day) such that (in order of importance)
A 0-entry is never chosen.
The workload is as spread out as possible (first give everyone one shift, then start giving out second shifts, etc. We can expect that most volunteers will only have one shift per 2-week period, and some may even have none.)
The sum over selected entries is maximised (volunteers get days that they prefer).
I want to output a decision matrix that has a 1 whenever a volunteer is chosen for a day, and 0 otherwise. I believe this is an instance of the nurse-rostering problem, so I'm not expecting a fast solution, but I just want to make a brute force algorithm that will work in a reasonable time for my ~60 person team. I'm just really not sure how to start tackling this problem. Is it suited for backtracking, or is there some way to calculate the best placement of each volunteer based on the distribution of his/her day-scores?

How to design an algorithm to put elements into groups with constraints?

I was given a task of putting students into groups (to prepare a coding camp), but with several constraints. Though I've finished the task by hand, I'd like to know is there already exist some algorithms for tasks like this, or how can I design such an algorithm.
Background: 40 students in total, with these attributes:
gender: F/M
grade: Year 1/2
school: School 1/School 2/...
early assessment result: Rank from 1 to 40
Constraints: All of them needs to be satisfied.
Exactly 4 people per group
Each group needs to have at least a girl
Each group needs to have at least a Year 2 student
4 group members needs to come from 4 different schools
Each group needs to have at least a student who ranked top 10 in early assessment
What I'm expecting:
The Best: An existing algorithm/program for these kind of problems
Or, An algorithm for this specific problem
Or at least, Some ideas of creating an algorithm for this specific problem
My thoughts:
Since I've successed in making groups by hand, I know that such a solution indeed exists for my current dataset. But if I need an algorithm to find a solution for me, it should first try to check whether a solution even exists, by check if the number of girl / Year 2 students is greater than 10 (with pigeonhole principle), and some other conditions. And obviously, Constraint 5 is the easiest, and can provide a base solution for the rest. However, I still can not find a systematic way of doing it. Perhaps bruteforce and randomization can help? I'm not sure.
And sorry, since the data is confidential, I can not post it.
Update: After consulting a friend, here is a possible method:
First put the top 1 to 10 into 10 different groups.
Then iterate through groups. If the only person in the group is a boy/girl, try to add a girl/boy from a different school.
Then the problem size is reduced from 2^40 to 2^20, making bruthforce a viable solution.

Rating Algorithm

I'm trying to develop a rating system for an application I'm working on. Basically app allows you to rate an object from 1 to 5(represented by stars). But I of course know that keeping a rating count and adding the rating the number itself is not feasible.
So the first thing that came up in my mind was dividing the received rating by the total ratings given. Like if the object has received the rating 2 from a user and if the number of times that object has been rated is 100 maybe adding the 2/100. However I believe this method is not good enough since 1)A naive approach 2) In order for me to get the number of times that object has been rated I have to do a look up on db which might end up having time complexity O(n)
So I was wondering what alternative and possibly better ways to approach this problem?
You can keep in DB 2 additional values - number of times it was rated and total sum of all ratings. This way to update object's rating you need only to:
Add new rating to total sum.
Divide total sum by total times it was rated.
There are many approaches to this but before that check
If all feedback givers treated at equal or some have more weight than others (like panel review, etc)
If the objective is to provide only an average or any score band or such. Consider scenario like this website - showing total reputation score
And yes - if average is to be omputed, you need to have total and count of feedback and then have to compute it - that's plain maths. But if you need any other method, be prepared for more compute cycles. balance between database hits and compute cycle but that's next stage of design. First get your requirement and approach to solution in place.
I think you should keep separate counters for 1 stars, 2 stars, ... to calcuate the rating, you'd have to compute rating = (1*numOneStars+2*numTwoStars+3*numThreeStars+4*numFourStars+5*numFiveStars)/numOneStars+numTwoStars+numThreeStars+numFourStars+numFiveStars)
This way you can, like amazon also show how many ppl voted 1 stars and how many voted 5 stars...
Have you considered a vote up/down mechanism over numbers of stars? It doesn't directly solve your problem but it's worth noting that other sites such as YouTube, Facebook, StackOverflow etc all use +/- voting as it is often much more effective than star based ratings.

What class of algorithms can be used to solve this?

EDIT: Just to make sure someone is not breaking their head on the problem... I am not looking for the best optimal algorithm. Some heuristic that makes sense is fine.
I made a previous attempt at formulating this and realized I did not do a great job at it so I removed that question. I have taken another shot at formulating my problem. Please feel free to provide any constructive criticism that can help me improve this.
Input:
N people
k announcements that I can make
Distance that my voice can be heard (say 5 meters) i.e. I may decide to announce or not depending on the number of people within these 5 meters
Goal:
Maximize the total number of people who have heard my k announcements and (optionally) minimize the time in which I can finish announcing all k announcements
Constraints:
Once a person hears my announcement, he is be removed from the total i.e. if he had heard my first announcement, I do not count him even if he hears my second announcement
I can see the same person as well as the same set of people within my proximity
Example:
Let us consider 10 people numbered from 1 to 10 and the following pattern of arrival:
Time slot 1: 1 (payoff = 1)
Time slot 2: 2 3 4 5 (payoff = 4)
Time slot 3: 5 6 7 8 (payoff = 4 if no announcement was made previously in time slot 2, 3 if an announcement was made in time slot 2)
Time slot 4: 9 10 (payoff = 2)
and I am given 2 announcements to make. Now if I were an oracle, I would choose time slots 2 and time slots 3 because then 7 people would have heard (because 5 already heard my announcement in Time slot 2, I do not consider him anymore). I am looking for an online algorithm that will help me make these decisions on whether or not to make an announcement and if so based on what factors. Does anyone have any ideas on what algorithms can be used to solve this or a simpler version of this problem?
There should be an approach relying upon a max-flow algorithm. In essence, you're trying to push the maximum amount of messages from start->end. Though it would be multidimensional, you could have a super-sink, which connects to each value of t, then have each value of t connect to the people you can reach at this time and then have a super-sink. This way, you simply have to compute a max-flow (with the added constraint of no more than k shouts, which should be solvable with a bit of dynamic programming). It's a terrifically dirty way to solve it, but it should get the job done deterministically and without the use of heuristics.
I don't know that there is really a way to solve this or an algorithm to do it the way you have formulated it.
It seems like basically you are trying to reach the maximum number of people with exactly 2 announcements. But without knowing any information about the groups of people in advance, you can't really make any kind of intelligent decision about whether or not to use your first announcement. Your second one at least has the benefit of knowing when not to be used (i.e. if the group has no new members then you can know its not worth wasting the announcement). But it still has basically the same problem.
The only real way to solve this is to use knowledge about the type of data or the desired outcome to make guesses. If you know that groups average 100 people with a standard deviation of 10, then you could just refuse to announce if less than 90 people are present. Or, if you know you need to reach at least 100 people with two announcements, you could choose never to announce to less than 50 at once. Obviously those approaches risk never announcing at all if the actual data does not meet what you would expect. But that's always going to be a risk, since you could get 1 person in the first group and then 0 in all of the rest, no matter what you do.
Or, you could try more clearly defining the problem, I have a hard time figuring out how to relate this to computers.
Lets start my trying to solve the simplest possible variant of the problem: Lets assume N people and K timeslots, but only one possible announcement. Lets also assume that each person will only ever stay for one timeslot and that each person who hasn't yet shown up has an equally probable chance of showing up at any future timeslot.
Given these simplifications, at each timeslot you look at the payoff of announcing at the current timeslot and compare to the chance of a future timeslot having a higher payoff, eg, lets assume 4 people 3 timeslots:
Timeslot 1: Person 1 shows up, so you know you could get a payoff of 1 by announcing, but then you have 3 people to show up in 2 remaining timeslots, so at least one of those timeslots is guaranteed to have 2 people, so don't announce..
So at each timeslot, you can calculate the chance that a later timeslot will have a higher payoff than the current by treating the remaining (N) people and (K) timeslots as being N independent random numbers each from 1..k, and calculate the chance of at least one value k being hit more than or equal to the current-payoff times. (Similar to the Birthday problem, but for more than 1 collision) and then you need to decide hwo much to discount based on expected variances. (bird in the hand, etc)
Generalization of this solution to the original problem is left as an exercise for the reader.

Resources