Finding a majority of unorderable items - algorithm

I have this one problem with finding solution to this task.
You have N students and N courses. Student can attend only one course
and one course can be attended by many students. Two students are
classmates if they are attending same course. How to find out if there
are N/2 classmates in N students with this?
conditions: You can take two students and ask if they are classmates
and only answer you can get is "yes" or "no". And you need to do this
in O(N*log(N)).
I need just some idea how to make it, pseudo code will be fine. I guess it will divide the list of students like merge sort, which gives me the logarithmic part of complexity. Any ideas will be great.

First, pair off each student (1&2, 3&4, 5&6... etc), and you check and see which pairs are in the same class. The first student of the pair gets "promoted". If there an "oddball" student, they are in their own class, so they get promoted as well. If a single class contains >=50% of the students, then >=50% of the promoted students are also in this class. If no students are promoted, then if a single class contains >=50% of the students then either the first or the second student must be in the class, so simply promote both of them. This leads to the case where >=50% of the promotions are in the large class. This always takes ⌊N/2⌋ comparisons.
Now when we examine the promoted students, then if a class contains >=50% of the students, then >=50% of the promoted students are in this class. Therefore, we can simply recurse, until we reach a stop condition: there are less than three promoted students. At each step we promote <=50% of the students (plus one sometimes), so this step occurs at most ⌈log(N,2)⌉ times.
If there are less than three promoted students, then we know that if >=50% of the original students are in the class, then at least one of these remaining students is in that class. Therefore, we can simply compare each and every original student against these promoted students, which will reveal either (A) the class with >=50% of the students, or (B) that no class has >=50% of the students. This takes at most (N-1) comparisons, but only occurs once. Note that there is the possibility where all the original students match with one of the two remaining students evenly, and this detects that both classes have =50% of the students.
So the complexity is N/2 *~ log(N,2) + N-1. However, the *~ signifies that we don't iterate over all N/2 students at each of the log(N,2) iterations, only decreasing fractions N/2, N/4, N/8..., which sum to N. So the total complexity is N/2 + N/2 + N-1 = 2N-1, and when we remove the constants we get O(N). (I feel like I may have made a math error in this paragraph. If someone spots it, let me know)
See it in action here: http://coliru.stacked-crooked.com/a/144075406b7566c2 (The comparison counts may be slightly over the estimate due to simplifications I made in the implementation)
Key here is that if >50% of the students are in a class, then >=50% of the arbitrary pairs are in that class, assuming the oddball student matches himself. One trick is that if exactly 50% match, it's possible that they alternate in the original order perfectly and thus nobody gets promoted. Luckily, the only cases is the alternating, so by promoting the first and second students, then even in that edge case, >=50% of the promotions are in the large class.
It's complicated to prove that >=50% of the promotions are in the large class, and I'm not even certain I can articulate why this is. Confusingly, it also doesn't hold prettily for any other fractions. If the target is >=30% of the comparisons, it's entirely possible that none of the promoted students are in the target class(s). So >=50% is the magic number, it isn't arbitrary at all.

If one can know the number of students for each course then it should suffice to know if there is a course with a number of students >= N/2. In this case you have a complexity of O(N) in the worst case.
If it is not possible to know the number of students for each course then you could use an altered quicksort. In each cycle you pick a random student and split the other students in classmates and non-classmates. If the number of classmates is >= N/2 you stop because you have the answer, else you analyze the non-classmates partition. If the number of students in that partition is < N/2 you stop because it is not possible to have a quantity of classmates >= N/2, else you pick another student from the non-classmates partition and repeat everything using only the non-classmates elements.
What we take from the quicksort algorithm is just the way we partition the students. The above algorithm has nothing to do with sorting. In pseudo-code it would look something like this (array indexing starts from 1 for the sake of clarity):
Student[] students = all_students;
int startIndex = 1;
int endIndex = N; // number of students
int i;
while(startIndex <= N/2){
endIndex = N; // this index must point to the last position in each cycle
students.swap(startIndex, start index + random_int % (endIndex-startIndex));
for(i = startIndex + 1; i < endIndex;){
if(students[startIndex].isClassmatesWith(students[i])){
i++;
}else{
students.swap(i,endIndex);
endIndex--;
}
}
if(i-startIndex >= N/2){
return true;
}
startIndex = i;
}
return false;
The situation of the partition before the algorithm starts would be as simple as:
| all_students_that_must_be_analyzed |
during the first run the set of students would be partitioned this way:
| classmates | to_be_analyzed | not_classmates |
and during each run after it, the set of students would be partitioned as follows:
| to_ignore | classmates | to_be_analyzed | not_classmates |
In the end of each run the set of students would be partitioned in the following way:
| to_ignore | classmates | not_classmates |
At this moment we need to check if the classmates partition has more than N/2 elements. If it has, then we have a positive result, if not we need to check if the not_classmates partition has >= N/2 elements. If it has, then we need to proceed with another run, otherwise we have a negative result.
Regarding complexity
Thinking more in depth on the complexity of the above algorithm, there are two main factors that affect it, which are:
The number of students in each course (it's not necessary to know this number for the algorithm to work).
The average number of classmates found in each iteration of the algorithm.
An important part of the algorithm is the random choice of the student to be analyzed.
The worst case scenario would be when each course has 1 student. In this case (for obvious reasons I would say) the complexity would be O(N^2). If the number of students for the courses varies then this case won't happen.
An example of the worst case scenario would be when we have, let's say, 10 students, 10 courses, and 1 student for each course. We would check 10 students the first time, 9 students the second time, 8 students the third time, and so on. This brings a O(N^2) complexity.
The best case scenario would be when the first student you choose is in a course with a number of students >= N/2. In this case the complexity would be O(N) because it would stop in the first run.
An example of the best case scenario would be when we have 10 students, 5 (or more) of which are classmates, and in the first run we pick one of these 5 students. In this case we would check only 1 time for the classmates, find 5 classmates, and return true.
The average case scenario is the most interesting part (and more close to a real-world scenario). In this case there are some probabilistic calculations to make.
First of all, the chances of a student from a particular course to be picked are [number_of_students_in_the_course] / N. This means that, in the first runs it's more probable to pick a student with many classmates.
That being said, let's consider the case where the average number of classmates found in each iteration is a number smaller that N/2 (as is the length of each partition in the average case for quicksort). Let's say that the average amount of classmates found in each iteration is 10% (number taken for ease of calculations) of the remaining M students (that are not classmates of the previously picked students). In this case we would have these values of M for each iteration:
M1 = N - 0.1*N = 0.9*N
M2 = M1 - 0.1*M1 = 0.9*M1 = 0.9*0.9*N = 0.81*N
M3 = M2 - 0.1*M2 = 0.9*M2 = 0.9*0.81*N = 0.729*N and I would round it to 0.73*N for ease of calculations
M4 = 0.9*M3 = 0.9*0.73*N = 0.657*N ~= 0.66*N
M5 = 0.9*M4 = 0.9*0.66*N = 0.594*N ~= 0.6*N
M6 = 0.9*M5 = 0.9*0.6*N = 0.54*N
M7 = 0.9*M6 = 0.9*0.54*N = 0.486*N ~= 0.49*N
The algorithm stops because we have 49% of remaining students and we can't have more than N/2 classmates among them.
Obviously, in the case of a smaller percentage of average classmates the number of iterations will be greater but, combined with the first fact (the students in courses with many students have a higher probability to get picked in the early iterations), the complexity will tend towards O(N), the number of iterations (in the outer cycle in the pseudo-code) will be (more or less) constant and will not depend on N.
To better explain this scenario let's work with greater (but more realistic) numbers and more than 1 distribution. Let's say that we have 100 students (number taken for the sake of simplicity in calculations) and that these students are distributed among the courses in one of the following (hypothetical) ways (the numbers are sorted just for explanation purposes, they are not necessary for the algorithm to work):
50, 30, 10, 5, 1, 1, 1, 1, 1
35, 27, 25, 10, 5, 1, 1, 1
11, 9, 9, 8, 7, 7, 5, 5, 5, 5, 5, 5, 5, 5, 5, 3, 1
The numbers given are also (in this particular case) the probabilities that a student in a course (not a particular student, just a student of that course) is picked in the first run. The 1st case is when we have a course with half the students. The 2nd case is when we don't have a course with half the students but more than 1 course with many students. The 3rd case is when we have a similar distribution among the courses.
In the 1st case we would have a 50% probability that a student of the 1st course gets picked in the first run, 30% probability that a student of the 2nd course gets picked, 10% probability that a student of the 3rd course gets picked, 5% probability that a student of the 4th course gets picked, 1% that a student from the 5th course gets picked, and so on for the 6th, 7th, 8th, and 9th course. The probabilities are higher for a student of the 1st case to get picked early, and in the case a student from this course does not get picked in the first run the probabilities it gets picked in the second run only increase. For example, let's suppose that in the 1st run a student from the second course is picked. 30% of the students would be "removed" (as in "not considered anymore") and not be analyzed in the 2nd run. In the 2nd run we would have 70 students remaining. The probability to pick a student from the 1st course in the second run would be 5/7, more than 70%. Let's suppose that - out of bad luck - in the 2nd run a student from the 3rd course gets picked. In the 3rd run we would have 60 students left and the probability that a student from the first course gets picked in the 3rd run would be 5/6 (more than 80%). I would say that we can consider our bad luck to be over in the 3rd run, a student from the 1st course gets picked, and the method returns true :)
For the 2nd and 3rd case I would follow the probabilities for each run, just for the sake of simplicity of calculations.
In the 2nd case we would have a student from the 1st course picked in the 1st run. Being that the number of classmates is not <= N/2 the algorithm would go on with the 2nd run. In the end of the 2nd run we would have "removed" from the student set 35+27=62 students. In the 3rd run we would have 38 students left, and being that 38 < (N/2) = 50 the computation stops and returns false.
The same happens in the 3rd case (in which we "remove" an average of 10% of the remaining students in each run), but with more steps.
Final considerations
In any case, the complexity of the algorithm in the worst case scenario is O(N^2). The average case scenario is heavily based on probabilities and tends to pick early the students from courses with many attendees. This behaviour tends to bring the complexity down to O(N), complexity that we also have in the best case scenario.
Test of the algorithm
In order to test the theoretical complexity of the algorithm I wrote the following code in C#:
public class Course
{
public int ID { get; set; }
public Course() : this(0) { }
public Course(int id)
{
ID = id;
}
public override bool Equals(object obj)
{
return (obj is Course) && this.Equals((Course)obj);
}
public bool Equals(Course other)
{
return ID == other.ID;
}
}
public class Student
{
public int ID { get; set; }
public Course Class { get; set; }
public Student(int id, Course course)
{
ID = id;
Class = course;
}
public Student(int id) : this(id, null) { }
public Student() : this(0) { }
public bool IsClassmatesWith(Student other)
{
return Class == other.Class;
}
public override bool Equals(object obj)
{
return (obj is Student) && this.Equals((Student)obj);
}
public bool Equals(Student other)
{
return ID == other.ID && Class == other.Class;
}
}
class Program
{
static int[] Sizes { get; set; }
static List<Student> Students { get; set; }
static List<Course> Courses { get; set; }
static void Initialize()
{
Sizes = new int[] { 2, 10, 100, 1000, 10000, 100000, 1000000 };
Students = new List<Student>();
Courses = new List<Course>();
}
static void PopulateCoursesList(int size)
{
for (int i = 1; i <= size; i++)
{
Courses.Add(new Course(i));
}
}
static void PopulateStudentsList(int size)
{
Random ran = new Random();
for (int i = 1; i <= size; i++)
{
Students.Add(new Student(i, Courses[ran.Next(Courses.Count)]));
}
}
static void Swap<T>(List<T> list, int i, int j)
{
if (i < list.Count && j < list.Count)
{
T temp = list[i];
list[i] = list[j];
list[j] = temp;
}
}
static bool AreHalfOfStudentsClassmates()
{
int startIndex = 0;
int endIndex;
int i;
int numberOfStudentsToConsider = (Students.Count + 1) / 2;
Random ran = new Random();
while (startIndex <= numberOfStudentsToConsider)
{
endIndex = Students.Count - 1;
Swap(Students, startIndex, startIndex + ran.Next(endIndex + 1 - startIndex));
for (i = startIndex + 1; i <= endIndex; )
{
if (Students[startIndex].IsClassmatesWith(Students[i]))
{
i++;
}
else
{
Swap(Students, i, endIndex);
endIndex--;
}
}
if (i - startIndex + 1 >= numberOfStudentsToConsider)
{
return true;
}
startIndex = i;
}
return false;
}
static void Main(string[] args)
{
Initialize();
int studentsSize, coursesSize;
Stopwatch stopwatch = new Stopwatch();
TimeSpan duration;
bool result;
for (int i = 0; i < Sizes.Length; i++)
{
for (int j = 0; j < Sizes.Length; j++)
{
Courses.Clear();
Students.Clear();
studentsSize = Sizes[j];
coursesSize = Sizes[i];
PopulateCoursesList(coursesSize);
PopulateStudentsList(studentsSize);
Console.WriteLine("Test for {0} students and {1} courses.", studentsSize, coursesSize);
stopwatch.Start();
result = AreHalfOfStudentsClassmates();
stopwatch.Stop();
duration = stopwatch.Elapsed;
var studentsGrouping = Students.GroupBy(s => s.Class);
var classWithMoreThanHalfOfTheStudents = studentsGrouping.FirstOrDefault(g => g.Count() >= (studentsSize + 1) / 2);
Console.WriteLine(result ? "At least half of the students are classmates." : "Less than half of the students are classmates");
if ((result && classWithMoreThanHalfOfTheStudents == null)
|| (!result && classWithMoreThanHalfOfTheStudents != null))
{
Console.WriteLine("There is something wrong with the result");
}
Console.WriteLine("Test duration: {0}", duration);
Console.WriteLine();
}
}
Console.ReadKey();
}
}
The execution time matched the expectations of the average case scenario. Feel free to play with the code, you just need to copy and paste it and it should work.

I will post some of my ideas..
First of all, I think that we need to do something like mergesort, to make that logarithmical part... I thought, that at the lowest level, where we have just 2 students to compare, we just ask and got an answer. But that doesnt solve anything. In this case, we will just have N/2 pairs of students and knowledge either they are classmates or not so ever. And this doesnt help..
Next idea was little bit better. I didnt divide that set to minimum level, but i stopped when i had sets of 4 students. so I had N/4 little sets where I compared everyone to each other. And if I found, that at least two of them are classmates, that was good. If not, and all of them was from different class, I completly forgot that group of 4. When I applyed this to each group, I started to joining them to groups of 8 just by comparing those, who were already flagged as classmates. (thanks to transitivity). And again... if there were at least 4 classmates, in group of 8, I was happy and if not, I forgot about that group. This ought to be repeated until i have two sets of students and make one comparsion on students from both sets to got final answer. BUT problem is, that in there can be n/2-1 classmates in one half and in another half just one student matching with them.. and this agorithm doesnt work with this idea.

Related

convert to divide and conquer algorithm. Kotlin

convert method "FINAL" to divide and conquer algorithm
the task sounded like this: The buyer has n coins of
H1,...,Hn.
The seller has m
coins in denominations of
B1,...,Bm.
Can the buyer purchase the item
the cost S so that the seller has an exact change (if
necessary).
fun Final(H: ArrayList<Int>, B: ArrayList<Int>, S: Int): Boolean {
var Clon_Price = false;
var Temp: Int;
for (i in H) {
if (i == S)
return true;
}
for (i in H.withIndex()) {
Temp = i.value - S;
for (j in B) {
if (j == Temp)
Clon_Price = true;
}
}
return Clon_Price;
}
fun main(args: Array<String>) {
val H:ArrayList<Int> = ArrayList();
val B:ArrayList<Int> = ArrayList();
println("Enter the number of coins the buyer has:");
var n: Int = readln().toInt();
println("Enter their nominal value:")
while (n > 0){
H.add(readln().toInt());
n--
}
println("Enter the number of coins the seller has:");
var m: Int = readln().toInt();
println("Enter their nominal value:")
while (m > 0){
B.add(readln().toInt());
m--
}
println("Enter the product price:");
val S = readln().toInt();
if(Final(H,B,S)){
println("YES");
}
else{
println("No!");
}
Introduction
Since this is an assignment, I will only give you insights to solve this problem and you will need to do the coding yourself.
The algorithm
Receives two ArrayList<Int> and an Int parameter
if the searched (S) element can be found in H, then the result is true
Otherwise it loops H
Computes the difference between the current element and S
Searches for a match in B and if it's found, then true is being returned
If the method has not returned yet, then return false;
Divide et impera (Divide and conquer)
Divide and conquer is the process of breaking down a complicated task into similar, but simpler subtasks, repeating this breaking down until the subtasks become trivial (this was the divide part) and then, using the results of the trivial subtasks we can solve the slightly more complicated subtasks and go upwards in our layers of unsolved complexities until the problem is solved (this is the conquer part).
A very handy data-structure to use is the Stack. You can use the stack of your memory, which are fancy words for recursion, or, you can solve it iteratively, by managing such a stack yourself.
This specific problem
This algorithm does not seem to necessitate divide and conquer, given the fact that you only have two array lists that can be iterated, so, I guess, this is an early assignment.
To make sure this is divide and conquer, you can add two parameters to your method (which are 0 and length - 1 at the start) that reflect the current problem-space. And upon each call, check whether the starting and ending index (the two new parameters) are equal. If they are, you already have a trivial, simplified subtask and you just iterate the second ArrayList.
If they are not equal, then you still need to divide. You can simply
//... Some code here
return Final(H, B, S, start, end / 2) || Final(H, B, S, end / 2 + 1, end);
(there you go, I couldn't resist writing code, after all)
for your nontrivial cases. This automatically breaks down the problem into sub-problems.
Self-criticism
The idea above is a simplistic solution for you to get the gist. But, in reality, programmers dislike recursion, as it can lead to trouble. So, once you complete the implementation of the above, you are well-advised to convert your algorithm to make sure it's iterative, which should be fairly easy once you succeeded implementing the recursive version.

What is the largest number of people that were in the city at the same period?

a problem has been discussed in the class today says :
n paris are given (ai,bi), each pair stands for a human and ai,bi represent his entrance date and exit date from the city for 2019.
the question, what was the largest number of people were in the city at the same period.
I tried to cast the dates to [1,365] (Integers), and insert them entrance to one AVL and the exit dates to another and save pointers from both of them traversing one tree and updating the maximum if needed.
I beileve this soultion is a naive one since it takes O(n^2).
The data-structers that we learend are:
Array,Linked-List,Queue,Stack,Heap,BST,AVL,Heap,Hash-Table,SkipList and Graph.
You can use array for this logic.
Since number of days in the year is fixed, create an array to keep count of number of people in the city in that particular day.
count[365] = {0}; //Reset the counter, all entries should be zero.
maxCount = 0;
maxDay = -1;
for(day from 0 to (365-1))
{
for(i from 0 to (n-1) person)
{
if(day >= ai && day <= bi) //Update this check based on whether ai and bi are inclusive or not.
{
count[day] = count[day]+1;
if(count[day] > maxCount) //Keep track of maxCount, if required update it.
{
maxCount = count[day];
maxDay = day;
}
}
}
}
Output maxDay, maxCount;
The time complexity of the above logic is O(365*n) => O(n).

Divvying people into rooms by last name?

I often teach large introductory programming classes (400 - 600 students) and when exam time comes around, we often have to split the class up into different rooms in order to make sure everyone has a seat for the exam.
To keep things logistically simple, I usually break the class apart by last name. For example, I might send students with last names A - H to one room, last name I - L to a second room, M - S to a third room, and T - Z to a fourth room.
The challenge in doing this is that the rooms often have wildly different capacities and it can be hard to find a way to segment the class in a way that causes everyone to fit. For example, suppose that the distribution of last names is (for simplicity) the following:
Last name starts with A: 25
Last name starts with B: 150
Last name starts with C: 200
Last name starts with D: 50
Suppose that I have rooms with capacities 350, 50, and 50. A greedy algorithm for finding a room assignment might be to sort the rooms into descending order of capacity, then try to fill in the rooms in that order. This, unfortunately, doesn't always work. For example, in this case, the right option is to put last name A in one room of size 50, last names B - C into the room of size 350, and last name D into another room of size 50. The greedy algorithm would put last names A and B into the 350-person room, then fail to find seats for everyone else.
It's easy to solve this problem by just trying all possible permutations of the room orderings and then running the greedy algorithm on each ordering. This will either find an assignment that works or report that none exists. However, I'm wondering if there is a more efficient way to do this, given that the number of rooms might be between 10 and 20 and checking all permutations might not be feasible.
To summarize, the formal problem statement is the following:
You are given a frequency histogram of the last names of the students in a class, along with a list of rooms and their capacities. Your goal is to divvy up the students by the first letter of their last name so that each room is assigned a contiguous block of letters and does not exceed its capacity.
Is there an efficient algorithm for this, or at least one that is efficient for reasonable room sizes?
EDIT: Many people have asked about the contiguous condition. The rules are
Each room should be assigned at most a block of contiguous letters, and
No letter should be assigned to two or more rooms.
For example, you could not put A - E, H - N, and P - Z into the same room. You could also not put A - C in one room and B - D in another.
Thanks!
It can be solved using some sort of DP solution on [m, 2^n] space, where m is number of letters (26 for english) and n is number of rooms. With m == 26 and n == 20 it will take about 100 MB of space and ~1 sec of time.
Below is solution I have just implemented in C# (it will successfully compile on C++ and Java too, just several minor changes will be needed):
int[] GetAssignments(int[] studentsPerLetter, int[] rooms)
{
int numberOfRooms = rooms.Length;
int numberOfLetters = studentsPerLetter.Length;
int roomSets = 1 << numberOfRooms; // 2 ^ (number of rooms)
int[,] map = new int[numberOfLetters + 1, roomSets];
for (int i = 0; i <= numberOfLetters; i++)
for (int j = 0; j < roomSets; j++)
map[i, j] = -2;
map[0, 0] = -1; // starting condition
for (int i = 0; i < numberOfLetters; i++)
for (int j = 0; j < roomSets; j++)
if (map[i, j] > -2)
{
for (int k = 0; k < numberOfRooms; k++)
if ((j & (1 << k)) == 0)
{
// this room is empty yet.
int roomCapacity = rooms[k];
int t = i;
for (; t < numberOfLetters && roomCapacity >= studentsPerLetter[t]; t++)
roomCapacity -= studentsPerLetter[t];
// marking next state as good, also specifying index of just occupied room
// - it will help to construct solution backwards.
map[t, j | (1 << k)] = k;
}
}
// Constructing solution.
int[] res = new int[numberOfLetters];
int lastIndex = numberOfLetters - 1;
for (int j = 0; j < roomSets; j++)
{
int roomMask = j;
while (map[lastIndex + 1, roomMask] > -1)
{
int lastRoom = map[lastIndex + 1, roomMask];
int roomCapacity = rooms[lastRoom];
for (; lastIndex >= 0 && roomCapacity >= studentsPerLetter[lastIndex]; lastIndex--)
{
res[lastIndex] = lastRoom;
roomCapacity -= studentsPerLetter[lastIndex];
}
roomMask ^= 1 << lastRoom; // Remove last room from set.
j = roomSets; // Over outer loop.
}
}
return lastIndex > -1 ? null : res;
}
Example from OP question:
int[] studentsPerLetter = { 25, 150, 200, 50 };
int[] rooms = { 350, 50, 50 };
int[] ans = GetAssignments(studentsPerLetter, rooms);
Answer will be:
2
0
0
1
Which indicates index of room for each of the student's last name letter. If assignment is not possible my solution will return null.
[Edit]
After thousands of auto generated tests my friend has found a bug in code which constructs solution backwards. It does not influence main algo, so fixing this bug will be an exercise to the reader.
The test case that reveals the bug is students = [13,75,21,49,3,12,27,7] and rooms = [6,82,89,6,56]. My solution return no answers, but actually there is an answer. Please note that first part of solution works properly, but answer construction part fails.
This problem is NP-Complete and thus there is no known polynomial time (aka efficient) solution for this (as long as people cannot prove P = NP). You can reduce an instance of knapsack or bin-packing problem to your problem to prove it is NP-complete.
To solve this you can use 0-1 knapsack problem. Here is how:
First pick the biggest classroom size and try to allocate as many group of students you can (using 0-1 knapsack), i.e equal to the size of the room. You are guaranteed not to split a group of student, as this is 0-1 knapsack. Once done, take the next biggest classroom and continue.
(You use any known heuristic to solve 0-1 knapsack problem.)
Here is the reduction --
You need to reduce a general instance of 0-1 knapsack to a specific instance of your problem.
So lets take a general instance of 0-1 knapsack. Lets take a sack whose weight is W and you have x_1, x_2, ... x_n groups and their corresponding weights are w_1, w_2, ... w_n.
Now the reduction --- this general instance is reduced to your problem as follows:
you have one classroom with seating capacity W. Each x_i (i \in (1,n)) is a group of students whose last alphabet begins with i and their number (aka size of group) is w_i.
Now you can prove if there is a solution of 0-1 knapsack problem, your problem has a solution...and the converse....also if there is no solution for 0-1 knapsack, then your problem have no solution, and vice versa.
Please remember the important thing of reduction -- general instance of a known NP-C problem to a specific instance of your problem.
Hope this helps :)
Here is an approach that should work reasonably well, given common assumptions about the distribution of last names by initial. Fill the rooms from smallest capacity to largest as compactly as possible within the constraints, with no backtracking.
It seems reasonable (to me at least) for the largest room to be listed last, as being for "everyone else" not already listed.
Is there any reason to make life so complicated? Why cann't you assign registration numbers to each student and then use the number to allocate them whatever the way you want :) You do not need to write a code, students are happy, everyone is happy.

Determine if more than half of the array repeats in a distinct array

I was looking at the following question from Glassdoor:
Given N credits cards, determine if more than half of them belong to the same person/owner. All you have is an array of the credit card numbers, and an api call like isSamePerson(num1, num2).
It is clear how to do it in O(n^2) but some commenters said it can be done in O(n) time. Is it even possible? I mean, if we have an array of credit card numbers where some numbers are repeated, then the claim makes sense. However, we need to make an API call for each credit card number to see its owner.
What am I missing here?
The algorithm goes as follows:
If there is a majority of one item (here, a person), then if you pair together items that are not equal (in any order), this item will be left over.
Start with an empty candidate slot
For every item
If the candidate slot is empty (count = 0), place it there.
Else if it is equal to the item in the slot, increment its count.
Else decrement the count for that slot(pop one item).
If there is nothing left on the candidate slot, there is no clear majority. Otherwise,
Count the number of occurences of the candidate (second pass).
If the number of occurences is more than 50%, declare it a winner,
Else there is no majority.
Note this cannot be applied if the threshold is below 50% (but it should be possible to adapt to a threshold of 33%, 25%... by holding two, three... candidate slots and popping only a distinct triple, quadruple...).
This also apllies to the case of the credit cards: All you need to is compare two elements (persons) for equality (via the API call), and a counter that is able to accomodate the total number of elements.
Time complexity: O(N)
Space complexity: O(1) + input
API calls: up to 2N-1: once in each pass, no api call for the first element in the first pass.
Let x1,x2,...,xn be the credit card numbers.
Note that since more than half of them belong to same person, if you consider two adjacent numbers, at least one pair of them are going to belong to same person.
If you consider all pairs (x1,x2), (x3,x4)...., and consider the subset of pairs where both elements belong to same person, a majority of same-person-pairs belong to the person who has majority of cards in first place. So, for every same-person-pair keep one of the card numbers and for non-same-person-pairs discard both. Do this recursively and return the last remaining same-person-pair.
You need to perform at most n comparisons.
NOTE: If n is odd keep the unpaired number.
Why this works: consider a case where n is even and person A owns n/2 + 1 cards. In the worst case you have exactly one pair where both cards are owned by A. In that case none of the other pairs are owned by same person ( other pairs contain one card of A and a card by other person).
Now, to create one matching pair of B (non-A person), you have to create one pair of B also. Therefore, at every instance a majority of matching pairs are owned by A.
I don't have reputation to comment. The method told by Jan Dvorak is known as Moore's Voting Algorithm (Stream Counting Algorithm). Here's the code.
int majorityElement(int* nums, int numsSize) {
int count =0, i, majorityElement;
/*Moore's voting algorithm Order(n), Auxiliary space: Order(1)*/
/*step:1-To get candidate for the majority element*/
for(i=0; i < numsSize; i++)
{
if(count==0)
majorityElement = nums[i];
if(nums[i]==majorityElement)
count++;
else
count--;
}
/*Step:2- To check whether this candidate occurs max no. of times */
count =0;
for(i=0; i<numsSize; i++)
{
if(nums[i]==majorityElement)
count ++;
}
if(count>numsSize/2) /*It can only be applied for majority check */
return majorityElement;
return -1;}
The question is to find out the majority element in an array. I shall use
Boyer-Moore Majority Vote algorithm. I am doing this using HashMap.
public class majorityElement1 {
public static void main(String[] args) {
int a[] = {2,2,2,2,5,5,2,3,3,3,3,3,3,33,3};
fix(a);
}
public static void fix(int[] a ) {
Map<Integer,Integer> map = new HashMap<>();
for(int i = 0 ; i<a.length ; i++) {
int r = a[i];
if(!map.containsKey(r)) {
map.put(r, 1);
}else {
if(map.get(r) +1 >= a.length/2) {
System.out.println("majority element => "+ r);
return ;
}else {
map.put(r,map.get(r) +1);
}
}//else1
}//for
}
}
The output is 3.
DONE IN ONE PASS :
Start from the the second index of array let say i=1 initially.
Initially count=1.
Call isSamePerson(a[i],a[i-1]) where array a[] contains credit card numbers.
If the returned value is positive , do count++ and i++
else if returned value is 0 and count==1 , i++
else if returned value is 0 and count>1 , do count-- and i++
If i!=(n-1) , go to step 3 where n is number of cards.
else If at the end of array count>1 , then there are more than half of cards belonging to a single person
else there is no clear majority of over 50%.
I hope that this is understandable and writing code would be an easy thing.
TIME COMPLEXITY - O(N)
NUMBER OF API CALLS = N-1
SPACE COMPLEXITY - O(1)

Known algorithm for efficiently distributing items and satisfying minima?

For the following problem I'm wondering if there is a known algorithm already as I don't want to reinvent the wheel.
In this case it's about hotel rooms, but I think that is rather irrelevant:
name | max guests | min guests
1p | 1 | 1
2p | 2 | 2
3p | 3 | 2
4p | 4 | 3
I'm trying to distribute a certain amount of guests over available rooms, but the distribution has to satisfy the 'min guests' criteria of the rooms. Also, the rooms need to be used as efficiently as possible.
Let's take 7 guests for example. I wouldn't want this combination:
3 x 3p ( 1 x 3 guests, 2 x 2 guests )
.. this would satisfy the minimum criteria, but would be inefficient. Rather I'm looking for combinations such as:
1 x 3p and 1 x 4p
3 x 2p and 1 x 1p
etc...
I would think this is a familiar problem. Is there any known algorithm already to solve this problem?
To clarify:
By efficient I mean, distribute guests in such a way that rooms are filled up as much as possible (guests preferences are of secondary concern here, and are not important for the algorithm I'm looking for).
I do want all permutations that satisfy this efficiency criteria though. So in above example 7 x 1p would be fine as well.
So in summary:
Is there a known algorithm that is able to distribute items as efficiently as possible over slots with a min and max capacity, always satisfying the min criteria and trying to satisfy the max criteria as much as possible.
You need to use dynamic programming, define a cost function, and try to fit people in possible rooms having a cost function as small as possible.
Your cost function can be something like :
Sum of vacancy in rooms + number of rooms
It can be a bit similar to the least rageness problem : Word wrap to X lines instead of maximum width (Least raggedness)
You fit people in room, as you fit words in line.
The constraints are the vacancies in the rooms instead of being the length of the lines. (infinite cost if you don't fullfil the constraints)
and the recursion relation is pretty much the same .
Hope it helps
$sql = "SELECT *
FROM rooms
WHERE min_guests <= [$num_of_guests]
ORDER BY max_guests DESC
LIMIT [$num_of_guests]";
$query = $this->db->query($sql);
$remaining_guests = $num_of_guests;
$rooms = array();
$filled= false;
foreach($query->result() as $row)
{
if(!$filled)
{
$rooms[] = $row;
$remaining_guests -= $row->max_guests;
if(remaining_guests <= 0)
{
$filled = true;
break;
}
}
}
Recursive function:
public function getRoomsForNumberOfGuests($number)
{
$sql = "SELECT *
FROM rooms
WHERE min_guests <= $number
ORDER BY max_guests DESC
LIMIT 1";
$query = $this->db->query($sql);
$remaining_guests = $number;
$rooms = array();
foreach($query->result() as $row)
{
$rooms[] = $row;
$remaining_guests -= $row->max_guests;
if($remaining_guests > 0)
{
$rooms = array_merge($this->getRoomsForNumberOfGuests($remaining_guests), $rooms);
}
}
return $rooms;
}
Would something like this work for ya? Not sure what language your in?
For efficint = minimum rooms used, perhaps this would work. To minimise the number of rooms used you want to put max guests in the large rooms.
So sort the rooms in descending order of max guests, then allocate guests to them in that order, placing max guests in each room in turn. Try to place all remaining guests is any remaining room that will accept that many min guests; if that is impossible, back-track and try again. When back tracking, hold back the room with the smallest min guests. Held back rooms are not allocated guests in the max guests phase.
EDIT
As Ricky Bobby pointed out, this does not work as such, because of the difficulty of the back-tracking. I'm keeping this answer for now, more as a warning than as a suggestion :-)
This can be done as a fairly straightforward modification of the recursive algorithm to enumerate integer partitions. In Python:
import collections
RoomType = collections.namedtuple("RoomType", ("name", "max_guests", "min_guests"))
def room_combinations(count, room_types, partial_combo=()):
if count == 0:
yield partial_combo
elif room_types and count > 0:
room = room_types[0]
for guests in range(room.min_guests, room.max_guests + 1):
yield from room_combinations(
count - guests, room_types, partial_combo + ((room.name, guests),)
)
yield from room_combinations(count, room_types[1:], partial_combo)
for combo in room_combinations(
7,
[
RoomType("1p", 1, 1),
RoomType("2p", 2, 2),
RoomType("3p", 3, 2),
RoomType("4p", 4, 3),
],
):
print(combo)
we should select a list of room (maybe more than one time for each room) in a way that sum of the min value of selected room get equal or smaller than the number of guests and sum of the max value get equal or bigger than the number of guests.
I defined cost as total free space in selected rooms. (cost = max - min)
We can check for the answer with this code and find all possible combinations with the minimum cost. (c# code)
class FindRooms
{
// input
int numberOfGuest = 7; // your goal
List<Room> rooms;
// output
int cost = -1; // total free space in rooms.
List<String> options; // list of possible combinations for the best cost
private void solve()
{
// fill rooms data
// fill numberOfGuest
// run this function to find the answer
addMoreRoom("", 0, 0);
// cost and options are ready to use
}
// this function add room to the list recursively
private void addMoreRoom(String selectedRooms, int minPerson, int maxPerson)
{
// check is it acceptable
if (minPerson <= numberOfGuest && maxPerson >= numberOfGuest)
{
// check is it better than or equal to previous result
if(maxPerson - minPerson == cost)
{
options.Add(selectedRooms);
}
else if (maxPerson - minPerson < cost)
{
cost = maxPerson - minPerson;
options.Clear();
options.Add(selectedRooms);
}
}
// check if too many room selected
if (minPerson > numberOfGuest)
return;
// add more room recursively
foreach (Room room in rooms)
{
// add room and min and max space to current state and check
addMoreRoom(selectedRooms + "," + room, minPerson + room.min, maxPerson + room.max);
}
}
public class Room
{
public String name;
public int min;
public int max;
}
}

Resources