Round robin for 3 participants/trios? - algorithm

I'm speaking with reference to Scheduling algorithm for a round-robin tournament?.
I need to pair (or triple) a group of people into trios, in order for them to meet. For example, in a group of 9 people, the first meetings would be: [1, 2, 3], [4, 5, 6], [7, 8, 9]. Next meetings would be something like [1, 4, 7], [2, 5, 8], [3, 6, 9]. Things end when everyone has met everyone else, and we need to minimize the number of "rounds".
I'm at wit's end thinking of the solution to this. Many thanks to someone who can point me in the right direction :)

If "everyone has met everyone else" means that all pairs appear in the schedule, then this is a generalization of Kirkman's schoolgirl problem, solvable in the minimum number of rounds when there is an odd number of groups (existence of Kirkman triple systems, due to Ray-Chaudhuri and Wilson). The social golfer problem is a generalization to other group sizes, and I expect that the situation for even numbers of groups would be studied under that name.
In the (seemingly unlikely) event that "everyone has met everyone else" means that all possible groups have been used, then you want to use the construction in Baranyai's theorem to find hypergraph factors (see my previous answer on the topic for a Python implementation).

Related

Algorithms for Optimization of Integer Subset Linking

Consider having two sets of integer values that are divided in multiple subsets. The two sets exist of the same set of values but the order and the division into subsets differ. The idea is to link the subsets from the first set with these from the second set in such way that every individual value in each subset of the first set is linked to a same individual value of a subset of the second set. No value can be linked with two others. In one linking step multiple values can be linked between only one subset of the first set with only one subset of the second set. The goal is to reduce the amount of linking steps as much as possible.
The question is: are there algorithms around for doing this kind of linking as optimal as possible?
I have done some research in several fields of mathematical optimization, such as Linear Programming, Integer Programming, Combinatorial optimization and Operations Research but none of the algorithms seem to cover this problem. Do you guys have any ideas, fields or algorithms to optimize these kinds of problems and make me head in the right direction?
For example:
Two sets of integers with two subsets:
[[1, 2, 2] [2, 3, 3]]
and
[[1, 2, 3] [2, 2, 3]].
Now the first linking set could be to link the first subset of the first set 1[1] with the first subset of the second set 2[1].
This is one step and leads to a link between: 1 - 1 - 1 and 2 - 1 - 1 and a link between 1 - 1 - 2 and 2 - 1 - 2. Now the sets will look like this:
[[1, 2, 2] [2, 3, 3]]
and
[[1, 2, 3] [2, 2, 3]].
The next step could be linking 1[1] with 2[2], leading to a link between 1 - 1 - 3 and 2 - 2 - 1 and the sets will look like this:
[[1, 2, 2] [2, 3, 3]]
and
[[1, 2, 3] [2, 2, 3]].
The third step could be linking 1[2] with 2[1]. Resulting in:
[[1, 2, 2] [2, 3, 3]]
and
[[1, 2, 3] [2, 2, 3]].
And the fourth step could then be linking 1[2] to 2[2]. Resulting in:
[[1, 2, 2] [2, 3, 3]]
and
[[1, 2, 3] [2, 2, 3]], which means every value is linked. This solution costs four steps.
When having larger sets, all subsets can be linked to all other subsets of the other set, but that will result in many steps. Is there a algorithm around that optimizes the number of steps?
Even this is not an answer, but I think this is a step in defining the problem toward finding a solution.
Note: The following example of input/output was an edit. I disagree with the rejecting votes, and I URGE everyone to read carefully before voting to approve or reject any edit.
This would open a discussion about the votes that are non-carefully casted. Still is a constructive discussion but here is not its place.
Consider the following example: It is less costly (less using of sub-sets) to use the 3nd sub-set of the first list than using the 2nd and 5th sub-sets.
The algorithm:
Define the smaller list: List #2
Create a counting list of all items in all sub-lists of list #2.
You will have this counting list {[item:count]}: {[1:3], [2:2], [3:1], [4:2], [5:1]}.
Now, your problem instead of linking (i.e. index-dependent) the sub-sets. It is to find the min number of sub-sets of list #1. That their items would give the count of the counting list.
A simple try of each possible combination would definitely get the answer.. but I think from point #4 we can think of a better solution containing some conditions to minimize the combination tries.
Hopefully, this suggestion would help in giving a hint towards finding a solution.

flatten and compact an array more efficiently

On many occasions, we need to perform two or more different operations on an array like flatten and compact.
some_array.flatten.compact
My concern here is that it will loop over the array two times. Is there more efficient way of doing this?
I actually think this is a great question. But first off, why is everyone not too concerned about this? Here's the performance of flatten and flatten.compact compared:
Here's the code I used to generate this chart, and one that includes memory.
Hopefully now you see why most folks won't worry: it is just another constant factor you're adding when you compose a flatten with a compact, maybe it's valuable at least theoretically to say: how can we shave off the time and space of this intermediate structure? Again, asymptotically not super valuable, but curious to think about.
As far as I can tell, you can't do this by making use of flatten:
Before looking at the source, I hoped that flatten could take a block like so:
[[3, [3, 3, 3]], [3, [3, 3, 3]], [3, [3, 3, 3]], nil].flatten {|e| e unless e.nil? }
No dice though. We get this as a return:
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, nil]
This is weird in that it basically tosses the block away as a no-op. But it makes sense with the source. The C method flatten used in Ruby's core isn't parameterized to take a block.
The procedure in the Ruby source code reads kinda weird to me (I am not a C programmer) but it's basically doing something like depth first search. It's using a stack that it adds every new nested array in to process that it encounters. (It terminates when none remain.) I've not calculated this formally, but it leads me to guess the complexity is on par with DFS.
So the source code could've been written such this would work by allowing for extra setup if a block is passed in. But without that, you're stuck with the (small) performance hit!
It is not iterating over the same array two times. flatten creates in general an array that has an entirely different structure from the original one. Therefore, the first and the second iteration are not iterating over the same elements. So, it naturally follows that you cannot do that.
If the array is one layer deep, then the arrays can be merged in a set.
require 'set'
s = Set.new
Ar.each{|a| s.merge(a)}

Tortoise and hare algorithm

I was reading Tortoise and hare algorithm from wikipedia. I am wondering whether the python pseudocode is wrong. It seems to fail for the array: [1, 2, 2, 3, 4, 5, 6, 7, 8, 9, ....] as at the very beginning, the two values meet and the algorithm continues to find the start of the cycle which is doomed to failure.
I understand that there is condition of i ≥ μ, should this constraint be added to the code for finding the start of the cycle?
If this constraint is added, should the algorithm terminate and return no cycle found when failed or continue for another iteration? Since what if the input is [1, 2, 2, 3, 4, 5, 3, 4, 5, 3, 4, 5, ....] ?
How does this algorithm guarantee that at the first meeting point, both pointers are inside some cycles?
The tortoise and hare algorithm runs two pointers, one at offset i and the other at offset 2i, both one-based, so initially 1 and 2, and is meant to detect cycles in linked-list-style data structures.
And, just to be clear, it compares the pointers rather than data values they point to, I'm unsure whether you understand that so, on the off-chance you didn't, I just thought I'd mention it.
The initial start point is to have the tortoise on the first element and the hare on the second (assuming they exist of course - if they don't, no loop is possible), so it's incorrect to state that they're equal at the start. The pointer value can only ever become equal if the hare cycles and therefore catches the tortoise from behind.

decoding via number combinations algorithm in python 3

ok so here is the problem.
let's say:
1 means Bob
2 means Jerry
3 means Tom
4 means Henry
any summation combination of two of aforementioned numbers is a status/ mood type which is how the program will be encoded:
7 (4+3) means Angry
5 (3+2) menas Sad
3 (2+1) means Mad
4 (3+1) means Happy
and so on...
how may i create a decode function such that it accepts one of the added (encoded) values, such as 7, 5, 3, 4, etc and figures out the combination and return the names of the people representing the two numbers that constitue the combination. take note that one number cannot be repeated to get mood result, meaning 4 has to be 3+1 and may not be 2+2. so we can assume for this example, that there is only one possible combination for each status/ mood code. now the problem is, how do you implement such code in python 3? what would be the algorithm or logic for such a problem. how do you seek or check for combination of two numbers? i'm thinking i should just run a loop that keeps on adding two numbers at a time until the result matches with the status/ mood code. will that work? BUT THIS METHOD WILL SOON BECOME OBSOLETE IF THE NUMBER OF COMBINATIONS IS INCREASED (as in adding 4 numbers together instead of 2). doing it this way will take up a lot of time and will possibly be inefficient.
i apologize, i know this questions is extremely confusing but please bear with me.
let's try and work something out.
Use Binary
If you want to have sums that are unique, then assign each possible "Person" a number that's a power of 2. The sum of any combination of these numbers will uniquely identify which numbers were used in the sum.
1, 2, 4, 8, 16, ...
Rather than offer a detailed proof of correctness, I offer an intuitive argument about this: any number can be represented in base 2, and it is always a sum of exactly one combination of powers of 2.
This solution may not be optimal. It has realistic limitations (32 or 64 different "person" identifiers, unless you use some sort of BigInt), but depending on your needs, it might work. Having the smallest possible values, binary is better than any other radix though.
Example
(Edited)
Here's a quick snippet that demonstrates how you could decode the sum. The returned values are the exponents of the powers of 2. count_persons could be arbitrarily large, as could the range of n iterated over (just as a quick example).
#!/usr/bin/python3
count_persons = 64
for n in range(20,30):
matches = list(filter(lambda i: (n>>i) & 0x1, range(1,count_persons)))
print('{0}: {1}'.format(n,matches))
Output:
20: [2, 4]
21: [2, 4]
22: [1, 2, 4]
23: [1, 2, 4]
24: [3, 4]
25: [3, 4]
26: [1, 3, 4]
27: [1, 3, 4]
28: [2, 3, 4]
29: [2, 3, 4]
See a more appropriate answer here
In my opinion, the selected answer is so suboptimal that it can be considered plain wrong.
The table you are building can be indexed with N(N-1)/2 values, while the binary approach uses 2N.
With a 64 bits unsigned integer, you could encode about sqrt(265) values, that is 6 billion names, compared with the 64 names the binary approach will allow.
Using a big number library could push the limit somewhat, but the computations involved would be hugely more costly than the simple o(N) reverse indexing algorithm needed by the alternative approach.
My conclusion is: the binary approach is grossly inefficient, unless you want to play with a handful of values, in which case hard-coding or precomputing the indexes would be just as good a solution.
Since the question is very unlikely to match a search on the subject, it is not that important anyway.

Bogosort optimization, probability related

I'm coding a question on an online judge for practice . The question is regarding optimizing Bogosort and involves not shuffling the entire number range every time. If after the last shuffle several first elements end up in the right places we will fix them and don't shuffle those elements furthermore. We will do the same for the last elements if they are in the right places. For example, if the initial sequence is (3, 5, 1, 6, 4, 2) and after one shuffle Johnny gets (1, 2, 5, 4, 3, 6) he will fix 1, 2 and 6 and proceed with sorting (5, 4, 3) using the same algorithm.
For each test case output the expected amount of shuffles needed for the improved algorithm to sort the sequence of first n natural numbers in the form of irreducible fractions.
A sample input/output says that for n=6, the answer is 1826/189.
I don't quite understand how the answer was arrived at.
This looks similar to 2011 Google Code Jam, Preliminary Round, Problem 4, however the answer is n, I don't know how you get 1826/189.

Resources