Bogosort optimization, probability related - sorting

I'm coding a question on an online judge for practice . The question is regarding optimizing Bogosort and involves not shuffling the entire number range every time. If after the last shuffle several first elements end up in the right places we will fix them and don't shuffle those elements furthermore. We will do the same for the last elements if they are in the right places. For example, if the initial sequence is (3, 5, 1, 6, 4, 2) and after one shuffle Johnny gets (1, 2, 5, 4, 3, 6) he will fix 1, 2 and 6 and proceed with sorting (5, 4, 3) using the same algorithm.
For each test case output the expected amount of shuffles needed for the improved algorithm to sort the sequence of first n natural numbers in the form of irreducible fractions.
A sample input/output says that for n=6, the answer is 1826/189.
I don't quite understand how the answer was arrived at.

This looks similar to 2011 Google Code Jam, Preliminary Round, Problem 4, however the answer is n, I don't know how you get 1826/189.

Related

Question regarding mergesort's merge algorithm

Let's suppose we have two sorted arrays, A and B, consisting of n elements. I dont understand why the time needed to merge these 2 is "n+n". In order to merge them we need to compare 2n-1 elements. For example, in the two following arrays
A = [3, 5, 7, 9] and B = [2, 4, 6, 8]
We will start merging them into a single one, by comparing the elements in the known way. However when we finally compare 8 with 9. Now, this will be our 2n-1=8-1=7th comparison and 8 will be inserted into the new array.
After this the 9 will be inserted without another comparison. So I guess my question is, since there are 2n-1 comparisons, why do we say that this merging takes 2n time? Im not saying O(n), im saying T(n)=2n, an exact time function.
Its probably a detail that im missing here so I would be very grateful if someone could provide some insight. Thanks in advance.

Speeding up LCS algorithm for graph construction

Referencing 2nd question from INOI 2011:
N people live in Sequence Land. Instead of a name, each person is identified by a sequence of integers, called his or her id. Each id is a sequence with no duplicate elements. Two people are said to be each other’s relatives if their ids have at least K elements in common. The extended family of a resident of Sequence Land includes herself or himself, all relatives, relatives of relatives, relatives of relatives of relatives, and so on without any limit.
Given the ids of all residents of Sequence Land, including its President, and the number K, find the number of people in the extended family of the President of Sequence Land.
For example, suppose N = 4 and K = 2. Suppose the President has id (4, 6, 7, 8) and the other three residents have ids (8, 3, 0, 4), (0, 10), and (1, 2, 3, 0, 5, 8). Here, the President is directly related to (8, 3, 0, 4), who in turn is directly related to (1, 2, 3, 0, 5, 8). Thus, the President’s extended family consists of everyone other than (0, 10) and so has size 3.
Limits: 1 <= n <= 300 & 1 <= K <= 300. Number of elements per id: 1-300
Currently, my solution is as follows:
For every person, compare his id to all other id's using an algorithm same as LCS, it can be edited to stop searching if k elements aren't there etc. etc. to improve it's average case performance. Time complexity = O(n^2*k^2)
Construct adjacency list using previous step result.
Use BFS. Output results
But the overall time complexity of this algorithm is not good enough for the second subtask. I googled around a little bit, and found most solutions to be similar to that of mine, and not working for larger subtask. The only thing close to a good solution was this one -> Yes, this question has been asked previously. The reason I'm asking essentially the same question again is that that solution is really tough to work with and implement. Recently, a friend of mine told me about a much better solution he read somewhere.
Can someone help me create a better solution ?
Even pointers to better solution would be great.

decoding via number combinations algorithm in python 3

ok so here is the problem.
let's say:
1 means Bob
2 means Jerry
3 means Tom
4 means Henry
any summation combination of two of aforementioned numbers is a status/ mood type which is how the program will be encoded:
7 (4+3) means Angry
5 (3+2) menas Sad
3 (2+1) means Mad
4 (3+1) means Happy
and so on...
how may i create a decode function such that it accepts one of the added (encoded) values, such as 7, 5, 3, 4, etc and figures out the combination and return the names of the people representing the two numbers that constitue the combination. take note that one number cannot be repeated to get mood result, meaning 4 has to be 3+1 and may not be 2+2. so we can assume for this example, that there is only one possible combination for each status/ mood code. now the problem is, how do you implement such code in python 3? what would be the algorithm or logic for such a problem. how do you seek or check for combination of two numbers? i'm thinking i should just run a loop that keeps on adding two numbers at a time until the result matches with the status/ mood code. will that work? BUT THIS METHOD WILL SOON BECOME OBSOLETE IF THE NUMBER OF COMBINATIONS IS INCREASED (as in adding 4 numbers together instead of 2). doing it this way will take up a lot of time and will possibly be inefficient.
i apologize, i know this questions is extremely confusing but please bear with me.
let's try and work something out.
Use Binary
If you want to have sums that are unique, then assign each possible "Person" a number that's a power of 2. The sum of any combination of these numbers will uniquely identify which numbers were used in the sum.
1, 2, 4, 8, 16, ...
Rather than offer a detailed proof of correctness, I offer an intuitive argument about this: any number can be represented in base 2, and it is always a sum of exactly one combination of powers of 2.
This solution may not be optimal. It has realistic limitations (32 or 64 different "person" identifiers, unless you use some sort of BigInt), but depending on your needs, it might work. Having the smallest possible values, binary is better than any other radix though.
Example
(Edited)
Here's a quick snippet that demonstrates how you could decode the sum. The returned values are the exponents of the powers of 2. count_persons could be arbitrarily large, as could the range of n iterated over (just as a quick example).
#!/usr/bin/python3
count_persons = 64
for n in range(20,30):
matches = list(filter(lambda i: (n>>i) & 0x1, range(1,count_persons)))
print('{0}: {1}'.format(n,matches))
Output:
20: [2, 4]
21: [2, 4]
22: [1, 2, 4]
23: [1, 2, 4]
24: [3, 4]
25: [3, 4]
26: [1, 3, 4]
27: [1, 3, 4]
28: [2, 3, 4]
29: [2, 3, 4]
See a more appropriate answer here
In my opinion, the selected answer is so suboptimal that it can be considered plain wrong.
The table you are building can be indexed with N(N-1)/2 values, while the binary approach uses 2N.
With a 64 bits unsigned integer, you could encode about sqrt(265) values, that is 6 billion names, compared with the 64 names the binary approach will allow.
Using a big number library could push the limit somewhat, but the computations involved would be hugely more costly than the simple o(N) reverse indexing algorithm needed by the alternative approach.
My conclusion is: the binary approach is grossly inefficient, unless you want to play with a handful of values, in which case hard-coding or precomputing the indexes would be just as good a solution.
Since the question is very unlikely to match a search on the subject, it is not that important anyway.

Sorting Algorithm : output

I faced this problem on a website and I quite can't understand the output, please help me understand it :-
Bogosort, is a dumb algorithm which shuffles the sequence randomly until it is sorted. But here we have tweaked it a little, so that if after the last shuffle several first elements end up in the right places we will fix them and don't shuffle those elements furthermore. We will do the same for the last elements if they are in the right places. For example, if the initial sequence is (3, 5, 1, 6, 4, 2) and after one shuffle we get (1, 2, 5, 4, 3, 6) we will keep 1, 2 and 6 and proceed with sorting (5, 4, 3) using the same algorithm. Calculate the expected amount of shuffles for the improved algorithm to sort the sequence of the first n natural numbers given that no elements are in the right places initially.
Input:
2
6
10
Output:
2
1826/189
877318/35343
For each test case output the expected amount of shuffles needed for the improved algorithm to sort the sequence of first n natural numbers in the form of irreducible fractions. I just can't understand the output.
I assume you found the problem on CodeChef. There is an explanation of the answer to the Bogosort problem here.
Ok I think I found the answer, there is a similar problem here https://math.stackexchange.com/questions/20658/expected-number-of-shuffles-to-sort-the-cards/21273 , and this problem can be thought of as its extension

Exhaustive random number generator

In one of my project I encountered a need to generate a set of numbers in a given range that will be:
Exhaustive, which means that it will cover the most of the given
range without any repetition.
It will guarantee determinism (every time the sequence will be the
same). This can be probably achieved with a fixed seed.
It will be random (I am not very versed into Random Number Theory, but I guess there is a bunch of rules that describes randomness. From perspective something like 0,1,2..N is not random).
Ranges I am talking about can be ranges of integers, or of real numbers.
For example, if I used standard C# random generator to generate 10 numbers in range [0, 9] I will get this:
0 0 1 2 0 1 5 6 2 6
As you can see, a big part of given range still remains 'unexplored' and there are many repetitions.
Of course, input space can be very large, so remembering previously chosen values is not an option.
What would be the right way to tackle this problem?
Thanks.
After the comments:
Ok i agree that the random is not the right word, but I hope that you understood what I am trying to achieve. I want to explore given range that can be big so in memory list is not an option. If a range is (0, 10) and i want three numbers i want to guarantee that those numbers will be different and that they will 'describe the range' (i.e. They wont all be in a lower half etc).
Determinism part means that i would like to use something like standard rng with a fixed seed, so I can fully control the sequence.
I hope i made things a bit clearer.
Thanks.
Here's three options with different tradeoffs:
Generate a list of numbers ahead of time, and shuffle them using the fisher-yates shuffle. Select from the list as needed. O(n) total memory, and O(1) time per element. Randomness is as good as the PRNG you used to do the shuffle. The simplest of the three alternatives, too.
Use a Linear Feedback Shift Register, which will generate every value in its sequence exactly once before repeating. O(log n) total memory, and O(1) time per element. It's easy to determine future values based on the present value, however, and LFSRs are most easily constructed for power of 2 periods (but you can pick the next biggest power of 2, and skip any out of range values).
Use a secure permutation based on a block cipher. Usable for any power of 2 period, and with a little extra trickery, any arbitrary period. O(log n) total space and O(1) time per element, randomness is as good as the block cipher. The most complex of the three to implement.
If you just need something, what about something like this?
maxint = 16
step = 7
sequence = 7, 14, 5, 12, 3, 10, 1, 8, 15, 6, 13, 4, 11, 2, 9, 0
If you pick step right, it will generate the entire interval before repeating. You can play around with different values of step to get something that "looks" good. The "seed" here is where you start in the sequence.
Is this random? Of course not. Will it look random according to a statistical test of randomness? It might depend on the step, but likely this will not look very statistically random at all. However, it certainly picks the numbers in the range, not in their original order, and without any memory of the numbers picked so far.
In fact, you could make this look even better by making a list of factors - like [1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16] - and using shuffled versions of those to compute step * factor (mod maxint). Let's say we shuffled the example factors lists like [3, 2, 4, 5, 1], [6, 8, 9, 10, 7], [13, 16, 12, 11, 14, 15]. then we'd get the sequence
5, 14, 12, 3, 7, 10, 8, 15, 6, 1, 11, 0, 4, 13, 2, 9
The size of the factors list is completely tunable, so you can store as much memory as you like. Bigger factor lists, more randomness. No repeats regardless of factor list size. When you exhaust a factor list, generating a new one is as easy as counting and shuffling.
It is my impression that what you are looking for is a randomly-ordered list of numbers, not a random list of numbers. You should be able to get this with the following pseudocode. Better math-ies may be able to tell me if this is in fact not random:
list = [ 1 .. 100 ]
for item,index in list:
location = random_integer_below(list.length - index)
list.switch(index,location+index)
Basically, go through the list and pick a random item from the rest of the list to use in the position you are at. This should randomly arrange the items in your list. If you need to reproduce the same random order each time, consider saving the array, or ensuring somehow that random_integer_below always returns numbers in the same order given some seed.
Generate an array that contains the range, in order. So the array contains [0, 1, 2, 3, 4, 5, ... N]. Then use a Fisher-Yates Shuffle to scramble the array. You can then iterate over the array to get your random numbers.
If you need repeatability, seed your random number generator with the same value at the start of the shuffle.
Do not use a random number generator to select numbers in a range. What will eventually happen is that you have one number left to fill, and your random number generator will cycle repeatedly until it selects that number. Depending on the random number generator, there is no guarantee that will ever happen.
What you should do is generate a list of numbers on the desired range, then use a random number generator to shuffle the list. The shuffle is known as the Fisher-Yates shuffle, or sometimes called the Knuth shuffle. Here's pseudocode to shuffle an array x of n elements with indices from 0 to n-1:
for i from n-1 to 1
j = random integer such that 0 ≤ j ≤ i
swap x[i] and x[j]

Resources