Naive shuffling algorithm probability analysis [duplicate] - algorithm

I implemented the shuffling algorithm as:
import random
a = range(1, n+1) #a containing element from 1 to n
for i in range(n):
j = random.randint(0, n-1)
a[i], a[j] = a[j], a[i]
As this algorithm is biased. I just wanted to know for any n(n ≤ 17), is it possible to find that which permutation have the highest probablity of occuring and which permutation have least probablity out of all possible n! permutations. If yes then what is that permutation??
For example n=3:
a = [1,2,3]
There are 3^3 = 27 possible shuffle
No. occurence of different permutations:
1 2 3 = 4
3 1 2 = 4
3 2 1 = 4
1 3 2 = 5
2 1 3 = 5
2 3 1 = 5
P.S. I am not so good with maths.

This is not a proof by any means, but you can quickly come up with the distribution of placement probabilities by running the biased algorithm a million times. It will look like this picture from wikipedia:
An unbiased distribution would have 14.3% in every field.
To get the most likely distribution, I think it's safe to just pick the highest percentage for each index. This means it's most likely that the entire array is moved down by one and the first element will become the last.
Edit: I ran some simulations and this result is most likely wrong. I'll leave this answer up until I can come up with something better.

Related

Efficiently grab some subsets that meet criteria [duplicate]

This question already has an answer here:
Count the total number of subsets that don't have consecutive elements
(1 answer)
Closed 4 years ago.
Given a set of consecutive numbers from 1 to n, I'm trying to find the number of subsets that do not contain consecutive numbers.
E.g., for the set [1, 2, 3], some possible subsets are [1, 2] and [1, 3]. The former would not be counted while the latter would be, since 1 and 3 are not consecutive numbers.
Here is what I have:
def f(n)
consecutives = Array(1..n)
stop = (n / 2.0).round
(1..stop).flat_map { |x|
consecutives.combination(x).select { |combo|
consecutive = false
combo.each_cons(2) do |l, r|
consecutive = l.next == r
break if consecutive
end
combo.length == 1 || !consecutive
}
}.size
end
It works, but I need it to work faster, under 12 seconds for n <= 75. How do I optimize this method so I can handle high n values no sweat?
I looked at:
Check if array is an ordered subset
How do I return a group of sequential numbers that might exist in an array?
Check if an array is subset of another array in Ruby
and some others. I can't seem to find an answer.
Suggested duplicate is Count the total number of subsets that don't have consecutive elements, although that question is slightly different as I was asking for this optimization in Ruby and I do not want the empty subset in my answer. That question would have been very helpful had I initially found that one though! But SergGr's answer is exactly what I was looking for.
Although #user3150716 idea is correct the details are wrong. Particularly you can see that for n = 3 there are 4 subsets: [1],[2],[3],[1,3] while his formula gives only 3. That is because he missed the subset [3] (i.e. the subset consisting of just [i]) and that error accumulates for larger n. Also I think it is easier to think if you start from 1 rather than n. So the correct formulas would be
f(1) = 1
f(2) = 2
f(n) = f(n-1) + f(n-2) + 1
Those formulas are easy to code using a simple loop in constant space and O(n) speed:
def f(n)
return 1 if n == 1
return 2 if n == 2
# calculate
# f(n) = f(n-1) + f(n - 2) + 1
# using simple loop
v2 = 1
v1 = 2
i = 3
while i <= n do
i += 1
v1, v2 = v1 + v2 + 1, v1
end
v1
end
You can see this online together with the original code here
This should be pretty fast for any n <= 75. For much larger n you might require some additional tricks like noticing that f(n) is actually one less than a Fibonacci number
f(n) = Fib(n+2) - 1
and there is a closed formula for Fibonacci number that theoretically can be computed faster for big n.
let number of subsets with no consecutive numbers from{i...n} be f(i), then f(i) is the sum of:
1) f(i+1) , the number of such subsets without i in them.
2) f(i+2) + 1 , the number of such subsets with i in them (hence leaving out i+1 from the subset)
So,
f(i)=f(i+1)+f(i+2)+1
f(n)=1
f(n-1)=2
f(1) will be your answer.
You can solve it using matrix exponentiation(http://zobayer.blogspot.in/2010/11/matrix-exponentiation.html) in O(logn) time.

algorithm to find consecutive wins (adjusted binomial distribution)

Imagine I am having a roulette wheel and I want to feed my algorithm three integers
p := probability to win one game
m := number of times the wheel is spun
k := number of consecutive wins I am interested in
and I am interested in the probability P that after spinning the wheel m times I win at least k consecutive games.
Let's go through an example where m = 5 and k = 3 and let's say 1 is a win and 0 a loss.
1 1 1 1 1
0 1 1 1 1
1 1 1 1 0
1 1 1 0 0
0 1 1 1 0
0 0 1 1 1
So in my intention, this would be all solution to win at least 3 consecutive games. For every k, I have (m-k+1) possible winning outcomes.
First question, is this true? Or would also 1 1 1 0 1 and 1 0 1 1 1 be possible solutions?
Next, how would a handy computation for this problem look like? First, I thought about the binomial distribution to solve this problem, where I just iterate over all k:
\textstyle {n \choose k}\,p^{k}(1-p)^{n-k}
But this somehow doesn't guarantee to have consecutive wins. Is the binomial distribution somehow adjustable to produce the output P I am looking for?
the following is an option you might want to consider:
you generate an array of length m, the entries 0 ...m representing the probability that at the ith time you have k-consecutive 1s.
all slots until k have probability 0, no chance for k consecutive wins.
slot k has the probability p^k.
all positions afterwards are computed based on dynamic programming approach: each position i as of position k+1 is calculated: sum consisting of position i-1 plus (p^k * (1-p) * (1 - the probability at position (i-1-k).
This way you iterate through the array and you will have in the last position the probability for at least k-consecutive 1s.
Or would also 1 1 1 0 1 and 1 0 1 1 1 be possible solutions?
yes, they would according to win at least k consecutive games.
First, I thought about the binomial distribution to solve this problem, where I just iterate over all k: \textstyle {n \choose k}\,p^{k}(1-p)^{n-k}
that might work if you combine if with acceptance-rejection method. Generate outcome from binomial, check for at least k winnings, accept if yes, drop otherwise.
Is the binomial distribution somehow adjustable to produce the output P I am looking for?
Frankly, I would take a look at Geometric distribution, which descirbes number of successful wins before loss (or vice versa).

Analyze the run time of a nested for loops algorithm

Say i have the following code
def func(A,n):
for i = 0 to n-1:
for k = i+1 to n-1:
for l = k+1 to n-1:
if A[i]+A[k]+A[l] = 0:
return True
A is an array, and n denotes the length of A.
As I read it, the code checks if any 3 consecutive integers in A sum up to 0. I see the time complexity as
T(n) = (n-2)(n-1)(n-2)+O(1) => O(n^3)
Is this correct, or am I missing something? I have a hard time finding reading material about this (and I own CLRS)
You have the functionality wrong: it checks to see whether any three elements add up to 0. To improve execution time, it considers them only in index order: i < k < j.
You are correct about the complexity. Although each loop takes a short-cut, that short-cut is merely a scalar divisor on the number of iterations. Each loop is still O(n).
As for the coding, you already have most of it done -- and Stack Overflow is not a coding service. Give it your best shot; if that doesn't work and you're stuck, post another question.
If you really want to teach yourself a new technique, look up Python's itertools package. You can use this to generate all the combinations in triples. You can then merely check sum(triple) in each case. In fact, you can use the any method to check whether any one triple sums to 0, which could reduce your function body to a single line of Python code.
I'll leave that research to you. You'll learn other neat stuff on the way.
Addition for OP's comment.
Let's set N to 4, and look at what happens:
i = 0
for k = 1 to 3
... three k loop
i = 1
for k = 2 to 3
... two k loops
i = 2
for k = 3 to 3
... one k loop
The number of k-loop executions is the "triangle" number of n-1: 3 + 2 + 1. Let m = n-1; the formula is T(m) = m(m-1)/2.
Now, you propagate the same logic to the l loops. You run T(k) loops on l for k= 1, 2, 3. If I recall, this third-order "pyramid" formula is P(m) = m(m-1)(m-2)/6.
In terms of n, this is (n-1)(n-2)(n-3)/6 loops on l. When you multiply this out, you get a straightforward cubic formula in n.
Here is the sequence for n=5:
0 1 2
0 1 3
0 1 4
change k
0 2 3
0 2 4
change k
0 3 4
change k
change k
change l
1 2 3
1 2 4
change k
1 3 4
change k
change k
change l
2 3 4
BTW, l is a bad variable name, easily confused with 1.

Generate a list of permutations paired with their number of inversions

I'm looking for an algorithm that generates all permutations of a set. To make it easier, the set is always [0, 1..n]. There are many ways to do this and it's not particularly hard.
What I also need is the number of inversions of each permutation.
What is the fastest (in terms of time complexity) algorithm that does this?
I was hoping that there's a way to generate those permutations that produces the number of inversions as a side-effect without adding to the complexity.
The algorithm should generate lists, not arrays, but I'll accept array based ones if it makes a big enough difference in terms of speed.
Plus points (...there are no points...) if it's functional and is implemented in a pure language.
There is Steinhaus–Johnson–Trotter algorithm that allows to keep inversion count easily during permutation generation. Excerpt from Wiki:
Thus, from the single permutation on one element,
1
one may place the number 2 in each possible position in descending
order to form a list of two permutations on two elements,
1 2
2 1
Then, one may place the number 3 in each of three different positions
for these three permutations, in descending order for the first
permutation 1 2, and then in ascending order for the permutation 2 1:
1 2 3
1 3 2
3 1 2
3 2 1
2 3 1
2 1 3
At every step of recursion we insert the biggest number in the list of smaller numbers. It is obvious that this insertion adds M new inversions, where M is insertion position (counting from the right). For example, if we have 3 1 2 list (2 inversions), and will insert 4
3 1 2 4 //position 0, 2 + 0 = 2 inversions
3 1 4 2 //position 1, 2 + 1 = 3 inversions
3 4 1 2 //position 2, 2 + 2 = 4 inversions
4 3 1 2 //position 3, 2 + 3 = 5 inversions
pseudocode:
function Generate(List, Count)
N = List.Length
if N = N_Max then
Output(List, 'InvCount = ': Count)
else
for Position = 0 to N do
Generate(List.Insert(N, N - Position), Count + Position)
P.S. Recursive method is not mandatory here, but I suspect that it is natural for functional guys
P.P.S If you are worried about inserting into lists, consider Even's speedup section that uses only exchange of neighbour elements, and every exchange increments or decrements inversion count by 1.
Here is an algorithm that does the task, is amortized O(1) per permutation, and generates an array of tuples of linked lists that share as much memory as they reasonably can.
I'll implement all except the linked list bit in untested Python. Though Python would be a bad language for a real implementation.
def permutations (sorted_list):
answer = []
def add_permutations(reversed_sublist, tail_node, inversions):
if (0 == len(sorted_sublist)):
answer.append((tail_node, inversions))
else:
for idx, val in enumerate(reversed_sublist):
add_permutations(
filter(lambda x: x != val),
ListNode(val, tail_node,
inversions + idx
)
add_permutations(reversed(sorted_list), EmptyListNode(), 0)
return answer
You might wonder at my claim of amortized O(1) work with all of this copying. That's because if m elements are left we do O(m) work then amortize it over m! elements. So the amortized cost of the higher level nodes is a converging cost per bottom call, of which we need one per permutation.

Algorithm for "pick the number up game"

I'm struggling to get some solution, but I have no idea for that.
RobotA and RobotB who choose a permutation of the N numbers to begin with. RobotA picks first, and they pick alternately. Each turn, robots only can pick any one remaining number from the permutation. When the remaining numbers form an increasing sequence, the game finishes. The robot who picked the last turn (after which the sequence becomes increasing) wins the game.
Assuming both play optimally , who wins?
Example 1:
The original sequence is 1 7 3.
RobotA wins by picking 7, after which the sequence is increasing 1 3.
Example 2:
The original sequence is 8 5 3 1 2.
RobotB wins by selecting the 2, preventing any increasing sequence.
Is there any known algorithm to solve that? Please give me any tips or ideas of where to look at would be really thankful!
Given a sequence w of distinct numbers, let N(w) be the length of w and let L(w) be the length of the longest increasing subsequence in w. For example, if
w = 3 5 8 1 4
then N(w) = 5 and L(w) = 3.
The game ends when L(w) = N(w), or, equivalently, N(w) - L(w) = 0.
Working the game backwards, if on RobotX's turn N(w) - L(w) = 1, then the optimal play is to remove the unique letter not in a longest increasing subsequence, thereby winning the game.
For example, if w = 1 7 3, then N(w) = 3 and L(w) = 2 with a longest increasing subsequence being 1 3. Removing the 7 results in an increasing sequence, ensuring that the player who removed the 7 wins.
Going back to the previous example, w = 3 5 8 1 4, if either 1 or 4 is removed, then for the resulting permutation u we have N(u) - L(u) = 1, so the player who removed the 1 or 4 will certainly lose to a competent opponent. However, any other play results in a victory since it forces the next player to move to a losing position. Here, the optimal play is to remove any of 3, 5, or 8, after which N(u) - L(u) = 2, but after the next move N(v) - L(v) = 1.
Further analysis along these lines should lead to an optimal strategy for either player.
The nearest mathematical game that I do know is the Monotone Sequence Game. In a monotonic sequence game, two players alternately choose elements of a sequence from some fixed ordered set (e.g. 1,2,...,N). The game ends when the resulting sequence contains either an ascending subsequence of length A or a descending one of length D. This game has its origins with a theorem of Erdos and Szekeres, and a nice exposition can be found in MONOTONIC SEQUENCE GAMES, and this slide presentation by Bruce Sagan is also a good reference.
If you want to know more about game theory in general, or these sorts of games in particular, then I strong recommend Winning Ways for Your Mathematical Plays by Berlekamp, Conway and Guy. Volume 3, I believe, addresses these sorts of games.
Looks like a Minimax problem.
I guess there is more fast solution for this task. I will think.
But I can give you an idea of solution with O(N! * N^2) complexity.
At first, note that picking number from N-permutation is equivalent to the following:
Pick number from N-permutation. Let's it was number X.
Reassign numbers using rule:
1 = 1
2 = 2
...
X-1 = X-1
X = Nothing, it's gone.
X+1 = X
...
N = N - 1
And you get permutation of N-1 numbers.
Example:
1 5 6 4 2 3
Pick 2
1 5 6 4 3
Reassign
1 4 5 3 2
Let's use this one as move, instead just picking. It's easy too see that games are equivalent, player A wins in this game for some permutation if and only if it wins in original.
Let's give a code to all permutations of N numbers, N-1 numbers, ... 2 numbers.
Define F(x) -> {0; 1} (where x is a permutation code) is function which is 1 when current
player wins, and 0 if current player fails. Easy to see F(1 2 .. K-1 K) = 0.
F(x) = 1 if there is at least on move which transforms x to y, and F(y) = 0.
F(x) = 0 if for any move which transforms x to y, F(y) = 1.
So you can use recursion with memorization to compute:
Boolean F(X)
{
Let K be length of permutation with code X.
if you already compute F for argument X return previously calculated result;
if X == 1 2 .. K return 0;
Boolean result = 0;
for i = 1 to K do
{
Y code of permutation get from X by picking number on position i.
if (F(y) == 0)
{
result = 1;
break;
}
}
Store result as F(X);
return result;
}
For each argument we will compute this function only once. There is 1! permutations of length 1, 2! permutations of length 2 .. N! permutations of length N. For permutation length K, we need to do O(K) operation to compute. So complexity will be O(1*1! + 2*2! + .. N*N!) =<= O(N! * N^2) = O(N! * N^2)
Here is Python code for Wisdom's Wind's algorithm. It prints out wins for RobotA.
import itertools
def moves(p):
if tuple(sorted(p)) == p:
return
for i in p:
yield tuple(j - (j > i) for j in p if j != i)
winning = set()
for n in range(6):
for p in itertools.permutations(range(n)):
if not winning.issuperset(moves(p)):
winning.add(p)
for p in sorted(winning, key=lambda q: (len(q), q)):
print(p)

Resources