Algorithm to partition a number - performance

Given a positive integer X, how can one partition it into N parts, each between A and B where A <= B are also positive integers? That is, write
X = X_1 + X_2 + ... + X_N
where A <= X_i <= B and the order of the X_is doesn't matter?

If you want to know the number of ways to do this, then you can use generating functions.
Essentially, you are interested in integer partitions. An integer partition of X is a way to write X as a sum of positive integers. Let p(n) be the number of integer partitions of n. For example, if n=5 then p(n)=7 corresponding to the partitions:
5
4,1
3,2
3,1,1
2,2,1
2,1,1,1
1,1,1,1,1
The the generating function for p(n) is
sum_{n >= 0} p(n) z^n = Prod_{i >= 1} ( 1 / (1 - z^i) )
What does this do for you? By expanding the right hand side and taking the coefficient of z^n you can recover p(n). Don't worry that the product is infinite since you'll only ever be taking finitely many terms to compute p(n). In fact, if that's all you want, then just truncate the product and stop at i=n.
Why does this work? Remember that
1 / (1 - z^i) = 1 + z^i + z^{2i} + z^{3i} + ...
So the coefficient of z^n is the number of ways to write
n = 1*a_1 + 2*a_2 + 3*a_3 +...
where now I'm thinking of a_i as the number of times i appears in the partition of n.
How does this generalize? Easily, as it turns out. From the description above, if you only want the parts of the partition to be in a given set A, then instead of taking the product over all i >= 1, take the product over only i in A. Let p_A(n) be the number of integer partitions of n whose parts come from the set A. Then
sum_{n >= 0} p_A(n) z^n = Prod_{i in A} ( 1 / (1 - z^i) )
Again, taking the coefficient of z^n in this expansion solves your problem. But we can go further and track the number of parts of the partition. To do this, add in another place holder q to keep track of how many parts we're using. Let p_A(n,k) be the number of integer partitions of n into k parts where the parts come from the set A. Then
sum_{n >= 0} sum_{k >= 0} p_A(n,k) q^k z^n = Prod_{i in A} ( 1 / (1 - q*z^i) )
so taking the coefficient of q^k z^n gives the number of integer partitions of n into k parts where the parts come from the set A.
How can you code this? The generating function approach actually gives you an algorithm for generating all of the solutions to the problem as well as a way to uniformly sample from the set of solutions. Once n and k are chosen, the product on the right is finite.

Here is a python solution to this problem, This is quite un-optimised but I have tried to keep it as simple as I can to demonstrate an iterative method of solving this problem.
The results of this method will commonly be a list of max values and min values with maybe 1 or 2 values inbetween. Because of this, there is a slight optimisation in there, (using abs) which will prevent the iterator constantly trying to find min values counting down from max and vice versa.
There are recursive ways of doing this that look far more elegant, but this will get the job done and hopefully give you an insite into a better solution.
SCRIPT:
# iterative approach in-case the number of partitians is particularly large
def splitter(value, partitians, min_range, max_range, part_values):
# lower bound used to determine if the solution is within reach
lower_bound = 0
# upper bound used to determine if the solution is within reach
upper_bound = 0
# upper_range used as upper limit for the iterator
upper_range = 0
# lower range used as lower limit for the iterator
lower_range = 0
# interval will be + or -
interval = 0
while value > 0:
partitians -= 1
lower_bound = min_range*(partitians)
upper_bound = max_range*(partitians)
# if the value is more likely at the upper bound start from there
if abs(lower_bound - value) < abs(upper_bound - value):
upper_range = max_range
lower_range = min_range-1
interval = -1
# if the value is more likely at the lower bound start from there
else:
upper_range = min_range
lower_range = max_range+1
interval = 1
for i in range(upper_range, lower_range, interval):
# make sure what we are doing won't break solution
if lower_bound <= value-i and upper_bound >= value-i:
part_values.append(i)
value -= i
break
return part_values
def partitioner(value, partitians, min_range, max_range):
if min_range*partitians <= value and max_range*partitians >= value:
return splitter(value, partitians, min_range, max_range, [])
else:
print ("this is impossible to solve")
def main():
print(partitioner(9800, 1000, 2, 100))
The basic idea behind this script is that the value needs to fall between min*parts and max*parts, for each step of the solution, if we always achieve this goal, we will eventually end up at min < value < max for parts == 1, so if we constantly take away from the value, and keep it within this min < value < max range we will always find the result if it is possable.
For this code's example, it will basically always take away either max or min depending on which bound the value is closer to, untill some non min or max value is left over as remainder.

A simple realization you can make is that the average of the X_i must be between A and B, so we can simply divide X by N and then do some small adjustments to distribute the remainder evenly to get a valid partition.
Here's one way to do it:
X_i = ceil (X / N) if i <= X mod N,
floor (X / N) otherwise.
This gives a valid solution if A <= floor (X / N) and ceil (X / N) <= B. Otherwise, there is no solution. See proofs below.
sum(X_i) == X
Proof:
Use the division algorithm to write X = q*N + r with 0 <= r < N.
If r == 0, then ceil (X / N) == floor (X / N) == q so the algorithm sets all X_i = q. Their sum is q*N == X.
If r > 0, then floor (X / N) == q and ceil (X / N) == q+1. The algorithm sets X_i = q+1 for 1 <= i <= r (i.e. r copies), and X_i = q for the remaining N - r pieces. The sum is therefore (q+1)*r + (N-r)*q == q*r + r + N*q - r*q == q*N + r == X.
If floor (X / N) < A or ceil (X / N) > B, then there is no solution.
Proof:
If floor (X / N) < A, then floor (X / N) * N < A * N, and since floor(X / N) * N <= X, this means that X < A*N, so even using only the smallest pieces possible, the sum would be larger than X.
Similarly, if ceil (X / N) > B, then ceil (X / N) * N > B * N, and since ceil(X / N) * N >= X, this means that X > B*N, so even using only the largest pieces possible, the sum would be smaller than X.

Related

Find the value of f(T) for big value T

I am trying to solve a problem which is described below,
Given value of f(0) and k , which are integers.
I need to find value of f( T ). where T<=1010
Recursive function is,
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
My efforts,
#include<iostream>
using namespace std;
int main(){
long k,f0,i;
cin>>k>>f0;
long operation ;
cin>>operation;
long answer=f0;
for(i=1;i<=operation;i++){
answer=(4*answer <= k )?(2*answer):(k-(2*answer));
}
cout<<answer;
return 0;
}
My code gives me right answer. But, The code will run 1010 time in worst case that gives me Time Limit Exceed. I need more efficient solution for this problem. Please help me. I don't know the correct algorithm.
If 2f(0) < k then you can compute this function in O(log n) time (using exponentiation by squaring modulo k).
r = f(0) * 2^n mod k
return 2 * r >= k ? k - r : r
You can prove this by induction. The induction hypothesis is that 0 <= f(n) < k/2, and that the above code fragment computes f(n).
Here's a Python program which checks random test cases, comparing a naive implementation (f) with an optimized one (g).
def f(n, k, z):
r = z
for _ in xrange(n):
if 4*r <= k:
r = 2 * r
else:
r = k - 2 * r
return r
def g(n, k, z):
r = (z * pow(2, n, k)) % k
if 2 * r >= k:
r = k - r
return r
import random
errs = 0
while errs < 20:
k = random.randrange(100, 10000000)
n = random.randrange(100000)
z = random.randrange(k//2)
a1 = f(n, k, z)
a2 = g(n, k, z)
if a1 != a2:
print n, k, z, a1, a2
errs += 1
print '.',
Can you use methmetical solution before progamming and compulating?
Actually,
f(n) = f0*2^(n-1) , if f(n-1)*4 <= k
k - f0*2^(n-1) , if f(n-1)*4 > k
thus, your code will write like this:
condition = f0*pow(2, operation-2)
answer = condition*4 =< k? condition*2: k - condition*2
For a simple loop, your answer looks pretty tight; one could optimise a little bit using answer<<2 instead of 4*answer, and answer<<1 for 2*answer, but quite possibly your compiler is already doing that. If you're blowing the time with this, it might be necessary to reduce the loop itself somehow.
I can't figure out a mathematical pattern that #Shannon was going for, but I'm thinking we could exploit the fact that this function will sooner or later cycle. If the cycle is short enough, then we could short the loop by just getting the answer at the same point in the cycle.
So let's get some cycle detection equipment in the form of Brent's algorithm, and see if we can cut the loop to reasonable levels.
def brent(f, x0):
# main phase: search successive powers of two
power = lam = 1
tortoise = x0
hare = f(x0) # f(x0) is the element/node next to x0.
while tortoise != hare:
if power == lam: # time to start a new power of two?
tortoise = hare
power *= 2
lam = 0
hare = f(hare)
lam += 1
# Find the position of the first repetition of length λ
mu = 0
tortoise = hare = x0
for i in range(lam):
# range(lam) produces a list with the values 0, 1, ... , lam-1
hare = f(hare)
# The distance between the hare and tortoise is now λ.
# Next, the hare and tortoise move at same speed until they agree
while tortoise != hare:
tortoise = f(tortoise)
hare = f(hare)
mu += 1
return lam, mu
f0 = 2
k = 198779
t = 10000000000
def f(x):
if 4 * x <= k:
return 2 * x
else:
return k - 2 * x
lam, mu = brent(f, f0)
t2 = t
if t >= mu + lam: # if T is past the cycle's first loop,
t2 = (t - mu) % lam + mu # find the equivalent place in the first loop
x = f0
for i in range(t2):
x = f(x)
print("Cycle start: %d; length: %d" % (mu, lam))
print("Equivalent result at index: %d" % t2)
print("Loop iterations skipped: %d" % (t - t2))
print("Result: %d" % x)
As opposed to the other proposed answers, this approach actually could use a memo array to speed up the process, since the start of the function is actually calculated multiple times (in particular, inside brent), or it may be irrelevant, depending on how big the cycle happens to be.
The algorithm you proposed already has O(n).
To come up with more efficient algorithms, there is not that much direction we can go about. Some typical options we have
1.Decease the coefficients of the linear term( but I doubt it would make a difference in this case
2.Change to O(Logn)(typically use some sort of divide and conquer technique)
3.Change to O(1)
In this case, we can do the last one.
The recursion function is a piece-wise function
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
Let's tackle it by case:
case 1: if 4*f(n-1) <= k (1)(assuming the starting index is zero)
this is a obvious a geometry series
a_n = 2*a_n-1
Therefore, have the formula
Sn = 2^(n-1)f(0) ----()
Case 2: if 4*f(n-1) > k (2), we have
a_n = -2a_n-1 + k
Assuming, a_j is the element in the sequence which just satisfy condition (2)
Nestedly sub in an_1 to the formula, you will obtain the equation
an = k -2k +4k -8k... +(-2)^(n-j)* a_j
k -2k 4k -8... is another gemo series
Sn = k*(1-2^(n-j))/(1-2) ---gemo series sum formula with starting value k and ratio = -2
Therefore, we have a formula for an in the case 2
an = k * (1-2^(n-j))/(1-2) + (-2)^(n-j) * a_j ----(**)
All we left to do it to find aj which just dissatisfy condition (1) and satisfy (2)
This can be obtained in constant time again using the formula we have for case 1:
find n such that, 4*an = 4*Sn = 4*2^(n-1)*f(0)
solve for n: 4*2^(n-1)*f(0) = k, if n is not integer, take ceiling of n
In my first attempt to solve this question, I had wrong assumption that the value of the sequence is monotonically increasing but in fact the sequence might jump between case 1 and case 2. Therefore, there might not be constant algorithm to solve the problem.
However, we can use utilize the result above to skip iterative update complexity.
The overall algorithm will look something like:
start with T, K, and f(0)
compute n that make the condition switch using either (*) or (**)
update f(0) with f(n), update T - n
repeat
terminate when T-n = 0(the last iteration might over compute causing T-n<0, therefore, you need to go back a little bit if that happen)
Create a map that can store your results. Before finding f(n) check in that map, if solution is already existed or not.
If exists, use that solution.
Otherwise find it, store it for future use.
For C++:
Definition:
map<long,long>result;
Insertion:
result[key]=value
Accessing:
value=result[key];
Checking:
map<long,long>::iterator it=result.find(key);
if(it==result.end())
{
//key was not found, find the solution and insert into result
}
else
{
return result[key];
}
Use above technique for better solution.

How do you determine the average-case complexity of this algorithm?

It's usually easy to calculate the time complexity for the best case and the worst case, but when it comes to the average case especially when there's a probability p given, I don't know where to start.
Let's look at the following algorithm to compute the product of all the elements in a matrix:
int computeProduct(int[][] A, int m, int n) {
int product = 1;
for (int i = 0; i < m; i++ {
for (int j = 0; j < n; j++) {
if (A[i][j] == 0) return 0;
product = product * A[i][j];
}
}
return product;
}
Suppose p is the probability of A[i][j] being 0 (i.e. the algorithm terminates there, return 0); how do we derive the average case time complexity for this algorithm?
Let’s consider a related problem. Imagine you have a coin that flips heads with probability p. How many times, on expectation, do you need to flip the coin before it comes up heads? The answer is 1/p, since
There’s a p chance that you need one flip.
There’s a p(1-p) chance that you need two flips (the first flip has to go tails and the second has to go heads).
There’s a p(1-p)^2 chance that you need three flips (the first two flips need to go tails and the third has to go heads)
...
There’s a p(1-p)^(k-1) chance that you need k flips (the first k-1 flips need to go tails and the kth needs to go heads.)
So this means the expected value of the number of flips is
p + 2p(1 - p) + 3p(1 - p)^2 + 4p(1 - p)^3 + ...
= p(1(1 - p)^0 + 2(1 - p)^1 + 3(1 - p)^2 + ...)
So now we need to work out what this summation is. The general form is
p sum from k = 1 to infinity (k(1 - p)^k).
Rather than solving this particular summation, let's make this more general. Let x be some variable that, later, we'll set equal to 1 - p, but which for now we'll treat as a free value. Then we can rewrite the above summation as
p sum from k = 1 to infinity (kx^(k-1)).
Now for a cute trick: notice that the inside of this expression is the derivative of x^k with respect to x. Therefore, this sum is
p sum from k = 1 to infinity (d/dx x^k).
The derivative is a linear operator, so we can move it out to the front:
p d/dx sum from k = 1 to infinity (x^k)
That inner sum (x + x^2 + x^3 + ...) is the Taylor series for 1 / (1 - x) - 1, so we can simplify this to get
p d/dx (1 / (1 - x) - 1)
= p / (1 - x)^2
And since we picked x = 1 - p, this simplifies to
p / (1 - (1 - p))^2
= p / p^2
= 1 / p
Whew! That was a long derivation. But it shows that the expected number of coin tosses needed is 1/p.
Now, in your case, your algorithm can be thought of as tossing mn coins that come up heads with probability p and stopping if any of them come up heads. Surely, the expected number of coins you’d need to toss won’t be more than the case where you’re allowed to flip infinitely often, so your expected runtime is at most O(1 / p) (assuming p > 0).
If we assume that p is independent of m and n, then we can notice that at after some initial growth, each added term into our summation as we increase the number of flips is exponentially lower than the previous ones. More specifically, after adding in roughly logarithmically many terms into the sum we’ll be off from the total in the case of the infinite summation. Therefore, provided that mn is roughly larger than Θ(log p), the sum ends up being Θ(1 / p). So in a big-O sense, if mn is independent of p, the runtime is Θ(1 / p).

Number of different binary sequences of length n generated using exactly k flip operations

Consider a binary sequence b of length N. Initially, all the bits are set to 0. We define a flip operation with 2 arguments, flip(L,R), such that:
All bits with indices between L and R are "flipped", meaning a bit with value 1 becomes a bit with value 0 and vice-versa. More exactly, for all i in range [L,R]: b[i] = !b[i].
Nothing happens to bits outside the specified range.
You are asked to determine the number of possible different sequences that can be obtained using exactly K flip operations modulo an arbitrary given number, let's call it MOD.
More specifically, each test contains on the first line a number T, the number of queries to be given. Then there are T queries, each one being of the form N, K, MOD with the meaning from above.
1 ≤ N, K ≤ 300 000
T ≤ 250
2 ≤ MOD ≤ 1 000 000 007
Sum of all N-s in a test is ≤ 600 000
time limit: 2 seconds
memory limit: 65536 kbytes
Example :
Input :
1
2 1 1000
Output :
3
Explanation :
There is a single query. The initial sequence is 00. We can do the following operations :
flip(1,1) ⇒ 10
flip(2,2) ⇒ 01
flip(1,2) ⇒ 11
So there are 3 possible sequences that can be generated using exactly 1 flip.
Some quick observations that I've made, although I'm not sure they are totally correct :
If K is big enough, that is if we have a big enough number of flips at our disposal, we should be able to obtain 2n sequences.
If K=1, then the result we're looking for is N(N+1)/2. It's also C(n,1)+C(n,2), where C is the binomial coefficient.
Currently trying a brute force approach to see if I can spot a rule of some kind. I think this is a sum of some binomial coefficients, but I'm not sure.
I've also come across a somewhat simpler variant of this problem, where the flip operation only flips a single specified bit. In that case, the result is
C(n,k)+C(n,k-2)+C(n,k-4)+...+C(n,(1 or 0)). Of course, there's the special case where k > n, but it's not a huge difference. Anyway, it's pretty easy to understand why that happens.I guess it's worth noting.
Here are a few ideas:
We may assume that no flip operation occurs twice (otherwise, we can assume that it did not happen). It does affect the number of operations, but I'll talk about it later.
We may assume that no two segments intersect. Indeed, if L1 < L2 < R1 < R2, we can just do the (L1, L2 - 1) and (R1 + 1, R2) flips instead. The case when one segment is inside the other is handled similarly.
We may also assume that no two segments touch each other. Otherwise, we can glue them together and reduce the number of operations.
These observations give the following formula for the number of different sequences one can obtain by flipping exactly k segments without "redundant" flips: C(n + 1, 2 * k) (we choose 2 * k ends of segments. They are always different. The left end is exclusive).
If we had perform no more than K flips, the answer would be sum for k = 0...K of C(n + 1, 2 * k)
Intuitively, it seems that its possible to transform the sequence of no more than K flips into a sequence of exactly K flips (for instance, we can flip the same segment two more times and add 2 operations. We can also split a segment of more than two elements into two segments and add one operation).
By running the brute force search (I know that it's not a real proof, but looks correct combined with the observations mentioned above) that the answer this sum minus 1 if n or k is equal to 1 and exactly the sum otherwise.
That is, the result is C(n + 1, 0) + C(n + 1, 2) + ... + C(n + 1, 2 * K) - d, where d = 1 if n = 1 or k = 1 and 0 otherwise.
Here is code I used to look for patterns running a brute force search and to verify that the formula is correct for small n and k:
reachable = set()
was = set()
def other(c):
"""
returns '1' if c == '0' and '0' otherwise
"""
return '0' if c == '1' else '1'
def flipped(s, l, r):
"""
Flips the [l, r] segment of the string s and returns the result
"""
res = s[:l]
for i in range(l, r + 1):
res += other(s[i])
res += s[r + 1:]
return res
def go(xs, k):
"""
Exhaustive search. was is used to speed up the search to avoid checking the
same string with the same number of remaining operations twice.
"""
p = (xs, k)
if p in was:
return
was.add(p)
if k == 0:
reachable.add(xs)
return
for l in range(len(xs)):
for r in range(l, len(xs)):
go(flipped(xs, l, r), k - 1)
def calc_naive(n, k):
"""
Counts the number of reachable sequences by running an exhaustive search
"""
xs = '0' * n
global reachable
global was
was = set()
reachable = set()
go(xs, k)
return len(reachable)
def fact(n):
return 1 if n == 0 else n * fact(n - 1)
def cnk(n, k):
if k > n:
return 0
return fact(n) // fact(k) // fact(n - k)
def solve(n, k):
"""
Uses the formula shown above to compute the answer
"""
res = 0
for i in range(k + 1):
res += cnk(n + 1, 2 * i)
if k == 1 or n == 1:
res -= 1
return res
if __name__ == '__main__':
# Checks that the formula gives the right answer for small values of n and k
for n in range(1, 11):
for k in range(1, 11):
assert calc_naive(n, k) == solve(n, k)
This solution is much better than the exhaustive search. For instance, it can run in O(N * K) time per test case if we compute the coefficients using Pascal's triangle. Unfortunately, it is not fast enough. I know how to solve it more efficiently for prime MOD (using Lucas' theorem), but O do not have a solution in general case.
Multiplicative modular inverses can't solve this problem immediately as k! or (n - k)! may not have an inverse modulo MOD.
Note: I assumed that C(n, m) is defined for all non-negative n and m and is equal to 0 if n < m.
I think I know how to solve it for an arbitrary MOD now.
Let's factorize the MOD into prime factors p1^a1 * p2^a2 * ... * pn^an. Now can solve this problem for each prime factor independently and combine the result using the Chinese remainder theorem.
Let's fix a prime p. Let's assume that p^a|MOD (that is, we need to get the result modulo p^a). We can precompute all p-free parts of the factorial and the maximum power of p that divides the factorial for all 0 <= n <= N in linear time using something like this:
powers = [0] * (N + 1)
p_free = [i for i in range(N + 1)]
p_free[0] = 1
for cur_p in powers of p <= N:
i = cur_p
while i < N:
powers[i] += 1
p_free[i] /= p
i += cur_p
Now the p-free part of the factorial is the product of p_free[i] for all i <= n and the power of p that divides n! is the prefix sum of the powers.
Now we can divide two factorials: the p-free part is coprime with p^a so it always has an inverse. The powers of p are just subtracted.
We're almost there. One more observation: we can precompute the inverses of p-free parts in linear time. Let's compute the inverse for the p-free part of N! using Euclid's algorithm. Now we can iterate over all i from N to 0. The inverse of the p-free part of i! is the inverse for i + 1 times p_free[i] (it's easy to prove it if we rewrite the inverse of the p-free part as a product using the fact that elements coprime with p^a form an abelian group under multiplication).
This algorithm runs in O(N * number_of_prime_factors + the time to solve the system using the Chinese remainder theorem + sqrt(MOD)) time per test case. Now it looks good enough.
You're on a good path with binomial-coefficients already. There are several factors to consider:
Think of your number as a binary-string of length n. Now we can create another array counting the number of times a bit will be flipped:
[0, 1, 0, 0, 1] number
[a, b, c, d, e] number of flips.
But even numbers of flips all lead to the same result and so do all odd numbers of flips. So basically the relevant part of the distribution can be represented %2
Logical next question: How many different combinations of even and odd values are available. We'll take care of the ordering later on, for now just assume the flipping-array is ordered descending for simplicity. We start of with k as the only flipping-number in the array. Now we want to add a flip. Since the whole flipping-array is used %2, we need to remove two from the value of k to achieve this and insert them into the array separately. E.g.:
[5, 0, 0, 0] mod 2 [1, 0, 0, 0]
[3, 1, 1, 0] [1, 1, 1, 0]
[4, 1, 0, 0] [0, 1, 0, 0]
As the last example shows (remember we're operating modulo 2 in the final result), moving a single 1 doesn't change the number of flips in the final outcome. Thus we always have to flip an even number bits in the flipping-array. If k is even, so will the number of flipped bits be and same applies vice versa, no matter what the value of n is.
So now the question is of course how many different ways of filling the array are available? For simplicity we'll start with mod 2 right away.
Obviously we start with 1 flipped bit, if k is odd, otherwise with 1. And we always add 2 flipped bits. We can continue with this until we either have flipped all n bits (or at least as many as we can flip)
v = (k % 2 == n % 2) ? n : n - 1
or we can't spread k further over the array.
v = k
Putting this together:
noOfAvailableFlips:
if k < n:
return k
else:
return (k % 2 == n % 2) ? n : n - 1
So far so well, there are always v / 2 flipping-arrays (mod 2) that differ by the number of flipped bits. Now we come to the next part permuting these arrays. This is just a simple permutation-function (permutation with repetition to be precise):
flipArrayNo(flippedbits):
return factorial(n) / (factorial(flippedbits) * factorial(n - flippedbits)
Putting it all together:
solutionsByFlipping(n, k):
res = 0
for i in [k % 2, noOfAvailableFlips(), step=2]:
res += flipArrayNo(i)
return res
This also shows that for sufficiently large numbers we can't obtain 2^n sequences for the simply reason that we can not arrange operations as we please. The number of flips that actually affect the outcome will always be either even or odd depending upon k. There's no way around this. The best result one can get is 2^(n-1) sequences.
For completeness, here's a dynamic program. It can deal easily with arbitrary modulo since it is based on sums, but unfortunately I haven't found a way to speed it beyond O(n * k).
Let a[n][k] be the number of binary strings of length n with k non-adjacent blocks of contiguous 1s that end in 1. Let b[n][k] be the number of binary strings of length n with k non-adjacent blocks of contiguous 1s that end in 0.
Then:
# we can append 1 to any arrangement of k non-adjacent blocks of contiguous 1's
# that ends in 1, or to any arrangement of (k-1) non-adjacent blocks of contiguous
# 1's that ends in 0:
a[n][k] = a[n - 1][k] + b[n - 1][k - 1]
# we can append 0 to any arrangement of k non-adjacent blocks of contiguous 1's
# that ends in either 0 or 1:
b[n][k] = b[n - 1][k] + a[n - 1][k]
# complete answer would be sum (a[n][i] + b[n][i]) for i = 0 to k
I wonder if the following observations might be useful: (1) a[n][k] and b[n][k] are zero when n < 2*k - 1, and (2) on the flip side, for values of k greater than ⌊(n + 1) / 2⌋ the overall answer seems to be identical.
Python code (full matrices are defined for simplicity, but I think only one row of each would actually be needed, space-wise, for a bottom-up method):
a = [[0] * 11 for i in range(0,11)]
b = [([1] + [0] * 10) for i in range(0,11)]
def f(n,k):
return fa(n,k) + fb(n,k)
def fa(n,k):
global a
if a[n][k] or n == 0 or k == 0:
return a[n][k]
elif n == 2*k - 1:
a[n][k] = 1
return 1
else:
a[n][k] = fb(n-1,k-1) + fa(n-1,k)
return a[n][k]
def fb(n,k):
global b
if b[n][k] or n == 0 or n == 2*k - 1:
return b[n][k]
else:
b[n][k] = fb(n-1,k) + fa(n-1,k)
return b[n][k]
def g(n,k):
return sum([f(n,i) for i in range(0,k+1)])
# example
print(g(10,10))
for i in range(0,11):
print(a[i])
print()
for i in range(0,11):
print(b[i])

Probability: No of ways to win if you have n dice with m faces each

You are given a number of dices n, each with a number of faces m. You roll all the n dices and note the sum of all the throws you get from rolling each dice. If you get a sum >= x, you win, otherwise you lose. Find the probability that you win.
I thought of generating all combinations of 1 to m ( of size n ) and keeping count of only those whose sum is more then x . Total no of ways are m^n
After that its just the divison of both.
Is there a better way ?
[EDIT: As noted by jpalacek, the time complexity was wrong -- I've now fixed this.]
You can solve this more efficiently with dynamic programming, by first changing it into the question:
How many ways can I get at least x from n dice?
Express this as f(x, n). Then it must be that
f(x, n) = sum(f(x - i, n - 1)) for all 1 <= i <= m.
I.e. if the first die has 1, the remaining n - 1 dice must add up to at least x - 1; if the first die has 2, the remaining n - 1 dice must add up to at least x - 2; and so on.
There are m terms in the sum, so if you memoise this function, it will be O(m^2*n^2), since it will be required to do this summing work at most (m * n) * n times (i.e. once per unique set of inputs to the function, assuming that the first parameter x <= m * n).
As a final step to get a probability, just divide the result of f(x, n) by the total number of possible outcomes, i.e. m^n.
Just to add up on #j_random_hacker's basically correct answer, you can make it even faster when you note that
f(x, n) = f(x-1, n) - f(x-m-1, n-1) + f(x-1, n-1) if x>m+1
This way, you'll only spend O(1) time calculating each of the f value.
//Passing curFace value will disallow duplicate combinations
//For 3 dices - and sum 8 - 2 4 2 and 2 2 4 are the same combination - so should be counted as one
int sums(int totSum,int noDices,int mFaces,int curFace,HashMap<String,Integer> map)
{
int count=0;
if (noDices<=0 || totSum<=0)
return 0;
if (noDices==1)
{
if (totSum>=1 & totSum<=mFaces)
return 1;
else
return 0;
}
if (map.containsKey(noDices+"-"+totSum))
return map.get(noDices+"-"+totSum);
for (int i=curFace;i<=mFaces;i++)
{
count+=sums(totSum-i,noDices-1,mFaces,i,map);
}
map.put(noDices+"-" +totSum,count);
return count;
}

Why do we check up to the square root of a number to determine if the number is prime?

To test whether a number is prime or not, why do we have to test whether it is divisible only up to the square root of that number?
If a number n is not a prime, it can be factored into two factors a and b:
n = a * b
Now a and b can't be both greater than the square root of n, since then the product a * b would be greater than sqrt(n) * sqrt(n) = n. So in any factorization of n, at least one of the factors must be smaller than the square root of n, and if we can't find any factors less than or equal to the square root, n must be a prime.
Let's say m = sqrt(n) then m × m = n. Now if n is not a prime then n can be written as n = a × b, so m × m = a × b. Notice that m is a real number whereas n, a and b are natural numbers.
Now there can be 3 cases:
a > m ⇒ b < m
a = m ⇒ b = m
a < m ⇒ b > m
In all 3 cases, min(a, b) ≤ m. Hence if we search till m, we are bound to find at least one factor of n, which is enough to show that n is not prime.
Because if a factor is greater than the square root of n, the other factor that would multiply with it to equal n is necessarily less than the square root of n.
Suppose n is not a prime number (greater than 1). So there are numbers a and b such that
n = ab (1 < a <= b < n)
By multiplying the relation a<=b by a and b we get:
a^2 <= ab
ab <= b^2
Therefore: (note that n=ab)
a^2 <= n <= b^2
Hence: (Note that a and b are positive)
a <= sqrt(n) <= b
So if a number (greater than 1) is not prime and we test divisibility up to square root of the number, we will find one of the factors.
It's all really just basic uses of Factorization and Square Roots.
It may appear to be abstract, but in reality it simply lies with the fact that a non-prime-number's maximum possible factorial would have to be its square root because:
sqrroot(n) * sqrroot(n) = n.
Given that, if any whole number above 1 and below or up to sqrroot(n) divides evenly into n, then n cannot be a prime number.
Pseudo-code example:
i = 2;
is_prime = true;
while loop (i <= sqrroot(n))
{
if (n % i == 0)
{
is_prime = false;
exit while;
}
++i;
}
Let's suppose that the given integer N is not prime,
Then N can be factorized into two factors a and b , 2 <= a, b < N such that N = a*b.
Clearly, both of them can't be greater than sqrt(N) simultaneously.
Let us assume without loss of generality that a is smaller.
Now, if you could not find any divisor of N belonging in the range [2, sqrt(N)], what does that mean?
This means that N does not have any divisor in [2, a] as a <= sqrt(N).
Therefore, a = 1 and b = n and hence By definition, N is prime.
...
Further reading if you are not satisfied:
Many different combinations of (a, b) may be possible. Let's say they are:
(a1, b1), (a2, b2), (a3, b3), ..... , (ak, bk). Without loss of generality, assume ai < bi, 1<= i <=k.
Now, to be able to show that N is not prime it is sufficient to show that none of ai can be factorized further. And we also know that ai <= sqrt(N) and thus you need to check till sqrt(N) which will cover all ai. And hence you will be able to conclude whether or not N is prime.
...
So to check whether a number N is Prime or not.
We need to only check if N is divisible by numbers<=SQROOT(N). This is because, if we factor N into any 2 factors say X and Y, ie. N=XY.
Each of X and Y cannot be less than SQROOT(N) because then, XY < N
Each of X and Y cannot be greater than SQROOT(N) because then, X*Y > N
Therefore one factor must be less than or equal to SQROOT(N) ( while the other factor is greater than or equal to SQROOT(N) ).
So to check if N is Prime we need only check those numbers <= SQROOT(N).
Let's say we have a number "a", which is not prime [not prime/composite number means - a number which can be divided evenly by numbers other than 1 or itself. For example, 6 can be divided evenly by 2, or by 3, as well as by 1 or 6].
6 = 1 × 6 or 6 = 2 × 3
So now if "a" is not prime then it can be divided by two other numbers and let's say those numbers are "b" and "c". Which means
a=b*c.
Now if "b" or "c" , any of them is greater than square root of "a "than multiplication of "b" & "c" will be greater than "a".
So, "b" or "c" is always <= square root of "a" to prove the equation "a=b*c".
Because of the above reason, when we test if a number is prime or not, we only check until square root of that number.
Given any number n, then one way to find its factors is to get its square root p:
sqrt(n) = p
Of course, if we multiply p by itself, then we get back n:
p*p = n
It can be re-written as:
a*b = n
Where p = a = b. If a increases, then b decreases to maintain a*b = n. Therefore, p is the upper limit.
Update: I am re-reading this answer again today and it became clearer to me more. The value p does not necessarily mean an integer because if it is, then n would not be a prime. So, p could be a real number (ie, with fractions). And instead of going through the whole range of n, now we only need to go through the whole range of p. The other p is a mirror copy so in effect we halve the range. And then, now I am seeing that we can actually continue re-doing the square root and doing it to p to further half the range.
Let n be non-prime. Therefore, it has at least two integer factors greater than 1. Let f be the smallest of n's such factors. Suppose f > sqrt n. Then n/f is an integer ≤ sqrt n, thus smaller than f. Therefore, f cannot be n's smallest factor. Reductio ad absurdum; n's smallest factor must be ≤ sqrt n.
Any composite number is a product of primes.
Let say n = p1 * p2, where p2 > p1 and they are primes.
If n % p1 === 0 then n is a composite number.
If n % p2 === 0 then guess what n % p1 === 0 as well!
So there is no way that if n % p2 === 0 but n % p1 !== 0 at the same time.
In other words if a composite number n can be divided evenly by
p2,p3...pi (its greater factor) it must be divided by its lowest factor p1 too.
It turns out that the lowest factor p1 <= Math.square(n) is always true.
Yes, as it was properly explained above, it's enough to iterate up to Math.floor of a number's square root to check its primality (because sqrt covers all possible cases of division; and Math.floor, because any integer above sqrt will already be beyond its range).
Here is a runnable JavaScript code snippet that represents a simple implementation of this approach – and its "runtime-friendliness" is good enough for handling pretty big numbers (I tried checking both prime and not prime numbers up to 10**12, i.e. 1 trillion, compared results with the online database of prime numbers and encountered no errors or lags even on my cheap phone):
function isPrime(num) {
if (num % 2 === 0 || num < 3 || !Number.isSafeInteger(num)) {
return num === 2;
} else {
const sqrt = Math.floor(Math.sqrt(num));
for (let i = 3; i <= sqrt; i += 2) {
if (num % i === 0) return false;
}
return true;
}
}
<label for="inp">Enter a number and click "Check!":</label><br>
<input type="number" id="inp"></input>
<button onclick="alert(isPrime(+document.getElementById('inp').value) ? 'Prime' : 'Not prime')" type="button">Check!</button>
To test the primality of a number, n, one would expect a loop such as following in the first place :
bool isPrime = true;
for(int i = 2; i < n; i++){
if(n%i == 0){
isPrime = false;
break;
}
}
What the above loop does is this : for a given 1 < i < n, it checks if n/i is an integer (leaves remainder 0). If there exists an i for which n/i is an integer, then we can be sure that n is not a prime number, at which point the loop terminates. If for no i, n/i is an integer, then n is prime.
As with every algorithm, we ask : Can we do better ?
Let us see what is going on in the above loop.
The sequence of i goes : i = 2, 3, 4, ... , n-1
And the sequence of integer-checks goes : j = n/i, which is n/2, n/3, n/4, ... , n/(n-1)
If for some i = a, n/a is an integer, then n/a = k (integer)
or n = ak, clearly n > k > 1 (if k = 1, then a = n, but i never reaches n; and if k = n, then a = 1, but i starts form 2)
Also, n/k = a, and as stated above, a is a value of i so n > a > 1.
So, a and k are both integers between 1 and n (exclusive). Since, i reaches every integer in that range, at some iteration i = a, and at some other iteration i = k. If the primality test of n fails for min(a,k), it will also fail for max(a,k). So we need to check only one of these two cases, unless min(a,k) = max(a,k) (where two checks reduce to one) i.e., a = k , at which point a*a = n, which implies a = sqrt(n).
In other words, if the primality test of n were to fail for some i >= sqrt(n) (i.e., max(a,k)), then it would also fail for some i <= n (i.e., min(a,k)). So, it would suffice if we run the test for i = 2 to sqrt(n).

Resources