Related
Note: This may involve a good deal of number theory, but the formula I found online is only an approximation, so I believe an exact solution requires some sort of iterative calculation by a computer.
My goal is to find an efficient algorithm (in terms of time complexity) to solve the following problem for large values of n:
Let R(a,b) be the amount of steps that the Euclidean algorithm takes to find the GCD of nonnegative integers a and b. That is, R(a,b) = 1 + R(b,a%b), and R(a,0) = 0. Given a natural number n, find the sum of R(a,b) for all 1 <= a,b <= n.
For example, if n = 2, then the solution is R(1,1) + R(1,2) + R(2,1) + R(2,2) = 1 + 2 + 1 + 1 = 5.
Since there are n^2 pairs corresponding to the numbers to be added together, simply computing R(a,b) for every pair can do no better than O(n^2), regardless of the efficiency of R. Thus, to improve the efficiency of the algorithm, a faster method must somehow calculate the sum of R(a,b) over many values at once. There are a few properties that I suspect might be useful:
If a = b, then R(a,b) = 1
If a < b, then R(a,b) = 1 + R(b,a)
R(a,b) = R(ka,kb) where k is some natural number
If b <= a, then R(a,b) = R(a+b,b)
If b <= a < 2b, then R(a,b) = R(2a-b,a)
Because of the first two properties, it is only necessary to find the sum of R(a,b) over pairs where a > b. I tried using this in addition to the third property in a method that computes R(a,b) only for pairs where a and b are also coprime in addition to a being greater than b. The total sum is then n plus the sum of (n / a) * ((2 * R(a,b)) + 1) over all such pairs (using integer division for n / a). This algorithm still had time complexity O(n^2), I discovered, due to Euler's totient function being roughly linear.
I don't need any specific code solution, I just need to figure out the procedure for a more efficient algorithm. But if the programming language matters, my attempts to solve this problem have used C++.
Side note: I have found that a formula has been discovered that nearly solves this problem, but it is only an approximation. Note that the formula calculates the average rather than the sum, so it would just need to be multiplied by n^2. If the formula could be expanded to reduce the error, it might work, but from what I can tell, I'm not sure if this is possible.
Using Stern-Brocot, due to symmetry, we can look at just one of the four subtrees rooted at 1/3, 2/3, 3/2 or 3/1. The time complexity is still O(n^2) but obviously performs less calculations. The version below uses the subtree rooted at 2/3 (or at least that's the one I looked at to think through :). Also note, we only care about the denominators there since the numerators are lower. Also note the code relies on rules 2 and 3 as well.
C++ code (takes about a tenth of a second for n = 10,000):
#include <iostream>
using namespace std;
long g(int n, int l, int mid, int r, int fromL, int turns){
long right = 0;
long left = 0;
if (mid + r <= n)
right = g(n, mid, mid + r, r, 1, turns + (1^fromL));
if (mid + l <= n)
left = g(n, l, mid + l, mid, 0, turns + fromL);
// Multiples
int k = n / mid;
// This subtree is rooted at 2/3
return 4 * k * turns + left + right;
}
long f(int n) {
// 1/1, 2/2, 3/3 etc.
long total = n;
// 1/2, 2/4, 3/6 etc.
if (n > 1)
total += 3 * (n >> 1);
if (n > 2)
// Technically 3 turns for 2/3 but
// we can avoid a subtraction
// per call by starting with 2. (I
// guess that means it could be
// another subtree, but I haven't
// thought it through.)
total += g(n, 2, 3, 1, 1, 2);
return total;
}
int main() {
cout << f(10000);
return 0;
}
I think this is a hard problem. We can avoid division and reduce the space usage to linear at least via the Stern--Brocot tree.
def f(n, a, b, r):
return r if a + b > n else r + f(n, a + b, b, r) + f(n, a + b, a, r + 1)
def R_sum(n):
return sum(f(n, d, d, 1) for d in range(1, n + 1))
def R(a, b):
return 1 + R(b, a % b) if b else 0
def test(n):
print(R_sum(n))
print(sum(R(a, b) for a in range(1, n + 1) for b in range(1, n + 1)))
test(100)
I encountered this question in an interview and could not figure it out. I believe it has a dynamic programming solution but it eludes me.
Given a number of bricks, output the total number of 2d pyramids possible, where a pyramid is defined as any structure where a row of bricks has strictly less bricks than the row below it. You do not have to use all the bricks.
A brick is simply a square, the number of bricks in a row is the only important bit of information.
Really stuck with this one, I thought it would be easy to solve each problem 1...n iteratively and sum. But coming up with the number of pyramids possible with exactly i bricks is evading me.
example, n = 6
X
XX
X
XX XXX
X
XXX XXXX
XX X
XXX XXXX XXXXX
X
XX XX X
XXX XXXX XXXXX XXXXXX
So the answer is 13 possible pyramids from 6 bricks.
edit
I am positive this is a dynamic programming problem, because it makes sense to (once you've determined the first row) simply look to the index in your memorized array of your remainder of bricks to see how many pyramids fit atop.
It also makes sense to consider bottom rows of width at least n/2 because we can't have more bricks atop than on the bottom row EXCEPT and this is where I lose it and my mind falls apart, in certain (few cases) you can I.e. N = 10
X
XX
XXX
XXXX
Now the bottom row has 4 but there are 6 left to place on top
But with n = 11 we cannot have a bottom row with less than n/2 bricks. There is another wierd inconsistency like that with n = 4 where we cannot have a bottom row of n/2 = 2 bricks.
Let's choose a suitable definition:
f(n, m) = # pyramids out of n bricks with base of size < m
The answer you are looking for now is (given that N is your input number of bricks):
f(N, N+1) - 1
Let's break that down:
The first N is obvious: that's your number of bricks.
Your bottom row will contain at most N bricks (because that's all you have), so N+1 is a sufficient lower bound.
Finally, the - 1 is there because technically the empty pyramid is also a pyramid (and will thus be counted) but you exclude that from your solutions.
The base cases are simple:
f(n, 0) = 1 for any n >= 0
f(0, m) = 1 for any m >= 0
In both cases, it's the empty pyramid that we are counting here.
Now, all we need still is a recursive formula for the general case.
Let's assume we are given n and m and choose to have i bricks on the bottom layer. What can we place on top of this layer? A smaller pyramid, for which we have n - i bricks left and whose base has size < i. This is exactly f(n - i, i).
What is the range for i? We can choose an empty row so i >= 0. Obviously, i <= n because we have only n bricks. But also, i <= m - 1, by definition of m.
This leads to the recursive expression:
f(n, m) = sum f(n - i, i) for 0 <= i <= min(n, m - 1)
You can compute f recursively, but using dynamic programming it will be faster of course. Storing the results matrix is straightforward though, so I leave that up to you.
Coming back to the original claim that f(N, N+1)-1 is the answer you are looking for, it doesn't really matter which value to choose for m as long as it is > N. Based on the recursive formula it's easy to show that f(N, N + 1) = f(N, N + k) for every k >= 1:
f(N, N + k) = sum f(N - i, i) for 0 <= i <= min(N, N + k - 1)
= sum f(N - i, i) for 0 <= i <= N
= sum f(N - i, i) for 0 <= i <= min(N, N + 1 - 1)
In how many ways can you build a pyramid of width n? By putting any pyramid of width n-1 or less anywhere atop the layer of n bricks. So if p(n) is the number of pyramids of width n, then p(n) = sum [m=1 to n-1] (p(m) * c(n, m)), where c(n, m) is the number of ways you can place a layer of width m atop a layer of width n (I trust that you can work that one out yourself).
This, however, doesn't place a limitation on the number of bricks. Generally, in DP, any resource limitation must be modeled as a separate dimension. So your problem is now p(n, b): "How many pyramids can you build of width n with a total of b bricks"? In the recursive formula, for each possible way of building a smaller pyramid atop your current one, you need to refer to the correct amount of remaining bricks. I leave it as a challenge for you to work out the recursive formula; let me know if you need any hints.
You can think of your recursion as: given x bricks left where you used n bricks on last row, how many pyramids can you build. Now you can fill up rows from either top to bottom row or bottom to top row. I will explain the former case.
Here the recursion might look something like this (left is number of bricks left and last is number of bricks used on last row)
f(left,last)=sum (1+f(left-i,i)) for i in range [last+1,left] inclusive.
Since when you use i bricks on current row you will have left-i bricks left and i will be number of bricks used on this row.
Code:
int calc(int left, int last) {
int total=0;
if(left<=0) return 0; // terminal case, no pyramid with no brick
for(int i=last+1; i<=left; i++) {
total+=1+calc(left-i,i);
}
return total;
}
I will leave it to you to implement memoized or bottom-up dp version. Also you may want to start from bottom row and fill up upper rows in pyramid.
Since we are asked to count pyramids of any cardinality less than or equal to n, we may consider each cardinality in turn (pyramids of 1 element, 2 elements, 3...etc.) and sum them up. But in how many different ways can we compose a pyramid from k elements? The same number as the count of distinct partitions of k (for example, for k = 6, we can have (6), (1,5), (2,4), and (1,2,3)). A generating function/recurrence for the count of distinct partitions is described in Wikipedia and a sequence at OEIS.
Recurrence, based on the Pentagonal number Theorem:
q(k) = ak + q(k − 1) + q(k − 2) − q(k − 5) − q(k − 7) + q(k − 12) + q(k − 15) − q(k − 22)...
where ak is (−1)^(abs(m)) if k = 3*m^2 − m for some integer m and is 0 otherwise.
(The subtracted coefficients are generalized pentagonal numbers.)
Since the recurrence described in Wikipedia obliges the calculation of all preceding q(n)'s to arrive at a larger q(n), we can simply sum the results along the way to obtain our result.
JavaScript code:
function numPyramids(n){
var distinctPartitions = [1,1],
pentagonals = {},
m = _m = 1,
pentagonal_m = 2,
result = 1;
while (pentagonal_m / 2 <= n){
pentagonals[pentagonal_m] = Math.abs(_m);
m++;
_m = m % 2 == 0 ? -m / 2 : Math.ceil(m / 2);
pentagonal_m = _m * (3 * _m - 1);
}
for (var k=2; k<=n; k++){
distinctPartitions[k] = pentagonals[k] ? Math.pow(-1,pentagonals[k]) : 0;
var cs = [1,1,-1,-1],
c = 0;
for (var i in pentagonals){
if (i / 2 > k)
break;
distinctPartitions[k] += cs[c]*distinctPartitions[k - i / 2];
c = c == 3 ? 0 : c + 1;
}
result += distinctPartitions[k];
}
return result;
}
console.log(numPyramids(6)); // 13
You are given a number of dices n, each with a number of faces m. You roll all the n dices and note the sum of all the throws you get from rolling each dice. If you get a sum >= x, you win, otherwise you lose. Find the probability that you win.
I thought of generating all combinations of 1 to m ( of size n ) and keeping count of only those whose sum is more then x . Total no of ways are m^n
After that its just the divison of both.
Is there a better way ?
[EDIT: As noted by jpalacek, the time complexity was wrong -- I've now fixed this.]
You can solve this more efficiently with dynamic programming, by first changing it into the question:
How many ways can I get at least x from n dice?
Express this as f(x, n). Then it must be that
f(x, n) = sum(f(x - i, n - 1)) for all 1 <= i <= m.
I.e. if the first die has 1, the remaining n - 1 dice must add up to at least x - 1; if the first die has 2, the remaining n - 1 dice must add up to at least x - 2; and so on.
There are m terms in the sum, so if you memoise this function, it will be O(m^2*n^2), since it will be required to do this summing work at most (m * n) * n times (i.e. once per unique set of inputs to the function, assuming that the first parameter x <= m * n).
As a final step to get a probability, just divide the result of f(x, n) by the total number of possible outcomes, i.e. m^n.
Just to add up on #j_random_hacker's basically correct answer, you can make it even faster when you note that
f(x, n) = f(x-1, n) - f(x-m-1, n-1) + f(x-1, n-1) if x>m+1
This way, you'll only spend O(1) time calculating each of the f value.
//Passing curFace value will disallow duplicate combinations
//For 3 dices - and sum 8 - 2 4 2 and 2 2 4 are the same combination - so should be counted as one
int sums(int totSum,int noDices,int mFaces,int curFace,HashMap<String,Integer> map)
{
int count=0;
if (noDices<=0 || totSum<=0)
return 0;
if (noDices==1)
{
if (totSum>=1 & totSum<=mFaces)
return 1;
else
return 0;
}
if (map.containsKey(noDices+"-"+totSum))
return map.get(noDices+"-"+totSum);
for (int i=curFace;i<=mFaces;i++)
{
count+=sums(totSum-i,noDices-1,mFaces,i,map);
}
map.put(noDices+"-" +totSum,count);
return count;
}
Given a positive integer X, how can one partition it into N parts, each between A and B where A <= B are also positive integers? That is, write
X = X_1 + X_2 + ... + X_N
where A <= X_i <= B and the order of the X_is doesn't matter?
If you want to know the number of ways to do this, then you can use generating functions.
Essentially, you are interested in integer partitions. An integer partition of X is a way to write X as a sum of positive integers. Let p(n) be the number of integer partitions of n. For example, if n=5 then p(n)=7 corresponding to the partitions:
5
4,1
3,2
3,1,1
2,2,1
2,1,1,1
1,1,1,1,1
The the generating function for p(n) is
sum_{n >= 0} p(n) z^n = Prod_{i >= 1} ( 1 / (1 - z^i) )
What does this do for you? By expanding the right hand side and taking the coefficient of z^n you can recover p(n). Don't worry that the product is infinite since you'll only ever be taking finitely many terms to compute p(n). In fact, if that's all you want, then just truncate the product and stop at i=n.
Why does this work? Remember that
1 / (1 - z^i) = 1 + z^i + z^{2i} + z^{3i} + ...
So the coefficient of z^n is the number of ways to write
n = 1*a_1 + 2*a_2 + 3*a_3 +...
where now I'm thinking of a_i as the number of times i appears in the partition of n.
How does this generalize? Easily, as it turns out. From the description above, if you only want the parts of the partition to be in a given set A, then instead of taking the product over all i >= 1, take the product over only i in A. Let p_A(n) be the number of integer partitions of n whose parts come from the set A. Then
sum_{n >= 0} p_A(n) z^n = Prod_{i in A} ( 1 / (1 - z^i) )
Again, taking the coefficient of z^n in this expansion solves your problem. But we can go further and track the number of parts of the partition. To do this, add in another place holder q to keep track of how many parts we're using. Let p_A(n,k) be the number of integer partitions of n into k parts where the parts come from the set A. Then
sum_{n >= 0} sum_{k >= 0} p_A(n,k) q^k z^n = Prod_{i in A} ( 1 / (1 - q*z^i) )
so taking the coefficient of q^k z^n gives the number of integer partitions of n into k parts where the parts come from the set A.
How can you code this? The generating function approach actually gives you an algorithm for generating all of the solutions to the problem as well as a way to uniformly sample from the set of solutions. Once n and k are chosen, the product on the right is finite.
Here is a python solution to this problem, This is quite un-optimised but I have tried to keep it as simple as I can to demonstrate an iterative method of solving this problem.
The results of this method will commonly be a list of max values and min values with maybe 1 or 2 values inbetween. Because of this, there is a slight optimisation in there, (using abs) which will prevent the iterator constantly trying to find min values counting down from max and vice versa.
There are recursive ways of doing this that look far more elegant, but this will get the job done and hopefully give you an insite into a better solution.
SCRIPT:
# iterative approach in-case the number of partitians is particularly large
def splitter(value, partitians, min_range, max_range, part_values):
# lower bound used to determine if the solution is within reach
lower_bound = 0
# upper bound used to determine if the solution is within reach
upper_bound = 0
# upper_range used as upper limit for the iterator
upper_range = 0
# lower range used as lower limit for the iterator
lower_range = 0
# interval will be + or -
interval = 0
while value > 0:
partitians -= 1
lower_bound = min_range*(partitians)
upper_bound = max_range*(partitians)
# if the value is more likely at the upper bound start from there
if abs(lower_bound - value) < abs(upper_bound - value):
upper_range = max_range
lower_range = min_range-1
interval = -1
# if the value is more likely at the lower bound start from there
else:
upper_range = min_range
lower_range = max_range+1
interval = 1
for i in range(upper_range, lower_range, interval):
# make sure what we are doing won't break solution
if lower_bound <= value-i and upper_bound >= value-i:
part_values.append(i)
value -= i
break
return part_values
def partitioner(value, partitians, min_range, max_range):
if min_range*partitians <= value and max_range*partitians >= value:
return splitter(value, partitians, min_range, max_range, [])
else:
print ("this is impossible to solve")
def main():
print(partitioner(9800, 1000, 2, 100))
The basic idea behind this script is that the value needs to fall between min*parts and max*parts, for each step of the solution, if we always achieve this goal, we will eventually end up at min < value < max for parts == 1, so if we constantly take away from the value, and keep it within this min < value < max range we will always find the result if it is possable.
For this code's example, it will basically always take away either max or min depending on which bound the value is closer to, untill some non min or max value is left over as remainder.
A simple realization you can make is that the average of the X_i must be between A and B, so we can simply divide X by N and then do some small adjustments to distribute the remainder evenly to get a valid partition.
Here's one way to do it:
X_i = ceil (X / N) if i <= X mod N,
floor (X / N) otherwise.
This gives a valid solution if A <= floor (X / N) and ceil (X / N) <= B. Otherwise, there is no solution. See proofs below.
sum(X_i) == X
Proof:
Use the division algorithm to write X = q*N + r with 0 <= r < N.
If r == 0, then ceil (X / N) == floor (X / N) == q so the algorithm sets all X_i = q. Their sum is q*N == X.
If r > 0, then floor (X / N) == q and ceil (X / N) == q+1. The algorithm sets X_i = q+1 for 1 <= i <= r (i.e. r copies), and X_i = q for the remaining N - r pieces. The sum is therefore (q+1)*r + (N-r)*q == q*r + r + N*q - r*q == q*N + r == X.
If floor (X / N) < A or ceil (X / N) > B, then there is no solution.
Proof:
If floor (X / N) < A, then floor (X / N) * N < A * N, and since floor(X / N) * N <= X, this means that X < A*N, so even using only the smallest pieces possible, the sum would be larger than X.
Similarly, if ceil (X / N) > B, then ceil (X / N) * N > B * N, and since ceil(X / N) * N >= X, this means that X > B*N, so even using only the largest pieces possible, the sum would be smaller than X.
Is there any known algorithm that can generate a shuffled range [0..n) in linear time and constant space (when output produced iteratively), given an arbitrary seed value?
Assume n may be large, e.g. in the many millions, so a requirement to potentially produce every possible permutation is not required, not least because it's infeasible (the seed value space would need to be huge). This is also the reason for a requirement of constant space. (So, I'm specifically not looking for an array-shuffling algorithm, as that requires that the range is stored in an array of length n, and so would use linear space.)
I'm aware of question 162606, but it doesn't present an answer to this particular question - the mappings from permutation indexes to permutations given in that question would require a huge seed value space.
Ideally, it would act like a LCG with a period and range of n, but the art of selecting a and c for an LCG is subtle. Simply satisfying the constraints for a and c in a full period LCG may satisfy my requirements, but I am wondering if there are any better ideas out there.
Based on Jason's answer, I've made a simple straightforward implementation in C#. Find the next largest power of two greater than N. This makes it trivial to generate a and c, since c needs to be relatively prime (meaning it can't be divisible by 2, aka odd), and (a-1) needs to be divisible by 2, and (a-1) needs to be divisible by 4. Statistically, it should take 1-2 congruences to generate the next number (since 2N >= M >= N).
class Program
{
IEnumerable<int> GenerateSequence(int N)
{
Random r = new Random();
int M = NextLargestPowerOfTwo(N);
int c = r.Next(M / 2) * 2 + 1; // make c any odd number between 0 and M
int a = r.Next(M / 4) * 4 + 1; // M = 2^m, so make (a-1) divisible by all prime factors, and 4
int start = r.Next(M);
int x = start;
do
{
x = (a * x + c) % M;
if (x < N)
yield return x;
} while (x != start);
}
int NextLargestPowerOfTwo(int n)
{
n |= (n >> 1);
n |= (n >> 2);
n |= (n >> 4);
n |= (n >> 8);
n |= (n >> 16);
return (n + 1);
}
static void Main(string[] args)
{
Program p = new Program();
foreach (int n in p.GenerateSequence(1000))
{
Console.WriteLine(n);
}
Console.ReadKey();
}
}
Here is a Python implementation of the Linear Congruential Generator from FryGuy's answer. Because I needed to write it anyway and thought it might be useful for others.
import random
import math
def lcg(start, stop):
N = stop - start
# M is the next largest power of 2
M = int(math.pow(2, math.ceil(math.log(N+1, 2))))
# c is any odd number between 0 and M
c = random.randint(0, M/2 - 1) * 2 + 1
# M=2^m, so make (a-1) divisible by all prime factors and 4
a = random.randint(0, M/4 - 1) * 4 + 1
first = random.randint(0, M - 1)
x = first
while True:
x = (a * x + c) % M
if x < N:
yield start + x
if x == first:
break
if __name__ == "__main__":
for x in lcg(100, 200):
print x,
Sounds like you want an algorithm which is guaranteed to produce a cycle from 0 to n-1 without any repeats. There are almost certainly a whole bunch of these depending on your requirements; group theory would be the most helpful branch of mathematics if you want to delve into the theory behind it.
If you want fast and don't care about predictability/security/statistical patterns, an LCG is probably the simplest approach. The wikipedia page you linked to contains this (fairly simple) set of requirements:
The period of a general LCG is at most
m, and for some choices of a much less
than that. The LCG will have a full
period if and only if:
c and m are relatively prime,
a - 1 is divisible by all prime factors of m
a - 1 is a multiple of 4 if m is a multiple of 4
Alternatively, you could choose a period N >= n, where N is the smallest value that has convenient numerical properties, and just discard any values produced between n and N-1. For example, the lowest N = 2k - 1 >= n would let you use linear feedback shift registers (LFSR). Or find your favorite cryptographic algorithm (RSA, AES, DES, whatever) and given a particular key, figure out the space N of numbers it permutes, and for each step apply encryption once.
If n is small but you want the security to be high, that's probably the trickiest case, as any sequence S is likely to have a period N much higher than n, but is also nontrivial to derive a nonrepeating sequence of numbers with a shorter period than N. (e.g. if you could take the output of S mod n and guarantee nonrepeating sequence of numbers, that would give information about S that an attacker might use)
See my article on secure permutations with block ciphers for one way to do it.
Look into Linear Feedback Shift Registers, they can be used for exactly this.
The short way of explaining them is that you start with a seed and then iterate using the formula
x = (x << 1) | f(x)
where f(x) can only return 0 or 1.
If you choose a good function f, x will cycle through all values between 1 and 2^n-1 (where n is some number), in a good, pseudo-random way.
Example functions can be found here, e.g. for 63 values you can use
f(x) = ((x >> 6) & 1) ^ ((x >> 5) & 1)