Fast algorithm for sum of steps taken by the Euclidean algorithm over pairs of numbers under an upper bound - algorithm

Note: This may involve a good deal of number theory, but the formula I found online is only an approximation, so I believe an exact solution requires some sort of iterative calculation by a computer.
My goal is to find an efficient algorithm (in terms of time complexity) to solve the following problem for large values of n:
Let R(a,b) be the amount of steps that the Euclidean algorithm takes to find the GCD of nonnegative integers a and b. That is, R(a,b) = 1 + R(b,a%b), and R(a,0) = 0. Given a natural number n, find the sum of R(a,b) for all 1 <= a,b <= n.
For example, if n = 2, then the solution is R(1,1) + R(1,2) + R(2,1) + R(2,2) = 1 + 2 + 1 + 1 = 5.
Since there are n^2 pairs corresponding to the numbers to be added together, simply computing R(a,b) for every pair can do no better than O(n^2), regardless of the efficiency of R. Thus, to improve the efficiency of the algorithm, a faster method must somehow calculate the sum of R(a,b) over many values at once. There are a few properties that I suspect might be useful:
If a = b, then R(a,b) = 1
If a < b, then R(a,b) = 1 + R(b,a)
R(a,b) = R(ka,kb) where k is some natural number
If b <= a, then R(a,b) = R(a+b,b)
If b <= a < 2b, then R(a,b) = R(2a-b,a)
Because of the first two properties, it is only necessary to find the sum of R(a,b) over pairs where a > b. I tried using this in addition to the third property in a method that computes R(a,b) only for pairs where a and b are also coprime in addition to a being greater than b. The total sum is then n plus the sum of (n / a) * ((2 * R(a,b)) + 1) over all such pairs (using integer division for n / a). This algorithm still had time complexity O(n^2), I discovered, due to Euler's totient function being roughly linear.
I don't need any specific code solution, I just need to figure out the procedure for a more efficient algorithm. But if the programming language matters, my attempts to solve this problem have used C++.
Side note: I have found that a formula has been discovered that nearly solves this problem, but it is only an approximation. Note that the formula calculates the average rather than the sum, so it would just need to be multiplied by n^2. If the formula could be expanded to reduce the error, it might work, but from what I can tell, I'm not sure if this is possible.

Using Stern-Brocot, due to symmetry, we can look at just one of the four subtrees rooted at 1/3, 2/3, 3/2 or 3/1. The time complexity is still O(n^2) but obviously performs less calculations. The version below uses the subtree rooted at 2/3 (or at least that's the one I looked at to think through :). Also note, we only care about the denominators there since the numerators are lower. Also note the code relies on rules 2 and 3 as well.
C++ code (takes about a tenth of a second for n = 10,000):
#include <iostream>
using namespace std;
long g(int n, int l, int mid, int r, int fromL, int turns){
long right = 0;
long left = 0;
if (mid + r <= n)
right = g(n, mid, mid + r, r, 1, turns + (1^fromL));
if (mid + l <= n)
left = g(n, l, mid + l, mid, 0, turns + fromL);
// Multiples
int k = n / mid;
// This subtree is rooted at 2/3
return 4 * k * turns + left + right;
}
long f(int n) {
// 1/1, 2/2, 3/3 etc.
long total = n;
// 1/2, 2/4, 3/6 etc.
if (n > 1)
total += 3 * (n >> 1);
if (n > 2)
// Technically 3 turns for 2/3 but
// we can avoid a subtraction
// per call by starting with 2. (I
// guess that means it could be
// another subtree, but I haven't
// thought it through.)
total += g(n, 2, 3, 1, 1, 2);
return total;
}
int main() {
cout << f(10000);
return 0;
}

I think this is a hard problem. We can avoid division and reduce the space usage to linear at least via the Stern--Brocot tree.
def f(n, a, b, r):
return r if a + b > n else r + f(n, a + b, b, r) + f(n, a + b, a, r + 1)
def R_sum(n):
return sum(f(n, d, d, 1) for d in range(1, n + 1))
def R(a, b):
return 1 + R(b, a % b) if b else 0
def test(n):
print(R_sum(n))
print(sum(R(a, b) for a in range(1, n + 1) for b in range(1, n + 1)))
test(100)

Related

Find the value of f(T) for big value T

I am trying to solve a problem which is described below,
Given value of f(0) and k , which are integers.
I need to find value of f( T ). where T<=1010
Recursive function is,
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
My efforts,
#include<iostream>
using namespace std;
int main(){
long k,f0,i;
cin>>k>>f0;
long operation ;
cin>>operation;
long answer=f0;
for(i=1;i<=operation;i++){
answer=(4*answer <= k )?(2*answer):(k-(2*answer));
}
cout<<answer;
return 0;
}
My code gives me right answer. But, The code will run 1010 time in worst case that gives me Time Limit Exceed. I need more efficient solution for this problem. Please help me. I don't know the correct algorithm.
If 2f(0) < k then you can compute this function in O(log n) time (using exponentiation by squaring modulo k).
r = f(0) * 2^n mod k
return 2 * r >= k ? k - r : r
You can prove this by induction. The induction hypothesis is that 0 <= f(n) < k/2, and that the above code fragment computes f(n).
Here's a Python program which checks random test cases, comparing a naive implementation (f) with an optimized one (g).
def f(n, k, z):
r = z
for _ in xrange(n):
if 4*r <= k:
r = 2 * r
else:
r = k - 2 * r
return r
def g(n, k, z):
r = (z * pow(2, n, k)) % k
if 2 * r >= k:
r = k - r
return r
import random
errs = 0
while errs < 20:
k = random.randrange(100, 10000000)
n = random.randrange(100000)
z = random.randrange(k//2)
a1 = f(n, k, z)
a2 = g(n, k, z)
if a1 != a2:
print n, k, z, a1, a2
errs += 1
print '.',
Can you use methmetical solution before progamming and compulating?
Actually,
f(n) = f0*2^(n-1) , if f(n-1)*4 <= k
k - f0*2^(n-1) , if f(n-1)*4 > k
thus, your code will write like this:
condition = f0*pow(2, operation-2)
answer = condition*4 =< k? condition*2: k - condition*2
For a simple loop, your answer looks pretty tight; one could optimise a little bit using answer<<2 instead of 4*answer, and answer<<1 for 2*answer, but quite possibly your compiler is already doing that. If you're blowing the time with this, it might be necessary to reduce the loop itself somehow.
I can't figure out a mathematical pattern that #Shannon was going for, but I'm thinking we could exploit the fact that this function will sooner or later cycle. If the cycle is short enough, then we could short the loop by just getting the answer at the same point in the cycle.
So let's get some cycle detection equipment in the form of Brent's algorithm, and see if we can cut the loop to reasonable levels.
def brent(f, x0):
# main phase: search successive powers of two
power = lam = 1
tortoise = x0
hare = f(x0) # f(x0) is the element/node next to x0.
while tortoise != hare:
if power == lam: # time to start a new power of two?
tortoise = hare
power *= 2
lam = 0
hare = f(hare)
lam += 1
# Find the position of the first repetition of length λ
mu = 0
tortoise = hare = x0
for i in range(lam):
# range(lam) produces a list with the values 0, 1, ... , lam-1
hare = f(hare)
# The distance between the hare and tortoise is now λ.
# Next, the hare and tortoise move at same speed until they agree
while tortoise != hare:
tortoise = f(tortoise)
hare = f(hare)
mu += 1
return lam, mu
f0 = 2
k = 198779
t = 10000000000
def f(x):
if 4 * x <= k:
return 2 * x
else:
return k - 2 * x
lam, mu = brent(f, f0)
t2 = t
if t >= mu + lam: # if T is past the cycle's first loop,
t2 = (t - mu) % lam + mu # find the equivalent place in the first loop
x = f0
for i in range(t2):
x = f(x)
print("Cycle start: %d; length: %d" % (mu, lam))
print("Equivalent result at index: %d" % t2)
print("Loop iterations skipped: %d" % (t - t2))
print("Result: %d" % x)
As opposed to the other proposed answers, this approach actually could use a memo array to speed up the process, since the start of the function is actually calculated multiple times (in particular, inside brent), or it may be irrelevant, depending on how big the cycle happens to be.
The algorithm you proposed already has O(n).
To come up with more efficient algorithms, there is not that much direction we can go about. Some typical options we have
1.Decease the coefficients of the linear term( but I doubt it would make a difference in this case
2.Change to O(Logn)(typically use some sort of divide and conquer technique)
3.Change to O(1)
In this case, we can do the last one.
The recursion function is a piece-wise function
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
Let's tackle it by case:
case 1: if 4*f(n-1) <= k (1)(assuming the starting index is zero)
this is a obvious a geometry series
a_n = 2*a_n-1
Therefore, have the formula
Sn = 2^(n-1)f(0) ----()
Case 2: if 4*f(n-1) > k (2), we have
a_n = -2a_n-1 + k
Assuming, a_j is the element in the sequence which just satisfy condition (2)
Nestedly sub in an_1 to the formula, you will obtain the equation
an = k -2k +4k -8k... +(-2)^(n-j)* a_j
k -2k 4k -8... is another gemo series
Sn = k*(1-2^(n-j))/(1-2) ---gemo series sum formula with starting value k and ratio = -2
Therefore, we have a formula for an in the case 2
an = k * (1-2^(n-j))/(1-2) + (-2)^(n-j) * a_j ----(**)
All we left to do it to find aj which just dissatisfy condition (1) and satisfy (2)
This can be obtained in constant time again using the formula we have for case 1:
find n such that, 4*an = 4*Sn = 4*2^(n-1)*f(0)
solve for n: 4*2^(n-1)*f(0) = k, if n is not integer, take ceiling of n
In my first attempt to solve this question, I had wrong assumption that the value of the sequence is monotonically increasing but in fact the sequence might jump between case 1 and case 2. Therefore, there might not be constant algorithm to solve the problem.
However, we can use utilize the result above to skip iterative update complexity.
The overall algorithm will look something like:
start with T, K, and f(0)
compute n that make the condition switch using either (*) or (**)
update f(0) with f(n), update T - n
repeat
terminate when T-n = 0(the last iteration might over compute causing T-n<0, therefore, you need to go back a little bit if that happen)
Create a map that can store your results. Before finding f(n) check in that map, if solution is already existed or not.
If exists, use that solution.
Otherwise find it, store it for future use.
For C++:
Definition:
map<long,long>result;
Insertion:
result[key]=value
Accessing:
value=result[key];
Checking:
map<long,long>::iterator it=result.find(key);
if(it==result.end())
{
//key was not found, find the solution and insert into result
}
else
{
return result[key];
}
Use above technique for better solution.

How do you determine the average-case complexity of this algorithm?

It's usually easy to calculate the time complexity for the best case and the worst case, but when it comes to the average case especially when there's a probability p given, I don't know where to start.
Let's look at the following algorithm to compute the product of all the elements in a matrix:
int computeProduct(int[][] A, int m, int n) {
int product = 1;
for (int i = 0; i < m; i++ {
for (int j = 0; j < n; j++) {
if (A[i][j] == 0) return 0;
product = product * A[i][j];
}
}
return product;
}
Suppose p is the probability of A[i][j] being 0 (i.e. the algorithm terminates there, return 0); how do we derive the average case time complexity for this algorithm?
Let’s consider a related problem. Imagine you have a coin that flips heads with probability p. How many times, on expectation, do you need to flip the coin before it comes up heads? The answer is 1/p, since
There’s a p chance that you need one flip.
There’s a p(1-p) chance that you need two flips (the first flip has to go tails and the second has to go heads).
There’s a p(1-p)^2 chance that you need three flips (the first two flips need to go tails and the third has to go heads)
...
There’s a p(1-p)^(k-1) chance that you need k flips (the first k-1 flips need to go tails and the kth needs to go heads.)
So this means the expected value of the number of flips is
p + 2p(1 - p) + 3p(1 - p)^2 + 4p(1 - p)^3 + ...
= p(1(1 - p)^0 + 2(1 - p)^1 + 3(1 - p)^2 + ...)
So now we need to work out what this summation is. The general form is
p sum from k = 1 to infinity (k(1 - p)^k).
Rather than solving this particular summation, let's make this more general. Let x be some variable that, later, we'll set equal to 1 - p, but which for now we'll treat as a free value. Then we can rewrite the above summation as
p sum from k = 1 to infinity (kx^(k-1)).
Now for a cute trick: notice that the inside of this expression is the derivative of x^k with respect to x. Therefore, this sum is
p sum from k = 1 to infinity (d/dx x^k).
The derivative is a linear operator, so we can move it out to the front:
p d/dx sum from k = 1 to infinity (x^k)
That inner sum (x + x^2 + x^3 + ...) is the Taylor series for 1 / (1 - x) - 1, so we can simplify this to get
p d/dx (1 / (1 - x) - 1)
= p / (1 - x)^2
And since we picked x = 1 - p, this simplifies to
p / (1 - (1 - p))^2
= p / p^2
= 1 / p
Whew! That was a long derivation. But it shows that the expected number of coin tosses needed is 1/p.
Now, in your case, your algorithm can be thought of as tossing mn coins that come up heads with probability p and stopping if any of them come up heads. Surely, the expected number of coins you’d need to toss won’t be more than the case where you’re allowed to flip infinitely often, so your expected runtime is at most O(1 / p) (assuming p > 0).
If we assume that p is independent of m and n, then we can notice that at after some initial growth, each added term into our summation as we increase the number of flips is exponentially lower than the previous ones. More specifically, after adding in roughly logarithmically many terms into the sum we’ll be off from the total in the case of the infinite summation. Therefore, provided that mn is roughly larger than Θ(log p), the sum ends up being Θ(1 / p). So in a big-O sense, if mn is independent of p, the runtime is Θ(1 / p).

Sum of remainders over the entire array for several queries

I am looking at this challenge:
You are provided an array A[ ] of N elements.
Also, you have to answer M queries.
Each query is of following type-
Given a value X, find A[1]%X + A[2]%X + ...... + A[N]%X
1<=N<=100000
1<=M<=100000
1<=X<=100000
1<=elements of array<=100000
I am having a problem in computing this value in an optimized way.
How can we compute this value for different X?
Here is a way that you could at least reduce the multiplicative factor in the time complexity.
In the C standard, the modulo (or remainder) is defined to be a % b = a - (a / b) * b (where / is integer division).
A naive, iterative way (possibly useful on embedded systems with no division unit) to compute the modulo is therefore (pseudo-code):
function remainder (A, B):
rem = A
while rem > B:
rem -= B;
return rem
But how does this help us at all? Suppose we:
Sort the array A[i] in ascending order
Pre-compute the sum of all elements in A[] -> S
Find the first element (with index I) greater than X
From the pseudocode above it is clear that at least (one multiple of) X must be subtracted from all elements in the array from index I onwards. Therefore we must subtract (N - I + 1) * X from the sum S.
Even better: we can keep a variable (call it K, initialize to zero) which is equal to the total multiple of X we must subtract from S to find the sum of all remainders. Thus at this stage we could simply add N - I + 1 to K.
Repeat the above, finding the first element greater than the next limit L = 2X, 3X, ... and so on, until we have passed the end of the array.
Finally, the result is given by S - K * X.
Pseudocode:
function findSumOfRemainder (A[N], X):
sort A
S = sum A
K = 0
L = X
I = 0
while I < N:
I = lowest index such that A[I] >= L
K += N - I + 1
L += X
return S - K * X
What is the best way to find I at each stage, and how does it relate to the time-complexity?
Binary search: Since the entire array is sorted, to find the first index I at which A[I] >= L, we can just do a binary search on the array (or succeeding sub-array at each stage of the iteration, bounded by [I, N - 1]). This has complexity O( log[N - I + 1] ).
Linear search: Self-explanatory - increment I until A[I] >= L, taking O( N - I + 1 )
You may dismiss the linear search method as being "stupid" - but let's look at the two different extreme cases. For simplicity we can assume that the values of A are "uniformly" distributed.
(max(A) / X) ~ N: We will have to compute very few values of I; binary search is the preferred method here because the complexity would be bounded by O([NX / max(A)] * log[N]), which is much better than that of linear search O(N).
(max(A) / X) << N: We will have to compute many values of I, each separated by only a few indices. In this case the total binary search complexity would be bounded by O(log N) + O(log[N-1]) + O(log[N-2]) + ... ~ O(N log N), which is significantly worse than that of linear search.
So which one do we choose? Well this is where I must get off, because I don't know what the optimal answer would be (if there even is one). But the best I can say is to set some threshold value for the ratio max(A) / X - if greater then choose binary search, else linear.
I welcome any comments on the above + possible improvements; the range constraint of the values may allow better methods for finding values of I (e.g. radix sort?).
#include<bits/stdc++.h>
using namespace std;
int main(){
int t;
cin >> t;
while(t--){
int n;
cin >> n;
int arr[n];
long long int sum = 0;
for(int i=0;i<n;i++){
cin >> arr[i];
}
cout << accumulate(arr, arr+n, sum) - n << '\n';
}
}
In case you don't know about accumulate refer this.

Finding median in merged array of two sorted arrays

Assume we have 2 sorted arrays of integers with sizes of n and m. What is the best way to find median of all m + n numbers?
It's easy to do this with log(n) * log(m) complexity. But i want to solve this problem in log(n) + log(m) time. So is there any suggestion to solve this problem?
Explanation
The key point of this problem is to ignore half part of A and B each step recursively by comparing the median of remaining A and B:
if (aMid < bMid) Keep [aMid +1 ... n] and [bLeft ... m]
else Keep [bMid + 1 ... m] and [aLeft ... n]
// where n and m are the length of array A and B
As the following: time complexity is O(log(m + n))
public double findMedianSortedArrays(int[] A, int[] B) {
int m = A.length, n = B.length;
int l = (m + n + 1) / 2;
int r = (m + n + 2) / 2;
return (getkth(A, 0, B, 0, l) + getkth(A, 0, B, 0, r)) / 2.0;
}
public double getkth(int[] A, int aStart, int[] B, int bStart, int k) {
if (aStart > A.length - 1) return B[bStart + k - 1];
if (bStart > B.length - 1) return A[aStart + k - 1];
if (k == 1) return Math.min(A[aStart], B[bStart]);
int aMid = Integer.MAX_VALUE, bMid = Integer.MAX_VALUE;
if (aStart + k/2 - 1 < A.length) aMid = A[aStart + k/2 - 1];
if (bStart + k/2 - 1 < B.length) bMid = B[bStart + k/2 - 1];
if (aMid < bMid)
return getkth(A, aStart + k / 2, B, bStart, k - k / 2); // Check: aRight + bLeft
else
return getkth(A, aStart, B, bStart + k / 2, k - k / 2); // Check: bRight + aLeft
}
Hope it helps! Let me know if you need more explanation on any part.
Here's a very good solution I found in Java on Stack Overflow. It's a method of finding the K and K+1 smallest items in the two arrays where K is the center of the merged array.
If you have a function for finding the Kth item of two arrays then finding the median of the two is easy;
Calculate the weighted average of the Kth and Kth+1 items of X and Y
But then you'll need a way to find the Kth item of two lists; (remember we're one indexing now)
If X contains zero items then the Kth smallest item of X and Y is the Kth smallest item of Y
Otherwise if K == 2 then the second smallest item of X and Y is the smallest of the smallest items of X and Y (min(X[0], Y[0]))
Otherwise;
i. Let A be min(length(X), K / 2)
ii. Let B be min(length(Y), K / 2)
iii. If the X[A] > Y[B] then recurse from step 1. with X, Y' with all elements of Y from B to the end of Y and K' = K - B, otherwise recurse with X' with all elements of X from A to the end of X, Y and K' = K - A
If I find the time tomorrow I will verify that this algorithm works in Python as stated and provide the example source code, it may have some off-by-one errors as-is.
Take the median element in list A and call it a. Compare a to the center elements in list B. Lets call them b1 and b2 (if B has odd length then exactly where you split b depends on your definition of the median of an even length list, but the procedure is almost identical regardless). if b1&leq;a&leq;b2 then a is the median of the merged array. This can be done in constant time since it requires exactly two comparisons.
If a is greater than b2 then we add the top half of A to the top of B and repeat. B will no longer be sorted, but it doesn't matter. If a is less than b1 then we add the bottom half of A to the bottom of B and repeat. These will iterate log(n) times at most (if the median is found sooner then stop, of course).
It is possible that this will not find the median. If this is the case then the median is in B. If so, perform the same algorithm with A and B reversed. This will require log(m) iterations. In total you will have performed at most 2*(log(n)+log(m)) iterations of a constant time operation, so you have solved the problem in order log(n)+log(m) time.
This is essentially the same answer as was given by iehrlich, but written out more explicitly.
Yes, this can be done. Given two arrays, A and B, in the worst-case scenario you have to first perform a binary search in A, and then, if it fails, binary search in B looking for the median. On each step of a binary search, you check if the current element is actually a median of a merged A+B array. Such check takes constant time.
Let's see why such check is constant. For simplicity, let's assume that |A| + |B| is an odd number, and that all numbers in both arrays are different. You can remove these restrictions later by applying the usual median definition approach (i.e., how to calculate the median of an array containing duplicates, or of an array with even length). Anyway, given that, we know for sure, that in the merged array there will be (|A| + |B| - 1) / 2 elements to the right and to the left of an actual median. In the process of a binary search in A, we know the index of current element x in array A (let it be i). Now, if x satisfies the condition B[j] < x < B[j+1], where i + j == (|A| + |B| - 1) / 2, then x is your median.
The overall complexity is O(log(max(|A|, |B|)) time and O(1) memory.

Probability: No of ways to win if you have n dice with m faces each

You are given a number of dices n, each with a number of faces m. You roll all the n dices and note the sum of all the throws you get from rolling each dice. If you get a sum >= x, you win, otherwise you lose. Find the probability that you win.
I thought of generating all combinations of 1 to m ( of size n ) and keeping count of only those whose sum is more then x . Total no of ways are m^n
After that its just the divison of both.
Is there a better way ?
[EDIT: As noted by jpalacek, the time complexity was wrong -- I've now fixed this.]
You can solve this more efficiently with dynamic programming, by first changing it into the question:
How many ways can I get at least x from n dice?
Express this as f(x, n). Then it must be that
f(x, n) = sum(f(x - i, n - 1)) for all 1 <= i <= m.
I.e. if the first die has 1, the remaining n - 1 dice must add up to at least x - 1; if the first die has 2, the remaining n - 1 dice must add up to at least x - 2; and so on.
There are m terms in the sum, so if you memoise this function, it will be O(m^2*n^2), since it will be required to do this summing work at most (m * n) * n times (i.e. once per unique set of inputs to the function, assuming that the first parameter x <= m * n).
As a final step to get a probability, just divide the result of f(x, n) by the total number of possible outcomes, i.e. m^n.
Just to add up on #j_random_hacker's basically correct answer, you can make it even faster when you note that
f(x, n) = f(x-1, n) - f(x-m-1, n-1) + f(x-1, n-1) if x>m+1
This way, you'll only spend O(1) time calculating each of the f value.
//Passing curFace value will disallow duplicate combinations
//For 3 dices - and sum 8 - 2 4 2 and 2 2 4 are the same combination - so should be counted as one
int sums(int totSum,int noDices,int mFaces,int curFace,HashMap<String,Integer> map)
{
int count=0;
if (noDices<=0 || totSum<=0)
return 0;
if (noDices==1)
{
if (totSum>=1 & totSum<=mFaces)
return 1;
else
return 0;
}
if (map.containsKey(noDices+"-"+totSum))
return map.get(noDices+"-"+totSum);
for (int i=curFace;i<=mFaces;i++)
{
count+=sums(totSum-i,noDices-1,mFaces,i,map);
}
map.put(noDices+"-" +totSum,count);
return count;
}

Resources