Are there any fast algorithms to evaluate a sum of the form (a * n + b) / (c * n + d) for a, b, c, d fixed, and n ranging from 1 to around 10^14 or so?
Obviously, summing each term individually won't work due to the size of the sum.
Edit: An algorithm to sum 1 / (c * n + d) would suffice, since you can split the fraction up and sum each numerator in O(1) time.
You can reduce the summation to that of the inverses of n + α, which yields a shifted Harmonic number, corresponding to the Digamma function. The value is asymptotic to ln(n) + Γ (Euler's constant), with a correction for the missing/additional initial terms from 1 to floor(α).
For an approximation of the Digamma function, check
https://en.wikipedia.org/wiki/Digamma_function#Computation_and_approximation
Here is a approximation function that is especially accurate for larger values of x as it was explained in here.
f = lambda x: 7/(6*x**15) - 691/(2730*x**13) + 5/(66*x**11) - 1/(30*x**9) + 1/(42*x**7) - 1/(30*x**5) + 1/(6*x**3) + 1/(2*x**2) + 1/x
# or
f = lambda x: 0.97354060901812177/x**2 + 1/(x + 0.4849142940227510) # relative low accuracy and optimized for absolute error
eulers_mascheroni = 0.5772156649015328606065120900824024310421
print(f((eulers_mascheroni+(12*5)+7))) # obviously use an another language if you had speed bottlenecks
Altough you can just use a library in most languages that wil do that for you in the best implementation possible
In case of python you can use something like scipy or mpmath(which is also for floating point arithmetics)
from mpmath import *
mp.dps = 50; mp.pretty = True
print(psi(1,eulers_mascheroni+(12*5)+7))
If you were to use it check out there docs they have a great explanations of each function.
from scipy import special
from numpy import arange
x = [*map(lambda c,n,d: c * n + d + eulers_mascheroni,
range(172),arange(24,42,.04),range(96))]
print(special.polygamma(1, x))
Related
I'm working on the following problem:
Let A be an array of length n with each element -10n <= A[i] <= 10n. Create an algorithm running in O(nlog(n)) time that determines whether or not there exist entries A[i], A[j], and A[k] (i, j, and k not necessarily distinct) such that A[i] + A[j] + A[k] = 0.
I'm approaching it in the following way. Define a polynomial p of degree n-1 such that the coefficient on the x^k term is A[k]. Then use the FFT to multiply p with itself, and then multiply the resulting polynomial again by p. If any of the coefficients in the resulting polynomial are 0, then return true. Else, return false. Since the FFT is O(nlog(n)), this algorithm is then O(nlog(n)).
The problem I'm running into is that the FFT combines like terms, so to speak. Thus, the existence of a coefficient 0 does not imply that such entries exist.
Could anyone suggest a modification to this algorithm to improve it?
If I remember it right, the way to solve this problem is:
Define a polynomial of degree 60n + 1, where the coefficient on the term x^k is number of occurrences of element k - 10n in the array. For instance, if n=8, the coefficient on x^5 is number of occurrences of -75 (-75 = 5 - 10x8)
Use FFT to raise that polynomial (of degree 60n + 1) to the third power.
See if the coefficient on x^(30n) is non-zero. If it is, there's an answer.
Here's a sample implementation on python, it seems to work for all the cases I came up with:
import numpy as np
from numpy.fft import fft, ifft
def hasZeroSum(a):
n = len(a)
b = [0 for x in range(n * 60 + 1)]
for el in a: b[el + 10 * n] += 1
f = fft(b, n * 60 + 1)
f = np.power(f, 3)
res = ifft(f, n * 60 + 1)
return np.absolute(res[n * 30]) > 0.5
print hasZeroSum([-11, -5, 2, 3, 7])
print hasZeroSum([-11, -5, 2, 4, 8])
Prints
True
False
Let f be a function defined on the non-negative integers n ≥ 0. Suppose f is known to be U-shaped (convex and eventually increasing). How to find its minimum? That is, m such that f(m) ≤ f(n) for all n.
Examples of U-shaped functions:
n**2 - 1000*n + 100
(1 + 1/2 + ... + 1/n) + 1000/sqrt(1+n)
Of course, a human mathematician can try to minimise these particular functions using calculus. For my computer though, I want a general search algorithm that can minimise any U-shaped function.
Those functions again, in Python, to help anyone who wants to test an algorithm.
f = lambda n: n**2 - 1000*n + 100
g = lambda n: sum(1/i for i in range(1,n+1)) + 1000/sqrt(1+n)
Don't necessarily need code (of any language) in an answer, just a description of an algorithm. Would interest me though to see its answers for these specific functions.
You are probably looking for ternary search .
Ternary search will help to find f(m) as your requirement in O(logN) time , where N is number of points on the curve .
It basically takes two points m1 and m2 in range (l,r) and then recursively searches in 1/3 rd part .
code in python (from wikipedia) :
def ternarySearch(f, left, right, absolutePrecision):
while True:
#left and right are the current bounds; the maximum is between them
if abs(right - left) < absolutePrecision:
return (left + right)/2
leftThird = (2*left + right)/3
rightThird = (left + 2*right)/3
if f(leftThird) < f(rightThird):
right = rightThird
else:
left = leftThird
If your function is known to be unimodal, use Fibonacci search. http://en.wikipedia.org/wiki/Fibonacci_search_technique
For a discrete domain, the way to decide where new "test points" are probed must be slightly adapted as the formulas for the continuous domain don't yield integers. Anyway the working principle remains.
As regards the number of tests required, we have the following hierarchy:
#Fibonacci < #Golden < #Ternary < #Dichotomic
This also works. Use binary search on the derivative to maximise f' <= 0
def minimise_convex(f):
"""Given a U-shaped (convex and eventually increasing) function f, find its minimum over the non-negative integers. That is m such that f(m) <= f(n) for all n. If there exist multiple solutions, return the largest. Uses binary search on the derivative."""
f_prime = lambda n: (f(n) - f(n-1)) if n > 0 else 0
return binary_search(f_prime, 0)
Where binary search is defined
def binary_search(f, t):
"""Given an increasing function f, find the greatest non-negative integer n such that f(n) <= t. If f(n) > t for all n, return None."""
Is there a name for this operation? And: is there a closed-form expression?
For a given set of n elements, and value k between 1 and n,
Take all subsets (combinations) of k items
Find the product of each subset
Find the sum of all those products
I can express this in Python, and do the calculation pretty easily:
from operator import mul
from itertools import combinations
from functools import reduce
def sum_of_product_of_subsets(list1, k):
val = 0
for subset in combinations(list1, k):
val += reduce(mul, subset)
return val
I'm just looking for the closed form expression, so as to avoid the loop in case the set size gets big.
Note this is NOT the same as this question: Sum of the product over all combinations with one element from each group -- that question is about the sum-of-products of a Cartesian product. I'm looking for the sum-of-products of the set of combinations of size k; I don't think they are the same.
To be clear, for set(a, b, c, d), then:
k = 4 --> a*b*c*d
k = 3 --> b*c*d + a*c*d + a*b*d + a*b*c
k = 2 --> a*b + a*c + a*d + b*c + b*d + c*d
k = 1 --> a + b + c + d
Just looking for the expression; no need to supply the Python code specifically. (Any language would be illustrative, if you'd like to supply an example implementation.)
These are elementary symmetric polynomials. You can write them using summation signs as in Wikipedia. You can also use Vieta's formulas to get all of them at once as coefficients of a polynomial (up to signs)
(x-a_1)(x-a_2)...(x-a_k) =
x^k -
(a_1 + ... + a_k) x^(k-1) +
(a_1 a_2 + a_1 a_3 + ... + a_(k-1) a_k)) x^(k-2) +
... +
(-1)^k a_1 ... a_k
By expanding (x-a_1)(x-a_2)...(x-a_k) you get a polynomial time algorithm to compute all those numbers (your original implementation runs in exponential time).
Edit: Python implementation:
from itertools import izip, chain
l = [2,3,4]
x = [1]
for i in l:
x = [a + b*i for a,b in izip(chain([0],x), chain(x,[0]))]
print x
That gives you [24, 26, 9, 1], as 2*3*4=24, 2*3+2*4+3*4=26, 2+3+4=9. That last 1 is the empty product, which corresponds to k=0 in your implementation.
This should be O(N2). Using polynomial FFT you could do O(N log2 N), but I am too lazy to code that.
I have just run into the same problem elsewhere and I might have an easier solution.
Basically the closed form you are looking for is this one:
(1+e_1)*(1+e_2)*(1+e_3)*...*(1+e_n) - 1
where considering the set S={e_1, e_2, ..., e_n}
Here is why:
Let 'm' be the product of the elements of S (n=e_1*e_2*...*e_n).
If you look at the original products of elements of subsets, you can see, that all of those products are divisors of 'm'.
Now apply the Divisor function to 'm' (from now on called sigma(m) ) with one modification: consider all e_i elements as 'primes' (because we don't want them to be divided), so this gives sigma(e_i)=e_i+1 .
Then if you apply sigma to m:
sigma(m)=sigma(e_1*e_2*...*e_n)=1+[e_1+e_2+...+e_n]+[e_1*e_2+e_1*e_3+...+e_(n-1)*e_n]+[e_1*e_2*e_3+e_1*e_2*e_3+...+e_(n-2)]+...+[e_1*e_2*...*e_n]
This is what the original problem was. (Except for the 1 in the beginning).
Our divisor function is multiplicative, so the previous equation can be rewritten as following:
sigma(m)=(1+e_1)*(1+e_2)*(1+e_3)*...*(1+e_n)
There is one correction you need here. It is because of the empty subset (which is taken into account here, but in the original problem it is not present), which includes '1' in the sum (in the beginning of the firs equation).
So the closed form, what you need is:
(1+e_1)*(1+e_2)*(1+e_3)*...*(1+e_n) - 1
Sorry, I can't really code that, but I think the computation shouldn't take more than 2n-1 loops.
(You can read more about the divisor function here: http://en.wikipedia.org/wiki/Divisor_function)
I keep getting these hard interview questions. This one really baffles me.
You're given a function poly that takes and returns an int. It's actually a polynomial with nonnegative integer coefficients, but you don't know what the coefficients are.
You have to write a function that determines the coefficients using as few calls to poly as possible.
My idea is to use recursion knowing that I can get the last coefficient by poly(0). So I want to replace poly with (poly - poly(0))/x, but I don't know how to do this in code, since I can only call poly. ANyone have an idea how to do this?
Here's a neat trick.
int N = poly(1)
Now we know that every coefficient in the polynomial is at most N.
int B = poly(N+1)
Now expand B in base N+1 and you have the coefficients.
Attempted explanation: Algebraically, the polynomial is
poly = p_0 + p_1 * x + p_2 * x^2 + ... + p_k * x^k
If you have a number b and expand it in base n, then you get
b = b_0 + b_1 * n + b_2 * n^2 + ...
where each b_i is uniquely determined and b_i < n.
I have a series
S = i^(m) + i^(2m) + ............... + i^(km) (mod m)
0 <= i < m, k may be very large (up to 100,000,000), m <= 300000
I want to find the sum. I cannot apply the Geometric Progression (GP) formula because then result will have denominator and then I will have to find modular inverse which may not exist (if the denominator and m are not coprime).
So I made an alternate algorithm making an assumption that these powers will make a cycle of length much smaller than k (because it is a modular equation and so I would obtain something like 2,7,9,1,2,7,9,1....) and that cycle will repeat in the above series. So instead of iterating from 0 to k, I would just find the sum of numbers in a cycle and then calculate the number of cycles in the above series and multiply them. So I first found i^m (mod m) and then multiplied this number again and again taking modulo at each step until I reached the first element again.
But when I actually coded the algorithm, for some values of i, I got cycles which were of very large size. And hence took a large amount of time before terminating and hence my assumption is incorrect.
So is there any other pattern we can find out? (Basically I don't want to iterate over k.)
So please give me an idea of an efficient algorithm to find the sum.
This is the algorithm for a similar problem I encountered
You probably know that one can calculate the power of a number in logarithmic time. You can also do so for calculating the sum of the geometric series. Since it holds that
1 + a + a^2 + ... + a^(2*n+1) = (1 + a) * (1 + (a^2) + (a^2)^2 + ... + (a^2)^n),
you can recursively calculate the geometric series on the right hand to get the result.
This way you do not need division, so you can take the remainder of the sum (and of intermediate results) modulo any number you want.
As you've noted, doing the calculation for an arbitrary modulus m is difficult because many values might not have a multiplicative inverse mod m. However, if you can solve it for a carefully selected set of alternate moduli, you can combine them to obtain a solution mod m.
Factor m into p_1, p_2, p_3 ... p_n such that each p_i is a power of a distinct prime
Since each p is a distinct prime power, they are pairwise coprime. If we can calculate the sum of the series with respect to each modulus p_i, we can use the Chinese Remainder Theorem to reassemble them into a solution mod m.
For each prime power modulus, there are two trivial special cases:
If i^m is congruent to 0 mod p_i, the sum is trivially 0.
If i^m is congruent to 1 mod p_i, then the sum is congruent to k mod p_i.
For other values, one can apply the usual formula for the sum of a geometric sequence:
S = sum(j=0 to k, (i^m)^j) = ((i^m)^(k+1) - 1) / (i^m - 1)
TODO: Prove that (i^m - 1) is coprime to p_i or find an alternate solution for when they have a nontrivial GCD. Hopefully the fact that p_i is a prime power and also a divisor of m will be of some use... If p_i is a divisor of i. the condition holds. If p_i is prime (as opposed to a prime power), then either the special case i^m = 1 applies, or (i^m - 1) has a multiplicative inverse.
If the geometric sum formula isn't usable for some p_i, you could rearrange the calculation so you only need to iterate from 1 to p_i instead of 1 to k, taking advantage of the fact that the terms repeat with a period no longer than p_i.
(Since your series doesn't contain a j=0 term, the value you want is actually S-1.)
This yields a set of congruences mod p_i, which satisfy the requirements of the CRT.
The procedure for combining them into a solution mod m is described in the above link, so I won't repeat it here.
This can be done via the method of repeated squaring, which is O(log(k)) time, or O(log(k)log(m)) time, if you consider m a variable.
In general, a[n]=1+b+b^2+... b^(n-1) mod m can be computed by noting that:
a[j+k]==b^{j}a[k]+a[j]
a[2n]==(b^n+1)a[n]
The second just being the corollary for the first.
In your case, b=i^m can be computed in O(log m) time.
The following Python code implements this:
def geometric(n,b,m):
T=1
e=b%m
total = 0
while n>0:
if n&1==1:
total = (e*total + T)%m
T = ((e+1)*T)%m
e = (e*e)%m
n = n/2
//print '{} {} {}'.format(total,T,e)
return total
This bit of magic has a mathematical reason - the operation on pairs defined as
(a,r)#(b,s)=(ab,as+r)
is associative, and the rule 1 basically means that:
(b,1)#(b,1)#... n times ... #(b,1)=(b^n,1+b+b^2+...+b^(n-1))
Repeated squaring always works when operations are associative. In this case, the # operator is O(log(m)) time, so repeated squaring takes O(log(n)log(m)).
One way to look at this is that the matrix exponentiation:
[[b,1],[0,1]]^n == [[b^n,1+b+...+b^(n-1))],[0,1]]
You can use a similar method to compute (a^n-b^n)/(a-b) modulo m because matrix exponentiation gives:
[[b,1],[0,a]]^n == [[b^n,a^(n-1)+a^(n-2)b+...+ab^(n-2)+b^(n-1)],[0,a^n]]
Based on the approach of #braindoper a complete algorithm which calculates
1 + a + a^2 + ... +a^n mod m
looks like this in Mathematica:
geometricSeriesMod[a_, n_, m_] :=
Module[ {q = a, exp = n, factor = 1, sum = 0, temp},
While[And[exp > 0, q != 0],
If[EvenQ[exp],
temp = Mod[factor*PowerMod[q, exp, m], m];
sum = Mod[sum + temp, m];
exp--];
factor = Mod[Mod[1 + q, m]*factor, m];
q = Mod[q*q, m];
exp = Floor[ exp /2];
];
Return [Mod[sum + factor, m]]
]
Parameters:
a is the "ratio" of the series. It can be any integer (including zero and negative values).
n is the highest exponent of the series. Allowed are integers >= 0.
mis the integer modulus != 0
Note: The algorithm performs a Mod operation after every arithmetic operation. This is essential, if you transcribe this algorithm to a language with a limited word length for integers.