Is there a set of test functions to measure the performance (in terms of speed, maybe trading off with accuracy) of a given algorithm whose task is to find a/the global minimum of a real-valued function over a given interval? Eventually: is this problem an open problem or does there exist a theoretical best algorithm for such a task?
EDIT: there are no restrictions on the function, other that it should be bounded.
With no restrictions on the function except boundedness, it does not seem possible to always find its global minimum, let alone in reasonable time.
Consider the family of real-valued functions defined on [0..1]:
f (x0) = y0
f (x) = 0 for all other x in [0..1]
For any fixed x0 in [0..1] and y0 < 0, the minimum is at x0.
Still, any algorithm with no prior knowledge of x0 will have a hard time finding it.
Take a function that is 0 in every point where you evaluate f (x), and c for an unknown c > 0 for every point where you don't evaluate f (x). If you want it continuous, then if x is between a and b, where a and b are the neighbouring points where you evaluated f (a) and f (b), then f goes linear from f (a) = 0 to f ((a + b)/2) = c and back linear to f (b) = 0.
Clearly every time you evaluate f (x), you get a zero. Since you never evaluate anything else, your algorithm cannot conclude that the global maximum is anything but zero - which is wrong.
Related
This is part of a bigger question. Its actually a mathematical problem. So it would be really great if someone can direct me to any algorithm to obtain the solution of this problem or a pseudo code will be of help.
The question. Given an equation check if it has an integral solution.
For example:
(26a+5)/32=b
Here a is an integer. Is there an algorithm to predict or find if b can be an integer. I need a general solution not specific to this question. The equation can vary. Thanks
Your problem is an example of a linear Diophantine equation. About that, Wikipedia says:
This Diophantine equation [i.e., a x + b y = c] has a solution (where x and y are integers) if and only if c is a multiple of the greatest common divisor of a and b. Moreover, if (x, y) is a solution, then the other solutions have the form (x + k v, y - k u), where k is an arbitrary integer, and u and v are the quotients of a and b (respectively) by the greatest common divisor of a and b.
In this case, (26 a + 5)/32 = b is equivalent to 26 a - 32 b = -5. The gcd of the coefficients of the unknowns is gcd(26, -32) = 2. Since -5 is not a multiple of 2, there is no solution.
A general Diophantine equation is a polynomial in the unknowns, and can only be solved (if at all) by more complex methods. A web search might turn up specialized software for that problem.
Linear Diophantine equations take the form ax + by = c. If c is the greatest common divisor of a and b this means a=z'c and b=z''c then this is Bézout's identity of the form
with a=z' and b=z'' and the equation has an infinite number of solutions. So instead of trial searching method you can check if c is the greatest common divisor (GCD) of a and b
If indeed a and b are multiples of c then x and y can be computed using extended Euclidean algorithm which finds integers x and y (one of which is typically negative) that satisfy Bézout's identity
(as a side note: this holds also for any other Euclidean domain, i.e. polynomial ring & every Euclidean domain is unique factorization domain). You can use Iterative Method to find these solutions:
Integral solution to equation `a + bx = c + dy`
and thank you for the attention you're paying to my question :)
My question is about finding an (efficient enough) algorithm for finding orthogonal polynomials of a given weight function f.
I've tried to simply apply the Gram-Schmidt algorithm but this one is not efficient enough. Indeed, it requires O(n^2) integrals. But my goal is to use this algorithm in order to find Hankel determinants of a function f. So a "direct" computation wich consists in simply compute the matrix and take its determinants requires only 2*n - 1 integrals.
But I want to use the theorem stating that the Hankel determinant of order n of f is a product of the n first leading coefficients of the orthogonal polynomials of f. The reason is that when n gets larger (say about 20), Hankel determinant gets really big and my goal is to divided it by an other big constant (for n = 20, the constant is of order 10^103). My idea is then to "dilute" the computation of the constant in the product of the leading coefficients.
I hope there is a O(n) algorithm to compute the n first orthogonal polynomials :) I've done some digging and found nothing in that direction for general function f (f can be any smooth function, actually).
EDIT: I'll precise here what the objects I'm talking about are.
1) A Hankel determinant of order n is the determinant of a square matrix which is constant on the skew diagonals. Thus for example
a b c
b c d
c d e
is a Hankel matrix of size 3 by 3.
2) If you have a function f : R -> R, you can associate to f its "kth moment" which is defined as (I'll write it in tex) f_k := \int_{\mathbb{R}} f(x) x^k dx
With this, you can create a Hankel matrix A_n(f) whose entries are (A_n(f)){ij} = f{i+j-2}, that is something of the like
f_0 f_1 f_2
f_1 f_2 f_3
f_2 f_3 f_4
With this in mind, it is easy to define the Hankel determinant of f which is simply
H_n(f) := det(A_n(f)). (Of course, it is understood that f has sufficient decay at infinity, this means that all the moments are well defined. A typical choice for f could be the gaussian f(x) = exp(-x^2), or any continuous function on a compact set of R...)
3) What I call orthogonal polynomials of f is a set of polynomials (p_n) such that
\int_{\mathbb{R}} f(x) p_j(x) p_k(x) is 1 if j = k and 0 otherwize.
(They are called like that since they form an orthonormal basis of the vector space of polynomials with respect to the scalar product
(p|q) = \int_{\mathbb{R}} f(x) p(x) q(x) dx
4) Now, it is basic linear algebra that from any basis of a vector space equipped with a scalar product, you can built a orthonormal basis thanks to the Gram-Schmidt algorithm. This is where the n^2 integrations comes from. You start from the basis 1, x, x^2, ..., x^n. Then you need n(n-1) integrals for the family to be orthogonal, and you need n more in order to normalize them.
5) There is a theorem saying that if f : R -> R is a function having sufficient decay at infinity, then we have that its Hankel determinant H_n(f) is equal to
H_n(f) = \prod_{j = 0}^{n-1} \kappa_j^{-2}
where \kappa_j is the leading coefficient of the j+1th orthogonal polynomial of f.
Thank you for your answer!
(PS: I tagged octave because I work in octave so, with a bit of luck (but I doubt it), there is a built-in function or a package already done managing this kind of think)
Orthogonal polynomials obey a recurrence relation, which we can write as
P[n+1] = (X-a[n])*P[n] - b[n-1]*P[n-1]
P[0] = 1
P[1] = X-a[0]
and we can compute the a, b coefficients by
a[n] = <X*P[n]|P[n]> / c[n]
b[n-1] = c[n-1]/c[n]
where
c[n] = <P[n]|P[n]>
(Here < | > is your inner product).
However I cannot vouch for the stability of this process at large n.
Pollard Rho factorization method uses a function generator f(x) = x^2-a(mod n) or f(x) = x^2+a(mod n) , is the choice of this function (parabolic) has got any significance or we may use any function (cubic , polynomial or even linear) as we have to identify or find the numbers belonging to same congruence class modulo n to find the non trivial divisor ?
In Knuth Vol II (The Art Of Computer Programming - Seminumerical Algorithms) section 4.5.4 Knuth says
Furthermore if f(y) mod p behaves as a random mapping from the set {0,
1, ... p-1} into itself, exercise 3.1-12 shows that the average value
of the least such m will be of order sqrt(p)... From the theory in
Chapter 3, we know that a linear polynomial f(x) = ax + c will not be
sufficiently random for our purpose. The next simplest case is
quadratic, say f(x) = x^2 + 1. We don't know that this function is
sufficiently random, but our lack of knowledge tends to support the
hypothesis of randomness, and empirical tests show that this f does
work essentially as predicted
The probability theory that says that f(x) has a cycle of length about sqrt(p) assumes in particular that there can be two values y and z such that f(y) = f(z) - since f is chosen at random. The rho in Pollard Rho contains such a junction, with the cycle containing multiple lines leading on to it. For a linear function f(x) = ax + b then for gcd(a, p) = 1 mod p (which is likely since p is prime) f(y) = f(z) means that y = z mod p, so there are no such junctions.
If you look at http://www.agner.org/random/theory/chaosran.pdf you will see that the expected cycle length of a random function is about the sqrt of the state size, but the expected cycle length of a random bijection is about the state size. If you think of generating the random function only as you evaluate it you can see that if the function is entirely random then every value seen so far is available to be chosen again at random to find a cycle, so the odds of closing the cycle increase with the cycle length, but if the function has to be invertible the only way to close the cycle is to generate the starting point, which is much less likely.
OK, here's the deal. I have a bunch of linear functions, a*x + b.
My goal is to answer the following question/query: What is the minimal function at x = q?
E.g.: If I have functions f(x) = 3*x + 2, g(x) = 5*x - 6 and h(x) = 2*x + 1, I will answer for e.g.:
for x = 4, function h
for x = 2, function g
for x = 1, function g
My idea goes like this:
Sort the functions by the coefficient of x, in decreasing order.
Sort the queries in increasing order
Get rid of the parallel functions, keep the ones with the smallest constant term (e.g.: if I have f(x) = 2*x + 4 and g(x) = 2*x + 2, f(x) will never be smaller than g(x), so I don't need f(x)).
Right now I am on the interval from -inf to some real number, call it w1 and I know that on this interval, the function with the highest linear coefficient is the smallest
Find w1 by finding the smallest x1 s.t. f(x1) = g(x1) where f is my current function and g iterates through all other functions with a smaller linear coefficient, w1 = x1
Repeat as long as my query is in the interval (-inf, w1): output the current function, then proceed to the next query.
If I still have queries that needed to be answered, let the current function be the one that intersects my actual current function at x = w1, and instead of -inf put w1, repeat the same steps.
However, my implementation or idea is not fast enough. Is there anything that I didn't notice that may speed up my program?
Thank you in advance.
could you not just solve for their intersections, and store the greatest function for each interval in the domain?
edit-
to elaborate, if you were to solve any pair of functions for x, then x represents the value where one of those two functions becomes greater than the other. There's going to be definable intervals where the minimal function is the same for all the values in the interval.
Here's a plot of your 3 example functions.
The intervals(with the corresponding minimal function) of this graph would be
(-∞, 7/3] => 5x - 6
(7/3, ∞] => 2x + 1
Now, at runtime, instead of "What is the minimal function at x = q" you simply do "What interval does q belong to".
And, if I'm not mistaken, if you have N linear functions, you would have at most N-1 intervals to store. And, there's specialized data structures that you can use to store and search intervals if you really have a lot of functions to analyze.
If I understood correctly, your solution is to do some pre-processing to all your functions so that the domain of x is split into ranges, and in every such a range you know what's the minimal function.
There're actually two phases: the "preparation" and the "querying" (where given a specific x you give the result).
What's your bottleneck?
Naturally for the "querying" phase to be fast you should organize your ranges in a kind of a sorted array, so that you can find the range enclosing the given x by a median search (or similar) in a logarithmic time. If this is what you did and still this isn't fast enough - consider code-level optimizations, because from the algorithmic point of view this seems to be the most optimal solution.
If your bottleneck is the "preparation" phase - here there're opportunities for optimizations. As I understand, you find intersections of all the pairs of your functions (after getting rid of parallel ones). And this is not really necessary.
Consider the following. First you sort all your functions by their coefficient (higher coefficients are at the beginning). Get rid of parallel functions. Next build the array of the ranges, while iterating through your functions.
Since the current function has the lowest coefficient (among those that have already been analyzed) - the current function will be the smallest one as x goes to infinity. So that its range should be from some x0 to infinity. Find that x0. Take the last range from the array (belonging to the previously-processed function), and find x0 - the intersection of that function with the current one. The former range shrinks up to x0. If that range becomes invalid (range start greater than x0) - means that that function is totally obscured. In such a case - remove that range, and repeat the procedure.
To make things more clear I'll write a pseudo-code:
rangeArr is an array of pairs F,X, whereas F is the function description, and X is the start of the function range. The end of the function range is considered the start of the next range, and the end of the last function range is +infinity.
for each F sorted by coefficient
{
double x0;
while (true)
{
if (rangeArr is empty)
{
x0 = -inf;
break;
}
FPrev = rangeArr.back().F;
xPrev = rangeArr.back().X;
x0 = IntersectionOf(F, FPrev);
if (x0 > xPrev)
break;
rangeArr.DeleteLastRange();
}
rangeArr.InsertRange(F, x0);
}
How to find the first perfect square from the function: f(n)=An²+Bn+C? B and C are given. A,B,C and n are always integer numbers, and A is always 1. The problem is finding n.
Example: A=1, B=2182, C=3248
The answer for the first perfect square is n=16, because sqrt(f(16))=196.
My algorithm increments n and tests if the square root is a integer nunber.
This algorithm is very slow when B or C is large, because it takes n calculations to find the answer.
Is there a faster way to do this calculation? Is there a simple formula that can produce an answer?
What you are looking for are integer solutions to a special case of the general quadratic Diophantine equation1
Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0
where you have
ax^2 + bx + c = y^2
so that A = a, B = 0, C = -1, D = b, E = 0, F = c where a, b, c are known integers and you are looking for unknown x and y that satisfy this equation. Once you recognize this, solutions to this general problem are in abundance. Mathematica can do it (use Reduce[eqn && Element[x|y, Integers], x, y]) and you can even find one implementation here including source code and an explanation of the method of solution.
1: You might recognize this as a conic section. It is, and people have been studying them for thousands of years. As such, our understanding of them is very deep and your problem is actually quite famous. The study of them is an immensely deep and still active area of mathematics.