f(x) = (exp(x)-1)/x;
g(x) = (exp(x)-1)/log(exp(x))
Analytically, f(x) = g(x) for all x.
When x approaches 0, both f(x) and g(x) approach 1.
% Compute y against x
for k = 1:15
x(k) = 10^(-k);
f(k) =(exp(x(k))-1)/x(k);
De(k) = log(exp(x(k)));
g(k)= (exp(x(k))-1)/De(k);
end
% Plot y
plot(1:15,f,'r',1:15,g,'b');
However, g(x) works better than f(x). f(x) actually diverges when x approaches 0. Why is g(x) better than f(x)?
It's hard not to give the answer to this, so I'll only point to a few hints
look at De... I mean really look at it. Note how as x gets
smaller, De is no longer equal to x.
Now look at exp(x) - 1. Notice a pattern.
Ask yourself, what is eps(1), and why does it matter?
In Matlab, exp(10^-16) -1 = 0. Why?
Related
I’m reading about proofs, currently reading Mathematics for Computer Science by Eric Lehman and Tom Leighton, they illustrate in a sample proposition that "as z ranges over the real numbers, e^z takes on every positive, real value at least once". I'm having trouble fully grasping this proposition.
I am trying to approach this is as a programmer and think of what it would look like in pseudocode if I were to see if it was true.
pr = [ all real positive numbers ]
r = [ all real numbers ]
for y in pr:
for z in r:
e = pow(y, z)
if e != y:
goto outer
print "this is true";
outer
Is this what they are proposing?
∀ y ∈ R+, ∃ z ∈ R, e^z = y
Is saying that for all y in the set of positive real numbers, there exists a z in the set of real numbers, such that exp(z) = y.
You can't really create a program to verify that this is true. Most basically because you will encounter one of the following problems
Floating point math imprecision (essential read Is floating point math broken?)
The reals are infinite
You could check this over every floating point number (which would take a very long time but is still theoretically able to be computed), but you would
Likely come up with a situation where there is no z such that exp(z) = y because such a z does not exist exactly enough in the set of floating point numbers to give you exp(z) = y
Even if there was a z for every y in the set of floating point numbers, this would not prove exp(z) = y for all y in R+ and z in R.
So on the whole, yes you're pseudocode somewhat represents the idea, but it's not viable or logical to check this on a computer or really as much think about this as a computing problem.
Edit:
The best way to think about this programmatically would be like this
R = [SET OF ALL REALS]
R+ = FILTER (> 0) R
(MAP (exp) R) == R+
N.B. exp means e^n where e^x = SUM [ (x^k)/(k!) | k <- [SET OF ALL NATURALS]] which is approximately 2.718^x.
If you imagine that it were possible to enumerate all the positive real numbers, and moreover to do so in finite time, then pseudocode for your thought experiment might look more like this:
pr = [ all real positive numbers ]
r = [ all real numbers ]
for y in pr:
for z in r:
e = exp(z)
if e == y:
goto outer
print "false"
stop
outer:
print "true"
The key differences from your pseudocode are:
(technical) changing pow(y, z) to exp(z). This is the exponential function on z, or equivalently, the number e raised to the zth power
The proposition is found to be true (and the algorithm prints that result) only if the outer loop runs to completion
If on any iteration of the outer loop, the inner one does run to completion (indicating that y for that iteration is not the exponential of any real number) then the proposition is proved false. In that case the algorithm prints that result and stops. Only one such real number y is needed for the whole proposition to fail.
Of course, an entirely different, mathematical way to describe the proposition is that it says the natural logarithm is defined and evaluates to a real number for all positive real arguments. That follows because the natural logarithm is the inverse of the exponential, so if you take the logarithm of both sides of e^z = y, you get z = log(y).
The proposition is proven by a program that will find z ∈ R given any y ∈ R+, so:
z = log(y);
I'm having trouble on doing a homework exercise.
I need to describe an efficient algorithm which solves the polynomial interpolation problem:
Let P[i,j] be the polynomial interpolation of the points (xi, yi),...,(xj,yj). Find 3 simple polynomials q(x), r(x), s(x) of degree 0 or 1 such that:
P[i,j+1]={q(x)P[i,j](x)-r(x)P[i+1,j+1](x)}/s(x)
Given the points (x1,y1),....(xn,yn), describe an efficient dynamic programming algorithm based on the recurrence relation which you found in section 1 for computing the coefficients a0,...an-1 of the polynomial interpolation.
Well, I know how to solve the polynomial interpolation problem using Newton polynomial which looks quite similar to the above recurrence relation but I don't see how it helps me to find q(x), r(x), s(x) of degree 0 or 1, and assuming I have the correct q(x), r(x), s(x)- how do I solve this problem using dynamic programming?
Any help will be much appreciated.
q(x) = (x at {j+1}) - x
r(x) = (x at i) - x
s(x) = (x at {j+1}) - (x at i)
x at i or x at j mean their place in the ordered list of input points.
Some explanations:
First we need to understand what P[i,j](x) means.
Put all your initial (x,y) pairs in the main diagonal of an n x n matrix.
Now you can extract P[0,0](x) to be the y value of the point in your matrix at (0,0).
P[0,1] is the linear interpolation of the points in your matrix at (0,0) and (1,1). This will be a straight line function.
((x at 0 - x)(y at 1) - (x at 1 - x)(y at 0))
---------------------------------------------
(x at 1 - x at 0)
P[0,2] is the linear interpolation of two previous linear interpolations, which means that your ys now will be the linear functions which you calculated at the previous step.
This is also the dynamic algorithm which builds the full polynom.
I highly recommend you have a look at this very good lecture, and the full lecture notes.
I am reading an algorithms textbook and I am stumped by this question:
Suppose we want to compute the value x^y, where x and y are positive
integers with m and n bits, respectively. One way to solve the problem is to perform y - 1 multiplications by x. Can you give a more efficient algorithm that uses only O(n) multiplication steps?
Would this be a divide and conquer algorithm? y-1 multiplications by x would run in theta(n) right? .. I don't know where to start with this question
I understand this better in an iterative way:
You can compute x^z for all powers of two: z = (2^0, 2^1, 2^2, ... ,2^(n-1))
Simply by going from 1 to n and applying x^(2^(i+1)) = x^(2^i) * x^(2^i).
Now you can use these n values to compute x^y:
result = 1
for i=0 to n-1:
if the i'th bit in y is on:
result *= x^(2^i)
return result
All is done in O(n)
Apply a simple recursion for divide and conquer.
Here i am posting a more like a pseudo code.
x^y :=
base case: if y==1 return x;
if y%2==0:
then (x^2)^(y/2;
else
x.(x^2)^((y-1)/2);
The y-1 multiplications solution is based on the identity x^y = x * x^(y-1). By repeated application of the identity, you know that you will decrease y down to 1 in y-1 steps.
A better idea is to decrease y more "energically". Assuming an even y, we have x^y = x^(2*y/2) = (x^2)^(y/2). Assuming an odd y, we have x^y = x^(2*y/2+1) = x * (x^2)^(y/2).
You see that you can halve y, provided you continue the power computation with x^2 instead of x.
Recursively:
Power(x, y)=
1 if y = 0
x if y = 1
Power(x * x, y / 2) if y even
x * Power(x * x, y / 2) if y odd
Another way to view it is to read y as a sum of weighted bits. y = b0 + 2.b1 + 4.b2 + 8.b3...
The properties of exponentiation imply:
x^y = x^b0 . x^(2.b1) . x^(4.b2) . x^(8.b2)...
= x^b0 . (x^2)^b1 . (x^4)^b2 . (x^8)^b3...
You can obtain the desired powers of x by squaring, and the binary decomposition of y tells you which powers to multiply.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I read the method to calculate the square root of any number and the algorithm is as follows:
double findSquareRoot(int n) {
double x = n;
double y = 1;
double e = 0.00001;
while(x-y >= e) {
x = (x+y)/2;
y = n/x;
}
return x;
}
My question regarding this method are
How it calculates the square root? I didn't understand the mathematics behind this. How x=(x+y)/2 and y=n/x converges to square root of n. Explain this mathematics.
What is the complexity of this algorithm?
It is easy to see if you do some runs and print the successive values of x and y. For example for 100:
50.5 1.9801980198019802
26.24009900990099 3.8109612300726345
15.025530119986813 6.655339226067038
10.840434673026925 9.224722348894286
10.032578510960604 9.96752728032478
10.000052895642693 9.999947104637101
10.000000000139897 9.999999999860103
See, the trick is that if x is not the square root of n, then it is above or below the real root, and n/x is always on the other side. So if you calculate the midpoint of x and n/x it will be somewhat nearer to the real root.
And about the complexity, it is actually unbounded, because the real root will never reached. That's why you have the e parameter.
This is a typical application of Newton's method for calculating the square root of n. You're calculating the limit of the sequence:
x_0 = n
x_{i+1} = (x_i + n / x_i) / 2
Your variable x is the current term x_i and your variable y is n / x_i.
To understand why you have to calculate this limit, you need to think of the function:
f(x) = x^2 - n
You want to find the root of this function. Its derivative is
f'(x) = 2 * x
and Newton's method gives you the formula:
x_{i+1} = x_i - f(x_i) / f'(x_1) = ... = (x_i + n / x_i) / 2
For completeness, I'm copying here the rationale from #rodrigo's answer, combined with my comment to it. This is helpful if you want to forget about Newton's method and try to understand this algorithm alone.
The trick is that if x is not the square root of n, then it is
an approximation which lies either above or below the real root, and y = n/x is always on the
other side. So if you calculate the midpoint of (x+y)/2, it will be
nearer to the real root than the worst of these two approximations
(x or y). When x and y are close enough, you're done.
This will also help you find the complexity of the algorithm. Say that d is the distance of the worst of the two approximations to the real root r. Then the distance between the midpoint (x+y)/2 and r is at most d/2 (it will help you if you draw a line to visualize this). This means that, with each iteration, the distance is halved. Therefore, the worst-case complexity is logarithmic w.r.t. to the distance of the initial approximation and the precision that is sought. For the given program, it is
log(|n-sqrt(n)|/epsilon)
I think all information can be found in wikipedia.
The basic idea is that if x is an overestimate to the square root of a non-negative real number S then S/x, will be an underestimate and so the average of these two numbers may reasonably be expected to provide a better approximation.
With each iteration this algorithm doubles correct digits in answer, so complexity is linear to desired accuracy's logarithm.
Why does it work? As stated here, if you will do infinite iterations you'll get some value, let's name it L. L has to satisfy equasion L = (L + N/L)/2 (as in algorithm), so L = sqrt(N). If you're worried about convergence, you may calculate squared relative errors for each iteration (Ek is error, Ak is computed value):
Ek = (Ak/sqrt(N) - 1)²
if:
Ak = (Ak-1 + N/Ak-1)/2 and Ak = sqrt(N)(sqrt(Ek) + 1)
you may derive recurrence relation for Ek:
Ek = Ek-1²/[4(sqrt(Ek-1) + 1)²]
and limit of it is 0, so limit of A1,A2... sequence is sqrt(N).
The mathematical explanation is that, over a small range, the arithmetic mean is a reasonable approximation to the geometric mean, which is used to calculate the square root. As the iterations get closer to the true square root, the difference between the arithmetic mean and the geometric mean vanishes, and the approximation gets very close. Here is my favorite version of Heron's algorithm, which first normalizes the input n over the range 1 ≤ n < 4, then unrolls the loop for a fixed number of iterations that is guaranteed to converge.
def root(n):
if n < 1: return root(n*4) / 2
if 4 <= n: return root(n/4) * 2
x = (n+1) / 2
x = (x + n/x) / 2
x = (x + n/x) / 2
x = (x + n/x) / 2
x = (x + n/x) / 2
x = (x + n/x) / 2
return x
I discuss several programs to calculate the square root at my blog.
Pollard Rho factorization method uses a function generator f(x) = x^2-a(mod n) or f(x) = x^2+a(mod n) , is the choice of this function (parabolic) has got any significance or we may use any function (cubic , polynomial or even linear) as we have to identify or find the numbers belonging to same congruence class modulo n to find the non trivial divisor ?
In Knuth Vol II (The Art Of Computer Programming - Seminumerical Algorithms) section 4.5.4 Knuth says
Furthermore if f(y) mod p behaves as a random mapping from the set {0,
1, ... p-1} into itself, exercise 3.1-12 shows that the average value
of the least such m will be of order sqrt(p)... From the theory in
Chapter 3, we know that a linear polynomial f(x) = ax + c will not be
sufficiently random for our purpose. The next simplest case is
quadratic, say f(x) = x^2 + 1. We don't know that this function is
sufficiently random, but our lack of knowledge tends to support the
hypothesis of randomness, and empirical tests show that this f does
work essentially as predicted
The probability theory that says that f(x) has a cycle of length about sqrt(p) assumes in particular that there can be two values y and z such that f(y) = f(z) - since f is chosen at random. The rho in Pollard Rho contains such a junction, with the cycle containing multiple lines leading on to it. For a linear function f(x) = ax + b then for gcd(a, p) = 1 mod p (which is likely since p is prime) f(y) = f(z) means that y = z mod p, so there are no such junctions.
If you look at http://www.agner.org/random/theory/chaosran.pdf you will see that the expected cycle length of a random function is about the sqrt of the state size, but the expected cycle length of a random bijection is about the state size. If you think of generating the random function only as you evaluate it you can see that if the function is entirely random then every value seen so far is available to be chosen again at random to find a cycle, so the odds of closing the cycle increase with the cycle length, but if the function has to be invertible the only way to close the cycle is to generate the starting point, which is much less likely.