MatLab - Newton's method algorithm - algorithm

I have written the following algorithm in order to evaluate a function in MatLab using Newton's method (we set r = -7 in my solution):
function newton(r);
syms x;
y = exp(x) - 1.5 - atan(x);
yprime = diff(y,x);
f = matlabFunction(y);
fprime = matlabFunction(yprime);
x = r;
xvals = x
for i=1:8
u = x;
x = u - f(r)/fprime(r);
xvals = x
end
The algorithm works in that it runs without any errors, but the numbers keep decreasing at every iteration, even though, according to my textbook, the expression should converge to roughly -14 for x. My algorithm is correct the first two iterations, but then it goes beyond -14 and finally ends up at roughøy -36.4 after all iterations have completed.
If anyone can give me some help as to why the algorithm does not work properly, I would greatly appreciate it!

I think
x = u - f(r)/fprime(r);
should be
x = u - f(u)/fprime(u);
If you always use r, you're always decrementing x by the same value.

syms x
y = exp(x) - 1.5 - atan(x); % your function is converted in for loop
x=-1;
n=10;
v=0;
for i=2:n
x(i)=tan(exp(x(i-1))-1.5);
v=[v ;x(i)]; % you will get solution vector for each i value
end
v

Related

Optimizing algorithm calculating (sin(x)-x)*x^{-3} (in matlab)

My task is to write optimal program that calculates matrix Y, given matrix X, where:
y = (sin(x)-x) x-3
Here's the code I have written so far:
n = size(X, 1);
m = size(X, 2);
Y = zeros(n, m);
d = n*m;
for i = 1:d
x = X(i);
if abs(x)<0.1
Y(i) = -1/6+x.^2/120-x.^4/5040+x.^6/362880;
else
Y(i) = (sin(x)-x).*(x.^(-3));
end
end
So, generally the formula was inaccurate around 0, so I have approximated it using Taylor theorem.
Unfortunately this program has accuracy of 91% and efficiency of only 24% (so it's 4 times slower than the optimal solution).
The tests are around 13 million samples, out of which around 6 million have value of less than 0.1. The range of samples is (-8π , 8π).
The target accuracy (100%) is 4*epsilon where epsilon equals 2^(-52) (that means that numbers calculated by program shouldn't be larger or smaller than numbers calculated "perfectly" than 4*epsilon).
100*epsilon means accuracy of 86%.
Do you have any ideas on how to make it faster and more accurate? I'm looking both for mathematical tricks on how to further transform given formula, and general MATLAB tips that can accelerate programs?
EDIT:
Using Horner method, I have managed to bring up efficiency up to 81% (accuracy still 91%) with this program:
function Y = main(X)
Y = (sin(X)-X).*(X.^(-3));
i = abs(X) < 0.1;
Y(i) = horner(X(i));
function y = horner (x)
pow = x.*x;
y = -1/6+pow.*(1/120+pow.*(-1/5040+pow./362880));
Do you have any further ideas on how to improve it?
Program seems to work fine for a great range of input:
x = linspace(-8*pi,8*pi,13e6); % 13 million samples in the desired range
y = (sin(x)-x)./x.^3;
plot(x,y)
Due due round-off errors, you may have problem calculating it for very small values of x:
x = 0
y = (sin(x)-x)./x.^3
y =
NaN
You already have the Taylor series expansion of the function around 0. As the Taylor expansion does not include a division by x, you can expect a better behaviour of the Taylor function around this region:
x = -1e-6:1e-9:1e-6;
y = (sin(x)-x)./x.^3;
y_taylor = -1/6 + x.^2/120 - x.^4/5040 + x.^6/362880;
plot(x,y,x,y_taylor); legend('y','taylor expansion','location','best')
You can replace your loop with vectorized code. This is usually more efficient than loop because the loop has a conditional in it, which is bad for branch prediction:
Y = (sin(X)-X).*(X.^(-3));
i = abs(X) < 0.1;
Y(i) = -1/6+X(i).^2/120-X(i).^4/5040+X(i).^6/362880;
Rewriting the primary equation to avoid the cubic root yields a 3x speedup for that computation:
Y = (sin(X)./X - 1) ./ (X.*X);
Speed comparison:
The following script compares timing for this method compared to OP's loop code. I use data that has 7 million values uniformly distributed in (-8π, 8π), and another 6 million values uniformly distributed in (-0.1,0.1).
OP's loop code takes 2.4412 s, and the vectorized solution takes 0.7224 s. Using OP's Horner method and the rewritten sin expression it takes 0.1437 s.
X = [linspace(-8*pi,8*pi,7e6), linspace(-0.1,0.1,6e6)];
timeit(#()method1(X))
timeit(#()method2(X))
function Y = method1(X)
n = size(X, 1);
m = size(X, 2);
Y = zeros(n, m);
d = n*m;
for i = 1:d
x = X(i);
if abs(x)<0.1
Y(i) = -1/6+x.^2/120-x.^4/5040+x.^6/362880;
else
Y(i) = (sin(x)-x).*(x.^(-3));
end
end
end
function Y = method2(X)
Y = (sin(X)-X).*(X.^(-3));
i = abs(X) < 0.1;
Y(i) = -1/6+X(i).^2/120-X(i).^4/5040+X(i).^6/362880;
end
function Y = method3(X)
Y = (sin(X)./X - 1) ./ (X.*X);
i = abs(X) < 0.1;
Y(i) = horner(X(i));
end
function y = horner (x)
pow = x.*x;
y = -1/6+pow.*(1/120+pow.*(-1/5040+pow./362880));
end

Loop invariant proof on multiply algorithm

I'm currently stuck on a loop invariant proof in my home assignment. The algorithm that I need to prove correctness of, is:
Multiply(a,b)
x=a
y=0
WHILE x>=b DO
x=x-b
y=y+1
IF x=0 THEN
RETURN(y)
ELSE
RETURN(-1)
I've tried to look at several examples of loop invariants and I have some sense of idea of how its supposed to work out. However in this algorithm above, I have two exit conditions, and I'm a bit lost on how to approach this in a loop invariant proof. In particular its the termination part I'm struggling with, around the IF and ELSE statements.
So far what I've constructed is simply by looking at the termination of the algorithm in which case if x = 0 then it returns the value of y containing the value of n (number of iterations in the while loop), where as if x is not 0, and x < b then it returns -1. I just have a feeling I need to prove this some how.
I hope someone can help share some light on this for me, as the similar cases I've found in here, have not been sufficient.
Thanks alot in advance for your time.
Provided that the algorithm terminates (for this let's assume a>0 and b>0, which is sufficient), one invariant is that at every iteration of your while loop, you have x + by = a.
Proof:
at first, x = a and y = 0 so that's ok
If x + by = a, then (x - b) + (y + 1)b = a, which are the values of x and y for your next iteration
Illustration:
Multiply(a,b)
x=a
y=0
// x + by = a, is true
WHILE x>=b DO
// x + by = a, is true
x=x-b // X = x - b
y=y+1 // Y = y + 1
// x + by = a
// x - b + by + b = a
// (x-b) + (y+1)b = a
// X + bY = a, is still true
// x + by = a, will remain true when you exit the loop
// since we exited the loop, x < b
IF x=0 THEN
// 0 + by = a, and 0 < b
// y = a/b
RETURN(y)
ELSE
RETURN(-1)
This algorithm returns a/b when b divides a, and -1 otherwise. Multiply does not quite sound like an appropriate name for it...
We can't prove correctness without a specification of exactly what the function is supposed to do, which I can't find in your question. Even the name of the function doesn't help: as noted already, your function returns a/b most of the time when b divides a, and -1 otherwise. Multiply is an inappropriate name for it.
Furthermore, if b=0 and a>=b the "algorithm" doesn't terminate so it isn't even an algorithm.
As Alex M noted, a loop invariant for the loop is x + by = a. At the moment the loop exits, we also have x < b. There are no other guarantees on x because (presumably) a could be negative. If we had a guarantee that a and b are positive, then we could guarantee that 0<=x<b at the moment the loop exits, which would mean that it implements the division with remainder algorithm (at the end of the loop, y is quotient and x is remainder, and it terminates by an "infinite descent" type argument: a decreasing sequence of positive integers x must terminate). Then you could conclude that if x=0, b divides a evenly, and the quotient is returned, otherwise -1 is returned.
But that is not a proof, because we are lacking a specification for what the algorithm is supposed to do, and a specification on restrictions on its inputs. (Are a and b any positive integers? Negative and 0 not allowed?)

Matlab parfor, cannot run "due to the way P is used"

I have a quite time consuming task that I perform in a for loop. Each iteration is completely independent from the others so I figured out to use the parfor loop and benefit from the i7 core of my machine.
The serial loop is:
for i=1 : size(datacoord,1)
%P matrix: person_number x z or
P(i,1) = datacoord(i,1); %pn
P(i,4) = datacoord(i,5); %or
P(i,3) = predict(Barea2, datacoord(i,4)); %distance (z)
dist = round(P(i,3)); %round the distance to get how many cells
x = ceil(datacoord(i,2) / (im_w / ncell(1,dist)));
P(i,2) = pos(dist, x); %x
end
Reading around about the parfor, the only doubt it had is that i use dist and x as indexes which are calculated inside the loop, i heard that this could be a problem.
The error I get from matlab is about the way P matrix is used though. How is it? If i remember correcly from my parallel computing courses and I interpret correcly the parfor documentation, this should work by just switching the for with the parfor.
Any input would be greatly appreciated, thanks!
Unfortunately, in a PARFOR loop, 'sliced' variables such as you'd like P to be cannot be indexed in multiple different ways. The simplest solution is to build up a single row, and then make a single assignment into P, like this:
parfor i=1 : size(datacoord,1)
%P matrix: person_number x z or
P_tmp = NaN(1, 4);
P_tmp(1) = datacoord(i,1); %pn
P_tmp(4) = datacoord(i,5); %or
P_tmp(3) = predict(Barea2, datacoord(i,4)); %distance (z)
dist = round(P_tmp(3)); %round the distance to get how many cells
x = ceil(datacoord(i,2) / (im_w / ncell(1,dist)));
P_tmp(2) = pos(dist, x); %x
P(i, :) = P_tmp;
end

Improving performance of interpolation (Barycentric formula)

I have been given an assignment in which I am supposed to write an algorithm which performs polynomial interpolation by the barycentric formula. The formulas states that:
p(x) = (SIGMA_(j=0 to n) w(j)*f(j)/(x - x(j)))/(SIGMA_(j=0 to n) w(j)/(x - x(j)))
I have written an algorithm which works just fine, and I get the polynomial output I desire. However, this requires the use of some quite long loops, and for a large grid number, lots of nastly loop operations will have to be done. Thus, I would appreciate it greatly if anyone has any hints as to how I may improve this, so that I will avoid all these loops.
In the algorithm, x and f stand for the given points we are supposed to interpolate. w stands for the barycentric weights, which have been calculated before running the algorithm. And grid is the linspace over which the interpolation should take place:
function p = barycentric_formula(x,f,w,grid)
%Assert x-vectors and f-vectors have same length.
if length(x) ~= length(f)
sprintf('Not equal amounts of x- and y-values. Function is terminated.')
return;
end
n = length(x);
m = length(grid);
p = zeros(1,m);
% Loops for finding polynomial values at grid points. All values are
% calculated by the barycentric formula.
for i = 1:m
var = 0;
sum1 = 0;
sum2 = 0;
for j = 1:n
if grid(i) == x(j)
p(i) = f(j);
var = 1;
else
sum1 = sum1 + (w(j)*f(j))/(grid(i) - x(j));
sum2 = sum2 + (w(j)/(grid(i) - x(j)));
end
end
if var == 0
p(i) = sum1/sum2;
end
end
This is a classical case for matlab 'vectorization'. I would say - just remove the loops. It is almost that simple. First, have a look at this code:
function p = bf2(x, f, w, grid)
m = length(grid);
p = zeros(1,m);
for i = 1:m
var = grid(i)==x;
if any(var)
p(i) = f(var);
else
sum1 = sum((w.*f)./(grid(i) - x));
sum2 = sum(w./(grid(i) - x));
p(i) = sum1/sum2;
end
end
end
I have removed the inner loop over j. All I did here was in fact removing the (j) indexing and changing the arithmetic operators from / to ./ and from * to .* - the same, but with a dot in front to signify that the operation is performed on element by element basis. This is called array operators in contrast to ordinary matrix operators. Also note that treating the special case where the grid points fall onto x is very similar to what you had in the original implementation, only using a vector var such that x(var)==grid(i).
Now, you can also remove the outermost loop. This is a bit more tricky and there are two major approaches how you can do that in MATLAB. I will do it the simpler way, which can be less efficient, but more clear to read - using repmat:
function p = bf3(x, f, w, grid)
% Find grid points that coincide with x.
% The below compares all grid values with all x values
% and returns a matrix of 0/1. 1 is in the (row,col)
% for which grid(row)==x(col)
var = bsxfun(#eq, grid', x);
% find the logical indexes of those x entries
varx = sum(var, 1)~=0;
% and of those grid entries
varp = sum(var, 2)~=0;
% Outer-most loop removal - use repmat to
% replicate the vectors into matrices.
% Thus, instead of having a loop over j
% you have matrices of values that would be
% referenced in the loop
ww = repmat(w, numel(grid), 1);
ff = repmat(f, numel(grid), 1);
xx = repmat(x, numel(grid), 1);
gg = repmat(grid', 1, numel(x));
% perform the calculations element-wise on the matrices
sum1 = sum((ww.*ff)./(gg - xx),2);
sum2 = sum(ww./(gg - xx),2);
p = sum1./sum2;
% fix the case where grid==x and return
p(varp) = f(varx);
end
The fully vectorized version can be implemented with bsxfun rather than repmat. This can potentially be a bit faster, since the matrices are not explicitly formed. However, the speed difference may not be large for small system sizes.
Also, the first solution with one loop is also not too bad performance-wise. I suggest you test those and see, what is better. Maybe it is not worth it to fully vectorize? The first code looks a bit more readable..

Summation without a for loop - MATLAB

I have 2 matrices: V which is square MxM, and K which is MxN. Calling the dimension across rows x and the dimension across columns t, I need to evaluate the integral (i.e sum) over both dimensions of K times a t-shifted version of V, the answer being a function of the shift (almost like a convolution, see below). The sum is defined by the following expression, where _{} denotes the summation indices, and a zero-padding of out-of-limits elements is assumed:
S(t) = sum_{x,tau}[V(x,t+tau) * K(x,tau)]
I manage to do it with a single loop, over the t dimension (vectorizing the x dimension):
% some toy matrices
V = rand(50,50);
K = rand(50,10);
[M N] = size(K);
S = zeros(1, M);
for t = 1 : N
S(1,1:end-t+1) = S(1,1:end-t+1) + sum(bsxfun(#times, V(:,t:end), K(:,t)),1);
end
I have similar expressions which I managed to evaluate without a for loop, using a combination of conv2 and\or mirroring (flipping) of a single dimension. However I can't see how to avoid a for loop in this case (despite the appeared similarity to convolution).
Steps to vectorization
1] Perform sum(bsxfun(#times, V(:,t:end), K(:,t)),1) for all columns in V against all columns in K with matrix-multiplication -
sum_mults = V.'*K
This would give us a 2D array with each column representing sum(bsxfun(#times,.. operation at each iteration.
2] Step1 gave us all possible summations and also the values to be summed are not aligned in the same row across iterations, so we need to do a bit more work before summing along rows. The rest of the work is about getting a shifted up version. For the same, you can use boolean indexing with a upper and lower triangular boolean mask. Finally, we sum along each row for the final output. So, this part of the code would look like so -
valid_mask = tril(true(size(sum_mults)));
sum_mults_shifted = zeros(size(sum_mults));
sum_mults_shifted(flipud(valid_mask)) = sum_mults(valid_mask);
out = sum(sum_mults_shifted,2);
Runtime tests -
%// Inputs
V = rand(1000,1000);
K = rand(1000,200);
disp('--------------------- With original loopy approach')
tic
[M N] = size(K);
S = zeros(1, M);
for t = 1 : N
S(1,1:end-t+1) = S(1,1:end-t+1) + sum(bsxfun(#times, V(:,t:end), K(:,t)),1);
end
toc
disp('--------------------- With proposed vectorized approach')
tic
sum_mults = V.'*K; %//'
valid_mask = tril(true(size(sum_mults)));
sum_mults_shifted = zeros(size(sum_mults));
sum_mults_shifted(flipud(valid_mask)) = sum_mults(valid_mask);
out = sum(sum_mults_shifted,2);
toc
Output -
--------------------- With original loopy approach
Elapsed time is 2.696773 seconds.
--------------------- With proposed vectorized approach
Elapsed time is 0.044144 seconds.
This might be cheating (using arrayfun instead of a for loop) but I believe this expression gives you what you want:
S = arrayfun(#(t) sum(sum( V(:,(t+1):(t+N)) .* K )), 1:(M-N), 'UniformOutput', true)

Resources