How to use the RK4 algorithm to solve an ODE? - algorithm

I am using an RK4 algorithm:
function R=RK4_h(f,a,b,ya,h)
% Input
% - f field of the edo y'=f(t,y). A string of characters 'f'
% - a and b initial and final time
% - ya initial value y0
% - h lenght of the step
% Output
% - R=[T' Y'] where T independent variable and Y dependent variable
N = fix((b-a) / h);
T = zeros(1,N+1);
Y = zeros(1,N+1);
% Vector of the time values
T = a:h:b;
% Solving ordinary differential equation
Y(1) = ya;
for j = 1:N
k1 = h*feval(f,T(j),Y(j));
k2 = h*feval(f,T(j)+h/2,Y(j)+k1/2);
k3 = h*feval(f,T(j)+h/2,Y(j)+k2/2);
k4 = h*feval(f,T(j)+h,Y(j)+k3);
Y(j+1) = Y(j) + (k1+2*k2+2*k3+k4)/6;
end
R=[T' Y'];
In my main script I call it for every value as:
xlabel('x')
ylabel('y')
h=0.05;
fprintf ('\n First block \n');
xx = [0:h:1];
Nodes = length(xx);
yy = zeros(1,Nodes);
for i=1:Nodes
fp(i)=feval('edo',-1,xx(i));
end
E=RK4_h('edo',0,1,-1,h);
plot(E);
fprintf ('\n%f',E);
The problem is when I try to use RK4 algorithm with edo formula:
function edo = edo(y,t)
edo = 6*((exp(1))^(6*t))*(y-(2*t))^2+2;
The results are not logical, for example, the real value for are: y(0)=8, y(1)=11,53. But the estimate is not close. Any of both coordinates in E vector represents a feasible approach for the problem, so I do not know if this is the correct implementation.
There is a basic error of implementation?

The function edo takes t as the first parameter, and y as the second parameter. You have the parameters reversed.
Your function should be:
function edo = edo(t,y) % NOT edo(y,t)
edo = 6*((exp(1))^(6*t))*(y-(2*t))^2+2;

Related

Speeding up program in matlab

I have 2 functions:
ccexpan - which calculates coefficients of interpolating polynomial of function f with N nodes in Chebyshew polynomial of the first kind basis.
csum - calculates value for arguments t using coefficients c from ccexpan (using Clenshaw algorithm).
This is what I have written so far:
function c = ccexpan(f,N)
z = zeros (1,N+1);
s = zeros (1,N+1);
for i = 1:(N+1)
z(i) = pi*(i-1)/N;
end
t = f(cos(z));
for k = 1:(N+1)
s(k) = sum(t.*cos(z.*(k-1)));
s(k) = s(k)-(f(1)+f(-1)*cos(pi*(k-1)))/2;
end
c = s.*2/N;
and:
function y = csum(t,c)
M = length(t);
N = length(c);
y = t;
b = zeros(1,N+2);
for k = 1:M
for i = N:-1:1
b(i) = c(i)+2*t(k)*b(i+1)-b(i+2);
end
y(k)=(b(1)-b(3))/2;
end
Unfortunately these programs are very slow, and also slightly inacurrate. Please give me some tips on how to speed them up, and how to improve accuracy.
Where possible try to get away from looping structures. At first blush, I would trade out your first for loop of
for i = 1:(N+1)
z(i) = pi*(i-1)/N;
end
and replace with
i=1:(N+1)
z = pi*(i-1)/N
I did not check the rest of you code but the above example will definitely speed up you code. And a second strategy is to combine loops when possible.
Martin,
Consider the following strategy.
% create hypothetical N and f
N = 3
f = #(x) 1./(1+15*x.*x)
% calculate z and t
i=1:(N+1)
z = pi*(i-1)/N
t = f(cos(z))
% make a column vector of k's
k = (1:(N+1))'
% do this: s(k) = sum(t.*cos(z.*(k-1)))
s1 = t.*cos(z.*(k-1)) % should be a matrix with one row for each row of k
% via implicit expansion
s2 = sum(s1,2) % row sum, i.e., one value for each row of k
% do this: s(k) = s(k)-(f(1)+f(-1)*cos(pi*(k-1)))/2
s3 = s2 - (f(1)+f(-1)*cos(pi*(k-1)))/2
% calculate c
c = s3 .* 2/N

Optimizing algorithm calculating (sin(x)-x)*x^{-3} (in matlab)

My task is to write optimal program that calculates matrix Y, given matrix X, where:
y = (sin(x)-x) x-3
Here's the code I have written so far:
n = size(X, 1);
m = size(X, 2);
Y = zeros(n, m);
d = n*m;
for i = 1:d
x = X(i);
if abs(x)<0.1
Y(i) = -1/6+x.^2/120-x.^4/5040+x.^6/362880;
else
Y(i) = (sin(x)-x).*(x.^(-3));
end
end
So, generally the formula was inaccurate around 0, so I have approximated it using Taylor theorem.
Unfortunately this program has accuracy of 91% and efficiency of only 24% (so it's 4 times slower than the optimal solution).
The tests are around 13 million samples, out of which around 6 million have value of less than 0.1. The range of samples is (-8π , 8π).
The target accuracy (100%) is 4*epsilon where epsilon equals 2^(-52) (that means that numbers calculated by program shouldn't be larger or smaller than numbers calculated "perfectly" than 4*epsilon).
100*epsilon means accuracy of 86%.
Do you have any ideas on how to make it faster and more accurate? I'm looking both for mathematical tricks on how to further transform given formula, and general MATLAB tips that can accelerate programs?
EDIT:
Using Horner method, I have managed to bring up efficiency up to 81% (accuracy still 91%) with this program:
function Y = main(X)
Y = (sin(X)-X).*(X.^(-3));
i = abs(X) < 0.1;
Y(i) = horner(X(i));
function y = horner (x)
pow = x.*x;
y = -1/6+pow.*(1/120+pow.*(-1/5040+pow./362880));
Do you have any further ideas on how to improve it?
Program seems to work fine for a great range of input:
x = linspace(-8*pi,8*pi,13e6); % 13 million samples in the desired range
y = (sin(x)-x)./x.^3;
plot(x,y)
Due due round-off errors, you may have problem calculating it for very small values of x:
x = 0
y = (sin(x)-x)./x.^3
y =
NaN
You already have the Taylor series expansion of the function around 0. As the Taylor expansion does not include a division by x, you can expect a better behaviour of the Taylor function around this region:
x = -1e-6:1e-9:1e-6;
y = (sin(x)-x)./x.^3;
y_taylor = -1/6 + x.^2/120 - x.^4/5040 + x.^6/362880;
plot(x,y,x,y_taylor); legend('y','taylor expansion','location','best')
You can replace your loop with vectorized code. This is usually more efficient than loop because the loop has a conditional in it, which is bad for branch prediction:
Y = (sin(X)-X).*(X.^(-3));
i = abs(X) < 0.1;
Y(i) = -1/6+X(i).^2/120-X(i).^4/5040+X(i).^6/362880;
Rewriting the primary equation to avoid the cubic root yields a 3x speedup for that computation:
Y = (sin(X)./X - 1) ./ (X.*X);
Speed comparison:
The following script compares timing for this method compared to OP's loop code. I use data that has 7 million values uniformly distributed in (-8π, 8π), and another 6 million values uniformly distributed in (-0.1,0.1).
OP's loop code takes 2.4412 s, and the vectorized solution takes 0.7224 s. Using OP's Horner method and the rewritten sin expression it takes 0.1437 s.
X = [linspace(-8*pi,8*pi,7e6), linspace(-0.1,0.1,6e6)];
timeit(#()method1(X))
timeit(#()method2(X))
function Y = method1(X)
n = size(X, 1);
m = size(X, 2);
Y = zeros(n, m);
d = n*m;
for i = 1:d
x = X(i);
if abs(x)<0.1
Y(i) = -1/6+x.^2/120-x.^4/5040+x.^6/362880;
else
Y(i) = (sin(x)-x).*(x.^(-3));
end
end
end
function Y = method2(X)
Y = (sin(X)-X).*(X.^(-3));
i = abs(X) < 0.1;
Y(i) = -1/6+X(i).^2/120-X(i).^4/5040+X(i).^6/362880;
end
function Y = method3(X)
Y = (sin(X)./X - 1) ./ (X.*X);
i = abs(X) < 0.1;
Y(i) = horner(X(i));
end
function y = horner (x)
pow = x.*x;
y = -1/6+pow.*(1/120+pow.*(-1/5040+pow./362880));
end

Writing a vector sum in MATLAB

Suppose I have a function phi(x1,x2)=k1*x1+k2*x2 which I have evaluated over a grid where the grid is a square having boundaries at -100 and 100 in both x1 and x2 axis with some step size say h=0.1. Now I want to calculate this sum over the grid with which I'm struggling:
What I was trying :
clear all
close all
clc
D=1; h=0.1;
D1 = -100;
D2 = 100;
X = D1 : h : D2;
Y = D1 : h : D2;
[x1, x2] = meshgrid(X, Y);
k1=2;k2=2;
phi = k1.*x1 + k2.*x2;
figure(1)
surf(X,Y,phi)
m1=-500:500;
m2=-500:500;
[M1,M2,X1,X2]=ndgrid(m1,m2,X,Y)
sys=#(m1,m2,X,Y) (k1*h*m1+k2*h*m2).*exp((-([X Y]-h*[m1 m2]).^2)./(h^2*D))
sum1=sum(sys(M1,M2,X1,X2))
Matlab says error in ndgrid, any idea how I should code this?
MATLAB shows:
Error using repmat
Requested 10001x1001x2001x2001 (298649.5GB) array exceeds maximum array size preference. Creation of arrays greater
than this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference
panel for more information.
Error in ndgrid (line 72)
varargout{i} = repmat(x,s);
Error in new_try1 (line 16)
[M1,M2,X1,X2]=ndgrid(m1,m2,X,Y)
Judging by your comments and your code, it appears as though you don't fully understand what the equation is asking you to compute.
To obtain the value M(x1,x2) at some given (x1,x2), you have to compute that sum over Z2. Of course, using a numerical toolbox such as MATLAB, you could only ever hope to compute over some finite range of Z2. In this case, since (x1,x2) covers the range [-100,100] x [-100,100], and h=0.1, it follows that mh covers the range [-1000, 1000] x [-1000, 1000]. Example: m = (-1000, -1000) gives you mh = (-100, -100), which is the bottom-left corner of your domain. So really, phi(mh) is just phi(x1,x2) evaluated on all of your discretised points.
As an aside, since you need to compute |x-hm|^2, you can treat x = x1 + i x2 as a complex number to make use of MATLAB's abs function. If you were strictly working with vectors, you would have to use norm, which is OK too, but a bit more verbose. Thus, for some given x=(x10, x20), you would compute x-hm over the entire discretised plane as (x10 - x1) + i (x20 - x2).
Finally, you can compute 1 term of M at a time:
D=1; h=0.1;
D1 = -100;
D2 = 100;
X = (D1 : h : D2); % X is in rows (dim 2)
Y = (D1 : h : D2)'; % Y is in columns (dim 1)
k1=2;k2=2;
phi = k1*X + k2*Y;
M = zeros(length(Y), length(X));
for j = 1:length(X)
for i = 1:length(Y)
% treat (x - hm) as a complex number
x_hm = (X(j)-X) + 1i*(Y(i)-Y); % this computes x-hm for all m
M(i,j) = 1/(pi*D) * sum(sum(phi .* exp(-abs(x_hm).^2/(h^2*D)), 1), 2);
end
end
By the way, this computation takes quite a long time. You can consider either increasing h, reducing D1 and D2, or changing all three of them.

Octave: function doesn't return expected value?

This code is a programming assignment for Andrew Ng's machine learning course.
The function is expecting a row vector [J grad]. The code computes J (albeit wrongly, but that's not the issue here), and I put in a dummy value for grad (because I haven't written the code to compute it yet). When I run the code, it only outputs ans as a scalar with the value of J. Where did grad go?
function [J grad] = nnCostFunction(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
%NNCOSTFUNCTION Implements the neural network cost function for a two layer
%neural network which performs classification
% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
% X, y, lambda) computes the cost and gradient of the neural network. The
% parameters for the neural network are "unrolled" into the vector
% nn_params and need to be converted back into the weight matrices.
%
% The returned parameter grad should be a "unrolled" vector of the
% partial derivatives of the neural network.
%
% Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
% for our 2 layer neural network
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
% Setup some useful variables
m = size(X, 1);
% You need to return the following variables correctly
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
% ====================== YOUR CODE HERE ======================
% Instructions: You should complete the code by working through the
% following parts.
%
% Part 1: Feedforward the neural network and return the cost in the
% variable J. After implementing Part 1, you can verify that your
% cost function computation is correct by verifying the cost
% computed in ex4.m
%
% Part 2: Implement the backpropagation algorithm to compute the gradients
% Theta1_grad and Theta2_grad. You should return the partial derivatives of
% the cost function with respect to Theta1 and Theta2 in Theta1_grad and
% Theta2_grad, respectively. After implementing Part 2, you can check
% that your implementation is correct by running checkNNGradients
%
% Note: The vector y passed into the function is a vector of labels
% containing values from 1..K. You need to map this vector into a
% binary vector of 1's and 0's to be used with the neural network
% cost function.
%
% Hint: We recommend implementing backpropagation using a for-loop
% over the training examples if you are implementing it for the
% first time.
%
% Part 3: Implement regularization with the cost function and gradients.
%
% Hint: You can implement this around the code for
% backpropagation. That is, you can compute the gradients for
% the regularization separately and then add them to Theta1_grad
% and Theta2_grad from Part 2.
%
% PART 1
a1 = [ones(m,1) X]; % set a1 to equal X and add column of 1's
z2 = a1 * Theta1'; % matrix times matrix [5000*401 * 401*25 = 5000*25]
a2 = [ones(m,1),sigmoid(z2)]; % sigmoid function on matrix [5000*26]
z3 = a2 * Theta2'; % matrix times matrix [5000*26 * 26*10 = 5000 * 10]
hox = sigmoid(z3); % sigmoid function on matrix [5000*10]
for k = 1:num_labels
yk = y == k; % using the correct column vector y each loop
J = J + sum(-yk.*log(hox(:,k)) - (1-yk).*log(1-hox(:,k)));
end
J = 1/m * J;
% -------------------------------------------------------------
% =========================================================================
% Unroll gradients
% grad = [Theta1_grad(:) ; Theta2_grad(:)];
grad = 6.6735;
end
You have specified in your function declaration that the function can simultaneously return more than one output value:
function [J grad] = nnCostFunction(nn_params, ... % etc
You can capture both outputs if you 'request' them by assigning to a matrix of variables instead of a single variable:
[a, b] = nnCostFunction(input1, input2, etc)
If you don't do this, you're essentially 'requesting' only the first of the returned variables:
a = nnCostFunction(input1, input2, etc) % output 'b' is discarded.
If you don't specify a variable to assign to at all, octave by default assigns to the 'default' variable ans. So it's essentially equivalent to doing
ans = nnCostFunction(input1, input2, etc) % output 'b' is discarded.
See the documentation for the find function (i.e. type help find in your octave terminal) to see an example of such a function.
PS. If you only wanted the second output and did not want to 'waste' a variable name for the first one, you can do this by specifying ~ as the first output, e.g.:
[~, b] = nnCostFunction(input1, input2, etc) % output 'a' is discarded

Improving performance of interpolation (Barycentric formula)

I have been given an assignment in which I am supposed to write an algorithm which performs polynomial interpolation by the barycentric formula. The formulas states that:
p(x) = (SIGMA_(j=0 to n) w(j)*f(j)/(x - x(j)))/(SIGMA_(j=0 to n) w(j)/(x - x(j)))
I have written an algorithm which works just fine, and I get the polynomial output I desire. However, this requires the use of some quite long loops, and for a large grid number, lots of nastly loop operations will have to be done. Thus, I would appreciate it greatly if anyone has any hints as to how I may improve this, so that I will avoid all these loops.
In the algorithm, x and f stand for the given points we are supposed to interpolate. w stands for the barycentric weights, which have been calculated before running the algorithm. And grid is the linspace over which the interpolation should take place:
function p = barycentric_formula(x,f,w,grid)
%Assert x-vectors and f-vectors have same length.
if length(x) ~= length(f)
sprintf('Not equal amounts of x- and y-values. Function is terminated.')
return;
end
n = length(x);
m = length(grid);
p = zeros(1,m);
% Loops for finding polynomial values at grid points. All values are
% calculated by the barycentric formula.
for i = 1:m
var = 0;
sum1 = 0;
sum2 = 0;
for j = 1:n
if grid(i) == x(j)
p(i) = f(j);
var = 1;
else
sum1 = sum1 + (w(j)*f(j))/(grid(i) - x(j));
sum2 = sum2 + (w(j)/(grid(i) - x(j)));
end
end
if var == 0
p(i) = sum1/sum2;
end
end
This is a classical case for matlab 'vectorization'. I would say - just remove the loops. It is almost that simple. First, have a look at this code:
function p = bf2(x, f, w, grid)
m = length(grid);
p = zeros(1,m);
for i = 1:m
var = grid(i)==x;
if any(var)
p(i) = f(var);
else
sum1 = sum((w.*f)./(grid(i) - x));
sum2 = sum(w./(grid(i) - x));
p(i) = sum1/sum2;
end
end
end
I have removed the inner loop over j. All I did here was in fact removing the (j) indexing and changing the arithmetic operators from / to ./ and from * to .* - the same, but with a dot in front to signify that the operation is performed on element by element basis. This is called array operators in contrast to ordinary matrix operators. Also note that treating the special case where the grid points fall onto x is very similar to what you had in the original implementation, only using a vector var such that x(var)==grid(i).
Now, you can also remove the outermost loop. This is a bit more tricky and there are two major approaches how you can do that in MATLAB. I will do it the simpler way, which can be less efficient, but more clear to read - using repmat:
function p = bf3(x, f, w, grid)
% Find grid points that coincide with x.
% The below compares all grid values with all x values
% and returns a matrix of 0/1. 1 is in the (row,col)
% for which grid(row)==x(col)
var = bsxfun(#eq, grid', x);
% find the logical indexes of those x entries
varx = sum(var, 1)~=0;
% and of those grid entries
varp = sum(var, 2)~=0;
% Outer-most loop removal - use repmat to
% replicate the vectors into matrices.
% Thus, instead of having a loop over j
% you have matrices of values that would be
% referenced in the loop
ww = repmat(w, numel(grid), 1);
ff = repmat(f, numel(grid), 1);
xx = repmat(x, numel(grid), 1);
gg = repmat(grid', 1, numel(x));
% perform the calculations element-wise on the matrices
sum1 = sum((ww.*ff)./(gg - xx),2);
sum2 = sum(ww./(gg - xx),2);
p = sum1./sum2;
% fix the case where grid==x and return
p(varp) = f(varx);
end
The fully vectorized version can be implemented with bsxfun rather than repmat. This can potentially be a bit faster, since the matrices are not explicitly formed. However, the speed difference may not be large for small system sizes.
Also, the first solution with one loop is also not too bad performance-wise. I suggest you test those and see, what is better. Maybe it is not worth it to fully vectorize? The first code looks a bit more readable..

Resources