How do I optimize my Dijkstra algorithm in Matlab to decrease the running time? - algorithm

Given below is my Dijkstra algorithm. The B matrix is the cost map. The source node is R=[1,1] and there is no target node. I want to just compute the Dijkstra algorithm given the source node to all the nodes in matrix.
But my Dijkstra takes more than 20 minutes to run. I am not sure how to optimize my code. Can you please help me out here?
function hMap = djikstra(B,R)
%B = Cost Map
%R = Initial position of the robot
fprintf('In Dijkstra');
hMap = Inf(length(B)); % Assigning infinity
hMap(R(1),R(2)) = 0 ; % source node
open = [];
open = [R(1),R(2),0];
closed = [];
closed = [closed; open(1,1:2)]; %Put current source node in closed
x = R(1);
y = R(2);
while(size(open) ~= 0)
if(x-1 ~=0)
if(sum(ismember(closed(:,1:2),[x-1,y],'rows'))== 0)
cost = hMap(x,y)+ B(x-1,y);
fprintf('x-1');
%if(cost < hMap(x-1,y) )
hMap(x-1,y) = min(cost,hMap(x-1,y));
fprintf(' update x-1');
open = [open;x-1,y,hMap(x-1,y)];
%end
end
end
if(x+1 <= length(B))
if(sum(ismember(closed(:,1:2),[x+1,y],'rows'))== 0)
cost = hMap(x,y)+ B(x+1,y) ;
%if( cost < hMap(x+1,y))
hMap(x+1,y) = min(cost,hMap(x+1,y));
open = [open;x+1,y,hMap(x+1,y)];
%end
end
end
if(y-1 ~= 0)
if(sum(ismember(closed(:,1:2),[x,y-1],'rows'))== 0)
cost = hMap(x,y)+ B(x,y-1);
%if( cost < hMap(x,y-1))
hMap(x,y-1) = min(cost, hMap(x,y-1));
open = [open;x,y-1,hMap(x,y-1)] ;
%end
end
end
if(y+1 <= length(B))
if(sum(ismember(closed(:,1:2),[x,y+1],'rows'))== 0)
cost = hMap(x,y)+ B(x,y+1);
%if(cost< hMap(x,y+1))
hMap(x,y+1) = min(cost,hMap(x,y+1));
open = [open;x,y+1,hMap(x,y+1)];
%end
end
end
closed = [closed;x,y];
open = open(2:size(open,1),:); %Removing source element from the open list
open = unique(open,'rows');
open = sortrows(open,3); % Sorting w.r.t G Value
fprintf('------------------------------OPEN--------------------------')
open
if(size(open) ~= 0)
x = open(1,1)
y = open(1,2)
end
end
end

Related

speed up loop in matlab

I'm very new in MATLAB (this is my first script).
I wonder how may I speed up this loop, I don't know any toolbox or 'tricks' as I'm a newbie on it. I tried to code it with instinct, it works, but it is really long.
All are variables get with fread or integer manually entered, so this is basically simple math, but I have no clue on why is it so long (maybe nested loops ?) and how to improve, as I am more familiar with Python and for example multiprocess.
Thanks a lot
X = 0;
Points = [0,0,0];
for i=1:nbLines
for j=1:nbPositions-1
if lDate(i)>posDate(j) && lDate(i)<=posDate(j+1)
weight = (lDate(i) - posDate(j)) / (posDate(j+1)- posDate(j));
X = posX(j)*(1-weight) + posX(j+1) * weight;
end
end
if X ~= 0
for j=1:nbScans
Y = - distance(i,j) / tan(angle(i,j));
Points = [Points;X, Y, distance(i,j)];
end
end
end
X = 0;
Points = cell([],1) ;
Points{1} = [0,0,0];
count = 1 ;
for i=1:nbLines
id = find(lDate(i)>posDate & lDate(i)<=posDate) ;
if length(id) > 1
weight = (lDate(i) - posDate(id(1))) / (posDate(id(end))- posDate(id(1)));
X = posX(id(1))*(1-weight) + posX(id(end)) * weight;
end
if X ~= 0
j=1:nbScans ;
count = count+1 ;
Y = - distance(i,j)./tan(angle(i,j));
Points{count} = [repelem(X,size(Y,2),size(Y),1), Y, distance(i,j)'];
end
end
You have one issue with the given code. The blow line:
Points = [Points; X, Y, distance(i,j)];
This will definitely slow up your code. You need to initialize this array to store the numbers. If you initialize it, you will find good difference in speed.
X = 0;
Points = zeros([],3) ;
Points(1,:) = [0,0,0];
count = 1 ;
for i=1:nbLines
for j=1:nbPositions-1
if lDate(i)>posDate(j) && lDate(i)<=posDate(j+1)
weight = (lDate(i) - posDate(j)) / (posDate(j+1)- posDate(j));
X = posX(j)*(1-weight) + posX(j+1) * weight;
end
end
if X ~= 0
for j=1:nbScans
count = count+1 ;
Y = - distance(i,j) / tan(angle(i,j));
Points(count,:) = [X, Y, distance(i,j)];
end
end
end
Note that, you code only saves the last value of X, is this what you want?
try using parallelization- "parfor" instead of "for" that uses all available processors.
parfor i=1:nbLines
rest of code here
end

Can anyone explain how different is this hybrid PSOGA from normal GA?

Does this code have mutation, selection, and crossover, just like the original genetic algorithm.
Since this, a hybrid algorithm (i.e PSO with GA) does it use all steps of original GA or skips some
of them.Please do tell me.
I am just new to this and still trying to understand. Thank you.
%%% Hybrid GA and PSO code
function [gbest, gBestScore, all_scores] = QAP_PSO_GA(CreatePopFcn, FitnessFcn, UpdatePosition, ...
nCity, nPlant, nPopSize, nIters)
% Set algorithm parameters
constant = 0.95;
c1 = 1.5; %1.4944; %2;
c2 = 1.5; %1.4944; %2;
w = 0.792 * constant;
% Allocate memory and initialize
gBestScore = inf;
all_scores = inf * ones(nPopSize, nIters);
x = CreatePopFcn(nPopSize, nCity);
v = zeros(nPopSize, nCity);
pbest = x;
% update lbest
cost_p = inf * ones(1, nPopSize); %feval(FUN, pbest');
for i=1:nPopSize
cost_p(i) = FitnessFcn(pbest(i, 1:nPlant));
end
lbest = update_lbest(cost_p, pbest, nPopSize);
for iter = 1 : nIters
if mod(iter,1000) == 0
parents = randperm(nPopSize);
for i = 1:nPopSize
x(i,:) = (pbest(i,:) + pbest(parents(i),:))/2;
% v(i,:) = pbest(parents(i),:) - x(i,:);
% v(i,:) = (v(i,:) + v(parents(i),:))/2;
end
else
% Update velocity
v = w*v + c1*rand(nPopSize,nCity).*(pbest-x) + c2*rand(nPopSize,nCity).*(lbest-x);
% Update position
x = x + v;
x = UpdatePosition(x);
end
% Update pbest
cost_x = inf * ones(1, nPopSize);
for i=1:nPopSize
cost_x(i) = FitnessFcn(x(i, 1:nPlant));
end
s = cost_x<cost_p;
cost_p = (1-s).*cost_p + s.*cost_x;
s = repmat(s',1,nCity);
pbest = (1-s).*pbest + s.*x;
% update lbest
lbest = update_lbest(cost_p, pbest, nPopSize);
% update global best
all_scores(:, iter) = cost_x;
[cost,index] = min(cost_p);
if (cost < gBestScore)
gbest = pbest(index, :);
gBestScore = cost;
end
% draw current fitness
figure(1);
plot(iter,min(cost_x),'cp','MarkerEdgeColor','k','MarkerFaceColor','g','MarkerSize',8)
hold on
str=strcat('Best fitness: ', num2str(min(cost_x)));
disp(str);
end
end
% Function to update lbest
function lbest = update_lbest(cost_p, x, nPopSize)
sm(1, 1)= cost_p(1, nPopSize);
sm(1, 2:3)= cost_p(1, 1:2);
[cost, index] = min(sm);
if index==1
lbest(1, :) = x(nPopSize, :);
else
lbest(1, :) = x(index-1, :);
end
for i = 2:nPopSize-1
sm(1, 1:3)= cost_p(1, i-1:i+1);
[cost, index] = min(sm);
lbest(i, :) = x(i+index-2, :);
end
sm(1, 1:2)= cost_p(1, nPopSize-1:nPopSize);
sm(1, 3)= cost_p(1, 1);
[cost, index] = min(sm);
if index==3
lbest(nPopSize, :) = x(1, :);
else
lbest(nPopSize, :) = x(nPopSize-2+index, :);
end
end
If you are new to Optimization, I recommend you first to study each algorithm separately, then you may study how GA and PSO maybe combined, Although you must have basic mathematical skills in order to understand the operators of the two algorithms and in order to test the efficiency of these algorithm (this is what really matter).
This code chunk is responsible for parent selection and crossover:
parents = randperm(nPopSize);
for i = 1:nPopSize
x(i,:) = (pbest(i,:) + pbest(parents(i),:))/2;
% v(i,:) = pbest(parents(i),:) - x(i,:);
% v(i,:) = (v(i,:) + v(parents(i),:))/2;
end
Is not really obvious how selection randperm is done (I have no experience about Matlab).
And this is the code that is responsible for updating the velocity and position of each particle:
% Update velocity
v = w*v + c1*rand(nPopSize,nCity).*(pbest-x) + c2*rand(nPopSize,nCity).*(lbest-x);
% Update position
x = x + v;
x = UpdatePosition(x);
This version of velocity updating strategy is utilizing what is called Interia-Weight W, which basically mean we are preserving the velocity history of each particle (not completely recomputing it).
It worth mentioning that velocity updating is done more often than crossover (each 1000 iteration).

MATLAB program takes more than 1 hour to execute

The below program is a program for finding k-clique communities from a input graph.
The graph dataset can be found here.
The first line of the dataset contains 'number of nodes and edges' respectively. The following lines have 'node1 node2' representing an edge between node1 and node2 .
For example:
2500 6589 // number_of_nodes, number_of_edges
0 5 // edge between node[0] and node[5]
.
.
.
The k-clique( aCliqueSIZE, anAdjacencyMATRIX ) function is contained here.
The following commands are executed in command window of MATLAB:
x = textread( 'amazon.graph.small' ); %% source input file text
s = max(x(1,1), x(1,2)); %% take largest dimemsion
adjMatrix = sparse(x(2:end,1)+1, x(2:end,2)+1, 1, s, s); %% now matrix is square
adjMatrix = adjMatrix | adjMatrix.'; %% apply "or" with transpose to make symmetric
adjMatrix = full(adjMatrix); %% convert to full if needed
k=4;
[X,Y,Z]=k_clique(k,adjMatrix); %%
% The output can be viewed by the following commands
celldisp(X);
celldisp(Y);
Z
The above program takes more than 1 hour to execute whereas I think this shouldn't be the case. While running the program on windows, I checked the task manager and found that only 500 MB is allocated for the program. Is this the reason for the slowness of the program? If yes, then how can I allocate more heap memory (close to 4GB) to this program in MATLAB?
The problem does not seem to be Memory-bound
Having a sparse, square, symmetric matrix of 6k5 * 6k5 edges does not mean a big memory.
The provided code has many for loops and is heavily recursive in the tail function transfer_nodes()
Add a "Stone-Age-Profiler" into the code
To show the respective times spent on a CPU-bound sections of the processing, wrap the main sections of the code into a construct of:
tic(); for .... end;toc()
which will print you the CPU-bound times spent on relevent sections of the k_clique.m code, showing the readings "on-the-fly"
Your original code k_clique.m
function [components,cliques,CC] = k_clique(k,M)
% k-clique algorithm for detecting overlapping communities in a network
% as defined in the paper "Uncovering the overlapping
% community structure of complex networks in nature and society"
%
% [X,Y,Z] = k_clique(k,A)
%
% Inputs:
% k - clique size
% A - adjacency matrix
%
% Outputs:
% X - detected communities
% Y - all cliques (i.e. complete subgraphs that are not parts of larger
% complete subgraphs)
% Z - k-clique matrix
nb_nodes = size(M,1); % number of nodes
% Find the largest possible clique size via the degree sequence:
% Let {d1,d2,...,dk} be the degree sequence of a graph. The largest
% possible clique size of the graph is the maximum value k such that
% dk >= k-1
degree_sequence = sort(sum(M,2) - 1,'descend');
%max_s = degree_sequence(1);
max_s = 0;
for i = 1:length(degree_sequence)
if degree_sequence(i) >= i - 1
max_s = i;
else
break;
end
end
cliques = cell(0);
% Find all s-size kliques in the graph
for s = max_s:-1:3
M_aux = M;
% Looping over nodes
for n = 1:nb_nodes
A = n; % Set of nodes all linked to each other
B = setdiff(find(M_aux(n,:)==1),n); % Set of nodes that are linked to each node in A, but not necessarily to the nodes in B
C = transfer_nodes(A,B,s,M_aux); % Enlarging A by transferring nodes from B
if ~isempty(C)
for i = size(C,1)
cliques = [cliques;{C(i,:)}];
end
end
M_aux(n,:) = 0; % Remove the processed node
M_aux(:,n) = 0;
end
end
% Generating the clique-clique overlap matrix
CC = zeros(length(cliques));
for c1 = 1:length(cliques)
for c2 = c1:length(cliques)
if c1==c2
CC(c1,c2) = numel(cliques{c1});
else
CC(c1,c2) = numel(intersect(cliques{c1},cliques{c2}));
CC(c2,c1) = CC(c1,c2);
end
end
end
% Extracting the k-clique matrix from the clique-clique overlap matrix
% Off-diagonal elements <= k-1 --> 0
% Diagonal elements <= k --> 0
CC(eye(size(CC))==1) = CC(eye(size(CC))==1) - k;
CC(eye(size(CC))~=1) = CC(eye(size(CC))~=1) - k + 1;
CC(CC >= 0) = 1;
CC(CC < 0) = 0;
% Extracting components (or k-clique communities) from the k-clique matrix
components = [];
for i = 1:length(cliques)
linked_cliques = find(CC(i,:)==1);
new_component = [];
for j = 1:length(linked_cliques)
new_component = union(new_component,cliques{linked_cliques(j)});
end
found = false;
if ~isempty(new_component)
for j = 1:length(components)
if all(ismember(new_component,components{j}))
found = true;
end
end
if ~found
components = [components; {new_component}];
end
end
end
function R = transfer_nodes(S1,S2,clique_size,C)
% Recursive function to transfer nodes from set B to set A (as
% defined above)
% Check if the union of S1 and S2 or S1 is inside an already found larger
% clique
found_s12 = false;
found_s1 = false;
for c = 1:length(cliques)
for cc = 1:size(cliques{c},1)
if all(ismember(S1,cliques{c}(cc,:)))
found_s1 = true;
end
if all(ismember(union(S1,S2),cliques{c}(cc,:)))
found_s12 = true;
break;
end
end
end
if found_s12 || (length(S1) ~= clique_size && isempty(S2))
% If the union of the sets A and B can be included in an
% already found (larger) clique, the recursion is stepped back
% to check other possibilities
R = [];
elseif length(S1) == clique_size;
% The size of A reaches s, a new clique is found
if found_s1
R = [];
else
R = S1;
end
else
% Check the remaining possible combinations of the neighbors
% indices
if isempty(find(S2>=max(S1),1))
R = [];
else
R = [];
for w = find(S2>=max(S1),1):length(S2)
S2_aux = S2;
S1_aux = S1;
S1_aux = [S1_aux S2_aux(w)];
S2_aux = setdiff(S2_aux(C(S2(w),S2_aux)==1),S2_aux(w));
R = [R;transfer_nodes(S1_aux,S2_aux,clique_size,C)];
end
end
end
end
end

Fast computation of warp matrices

For a fixed and given tform, the imwarp command in the Image Processing Toolbox
B = imwarp(A,tform)
is linear with respect to A, meaning there exists some sparse matrix W, depending on tform but independent of A, such that the above can be equivalently implemented
B(:)=W*A(:)
for all A of fixed known dimensions [n,n]. My question is whether there are fast/efficient options for computing W. The matrix form is necessary when I need the transpose operation W.'*B(:), or if I need to do W\B(:) or similar linear algebraic things which I can't do directly through imwarp alone.
I know that it is possible to compute W column-by-column by doing
E=zeros(n);
W=spalloc(n^2,n^2,4*n^2);
for i=1:n^2
E(i)=1;
tmp=imwarp(E,tform);
E(i)=0;
W(:,i)=tmp(:);
end
but this is brute force and slow.
The routine FUNC2MAT is somewhat more optimal in that it uses the loop to compute/gather the sparse entry data I,J,S of each column W(:,i). Then, after the loop, it uses this to construct the overall sparse matrix. It also offers the option of using a PARFOR loop. However, this is still slower than I would like.
Can anyone suggest more speed-optimal alternatives?
EDIT:
For those uncomfortable with my claim that imwarp(A,tform) is linear w.r.t. A, I include the demo script below, which tests that the superposition property is satisfied for random input images and tform data. It can be run repeatedly to see that the nonlinearityError is always small, and easily attributable to floating point noise.
tform=affine2d(rand(3,2));
%tform=projective2d(rand(3));
fun=#(A) imwarp(A,tform,'cubic');
I1=rand(100); I2=rand(100);
c1=rand; c2=rand;
LHS=fun(c1*I1+c2*I2); %left hand side
RHS=c1*fun(I1)+c2*fun(I2); %right hand side
linearityError = norm(LHS(:)-RHS(:),'inf')
That's actually pretty simple:
W = sparse(B(:)/A(:));
Note that W is not unique, but this operation probably produces the most sparse result. Another way to calculate it would be
W = sparse( B(:) * pinv(A(:)) );
but that results in a much less sparse (yet still valid) result.
I constructed the warping matrix using the optical flow fields [u,v] and it is working well for my application
% this function computes the warping matrix
% M x N is the size of the image
function [ Fw ] = generateFwi( u,v,M,N )
Fw = zeros(M*N, M*N);
k =1;
for i=1:M
for j= 1:N
newcoord(1) = i+u(i,j);
newcoord(2) = j+v(i,j);
newi = newcoord(1);
newj = newcoord(2);
if newi >0 && newj >0
newi1x = floor(newi);
newi1y = floor(newj);
newi2x = floor(newi);
newi2y = ceil(newj);
newi3x = ceil(newi); % four nearest points to the given point
newi3y = floor(newj);
newi4x = ceil(newi);
newi4y = ceil(newj);
x1 = [newi,newj;newi1x,newi1y];
x2 = [newi,newj;newi2x,newi2y];
x3 = [newi,newj;newi3x,newi3y];
x4 = [newi,newj;newi4x,newi4y];
w1 = pdist(x1,'euclidean');
w2 = pdist(x2,'euclidean');
w3 = pdist(x3,'euclidean');
w4 = pdist(x4,'euclidean');
if ceil(newi) == floor(newi) && ceil(newj)==floor(newj) % both the new coordinates are integers
Fw(k,(newi1x-1)*N+newi1y) = 1;
else if ceil(newi) == floor(newi) % one of the new coordinates is an integer
w = w1+w2;
w1new = w1/w;
w2new = w2/w;
W = w1new*w2new;
y1coord = (newi1x-1)*N+newi1y;
y2coord = (newi2x-1)*N+newi2y;
if y1coord <= M*N && y2coord <=M*N
Fw(k,y1coord) = W/w2new;
Fw(k,y2coord) = W/w1new;
end
else if ceil(newj) == floor(newj) % one of the new coordinates is an integer
w = w1+w3;
w1 = w1/w;
w3 = w3/w;
W = w1*w3;
y1coord = (newi1x-1)*N+newi1y;
y2coord = (newi3x-1)*N+newi3y;
if y1coord <= M*N && y2coord <=M*N
Fw(k,y1coord) = W/w3;
Fw(k,y2coord) = W/w1;
end
else % both the new coordinates are not integers
w = w1+w2+w3+w4;
w1 = w1/w;
w2 = w2/w;
w3 = w3/w;
w4 = w4/w;
W = w1*w2*w3 + w2*w3*w4 + w3*w4*w1 + w4*w1*w2;
y1coord = (newi1x-1)*N+newi1y;
y2coord = (newi2x-1)*N+newi2y;
y3coord = (newi3x-1)*N+newi3y;
y4coord = (newi4x-1)*N+newi4y;
if y1coord <= M*N && y2coord <= M*N && y3coord <= M*N && y4coord <= M*N
Fw(k,y1coord) = w2*w3*w4/W;
Fw(k,y2coord) = w3*w4*w1/W;
Fw(k,y3coord) = w4*w1*w2/W;
Fw(k,y4coord) = w1*w2*w3/W;
end
end
end
end
else
Fw(k,k) = 1;
end
k=k+1;
end
end
end

Permutations with order restrictions

Let L be a list of objects. Moreover, let C be a set of constraints, e.g.:
C(1) = t1 comes before t2, where t1 and t2 belong to L
C(2) = t3 comes after t2, where t3 and t2 belong to L
How can I find (in MATLAB) the set of permutations for which the constraints in C are not violated?
My first solution is naive:
orderings = perms(L);
toBeDeleted = zeros(1,size(orderings,1));
for ii = 1:size(orderings,1)
for jj = 1:size(constraints,1)
idxA = find(orderings(ii,:) == constraints(jj,1));
idxB = find(orderings(ii,:) == constraints(jj,2));
if idxA > idxB
toBeDeleted(ii) = 1;
end
end
end
where constraints is a set of constraints (each constraint is on a row of two elements, specifying that the first element comes before the second element).
I was wondering whether there exists a simpler (and more efficient) solution.
Thanks in advance.
I'd say that's a pretty good solution you have so far.
There is a few optimizations I see though. Here's my variation:
% INITIALIZE
NN = 9;
L = rand(1,NN-1);
while numel(L) ~= NN;
L = unique( randi(100,1,NN) ); end
% Some bogus constraints
constraints = [...
L(1) L(2)
L(3) L(6)
L(3) L(5)
L(8) L(4)];
% METHOD 0 (your original method)
tic
orderings = perms(L);
p = size(orderings,1);
c = size(constraints,1);
toKeep = true(p,1);
for perm = 1:p
for constr = 1:c
idxA = find(orderings(perm,:) == constraints(constr,1));
idxB = find(orderings(perm,:) == constraints(constr,2));
if idxA > idxB
toKeep(perm) = false;
end
end
end
orderings0 = orderings(toKeep,:);
toc
% METHOD 1 (your original, plus a few optimizations)
tic
orderings = perms(L);
p = size(orderings,1);
c = size(constraints,1);
toKeep = true(p,1);
for perm = 1:p
for constr = 1:c
% break on first condition breached
if toKeep(perm)
% find only *first* entry
toKeep(perm) = ...
find(orderings(perm,:) == constraints(constr,1), 1) < ...
find(orderings(perm,:) == constraints(constr,2), 1);
else
break
end
end
end
orderings1 = orderings(toKeep,:);
toc
% METHOD 2
tic
orderings = perms(L);
p = size(orderings,1);
c = size(constraints,1);
toKeep = true(p,1);
for constr = 1:c
% break on first condition breached1
if any(toKeep)
% Vectorized search for constraint values
[i1, j1] = find(orderings == constraints(constr,1));
[i2, j2] = find(orderings == constraints(constr,2));
% sort by rows
[i1, j1i] = sort(i1);
[i2, j2i] = sort(i2);
% Check if columns meet condition
toKeep = toKeep & j1(j1i) < j2(j2i);
else
break
end
end
orderings2 = orderings(toKeep,:);
toc
% Check for equality
all(orderings2(:) == orderings1(:))
Results:
Elapsed time is 17.911469 seconds. % your method
Elapsed time is 10.477549 seconds. % your method + optimizations
Elapsed time is 2.184242 seconds. % vectorized outer loop
ans =
1
ans =
1
The whole approach however has one fundamental flaw IMHO; the direct use of perms. This inherently poses a limitation due to memory constraints (NN < 10, as stated in help perms).
I have a strong suspicion you can get better performance, both time-wise and memory-wise, when you put together a customized perms. Luckily, perms is not built-in, so you can start by copy-pasting that code into your custom function.

Resources