worldStates is a Matlab MxNxL 3D array (tensor) containing L states of a MxN grid of binary values.
ps is a length L list of probabilities associated with the different states.
The function [worldStates, ps] = StateMerge(worldStates, ps) should remove duplicate world states and sum the probabilities of the merged states to the single state that remains. Duplicate states are states with the exact same configuration of binary values.
Here is the current implementation of this function:
function [worldStates, ps] = StateMerge(worldStates, ps)
M = containers.Map;
for i = 1:length(ps)
s = worldStates(:,:,i);
s = mat2str(s);
if isKey(M, s)
M(s) = M(s) + ps(i);
else
M(s) = ps(i);
end
end
stringStates = keys(M);
n = length(stringStates);
sz = size(worldStates);
worldStates = zeros([sz(1:2), n]);
ps = zeros(1, 1, n);
for i = 1:n
worldStates(:,:,i) = eval(stringStates{i});
ps(i) = M(stringStates{i});
end
end
It uses a Map to be able to remove duplicates in O(L) time, using the states as keys and the probabilities as values. Since Matlab maps does not allow for general data structures as keys the states are converted into string representations to be used as keys and later converted back to arrays using the eval function.
It turns out this code is way to slow for my needs as i will want to process many states (magnitude ~10^6) many times (10^3). The problem lies in converting the matrix to a string which takes a substantial amount of time and scales poorly with state size. An example for small 25x25 states is given below:
How could i create keys in a more efficient manner? Is there another solution aside from using a map that would yield better results?
EDIT: Runnable code as requested. This example makes merges very unlikely:
worldStates = double(rand(25,25, 1000) > 0.5);
weights = rand(1,1, 1000);
ps = weights./sum(weights);
[worldStates, ps] = StateMerge(worldStates, ps);
In this example there will be lot's of merges:
worldStates = double(rand(25,25) > 0.5) .* ones(1,1,1000);
worldStates(1:2,1:2,:) = rand(2,2,1000) > 0.5;
weights = rand(1,1, 1000);
ps = weights./sum(weights);
[worldStates, ps] = StateMerge(worldStates, ps);
Use unique to extract unique (merged) states and accumarray to sum the probabilities of the merged states. Note that this solution, like your solution, doesn't preserve the order of the original states. As suggested by #Wolfie in comments you can use unique with 'stable' option to preserve the order of the states:
function [worldStates, ps] = StateMerge(worldStates, ps)
[M, N, L] = size (worldStates);
worldStates1 = reshape(worldStates, M*N, L).';
[~, uc, ui] = unique(worldStates1, 'rows');
ps = accumarray(ui, ps(:));
worldStates = worldStates (:, :, uc);
end
Related
I am running the following code to obtain the values of the inverse EDF of a data Matrix at the data points:
function [mOUT] = InvEDF (data)
% compute inverse of EDF at data values
% function takes T*K matrix of data and returns T*K matrix of transformed
% data, keepin the order of the original series
T = rows(data);
K = cols(data);
mOUT=zeros(T,K);
for j = 1:K
for i = 1:T
temp = data(:,j)<=data(i,j);
mOUT(i,j) = 1/(T+1)*sum(temp);
end
end
The data Matrix is usually of size 1000*10 or even 1000*30 and I am calling this function a few thousand times. Is there a faster way of doinf this? Any answers are appreciated. Thanks!
You can sort the values and use the index in the sorted matrix as the count of values less or equal. We treat each column by itself, so I will illustrate on a Mx1 matrix.
A = rand(M,1);
[B,I] = sort(A);
C(I) = 1:M;
C(i) will now contain the count of values less or equal to A(i). If you can have duplicate values you need to take that into account.
The advantage of this approach is that we can do it in O(M log M) time, whereas your original inner loop is O(M^2)
Try this -
mOUT=zeros(T,K);
for j = 1:K
d1 = data(:,j);
mOUT(:,j) = sum(bsxfun(#ge,d1,d1'),2); %%//'
end
mOUT = mOUT./(T+1);
My question is twofold:
In the below, A = full(S) where S is a sparse matrix.
What's the "correct" way to access an element in a sparse matrix?
That is, what would the sparse equivalent to var = A(row, col) be?
My view on this topic: You wouldn't do anything different. var = S(row, col) is as efficient as it gets.
What's the "correct" way to add elements to a sparse matrix?
That is, what would the sparse equivalent of A(row, col) = var be? (Assuming A(row, col) == 0 to begin with)
It is known that simply doing A(row, col) = var is slow for large sparse matrices. From the documentation:
If you wanted to change a value in this matrix, you might be tempted
to use the same indexing:
B(3,1) = 42; % This code does work, however, it is slow.
My view on this topic: When working with sparse matrices, you often start with the vectors and use them to create the matrix this way: S = sparse(i,j,s,m,n). Of course, you could also have created it like this: S = sparse(A) or sprand(m,n,density) or something similar.
If you start of the first way, you would simply do:
i = [i; new_i];
j = [j; new_j];
s = [s; new_s];
S = sparse(i,j,s,m,n);
If you started out not having the vectors, you would do the same thing, but use find first:
[i, j, s] = find(S);
i = [i; new_i];
j = [j; new_j];
s = [s; new_s];
S = sparse(i,j,s,m,n);
Now you would of course have the vectors, and can reuse them if you're doing this operation several times. It would however be better to add all new elements at once, and not do the above in a loop, because growing vectors are slow. In this case, new_i, new_j and new_s will be vectors corresponding to the new elements.
MATLAB stores sparse matrices in compressed column format. This means that when you perform an operations like A(2,2) (to get the element in at row 2, column 2) MATLAB first access the second column and then finds the element in row 2 (row indices in each column are stored in ascending order). You can think of it as:
A2 = A(:,2);
A2(2)
If you are only accessing a single element of sparse matrix doing var = S(r,c) is fine. But if you are looping over the elements of a sparse matrix, you probably want to access one column at a time, and then loop over the nonzero row indices via [i,~,x]=find(S(:,c)). Or use something like spfun.
You should avoid constructing a dense matrix A and then doing S = sparse(A), as this operations just squeezes out zeros. Instead, as you note, it's much more efficient to build a sparse matrix from scratch using triplet-form and a call to sparse(i,j,x,m,n). MATLAB has a nice page which describes how to efficiently construct sparse matrices.
The original paper describing the implementation of sparse matrices in MATLAB is quite a good read. It provides some more info on how the sparse matrix algorithms were originally implemented.
EDIT: Answer modified according to suggestions by Oleg (see comments).
Here is my benchmark for the second part of your question. For testing direct insertion, the matrices are initialized empty with a varying nzmax. For testing rebuilding from index vectors this is irrelevant as the matrix is built from scratch at every call. The two methods were tested for doing a single insertion operation (of a varying number of elements), or for doing incremental insertions, one value at a time (up to the same numbers of elements). Due to the computational strain I lowered the number of repetitions from 1000 to 100 for each test case. I believe this is still statistically viable.
Ssize = 10000;
NumIterations = 100;
NumInsertions = round(logspace(0, 4, 10));
NumInitialNZ = round(logspace(1, 4, 4));
NumTests = numel(NumInsertions) * numel(NumInitialNZ);
TimeDirect = zeros(numel(NumInsertions), numel(NumInitialNZ));
TimeIndices = zeros(numel(NumInsertions), 1);
%% Single insertion operation (non-incremental)
% Method A: Direct insertion
for iInitialNZ = 1:numel(NumInitialNZ)
disp(['Running with initial nzmax = ' num2str(NumInitialNZ(iInitialNZ))]);
for iInsertions = 1:numel(NumInsertions)
tSum = 0;
for jj = 1:NumIterations
S = spalloc(Ssize, Ssize, NumInitialNZ(iInitialNZ));
r = randi(Ssize, NumInsertions(iInsertions), 1);
c = randi(Ssize, NumInsertions(iInsertions), 1);
tic
S(r,c) = 1;
tSum = tSum + toc;
end
disp([num2str(NumInsertions(iInsertions)) ' direct insertions: ' num2str(tSum) ' seconds']);
TimeDirect(iInsertions, iInitialNZ) = tSum;
end
end
% Method B: Rebuilding from index vectors
for iInsertions = 1:numel(NumInsertions)
tSum = 0;
for jj = 1:NumIterations
i = []; j = []; s = [];
r = randi(Ssize, NumInsertions(iInsertions), 1);
c = randi(Ssize, NumInsertions(iInsertions), 1);
s_ones = ones(NumInsertions(iInsertions), 1);
tic
i_new = [i; r];
j_new = [j; c];
s_new = [s; s_ones];
S = sparse(i_new, j_new ,s_new , Ssize, Ssize);
tSum = tSum + toc;
end
disp([num2str(NumInsertions(iInsertions)) ' indexed insertions: ' num2str(tSum) ' seconds']);
TimeIndices(iInsertions) = tSum;
end
SingleOperation.TimeDirect = TimeDirect;
SingleOperation.TimeIndices = TimeIndices;
%% Incremental insertion
for iInitialNZ = 1:numel(NumInitialNZ)
disp(['Running with initial nzmax = ' num2str(NumInitialNZ(iInitialNZ))]);
% Method A: Direct insertion
for iInsertions = 1:numel(NumInsertions)
tSum = 0;
for jj = 1:NumIterations
S = spalloc(Ssize, Ssize, NumInitialNZ(iInitialNZ));
r = randi(Ssize, NumInsertions(iInsertions), 1);
c = randi(Ssize, NumInsertions(iInsertions), 1);
tic
for ii = 1:NumInsertions(iInsertions)
S(r(ii),c(ii)) = 1;
end
tSum = tSum + toc;
end
disp([num2str(NumInsertions(iInsertions)) ' direct insertions: ' num2str(tSum) ' seconds']);
TimeDirect(iInsertions, iInitialNZ) = tSum;
end
end
% Method B: Rebuilding from index vectors
for iInsertions = 1:numel(NumInsertions)
tSum = 0;
for jj = 1:NumIterations
i = []; j = []; s = [];
r = randi(Ssize, NumInsertions(iInsertions), 1);
c = randi(Ssize, NumInsertions(iInsertions), 1);
tic
for ii = 1:NumInsertions(iInsertions)
i = [i; r(ii)];
j = [j; c(ii)];
s = [s; 1];
S = sparse(i, j ,s , Ssize, Ssize);
end
tSum = tSum + toc;
end
disp([num2str(NumInsertions(iInsertions)) ' indexed insertions: ' num2str(tSum) ' seconds']);
TimeIndices(iInsertions) = tSum;
end
IncremenalInsertion.TimeDirect = TimeDirect;
IncremenalInsertion.TimeIndices = TimeIndices;
%% Plot results
% Single insertion
figure;
loglog(NumInsertions, SingleOperation.TimeIndices);
cellLegend = {'Using index vectors'};
hold all;
for iInitialNZ = 1:numel(NumInitialNZ)
loglog(NumInsertions, SingleOperation.TimeDirect(:, iInitialNZ));
cellLegend = [cellLegend; {['Direct insertion, initial nzmax = ' num2str(NumInitialNZ(iInitialNZ))]}];
end
hold off;
title('Benchmark for single insertion operation');
xlabel('Number of insertions'); ylabel('Runtime for 100 operations [sec]');
legend(cellLegend, 'Location', 'NorthWest');
grid on;
% Incremental insertions
figure;
loglog(NumInsertions, IncremenalInsertion.TimeIndices);
cellLegend = {'Using index vectors'};
hold all;
for iInitialNZ = 1:numel(NumInitialNZ)
loglog(NumInsertions, IncremenalInsertion.TimeDirect(:, iInitialNZ));
cellLegend = [cellLegend; {['Direct insertion, initial nzmax = ' num2str(NumInitialNZ(iInitialNZ))]}];
end
hold off;
title('Benchmark for incremental insertions');
xlabel('Number of insertions'); ylabel('Runtime for 100 operations [sec]');
legend(cellLegend, 'Location', 'NorthWest');
grid on;
I ran this in MATLAB R2012a. The results for doing a single insertion operations are summarized in this graph:
This shows that using direct insertion is much slower than using index vectors, if only a single operation is done. The growth in the case of using index vectors can be either because of growing the vectors themselves or from the lengthier sparse matrix construction, I'm not sure which. The initial nzmax used to construct the matrices seems to have no effect on their growth.
The results for doing incremental insertions are summarized in this graph:
Here we see the opposite trend: using index vectors is slower, because of the overhead of incrementally growing them and rebuilding the sparse matrix at every step. A way to understand this is to look at the first point in the previous graph: for insertion of a single element, it is more effective to use direct insertion rather than rebuilding using the index vectors. In the incrementlal case, this single insertion is done repetitively, and so it becomes viable to use direct insertion rather than index vectors, against MATLAB's suggestion.
This understanding also suggests that were we to incrementally add, say, 100 elements at a time, the efficient choice would then be to use index vectors rather than direct insertion, as the first graph shows this method to be faster for insertions of this size. In between these two regimes is an area where you should probably experiment to see which method is more effective, though probably the results will show that the difference between the methods is neglibile there.
Bottom line: which method should I use?
My conclusion is that this is dependant on the nature of your intended insertion operations.
If you intend to insert elements one at a time, use direct insertion.
If you intend to insert a large (>10) number of elements at a time, rebuild the matrix from index vectors.
I have been given an assignment in which I am supposed to write an algorithm which performs polynomial interpolation by the barycentric formula. The formulas states that:
p(x) = (SIGMA_(j=0 to n) w(j)*f(j)/(x - x(j)))/(SIGMA_(j=0 to n) w(j)/(x - x(j)))
I have written an algorithm which works just fine, and I get the polynomial output I desire. However, this requires the use of some quite long loops, and for a large grid number, lots of nastly loop operations will have to be done. Thus, I would appreciate it greatly if anyone has any hints as to how I may improve this, so that I will avoid all these loops.
In the algorithm, x and f stand for the given points we are supposed to interpolate. w stands for the barycentric weights, which have been calculated before running the algorithm. And grid is the linspace over which the interpolation should take place:
function p = barycentric_formula(x,f,w,grid)
%Assert x-vectors and f-vectors have same length.
if length(x) ~= length(f)
sprintf('Not equal amounts of x- and y-values. Function is terminated.')
return;
end
n = length(x);
m = length(grid);
p = zeros(1,m);
% Loops for finding polynomial values at grid points. All values are
% calculated by the barycentric formula.
for i = 1:m
var = 0;
sum1 = 0;
sum2 = 0;
for j = 1:n
if grid(i) == x(j)
p(i) = f(j);
var = 1;
else
sum1 = sum1 + (w(j)*f(j))/(grid(i) - x(j));
sum2 = sum2 + (w(j)/(grid(i) - x(j)));
end
end
if var == 0
p(i) = sum1/sum2;
end
end
This is a classical case for matlab 'vectorization'. I would say - just remove the loops. It is almost that simple. First, have a look at this code:
function p = bf2(x, f, w, grid)
m = length(grid);
p = zeros(1,m);
for i = 1:m
var = grid(i)==x;
if any(var)
p(i) = f(var);
else
sum1 = sum((w.*f)./(grid(i) - x));
sum2 = sum(w./(grid(i) - x));
p(i) = sum1/sum2;
end
end
end
I have removed the inner loop over j. All I did here was in fact removing the (j) indexing and changing the arithmetic operators from / to ./ and from * to .* - the same, but with a dot in front to signify that the operation is performed on element by element basis. This is called array operators in contrast to ordinary matrix operators. Also note that treating the special case where the grid points fall onto x is very similar to what you had in the original implementation, only using a vector var such that x(var)==grid(i).
Now, you can also remove the outermost loop. This is a bit more tricky and there are two major approaches how you can do that in MATLAB. I will do it the simpler way, which can be less efficient, but more clear to read - using repmat:
function p = bf3(x, f, w, grid)
% Find grid points that coincide with x.
% The below compares all grid values with all x values
% and returns a matrix of 0/1. 1 is in the (row,col)
% for which grid(row)==x(col)
var = bsxfun(#eq, grid', x);
% find the logical indexes of those x entries
varx = sum(var, 1)~=0;
% and of those grid entries
varp = sum(var, 2)~=0;
% Outer-most loop removal - use repmat to
% replicate the vectors into matrices.
% Thus, instead of having a loop over j
% you have matrices of values that would be
% referenced in the loop
ww = repmat(w, numel(grid), 1);
ff = repmat(f, numel(grid), 1);
xx = repmat(x, numel(grid), 1);
gg = repmat(grid', 1, numel(x));
% perform the calculations element-wise on the matrices
sum1 = sum((ww.*ff)./(gg - xx),2);
sum2 = sum(ww./(gg - xx),2);
p = sum1./sum2;
% fix the case where grid==x and return
p(varp) = f(varx);
end
The fully vectorized version can be implemented with bsxfun rather than repmat. This can potentially be a bit faster, since the matrices are not explicitly formed. However, the speed difference may not be large for small system sizes.
Also, the first solution with one loop is also not too bad performance-wise. I suggest you test those and see, what is better. Maybe it is not worth it to fully vectorize? The first code looks a bit more readable..
Statement of Problem:
I have an array M with m rows and n columns. The array M is filled with non-zero elements.
I also have a vector t with n elements, and a vector omega
with m elements.
The elements of t correspond to the columns of matrix M.
The elements of omega correspond to the rows of matrix M.
Goal of Algorithm:
Define chi as the multiplication of vector t and omega. I need to obtain a 1D vector a, where each element of a is a function of chi.
Each element of chi is unique (i.e. every element is different).
Using mathematics notation, this can be expressed as a(chi)
Each element of vector a corresponds to an element or elements of M.
Matlab code:
Here is a code snippet showing how the vectors t and omega are generated. The matrix M is pre-existing.
[m,n] = size(M);
t = linspace(0,5,n);
omega = linspace(0,628,m);
Conceptual Diagram:
This appears to be a type of integration (if this is the right word for it) along constant chi.
Reference:
Link to reference
The algorithm is not explicitly stated in the reference. I only wish that this algorithm was described in a manner reminiscent of computer science textbooks!
Looking at Figure 11.5, the matrix M is Figure 11.5(a). The goal is to find an algorithm to convert Figure 11.5(a) into 11.5(b).
It appears that the algorithm is a type of integration (averaging, perhaps?) along constant chi.
It appears to me that reshape is the matlab function you need to use. As noted in the link:
B = reshape(A,siz) returns an n-dimensional array with the same elements as A, but reshaped to siz, a vector representing the dimensions of the reshaped array.
That is, create a vector siz with the number m*n in it, and say A = reshape(P,siz), where P is the product of vectors t and ω; or perhaps say something like A = reshape(t*ω,[m*n]). (I don't have matlab here, or would run a test to see if I have the product the right way around.) Note, the link does not show an example with one number (instead of several) after the matrix parameter to reshape, but I would expect from the description that A = reshape(t*ω,m*n) might also work.
You should add a pseudocode or a link to the algorithm you want to implement. From what I could understood I have developed the following code anyway:
M = [1 2 3 4; 5 6 7 8; 9 10 11 12]' % easy test M matrix
a = reshape(M, prod(size(M)), 1) % convert M to vector 'a' with reshape command
[m,n] = size(M); % Your sample code
t = linspace(0,5,n); % Your sample code
omega = linspace(0,628,m); % Your sample code
for i=1:length(t)
for j=1:length(omega) % Acces a(chi) in the desired order
chi = length(omega)*(i-1)+j;
t(i) % related t value
omega(j) % related omega value
a(chi) % related a(chi) value
end
end
As you can see, I also think that the reshape() function is the solution to your problems. I hope that this code helps,
The basic idea is to use two separate loops. The outer loop is over the chi variable values, whereas the inner loop is over the i variable values. Referring to the above diagram in the original question, the i variable corresponds to the x-axis (time), and the j variable corresponds to the y-axis (frequency). Assuming that the chi, i, and j variables can take on any real number, bilinear interpolation is then used to find an amplitude corresponding to an element in matrix M. The integration is just an averaging over elements of M.
The following code snippet provides an overview of the basic algorithm to express elements of a matrix as a vector using the spectral collapsing from 2D to 1D. I can't find any reference for this, but it is a solution that works for me.
% Amp = amplitude vector corresponding to Figure 11.5(b) in book reference
% M = matrix corresponding to the absolute value of the complex Gabor transform
% matrix in Figure 11.5(a) in book reference
% Nchi = number of chi in chi vector
% prod = product of timestep and frequency step
% dt = time step
% domega = frequency step
% omega_max = maximum angular frequency
% i = time array element along x-axis
% j = frequency array element along y-axis
% current_i = current time array element in loop
% current_j = current frequency array element in loop
% Nchi = number of chi
% Nivar = number of i variables
% ivar = i variable vector
% calculate for chi = 0, which only occurs when
% t = 0 and omega = 0, at i = 1
av0 = mean( M(1,:) );
av1 = mean( M(2:end,1) );
av2 = mean( [av0 av1] );
Amp(1) = av2;
% av_val holds the sum of all values that have been averaged
av_val_sum = 0;
% loop for rest of chi
for ccnt = 2:Nchi % 2:Nchi
av_val_sum = 0; % reset av_val_sum
current_chi = chi( ccnt ); % current value of chi
% loop over i vector
for icnt = 1:Nivar % 1:Nivar
current_i = ivar( icnt );
current_j = (current_chi / (prod * (current_i - 1))) + 1;
current_t = dt * (current_i - 1);
current_omega = domega * (current_j - 1);
% values out of range
if(current_omega > omega_max)
continue;
end
% use bilinear interpolation to find an amplitude
% at current_t and current_omega from matrix M
% f_x_y is the bilinear interpolated amplitude
% Insert bilinear interpolation code here
% add to running sum
av_val_sum = av_val_sum + f_x_y;
end % icnt loop
% compute the average over all i
av = av_val_sum / Nivar;
% assign the average to Amp
Amp(ccnt) = av;
end % ccnt loop
I have a matrix, matrix_logical(50000,100000), that is a sparse logical matrix (a lot of falses, some true). I have to produce a matrix, intersect(50000,50000), that, for each pair, i,j, of rows of matrix_logical(50000,100000), stores the number of columns for which rows i and j have both "true" as the value.
Here is the code I wrote:
% store in advance the nonzeros cols
for i=1:50000
nonzeros{i} = num2cell(find(matrix_logical(i,:)));
end
intersect = zeros(50000,50000);
for i=1:49999
a = cell2mat(nonzeros{i});
for j=(i+1):50000
b = cell2mat(nonzeros{j});
intersect(i,j) = numel(intersect(a,b));
end
end
Is it possible to further increase the performance? It takes too long to compute the matrix. I would like to avoid the double loop in the second part of the code.
matrix_logical is sparse, but it is not saved as sparse in MATLAB because otherwise the performance become the worst possible.
Since the [i,j] entry counts the number of non zero elements in the element-wise multiplication of rows i and j, you can do it by multiplying matrix_logical with its transpose (you should convert to numeric data type first, e.g matrix_logical = single(matrix_logical)):
inter = matrix_logical * matrix_logical';
And it works both for sparse or full representation.
EDIT
In order to calculate numel(intersect(a,b))/numel(union(a,b)); (as asked in your comment), you can use the fact that for two sets a and b, you have
length(union(a,b)) = length(a) + length(b) - length(intersect(a,b))
so, you can do the following:
unLen = sum(matrix_logical,2);
tmp = repmat(unLen, 1, length(unLen)) + repmat(unLen', length(unLen), 1);
inter = matrix_logical * matrix_logical';
inter = inter ./ (tmp-inter);
If I understood you correctly, you want a logical AND of the rows:
intersct = zeros(50000, 50000)
for ii = 1:49999
for jj = ii:50000
intersct(ii, jj) = sum(matrix_logical(ii, :) & matrix_logical(jj, :));
intersct(jj, ii) = intersct(ii, jj);
end
end
Doesn't avoid the double loop, but at least works without the first loop and the slow find command.
Elaborating on my comment, here is a distance function suitable for pdist()
function out = distfun(xi,xj)
out = zeros(size(xj,1),1);
for i=1:size(xj,1)
out(i) = sum(sum( xi & xj(i,:) )) / sum(sum( xi | xj(i,:) ));
end
In my experience, sum(sum()) is faster for logicals than nnz(), thus its appearance above.
You would also need to use squareform() to reshape the output of pdist() appropriately:
squareform(pdist(martrix_logical,#distfun));
Note that pdist() includes a 'jaccard' distance measure, but it is actually the Jaccard distance and not the Jaccard index or coefficient, which is the value you are apparently after.