Faster way to decrease some items in a vector in Matlab - performance

I'm looking for a faster way to do decrease the value of certain numbers in a vector in Matlab, for example I've this vector:
Vector a=[1 21 35 44 45 67 77 83 93 100]
Then I have to remove the elements 35,45,77, so:
RemoveVector b=[3,5,7]
RemoveElements c=[35,45,77]
After remove the elements, the should be:
Vector=[1 21 43 65 80 90 97]
Note that besides remove the element, all the next elements decrease their values in 1, I've this code in Matlab:
a(:,b) = [];
b = fliplr(b);
for i=1:size(a,2)
for j=1:size(c,2)
if(a(1,i)>=c(1,j))
a(1,i) = a(1,i) -1;
end
end
end
But is too slow, m0=2.8*10^-3 seconds, there is a faster algorithm? I believe with matrix operations could be faster and elegant.

#Geoff has a good overall approach, but the adjustment can be done in O(n) not O(n*k):
adjustment = zeros(size(a));
adjustment(b(:)) = 1;
a = a - cumsum(adjustment);
a(b(:)) = [];

I think prior to removing the elements from a whose indices are given in b, the code could do all the decrementing first
% copy a
c = a;
% iterate over each index in b
for k=1:length(b)
% for all elements in c that follow the index in b (so b(k)+1…end)
% subtract one
c(b(k)+1:end) = c(b(k)+1:end) - 1;
end
% now remove the elements that correspond to the indices in b
c(b) = [];
Try the above and see what happens!

Thank so much to Geoff and Ben for yours answer, I've proved both answers by this way:
tic
a=[1 21 35 44 45 67 77 83 93 100];
b=[3 5 7];
%Code by Geoff
c = a;
for k=1:length(b)
% for all elements in c that follow the index in b (so b(k)+1…end)
% subtract one
c(b(k)+1:end) = c(b(k)+1:end) - 1;
end
c(b) = [];
m1 = toc;
and
tic
a=[1 21 35 44 45 67 77 83 93 100];
b=[3 5 7];
%Code by Ben
adjustment = zeros(size(a));
adjustment(b(:)) = 1;
a = a - cumsum(adjustment);
a(b(:)) = [];
m2 = toc;
The results in my machine were m1=1.2648*10^-4 seconds and m2=7.426*10^-5 seconds, the second code is faster, my first code gives m0 = 2.8*10^-3 seconds .

Related

Quickly compute `dot(a(n:end), b(1:end-n))`

Suppose we have two, one dimensional arrays of values a and b which both have length N. I want to create a new array c such that c(n)=dot(a(n:N), b(1:N-n+1)) I can of course do this using a simple loop:
for n=1:N
c(n)=dot(a(n:N), b(1:N-n+1));
end
but given that this is such a simple operation which resembles a convolution I was wondering if there isn't a more efficient method to do this (using Matlab).
A solution using 1D convolution conv:
out = conv(a, flip(b));
c = out(ceil(numel(out)/2):end);
In conv the first vector is multiplied by the reversed version of the second vector so we need to compute the convolution of a and the flipped b and trim the unnecessary part.
This is an interesting problem!
I am going to assume that a and b are column vectors of the same length. Let us consider a simple example:
a = [9;10;2;10;7];
b = [1;3;6;10;10];
% yields:
c = [221;146;74;31;7];
Now let's see what happens when we compute the convolution of these vectors:
>> conv(a,b)
ans =
9
37
86
166
239
201
162
170
70
>> conv2(a, b.')
ans =
9 27 54 90 90
10 30 60 100 100
2 6 12 20 20
10 30 60 100 100
7 21 42 70 70
We notice that c is the sum of elements along the lower diagonals of the result of conv2. To show it clearer we'll transpose to get the diagonals in the same order as values in c:
>> triu(conv2(a.', b))
ans =
9 10 2 10 7
0 30 6 30 21
0 0 12 60 42
0 0 0 100 70
0 0 0 0 70
So now it becomes a question of summing the diagonals of a matrix, which is a more common problem with existing solution, for example this one by Andrei Bobrov:
C = conv2(a.', b);
p = sum( spdiags(C, 0:size(C,2)-1) ).'; % This gives the same result as the loop.

Combining every column-combination of an arbitrary number of matrices

I'm trying to figure out a way to do a certain "reduction"
I have a varying number of matrices of varying size, e.g
1 2 2 2 5 6...70 70
3 7 8 9 7 7...88 89
1 3 4
2 7 7
3 8 8
9 9 9
.
.
44 49 49 49 49 49 49
50 50 50 50 50 50 50
87 87 88 89 90 91 92
What I need to do (and I hope that I'm explaining this clearly enough) is to combine any possible
combination of columns from these matrices, this means that one column might be
1
3
1
2
3
9
.
.
.
44
50
87
Which would reduce down to
1
2
3
9
.
.
.
44
50
87
The reason why I'm doing this is because I need to find the smallest unique combined column
What am I trying to accomplish
For those interested, I'm trying to find the smallest set of gene knockouts
to disable reactions. Here, every matrix represents a reactions, and the columns represent the indices of
the genes that would disable that reaction.
The method may be as brute force as needed, as these matrices rarely become overwhelmingly large,
and the reaction combinations won't be long either
The problem
I can't (as far as I know) create a for loop with an arbitrary number of iterators, and the number of
matrices (reactions to disable) is arbitrary.
Clarification
If I have matrices A,B,C with columns a1,a2...b1,b2...c1...cn what I need
are the columns [a1 b1 c1], [a1, b1, c2], ..., [a1 b1 cn] ... [an bn cn]
Solution
Courtesy of Michael Ohlrogge below.
Extension of his answer, for completeness
His solution ends with
MyProd = product(Array_of_ColGroups...)
Which gets the job done
And picking up where he left off
collection = collect(MyProd); #MyProd is an iterator
merged_cols = Array[] # the rows of 'collection' are arrays of arrays
for (i,v) in enumerate(collection)
# I apologize for this line
push!(merged_cols, sort!(unique(vcat(v...))))
end
# find all lengths so I can find which is the minimum
lengths = map(x -> length(x), merged_cols);
loc_of_shortest = find(broadcast((x,y) -> length(x) == y, merged_cols,minimum(lengths)))
best_gene_combos = merged_cols[loc_of_shortest]
tl;dr - complete solution:
# example matrices
a = rand(1:50, 8,4); b = rand(1:50, 10,5); c = rand(1:50, 12,4);
Matrices = [a,b,c];
toJagged(x) = [x[:,i] for i in 1:size(x,2)];
JaggedMatrices = [toJagged(x) for x in Matrices];
Combined = [unique(i) for i in JaggedMatrices[1]];
for n in 2:length(JaggedMatrices)
Combined = [unique([i;j]) for i in Combined, j in JaggedMatrices[n]];
end
Lengths = [length(s) for s in Combined];
Minima = findin(Lengths, min(Lengths...));
SubscriptsArray = ind2sub(size(Lengths), Minima);
ComboTuples = [((i[j] for i in SubscriptsArray)...) for j in 1:length(Minima)]
Explanation:
Assume you have matrix a and b
a = rand(1:50, 8,4);
b = rand(1:50, 10,5);
Express them as a jagged array, columns first
A = [a[:,i] for i in 1:size(a,2)];
B = [b[:,i] for i in 1:size(b,2)];
Concatenate rows for all column combinations using a list comprehension; remove duplicates on the spot:
Combined = [unique([i;j]) for i in A, j in B];
You now have all column combinations of a and b, as concatenated rows with duplicates removed. Find the lengths easily:
Lengths = [length(s) for s in Combined];
If you have more than two matrices, perform this process iteratively in a for loop, e.g. by using the Combined matrix in place of a. e.g. if you have a matrix c:
c = rand(1:50, 12,4);
C = [c[:,i] for i in 1:size(c,2)];
Combined = [unique([i;j]) for i in Combined, j in C];
Once you have the Lengths array as a multidimensional array (as many dimensions as input matrices, where the size of each dimension is the number of columns in each matrix), you can find the column combinations that correspond to the lowest value (there may well be more than one combination), via a simple ind2sub operation:
Minima = findin(Lengths, min(Lengths...));
SubscriptsArray = ind2sub(size(Lengths), Minima)
(e.g. for a randomized run with 3 input matrices, I happened to get 4 results with the minimal length of 19. The result of ind2sub was ([4,4,3,4,4],[3,3,4,5,3],[1,3,3,3,4])
You can convert this further to a list of "Column Combination" tuples with a (somewhat ugly) list comprehension:
ComboTuples = [((i[j] for i in SubscriptsArray)...) for j in 1:length(Minima)]
# results in:
# 5-element Array{Tuple{Int64,Int64,Int64},1}:
# (4,3,1)
# (4,3,3)
# (3,4,3)
# (4,5,3)
# (4,3,4)
Ok, let's see if I understand this. You've got n matrices and want all combinations with one column from each of the n matrices? If so, how about the product() (for Cartesian product) from the Iterators package?
using Iterators
n = 3
Array_of_Arrays = [rand(3,3) for idx = 1:n] ## arbitrary representation of your set of arrays.
Array_of_ColGroups = Array(Array, length(Array_of_Arrays))
for (idx, MyArray) in enumerate(Array_of_Arrays)
Array_of_ColGroups[idx] = [MyArray[:,jdx] for jdx in 1:size(MyArray,2)]
end
MyProd = product(Array_of_ColGroups...)
This will create an iterator object which you can then loop over to consider the specific combinations of columns.

Compute the local mean and variance of identified pixel in image

I want to compute the mean and standard derivation of sub region that is created by a window (dashed line) and center at identified pixel-red color( called local mean and standard derivation). This is figure to describe it
We can do it by convolution image with a mask. However, it takes long time because I only care the mean and standard derivation of a server points, while convolution computes for whole point in image. Could you have a faster way to resolve it that only compute the mean and standard derivation at identified pixel? I am doing it by matlab. This is my code by convolution function
I=[18 36 70 33 64 40 62 76 71 37 5
82 49 86 45 96 29 74 7 60 56 45
25 32 55 48 25 30 12 82 95 77 8
24 18 78 74 19 57 67 59 16 46 78
28 9 59 2 29 11 7 31 75 15 25
83 26 96 8 82 26 85 12 11 28 19
81 64 78 70 26 33 17 72 81 16 54
75 39 78 34 59 31 77 31 61 81 89
89 84 29 99 79 25 26 35 65 56 76
93 90 45 7 61 13 34 24 11 34 92
88 82 91 81 100 4 88 70 85 8 19];
identified_position=[30 36 84 90] %indices of pixel 78, 48,72 60
mask=1/9.*ones(3,3);
mean_all=imfilter(I,mask,'same');
%Mean of identified pixels
mean_all(identified_position)
% Compute the variance
std_all=stdfilt(I,ones(3));
%std of identified pixels
std_all(identified_position)
This is the comparison code
function compare_mean(dimx,dimy)
I=randi(100,[dimx,dimy]);
rad=3;
identified_position=randi(max(I(:)),[1,5]);% Get 5 random position
function way1()
mask=ones(rad,rad);
mask=mask./sum(mask(:));
mean_all=conv2(I,mask,'same');
mean_out =mean_all(identified_position);
end
function way2()
box_size = rad; %// Edit your window size here (an odd number is preferred)
bxr = floor(box_size/2); %// box radius
%// Get neighboring indices and those elements for all identified positions
off1 = bsxfun(#plus,[-bxr:bxr]',[-bxr:bxr]*size(I,1)); %//'#neighborhood offsets
idx = bsxfun(#plus,off1(:),identified_position); %// all absolute offsets
I_selected_neigh = I(idx); %// all offsetted elements
mean_out = mean(I_selected_neigh,1); %// mean output
end
way2()
time_way1=#()way1();timeit(time_way1)
time_way2=#()way2();timeit(time_way2)
end
Sometime the way2 has error is
Subscript indices must either be real positive integers or logicals.
Error in compare_mean/way2 (line 18)
I_selected_neigh = I(idx); %// all offsetted elements
Error in compare_mean (line 22)
way2()
Discussion & Solution Codes
Given I as the input image, identified_position as the linear indices of the selected points and bxsz as the window/box size, the approach listed next must be pretty efficient -
%// Get XY coordinates
[X,Y] = ind2sub(size(I),identified_position);
pts = [X(:) Y(:)];
%// Parameters
bxr = (bxsz-1)/2;
Isz = size(I);
%// XY coordinates of neighboring elements
[offx,offy] = ndgrid(-bxr:bxr,-bxr:bxr);
x_idx = bsxfun(#plus,offx(:),pts(:,1)'); %//'
y_idx = bsxfun(#plus,offy(:),pts(:,2)'); %//'
%// Outside image boundary elements
invalids = x_idx>Isz(1) | x_idx<1 | y_idx>Isz(2) | y_idx<1;
%// All neighboring indices
all_idx = (y_idx-1)*size(I,1) + x_idx;
all_idx(invalids) = 1;
%// All neighboring elements
all_vals = I(all_idx);
all_vals(invalids) = 0;
mean_out = mean(all_vals,1); %// final mean output
stdfilts = stdfilt(all_vals,ones(bxsz^2,1))
std_out = stdfilts(ceil(size(stdfilts,1)/2),:) %// final stdfilt output
Basically, it gets all the neighbouring indices for all identified positions in one go with bsxfun and thus, gets all those neighbouring elements. Those selected elements are then used to get the mean and stdfilt outputs. The whole idea is to keep the memory requirement minimum and at the same time doing everything in a vectorized fashion within those selected elements. Hopefully, this must be faster!
Benchmarking
Benchmarking Code
dx = 10000; %// x-dimension of input image
dy = 10000; %// y-dimension of input image
npts = 1000; %// number of points
I=randi(100,[dx,dy]); %// create input image of random intensities
identified_position=randi(max(I(:)),[1,npts]);
rad=5; %// blocksize (rad x rad)
%// Run the approaches fed with the inputs
func1 = #() way1(I,identified_position,rad); %// original approach
time1 = timeit(func1);
clear func1
func2 = #() way2(I,identified_position,rad); %// proposed approach
time2 = timeit(func2);
clear func2
disp(['Input size: ' num2str(dx) 'x' num2str(dy) ' & Points: ' num2str(npts)])
disp(['With Original Approach: Elapsed Time = ' num2str(time1) '(s)'])
disp(['With Proposed Approach: Elapsed Time = ' num2str(time2) '(s)'])
disp(['**Speedup w/ Proposed Approach : ' num2str(time1/time2) 'x!**'])
Associated function codes
%// OP's stated approach
function mean_out = way1(I,identified_position,rad)
mask=ones(rad,rad);
mask=mask./sum(mask(:));
mean_all=conv2(I,mask,'same');
mean_out =mean_all(identified_position);
return;
function mean_out = way2(I,identified_position,rad)
%//.... code from proposed approach stated earlier until mean_out %//
Runtime results
Input size: 10000x10000 & Points: 1000
With Original Approach: Elapsed Time = 0.46394(s)
With Proposed Approach: Elapsed Time = 0.00049403(s)
**Speedup w/ Proposed Approach : 939.0778x!**

What is the most efficient way to implement zig-zag ordering in MATLAB? [duplicate]

I have an NxM matrix in MATLAB that I would like to reorder in similar fashion to the way JPEG reorders its subblock pixels:
(image from Wikipedia)
I would like the algorithm to be generic such that I can pass in a 2D matrix with any dimensions. I am a C++ programmer by trade and am very tempted to write an old school loop to accomplish this, but I suspect there is a better way to do it in MATLAB.
I'd be rather want an algorithm that worked on an NxN matrix and go from there.
Example:
1 2 3
4 5 6 --> 1 2 4 7 5 3 6 8 9
7 8 9
Consider the code:
M = randi(100, [3 4]); %# input matrix
ind = reshape(1:numel(M), size(M)); %# indices of elements
ind = fliplr( spdiags( fliplr(ind) ) ); %# get the anti-diagonals
ind(:,1:2:end) = flipud( ind(:,1:2:end) ); %# reverse order of odd columns
ind(ind==0) = []; %# keep non-zero indices
M(ind) %# get elements in zigzag order
An example with a 4x4 matrix:
» M
M =
17 35 26 96
12 59 51 55
50 23 70 14
96 76 90 15
» M(ind)
ans =
17 35 12 50 59 26 96 51 23 96 76 70 55 14 90 15
and an example with a non-square matrix:
M =
69 9 16 100
75 23 83 8
46 92 54 45
ans =
69 9 75 46 23 16 100 83 92 54 8 45
This approach is pretty fast:
X = randn(500,2000); %// example input matrix
[r, c] = size(X);
M = bsxfun(#plus, (1:r).', 0:c-1);
M = M + bsxfun(#times, (1:r).'/(r+c), (-1).^M);
[~, ind] = sort(M(:));
y = X(ind).'; %'// output row vector
Benchmarking
The following code compares running time with that of Amro's excellent answer, using timeit. It tests different combinations of matrix size (number of entries) and matrix shape (number of rows to number of columns ratio).
%// Amro's approach
function y = zigzag_Amro(M)
ind = reshape(1:numel(M), size(M));
ind = fliplr( spdiags( fliplr(ind) ) );
ind(:,1:2:end) = flipud( ind(:,1:2:end) );
ind(ind==0) = [];
y = M(ind);
%// Luis' approach
function y = zigzag_Luis(X)
[r, c] = size(X);
M = bsxfun(#plus, (1:r).', 0:c-1);
M = M + bsxfun(#times, (1:r).'/(r+c), (-1).^M);
[~, ind] = sort(M(:));
y = X(ind).';
%// Benchmarking code:
S = [10 30 100 300 1000 3000]; %// reference to generate matrix size
f = [1 1]; %// number of cols is S*f(1); number of rows is S*f(2)
%// f = [0.5 2]; %// plotted with '--'
%// f = [2 0.5]; %// plotted with ':'
t_Amro = NaN(size(S));
t_Luis = NaN(size(S));
for n = 1:numel(S)
X = rand(f(1)*S(n), f(2)*S(n));
f_Amro = #() zigzag_Amro(X);
f_Luis = #() zigzag_Luis(X);
t_Amro(n) = timeit(f_Amro);
t_Luis(n) = timeit(f_Luis);
end
loglog(S.^2*prod(f), t_Amro, '.b-');
hold on
loglog(S.^2*prod(f), t_Luis, '.r-');
xlabel('number of matrix entries')
ylabel('time')
The figure below has been obtained with Matlab R2014b on Windows 7 64 bits. Results in R2010b are very similar. It is seen that the new approach reduces running time by a factor between 2.5 (for small matrices) and 1.4 (for large matrices). Results are seen to be almost insensitive to matrix shape, given a total number of entries.
Here's a non-loop solution zig_zag.m. It looks ugly but it works!:
function [M,index] = zig_zag(M)
[r,c] = size(M);
checker = rem(hankel(1:r,r-1+(1:c)),2);
[rEven,cEven] = find(checker);
[cOdd,rOdd] = find(~checker.'); %'#
rTotal = [rEven; rOdd];
cTotal = [cEven; cOdd];
[junk,sortIndex] = sort(rTotal+cTotal);
rSort = rTotal(sortIndex);
cSort = cTotal(sortIndex);
index = sub2ind([r c],rSort,cSort);
M = M(index);
end
And a test matrix:
>> M = [magic(4) zeros(4,1)];
M =
16 2 3 13 0
5 11 10 8 0
9 7 6 12 0
4 14 15 1 0
>> newM = zig_zag(M) %# Zig-zag sampled elements
newM =
16
2
5
9
11
3
13
10
7
4
14
6
8
0
0
12
15
1
0
0
Here's a way how to do this. Basically, your array is a hankel matrix plus vectors of 1:m, where m is the number of elements in each diagonal. Maybe someone else has a neat idea on how to create the diagonal arrays that have to be added to the flipped hankel array without a loop.
I think this should be generalizeable to a non-square array.
% for a 3x3 array
n=3;
numElementsPerDiagonal = [1:n,n-1:-1:1];
hadaRC = cumsum([0,numElementsPerDiagonal(1:end-1)]);
array2add = fliplr(hankel(hadaRC(1:n),hadaRC(end-n+1:n)));
% loop through the hankel array and add numbers counting either up or down
% if they are even or odd
for d = 1:(2*n-1)
if floor(d/2)==d/2
% even, count down
array2add = array2add + diag(1:numElementsPerDiagonal(d),d-n);
else
% odd, count up
array2add = array2add + diag(numElementsPerDiagonal(d):-1:1,d-n);
end
end
% now flip to get the result
indexMatrix = fliplr(array2add)
result =
1 2 6
3 5 7
4 8 9
Afterward, you just call reshape(image(indexMatrix),[],1) to get the vector of reordered elements.
EDIT
Ok, from your comment it looks like you need to use sort like Marc suggested.
indexMatrixT = indexMatrix'; % ' SO formatting
[dummy,sortedIdx] = sort(indexMatrixT(:));
sortedIdx =
1 2 4 7 5 3 6 8 9
Note that you'd need to transpose your input matrix first before you index, because Matlab counts first down, then right.
Assuming X to be the input 2D matrix and that is square or landscape-shaped, this seems to be pretty efficient -
[m,n] = size(X);
nlim = m*n;
n = n+mod(n-m,2);
mask = bsxfun(#le,[1:m]',[n:-1:1]);
start_vec = m:m-1:m*(m-1)+1;
a = bsxfun(#plus,start_vec',[0:n-1]*m);
offset_startcol = 2- mod(m+1,2);
[~,idx] = min(mask,[],1);
idx = idx - 1;
idx(idx==0) = m;
end_ind = a([0:n-1]*m + idx);
offsets = a(1,offset_startcol:2:end) + end_ind(offset_startcol:2:end);
a(:,offset_startcol:2:end) = bsxfun(#minus,offsets,a(:,offset_startcol:2:end));
out = a(mask);
out2 = m*n+1 - out(end:-1:1+m*(n-m+1));
result = X([out2 ; out(out<=nlim)]);
Quick runtime tests against Luis's approach -
Datasize: 500 x 2000
------------------------------------- With Proposed Approach
Elapsed time is 0.037145 seconds.
------------------------------------- With Luis Approach
Elapsed time is 0.045900 seconds.
Datasize: 5000 x 20000
------------------------------------- With Proposed Approach
Elapsed time is 3.947325 seconds.
------------------------------------- With Luis Approach
Elapsed time is 6.370463 seconds.
Let's assume for a moment that you have a 2-D matrix that's the same size as your image specifying the correct index. Call this array idx; then the matlab commands to reorder your image would be
[~,I] = sort (idx(:)); %sort the 1D indices of the image into ascending order according to idx
reorderedim = im(I);
I don't see an obvious solution to generate idx without using for loops or recursion, but I'll think some more.

How many times the function will be invoked?

I have this cycle:
for(i = 0; i < n; i ++) {
if(i % 5 == 1 && i % 3 == 1) {
function();
}
}
How can i count amount of calls of function() without running this code?
I take from the complexity-theory tag that you want some Theta expression. The if causes your function to be executed every fifteenth time, which is a constant factor, so the number of executions is still Theta(n).
The conditional has two expressions. The first expression holds true every 5 iterations and the second holds true every 3 iterations. Together they hold true apprx every 15 rounds and function() gets called.
Look at the values of i where your condition holds:
1
16
31
46
61
76
91
106
121
136
151
166
181
196
...
Now, what would be the case if you had the condition i % 5 == 0 && i % 3 == 0, it would have to be a multiple of 15 (lcm(3,5)) and then the condition would hold on every 15th iteration. From that you can likely derive the relation yourself.
The if statement is true when i = 15*k +1 where k is whole number. For total number like that within (0,n) is given by k = floor((n-1)/15) + 1 for example n = 31, k = floor((31-1)/15) + 1 = 3 which is (1,16,31)

Resources