transforming a matrix into a vector along its diagonals - algorithm

im not not a programmer, i just need to solve something numerically in matlab.
i need a function to make the following transformation for any square matrix:
from
row 1: 1 2 3
row 2: 4 5 6
row 3: 7 8 9
to
1 4 2 7 5 3 8 6 9
ie write the matrix in a vector along its diagonals from left to top right.
any ideas please?
i really need a little more help though:
say the matrix that we have transformed into the vector, has entries denoted by M(i,j), where i are rows and j columns. now i need to be able to find out from a position in the vector, the original position in the matrix, i.e say if its 3rd entry in the vector, i need a function that would give me i=1 j=2. any ideas please? im really stuck on this:( thanks

This is quite similar to a previous question on traversing the matrix in a zigzag order. With slight modification we get:
A = rand(3); %# input matrix
ind = reshape(1:numel(A), size(A)); %# indices of elements
ind = spdiags(fliplr(ind)); %# get the anti-diagonals
ind = ind(end:-1:1); %# reverse order
ind = ind(ind~=0); %# keep non-zero indices
B = A(ind); %# get elements in desired order
using the SPDIAGS function. The advantage of this is that it works for any arbitrary matrix size (not just square matrices). Example:
A =
0.75127 0.69908 0.54722 0.25751
0.2551 0.8909 0.13862 0.84072
0.50596 0.95929 0.14929 0.25428
B =
Columns 1 through 6
0.75127 0.2551 0.69908 0.50596 0.8909 0.54722
Columns 7 through 12
0.95929 0.13862 0.25751 0.14929 0.84072 0.25428

Here's one way to do this.
%# n is the number of rows (or cols) of the square array
n = 3;
array = [1 2 3;4 5 6;7 8 9]; %# this is the array we'll reorder
%# create list of indices that allow us
%# to read the array in the proper order
hh = hankel(1:n,n:(2*n-1)); %# creates a matrix with numbered antidiagonals
[dummy,sortIdx] = sort(hh(:)); %# sortIdx contains the new order
%# reorder the array
array(sortIdx)
ans =
1 4 2 7 5 3 8 6 9

You can convert your matrix to a vector using the function HANKEL to generate indices into the matrix. Here's a shortened version of Jonas' answer, using M as your sample matrix given above:
N = size(M,1);
A = hankel(1:N,N:(2*N-1));
[junk,sortIndex] = sort(A(:));
Now, you can use sortIndex to change your matrix M to a vector vec like so:
vec = M(sortIndex);
And if you want to get the row and column indices (rIndex and cIndex) into your original matrix that correspond to the values in vec, you can use the function IND2SUB:
[rIndex,cIndex] = ind2sub(N,sortIndex);

A=[1,2,3;4,5,6;7,8,9];
d = size(A,1);
X=[];
for n = 1:2*size(A,1) - 1
j = min(n,d); i = (n+1)-(j);
X = cat(2,X,diag(flipud(A(i:j,i:j)))');
end
X
X =
1 4 2 7 5 3 8 6 9

You can generate the diagonals in this way:
for i = -2:2
diag(flipud(a), i)
end
I don't know whether this is the optimal way to concatenate the diagonals:
d = []
for i = -2:2
d = vertcat(d, diag(flipud(a), i))
end
(I tested it in octave, not in matlab)

Related

Compute mean of columns for groups of rows in Octave

I have a matrix, for example:
1 2
3 4
4 5
And I also have a rule of grouping the rows, which is defined as a vector of group IDs like this:
1
2
1
Which means that the first and the third rows belong to the same group (ID 1) and the second row belong to another group (ID 2). So, I would like to compute the mean value for each group. Here is the result for my example:
2.5 3.5
3 4
More formally, there is a matrix A of size (m, n), a number of groups k and a vector v of size (m, 1), values of which are integers in range from 1 to k. The result is a matrix R of size (k, n), where each row with index r corresponds to the mean value of the group r.
Here is my solution (which does what I need) using for-loop in Octave:
R = zeros(k, n);
for r = 1:k
R(r, :) = mean(A((v == r), :), 1);
end
I wonder whether it could be vectorized. So, what I need is to replace the for-loop with a vectorized solution, which is going to be much more efficient than the iterative one.
Here is one of my many attempts (which do not work) to solve the problem in a vectorized way:
R = mean(A((v == 1:k), :);
As long as our data is of floating point, you can just do it manually by doing the sum yourself and then divide, by making use of accumdim. Like so:
octave:1> A = [1 2; 3 4; 4 5];
octave:2> subs = [1; 2; 1];
octave:3> accumdim (subs, A) ./ accumdim (subs, ones (rows (subs), 1))
ans =
2.5000 3.5000
3.0000 4.0000
You can consider it as a matrix multiplication problem. For instance, for your example this corresponds to
A = [1 2; 3 4; 4 5];
B = [0.5,0,0.5;0,1,0];
C = B*A
The main issue, is to construct B from your list of indicies in an efficient manner. My suggestion is to use the implicit expansion of ==.
A = [1 2; 3 4; 4 5]; % Input data
idx = [1;2;1]; % Input Grouping
k = 2; % number of groups, ( = max(idx) )
m = 3; % Number of "observations"
Btmp = (idx == 1:k)'; % Mark locations
B = Btmp ./sum(Btmp,2); % Normalise
C = B*A
C =
2.5000 3.5000
3.0000 4.0000

Vectorized search for permutations (with repetitions) that contain given subpermutations (with repetitions)

This question is can be viewed continuation/extension/generalization of a previous question of mine from here.
Some definitions: I have a set of integers S = {1,2,...,s}, say s = 20, and two matrices N and M whose rows are finite sequences of numbers from S (i.e. permutations with possible repetitions), of order n and m respectively, where 1 <= n <= m. Let us think of N as a collection of candidate sub-sequences for the sequences from M.
Example: [2 3 4 3] is a sub-sequence of [1 2 2 3 5 4 1 3] that occurs with multiplicity 2 (=in how many different ways one can find the sub-seq. in the main seq.), whereas [3 2 2 3] is not a sub-sequence of it. In particular, a valid sub-sequence by definition must preserve the order of the indices.
Problem statement:
(P1) For each row of M, obtain the number of sub-sequences of it, with multiplicity and without multiplicity, that occur in N as rows (it can be zero if none are contained in N);
(P2) For each row of N, find out how many times, with multiplicity and without multiplicity, it is contained in M as a sub-sequence (again, this number can be zero);
Example: Let N = [1 2 2; 2 3 4] and M = [1 1 2 2 3; 1 2 2 3 4; 1 2 3 5 6]. Then (P1) returns [2; 3; 0] for 'with multiplicities' and [1; 2; 0] for 'without multiplicities'. (P2) returns [3; 2] for 'with multiplicities' and [2; 1] without multiplicities.
Order of magnitude: M could typically have up to 30-40 columns and a few thousand rows, although I currently have M with only a few hundred rows and ~10 columns. N could be approaching the size of
M or could be also much smaller.
What I have so far: Not much, to be honest. I believe I might be able to slightly modify my not-very-well-vectorized solution from my previous question to tackle permutations with repetitions, but I am still thinking on that and will update as soon as I have something working. But given my (lack of) experience so far, it would be in all likelihood very suboptimal :(
Thanks!
Introduction : Owing to the repetitions in the input data in each row, the combination finding process doesn't have the sort of "uniqueness" among elements which was exploited in your previous problem and hence the loops used here. Also, note that the without multiplicity codes don't use nchoosek and as such, I feel more optimistic about them for performance.
Notations :
p1wim -> P1 with multiplicity
p2wim -> P2 with multiplicity
p1wom -> P1 without multiplicity
p2wom -> P2 without multiplicity
Codes :
I. Code for P1, 2 with multiplicity
permN = permute(N,[3 2 1]);
p1wim(size(M,1),1)=0;
p2wim(size(N,1),1)=0;
for k1 = 1:size(M,1)
d1 = nchoosek(M(k1,:),3);
t1 = all(bsxfun(#eq,d1,permN),2);
p1wim(k1) = sum(t1(:));
p2wim = p2wim + squeeze(sum(t1,1));
end
II. Code for P1, 2 without multiplicity
eqmat = bsxfun(#eq,M,permute(N,[3 4 2 1])); %// equality matrix
[m,n,p,q] = size(eqmat); %// get sizes
inds = zeros(size(M,1),p,q); %// pre-allocate for indices array
vec1 = [1:m]'; %//' setup constants to loop
vec2 = [0:q-1]*m*n*p;
vec3 = permute([0:p-1]*m*n,[1 3 2]);
for iter = 1:p
[~,ind1] = max(eqmat(:,:,iter,:),[],2);
inds(:,iter,:) = reshape(ind1,m,1,q);
ind2 = squeeze(ind1);
ind3 = bsxfun(#plus,vec1,(ind2-1)*m); %//' setup forward moving equalities
ind4 = bsxfun(#plus,ind3,vec2);
ind5 = bsxfun(#plus,ind4,vec3);
eqmat(ind5(:)) = 0;
end
p1wom = sum(all(diff(inds,[],2)>0,2),3);
p2wom = squeeze(sum(all(diff(inds,[],2)>0,2),1));
As usual, I would encourage you to use gpuArrays too with your favorite parfor.
This approach uses only one loop over the rows of M (P1) or N (P2). The code makes use of linear indexing and the very powerful bsxfun function. Note that if the number of columns is large you may experience problems because of nchoosek.
[mr mc] = size(M);
[nr nc] = size(N);
%// P1
combs = nchoosek(1:mc, nc)-1;
P1mu = NaN(mr,1);
P1nm = NaN(mr,1);
for r = 1:mr
aux = M(r+mr*combs);
P1mu(r) = sum(ismember(aux, N, 'rows'));
P1nm(r) = sum(ismember(unique(aux, 'rows'), N, 'rows'));
end
%// P2. Multiplicity defined to span across different rows
rr = reshape(repmat(1:mr, size(combs,1), 1),[],1);
P2mu = NaN(nr,1);
P2nm = NaN(nr,1);
for r = 1:nr
aux = M(bsxfun(#plus, rr, mr*repmat(combs, mr, 1)));
P2mu(r) = sum(all(bsxfun(#eq, N(r,:), aux), 2));
P2nm(r) = sum(all(bsxfun(#eq, N(r,:), unique(aux, 'rows')), 2));
end
%// P2. Multiplicity defined restricted to within one row
rr = reshape(repmat(1:mr, size(combs,1), 1),[],1);
P2mur = NaN(nr,1);
P2nmr = NaN(nr,1);
for r = 1:nr
aux = M(bsxfun(#plus, rr, mr*repmat(combs, mr, 1)));
P2mur(r) = sum(all(bsxfun(#eq, N(r,:), aux), 2));
aux2 = unique([aux rr], 'rows'); %// concat rr to differentiate rows...
aux2 = aux2(:,1:end-1); %// ...and now remove it
P2nmr(r) = sum(all(bsxfun(#eq, N(r,:), aux2), 2));
end
Results for your example data:
P1mu =
2
3
0
P1nm =
1
2
0
P2mu =
3
2
P2nm =
1
1
P2mur =
3
2
P2nmr =
2
1
Some optimizations to the code would be possible. Not sure they are worth the effort:
Replace repmat by another bsxfun (using a 3rd dimension). That may save some memory
Transpose original matrices and work down colunmns, instead of along rows. That may be faster.

bsxfun implementation in matrix multiplication

As always trying to learn more from you, I was hoping I could receive some help with the following code.
I need to accomplish the following:
1) I have a vector:
x = [1 2 3 4 5 6 7 8 9 10 11 12]
2) and a matrix:
A =[11 14 1
5 8 18
10 8 19
13 20 16]
I need to be able to multiply each value from x with every value of A, this means:
new_matrix = [1* A
2* A
3* A
...
12* A]
This will give me this new_matrix of size (12*m x n) assuming A (mxn). And in this case (12*4x3)
How can I do this using bsxfun from matlab? and, would this method be faster than a for-loop?
Regarding my for-loop, I need some help here as well... I am not able to storage each "new_matrix" as the loop runs :(
for i=x
new_matrix = A.*x(i)
end
Thanks in advance!!
EDIT: After the solutions where given
First solution
clear all
clc
x=1:0.1:50;
A = rand(1000,1000);
tic
val = bsxfun(#times,A,permute(x,[3 1 2]));
out = reshape(permute(val,[1 3 2]),size(val,1)*size(val,3),[]);
toc
Output:
Elapsed time is 7.597939 seconds.
Second solution
clear all
clc
x=1:0.1:50;
A = rand(1000,1000);
tic
Ps = kron(x.',A);
toc
Output:
Elapsed time is 48.445417 seconds.
Send x to the third dimension, so that singleton expansion would come into effect when bsxfun is used for multiplication with A, extending the product result to the third dimension. Then, perform the bsxfun multiplication -
val = bsxfun(#times,A,permute(x,[3 1 2]))
Now, val is a 3D matrix and the desired output is expected to be a 2D matrix concatenated along the columns through the third dimension. This is achieved below -
out = reshape(permute(val,[1 3 2]),size(val,1)*size(val,3),[])
Hope that made sense! Spread the bsxfun word around! woo!! :)
The kron function does exactly that:
kron(x.',A)
Here is my benchmark of the methods mentioned so far, along with a few additions of my own:
function [t,v] = testMatMult()
% data
%{
x = [1 2 3 4 5 6 7 8 9 10 11 12];
A = [11 14 1; 5 8 18; 10 8 19; 13 20 16];
%}
x = 1:50;
A = randi(100, [1000,1000]);
% functions to test
fcns = {
#() func1_repmat(A,x)
#() func2_bsxfun_3rd_dim(A,x)
#() func2_forloop_3rd_dim(A,x)
#() func3_kron(A,x)
#() func4_forloop_matrix(A,x)
#() func5_forloop_cell(A,x)
#() func6_arrayfun(A,x)
};
% timeit
t = cellfun(#timeit, fcns, 'UniformOutput',true);
% check results
v = cellfun(#feval, fcns, 'UniformOutput',false);
isequal(v{:})
%for i=2:numel(v), assert(norm(v{1}-v{2}) < 1e-9), end
end
% Amro
function B = func1_repmat(A,x)
B = repmat(x, size(A,1), 1);
B = bsxfun(#times, B(:), repmat(A,numel(x),1));
end
% Divakar
function B = func2_bsxfun_3rd_dim(A,x)
B = bsxfun(#times, A, permute(x, [3 1 2]));
B = reshape(permute(B, [1 3 2]), [], size(A,2));
end
% Vissenbot
function B = func2_forloop_3rd_dim(A,x)
B = zeros([size(A) numel(x)], 'like',A);
for i=1:numel(x)
B(:,:,i) = x(i) .* A;
end
B = reshape(permute(B, [1 3 2]), [], size(A,2));
end
% Luis Mendo
function B = func3_kron(A,x)
B = kron(x(:), A);
end
% SergioHaram & TheMinion
function B = func4_forloop_matrix(A,x)
[m,n] = size(A);
p = numel(x);
B = zeros(m*p,n, 'like',A);
for i=1:numel(x)
B((i-1)*m+1:i*m,:) = x(i) .* A;
end
end
% Amro
function B = func5_forloop_cell(A,x)
B = cell(numel(x),1);
for i=1:numel(x)
B{i} = x(i) .* A;
end
B = cell2mat(B);
%B = vertcat(B{:});
end
% Amro
function B = func6_arrayfun(A,x)
B = cell2mat(arrayfun(#(xx) xx.*A, x(:), 'UniformOutput',false));
end
The results on my machine:
>> t
t =
0.1650 %# repmat (Amro)
0.2915 %# bsxfun in the 3rd dimension (Divakar)
0.4200 %# for-loop in the 3rd dim (Vissenbot)
0.1284 %# kron (Luis Mendo)
0.2997 %# for-loop with indexing (SergioHaram & TheMinion)
0.5160 %# for-loop with cell array (Amro)
0.4854 %# arrayfun (Amro)
(Those timings can slightly change between different runs, but this should give us an idea how the methods compare)
Note that some of these methods are going to cause out-of-memory errors for larger inputs (for example my solution based on repmat can easily run out of memory). Others will get significantly slower for larger sizes but won't error due to exhausted memory (the kron solution for instance).
I think that the bsxfun method func2_bsxfun_3rd_dim or the straightforward for-loop func4_forloop_matrix (thanks to MATLAB JIT) are the best solutions in this case.
Of course you can change the above benchmark parameters (size of x and A) and draw your own conclusions :)
Just to add an alternative, you maybe can use cellfun to achieve what you want. Here's an example (slightly modified from yours):
x = randi(2, 5, 3)-1;
a = randi(3,3);
%// bsxfun 3D (As implemented in the accepted solution)
val = bsxfun(#and, a, permute(x', [3 1 2])); %//'
out = reshape(permute(val,[1 3 2]),size(val,1)*size(val,3),[]);
%// cellfun (My solution)
val2 = cellfun(#(z) bsxfun(#and, a, z), num2cell(x, 2), 'UniformOutput', false);
out2 = cell2mat(val2); % or use cat(3, val2{:}) to get a 3D matrix equivalent to val and then permute/reshape like for out
%// compare
disp(nnz(out ~= out2));
Both give the same exact result.
For more infos and tricks using cellfun, see: http://matlabgeeks.com/tips-tutorials/computation-using-cellfun/
And also this: https://stackoverflow.com/a/1746422/1121352
If your vector x is of lenght = 12 and your matrix of size 3x4, I don't think that using one or the other would change much in term of time. If you are working with higher size matrix and vector, now that might become an issue.
So first of all, we want to multiply a vector with a matrix. In the for-loop method, that would give something like that :
s = size(A);
new_matrix(s(1),s(2),numel(x)) = zeros; %This is for pre-allocating. If you have a big vector or matrix, this will help a lot time efficiently.
for i = 1:numel(x)
new_matrix(:,:,i)= A.*x(i)
end
This will give you 3D matrix, with each 3rd dimension being a result of your multiplication. If this is not what you are looking for, I'll be adding another solution which might be more time efficient with bigger matrixes and vectors.

How do I quickly calculate changes in two n-by-4 matrices?

I have two matrices (tri1 and tri2) which represent a Delaunay triangulation. tri1 is the triangulation before inserting a new point, tri2 is the result after adding a new point. Each row has 4 columns. The rows represent tetrahedra.
I would like to calculate a relation between lines from tri1 to tri2. A result could look like this:
result =
1 1
2 2
3 3
4 4
0 0 % tri1(5, :) was not found in tri2 (a lot more lines could be missing)
6 5
7 6
8 7
9 8
10 9
Currently my source code looks like this:
% sort the arrays
[~, idx1] = sort(tri1(:, 1), 'ascend');
[~, idx2] = sort(tri2(:, 1), 'ascend');
stri1 = tri1(idx1, :);
stri2 = tri2(idx2, :);
result = zeros(size(tri1, 1), 2);
% find old cells in new triangulation
deleted = 0;
for ii = 1:size(tri1, 1)
found = false;
for jj = ii-deleted:size(tri2, 1)
if sum(stri1(ii, :) == stri2(jj, :)) == 4 % hot spot according to the profiler
found = true;
break;
end
if (stri1(ii, 1) < stri2(jj, 1)), break, end;
end
if found == false
deleted = deleted + 1;
else
result(idx1(ii), 1) = idx1(ii);
result(idx1(ii), 2) = idx2(jj);
end
end
The above source code gives me the results that I want, but not fast enough. I am not very experienced with MATLAB, I usually work with C++. My question: How can I speed up the comparison of two rows?
Some additional information (just in case):
the number of rows in tri can grow to about 10000
this function will be called once per inserted vertex (about 1000)
I cannot follow your example code completely, but judging from your explanation you want to see whether a row from matrix A occurs in matrix B.
In this case a very efficient implentation is available:
[Lia, Locb] = ismember(A,B,'rows');
Check the doc for more information about this function and see whether it is what you need.

Using non-continuous integers as identifiers in cells or structs in Matlab

I want to store some results in the following way:
Res.0 = magic(4); % or Res.baseCase = magic(4);
Res.2 = magic(5); % I would prefer to use integers on all other
Res.7 = magic(6); % elements than the first.
Res.2000 = 1:3;
I want to use numbers between 0 and 3000, but I will only use approx 100-300 of them. Is it possible to use 0 as an identifier, or will I have to use a minimum value of 1? (The numbers have meaning, so I would prefer if I don't need to change them). Can I use numbers as identifiers in structs?
I know I can do the following:
Res{(last number + 1)} = magic(4);
Res{2} = magic(5);
Res{7} = magic(6);
Res{2000} = 1:3;
And just remember that the last element is really the "number zero" element.
In this case I will create a bunch of empty cell elements [] in the non-populated positions. Does this cause a problem? I assume it will be best to assign the last element first, to avoid creating a growing cell, or does this not have an effect? Is this an efficient way of doing this?
Which will be most efficient, struct's or cell's? (If it's possible to use struct's, that is).
My main concern is computational efficiency.
Thanks!
Let's review your options:
Indexing into a cell arrays
MATLAB indices start from 1, not from 0. If you want to store your data in cell arrays, in the worst case, you could always use the subscript k + 1 to index into cell corresponding to the k-th identifier (k ≥ 0). In my opinion, using the last element as the "base case" is more confusing. So what you'll have is:
Res{1} = magic(4); %// Base case
Res{2} = magic(5); %// Corresponds to identifier 1
...
Res{k + 1} = ... %// Corresponds to indentifier k
Accessing fields in structures
Field names in structures are not allowed to begin with numbers, but they are allowed to contain them starting from the second character. Hence, you can build your structure like so:
Res.c0 = magic(4); %// Base case
Res.c1 = magic(5); %// Corresponds to identifier 1
Res.c2 = magic(6); %// Corresponds to identifier 2
%// And so on...
You can use dynamic field referencing to access any field, for instance:
k = 3;
kth_field = Res.(sprintf('c%d', k)); %// Access field k = 3 (i.e field 'c3')
I can't say which alternative seems more elegant, but I believe that indexing into a cell should be faster than dynamic field referencing (but you're welcome to check that out and prove me wrong).
As an alternative to EitanT's answer, it sounds like matlab's map containers are exactly what you need. They can deal with any type of key and the value may be a struct or cell.
EDIT:
In your case this will be:
k = {0,2,7,2000};
Res = {magic(4),magic(5),magic(6),1:3};
ResMap = containers.Map(k, Res)
ResMap(0)
ans =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
I agree with the idea in #wakjah 's comment. If you are concerned about the efficiency of your program it's better to change the interpretation of the problem. In my opinion there is definitely a way that you could priorotize your data. This prioritization could be according to the time you acquired them, or with respect to the inputs that they are calculated. If you set any kind of priority among them, you can sort them into an structure or cell (structure might be faster).
So
Priority (Your Current Index Meaning) Data
1 0 magic(4)
2 2 magic(5)
3 7 magic(6)
4 2000 1:3
Then:
% Initialize Result structure which is different than your Res.
Result(300).Data = 0; % 300 the maximum number of data
Result(300).idx = 0; % idx or anything that represent the meaning of your current index.
% Assigning
k = 1; % Priority index
Result(k).idx = 0; Result(k).Data = magic(4); k = k + 1;
Result(k).idx = 2; Result(k).Data = magic(5); k = k + 1;
Result(k).idx = 7; Result(k).Data = magic(6); k = k + 1;
...

Resources