Comparing Two Matrices in MATLAB which shows how much they are matched - performance

Please assume A is a matrix of 4 x 4 which has:
A = 1 0 1 0
1 0 1 0
1 1 1 0
1 1 0 0
And B is a reference matrix (4 x 4) which is:
B = 1 0 1 0
1 0 1 0
1 0 1 0
1 1 1 0
Now, if A would be compared to B which is the reference matrix, by matching these two matrices, almost all of members are equal except A(4,3) and A(3,2). However, since B is the reference matrix and A is comparing to that, only differences of those members are matter which are 1 in B. In this particular example, A(4,3) is only matter, not A(3,2), Means:
>> C = B ~= A;
ans =
0 0 0 0
0 0 0 0
0 1 0 0
0 0 1 0
A(4,3) ~= B(4,3)
Finally, we are looking for a piece of code which can show how many percentage of ones in A are equal to their equivalent members at B. In this case the difference is:
(8 / 9) * 100 = 88.89 % are matched.
Please bear in mind that speed is also important here. Therefore, quicker solution are more appreciated. Thanks.

For getting only the different entries where there is a 1 in B, just add an & to it, so you'll only get these entries. To get the percentage, take the sum where A and B are 1. Then divide it by the sum of 1 in B (or the sum of 1in A -> see the note below).
A = [1 0 1 0;
1 0 1 0;
1 1 1 0;
1 1 0 0];
B = [1 0 1 0;
1 0 1 0;
1 0 1 0;
1 1 1 0];
C = (B ~= A) & B
p = sum(B(:) & A(:)) / sum(B(:)) * 100
This is the result:
C =
0 0 0 0
0 0 0 0
0 0 0 0
0 0 1 0
p =
88.8889
Edit / Note: In the OP's question it's not 100% clear if he wants the percentage in relation to the sum of ones in A or B. I assumed that it is a percentage of the reference-matrix, which is B. Therefore I divide by sum(B(:)). In case you need it in reference to the ones in A, just change the last line to:
p = sum(B(:) & A(:)) / sum(A(:)) * 100

If I got it right, what you want to know is where B == 1 and A == 0.
Try this:
>> C = B & ~A
C =
0 0 0 0
0 0 0 0
0 0 0 0
0 0 1 0
To get the percentage, you could try this:
>> 100 * sum(A(:) & B(:)) / sum(A(:))
ans =
88.8889

You can use matrix-multiplication, which must be pretty efficient as listed next.
To get the percentage value with respect to A -
percentage_wrtA = A(:).'*B(:)/sum(A(:)) * 100;
To get the percentage value with respect to B -
percentage_wrtB = A(:).'*B(:)/sum(B(:)) * 100;
Runtime tests
Here's some quick runtime tests to compare matrix-multiplication against summation of elements with (:) and ANDing -
>> M = 6000; %// Datasize
>> A = randi([0,1],M,M);
>> B = randi([0,1],M,M);
>> tic,sum(B(:) & A(:));toc
Elapsed time is 0.500149 seconds.
>> tic,A(:).'*B(:);toc
Elapsed time is 0.126881 seconds.

Try:
sum(sum(A & B))./sum(sum(A))
Output:
ans =
0.8889

Related

Quick way of finding complementary vectors in MATLAB

I have a matrix of N rows of binary vectors, i.e.
mymatrix = [ 1 0 0 1 0;
1 1 0 0 1;
0 1 1 0 1;
0 1 0 0 1;
0 0 1 0 0;
0 0 1 1 0;
.... ]
where I'd like to find the combinations of rows that, when added together, gets me exactly:
[1 1 1 1 1]
So in the above example, the combinations that would work are 1/3, 1/4/5, and 2/6.
The code I have for this right now is:
i = 1;
for j = 1:5
C = combnk([1:N],j); % Get every possible combination of rows
for c = 1:size(C,1)
if isequal(ones(1,5),sum(mymatrix(C(c,:),:)))
combis{i} = C(c,:);
i = i+1;
end
end
end
But as you would imagine, this takes a while, especially because of that combnk in there.
What might be a useful algorithm/function that can help me speed this up?
M = [
1 0 0 1 0;
1 1 0 0 1;
0 1 1 0 1;
0 1 0 0 1;
0 0 1 0 0;
0 0 1 1 0;
1 1 1 1 1
];
% Find all the unique combinations of rows...
S = (dec2bin(1:2^size(M,1)-1) == '1');
% Find the matching combinations...
matches = cell(0,1);
for i = 1:size(S,1)
S_curr = S(i,:);
rows = M(S_curr,:);
rows_sum = sum(rows,1);
if (all(rows_sum == 1))
matches = [matches; {find(S_curr)}];
end
end
To display your matches in a good stylized way:
for i = 1:numel(matches)
match = matches{i};
if (numel(match) == 1)
disp(['Match found for row: ' mat2str(match) '.']);
else
disp(['Match found for rows: ' mat2str(match) '.']);
end
end
This will produce:
Match found for row: 7.
Match found for rows: [2 6].
Match found for rows: [1 4 5].
Match found for rows: [1 3].
In terms of efficiency, in my machine this algoritm is completing the detection of matches in about 2 milliseconds.

Efficiently unpack a vector into binary matrix Octave

On Octave I'm trying to unpack a vector in the format:
y = [ 1
2
4
1
3 ]
I want to return a matrix of dimension ( rows(y) x max value(y) ), where for each row I have a 1 in the column of the original digits value, and a zero everywhere else, i.e. for the example above
y01 = [ 1 0 0 0
0 1 0 0
0 0 0 1
1 0 0 0
0 0 1 0 ]
so far I have
y01 = zeros( m, num_labels );
for i = 1:m
for j = 1:num_labels
y01(i,j) = (y(i) == j);
end
end
which works, but is going get slow for bigger matrices, and seems inefficient because it is cycling through every single value even though the majority aren't changing.
I found this for R on another thread:
f3 <- function(vec) {
U <- sort(unique(vec))
M <- matrix(0, nrow = length(vec),
ncol = length(U),
dimnames = list(NULL, U))
M[cbind(seq_len(length(vec)), match(vec, U))] <- 1L
M
}
but I don't know R and I'm not sure if/how the solution ports to octave.
Thanks for any suggestions!
Use a sparse matrix (which also saves a lot of memory) which can be used in further calculations as usual:
y = [1; 2; 4; 1; 3]
y01 = sparse (1:rows (y), y, 1)
if you really want a full matrix then use "full":
full (y01)
ans =
1 0 0 0
0 1 0 0
0 0 0 1
1 0 0 0
0 0 1 0
Sparse is a more efficient way to do this when the matrix is big.
If your dimension of the result is not very high, you can try this:
y = [1; 2; 4; 1; 3]
I = eye(max(y));
y01 = I(y,:)
The result is same as full(sparse(...)).
y01 =
1 0 0 0
0 1 0 0
0 0 0 1
1 0 0 0
0 0 1 0
% Vector y to Matrix Y
Y = zeros(m, num_labels);
% Loop through each row
for i = 1:m
% Use the value of y as an index; set the value matching index to 1
Y(i,y(i)) = 1;
end
Another possibility is:
y = [1; 2; 4; 1; 3]
classes = unique(y)(:)
num_labels = length(classes)
y01=[1:num_labels] == y
With the following detailed printout:
y =
1
2
4
1
3
classes =
1
2
3
4
num_labels = 4
y01 =
1 0 0 0
0 1 0 0
0 0 0 1
1 0 0 0
0 0 1 0

Count the number of rows between each instance of a value in a matrix

Assume the following matrix:
myMatrix = [
1 0 1
1 0 0
1 1 1
1 1 1
0 1 1
0 0 0
0 0 0
0 1 0
1 0 0
0 0 0
0 0 0
0 0 1
0 0 1
0 0 1
];
Given the above (and treating each column independently), I'm trying to create a matrix that will contain the number of rows since the last value of 1 has "shown up". For example, in the first column, the first four values would become 0 since there are 0 rows between each of those rows and the previous value of 1.
Row 5 would become 1, row 6 = 2, row 7 = 3, row 8 = 4. Since row 9 contains a 1, it would become 0 and the count starts again with row 10. The final matrix should look like this:
FinalMatrix = [
0 1 0
0 2 1
0 0 0
0 0 0
1 0 0
2 1 1
3 2 2
4 0 3
0 1 4
1 2 5
2 3 6
3 4 0
4 5 0
5 6 0
];
What is a good way of accomplishing something like this?
EDIT: I'm currently using the following code:
[numRow,numCol] = size(myMatrix);
oneColumn = 1:numRow;
FinalMatrix = repmat(oneColumn',1,numCol);
toSubtract = zeros(numRow,numCol);
for m=1:numCol
rowsWithOnes = find(myMatrix(:,m));
for mm=1:length(rowsWithOnes);
toSubtract(rowsWithOnes(mm):end,m) = rowsWithOnes(mm);
end
end
FinalMatrix = FinalMatrix - toSubtract;
which runs about 5 times faster than the bsxfun solution posted over many trials and data sets (which are about 1500 x 2500 in size). Can the code above be optimized?
For a single column you could do this:
col = 1; %// desired column
vals = bsxfun(#minus, 1:size(myMatrix,1), find(myMatrix(:,col)));
vals(vals<0) = inf;
result = min(vals, [], 1).';
Result for first column:
result =
0
0
0
0
1
2
3
4
0
1
2
3
4
5
find + diff + cumsum based approach -
offset_array = zeros(size(myMatrix));
for k1 = 1:size(myMatrix,2)
a = myMatrix(:,k1);
widths = diff(find(diff([1 ; a])~=0));
idx = find(diff(a)==1)+1;
offset_array(idx(idx<=numel(a)),k1) = widths(1:2:end);
end
FinalMatrix1 = cumsum(double(myMatrix==0) - offset_array);
Benchmarking
The benchmarking code for comparing the above mentioned approach against the one in the question is listed here -
clear all
myMatrix = round(rand(1500,2500)); %// create random input array
for k = 1:50000
tic(); elapsed = toc(); %// Warm up tic/toc
end
disp('------------- With FIND+DIFF+CUMSUM based approach') %//'#
tic
offset_array = zeros(size(myMatrix));
for k1 = 1:size(myMatrix,2)
a = myMatrix(:,k1);
widths = diff(find(diff([1 ; a])~=0));
idx = find(diff(a)==1)+1;
offset_array(idx(idx<=numel(a)),k1) = widths(1:2:end);
end
FinalMatrix1 = cumsum(double(myMatrix==0) - offset_array);
toc
clear FinalMatrix1 offset_array idx widths a
disp('------------- With original approach') %//'#
tic
[numRow,numCol] = size(myMatrix);
oneColumn = 1:numRow;
FinalMatrix = repmat(oneColumn',1,numCol); %//'#
toSubtract = zeros(numRow,numCol);
for m=1:numCol
rowsWithOnes = find(myMatrix(:,m));
for mm=1:length(rowsWithOnes);
toSubtract(rowsWithOnes(mm):end,m) = rowsWithOnes(mm);
end
end
FinalMatrix = FinalMatrix - toSubtract;
toc
The results I got were -
------------- With FIND+DIFF+CUMSUM based approach
Elapsed time is 0.311115 seconds.
------------- With original approach
Elapsed time is 7.587798 seconds.

Why does diag exhibit inconsistent behavior in octave

Can someone explain what's going on here?
octave:1> t = eye(3)
t =
Diagonal Matrix
1 0 0
0 1 0
0 0 1
octave:2> diag(t(3,:))
ans =
Diagonal Matrix
0 0 0
0 0 0
0 0 1
octave:3> diag(t(2,:))
ans =
Diagonal Matrix
0 0 0
0 1 0
0 0 0
octave:4> diag(t(1,:))
ans = 1
Why do the first two give back 3x3 matrices but the last one is just a number?
The problem arises because of the way t(1,:) was created, from eye(3).
If you output the rows of t individually the results are:
octave.28> t(1,:)
ans =
**Diagonal Matrix**
1 0 0
octave.29> t(2,:)
ans =
0 1 0
octave.30> t(3,:)
ans =
0 0 1
For some reason (I can't explain) t(1,:) is still recognized as a diagonal matrix, while t(2,:) and t(3,:) are vectors. When you call diag(t(:,1)) it is not receiving a vector argument, but rather a matrix. If you convert t(:,1) to vector before evaluation you get the expected result.
octave.31> diag(vec(t(1,:)))
ans =
**Diagonal Matrix**
1 0 0
0 0 0
0 0 0

Microsoft Interview: transforming a matrix

Given a matrix of size n x m filled with 0's and 1's
e.g.:
1 1 0 1 0
0 0 0 0 0
0 1 0 0 0
1 0 1 1 0
if the matrix has 1 at (i,j), fill the column j and row i with 1's
i.e., we get:
1 1 1 1 1
1 1 1 1 0
1 1 1 1 1
1 1 1 1 1
Required complexity: O(n*m) time and O(1) space
NOTE: you are not allowed to store anything except '0' or '1' in the matrix entries
Above is a Microsoft Interview Question.
I thought for two hours now. I have some clues but can't proceed any more.
Ok. The first important part of this question is that Even using a straight forward brute-force way, it can't be easily solved.
If I just use two loops to iterate through every cell in the matrix, and change the according row and column, it can't be done as the resulting matrix should be based on the origin matrix.
For example, if I see a[0][0] == 1, I can't change row 0 and column 0 all to 1, because that will affect row 1 as row 1 doesn't have 0 originally.
The second thing I noticed is that if a row r contains only 0 and a column c contains only 0, then a[r][c] must be 0; for any other position which is not in this pattern should be 1.
Then another question comes, if I find such a row and column, how can I mark the according cell a[r][c] as special as it already is 0.
My intuitive is that I should use some kind of bit operations on this. Or to meet the required complexity, I have to do something like After I take care of a[i][j], I should then proceed to deal with a[i+1][j+1], instead of scan row by row or column by column.
Even for brute-force without considering time complexity, I can't solve it with the other conditions.
Any one has a clue?
Solution: Java version
#japreiss has answered this question, and his/her answer is smart and correct. His code is in Python, and now I give the Java version. Credits all go to #japreiss
public class MatrixTransformer {
private int[][] a;
private int m;
private int n;
public MatrixTransformer(int[][] _a, int _m, int _n) {
a = _a;
m = _m;
n = _n;
}
private int scanRow(int i) {
int allZero = 0;
for(int k = 0;k < n;k++)
if (a[i][k] == 1) {
allZero = 1;
break;
}
return allZero;
}
private int scanColumn(int j) {
int allZero = 0;
for(int k = 0;k < m;k++)
if (a[k][j] == 1) {
allZero = 1;
break;
}
return allZero;
}
private void setRowToAllOnes(int i) {
for(int k = 0; k < n;k++)
a[i][k] = 1;
}
private void setColToAllOnes(int j) {
for(int k = 0; k < m;k++)
a[k][j] = 1;
}
// # we're going to use the first row and column
// # of the matrix to store row and column scan values,
// # but we need aux storage to deal with the overlap
// firstRow = scanRow(0)
// firstCol = scanCol(0)
//
// # scan each column and store result in 1st row - O(mn) work
public void transform() {
int firstRow = scanRow(0);
int firstCol = scanColumn(0);
for(int k = 0;k < n;k++) {
a[0][k] = scanColumn(k);
}
// now row 0 tells us whether each column is all zeroes or not
// it's also the correct output unless row 0 contained a 1 originally
for(int k = 0;k < m;k++) {
a[k][0] = scanRow(k);
}
a[0][0] = firstCol | firstRow;
for (int i = 1;i < m;i++)
for(int j = 1;j < n;j++)
a[i][j] = a[0][j] | a[i][0];
if (firstRow == 1) {
setRowToAllOnes(0);
}
if (firstCol == 1)
setColToAllOnes(0);
}
#Override
public String toString() {
StringBuilder sb = new StringBuilder();
for (int i = 0; i< m;i++) {
for(int j = 0;j < n;j++) {
sb.append(a[i][j] + ", ");
}
sb.append("\n");
}
return sb.toString();
}
/**
* #param args
*/
public static void main(String[] args) {
int[][] a = {{1, 1, 0, 1, 0}, {0, 0, 0, 0, 0},{0, 1, 0, 0, 0},{1, 0, 1, 1, 0}};
MatrixTransformer mt = new MatrixTransformer(a, 4, 5);
mt.transform();
System.out.println(mt);
}
}
Here is a solution in python pseudocode that uses 2 extra bools of storage. I think it is more clear than I could do in English.
def scanRow(i):
return 0 if row i is all zeroes, else 1
def scanColumn(j):
return 0 if col j is all zeroes, else 1
# we're going to use the first row and column
# of the matrix to store row and column scan values,
# but we need aux storage to deal with the overlap
firstRow = scanRow(0)
firstCol = scanCol(0)
# scan each column and store result in 1st row - O(mn) work
for col in range(1, n):
matrix[0, col] = scanColumn(col)
# now row 0 tells us whether each column is all zeroes or not
# it's also the correct output unless row 0 contained a 1 originally
# do the same for rows into column 0 - O(mn) work
for row in range(1, m):
matrix[row, 0] = scanRow(row)
matrix[0,0] = firstRow or firstCol
# now deal with the rest of the values - O(mn) work
for row in range(1, m):
for col in range(1, n):
matrix[row, col] = matrix[0, col] or matrix[row, 0]
# 3 O(mn) passes!
# go back and fix row 0 and column 0
if firstRow:
# set row 0 to all ones
if firstCol:
# set col 0 to all ones
Here's another intuition that gives a clean and simple algorithm for solving the problem.
An initial algorithm using O(n) space.
For now, let's ignore the O(1) memory constraint. Suppose that you can use O(n) memory (if the matrix is m × n). That would make this problem a lot easier and we could use the following strategy:
Create an boolean array with one entry per column.
For each column, determine whether there are any 1's in the column and store that information in the appropriate array entry.
For each row, set that row to be all 1's if there are any 1's in the row.
For each column, set that column to be all 1's if the corresponding array entry is set.
As an example, consider this array:
1 1 0 1 0
0 0 0 0 0
0 1 0 0 0
1 0 1 1 0
We'd start off by creating and populating the auxiliary array, which can be done in time O(mn) by visiting each column one at a time. This is shown here:
1 1 0 1 0
0 0 0 0 0
0 1 0 0 0
1 0 1 1 0
1 1 1 1 0 <--- aux array
Next, we iterate across the rows and fill each one in if it contains any 1's. This gives this result:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1
1 1 1 1 1
1 1 1 1 0 <--- aux array
Finally, we fill in each column with 1's if the auxiliary array has a 1 in that position. This is shown here:
1 1 1 1 1
1 1 1 1 0
1 1 1 1 1
1 1 1 1 1
1 1 1 1 0 <--- aux array
So there's one problem: this uses O(n) space, which we don't have! So why even go down this route?
A revised algorithm using O(1) space.
It turns out that we can use a very cute trick to run this algorithm using O(1) space. We need a key observation: if every row contains at least one 1, then the entire matrix becomes 1's. We therefore start off by seeing if this is the case. If it is, great! We're done.
Otherwise, there must be some row in the matrix that is all 0's. Since this row is all 0's, we know that in the "fill each row containing a 1 with 1's" step, the row won't be filled in. Therefore, we can use that row as our auxiliary array!
Let's see this in action. Start off with this:
1 1 0 1 0
0 0 0 0 0
0 1 0 0 0
1 0 1 1 0
Now, we can find a row with all 0's in it and use it as our auxiliary array:
1 1 0 1 0
0 0 0 0 0 <-- Aux array
0 1 0 0 0
1 0 1 1 0
We now fill in the auxiliary array by looking at each column and marking which ones contain at least one 1:
1 1 0 1 0
1 1 1 1 0 <-- Aux array
0 1 0 0 0
1 0 1 1 0
It's perfectly safe to fill in the 1's here because we know that they're going to get filled in anyway. Now, for each row that contains a 1, except for the auxiliary array row, we fill in those rows with 1's:
1 1 1 1 1
1 1 1 1 0 <-- Aux array
1 1 1 1 1
1 1 1 1 1
We skip the auxiliary array because initially it was all 0's, so it wouldn't normally be filled. Finally, we fill in each column with a 1 in the auxiliary array with 1's, giving this final result:
1 1 1 1 1
1 1 1 1 0 <-- Aux array
1 1 1 1 1
1 1 1 1 1
Let's do another example. Consider this setup:
1 0 0 0
0 0 1 0
0 0 0 0
0 0 1 0
We begin by finding a row that's all zeros, as shown here:
1 0 0 0
0 0 1 0
0 0 0 0 <-- Aux array
0 0 1 0
Next, let's populate that row by marking columns containing a 1:
1 0 0 0
0 0 1 0
1 0 1 0 <-- Aux array
0 0 1 0
Now, fill in all rows containing a 1:
1 1 1 1
1 1 1 1
1 0 1 0 <-- Aux array
1 1 1 1
Next, fill in all columns containing a 1 in the aux array with 1's. This is already done here, and we have our result!
As another example, consider this array:
1 0 0
0 0 1
0 1 0
Every row here contains at least one 1, so we just fill the matrix with ones and are done.
Finally, let's try this example:
0 0 0 0 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 1 0
We have lots of choices for aux arrays, so let's pick the first row:
0 0 0 0 0 <-- aux array
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 1 0
Now, we fill in the aux array:
0 1 0 1 0 <-- aux array
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 1 0
Now, we fill in the rows:
0 1 0 1 0 <-- aux array
0 0 0 0 0
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1
Now, we fill in the columns based on the aux array:
0 1 0 1 0 <-- aux array
0 1 0 1 0
1 1 1 1 1
0 1 0 1 0
1 1 1 1 1
And we're done! The whole thing runs in O(mn) time because we
Do O(mn) work to find the aux array, and possibly O(mn) work immediately if one doesn't exist.
Do O(mn) work to fill in the aux array.
Do O(mn) work to fill in rows containing 1s.
Do O(mn) work to fill in columns containing 1s.
Plus, it only uses O(1) space, since we just need to store the index of the aux array and enough variables to do loops over the matrix.
EDIT: I have a Java implementation of this algorithm with comments describing it in detail available on my personal site. Enjoy!
Hope this helps!
Assuming matrix is 0-based, i.e. the first element is at mat[0][0]
Use the first row and first column as table headers to contain column and row info respectively.
1.1 Note the element at mat[0][0]. If it is 1, it will require special handling at the end (described later)
Now, start scanning the inner matrix from index[1][1] up to the last element
2.1 If the element at[row][col] == 1 then update the table header data as follows
Row: mat[row][0] = 1;
Column: mat[0][col] = 1;
At this point we have the complete info on which column and row should be set to 1
Again start scanning the inner matrix starting from mat[1][1] and set each element
to 1 if either the current row or column contains 1 in the table header:
if ( (mat[row][0] == 1) || (mat[0][col] == 1) ) then set mat[row][col] to 1.
At this point we have processed all the cells in the inner matrix and we are
yet to process the table header itself
Process the table header
If the matt[0][0] == 1 then set all the elements in the first column and first
row to 1
Done
Time complexity O(2*((n-1)(m-1)+(n+m-1)), i.e. O(2*n*m - (n+m) + 1), i.e. O(2*n*m)
Space O(1)
See my implementation at http://codepad.org/fycIyflw
Another solution would be to scan the matrix as usual, and at the first 1 you split the matrix in 4 quadrants. You then set the line and the column to 1's, and recursively process each quadrant. Just make sure to set the whole columns and rows, even though you are scanning only a quadrant.
public void setOnes(int [][] matrix){
boolean [] row = new boolean [matrix.length]
boolean [] col = new boolean [matrix[0].length]
for (int i=0;i<matrix.length;i++){
for(int j=0;j<matrix[0].length;j++){
if (matrix[i][j] == 1){
row[i] = true
col[j] = true
}
}
}
for (int i=0;i<matrix.length;i++){
for(int j=0;j<matrix[0].length;j++){
if (row[i] || col[j]){
matrix[i][j] = 1;
}
}
}
}

Resources