Related
I am trying to find islands of numbers in a matrix.
By an island, I mean a rectangular area where ones are connected with each other either horizontally, vertically or diagonally including the boundary layer of zeros
Suppose I have this matrix:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1
0 0 0 1 1 1 0 1 1 0 0 0 1 1 1 1 0
0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1
0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0
0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
By boundary layer, I mean row 2 and 7, and column 3 and 10 for island#1.
This is shown below:
I want the row and column indices of the islands. So for the above matrix, the desired output is:
isl{1}= {[2 3 4 5 6 7]; % row indices of island#1
[3 4 5 6 7 8 9 10]} % column indices of island#1
isl{2}= {[2 3 4 5 6 7]; % row indices of island#2
[12 13 14 15 16 17]}; % column indices of island#2
isl{3} ={[9 10 11 12]; % row indices of island#3
[2 3 4 5 6 7 8 9 10 11];} % column indices of island#3
It doesn't matter which island is detected first.
While I know that the [r,c] = find(matrix) function can give the row and column indices of ones but I have no clues on how to detect the connected ones since they can be connected in horizontal, vertical and diagonal order.
Any ideas on how to deal with this problem?
You should look at the BoundingBox and ConvexHull stats returned by regionprops:
a = imread('circlesBrightDark.png');
bw = a < 100;
s = regionprops('table',bw,'BoundingBox','ConvexHull')
https://www.mathworks.com/help/images/ref/regionprops.html
Finding the connected components and their bounding boxes is the easy part. The more difficult part is merging the bounding boxes into islands.
Bounding Boxes
First the easy part.
function bBoxes = getIslandBoxes(lMap)
% find bounding box of each candidate island
% lMap is a logical matrix containing zero or more connected components
bw = bwlabel(lMap); % label connected components in logical matrix
bBoxes = struct2cell(regionprops(bw, 'BoundingBox')); % get bounding boxes
bBoxes = cellfun(#round, bBoxes, 'UniformOutput', false); % round values
end
The values are rounded because the bounding boxes returned by regionprops lies outside its respective component on the grid lines rather than the cell center, and we need integer values to use as subscripts into the matrix. For example, a component that looks like this:
0 0 0
0 1 0
0 0 0
will have a bounding box of
[ 1.5000 1.5000 1.0000 1.0000 ]
which we round to
[ 2 2 1 1]
Merging
Now the hard part. First, the merge condition:
We merge bounding box b2 into bounding box b1 if b2 and the island of b1 (including the boundary layer) have a non-null intersection.
This condition ensures that bounding boxes are merged when one component is wholly or partially inside the bounding box of another, but it also catches the edge cases when a bounding box is within the zero boundary of another. Once all of the bounding boxes are merged, they are guaranteed to have a boundary of all zeros (or border the edge of the matrix), otherwise the nonzero value in its boundary would have been merged.
Since merging involves deleting the merged bounding box, the loops are done backwards so that we don't end up indexing non-existent array elements.
Unfortunately, making one pass through the array comparing each element to all the others is insufficient to catch all cases. To signal that all of the possible bounding boxes have been merged into islands, we use a flag called anyMerged and loop until we get through one complete iteration without merging anything.
function mBoxes = mergeBoxes(bBoxes)
% find bounding boxes that intersect, and merge them
mBoxes = bBoxes;
% merge bounding boxes that overlap
anyMerged = true; % flag to show when we've finished
while (anyMerged)
anyMerged = false; % no boxes merged on this iteration so far...
for box1 = numel(mBoxes):-1:2
for box2 = box1-1:-1:1
% if intersection between bounding boxes is > 0, merge
% the size of box1 is increased b y 1 on all sides...
% this is so that components that lie within the borders
% of another component, but not inside the bounding box,
% are merged
if (rectint(mBoxes{box1} + [-1 -1 2 2], mBoxes{box2}) > 0)
coords1 = rect2corners(mBoxes{box1});
coords2 = rect2corners(mBoxes{box2});
minX = min(coords1(1), coords2(1));
minY = min(coords1(2), coords2(2));
maxX = max(coords1(3), coords2(3));
maxY = max(coords1(4), coords2(4));
mBoxes{box2} = [minX, minY, maxX-minX+1, maxY-minY+1]; % merge
mBoxes(box1) = []; % delete redundant bounding box
anyMerged = true; % bounding boxes merged: loop again
break;
end
end
end
end
end
The merge function uses a small utility function that converts rectangles with the format [x y width height] to a vector of subscripts for the top-left, bottom-right corners [x1 y1 x2 y2]. (This was actually used in another function to check that an island had a zero border, but as discussed above, this check is unnecessary.)
function corners = rect2corners(rect)
% change from rect = x, y, width, height
% to corners = x1, y1, x2, y2
corners = [rect(1), ...
rect(2), ...
rect(1) + rect(3) - 1, ...
rect(2) + rect(4) - 1];
end
Output Formatting and Driver Function
The return value from mergeBoxes is a cell array of rectangle objects. If you find this format useful, you can stop here, but it's easy to get to the format requested with ranges of rows and columns for each island:
function rRanges = rect2range(bBoxes, mSize)
% convert rect = x, y, width, height to
% range = y:y+height-1; x:x+width-1
% and expand range by 1 in all 4 directions to include zero border,
% making sure to stay within borders of original matrix
rangeFun = #(rect) {max(rect(2)-1,1):min(rect(2)+rect(4),mSize(1));...
max(rect(1)-1,1):min(rect(1)+rect(3),mSize(2))};
rRanges = cellfun(rangeFun, bBoxes, 'UniformOutput', false);
end
All that's left is a main function to tie all of the others together and we're done.
function theIslands = getIslandRects(m)
% get rectangle around each component in map
lMap = logical(m);
% get the bounding boxes of candidate islands
bBoxes = getIslandBoxes(lMap);
% merge bounding boxes that overlap
bBoxes = mergeBoxes(bBoxes);
% convert bounding boxes to row/column ranges
theIslands = rect2range(bBoxes, size(lMap));
end
Here's a run using the sample matrix given in the question:
M =
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1
0 0 0 1 1 1 0 1 1 0 0 0 1 1 1 1 0
0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1
0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0
0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> getIslandRects(M)
ans =
{
[1,1] =
{
[1,1] =
9 10 11 12
[2,1] =
2 3 4 5 6 7 8 9 10 11
}
[1,2] =
{
[1,1] =
2 3 4 5 6 7
[2,1] =
3 4 5 6 7 8 9 10
}
[1,3] =
{
[1,1] =
2 3 4 5 6 7
[2,1] =
12 13 14 15 16 17
}
}
Quite easy!
Just use bwboundaries to get the boundaries of each of the blobs. you can then just get the min and max in each x and y direction of each boundary to build your box.
Use image dilation and regionprops
mat = [...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1;
0 0 0 1 1 1 0 1 1 0 0 0 1 1 1 1 0;
0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1;
0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0;
0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0];
mat=logical(mat);
dil_mat=imdilate(mat,true(2,2)); %here we make bridges to 1 px away ones
l_mat=bwlabel(dil_mat,8);
bb = regionprops(l_mat,'BoundingBox');
bb = struct2cell(bb); bb = cellfun(#(x) fix(x), bb, 'un',0);
isl = cellfun(#(x) {max(1,x(2)):min(x(2)+x(4),size(mat,1)),...
max(1,x(1)):min(x(1)+x(3),size(mat,2))},bb,'un',0);
Let Y be a vector of length N, containing numbers from 1 to 10. As example code you can use:
Y = vec(1:10);
I am writing the code which must create an N x 10 matrix, each row consisting of all zeros except for a 1 only in the position which corresponds to the number in vector Y. Thus, 1 in Y becomes 10000000000, 3 becomes 0010000000, and so on.
This approach works:
cell2mat(arrayfun(#(x)eye(10)(x,:), Y, 'UniformOutput', false))
My next idea was to "optimize", so eye(10) is not generated N times, and I wrote this:
theEye = eye(10);
cell2mat(arrayfun(#(x)theEye(x,:), Y, 'UniformOutput', false))
However, now Octave is giving me error:
error: can't perform indexing operations for diagonal matrix type
error: evaluating argument list element number 1
Why do I get this error? What is wrong?
Bonus questions — do you see a better way to do what I am doing? Is my attempt to optimize making things easier for Octave?
I ran this code in Octave and eye creates a matrix of a class (or whatever this is) known as a Diagonal Matrix:
octave:3> theEye = eye(10);
octave:4> theEye
theEye =
Diagonal Matrix
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 1
In fact, the documentation for Octave says that if the matrix is diagonal, a special object is created to handle the diagonal matrices instead of a standard matrix: https://www.gnu.org/software/octave/doc/interpreter/Creating-Diagonal-Matrices.html
What's interesting is that we can slice into this matrix outside of the arrayfun call, regardless of it being in a separate class.
octave:1> theEye = eye(10);
octave:2> theEye(1,:)
ans =
Diagonal Matrix
1 0 0 0 0 0 0 0 0 0
However, as soon as we put this into an arrayfun call, it decides to crap out:
octave:5> arrayfun(#(x)theEye(x,:), 1:3, 'uni', 0)
error: can't perform indexing operations for diagonal matrix type
This to me doesn't make any sense, especially since we can slice into it outside of arrayfun. One may suspect that it has something to do with arrayfun and since you are specifying UniformOutput to be false, a cell array of elements is returned per element in Y and perhaps something is going wrong when storing these slices into each cell array element.
However, this doesn't seem to be the culprit either. I took the first three rows of theEye, placed them into a cell array and merged them together using cell2mat:
octave:6> cell2mat({theEye(1,:); theEye(2,:); theEye(3,:)})
ans =
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
As such, I suspect that it may be some sort of internal bug (if you could call it that...). Thanks to user carandraug (see comment above), this is indeed a bug and it has been reported: https://savannah.gnu.org/bugs/?47510. What may also provide insight is that this code runs as expected in MATLAB.
In any case, one thing you can take away from this is that I would seriously refrain from using cell2mat. Just use straight up indexing:
Y = vec(1:10);
theEye = eye(10);
out = theEye(Y,:);
This would index into theEye and extract out the relevant rows stored in Y and create a matrix where each row is zero except for the corresponding value seen in each element Y.
Also, have a look at this post for a similar example: Replace specific columns in a matrix with a constant column vector
However, it is defined over the columns instead of the rows, but it's very similar to what you want to achieve.
Another approach; We start with the data:
>> len = 10; % max number
>> vec = randi(len, [1 7]) % vector of numbers
vec =
1 10 9 5 7 3 6
Now we build the indicator matrix:
>> I = full(sparse(1:numel(vec), vec, 1, numel(vec), len))
I =
1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
Please assume A is a matrix of 4 x 4 which has:
A = 1 0 1 0
1 0 1 0
1 1 1 0
1 1 0 0
And B is a reference matrix (4 x 4) which is:
B = 1 0 1 0
1 0 1 0
1 0 1 0
1 1 1 0
Now, if A would be compared to B which is the reference matrix, by matching these two matrices, almost all of members are equal except A(4,3) and A(3,2). However, since B is the reference matrix and A is comparing to that, only differences of those members are matter which are 1 in B. In this particular example, A(4,3) is only matter, not A(3,2), Means:
>> C = B ~= A;
ans =
0 0 0 0
0 0 0 0
0 1 0 0
0 0 1 0
A(4,3) ~= B(4,3)
Finally, we are looking for a piece of code which can show how many percentage of ones in A are equal to their equivalent members at B. In this case the difference is:
(8 / 9) * 100 = 88.89 % are matched.
Please bear in mind that speed is also important here. Therefore, quicker solution are more appreciated. Thanks.
For getting only the different entries where there is a 1 in B, just add an & to it, so you'll only get these entries. To get the percentage, take the sum where A and B are 1. Then divide it by the sum of 1 in B (or the sum of 1in A -> see the note below).
A = [1 0 1 0;
1 0 1 0;
1 1 1 0;
1 1 0 0];
B = [1 0 1 0;
1 0 1 0;
1 0 1 0;
1 1 1 0];
C = (B ~= A) & B
p = sum(B(:) & A(:)) / sum(B(:)) * 100
This is the result:
C =
0 0 0 0
0 0 0 0
0 0 0 0
0 0 1 0
p =
88.8889
Edit / Note: In the OP's question it's not 100% clear if he wants the percentage in relation to the sum of ones in A or B. I assumed that it is a percentage of the reference-matrix, which is B. Therefore I divide by sum(B(:)). In case you need it in reference to the ones in A, just change the last line to:
p = sum(B(:) & A(:)) / sum(A(:)) * 100
If I got it right, what you want to know is where B == 1 and A == 0.
Try this:
>> C = B & ~A
C =
0 0 0 0
0 0 0 0
0 0 0 0
0 0 1 0
To get the percentage, you could try this:
>> 100 * sum(A(:) & B(:)) / sum(A(:))
ans =
88.8889
You can use matrix-multiplication, which must be pretty efficient as listed next.
To get the percentage value with respect to A -
percentage_wrtA = A(:).'*B(:)/sum(A(:)) * 100;
To get the percentage value with respect to B -
percentage_wrtB = A(:).'*B(:)/sum(B(:)) * 100;
Runtime tests
Here's some quick runtime tests to compare matrix-multiplication against summation of elements with (:) and ANDing -
>> M = 6000; %// Datasize
>> A = randi([0,1],M,M);
>> B = randi([0,1],M,M);
>> tic,sum(B(:) & A(:));toc
Elapsed time is 0.500149 seconds.
>> tic,A(:).'*B(:);toc
Elapsed time is 0.126881 seconds.
Try:
sum(sum(A & B))./sum(sum(A))
Output:
ans =
0.8889
Can someone explain what's going on here?
octave:1> t = eye(3)
t =
Diagonal Matrix
1 0 0
0 1 0
0 0 1
octave:2> diag(t(3,:))
ans =
Diagonal Matrix
0 0 0
0 0 0
0 0 1
octave:3> diag(t(2,:))
ans =
Diagonal Matrix
0 0 0
0 1 0
0 0 0
octave:4> diag(t(1,:))
ans = 1
Why do the first two give back 3x3 matrices but the last one is just a number?
The problem arises because of the way t(1,:) was created, from eye(3).
If you output the rows of t individually the results are:
octave.28> t(1,:)
ans =
**Diagonal Matrix**
1 0 0
octave.29> t(2,:)
ans =
0 1 0
octave.30> t(3,:)
ans =
0 0 1
For some reason (I can't explain) t(1,:) is still recognized as a diagonal matrix, while t(2,:) and t(3,:) are vectors. When you call diag(t(:,1)) it is not receiving a vector argument, but rather a matrix. If you convert t(:,1) to vector before evaluation you get the expected result.
octave.31> diag(vec(t(1,:)))
ans =
**Diagonal Matrix**
1 0 0
0 0 0
0 0 0
an expecting result is like this one.
0 0 0 1 0 * * * 0 0 0 0
0 0 0 0 1 * 1 * * 0 0 0
0 0 1 1 1 * 1 0 * 0 0 0
0 * * 0 1 * 1 0 * 0 0 0
0 * 1 1 1 * 1 0 * * 0 0
0 * 0 1 1 * 1 0 0 * 0 0
0 * 0 0 1 * 1 0 0 * 0 0
0 * * * * * 1 0 0 * * 0
0 0 0 0 0 0 1 0 0 0 * 0
0 0 0 0 0 0 1 0 0 0 * 0
0 0 0 0 0 0 1 0 0 0 * 0
0 0 0 0 0 0 1 0 0 0 * *
you can only walk at 4 directions,no 45 degree direction, im using A* , i changed part of the original algorithm for more suited in my case.
here's my python code:
i run it 1000 times.
the cost is 1.4s~1.5s
def astar(m,startp,endp):
w,h = 12,12
sx,sy = startp
ex,ey = endp
#[parent node, x, y,g,f]
node = [None,sx,sy,0,abs(ex-sx)+abs(ey-sy)]
closeList = [node]
createdList = {}
createdList[sy*w+sx] = node
k=0
while(closeList):
node = closeList.pop(0)
x = node[1]
y = node[2]
l = node[3]+1
k+=1
#find neighbours
#make the path not too strange
if k&1:
neighbours = ((x,y+1),(x,y-1),(x+1,y),(x-1,y))
else:
neighbours = ((x+1,y),(x-1,y),(x,y+1),(x,y-1))
for nx,ny in neighbours:
if nx==ex and ny==ey:
path = [(ex,ey)]
while node:
path.append((node[1],node[2]))
node = node[0]
return list(reversed(path))
if 0<=nx<w and 0<=ny<h and m[ny][nx]==0:
if ny*w+nx not in createdList:
nn = (node,nx,ny,l,l+abs(nx-ex)+abs(ny-ey))
createdList[ny*w+nx] = nn
#adding to closelist ,using binary heap
nni = len(closeList)
closeList.append(nn)
while nni:
i = (nni-1)>>1
if closeList[i][4]>nn[4]:
closeList[i],closeList[nni] = nn,closeList[i]
nni = i
else:
break
return 'not found'
m = ((0,0,0,1,0,0,0,0,0,0,0,0),
(0,0,0,0,1,0,1,0,0,0,0,0),
(0,0,1,1,1,0,1,0,0,0,0,0),
(0,0,0,0,1,0,1,0,0,0,0,0),
(0,0,1,1,1,0,1,0,0,0,0,0),
(0,0,0,1,1,0,1,0,0,0,0,0),
(0,0,0,0,1,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0)
)
t1 = time.time()
for i in range(1000):
result = astar(m,(2,3),(11,11))
print(time.time()-t1)
cm = [list(x[:]) for x in m]
if isinstance(result, list):
for y in range(len(m)):
my = m[y]
for x in range(len(my)):
for px,py in result:
if px==x and py ==y:
cm[y][x] = '*'
for my in cm:
print(' '.join([str(x) for x in my]))
exit(0)
tell me if you know faster or fastest way by now.
A* algorithm is pretty fast one for a known graph (all edges are known and you can estimate distance to the target using some admissible heuristic).
There are some improvements to A* algorithm which makes it faster at the cost of being less optimal. The most common is A*-Epsilon (AKA bounded A*). The idea is to allow the algorithm to develop nodes that are (1+epsilon)*MIN (where regular A* develops only MIN). The result (depending on the epsilon value of course) is usually a faster solution, but the path found is at most (1+epsilon) * OPTIMAL.
Another possible optimization is doing A* from one end - and from the other (the "exit") do a BFS simultaneously. This technique is called bi-directional search - and is usually a great way to improve performance in unweighted graphs when the problem has a single final state. I tried to explain the principles of bi-directional search once in this thread