ismember fails to find a number generated by bwlabel - image

Following this post by steve: http://blogs.mathworks.com/steve/2009/02/27/using-ismember-with-the-output-of-regionprops/
I wanted to apply it on a very simple case. Here is the logical image that I have, it clearly has three objects:
This is the code I used :
[L_t,n_t] = bwlabel(logical_image);
iii = find(L_t == 2);
bbb = ismember(L_t,iii);
imshow(bbb);
But all I am getting in bbb is an empty matrix. i.e. a logical image the same size of the original but consisting entirely of 0.
n_t shows 3 objects found. the max value of L_t is 3. Then how come ismember fails to find 2?

It doesn't work because iii is a list of indices (positions in L_t where L_t == 2), and L_t is a number from 1 to 3. This is not the same as what they are doing in the original example:
idx = find((100 <= area_values) & (area_values <= 1000))
Here, area_values is a list taken from regionprops of the area of the different regions in your labelled image. It has the same length, n, as the number of regions (different values) in L. e.g. if there are 10 areas in the image and areas 1, 3, and 7 have areas in the specified range the output of idx is [1 3 7].
This then selects the parts of L where L is 1, 3, or 7:
bw2 = ismember(L, idx);
In your case, iii is a list of pixel indices, not their values. So none of those values are 1, 2 or 3 (even where you looked for the ones that were equal to 2), as those are the indices of the first three pixels of the image. Therefore none of the values in L_t match any of the values in iii.
If all you wanted was an image showing only the second object, then this is enough:
bbb = L_t==2;
imshow(bbb)

Related

Faster way to find the size of the intersection of any two corresponding multisets from two 3D arrays of multisets

I have two uint16 3D (GPU) arrays A and B in MATLAB, which have the same 2nd and 3rd dimension. For instance, size(A,1) = 300 000, size(B,1) = 2000, size(A,2) = size(B,2) = 20, and size(A,3) = size(B,3) = 100, to give an idea about the orders of magnitude. Actually, size(A,3) = size(B,3) is very big, say ~ 1 000 000, but the arrays are stored externally in small pieces cut along the 3rd dimension. The point is that there is a very long loop along the 3rd dimension (cfg. MWE below), so the code inside of it needs to be optimized further (if possible). Furthermore, the values of A and B can be assumed to be bounded way below 65535, but there are still hundreds of different values.
For each i,j, and d, the rows A(i,:,d) and B(j,:,d) represent multisets of the same size, and I need to find the size of the largest common submultiset (multisubset?) of the two, i.e. the size of their intersection as multisets. Moreover, the rows of B can be assumed sorted.
For example, if [2 3 2 1 4 5 5 5 6 7] and [1 2 2 3 5 5 7 8 9 11] are two such multisets, respectively, then their multiset intersection is [1 2 2 3 5 5 7], which has the size 7 (7 elements as a multiset).
I am currently using the following routine to do this:
s = 300000; % 1st dim. of A
n = 2000; % 1st dim. of B
c = 10; % 2nd dim. of A and B
depth = 10; % 3rd dim. of A and B (corresponds to a batch of size 10 of A and B along the 3rd dim.)
N = 100; % upper bound on the possible values of A and B
A = randi(N,s,c,depth,'uint16','gpuArray');
B = randi(N,n,c,depth,'uint16','gpuArray');
Sizes_of_multiset_intersections = zeros(s,n,depth,'uint8'); % too big to fit in GPU memory together with A and B
for d=1:depth
A_slice = A(:,:,d);
B_slice = B(:,:,d);
unique_B_values = permute(unique(B_slice),[3 2 1]); % B is smaller than A
% compute counts of the unique B-values for each multiset:
A_values_counts = permute(sum(uint8(A_slice==unique_B_values),2,'native'),[1 3 2]);
B_values_counts = permute(sum(uint8(B_slice==unique_B_values),2,'native'),[1 3 2]);
% compute the count of each unique B-value in the intersection:
Sizes_of_multiset_intersections_tmp = gpuArray.zeros(s,n,'uint8');
for i=1:n
Sizes_of_multiset_intersections_tmp(:,i) = sum(min(A_values_counts,B_values_counts(i,:)),2,'native');
end
Sizes_of_multiset_intersections(:,:,d) = gather(Sizes_of_multiset_intersections_tmp);
end
One can also easily adapt above code to compute the result in batches along dimension 3 rather than d=1:depth (=batch of size 1), though at the expense of even bigger unique_B_values vector.
Since the depth dimension is large (even when working in batches along it), I am interested in faster alternatives to the code inside the outer loop. So my question is this: is there a faster (e.g. better vectorized) way to compute sizes of intersections of multisets of equal size?
Disclaimer : This is not a GPU based solution (Don't have a good GPU). I find the results interesting and want to share, but I can delete this answer if you think it should be.
Below is a vectorized version of your code, that makes it possible to get rid of the inner loop, at the cost of having to deal with a bigger array, that might be too big to fit in the memory.
The idea is to have the matrices A_values_counts and B_values_counts be 3D matrices shaped in such a way that calling min(A_values_counts,B_values_counts) will calculate everything in one go due to implicit expansion. In the background it will create a big array of size s x n x length(unique_B_values) (Probably most of the time too big)
In order to go around the constraint on the size, the results are calculated in batches along the n dimension, i.e. the first dimension of B:
tic
nBatches_B = 2000;
sBatches_B = n/nBatches_B;
Sizes_of_multiset_intersections_new = zeros(s,n,depth,'uint8');
for d=1:depth
A_slice = A(:,:,d);
B_slice = B(:,:,d);
% compute counts of the unique B-values for each multiset:
unique_B_values = reshape(unique(B_slice),1,1,[]);
A_values_counts = sum(uint8(A_slice==unique_B_values),2,'native'); % s x 1 x length(uniqueB) array
B_values_counts = reshape(sum(uint8(B_slice==unique_B_values),2,'native'),1,n,[]); % 1 x n x length(uniqueB) array
% Not possible to do it all in one go, must split in batches along B
for ii = 1:nBatches_B
Sizes_of_multiset_intersections_new(:,((ii-1)*sBatches_B+1):ii*sBatches_B,d) = sum(min(A_values_counts,B_values_counts(:,((ii-1)*sBatches_B+1):ii*sBatches_B,:)),3,'native'); % Vectorized
end
end
toc
Here is a little benchmark with different values of the number of batches. You can see that a minimum is found around a number of 400 (batch size 50), with a decrease of around 10% in processing time (each point is an average over 3 runs). (EDIT : x axis is amount of batches, not batches size)
I'd be interested in knowing how it behaves for GPU arrays as well!

MATLAB - Permutations of random indices in specific areas of a grid

I have a problem in which I have 4 objects (1s) on a 100x100 grid of zeros that is split up into 16 even squares of 25x25.
I need to create a (16^4 * 4) table where entries listing all the possible positions of each of these 4 objects across the 16 submatrices. The objects can be anywhere within the submatrices so long as they aren't overlapping one another. This is clearly a permutation problem, but there is added complexity because of the indexing and the fact that the positions ned to be random but not overlapping within a 16th square. Would love any pointers!
What I tried to do was create a function called "top_left_corner(position)" that returns the subscript of the top left corner of the sub-matrix you are in. E.g. top_left_corner(1) = (1,1), top_left_corner(2) = (26,1), etc. Then I have:
pos = randsample(24,2);
I = pos(1)+top_left_corner(position,1);
J = pos(2)+top_left_corner(position,2);
The problem is how to generate and store permutations of this in a table as linear indices.
First using ndgrid cartesian product generated in the form of a [4 , 16^4] matrix perm. Then in the while loop random numbers generated and added to perm. If any column of perm contains duplicated random numbers ,random number generation repeated for those columns until no column has duplicated elements.Normally no more than 2-3 iterations needed. Since the [100 ,100] array divided into 16 blocks, using kron an index pattern like the 16 blocks generated and with the sort function indexes of sorted elements extracted. Then generated random numbers form indexes of the pattern( 16 blocks).
C = cell(1,4);
[C{:}]=ndgrid(0:15,0:15,0:15,0:15);
perm = reshape([C{:}],16^4,4).';
perm_rnd = zeros(size(perm));
c = 1:size(perm,2);
while true
perm_rnd(:,c) = perm(:,c) * 625 +randi(625,4,numel(c));
[~ ,c0] = find(diff(sort(perm_rnd(:,c),1),1,1)==0);
if isempty(c0)
break;
end
%c = c(unique(c0));
c = c([true ; diff(c0)~=0]);
end
pattern = kron(reshape(1:16,4,4),ones(25));
[~,idx] = sort(pattern(:));
result = idx(perm_rnd).';

How to traverse an image across the blocks randomly?

I have divided a 512X512 image into 2X2 pixel blocks. Thus I have 65536 blocks in total. Each block has four pixels.
Now I want to traverse the image in random order. As for example: starting from 6th block, then to 3rd block, then to 8th, then to 1st block...... like this until the whole image is traversed.
Important: I need to store the traversing order for later use.
Please help me writing a MATLAB code for this. Many many many thanks in advance.
Easy, let's make an example with small matrix (6x6)
Im = rand(6,6);
nblocks = 9;
blocksize = 2;
You will have blocks of size 2x2 (in total 3x3=9 blocks).
Reshape the matrix into a 2 x 18 matrix.
Im = reshape(Im, numel(Im)/blocksize, blocksize);
Now generate a random permutation of indexes separated by the size of the block:
idx = randperm(nblocks) * blocksize;
Et voilà. Now you can access the 5th block just doing:
currentblock = Im(idx(5):idx(5)+blocksize, :);
Use a loop to transverse each block.
You can divide the image into blocks and tile them along a third dimension using this great answer. You then loop over a random permutation of the third dimension indices:
A = randn(12,12);
m = 3;
n = 6;
T = permute(reshape(permute(reshape(A, size(A, 1), n, []), [2 1 3]), n, m, []), [2 1 3]);
% each third-dim slice is an mxn block
scan_order = randperm(size(T,3)); % random permutation of block indices
for b = scan_order
block = T(:,:,b);
% Do stuff with current block
end

segment overlapping regions into disjoint regions

Given a set of closed regions [a,b] where a and b are integers I need to find another set of regions that cover the same numbers but are disjoint.
I suppose it is possible to do naively by iterating through the set several times, but I am looking for a recommendation of a good algorithm for this. Please help.
EDIT:
to clarify, the resulting regions cannot be larger than the original ones, I have to come up with disjoint regions that are contained by the original ones. In other words, I need to split the original regions on the boundaries where they overlap.
example:
3,8
1,4
7,9
11,14
result:
1,3
3,4
4,7
7,8
8,9
11,14
Just sort all endpoints left to right (remember their type: start or end). Swype left to right. Keep a counter starting at 0. Whenever you come across a start: increment the counter. When you come across an end: decrement (note that the counter is always at least 0).
Keep track of the last two points. If the counter is greater than zero - and the last two points are different (prevent empty ranges) - add the interval between the last two points.
Pseudocode:
points = all interval endpoints
sort(points)
previous = points[0]
counter = 1
for(int i = 1; i < #points; i++) {
current = points[i]
if (current was start point)
counter++
else
counter--
if (counter > 0 and previous != current)
add (previous, current) to output
previous = current
}
(This is a modification of an answer that I posted earlier today which I deleted after I discovered it had a logic error. I later realized that I could modify Vincent van der Weele's elegant idea of using parenthesis depth to fix the bug)
On Edit: Modified to be able to accept intervals of length 0
Call an interval [a,a] of length 0 essential if a doesn't also appear as an endpoint of any interval of length > 0. For example, in [1,3], [2,2], [3,3], [4,4] the 0-length intervals [2,2] and [4,4] are essential but [3,3] isn't.
Inessential 0-length intervals are redundant thus need not appear in the final output. When the list of intervals is initially scanned (loading the basic data structures) points corresponding to 0-length intervals are recorded, as are endpoint of intervals of length > 0. When the scan is completed, two instances of each point corresponding to essential 0-length intervals are added into the list of endpoints, which is then sorted. The resulting data structure is a multiset where the only repetitions correspond to essential 0-length intervals.
For every endpoint in the interval define the pdelta (parentheses delta) of the endpoint as the number of times that point appears as a left endpoint minus the number of times it appears as a right endpoint. Store these in a dictionary keyed by the endpoints.
[a,b] where a,b are the first two elements of the list of endpoints is the first interval in the list of disjoint intervals. Define the parentheses depth of b to be the sum of pdelta[a] and pdelta[b]. We loop through the rest of the endpoints as follows:
In each pass through the loop look at the parenthesis depth of b. If it is not 0 than b is still needed for one more interval. Let a = b and let the new p be the next value in the list. Adjust the parentheses depth be the pdelta of the new b and add [a,b] to the list of disjoint intervals. Otherwise (if the parenthesis depth of b is 0) let the next [a,b] be the next two points in the list and adjust the parenthesis depth accordingly.
Here is a Python implementation:
def disjointify(intervals):
if len(intervals) == 0: return []
pdelta = {}
ends = set()
disjoints = []
onePoints = set() #onePoint intervals
for (a,b) in intervals:
if a == b:
onePoints.add(a)
if not a in pdelta: pdelta[a] = 0
else:
ends.add(a)
ends.add(b)
pdelta[a] = pdelta.setdefault(a,0) + 1
pdelta[b] = pdelta.setdefault(b,0) - 1
onePoints.difference_update(ends)
ends = list(ends)
for a in onePoints:
ends.extend([a,a])
ends.sort()
a = ends[0]
b = ends[1]
pdepth = pdelta[a] + pdelta[b]
i = 1
disjoints.append((a,b))
while i < len(ends) - 1:
if pdepth != 0:
a = b
b = ends[i+1]
pdepth += pdelta[b]
i += 1
else:
a = ends[i+1]
b = ends[i+2]
pdepth += (pdelta[a] + pdelta[b])
i += 2
disjoints.append((a,b))
return disjoints
Sample output which illustrates various edge cases:
>>> example = [(1,1), (1,4), (2,2), (4,4),(5,5), (6,8), (7,9), (10,10)]
>>> disjointify(example)
[(1, 2), (2, 2), (2, 4), (5, 5), (6, 7), (7, 8), (8, 9), (10, 10)]
>>> disjointify([(1,1), (2,2)])
[(1, 1), (2, 2)]
(I am using Python tuples to represent the closed intervals even though it has the minor drawback of looking like the standard mathematical notation for open intervals).
A final remark: referring to the result as a collection of disjoint interval might not be accurate since some of these intervals have nonempty albeit 1-point intersections

Determine whether a symbol is part of the ith combination nCr

UPDATE:
Combinatorics and unranking was eventually what I needed.
The links below helped alot:
http://msdn.microsoft.com/en-us/library/aa289166(v=vs.71).aspx
http://www.codeproject.com/Articles/21335/Combinations-in-C-Part-2
The Problem
Given a list of N symbols say {0,1,2,3,4...}
And NCr combinations of these
eg. NC3 will generate:
0 1 2
0 1 3
0 1 4
...
...
1 2 3
1 2 4
etc...
For the ith combination (i = [1 .. NCr]) I want to determine Whether a symbol (s) is part of it.
Func(N, r, i, s) = True/False or 0/1
eg. Continuing from above
The 1st combination contains 0 1 2 but not 3
F(N,3,1,"0") = TRUE
F(N,3,1,"1") = TRUE
F(N,3,1,"2") = TRUE
F(N,3,1,"3") = FALSE
Current approaches and tibits that might help or be related.
Relation to matrices
For r = 2 eg. 4C2 the combinations are the upper (or lower) half of a 2D matrix
1,2 1,3 1,4
----2,3 2,4
--------3,4
For r = 3 its the corner of a 3D matrix or cube
for r = 4 Its the "corner" of a 4D matrix and so on.
Another relation
Ideally the solution would be of a form something like the answer to this:
Calculate Combination based on position
The nth combination in the list of combinations of length r (with repitition allowed), the ith symbol can be calculated
Using integer division and remainder:
n/r^i % r = (0 for 0th symbol, 1 for 1st symbol....etc)
eg for the 6th comb of 3 symbols the 0th 1st and 2nd symbols are:
i = 0 => 6 / 3^0 % 3 = 0
i = 1 => 6 / 3^1 % 3 = 2
i = 2 => 6 / 3^2 % 3 = 0
The 6th comb would then be 0 2 0
I need something similar but with repition not allowed.
Thank you for following this question this far :]
Kevin.
I believe your problem is that of unranking combinations or subsets.
I will give you an implementation in Mathematica, from the package Combinatorica, but the Google link above is probably a better place to start, unless you are familiar with the semantics.
UnrankKSubset::usage = "UnrankKSubset[m, k, l] gives the mth k-subset of set l, listed in lexicographic order."
UnrankKSubset[m_Integer, 1, s_List] := {s[[m + 1]]}
UnrankKSubset[0, k_Integer, s_List] := Take[s, k]
UnrankKSubset[m_Integer, k_Integer, s_List] :=
Block[{i = 1, n = Length[s], x1, u, $RecursionLimit = Infinity},
u = Binomial[n, k];
While[Binomial[i, k] < u - m, i++];
x1 = n - (i - 1);
Prepend[UnrankKSubset[m - u + Binomial[i, k], k-1, Drop[s, x1]], s[[x1]]]
]
Usage is like:
UnrankKSubset[5, 3, {0, 1, 2, 3, 4}]
{0, 3, 4}
Yielding the 6th (indexing from 0) length-3 combination of set {0, 1, 2, 3, 4}.
There's a very efficient algorithm for this problem, which is also contained in the recently published:Knuth, The Art of Computer Programming, Volume 4A (section 7.2.1.3).
Since you don't care about the order in which the combinations are generated, let's use the lexicographic order of the combinations where each combination is listed in descending order. Thus for r=3, the first 11 combinations of 3 symbols would be: 210, 310, 320, 321, 410, 420, 421, 430, 431, 432, 510. The advantage of this ordering is that the enumeration is independent of n; indeed it is an enumeration over all combinations of 3 symbols from {0, 1, 2, …}.
There is a standard method to directly generate the ith combination given i, so to test whether a symbol s is part of the ith combination, you can simply generate it and check.
Method
How many combinations of r symbols start with a particular symbol s? Well, the remaining r-1 positions must come from the s symbols 0, 1, 2, …, s-1, so it's (s choose r-1), where (s choose r-1) or C(s,r-1) is the binomial coefficient denoting the number of ways of choosing r-1 objects from s objects. As this is true for all s, the first symbol of the ith combination is the smallest s such that
&Sum;k=0s(k choose r-1) ≥ i.
Once you know the first symbol, the problem reduces to finding the (i - &Sum;k=0s-1(k choose r-1))-th combination of r-1 symbols, where we've subtracted those combinations that start with a symbol less than s.
Code
Python code (you can write C(n,r) more efficiently, but this is fast enough for us):
#!/usr/bin/env python
tC = {}
def C(n,r):
if tC.has_key((n,r)): return tC[(n,r)]
if r>n-r: r=n-r
if r<0: return 0
if r==0: return 1
tC[(n,r)] = C(n-1,r) + C(n-1,r-1)
return tC[(n,r)]
def combination(r, k):
'''Finds the kth combination of r letters.'''
if r==0: return []
sum = 0
s = 0
while True:
if sum + C(s,r-1) < k:
sum += C(s,r-1)
s += 1
else:
return [s] + combination(r-1, k-sum)
def Func(N, r, i, s): return s in combination(r, i)
for i in range(1, 20): print combination(3, i)
print combination(500, 10000000000000000000000000000000000000000000000000000000000000000)
Note how fast this is: it finds the 10000000000000000000000000000000000000000000000000000000000000000th combination of 500 letters (it starts with 542) in less than 0.5 seconds.
I have written a class to handle common functions for working with the binomial coefficient, which is the type of problem that your problem falls under. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
This class can easily be applied to your problem. If you have the rank (or index) to the binomial coefficient table, then simply call the class method that returns the K-indexes in an array. Then, loop through that returned array to see if any of the K-index values match the value you have. Pretty straight forward...

Resources