I have 40 classes each has 10 replications, I need to generate one hot encoding as below :
class 1, replication 1: [ 1 0 0 0 ....40(0) 1 0 0 0 0 ....0(10)]
class 1, replication 2: [ 1 0 0 0 ....40(0) 0 1 0 0 0 ....0(10)]
I am not sure how can i code the input array : for example the code below all the classes are in one array :
X = [2,1,2,3,3]'
LinearIndices = sub2ind([length(X),3], [1:length(X)]', X);
tmp = zeros(length(X), 3);
tmp(LinearIndices) = 1
The above code is not working , its generate :
[ 1 00000 ...0(400)]
Here, let me try to answer your question as I see it asked, or at least point you in a direction. As I see it, you are trying to do two seperate one-hot encodings and then concat them, so lets do it like that. There is a matlab function called ind2vec that will do one-hot encoding.
X = zeros(400,2);
X(:,1) = repelem(1:40, 10);
X(:,2) = repmat(1:10, 1,40);
encoding = [ind2vec(X(:,1)', 40)', ind2vec(X(:,2)', 10)'];
full(encoding)
This is my from my MATLAB script.
function [ Im ] = findBorders( I )
Im = false(size(I));
I = padarray(I, [1, 1], 1);
[h w] = size(Im);
bkgFound = false;
for row = 1 : h
for col = 1 : w
if I(row + 1, col + 1)
bkgFound = false;
for i = 0:2
for j = 0:2
if ~I(row + i, col + j)
Im(row, col) = 1;
bkgFound = true;
break;
end;
end;
if bkgFound
break;
end;
end;
end;
end;
end;
end
So, I need to convert it to parfor loop, to run into GPU.
I need help. I read some articles, but have no idea about how to convert this.
In MATLAB, parfor doesn't allow for things to run on the GPU. The best way to interface with the GPU through MATLAB is to convert your data to a gpuArray and then all operations performed on that data that are optimized for the GPU will be optimized there.
As #Daniel stated, the code that you have posted 1) is not ideal for any sort of parallel processing and 2) could likely be sped up only through vectorization.
I'm not entirely sure what you're trying to do, but it seems like you're trying to find pixels within an image that are surrounded by "not-background". For this I would usually use 2D convolution with a neighborhood kernel to figure out how many neighbors of a given value a pixel has.
For example, the following code locates any pixel which is itself false and completely surrounded by false values (assuming your input image is a logical)
I = [...
1 1 1 1 0;
1 0 0 0 0;
0 0 0 0 0;
0 0 0 0 0;
0 0 0 1 1;
0 0 0 1 0;
];
surrounded_by_zeros = conv2(double(I), ones(3), 'same') == 0
surrounded_by_zeros =
0 0 0 0 0
0 0 0 0 0
0 0 1 1 1
1 1 0 0 0
1 1 0 0 0
1 1 0 0 0
I personally like this solution, but if you have the Image Processing Toolbox, you can also use imerode or imdilate to basically do the same thing.
surrounded_by_zeros = ~imdilate(I, ones(3));
surrounded_by_zeros = imerode(~I, ones(3));
If for some reason you really needed to move this calculation to the GPU (you don't), you could cast this as a gpuArray and then perform the same operation and it would use the GPU behind the scenes
I = gpuArray(I);
surrounded_by_zeros_on_gpu = conv2(double(I), ones(3), 'same') == 0;
Keep in mind that this has the overhead of copying I over to the GPU which for large enough images can be a significant performance hit.
Please assume A is a matrix of 4 x 4 which has:
A = 1 0 1 0
1 0 1 0
1 1 1 0
1 1 0 0
And B is a reference matrix (4 x 4) which is:
B = 1 0 1 0
1 0 1 0
1 0 1 0
1 1 1 0
Now, if A would be compared to B which is the reference matrix, by matching these two matrices, almost all of members are equal except A(4,3) and A(3,2). However, since B is the reference matrix and A is comparing to that, only differences of those members are matter which are 1 in B. In this particular example, A(4,3) is only matter, not A(3,2), Means:
>> C = B ~= A;
ans =
0 0 0 0
0 0 0 0
0 1 0 0
0 0 1 0
A(4,3) ~= B(4,3)
Finally, we are looking for a piece of code which can show how many percentage of ones in A are equal to their equivalent members at B. In this case the difference is:
(8 / 9) * 100 = 88.89 % are matched.
Please bear in mind that speed is also important here. Therefore, quicker solution are more appreciated. Thanks.
For getting only the different entries where there is a 1 in B, just add an & to it, so you'll only get these entries. To get the percentage, take the sum where A and B are 1. Then divide it by the sum of 1 in B (or the sum of 1in A -> see the note below).
A = [1 0 1 0;
1 0 1 0;
1 1 1 0;
1 1 0 0];
B = [1 0 1 0;
1 0 1 0;
1 0 1 0;
1 1 1 0];
C = (B ~= A) & B
p = sum(B(:) & A(:)) / sum(B(:)) * 100
This is the result:
C =
0 0 0 0
0 0 0 0
0 0 0 0
0 0 1 0
p =
88.8889
Edit / Note: In the OP's question it's not 100% clear if he wants the percentage in relation to the sum of ones in A or B. I assumed that it is a percentage of the reference-matrix, which is B. Therefore I divide by sum(B(:)). In case you need it in reference to the ones in A, just change the last line to:
p = sum(B(:) & A(:)) / sum(A(:)) * 100
If I got it right, what you want to know is where B == 1 and A == 0.
Try this:
>> C = B & ~A
C =
0 0 0 0
0 0 0 0
0 0 0 0
0 0 1 0
To get the percentage, you could try this:
>> 100 * sum(A(:) & B(:)) / sum(A(:))
ans =
88.8889
You can use matrix-multiplication, which must be pretty efficient as listed next.
To get the percentage value with respect to A -
percentage_wrtA = A(:).'*B(:)/sum(A(:)) * 100;
To get the percentage value with respect to B -
percentage_wrtB = A(:).'*B(:)/sum(B(:)) * 100;
Runtime tests
Here's some quick runtime tests to compare matrix-multiplication against summation of elements with (:) and ANDing -
>> M = 6000; %// Datasize
>> A = randi([0,1],M,M);
>> B = randi([0,1],M,M);
>> tic,sum(B(:) & A(:));toc
Elapsed time is 0.500149 seconds.
>> tic,A(:).'*B(:);toc
Elapsed time is 0.126881 seconds.
Try:
sum(sum(A & B))./sum(sum(A))
Output:
ans =
0.8889
Assume the following matrix:
myMatrix = [
1 0 1
1 0 0
1 1 1
1 1 1
0 1 1
0 0 0
0 0 0
0 1 0
1 0 0
0 0 0
0 0 0
0 0 1
0 0 1
0 0 1
];
Given the above (and treating each column independently), I'm trying to create a matrix that will contain the number of rows since the last value of 1 has "shown up". For example, in the first column, the first four values would become 0 since there are 0 rows between each of those rows and the previous value of 1.
Row 5 would become 1, row 6 = 2, row 7 = 3, row 8 = 4. Since row 9 contains a 1, it would become 0 and the count starts again with row 10. The final matrix should look like this:
FinalMatrix = [
0 1 0
0 2 1
0 0 0
0 0 0
1 0 0
2 1 1
3 2 2
4 0 3
0 1 4
1 2 5
2 3 6
3 4 0
4 5 0
5 6 0
];
What is a good way of accomplishing something like this?
EDIT: I'm currently using the following code:
[numRow,numCol] = size(myMatrix);
oneColumn = 1:numRow;
FinalMatrix = repmat(oneColumn',1,numCol);
toSubtract = zeros(numRow,numCol);
for m=1:numCol
rowsWithOnes = find(myMatrix(:,m));
for mm=1:length(rowsWithOnes);
toSubtract(rowsWithOnes(mm):end,m) = rowsWithOnes(mm);
end
end
FinalMatrix = FinalMatrix - toSubtract;
which runs about 5 times faster than the bsxfun solution posted over many trials and data sets (which are about 1500 x 2500 in size). Can the code above be optimized?
For a single column you could do this:
col = 1; %// desired column
vals = bsxfun(#minus, 1:size(myMatrix,1), find(myMatrix(:,col)));
vals(vals<0) = inf;
result = min(vals, [], 1).';
Result for first column:
result =
0
0
0
0
1
2
3
4
0
1
2
3
4
5
find + diff + cumsum based approach -
offset_array = zeros(size(myMatrix));
for k1 = 1:size(myMatrix,2)
a = myMatrix(:,k1);
widths = diff(find(diff([1 ; a])~=0));
idx = find(diff(a)==1)+1;
offset_array(idx(idx<=numel(a)),k1) = widths(1:2:end);
end
FinalMatrix1 = cumsum(double(myMatrix==0) - offset_array);
Benchmarking
The benchmarking code for comparing the above mentioned approach against the one in the question is listed here -
clear all
myMatrix = round(rand(1500,2500)); %// create random input array
for k = 1:50000
tic(); elapsed = toc(); %// Warm up tic/toc
end
disp('------------- With FIND+DIFF+CUMSUM based approach') %//'#
tic
offset_array = zeros(size(myMatrix));
for k1 = 1:size(myMatrix,2)
a = myMatrix(:,k1);
widths = diff(find(diff([1 ; a])~=0));
idx = find(diff(a)==1)+1;
offset_array(idx(idx<=numel(a)),k1) = widths(1:2:end);
end
FinalMatrix1 = cumsum(double(myMatrix==0) - offset_array);
toc
clear FinalMatrix1 offset_array idx widths a
disp('------------- With original approach') %//'#
tic
[numRow,numCol] = size(myMatrix);
oneColumn = 1:numRow;
FinalMatrix = repmat(oneColumn',1,numCol); %//'#
toSubtract = zeros(numRow,numCol);
for m=1:numCol
rowsWithOnes = find(myMatrix(:,m));
for mm=1:length(rowsWithOnes);
toSubtract(rowsWithOnes(mm):end,m) = rowsWithOnes(mm);
end
end
FinalMatrix = FinalMatrix - toSubtract;
toc
The results I got were -
------------- With FIND+DIFF+CUMSUM based approach
Elapsed time is 0.311115 seconds.
------------- With original approach
Elapsed time is 7.587798 seconds.
an expecting result is like this one.
0 0 0 1 0 * * * 0 0 0 0
0 0 0 0 1 * 1 * * 0 0 0
0 0 1 1 1 * 1 0 * 0 0 0
0 * * 0 1 * 1 0 * 0 0 0
0 * 1 1 1 * 1 0 * * 0 0
0 * 0 1 1 * 1 0 0 * 0 0
0 * 0 0 1 * 1 0 0 * 0 0
0 * * * * * 1 0 0 * * 0
0 0 0 0 0 0 1 0 0 0 * 0
0 0 0 0 0 0 1 0 0 0 * 0
0 0 0 0 0 0 1 0 0 0 * 0
0 0 0 0 0 0 1 0 0 0 * *
you can only walk at 4 directions,no 45 degree direction, im using A* , i changed part of the original algorithm for more suited in my case.
here's my python code:
i run it 1000 times.
the cost is 1.4s~1.5s
def astar(m,startp,endp):
w,h = 12,12
sx,sy = startp
ex,ey = endp
#[parent node, x, y,g,f]
node = [None,sx,sy,0,abs(ex-sx)+abs(ey-sy)]
closeList = [node]
createdList = {}
createdList[sy*w+sx] = node
k=0
while(closeList):
node = closeList.pop(0)
x = node[1]
y = node[2]
l = node[3]+1
k+=1
#find neighbours
#make the path not too strange
if k&1:
neighbours = ((x,y+1),(x,y-1),(x+1,y),(x-1,y))
else:
neighbours = ((x+1,y),(x-1,y),(x,y+1),(x,y-1))
for nx,ny in neighbours:
if nx==ex and ny==ey:
path = [(ex,ey)]
while node:
path.append((node[1],node[2]))
node = node[0]
return list(reversed(path))
if 0<=nx<w and 0<=ny<h and m[ny][nx]==0:
if ny*w+nx not in createdList:
nn = (node,nx,ny,l,l+abs(nx-ex)+abs(ny-ey))
createdList[ny*w+nx] = nn
#adding to closelist ,using binary heap
nni = len(closeList)
closeList.append(nn)
while nni:
i = (nni-1)>>1
if closeList[i][4]>nn[4]:
closeList[i],closeList[nni] = nn,closeList[i]
nni = i
else:
break
return 'not found'
m = ((0,0,0,1,0,0,0,0,0,0,0,0),
(0,0,0,0,1,0,1,0,0,0,0,0),
(0,0,1,1,1,0,1,0,0,0,0,0),
(0,0,0,0,1,0,1,0,0,0,0,0),
(0,0,1,1,1,0,1,0,0,0,0,0),
(0,0,0,1,1,0,1,0,0,0,0,0),
(0,0,0,0,1,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0),
(0,0,0,0,0,0,1,0,0,0,0,0)
)
t1 = time.time()
for i in range(1000):
result = astar(m,(2,3),(11,11))
print(time.time()-t1)
cm = [list(x[:]) for x in m]
if isinstance(result, list):
for y in range(len(m)):
my = m[y]
for x in range(len(my)):
for px,py in result:
if px==x and py ==y:
cm[y][x] = '*'
for my in cm:
print(' '.join([str(x) for x in my]))
exit(0)
tell me if you know faster or fastest way by now.
A* algorithm is pretty fast one for a known graph (all edges are known and you can estimate distance to the target using some admissible heuristic).
There are some improvements to A* algorithm which makes it faster at the cost of being less optimal. The most common is A*-Epsilon (AKA bounded A*). The idea is to allow the algorithm to develop nodes that are (1+epsilon)*MIN (where regular A* develops only MIN). The result (depending on the epsilon value of course) is usually a faster solution, but the path found is at most (1+epsilon) * OPTIMAL.
Another possible optimization is doing A* from one end - and from the other (the "exit") do a BFS simultaneously. This technique is called bi-directional search - and is usually a great way to improve performance in unweighted graphs when the problem has a single final state. I tried to explain the principles of bi-directional search once in this thread