Minimizing distance to a weighted grid - algorithm

Lets suppose you have a 1000x1000 grid of positive integer weights W.
We want to find the cell that minimizes the average weighted distance.to each cell.
The brute force way to do this would be to loop over each candidate cell and calculate the distance:
int best_x, best_y, best_dist;
for x0 = 1:1000,
for y0 = 1:1000,
int total_dist = 0;
for x1 = 1:1000,
for y1 = 1:1000,
total_dist += W[x1,y1] * sqrt((x0-x1)^2 + (y0-y1)^2);
if (total_dist < best_dist)
best_x = x0;
best_y = y0;
best_dist = total_dist;
This takes ~10^12 operations, which is too long.
Is there a way to do this in or near ~10^8 or so operations?

Theory
This is possible using Filters in O(n m log nm ) time where n, m are the grid dimensions.
You need to define a filter of size 2n + 1 x 2m + 1, and you need to (centered) embed your original weight grid in a grid of zeros of size 3n x 3m. The filter needs to be the distance weighting from the origin at (n,m):
F(i,j) = sqrt((n-i)^2 + (m-j)^2)
Let W denote the original weight grid (centered) embedded in a grid of zeros of size 3n x 3m.
Then the filtered (cross-correlation) result
R = F o W
will give you total_dist grid, simply take the min R (ignoring the extra embedded zeros you put into W) to find your best x0, y0 positions.
Image (i.e. Grid) filtering is very standard, and can be done in all sorts of different existing software such as matlab, with the imfilter command.
I should note, though I explicitly made use of cross-correlation above, you would get the same result with convolution only because your filter F is symmetric. In general, image filter is cross-correlation, not convolution, though the two operations are very analogous.
The reason for the O(nm log nm ) runtime is because image filtering can be done using 2D FFT's.
Implemenation
Here are both implementations in Matlab, final result is the same for both methods and they are benchmarked in a very simple way:
m=100;
n=100;
W0=abs(randn(m,n))+.001;
tic;
%The following padding is not necessary in the matlab code because
%matlab implements it in the imfilter function, from the imfilter
%documentation:
% - Boundary options
%
% X Input array values outside the bounds of the array
% are implicitly assumed to have the value X. When no
% boundary option is specified, imfilter uses X = 0.
%W=padarray(W0,[m n]);
W=W0;
F=zeros(2*m+1,2*n+1);
for i=1:size(F,1)
for j=1:size(F,2)
%This is matlab where indices start from 1, hence the need
%for m-1 and n-1 in the equations
F(i,j)=sqrt((i-m-1)^2 + (j-n-1)^2);
end
end
R=imfilter(W,F);
[mr mc] = ind2sub(size(R),find(R == min(R(:))));
[mr, mc]
toc;
tic;
T=zeros([m n]);
best_x=-1;
best_y=-1;
best_val=inf;
for y0=1:m
for x0=1:n
total_dist = 0;
for y1=1:m
for x1=1:n
total_dist = total_dist + W0(y1,x1) * sqrt((x0-x1)^2 + (y0-y1)^2);
end
end
T(y0,x0) = total_dist;
if ( total_dist < best_val )
best_x = x0;
best_y = y0;
best_val = total_dist;
end
end
end
[best_y best_x]
toc;
diff=abs(T-R);
max_diff=max(diff(:));
fprintf('The max difference between the two computations: %g\n', max_diff);
Performance
For an 800x800 grid, on my PC which is certainly not the fastest, the FFT method evaluates in just over 700 seconds. The brute force method doesn't complete after several hours and I have to kill it.
In terms of further performance gains, you can attain them by moving to a hardware platform like GPUs. For example, using CUDA's FFT library you can compute 2D FFT's in a fraction of the time it takes on a CPU. The key point is, that the FFT method will scale as you throw more hardware to do the computation, while the brute force method will scale much worse.
Observations
While implementing this, I have observed that almost every time, the best_x,bext_y values are one of floor(n/2)+-1. This means that most likely the distance term dominates the entire computation, therefore, you could get away with computing the value of total_dist for only 4 values, making this algorithm trivial!

Related

Predict Collision Between 2 Uniform Circular Motion Objects

this is my first question on the forum and my algebra is rusty so please be indulgent ^^'
So my problem is that i want to predict collision between two uniform circular motion objects for which i know velocity (angular speed in radian), distance from the origin (radius), cartesian coordinate of the center of the circle.
I can get cartesian position for each object given for t time (timestamp) using :
Oa.x = ra X cos(wa X t)
Oa.y = ra X sin(wa X t)
Oa.x: Object A x coordinates
ra: radius of a Circle A
wa: velocity of object A (angular speed in radian)
t: time (timestamp)
Same goes for object b (Ob)
I want to find t such that ||Ca - Cb|| = (rOa + rOb)
rOa: radius of object a
Squaring both side and expanding give me this :
||Ca-Cb||^2 = (rOa+rOb)^2
(ra * cos (wa * t) - rb / cos (wb * t))^2 + (ra * sin (wa * t) - rb / sin (wb * t))^2 = (ra+rb)^2
From that i should get a quadratic polynomial that i can solve for t, but how can i find a condition that tell me if such a t exist ? And possibly, how to solve it for t ?
Your motion equations are missing some stuff I expect this instead:
a0(t) = omg0*t + ang0
x0(t) = cx0 + R0 * cos(a0(t))
y0(t) = cy0 + R0 * sin(a0(t))
a1(t) = omg1*t + ang1
x1(t) = cx1 + R1 * cos(a1(t))
y1(t) = cy1 + R1 * sin(a1(t))
where t is time in [sec], cx?,cy? is the center of rotation ang? is starting angle (t=0) in [rad] and omg? is angular speed in [rad/sec]. If the objects have radius r? then collision occurs when the distance is <= r0+r1
so You want to find smallest time where:
(x1-x0)^2 + (y1-y0)^2 <= (r0+r1)^2
This will most likely lead to transcendent equation so you need numeric approach to solve this. For stuff like this I usually use Approximation search so to solve this do:
loop t from 0 to some reasonable time limit
The collision will happen with constant frequency and the time between collisions will be divisible by periods of both motions so I would test up to lcm(2*PI/omg0,2*PI/omg1) time limit where lcm is least common multiple
Do not loop t through all possible times with brute force but use heuristic (like the approx search linked above) beware initial time step must be reasonable I would try dt = min(0.2*PI/omg0,0.2*PI/omg1) so you have at least 10 points along circle
solve t so the distance between objects is minimal
This however will find the time when the objects collide fully so their centers merge. So you need to substract some constant time (or search it again) that will get you to the start of collision. This time you can use even binary search as the distance will be monotonic.
next collision will appear after lcm(2*PI/omg0,2*PI/omg1)
so if you found first collision time tc0 then
tc(i) = tc0 + i*lcm(2*PI/omg0,2*PI/omg1)
i = 0,1,2,3,...

Speed-efficient classification in Matlab

I have an image of size as RGB uint8(576,720,3) where I want to classify each pixel to a set of colors. I have transformed using rgb2lab from RGB to LAB space, and then removed the L layer so it is now a double(576,720,2) consisting of AB.
Now, I want to classify this to some colors that I have trained on another image, and calculated their respective AB-representations as:
Cluster 1: -17.7903 -13.1170
Cluster 2: -30.1957 40.3520
Cluster 3: -4.4608 47.2543
Cluster 4: 46.3738 36.5225
Cluster 5: 43.3134 -17.6443
Cluster 6: -0.9003 1.4042
Cluster 7: 7.3884 11.5584
Now, in order to classify/label each pixel to a cluster 1-7, I currently do the following (pseudo-code):
clusters;
for each x
for each y
ab = im(x,y,2:3);
dist = norm(ab - clusters); // norm of dist between ab and each cluster
[~, idx] = min(dist);
end
end
However, this is terribly slow (52 seconds) because of the image resolution and that I manually loop through each x and y.
Are there some built-in functions I can use that performs the same job? There must be.
To summarize: I need a classification method that classifies pixel images to an already defined set of clusters.
Approach #1
For a N x 2 sized points/pixels array, you can avoid permute as suggested in the other solution by Luis, which could slow down things a bit, to have a kind of "permute-unrolled" version of it and also let's bsxfun work towards a 2D array instead of a 3D array, which must be better with performance.
Thus, assuming clusters to be ordered as a N x 2 sized array, you may try this other bsxfun based approach -
%// Get a's and b's
im_a = im(:,:,2);
im_b = im(:,:,3);
%// Get the minimum indices that correspond to the cluster IDs
[~,idx] = min(bsxfun(#minus,im_a(:),clusters(:,1).').^2 + ...
bsxfun(#minus,im_b(:),clusters(:,2).').^2,[],2);
idx = reshape(idx,size(im,1),[]);
Approach #2
You can try out another approach that leverages fast matrix multiplication in MATLAB and is based on this smart solution -
d = 2; %// dimension of the problem size
im23 = reshape(im(:,:,2:3),[],2);
numA = size(im23,1);
numB = size(clusters,1);
A_ext = zeros(numA,3*d);
B_ext = zeros(numB,3*d);
for id = 1:d
A_ext(:,3*id-2:3*id) = [ones(numA,1), -2*im23(:,id), im23(:,id).^2 ];
B_ext(:,3*id-2:3*id) = [clusters(:,id).^2 , clusters(:,id), ones(numB,1)];
end
[~, idx] = min(A_ext * B_ext',[],2); %//'
idx = reshape(idx, size(im,1),[]); %// Desired IDs
What’s going on with the matrix multiplication based distance matrix calculation?
Let us consider two matrices A and B between whom we want to calculate the distance matrix. For the sake of an easier explanation that follows next, let us consider A as 3 x 2 and B as 4 x 2 sized arrays, thus indicating that we are working with X-Y points. If we had A as N x 3 and B as M x 3 sized arrays, then those would be X-Y-Z points.
Now, if we have to manually calculate the first element of the square of distance matrix, it would look like this –
first_element = ( A(1,1) – B(1,1) )^2 + ( A(1,2) – B(1,2) )^2
which would be –
first_element = A(1,1)^2 + B(1,1)^2 -2*A(1,1)* B(1,1) + ...
A(1,2)^2 + B(1,2)^2 -2*A(1,2)* B(1,2) … Equation (1)
Now, according to our proposed matrix multiplication, if you check the output of A_ext and B_ext after the loop in the earlier code ends, they would look like the following –
So, if you perform matrix multiplication between A_ext and transpose of B_ext, the first element of the product would be the sum of elementwise multiplication between the first rows of A_ext and B_ext, i.e. sum of these –
The result would be identical to the result obtained from Equation (1) earlier. This would continue for all the elements of A against all the elements of B that are in the same column as in A. Thus, we would end up with the complete squared distance matrix. That’s all there is!!
Vectorized Variations
Vectorized variations of the matrix multiplication based distance matrix calculations are possible, though there weren't any big performance improvements seen with them. Two such variations are listed next.
Variation #1
[nA,dim] = size(A);
nB = size(B,1);
A_ext = ones(nA,dim*3);
A_ext(:,2:3:end) = -2*A;
A_ext(:,3:3:end) = A.^2;
B_ext = ones(nB,dim*3);
B_ext(:,1:3:end) = B.^2;
B_ext(:,2:3:end) = B;
distmat = A_ext * B_ext.';
Variation #2
[nA,dim] = size(A);
nB = size(B,1);
A_ext = [ones(nA*dim,1) -2*A(:) A(:).^2];
B_ext = [B(:).^2 B(:) ones(nB*dim,1)];
A_ext = reshape(permute(reshape(A_ext,nA,dim,[]),[1 3 2]),nA,[]);
B_ext = reshape(permute(reshape(B_ext,nB,dim,[]),[1 3 2]),nB,[]);
distmat = A_ext * B_ext.';
So, these could be considered as experimental versions too.
Use pdist2 (Statistics Toolbox) to compute the distances in a vectorized manner:
ab = im(:,:,2:3); % // get A, B components
ab = reshape(ab, [size(im,1)*size(im,2) 2]); % // reshape into 2-column
dist = pdist2(clusters, ab); % // compute distances
[~, idx] = min(dist); % // find minimizer for each pixel
idx = reshape(idx, size(im,1), size(im,2)); % // reshape result
If you don't have the Statistics Toolbox, you can replace the third line by
dist = squeeze(sum(bsxfun(#minus, clusters, permute(ab, [3 2 1])).^2, 2));
This gives squared distance instead of distance, but for the purposes of minimizing it doesn't matter.

Circular Hough Transform Improvements

I'm working on an iris recognition algorithm that processes these kind of images into unique codes for identification and authentication purposes.
After filtering, intelligently thresholding, then finding edges in the image, the next step is obviously to fit circles to the pupil and iris. I've looked around the the technique to use is the circular Hough Transform. Here is the code for my implementation. Sorry about the cryptic variable names.
print "Populating Accumulator..."
# Loop over image rows
for x in range(w):
# Loop over image columns
for y in range(h):
# Only process black pixels
if inp[x,y] == 0:
# px,py = 0 means pupil, otherwise pupil center
if px == 0:
ra = r_min
rb = r_max
else:
rr = sqrt((px-x)*(px-x)+(py-y)*(py-y))
ra = int(rr-3)
rb = int(rr+3)
# a is the width of the image, b is the height
for _a in range(a):
for _b in range(b):
for _r in range(rb-ra):
s1 = x - (_a + a_min)
s2 = y - (_b + b_min)
r1 = _r + ra
if (s1 * s1 + s2 * s2 == r1 * r1):
new = acc[_a][_b][_r]
if new >= maxVotes:
maxVotes = new
print "Done"
# Average all circles with the most votes
for _a in range(a):
for _b in range(b):
for _r in range(r):
if acc[_a][_b][_r] >= maxVotes-1:
total_a += _a + a_min
total_b += _b + b_min
total_r += _r + r_min
amount += 1
top_a = total_a / amount
top_b = total_b / amount
top_r = total_r / amount
print top_a,top_b,top_r
This is written in python and uses the Python Imaging Library to do image processing. As you can see, this is a very naive brute force method of finding circles. It works, but takes several minutes. The basic idea is to draw circles from rmin to rmax wherever there is a black pixel (from thresholding and edge-detection), the build an accumulator array of the number of times a location on the image is "voted" on. Whichever x, y, and r has the most votes is the circle of interest. I tried to use the fact that the iris and pupil have about the same center (variables ra and rb) to reduce some of the complexity of the r loop, but the pupil detection takes so long that it doesn't matter.
Now, obviously my implementation is very naive. It uses a three dimensional parameter space (x, y, and r), which unfortunately makes it run slower than is acceptable. What kind of improvements can I make? Is there any way to reduce this to a two-dimensional parameter space? Is there a more efficient way of accessing and setting pixels that I'm not aware of?
On a side note, are there any other techniques for improving the overall runtime of this algorithm that I'm not aware of? Such as methods to approximate the maximum radius of the pupil or iris?
Note: I've tried to use OpenCV for this as well, but I could not tune the parameters enough to be consistently accurate.
Let me know if there's any other information that you need.
NOTE: Once again I misinterpreted my own code. It is technically 5-dimensional, but the 3-dimensional x,y,r loop only operates on black pixels.
Assuming you want the position of the circle rather than a measure of R.
If you have a decent estimate of the possible range of R then a common technique is to run the algorithm for a first guess of fixed R, adjust it and try again.

random unit vector in multi-dimensional space

I'm working on a data mining algorithm where i want to pick a random direction from a particular point in the feature space.
If I pick a random number for each of the n dimensions from [-1,1] and then normalize the vector to a length of 1 will I get an even distribution across all possible directions?
I'm speaking only theoretically here since computer generated random numbers are not actually random.
One simple trick is to select each dimension from a gaussian distribution, then normalize:
from random import gauss
def make_rand_vector(dims):
vec = [gauss(0, 1) for i in range(dims)]
mag = sum(x**2 for x in vec) ** .5
return [x/mag for x in vec]
For example, if you want a 7-dimensional random vector, select 7 random values (from a Gaussian distribution with mean 0 and standard deviation 1). Then, compute the magnitude of the resulting vector using the Pythagorean formula (square each value, add the squares, and take the square root of the result). Finally, divide each value by the magnitude to obtain a normalized random vector.
If your number of dimensions is large then this has the strong benefit of always working immediately, while generating random vectors until you find one which happens to have magnitude less than one will cause your computer to simply hang at more than a dozen dimensions or so, because the probability of any of them qualifying becomes vanishingly small.
You will not get a uniformly distributed ensemble of angles with the algorithm you described. The angles will be biased toward the corners of your n-dimensional hypercube.
This can be fixed by eliminating any points with distance greater than 1 from the origin. Then you're dealing with a spherical rather than a cubical (n-dimensional) volume, and your set of angles should then be uniformly distributed over the sample space.
Pseudocode:
Let n be the number of dimensions, K the desired number of vectors:
vec_count=0
while vec_count < K
generate n uniformly distributed values a[0..n-1] over [-1, 1]
r_squared = sum over i=0,n-1 of a[i]^2
if 0 < r_squared <= 1.0
b[i] = a[i]/sqrt(r_squared) ; normalize to length of 1
add vector b[0..n-1] to output list
vec_count = vec_count + 1
else
reject this sample
end while
There is a boost implementation of the algorithm that samples from normal distributions: random::uniform_on_sphere
I had the exact same question when also developing a ML algorithm.
I got to the same conclusion as Jim Lewis after drawing samples for the 2-d case and plotting the resulting distribution of the angle.
Furthermore, if you try to derive the density distribution for the direction in 2d when you draw at random from [-1,1] for the x- and y-axis ,you will see that:
f_X(x) = 1/(4*cos²(x)) if 0 < x < 45⁰
and
f_X(x) = 1/(4*sin²(x)) if x > 45⁰
where x is the angle, and f_X is the probability density distribution.
I have written about this here:
https://aerodatablog.wordpress.com/2018/01/14/random-hyperplanes/
#define SCL1 (M_SQRT2/2)
#define SCL2 (M_SQRT2*2)
// unitrand in [-1,1].
double u = SCL1 * unitrand();
double v = SCL1 * unitrand();
double w = SCL2 * sqrt(1.0 - u*u - v*v);
double x = w * u;
double y = w * v;
double z = 1.0 - 2.0 * (u*u + v*v);

Wrapping 2D perlin noise

I'm working with Perlin Noise for a height map generation algorithm, I would like to make it wrap around edges so that it can been seen as continuous.. is there a simple way or trick to do that? I guess I need something like a spherical noise so that either horizontally and vertically it wraps around. I would be happy also with just 1 wrapping axis but two would be better.
For now I'm using the classical algorithm in which you can set up how many octaves you want to add and which are the multipliers used for changing amplitude and frequency of the waves between every successive octave.
Thanks in advance!
Perlin noise is obtained as the sum of waveforms. The waveforms are obtained by interpolating random values, and the higher octave waveforms have smaller scaling factors whereas the interpolated random values are nearer to each other. To make this wrap around, you just need to properly interpolate around the y- and x-axes in the usual toroidal fashion, i.e. if your X-axis spans from x_min to x_max, and the leftmost random point (which is being interpolated) is at x0 and the rightmost at x1 (x_min < x0 < x1 < x_max), the value for the interpolated pixels right to x1 and left from x0 are obtained by interpolating from x1 to x0 (wrapping around the edges).
Here pseudocode for one of the octaves using linear interpolation. This assumes a 256 x 256 matrix where the Perlin noise grid size is a power of two pixels... just to make it readable. Imagine e.g. size==16:
wrappable_perlin_octave(grid, size):
for (x=0;x<256;x+=size):
for (y=0;y<256;y+=size):
grid[x][y] = random()
for (x=0;x<256;x+=size):
for (y=0;y<256;y+=size):
if (x % size != 0 || y % size != 0): # interpolate
ax = x - x % size
bx = (ax + size) % 256 # wrap-around
ay = y - y % size
by = (ay + size) % 256 # wrap-around
h = (x % size) / size # horizontal balance, floating-point calculation
v = (y % size) / size # vertical balance, floating-point calculation
grid[x][y] = grid[ax][ay] * (1-h) * (1-v) +
grid[bx][ay] * h * (1-v) +
grid[ax][by] * (1-h) * v +
grid[bx][by] * h * v

Resources