Quadtree Nearest Neighbour Algorithm

Quadtree Nearest Neighbour Algorithm - algorithm

I have implemented a quadtree structure for n points as well as a method for returning an array of points within a given rectangle. I can't seem to find an algorithm to efficiently find the point that is closest to another given point. Am I missing something obvious? I assume a recursive solution is the correct approach?
Am working in Objective C but pseudo code would be fine. Additionally I am actually storing lat, long data and the distance between points is along a great circle.
EDIT:
This is my tree insert and subdivide code
- (BOOL)insert:(id<PASQuadTreeDataPoint>)dataPoint {
BOOL pointAdded = false;
// If the point lies within the region
if(CGRectContainsPoint(self.region, dataPoint.point)) {
// If there are less than 4 points then add this point
if(self.dataPoints.count < kMaxPointsPerNode) {
[self.dataPoints addObject:dataPoint];
pointAdded = true;
}
else {
// Subdivide into 4 quadrants if not already subdivided
if(northEast == nil) [self subdivide];
// Attempt to add the point to one of the 4 subdivided quadrants
if([northEast insert:dataPoint]) return true;
if([southEast insert:dataPoint]) return true;
if([southWest insert:dataPoint]) return true;
if([northWest insert:dataPoint]) return true;
}
}
return pointAdded;
}
- (void)subdivide {
// Compute the half width and the origin
CGFloat width = self.region.size.width * 0.5f;
CGFloat height = self.region.size.height * 0.5f;
CGFloat x = self.region.origin.x;
CGFloat y = self.region.origin.y;
// Create a new child quadtree with the region divided into quarters
self.northEast = [PASQuadTree quadTreeWithRegion:CGRectMake(x + width, y, width, height)];
self.southEast = [PASQuadTree quadTreeWithRegion:CGRectMake(x + width, y + height, width, height)];
self.southWest = [PASQuadTree quadTreeWithRegion:CGRectMake(x, y + height, width, height)];
self.northWest = [PASQuadTree quadTreeWithRegion:CGRectMake(x, y, width, height)];
}
EDIT:
Have written this code to find the smallest node (leaf) that would contain the point:
-(PASQuadTree *)nodeThatWouldContainPoint:(CGPoint)point {
PASQuadTree *node = nil;
// If the point is within the region
if (CGRectContainsPoint(self.region, point)) {
// Set the node to this node
node = self;
// If the node has children
if (self.northEast != nil) {
// Recursively check each child to see if it would contain the point
PASQuadTree *childNode = [self.northEast nodeThatWouldContainPoint:point];
if (!childNode) childNode = [self.southEast nodeThatWouldContainPoint:point];
if (!childNode) childNode = [self.southWest nodeThatWouldContainPoint:point];
if (!childNode) childNode = [self.northWest nodeThatWouldContainPoint:point];
if (childNode) node = childNode;
}
}
return node;
}
Closer but no cigar!

Find the smallest square with your search point at the center and exactly one other point inside that rectangle (you need to do logn number of searches).
Let x be the distance to the other point.
Then find all the points within a square whose side is 2x and centered around your first point. For each point within this square, calculate the distance from search point and find the closest.
UPDATE: How to find one square centered around search point that contains exactly one other point?
Find a random point. Let the distance to that random point be x. Find all points within square of size x centered around search point. If there are non zero number of points within this square, then select a point at random and repeat. If there are no points, increase search square size to (2-0.5)*x (in next step (2-0.25)*x and so on.

Related

How to efficiently cover a set of points with circles when you can't access point coordinates? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
Suppose I have a finite set of points distributed in a unit square. I can't access the point coordinates; instead, I can only specify a (point, radius) pair and see how many points fall inside that circle. I want to find a set of circles such that each point is in at least one circle, and no circle contains more than 1000 points. What's an efficient way to do this? E.g. a way that minimizes the expected number of (point, radius) searches?
I tried a recursive approach. E.g. f(point, radius) takes a circle and returns a set of smaller circles that cover it. Then recurse until each circle contains fewer than 1000 points. But there's not a straightforward (to me) way to choose the smaller circles in the recursive step.
Edit: Circles are allowed to overlap with each other / with the outside of the square.

Not having a strict partition ("strict" - where the circles in the solution may not overlap and points must appear in exactly 1 circle) simplifies the problem.
The straight-forward way to subdivide a circle under those circumstances is to form a set of child circles that circumscribe the four quadrants of the parent...
Here's a (cursorily tested) demo using that approach
class Circle {
constructor(x,y,radius) {
Object.assign(this, { x, y, radius })
this.rSquared = radius*radius
}
contains(point) {
let dx = point.x - this.x
let dy = point.y - this.y
return dx*dx + dy*dy < this.rSquared
}
draw() {
ctx.beginPath();
ctx.arc(this.x, this.y, this.radius, 0, 2 * Math.PI);
ctx.stroke();
}
subdivide() {
const halfR = this.radius / 2.0
const smallR = this.radius * Math.SQRT2 / 2.0
return [
new Circle(this.x-halfR, this.y-halfR, smallR),
new Circle(this.x-halfR, this.y+halfR, smallR),
new Circle(this.x+halfR, this.y-halfR, smallR),
new Circle(this.x+halfR, this.y+halfR, smallR),
]
}
}
// this class keeps a set of random points and answers countInCircle()
// solutions may only call countInCircle()
class Puzzler {
constructor(count) {
this.points = []
for (let i=0; i<count; i++) {
let point = { x: Math.random()*width, y: Math.random()*height}
this.points.push(point)
}
}
// answer how many points fall inside circle
countInCircle(circle) {
return this.points.reduce((total, p) => total += circle.contains(p) ? 1 : 0, 0);
}
drawSolution(circles) {
// draw the random points
this.points.map(p => ctx.fillRect(p.x,p.y,2,2))
// draw the circles in the solution
ctx.strokeStyle = 'lightgray'
circles.forEach(circle => circle.draw())
// log some stats - commented a few of these out for snippet brevity
const counts = circles.map(circle => this.countInCircle(circle));
console.log('circles:', circles.length)
// console.log('counts:', counts.join(', '))
// console.log('counts above 100:', counts.filter(c => c > 100).length)
const averageCount = counts.reduce((a, b) => a + b) / counts.length
console.log('average count:', averageCount.toFixed(2))
const uncovered = this.points.reduce((total, point) => {
return total + (circles.some(circle => circle.contains(point)) ? 0 : 1)
}, 0)
console.log('uncovered points:', uncovered)
}
}
// setup canvas
const canvas = document.getElementById('canvas')
const { width, height } = canvas
const ctx = canvas.getContext('2d')
// setup puzzle
const count = 1000
const maxCountPer = 100
const puzzler = new Puzzler(count, maxCountPer)
// begin with an encompasing circle, subdivide and solve recursively
// until all subdivided circles meet the count criterion
let r = width*Math.SQRT2/2
let c = new Circle(width/2.0, width/2.0, r)
let solution = solve(c);
function solve(circle) {
let result = []
let count = puzzler.countInCircle(circle)
if (count > 0 && count <= maxCountPer) {
result.push(circle);
} else if (count > maxCountPer) {
circle.subdivide().forEach(c => {
result.push(...solve(c))
})
}
return result
}
requestAnimationFrame(() => {
ctx.clearRect(0, 0, canvas.width, canvas.height);
puzzler.drawSolution(solution)
});
<h1>Circles Puzzle</h1>
<canvas style="border: 1px solid gray;" id="canvas" height="800" width="800"></canvas>

Assumption: When you pick a point and a radius, you get back a list of points that are in the containing circle. I.e., you know which points are covered by which circles.
If that's correct,then you can map out the approximate relative location of all points, after which answers to this similar question should carry you over the finish line.
To map out the relative location of all points:
Note that you can find the distance between any pair of points by centering your circle on one and using binary search on your radius to find the distance to the other within whatever precision you want to use.
Next choose three arbitrary points that aren't too close together. Pick an arbitrary point. Grow the radius, say to 1/4. Pick an arbitrary point close to that radius (by incrementing radius a bit to get another point, or using binary search on radius). Say the distance between these first two points is d. Pick a third point at distance >= d from the first two points but ideally close to d, again by incrementing the two radii or binary search on the same.
Now you have a roughly equilateral triangle. It isn't important that it's equilateral, but it is important that the points aren't very close, and aren't co-linear.
Next, give these three points coordinates. Say the first point is at (0,0), the second point is at (0, dist to first point). The third point will have two possible locations based on its distance from the first two. Choose the one in the first quadrant (arbitrarily).
All other points can now be positioned relative to this triangle by finding their distance two the points of the triangle.
For purposes of your problem, it doesn't matter that the cloud of points is rotated relative to the input, or that we don't know where the unit square is relative to the points. You have a cloud of points with (approximately) known coordinates, and can proceed accordingly.

Polar Coordinate Map Generation

I am currently work some sort of map generation algorithm for my game. I have a basic understanding on what I want it to do and how it would generate the map.
I want to use the Polar Coordinate system. I want a circular graph so that each player would spawn on the edge of the circle, evenly spread out.
The algorithm should generate "cities" spread out from across the circle (but only inside the circle). Each city should be connected some form of way.
The size of the circle should depends on the number of players.
Everything should be random, meaning if I run
GenerateMap()
two times, it should not give the same results.
Here is a picture showing what I want: img
The red arrows are pointing to the cities and the lines are the connections between the cities.
How would I go about creating an algorithm based on the above?
Update: Sorry the link was broken. Fixed it.

I see the cities like this:
compute sizes and constants from N
as your cities should have constant average density then the radius can be computed from it directly. as it scales linearly with average or min city distance.
loop N (cities) times
generate random (x,y) with uniform distribution
throw away iterations where (x,y) is outside circle
throw away iterations where (x,y) is too near to already generated city
The paths are similar just generate all possible paths (non random) and throw away:
paths much longer then average or min distance between cities (connects jutst neighbors)
paths that intersect already generated path
In C++ code it could look like this:
//---------------------------------------------------------------------------
// some globals first
const int _max=128; // just max limit for cities and paths
const int R0=10; // city radius
const int RR=R0*R0*9; // min distance^2 between cities
int N=20; // number of cities
int R1=100; // map radius
struct _city { int x,y; }; // all the data you need for city
_city city[_max]; // list of cities
struct _path { int i0,i1; };// index of cities to join with path
_path path[_max]; // list of paths
int M=0; // number of paths in the list
//---------------------------------------------------------------------------
bool LinesIntersect(float X1,float Y1,float X2,float Y2,float X3,float Y3,float X4,float Y4)
{
if (fabs(X2-X3)+fabs(Y2-Y3)<1e-3) return false;
if (fabs(X1-X4)+fabs(Y1-Y4)<1e-3) return false;
float dx1,dy1,dx2,dy2;
dx1 = X2 - X1;
dy1 = Y2 - Y1;
dx2 = X4 - X3;
dy2 = Y4 - Y3;
float s,t,ax,ay,b;
ax=X1-X3;
ay=Y1-Y3;
b=(-(dx2*dy1)+(dx1*dy2)); if (fabs(b)>1e-3) b=1.0/b; else b=0.0;
s = (-(dy1*ax)+(dx1*ay))*b;
t = ( (dx2*ay)-(dy2*ax))*b;
if ((s>=0)&&(s<=1)&&(t>=0)&&(t<=1)) return true;
return false; // No collision
}
//---------------------------------------------------------------------------
// here generate n cities into circle at x0,y0
// compute R1,N from R0,RR,n
void genere(int x0,int y0,int n)
{
_city c;
_path p,*q;
int i,j,cnt,x,y,rr;
Randomize(); // init pseudo random generator
// [generate cities]
R1=(sqrt(RR*n)*8)/10;
rr=R1-R0; rr*=rr;
for (cnt=50*n,i=0;i<n;) // loop until all cities are generated
{
// watchdog
cnt--; if (cnt<=0) break;
// pseudo random position
c.x=Random(R1+R1)-R1; // <-r,+r>
c.y=Random(R1+R1)-R1; // <-r,+r>
// ignore cities outside R1 radius
if ((c.x*c.x)+(c.y*c.y)>rr) continue;
c.x+=x0; // position to center
c.y+=y0;
// ignore city if closer to any other then RR
for (j=0;j<i;j++)
{
x=c.x-city[j].x;
y=c.y-city[j].y;
if ((x*x)+(y*y)<=RR) { j=-1; break; }
}
if (j<0) continue;
// add new city to list
city[i]=c; i++;
}
N=i; // just in case watch dog kiks in
// [generate paths]
for (M=0,p.i0=0;p.i0<N;p.i0++)
for (p.i1=p.i0+1;p.i1<N;p.i1++)
{
// ignore too long path
x=city[p.i1].x-city[p.i0].x;
y=city[p.i1].y-city[p.i0].y;
if ((x*x)+(y*y)>5*RR) continue; // this constant determine the path density per city
// ignore intersecting path
for (q=path,i=0;i<M;i++,q++)
if ((q->i0!=p.i0)&&(q->i0!=p.i1)&&(q->i1!=p.i0)&&(q->i1!=p.i1))
if (LinesIntersect(
city[p.i0].x,city[p.i0].y,city[p.i1].x,city[p.i1].y,
city[q->i0].x,city[q->i0].y,city[q->i1].x,city[q->i1].y
)) { i=-1; break; }
if (i<0) continue;
// add path to list
if (M>=_max) break;
path[M]=p; M++;
}
}
//---------------------------------------------------------------------------
Here overview of generated layout:
And Growth of N:
The blue circles are the cities, the gray area is the target circle and Lines are the paths. The cnt is just watch dog to avoid infinite loop if constants are wrong. Set the _max value properly so it is high enough for your N or use dynamic allocation instead. There is much more paths than cities so they could have separate _max value to preserve memory (was too lazy to add it).
You can use the RandSeed to have procedural generated maps ...
You can rescale output to better match circle layout after the generation simply by finding bounding box and rescale to radius R1.
Some constants are obtained empirically so play with them to achieve best output for your purpose.

How to smooth the blocks of a 3D voxel world?

In my (Minecraft-like) 3D voxel world, I want to smooth the shapes for more natural visuals. Let's look at this example in 2D first.
Left is how the world looks without any smoothing. The terrain data is binary and each voxel is rendered as a unit size cube.
In the center you can see a naive circular smoothing. It only takes the four directly adjacent blocks into account. It is still not very natural looking. Moreover, I'd like to have flat 45-degree slopes emerge.
On the right you can see a smoothing algorithm I came up with. It takes the eight direct and diagonal neighbors into account in order to come up with the shape of a block. I have the C++ code online. Here is the code that comes up with the control points that the bezier curve is drawn along.
#include <iostream>
using namespace std;
using namespace glm;
list<list<dvec2>> Points::find(ivec2 block)
{
// Control points
list<list<ivec2>> lines;
list<ivec2> *line = nullptr;
// Fetch blocks, neighbours start top left and count
// around the center block clock wise
int center = m_blocks->get(block);
int neighs[8];
for (int i = 0; i < 8; i++) {
auto coord = blockFromIndex(i);
neighs[i] = m_blocks->get(block + coord);
}
// Iterate over neighbour blocks
for (int i = 0; i < 8; i++) {
int current = neighs[i];
int next = neighs[(i + 1) % 8];
bool is_side = (((i + 1) % 2) == 0);
bool is_corner = (((i + 1) % 2) == 1);
if (line) {
// Border between air and ground needs a line
if (current != center) {
// Sides are cool, but corners get skipped when they don't
// stop a line
if (is_side || next == center)
line->push_back(blockFromIndex(i));
} else if (center || is_side || next == center) {
// Stop line since we found an end of the border. Always
// stop for ground blocks here, since they connect over
// corners so there must be open docking sites
line = nullptr;
}
} else {
// Start a new line for the border between air and ground that
// just appeared. However, corners get skipped if they don't
// end a line.
if (current != center) {
lines.emplace_back();
line = &lines.back();
line->push_back(blockFromIndex(i));
}
}
}
// Merge last line with first if touching. Only close around a differing corner for air
// blocks.
if (neighs[7] != center && (neighs[0] != center || (!center && neighs[1] != center))) {
// Skip first corner if enclosed
if (neighs[0] != center && neighs[1] != center)
lines.front().pop_front();
if (lines.size() == 1) {
// Close circle
auto first_point = lines.front().front();
lines.front().push_back(first_point);
} else {
// Insert last line into first one
lines.front().insert(lines.front().begin(), line->begin(), line->end());
lines.pop_back();
}
}
// Discard lines with too few points
auto i = lines.begin();
while (i != lines.end()) {
if (i->size() < 2)
lines.erase(i++);
else
++i;
}
// Convert to concrete points for output
list<list<dvec2>> points;
for (auto &line : lines) {
points.emplace_back();
for (auto &neighbour : line)
points.back().push_back(pointTowards(neighbour));
}
return points;
}
glm::ivec2 Points::blockFromIndex(int i)
{
// Returns first positive representant, we need this so that the
// conditions below "wrap around"
auto modulo = [](int i, int n) { return (i % n + n) % n; };
ivec2 block(0, 0);
// For two indices, zero is right so skip
if (modulo(i - 1, 4))
// The others are either 1 or -1
block.x = modulo(i - 1, 8) / 4 ? -1 : 1;
// Other axis is same sequence but shifted
if (modulo(i - 3, 4))
block.y = modulo(i - 3, 8) / 4 ? -1 : 1;
return block;
}
dvec2 Points::pointTowards(ivec2 neighbour)
{
dvec2 point;
point.x = static_cast<double>(neighbour.x);
point.y = static_cast<double>(neighbour.y);
// Convert from neighbour space into
// drawing space of the block
point *= 0.5;
point += dvec2(.5);
return point;
}
However, this is still in 2D. How to translate this algorithm into three dimensions?

You should probably have a look at the marching cubes algorithm and work from there. You can easily control the smoothness of the resulting blob:
Imagine that each voxel defines a field, with a high density at it's center, slowly fading to nothing as you move away from the center. For example, you could use a function that is 1 inside a voxel and goes to 0 two voxels away. No matter what exact function you choose, make sure that it's only non-zero inside a limited (preferrably small) area.
For each point, sum the densities of all fields.
Use the marching cubes algorithm on the sum of those fields
Use a high resolution mesh for the algorithm
In order to change the look/smoothness you change the density function and the threshold of the marching cubes algorithm. A possible extension to marching cubes to create smoother meshes is the following idea: Imagine that you encounter two points on an edge of a cube, where one point lies inside your volume (above a threshold) and the other outside (under the threshold). In this case many marching cubes algorithms place the boundary exactly at the middle of the edge. One can calculate the exact boundary point - this gets rid of aliasing.
Also I would recommend that you run a mesh simplification algorithm after that. Using marching cubes results in meshes with many unnecessary triangles.

As an alternative to my answer above: You could also use NURBS or any algorithm for subdivision surfaces. Especially the subdivision surfaces algorithms are spezialized to smooth meshes. Depending on the algorithm and it's configuration you will get smoother versions of your original mesh with
the same volume
the same surface
the same silhouette
and so on.

Use 3D implementations for Biezer curves known as Biezer surfaces or use the B-Spline Surface algorithms explained:
here
or
here

Circle Separation Distance - Nearest Neighbor Problem

I have a set of circles with given locations and radii on a two dimensional plane. I want to determine for every circle if it is intersecting with any other circle and the distance that is needed to separate the two. Under my current implementation, I just go through all the possible combinations of circles and then do the calculations. Unfortunately, this algorithm is O(n^2), which is slow.
The circles will generally be clustered in groups, and they will have similar (but different) radii. The approximate maximum for the number of circles is around 200. The algorithm does not have to be exact, but it should be close.
Here is a (slow) implementation I currently have in JavaScript:
// Makes a new circle
var circle = function(x,y,radius) {
return {
x:x,
y:y,
radius:radius
};
};
// These points are not representative of the true data set. I just made them up.
var points = [
circle(3,3,2),
circle(7,5,4),
circle(16,6,4),
circle(17,12,3),
circle(26,20,1)
];
var k = 0,
len = points.length;
for (var i = 0; i < len; i++) {
for (var j = k; j < len; j++) {
if (i !== j) {
var c1 = points[i],
c2 = points[j],
radiiSum = c1.radius+c2.radius,
deltaX = Math.abs(c1.x-c2.x);
if (deltaX < radiiSum) {
var deltaY = Math.abs(c1.y-c2.y);
if (deltaY < radiiSum) {
var distance = Math.sqrt(deltaX*deltaX+deltaY*deltaY);
if (distance < radiiSum) {
var separation = radiiSum - distance;
console.log(c1,c2,separation);
}
}
}
}
}
k++;
}
Also, I would appreciate it if you explained a good algorithm (KD Tree?) in plain English :-/

For starters, your algorithm above will be greatly sped-up if you just skipped the SQRT call. That's the most well known simple optimization for comparing distances. You can also precompute the "squared radius" distance so you don't redundantly recompute it.
Also, there looks to be lots of other little bugs in some of your algorithms. Here's my take on how to fix it below.
Also, if you want to get rid of the O(N-Squared) algorithm, you can look at using a kd-tree. There's an upfront cost of building the KD-Tree but with the benefit of searching for nearest neighbors as much faster.
function Distance_Squared(c1, c2) {
var deltaX = (c1.x - c2.x);
var deltaY = (c1.y - c2.y);
return (deltaX * deltaX + deltaY * deltaY);
}
// returns false if it's possible that the circles intersect. Returns true if the bounding box test proves there is no chance for intersection
function TrivialRejectIntersection(c1, c2) {
return ((c1.left >= c2.right) || (c2.right <= c1.left) || (c1.top >= c2.bottom) || (c2.bottom <= c1.top));
}
var circle = function(x,y,radius) {
return {
x:x,
y:y,
radius:radius,
// some helper properties
radius_squared : (radius*radius), // precompute the "squared distance"
left : (x-radius),
right: (x+radius),
top : (y - radius),
bottom : (y+radius)
};
};
// These points are not representative of the true data set. I just made them up.
var points = [
circle(3,3,2),
circle(7,5,4),
circle(16,6,4),
circle(17,12,3),
circle(26,20,1)
];
var k = 0;
var len = points.length;
var c1, c2;
var distance_squared;
var deltaX, deltaY;
var min_distance;
var seperation;
for (var i = 0; i < len; i++) {
for (var j = (i+1); j < len; j++) {
c1 = points[i];
c2 = points[j];
// try for trivial rejections first. Jury is still out if this will help
if (TrivialRejectIntesection(c1, c2)) {
continue;
}
//distance_squared is the actual distance between c1 and c2 'squared'
distance_squared = Distance_Squared(c1, c2);
// min_distance_squared is how much "squared distance" is required for these two circles to not intersect
min_distance_squared = (c1.radius_squared + c2.radius_squared + (c1.radius*c2.radius*2)); // D**2 == deltaX*deltaX + deltaY*deltaY + 2*deltaX*deltaY
// and so it follows
if (distance_squared < min_distance_squared) {
// intersection detected
// now subtract actual distance from "min distance"
seperation = c1.radius + c2.radius - Math.sqrt(distance_squared);
Console.log(c1, c2, seperation);
}
}
}

This article has been dormant for a long time, but I've run into and solved this problem reasonably well, so will post so that others don't have to do the same head scratching.
You can treat the nearest circle neighbor problem as a 3d point nearest neighbor search in a kd-tree or octree. Define the distance between two circles A and B as
D(A,B) = sqrt( (xA - xB)^2 + (yA - yB)^2 ) - rA - rB
This is a negative quantity iff the circles overlap. For this discussion I'll assume an octree, but a kd-tree with k=3 is similar.
Store a triple (x,y,r) in the octree for each circle.
To find the nearest neighbor to a target circle T, use the standard algorithm:
def search(node, T, nst)
if node is a leaf
update nst with node's (x,y,r) nearest to T
else
for each cuboid C subdividing node (there are 8 of them)
if C contains any point nearer to T than nst
search(C, T, nst)
end
end
Here nst is a reference to the nearest circle to T found so far. Initially it's null.
The slightly tricky part is determining if C contains any point nearer to T than nst. For this it is sufficent to consider the unique point (x,y,r) within C that is Euclidean nearest to T in x and y and has the maximum value of the r range contained in the cuboid. In other words, the cuboid represents a set of circles with centers ranging over a rectangular area in x and y and with a range of radii. The point you want to check is the one representing the circle with center closest to T and with largest radius.
Note the radius of T plays no part in the algorithm at all. You're only concered with how far inside any other circle the center of T lies. (I wish this had been as obvious at the start as it seems now...)

Finding closest non-black pixel in an image fast

I have a 2D image randomly and sparsely scattered with pixels.
given a point on the image, I need to find the distance to the closest pixel that is not in the background color (black).
What is the fastest way to do this?
The only method I could come up with is building a kd-tree for the pixels. but I would really want to avoid such expensive preprocessing. also, it seems that a kd-tree gives me more than I need. I only need the distance to something and I don't care about what this something is.

Personally, I'd ignore MusiGenesis' suggestion of a lookup table.
Calculating the distance between pixels is not expensive, particularly as for this initial test you don't need the actual distance so there's no need to take the square root. You can work with distance^2, i.e:
r^2 = dx^2 + dy^2
Also, if you're going outwards one pixel at a time remember that:
(n + 1)^2 = n^2 + 2n + 1
or if nx is the current value and ox is the previous value:
nx^2 = ox^2 + 2ox + 1
= ox^2 + 2(nx - 1) + 1
= ox^2 + 2nx - 1
=> nx^2 += 2nx - 1
It's easy to see how this works:
1^2 = 0 + 2*1 - 1 = 1
2^2 = 1 + 2*2 - 1 = 4
3^2 = 4 + 2*3 - 1 = 9
4^2 = 9 + 2*4 - 1 = 16
5^2 = 16 + 2*5 - 1 = 25
etc...
So, in each iteration you therefore need only retain some intermediate variables thus:
int dx2 = 0, dy2, r2;
for (dx = 1; dx < w; ++dx) { // ignoring bounds checks
dx2 += (dx << 1) - 1;
dy2 = 0;
for (dy = 1; dy < h; ++dy) {
dy2 += (dy << 1) - 1;
r2 = dx2 + dy2;
// do tests here
}
}
Tada! r^2 calculation with only bit shifts, adds and subtracts :)
Of course, on any decent modern CPU calculating r^2 = dx*dx + dy*dy might be just as fast as this...

As Pyro says, search the perimeter of a square that you keep moving out one pixel at a time from your original point (i.e. increasing the width and height by two pixels at a time). When you hit a non-black pixel, you calculate the distance (this is your first expensive calculation) and then continue searching outwards until the width of your box is twice the distance to the first found point (any points beyond this cannot possibly be closer than your original found pixel). Save any non-black points you find during this part, and then calculate each of their distances to see if any of them are closer than your original point.
In an ideal find, you only have to make one expensive distance calculation.
Update: Because you're calculating pixel-to-pixel distances here (instead of arbitrary precision floating point locations), you can speed up this algorithm substantially by using a pre-calculated lookup table (just a height-by-width array) to give you distance as a function of x and y. A 100x100 array costs you essentially 40K of memory and covers a 200x200 square around the original point, and spares you the cost of doing an expensive distance calculation (whether Pythagorean or matrix algebra) for every colored pixel you find. This array could even be pre-calculated and embedded in your app as a resource, to spare you the initial calculation time (this is probably serious overkill).
Update 2: Also, there are ways to optimize searching the square perimeter. Your search should start at the four points that intersect the axes and move one pixel at a time towards the corners (you have 8 moving search points, which could easily make this more trouble than it's worth, depending on your application's requirements). As soon as you locate a colored pixel, there is no need to continue towards the corners, as the remaining points are all further from the origin.
After the first found pixel, you can further restrict the additional search area required to the minimum by using the lookup table to ensure that each searched point is closer than the found point (again starting at the axes, and stopping when the distance limit is reached). This second optimization would probably be much too expensive to employ if you had to calculate each distance on the fly.
If the nearest pixel is within the 200x200 box (or whatever size works for your data), you will only search within a circle bounded by the pixel, doing only lookups and <>comparisons.

You didn't specify how you want to measure distance. I'll assume L1 (rectilinear) because it's easier; possibly these ideas could be modified for L2 (Euclidean).
If you're only doing this for relatively few pixels, then just search outward from the source pixel in a spiral until you hit a nonblack one.
If you're doing this for many/all of them, how about this: Build a 2-D array the size of the image, where each cell stores the distance to the nearest nonblack pixel (and if necessary, the coordinates of that pixel). Do four line sweeps: left to right, right to left, bottom to top, and top to bottom. Consider the left to right sweep; as you sweep, keep a 1-D column containing the last nonblack pixel seen in each row, and mark each cell in the 2-D array with the distance to and/or coordinates of that pixel. O(n^2).
Alternatively, a k-d tree is overkill; you could use a quadtree. Only a little more difficult to code than my line sweep, a little more memory (but less than twice as much), and possibly faster.

Search "Nearest neighbor search", first two links in Google should help you.
If you are only doing this for 1 pixel per image, I think your best bet is just a linear search, 1 pixel width box at time outwards. You can't take the first point you find, if your search box is square. You have to be careful

Yes, the Nearest neighbor search is good, but does not guarantee you'll find the 'nearest'. Moving one pixel out each time will produce a square search - the diagonals will be farther away than the horizontal / vertical. If this is important, you'll want to verify - continue expanding until the absolute horizontal has a distance greater than the 'found' pixel, and then calculate distances on all non-black pixels that were located.

Ok, this sounds interesting.
I made a c++ version of a soulution, I don't know if this helps you. I think it works fast enough as it's almost instant on a 800*600 matrix. If you have any questions just ask.
Sorry for any mistakes I've made, it's a 10min code...
This is a iterative version (I was planing on making a recursive one too, but I've changed my mind).
The algorithm could be improved by not adding any point to the points array that is to a larger distance from the starting point then the min_dist, but this involves calculating for each pixel (despite it's color) the distance from the starting point.
Hope that helps
//(c++ version)
#include<iostream>
#include<cmath>
#include<ctime>
using namespace std;
//ITERATIVE VERSION
//picture witdh&height
#define width 800
#define height 600
//indexex
int i,j;
//initial point coordinates
int x,y;
//variables to work with the array
int p,u;
//minimum dist
double min_dist=2000000000;
//array for memorising the points added
struct point{
int x;
int y;
} points[width*height];
double dist;
bool viz[width][height];
// direction vectors, used for adding adjacent points in the "points" array.
int dx[8]={1,1,0,-1,-1,-1,0,1};
int dy[8]={0,1,1,1,0,-1,-1,-1};
int k,nX,nY;
//we will generate an image with white&black pixels (0&1)
bool image[width-1][height-1];
int main(){
srand(time(0));
//generate the random pic
for(i=1;i<=width-1;i++)
for(j=1;j<=height-1;j++)
if(rand()%10001<=9999) //9999/10000 chances of generating a black pixel
image[i][j]=0;
else image[i][j]=1;
//random coordinates for starting x&y
x=rand()%width;
y=rand()%height;
p=1;u=1;
points[1].x=x;
points[1].y=y;
while(p<=u){
for(k=0;k<=7;k++){
nX=points[p].x+dx[k];
nY=points[p].y+dy[k];
//nX&nY are the coordinates for the next point
//if we haven't added the point yet
//also check if the point is valid
if(nX>0&&nY>0&&nX<width&&nY<height)
if(viz[nX][nY] == 0 ){
//mark it as added
viz[nX][nY]=1;
//add it in the array
u++;
points[u].x=nX;
points[u].y=nY;
//if it's not black
if(image[nX][nY]!=0){
//calculate the distance
dist=(x-nX)*(x-nX) + (y-nY)*(y-nY);
dist=sqrt(dist);
//if the dist is shorter than the minimum, we save it
if(dist<min_dist)
min_dist=dist;
//you could save the coordinates of the point that has
//the minimum distance too, like sX=nX;, sY=nY;
}
}
}
p++;
}
cout<<"Minimum dist:"<<min_dist<<"\n";
return 0;
}

I'm sure this could be done better but here's some code that searches the perimeter of a square around the centre pixel, examining the centre first and moving toward the corners. If a pixel isn't found the perimeter (radius) is expanded until either the radius limit is reached or a pixel is found. The first implementation was a loop doing a simple spiral around the centre point but as noted that doesn't find the absolute closest pixel. SomeBigObjCStruct's creation inside the loop was very slow - removing it from the loop made it good enough and the spiral approach is what got used. But here's this implementation anyway - beware, little to no testing done.
It is all done with integer addition and subtraction.
- (SomeBigObjCStruct *)nearestWalkablePoint:(SomeBigObjCStruct)point {
typedef struct _testPoint { // using the IYMapPoint object here is very slow
int x;
int y;
} testPoint;
// see if the point supplied is walkable
testPoint centre;
centre.x = point.x;
centre.y = point.y;
NSMutableData *map = [self getWalkingMapDataForLevelId:point.levelId];
// check point for walkable (case radius = 0)
if(testThePoint(centre.x, centre.y, map) != 0) // bullseye
return point;
// radius is the distance from the location of point. A square is checked on each iteration, radius units from point.
// The point with y=0 or x=0 distance is checked first, i.e. the centre of the side of the square. A cursor variable
// is used to move along the side of the square looking for a walkable point. This proceeds until a walkable point
// is found or the side is exhausted. Sides are checked until radius is exhausted at which point the search fails.
int radius = 1;
BOOL leftWithinMap = YES, rightWithinMap = YES, upWithinMap = YES, downWithinMap = YES;
testPoint leftCentre, upCentre, rightCentre, downCentre;
testPoint leftUp, leftDown, rightUp, rightDown;
testPoint upLeft, upRight, downLeft, downRight;
leftCentre = rightCentre = upCentre = downCentre = centre;
int foundX = -1;
int foundY = -1;
while(radius < 1000) {
// radius increases. move centres outward
if(leftWithinMap == YES) {
leftCentre.x -= 1; // move left
if(leftCentre.x < 0) {
leftWithinMap = NO;
}
}
if(rightWithinMap == YES) {
rightCentre.x += 1; // move right
if(!(rightCentre.x < kIYMapWidth)) {
rightWithinMap = NO;
}
}
if(upWithinMap == YES) {
upCentre.y -= 1; // move up
if(upCentre.y < 0) {
upWithinMap = NO;
}
}
if(downWithinMap == YES) {
downCentre.y += 1; // move down
if(!(downCentre.y < kIYMapHeight)) {
downWithinMap = NO;
}
}
// set up cursor values for checking along the sides of the square
leftUp = leftDown = leftCentre;
leftUp.y -= 1;
leftDown.y += 1;
rightUp = rightDown = rightCentre;
rightUp.y -= 1;
rightDown.y += 1;
upRight = upLeft = upCentre;
upRight.x += 1;
upLeft.x -= 1;
downRight = downLeft = downCentre;
downRight.x += 1;
downLeft.x -= 1;
// check centres
if(testThePoint(leftCentre.x, leftCentre.y, map) != 0) {
foundX = leftCentre.x;
foundY = leftCentre.y;
break;
}
if(testThePoint(rightCentre.x, rightCentre.y, map) != 0) {
foundX = rightCentre.x;
foundY = rightCentre.y;
break;
}
if(testThePoint(upCentre.x, upCentre.y, map) != 0) {
foundX = upCentre.x;
foundY = upCentre.y;
break;
}
if(testThePoint(downCentre.x, downCentre.y, map) != 0) {
foundX = downCentre.x;
foundY = downCentre.y;
break;
}
int i;
for(i = 0; i < radius; i++) {
if(leftWithinMap == YES) {
// LEFT Side - stop short of top/bottom rows because up/down horizontal cursors check that line
// if cursor position is within map
if(i < radius - 1) {
if(leftUp.y > 0) {
// check it
if(testThePoint(leftUp.x, leftUp.y, map) != 0) {
foundX = leftUp.x;
foundY = leftUp.y;
break;
}
leftUp.y -= 1; // moving up
}
if(leftDown.y < kIYMapHeight) {
// check it
if(testThePoint(leftDown.x, leftDown.y, map) != 0) {
foundX = leftDown.x;
foundY = leftDown.y;
break;
}
leftDown.y += 1; // moving down
}
}
}
if(rightWithinMap == YES) {
// RIGHT Side
if(i < radius - 1) {
if(rightUp.y > 0) {
if(testThePoint(rightUp.x, rightUp.y, map) != 0) {
foundX = rightUp.x;
foundY = rightUp.y;
break;
}
rightUp.y -= 1; // moving up
}
if(rightDown.y < kIYMapHeight) {
if(testThePoint(rightDown.x, rightDown.y, map) != 0) {
foundX = rightDown.x;
foundY = rightDown.y;
break;
}
rightDown.y += 1; // moving down
}
}
}
if(upWithinMap == YES) {
// UP Side
if(upRight.x < kIYMapWidth) {
if(testThePoint(upRight.x, upRight.y, map) != 0) {
foundX = upRight.x;
foundY = upRight.y;
break;
}
upRight.x += 1; // moving right
}
if(upLeft.x > 0) {
if(testThePoint(upLeft.x, upLeft.y, map) != 0) {
foundX = upLeft.x;
foundY = upLeft.y;
break;
}
upLeft.y -= 1; // moving left
}
}
if(downWithinMap == YES) {
// DOWN Side
if(downRight.x < kIYMapWidth) {
if(testThePoint(downRight.x, downRight.y, map) != 0) {
foundX = downRight.x;
foundY = downRight.y;
break;
}
downRight.x += 1; // moving right
}
if(downLeft.x > 0) {
if(testThePoint(upLeft.x, upLeft.y, map) != 0) {
foundX = downLeft.x;
foundY = downLeft.y;
break;
}
downLeft.y -= 1; // moving left
}
}
}
if(foundX != -1 && foundY != -1) {
break;
}
radius++;
}
// build the return object
if(foundX != -1 && foundY != -1) {
SomeBigObjCStruct *foundPoint = [SomeBigObjCStruct mapPointWithX:foundX Y:foundY levelId:point.levelId];
foundPoint.z = [self zWithLevelId:point.levelId];
return foundPoint;
}
return nil;
}

You can combine many ways to speed it up.
A way to accelerate the pixel lookup is to use what I call a spatial lookup map. It is basically a downsampled map (say of 8x8 pixels, but its a tradeoff) of the pixels in that block. Values can be "no pixels set" "partial pixels set" "all pixels set". This way one read can tell if a block/cell is either full, partially full or empty.
scanning a box/rectangle around the center may not be ideal because there are many pixels/cells which are far far away. I use a circle drawing algorithm (Bresenham) to reduce the overhead.
reading the raw pixel values can happen in horizontal batches, for example a byte (for a cell size of 8x8 or multiples of it), dword or long. This should give you a serious speedup again.
you can also use multiple levels of "spatial lookup maps", its again a tradeoff.
For the distance calculatation the mentioned lookup table can be used, but its a (cache)bandwidth vs calculation speed tradeoff (I dunno how it performs on a GPU for example).

Another approach I have investigated and likely will stick to: Utilizing the Bresenham circle algorithm.
It is surprisingly fast as it saves you any sort of distance comparisons!
You effectively just draw bigger and bigger circles around your target point so that when the first time you encounter a non-black pixel you automatically know it is the closest, saving any further checks.
What I have not verified yet is whether the bresenham circle will catch every single pixel but that wasn't a concern for my case as my pixels will occur in blobs of some sort.

I would do a simple lookup table - for every pixel, precalculate distance to the closest non-black pixel and store the value in the same offset as the corresponding pixel. Of course, this way you will need more memory.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio