nearest neighbor - k-d tree - wikipedia proof - nearest-neighbor

On the wikipedia entry for k-d trees, an algorithm is presented for doing a nearest neighbor search on a k-d tree. What I don't understand is the explanation of step 3.2. How do you know there isn't a closer point just because the difference between the splitting coordinate of the search point and the current node is greater than the difference between the splitting coordinate of the search point and the current best?
Nearest neighbor search Animation of
NN searching with a KD Tree in 2D
The nearest neighbor (NN) algorithm
aims to find the point in the tree
which is nearest to a given input
point. This search can be done
efficiently by using the tree
properties to quickly eliminate large
portions of the search space.
Searching for a nearest neighbor in a
kd-tree proceeds as follows:
Starting with the root node, the algorithm moves down the tree
recursively, in the same way that it
would if the search point were being
inserted (i.e. it goes right or left
depending on whether the point is
greater or less than the current node
in the split dimension).
Once the algorithm reaches a leaf node, it saves that node point as
the "current best"
The algorithm unwinds the recursion of the tree, performing the
following steps at each node:
1. If the current node is closer than the current best, then it
becomes the current best.
2. The algorithm checks whether there could be any points on
the other side of the splitting plane
that are closer to the search point
than the current best. In concept,
this is done by intersecting the
splitting hyperplane with a
hypersphere around the search point
that has a radius equal to the current
nearest distance. Since the
hyperplanes are all axis-aligned this
is implemented as a simple comparison
to see whether the difference between
the splitting coordinate of the search
point and current node is less than
the distance (overall coordinates)
from the search point to the current
best.
1. If the hypersphere crosses the plane, there could be
nearer points on the other side of the
plane, so the algorithm must move down
the other branch of the tree from the
current node looking for closer
points, following the same recursive
process as the entire search.
2. If the hypersphere doesn't intersect the splitting plane,
then the algorithm continues walking
up the tree, and the entire branch on
the other side of that node is
eliminated.
When the algorithm finishes this process for the root node, then the
search is complete.
Generally the algorithm uses squared
distances for comparison to avoid
computing square roots. Additionally,
it can save computation by holding the
squared current best distance in a
variable for comparison.

Look carefully at the 6th frame of the animation on that page.
As the algorithm is going back up the recursion, it is possible that there is a closer point on the other side of the hyperplane that it's on. We've checked one half, but there could be an even closer point on the other half.
Well, it turns out we can sometimes make a simplification. If it's impossible for there to be a point on the other half closer than our current best (closest) point, then we can skip that hyperplane half entirely. This simplification is the one shown on the 6th frame.
Figuring out whether this simplification is possible is done by comparing the distance from the hyperplane to our search location. Because the hyperplane is aligned to the axes, the shortest line from it to any other point will a line along one dimension, so we can compare just the coordinate of the dimension that the hyperplane splits.
If it's farther from the search point to the hyperplane than from the search point to your current closest point, then there's no reason to search past that splitting coordinate.
Even if my explanation doesn't help, the graphic will. Good luck on your project!

Yes, the description of NN (Nearest Neighbour) search in a KD Tree on Wikipedia is a little hard to follow. It doesn't help that a lot of the top Google search results on NN KD Tree searches are just plain wrong!
Here's some C++ code to show you how to get it right:
template <class T, std::size_t N>
void KDTree<T,N>::nearest (
const const KDNode<T,N> &node,
const std::array<T, N> &point, // looking for closest node to this point
const KDPoint<T,N> &closest, // closest node (so far)
double &minDist,
const uint depth) const
{
if (node->isLeaf()) {
const double dist = distance(point, node->leaf->point);
if (dist < minDist) {
minDist = dist;
closest = node->leaf;
}
} else {
const T dim = depth % N;
if (point[dim] < node->splitVal) {
// search left first
nearest(node->left, point, closest, minDist, depth + 1);
if (point[dim] + minDist >= node->splitVal)
nearest(node->right, point, closest, minDist, depth + 1);
} else {
// search right first
nearest(node->right, point, closest, minDist, depth + 1);
if (point[dim] - minDist <= node->splitVal)
nearest(node->left, point, closest, minDist, depth + 1);
}
}
}
API for NN searching on a KD Tree:
// Nearest neighbour
template <class T, std::size_t N>
const KDPoint<T,N> KDTree<T,N>::nearest (const std::array<T, N> &point) const {
const KDPoint<T,N> closest;
double minDist = std::numeric_limits<double>::max();
nearest(root, point, closest, minDist);
return closest;
}
Default distance function:
template <class T, std::size_t N>
double distance (const std::array<T, N> &p1, const std::array<T, N> &p2) {
double d = 0.0;
for (uint i = 0; i < N; ++i) {
d += pow(p1[i] - p2[i], 2.0);
}
return sqrt(d);
}
Edit: some people are asking for help with the data structures too (not just the NN algorithm), so here is what I have used. Depending on your purpose, you might wish to modify the data structures slightly. (Note: but you almost certainly do not want to modify the NN algorithm.)
KDPoint class:
template <class T, std::size_t N>
class KDPoint {
public:
KDPoint<T,N> (std::array<T,N> &&t) : point(std::move(t)) { };
virtual ~KDPoint<T,N> () = default;
std::array<T, N> point;
};
KDNode class:
template <class T, std::size_t N>
class KDNode
{
public:
KDNode () = delete;
KDNode (const KDNode &) = delete;
KDNode & operator = (const KDNode &) = delete;
~KDNode () = default;
// branch node
KDNode (const T split,
std::unique_ptr<const KDNode> &lhs,
std::unique_ptr<const KDNode> &rhs) : splitVal(split), left(std::move(lhs)), right(std::move(rhs)) { };
// leaf node
KDNode (std::shared_ptr<const KDPoint<T,N>> p) : splitVal(0), leaf(p) { };
bool isLeaf (void) const { return static_cast<bool>(leaf); }
// data members
const T splitVal;
const std::unique_ptr<const KDNode<T,N>> left, right;
const std::shared_ptr<const KDPoint<T,N>> leaf;
};
KDTree class: (Note: you'll need to add a member function to build/fill your tree.)
template <class T, std::size_t N>
class KDTree {
public:
KDTree () = delete;
KDTree (const KDTree &) = delete;
KDTree (KDTree &&t) : root(std::move(const_cast<std::unique_ptr<const KDNode<T,N>>&>(t.root))) { };
KDTree & operator = (const KDTree &) = delete;
~KDTree () = default;
const KDPoint<T,N> nearest (const std::array<T, N> &point) const;
// Nearest neighbour search - runs in O(log n)
void nearest (const std::unique_ptr<const KDNode<T,N>> &node,
const std::array<T, N> &point,
std::shared_ptr<const KDPoint<T,N>> &closest,
double &minDist,
const uint depth = 0) const;
// data members
const std::unique_ptr<const KDNode<T,N>> root;
};

Related

KD-Tree which nodes are visited when finding nearest neighbor

Given these points (7,3), (10,5), (9,0), (5,8), (3,2), (8,1), I need to create a balanced KD Tree such that the first level of the KD Tree is split along the x-axis, and when there are two medians we pick the “larger” one as the root of subtree. After building it I need to list the nodes that get visited when trying to find the nearest neighbor of the point (2,4). Here is the tree I built using given points above:
Here is the KD-Tree I've built
Im very confused about finding the nearest neighbor, And i have to list the nodes that get visited when the tree is finding the point (2,4). So far I think it visits (8,1) -> (7,3) -> (5,8). But what comes after that?? Which nodes get visited?
Your k-d tree is correct.
Nearest-neighbor algorithm
The k-d tree nearest-neighbor search traverses the tree by alternating between two phases:
Go-down phase:
Note the distance between your input point and the current node in the tree (the actual distance, not the distance on an axis).
Alternating between x-axis and y-axis, compare the axis-coordinate of your input point with the axis-coordinate of the current node in the tree to determine which sub-tree to go down to.
Repeat Go-down phase until you reached the bottom of the tree, then enter Go-back-up phase.
Go-back-up phase:
Go up one level. If you can't go up, you are done.
If you already did a Go-down phase on both sub-trees of the current node, repeat Go-back-up.
If the actual distance to the best neighbor you have found so far is closer than the axis-distance between your input node and the current node in the tree, repeat Go-back-up.
Otherwise, enter Go-down phase on the sub-tree opposite to the one from where you came.
Your example
Here is a sketch of your k-d tree to make it more clear:
Applying these steps on your example tree and input node (2,4):
Start with Go-down phase at the root node (8,1).
Distance between (8,1) and input node (2,4) is 6.708, so (8,1) is our currently known nearest neighbor. The current axis is x, so we compare 8 and 2 and we see we have to go to the left sub-tree.
Current node is (7,3). Distance between (7,3) and input node (2,4) is 5.099, which is better than the previous best-known distance, so (7,3) becomes our new nearest neighbor. The current axis is y, so we compare 3 and 4 and we see have to go to the right sub-tree.
Current node is (5,8). Distance between (5,8) and input node (2,4) is 5.000, which is smaller than the previous best-known distance, so (5,8) becomes our new nearest neighbor. Current axis is x, but we cannot go down any further, so we enter Go-back-up phase.
We go back up to (7,3). Current axis is y. The y-distance between (7,3) and the input node (2,4) is |3-4| = 1, which is smaller than 5, the distance to the currently known nearest neighbor. Therefore, we have to enter Go-back-down phase on the other sub-tree. You can see it in the picture: The distance between the input point (S) and (5,8) is greater than the distance between (S) and the separation line that goes through (7,3). This means there could be a nearer neighbor on the other side of the separation line.
Current node is (3,2). Distance between (3,2) and input node (2,4) is 2.236 which is better than the previously known best distance, so (3,2) becomes our currently known closest neighbor. The current axis is x, but we cannot go any further, so we enter Go-back-up phase.
We go back up to (7,3). Current axis is y. We visited both sub-trees of this node, so we repeat Go-back-up phase.
We go back up to (8,1). Current axis is x. The x-distance between (8,1) and the input node (2,4) is |8-2| = 6, which is larger than the distance to the currently known closest neighbor, so we repeat Go-back-up phase. You can again see it in the picture: The distance between the input point (S) and the current nearest neighbor (3,2) is smaller than the distance between (S) and the separation line that goes through (8,1). This means that there cannot be a nearer neighbor on the other side of the separation line.
We cannot go up any further, so we are done.
The nodes we visited were: (8,1), (7,3), (5,8), (7,3), (3,2), (7,3), (8,1). The nearest neighbor we found was (3,2) with distance 2.236.
I use this code in a similar issue, of course you should use x and y instead of lat and lon, I hope I could help.
class LocationKDTree {
private static final int K = 3; // 3-d tree
private final Node tree;
public LocationKDTree(#Nonnull final List<Location> locations) {
final List<Node> nodes = new ArrayList<>(locations.size());
for (final Location location : locations) {
nodes.add(new Node(location));
}
tree = buildTree(nodes, 0);
}
#Nullable
public Location findNearest(final double latitude, final double longitude) {
final Node node = findNearest(tree, new Node(latitude, longitude), 0);
return node == null ? null : node.location;
}
private static Node findNearest(final Node current, final Node target, final int depth) {
final int axis = depth % K;
final int direction = getComparator(axis).compare(target, current);
final Node next = (direction < 0) ? current.left : current.right;
final Node other = (direction < 0) ? current.right : current.left;
Node best = (next == null) ? current : findNearest(next, target, depth + 1);
if (current.euclideanDistance(target) < best.euclideanDistance(target)) {
best = current;
}
if (other != null) {
if (current.verticalDistance(target, axis) < best.euclideanDistance(target)) {
final Node possibleBest = findNearest(other, target, depth + 1);
if (possibleBest.euclideanDistance(target) < best.euclideanDistance(target)) {
best = possibleBest;
}
}
}
return best;
}
#Nullable
private static Node buildTree(final List<Node> items, final int depth) {
if (items.isEmpty()) {
return null;
}
Collections.sort(items, getComparator(depth % K));
final int index = items.size() / 2;
final Node root = items.get(index);
root.left = buildTree(items.subList(0, index), depth + 1);
root.right = buildTree(items.subList(index + 1, items.size()), depth + 1);
return root;
}
private static class Node {
Node left;
Node right;
Location location;
final double[] point = new double[K];
Node(final double latitude, final double longitude) {
point[0] = (double) (cos(toRadians(latitude)) * cos(toRadians(longitude)));
point[1] = (double) (cos(toRadians(latitude)) * sin(toRadians(longitude)));
point[2] = (double) (sin(toRadians(latitude)));
}
Node(final Location location) {
this(location.latitude, location.longitude);
this.location = location;
}
double euclideanDistance(final Node that) {
final double x = this.point[0] - that.point[0];
final double y = this.point[1] - that.point[1];
final double z = this.point[2] - that.point[2];
return x * x + y * y + z * z;
}
double verticalDistance(final Node that, final int axis) {
final double d = this.point[axis] - that.point[axis];
return d * d;
}
}
private static Comparator<Node> getComparator(final int i) {
return NodeComparator.values()[i];
}
private static enum NodeComparator implements Comparator<Node> {
x {
#Override
public int compare(final Node a, final Node b) {
return Double.compare(a.point[0], b.point[0]);
}
},
y {
#Override
public int compare(final Node a, final Node b) {
return Double.compare(a.point[1], b.point[1]);
}
},
z {
#Override
public int compare(final Node a, final Node b) {
return Double.compare(a.point[2], b.point[2]);
}
}
}
}
and location class :
class Location {
public double latitude;
public double longitude;
public String name;
}

How to correct winding of triangles to counter-clockwise direction of a 3D Mesh model?

First of all let me clear .. I am not asking about 2D mesh, to determine the winding order of 2D mesh its very easy with normal-z direction.
Second is, I am not asking any optimized algorithm, I do not worry about the time or speed, I just want to do it with my mesh.
When I triangulate a 3D object using Greedy Projection Triangulation algorithm, This problem happens.
check the attached images.
If I apply 2D approaches to this model using "Calculate Signed Area" or "Cross production of AB and BC vectors of a triangle", it only solves the 2D mesh but how about a 3D mesh?
First we need to check that which triangles are in wrong winding direction in 3D mesh, then we only consider those triangles, so the issue is, how can we check that which triangles are in wrong winding direction in 3D? We can not just do with 2D approach I have tested it and but no success.
For example in case of a sphere, we can not apply 2D approach to sphere.
So is there any way to solve this issue ?
Thanks.
Update # 1:
Below is the algorithm to check which edge has the same winding. It doesn't work well, I don't know why. Theoretically it should correct all the triangles but it is not correcting. For example in case of a sphere check in the attached figure. Something is wrong with it.
void GLReversedEdge(int i, int j, GLFace *temp)
{
//i'th triangle
int V1 = temp[i].v1;
int V2 = temp[i].v2;
int V3 = temp[i].v3;
//i'th triangle edges
int E1[] ={V1, V2};
int E2[] ={V2, V3};
int E3[] ={V3, V1};
//adjacent triangle
int jV1 = temp[j].v1;
int jV2 = temp[j].v2;
int jV3 = temp[j].v3;
//adjacent edges
int jE1[] ={jV1, jV2};
int jE2[] ={jV2, jV3};
int jE3[] ={jV3, jV1};
// 1st edge of adjacent triangle is checking with all edges of ith triangle
if((jE1[0] == E1[0] && jE1[1] == E1[1]) ||
(jE1[0] == E2[0] && jE1[1] == E2[1]) ||
(jE1[0] == E3[0] && jE1[1] == E3[1]))
{
temp[j].set(jV2, jV1, jV3); // 1st edges orientation is same, so reverse/swap it
}
// 2nd edge of adjacent triangle is checking with all edges of ith triangle
else if((jE2[0] == E1[0] && jE2[1] == E1[1]) ||
(jE2[0] == E2[0] && jE2[1] == E2[1]) ||
(jE2[0] == E3[0] && jE2[1] == E3[1]))
{
temp[j].set(jV1, jV3, jV2); // 2nd edges orientation is same, so reverse/swap it
}
// 3rd edge of adjacent triangle is checking with all edges of ith triangle
else if((jE3[0] == E1[0] && jE3[1] == E1[1]) ||
(jE3[0] == E2[0] && jE3[1] == E2[1]) ||
(jE3[0] == E3[0] && jE3[1] == E3[1]))
{
temp[j].set(jV3, jV2, jV1); // 3rd edges orientation is same, so reverse/swap it
}
}
void GetCorrectWindingOfMesh()
{
for(int i=0; i<nbF; i++)
{
int j1 = AdjacentTriangleToEdgeV1V2;
if(j1 >= 0) GLReversedEdge(i, j1, temp);
int j2 = AdjacentTriangleToEdgeV2V3;
if(j2 >= 0) GLReversedEdge(i, j2, temp);
int j3 = AdjacentTriangleToEdgeV3V1;
if(j3 >= 0) GLReversedEdge(i, j3, temp);
}
}
To retrieve neighboring information lets assume we have method that returns neighbor of triangle on given edge neighbor_on_egde( next_tria, edge ).
That method can be implemented with information for each vertex in which triangles it is used. That is dictionary structure that maps vertex index to list of triangle indices. It is easily created by passing through list of triangles and setting for each triangle vertex index of triangle in right dictionary element.
Traversal is done by storing which triangles to check for orientation and which triangles are already checked. While there are triangles to check, make check on it and add it's neighbors to be checked if they weren't checked. Pseudo code looks like:
to_process = set of pairs triangle and orientation edge
initial state is one good oriented triangle with any edge on it
processed = set of processed triangles; initial empty
while to_process is not empty:
next_tria, orientation_edge = to_process.pop()
add next_tria in processed
if next_tria is not opposite oriented than orientation_edge:
change next_tria (ABC) orientation (B<->C)
for each edge (AB) in next_tria:
neighbor_tria = neighbor_on_egde( next_tria, edge )
if neighbor_tria exists and neighbor_tria not in processed:
to_process add (neighbor_tria, edge opposite oriented (BA))
Does your mesh include edge adjacency information? i.e. each triangle T contains three vertices A,B,C and three edges AB, BC and CA, where AB is a link to the triangle T1 which shares common vertices A,B and includes a new vertex D. Something like
struct Vertex
{
double x,y,z;
};
struct Triangle
{
int vertices[3],edges[3];
};
struct TriangleMesh
{
Vertex Vertices[];
Triangle Triangles[];
};
If this is the case, for any triangle T = {{VA,VB,VC},{TAB,TBC,TCA}} with neighbour TE = &TAB at edge AB, A and B must appear in the reverse order for T and TE to have the same winding. e.g. TAB = {{VB,VA,VD},{TBA,TAD,TDA}} where TBA = &T. This can be used to give all the triangles the same winding.

Ray-triangle intersection

I saw that Fast Minimum Storage Ray/Triangle Intersection by Moller and Trumbore is frequently recommended.
The thing is, I don't mind pre-computing and storing any amounts of data, as long as it speeds-up the intersection.
So my question is, not caring about memory, what are the fastest methods of doing ray-triangle intersection?
Edit: I wont move the triangles, i.e. it is a static scene.
As others have mentioned, the most effective way to speed things up is to use an acceleration structure to reduce the number of ray-triangle intersections needed. That said, you still want your ray-triangle intersections to be fast. If you're happy to precompute stuff, you can try the following:
Convert your ray lines and your triangle edges to Plücker coordinates. This allows you to determine if your ray line passes through a triangle at 6 multiply/add's per edge. You will still need to compare your ray start and end points with the triangle plane (at 4 multiply/add's per point) to make sure it actually hits the triangle.
Worst case runtime expense is 26 multiply/add's total. Also, note that you only need to compute the ray/edge sign once per ray/edge combination, so if you're evaluating a mesh, you may be able to use each edge evaluation twice.
Also, these numbers assume everything is being done in homogeneous coordinates. You may be able to reduce the number of multiplications some by normalizing things ahead of time.
I have done a lot of benchmarks, and I can confidently say that the fastest (published) method is the one invented by Havel and Herout and presented in their paper Yet Faster Ray-Triangle Intersection (Using SSE4). Even without using SSE it is about twice as fast as Möller and Trumbore's algorithm.
My C implementation of Havel-Herout:
typedef struct {
vec3 n0; float d0;
vec3 n1; float d1;
vec3 n2; float d2;
} isect_hh_data;
void
isect_hh_pre(vec3 v0, vec3 v1, vec3 v2, isect_hh_data *D) {
vec3 e1 = v3_sub(v1, v0);
vec3 e2 = v3_sub(v2, v0);
D->n0 = v3_cross(e1, e2);
D->d0 = v3_dot(D->n0, v0);
float inv_denom = 1 / v3_dot(D->n0, D->n0);
D->n1 = v3_scale(v3_cross(e2, D->n0), inv_denom);
D->d1 = -v3_dot(D->n1, v0);
D->n2 = v3_scale(v3_cross(D->n0, e1), inv_denom);
D->d2 = -v3_dot(D->n2, v0);
}
inline bool
isect_hh(vec3 o, vec3 d, float *t, vec2 *uv, isect_hh_data *D) {
float det = v3_dot(D->n0, d);
float dett = D->d0 - v3_dot(o, D->n0);
vec3 wr = v3_add(v3_scale(o, det), v3_scale(d, dett));
uv->x = v3_dot(wr, D->n1) + det * D->d1;
uv->y = v3_dot(wr, D->n2) + det * D->d2;
float tmpdet0 = det - uv->x - uv->y;
int pdet0 = ((int_or_float)tmpdet0).i;
int pdetu = ((int_or_float)uv->x).i;
int pdetv = ((int_or_float)uv->y).i;
pdet0 = pdet0 ^ pdetu;
pdet0 = pdet0 | (pdetu ^ pdetv);
if (pdet0 & 0x80000000)
return false;
float rdet = 1 / det;
uv->x *= rdet;
uv->y *= rdet;
*t = dett * rdet;
return *t >= ISECT_NEAR && *t <= ISECT_FAR;
}
One suggestion could be to implement the octree (http://en.wikipedia.org/wiki/Octree) algorithm to partition your 3D Space into very fine blocks. The finer the partitioning the more memory required, but the better accuracy the tree gets.
You still need to check ray/triangle intersections, but the idea is that the tree can tell you when you can skip the ray/triangle intersection, because the ray is guaranteed not to hit the triangle.
However, if you start moving your triangle around, you need to update the Octree, and then I'm not sure it's going to save you anything.
Found this article by Dan Sunday:
Based on a count of the operations done up to the first rejection test, this algorithm is a bit less efficient than the MT (Möller & Trumbore) algorithm, [...]. However, the MT algorithm uses two cross products whereas our algorithm uses only one, and the one we use computes the normal vector of the triangle's plane, which is needed to compute the line parameter rI. But, when the normal vectors have been precomputed and stored for all triangles in a scene (which is often the case), our algorithm would not have to compute this cross product at all. But, in this case, the MT algorithm would still compute two cross products, and be less efficient than our algorithm.
http://geomalgorithms.com/a06-_intersect-2.html

Calculating length of objects in binary image - algorithm

I need to calculate length of the object in a binary image (maximum distance between the pixels inside the object). As it is a binary image, so we might consider it a 2D array with values 0 (white) and 1 (black). The thing I need is a clever (and preferably simple) algorithm to perform this operation. Keep in mind there are many objects in the image.
The image to clarify:
Sample input image:
I think the problem is simple if the boundary of an object is convex and no three vertices are on a line (i.e. no vertex can be removed without changing the polygon): Then you can just pick two points at random and use a simple gradient-descent type search to find the longest line:
Start with random vertices A, B
See if the line A' - B is longer than A - B where A' is the point left of A; if so, replace A with A'
See if the line A' - B is longer than A - B where A' is the point right of A; if so, replace A with A'
Do the same for B
repeat until convergence
So I'd suggest finding the convex hull for each seed blob, removing all "superfluos" vertices (to ensure convergence) and running the algorithm above.
Constructing a convex hull is an O(n log n) operation IIRC, where n is the number of boundary pixels. Should be pretty efficient for small objects like these. EDIT: I just remembered that the O(n log n) for the convex hull algorithm was needed to sort the points. If the boundary points are the result of a connected component analysis, they are already sorted. So the whole algorithm should run in O(n) time, where n is the number of boundary points. (It's a lot of work, though, because you might have to write your own convex-hull algorithm or modify one to skip the sort.)
Add: Response to comment
If you don't need 100% accuracy, you could simply fit an ellipse to each blob and calculate the length of the major axis: This can be computed from central moments (IIRC it's simply the square root if the largest eigenvalue of the covariance matrix), so it's an O(n) operation and can efficiently be calculated in a single sweep over the image. It has the additional advantage that it takes all pixels of a blob into account, not just two extremal points, i.e. it is far less affected by noise.
Find the major-axis length of the ellipse that has the same normalized second central moments as the region. In MATLAB you can use regionprops.
A very crude, brute-force approach would be to first identify all the edge pixels (any black pixel in the object adjacent to a non-black pixel) and calculate the distances between all possible pairs of edge pixels. The longest of these distances will give you the length of the object.
If the objects are always shaped like the ones in your sample, you could speed this up by only evaluating the pixels with the highest and lowest x and y values within the object.
I would suggest trying an "reverse" distance transform. In the magical world of mathematical morphology (sorry couldn't resist the alliteration) the distance transform gives you the closest distance of each pixel to its nearest boundary pixel. In your case, you are interested in the farthest distance to a boundary pixel, hence I have cleverly applied a "reverse" prefix.
You can find information on the distance transform here and here. I believe that matlab implements the distance transform as per here. That would lead me to believe that you can find an open source implementation of the distance transform in octave. Furthermore, it would not surprise me in the least if opencv implemented it.
I haven't given it much thought but its intuitive to me that you should be able to reverse the distance transform and calculate it in roughly the same amount of time as the original distance transform.
I think you could consider using a breadth first search algorithm.
The basic idea is that you loop over each row and column in the image, and if you haven't visited the node (a node is a row and column with a colored pixel) yet, then you would run the breadth first search. You would visit each node you possibly could, and keep track of the max and min points for the object.
Here's some C++ sample code (untested):
#include <vector>
#include <queue>
#include <cmath>
using namespace std;
// used to transition from given row, col to each of the
// 8 different directions
int dr[] = { -1, 0, 1, -1, 1, -1, 0, 1 };
int dc[] = { -1, -1, -1, 0, 0, 1, 1, 1 };
// WHITE or COLORED cells
const int WHITE = 0;
const int COLORED = 1;
// number of rows and columns
int nrows = 2000;
int ncols = 2000;
// assume G is the image
int G[2000][2000];
// the "visited array"
bool vis[2000][2000];
// get distance between 2 points
inline double getdist(double x1, double y1, double x2, double y2) {
double d1 = x1 - x2;
double d2 = y1 - y2;
return sqrt(d1*d1+d2*d2);
}
// this function performs the breadth first search
double bfs(int startRow, int startCol) {
queue< int > q;
q.push(startRow);
q.push(startCol);
vector< pair< int, int > > points;
while(!q.empty()) {
int r = q.front();
q.pop();
int c = q.front();
q.pop();
// already visited?
if (vis[r][c])
continue;
points.push_back(make_pair(r,c));
vis[r][c] = true;
// try all eight directions
for(int i = 0; i < 8; ++i) {
int nr = r + dr[i];
int nc = c + dc[i];
if (nr < 0 || nr >= nrows || nc < 0 || nc >= ncols)
continue; // out of bounds
// push next node on queue
q.push(nr);
q.push(nc);
}
}
// the distance is maximum difference between any 2 points encountered in the BFS
double diff = 0;
for(int i = 0; i < (int)points.size(); ++i) {
for(int j = i+1; j < (int)points.size(); ++j) {
diff = max(diff,getdist(points[i].first,points[i].second,points[j].first,points[j].second));
}
}
return diff;
}
int main() {
vector< double > lengths;
memset(vis,false,sizeof vis);
for(int r = 0; r < nrows; ++r) {
for(int c = 0; c < ncols; ++c) {
if (G[r][c] == WHITE)
continue; // we don't care about cells without objects
if (vis[r][c])
continue; // we've already processed this object
// find the length of this object
double len = bfs(r,c);
lengths.push_back(len);
}
}
return 0;
}

Data structure for storing thousands of vectors

I have upto 10,000 randomly positioned points in a space and i need to be able to tell which the cursor is closest to at any given time. To add some context, the points are in the form of a vector drawing, so they can be constantly and quickly added and removed by the user and also potentially be unbalanced across the canvas space..
I am therefore trying to find the most efficient data structure for storing and querying these points. I would like to keep this question language agnostic if possible.
After the Update to the Question
Use two Red-Black Tree or Skip_list maps. Both are compact self-balancing data structures giving you O(log n) time for search, insert and delete operations. One map will use X-coordinate for every point as a key and the point itself as a value and the other will use Y-coordinate as a key and the point itself as a value.
As a trade-off I suggest to initially restrict the search area around the cursor by a square. For perfect match the square side should equal to diameter of your "sensitivity circle” around the cursor. I.e. if you’re interested only in a nearest neighbour within 10 pixel radius from the cursor then the square side needs to be 20px. As an alternative, if you’re after nearest neighbour regardless of proximity you might try finding the boundary dynamically by evaluating floor and ceiling relative to cursor.
Then retrieve two subsets of points from the maps that are within the boundaries, merge to include only the points within both sub sets.
Loop through the result, calculate proximity to each point (dx^2+dy^2, avoid square root since you're not interested in the actual distance, just proximity), find the nearest neighbour.
Take root square from the proximity figure to measure the distance to the nearest neighbour, see if it’s greater than the radius of the “sensitivity circle”, if it is it means there is no points within the circle.
I suggest doing some benchmarks every approach; it’s two easy to go over the top with optimisations. On my modest hardware (Duo Core 2) naïve single-threaded search of a nearest neighbour within 10K points repeated a thousand times takes 350 milliseconds in Java. As long as the overall UI re-action time is under 100 milliseconds it will seem instant to a user, keeping that in mind even naïve search might give you sufficiently fast response.
Generic Solution
The most efficient data structure depends on the algorithm you’re planning to use, time-space trade off and the expected relative distribution of points:
If space is not an issue the most efficient way may be to pre-calculate the nearest neighbour for each point on the screen and then store nearest neighbour unique id in a two-dimensional array representing the screen.
If time is not an issue storing 10K points in a simple 2D array and doing naïve search every time, i.e. looping through each point and calculating the distance may be a good and simple easy to maintain option.
For a number of trade-offs between the two, here is a good presentation on various Nearest Neighbour Search options available: http://dimacs.rutgers.edu/Workshops/MiningTutorial/pindyk-slides.ppt
A bunch of good detailed materials for various Nearest Neighbour Search algorithms: http://simsearch.yury.name/tutorial.html, just pick one that suits your needs best.
So it's really impossible to evaluate the data structure is isolation from algorithm which in turn is hard to evaluate without good idea of task constraints and priorities.
Sample Java Implementation
import java.util.*;
import java.util.concurrent.ConcurrentSkipListMap;
class Test
{
public static void main (String[] args)
{
Drawing naive = new NaiveDrawing();
Drawing skip = new SkipListDrawing();
long start;
start = System.currentTimeMillis();
testInsert(naive);
System.out.println("Naive insert: "+(System.currentTimeMillis() - start)+"ms");
start = System.currentTimeMillis();
testSearch(naive);
System.out.println("Naive search: "+(System.currentTimeMillis() - start)+"ms");
start = System.currentTimeMillis();
testInsert(skip);
System.out.println("Skip List insert: "+(System.currentTimeMillis() - start)+"ms");
start = System.currentTimeMillis();
testSearch(skip);
System.out.println("Skip List search: "+(System.currentTimeMillis() - start)+"ms");
}
public static void testInsert(Drawing d)
{
Random r = new Random();
for (int i=0;i<100000;i++)
d.addPoint(new Point(r.nextInt(4096),r.nextInt(2048)));
}
public static void testSearch(Drawing d)
{
Point cursor;
Random r = new Random();
for (int i=0;i<1000;i++)
{
cursor = new Point(r.nextInt(4096),r.nextInt(2048));
d.getNearestFrom(cursor,10);
}
}
}
// A simple point class
class Point
{
public Point (int x, int y)
{
this.x = x;
this.y = y;
}
public final int x,y;
public String toString()
{
return "["+x+","+y+"]";
}
}
// Interface will make the benchmarking easier
interface Drawing
{
void addPoint (Point p);
Set<Point> getNearestFrom (Point source,int radius);
}
class SkipListDrawing implements Drawing
{
// Helper class to store an index of point by a single coordinate
// Unlike standard Map it's capable of storing several points against the same coordinate, i.e.
// [10,15] [10,40] [10,49] all can be stored against X-coordinate and retrieved later
// This is achieved by storing a list of points against the key, as opposed to storing just a point.
private class Index
{
final private NavigableMap<Integer,List<Point>> index = new ConcurrentSkipListMap <Integer,List<Point>> ();
void add (Point p,int indexKey)
{
List<Point> list = index.get(indexKey);
if (list==null)
{
list = new ArrayList<Point>();
index.put(indexKey,list);
}
list.add(p);
}
HashSet<Point> get (int fromKey,int toKey)
{
final HashSet<Point> result = new HashSet<Point> ();
// Use NavigableMap.subMap to quickly retrieve all entries matching
// search boundaries, then flatten resulting lists of points into
// a single HashSet of points.
for (List<Point> s: index.subMap(fromKey,true,toKey,true).values())
for (Point p: s)
result.add(p);
return result;
}
}
// Store each point index by it's X and Y coordinate in two separate indices
final private Index xIndex = new Index();
final private Index yIndex = new Index();
public void addPoint (Point p)
{
xIndex.add(p,p.x);
yIndex.add(p,p.y);
}
public Set<Point> getNearestFrom (Point origin,int radius)
{
final Set<Point> searchSpace;
// search space is going to contain only the points that are within
// "sensitivity square". First get all points where X coordinate
// is within the given range.
searchSpace = xIndex.get(origin.x-radius,origin.x+radius);
// Then get all points where Y is within the range, and store
// within searchSpace the intersection of two sets, i.e. only
// points where both X and Y are within the range.
searchSpace.retainAll(yIndex.get(origin.y-radius,origin.y+radius));
// Loop through search space, calculate proximity to each point
// Don't take square root as it's expensive and really unneccessary
// at this stage.
//
// Keep track of nearest points list if there are several
// at the same distance.
int dist,dx,dy, minDist = Integer.MAX_VALUE;
Set<Point> nearest = new HashSet<Point>();
for (Point p: searchSpace)
{
dx=p.x-origin.x;
dy=p.y-origin.y;
dist=dx*dx+dy*dy;
if (dist<minDist)
{
minDist=dist;
nearest.clear();
nearest.add(p);
}
else if (dist==minDist)
{
nearest.add(p);
}
}
// Ok, now we have the list of nearest points, it might be empty.
// But let's check if they are still beyond the sensitivity radius:
// we search area we have evaluated was square with an side to
// the diameter of the actual circle. If points we've found are
// in the corners of the square area they might be outside the circle.
// Let's see what the distance is and if it greater than the radius
// then we don't have a single point within proximity boundaries.
if (Math.sqrt(minDist) > radius) nearest.clear();
return nearest;
}
}
// Naive approach: just loop through every point and see if it's nearest.
class NaiveDrawing implements Drawing
{
final private List<Point> points = new ArrayList<Point> ();
public void addPoint (Point p)
{
points.add(p);
}
public Set<Point> getNearestFrom (Point origin,int radius)
{
int prevDist = Integer.MAX_VALUE;
int dist;
Set<Point> nearest = Collections.emptySet();
for (Point p: points)
{
int dx = p.x-origin.x;
int dy = p.y-origin.y;
dist = dx * dx + dy * dy;
if (dist < prevDist)
{
prevDist = dist;
nearest = new HashSet<Point>();
nearest.add(p);
}
else if (dist==prevDist) nearest.add(p);
}
if (Math.sqrt(prevDist) > radius) nearest = Collections.emptySet();
return nearest;
}
}
I would like to suggest creating a Voronoi Diagram and a Trapezoidal Map (Basically the same answer as I gave to this question). The Voronoi Diagram will partition the space in polygons. Every point will have a polygon describing all points that are closest to it.
Now when you get a query of a point, you need to find in which polygon it lies. This problem is called Point Location and can be solved by constructing a Trapezoidal Map.
The Voronoi Diagram can be created using Fortune's algorithm which takes O(n log n) computational steps and costs O(n) space.
This website shows you how to make a trapezoidal map and how to query it. You can also find some bounds there:
Expected creation time: O(n log n)
Expected space complexity: O(n) But
most importantly, expected query
time: O(log n).
(This is (theoretically) better than O(√n) of the kD-tree.)
Updating will be linear (O(n)) I think.
My source(other than the links above) is: Computational Geometry: algorithms and applications, chapters six and seven.
There you will find detailed information about the two data structures (including detailed proofs). The Google books version only has a part of what you need, but the other links should be sufficient for your purpose. Just buy the book if you are interested in that sort of thing (it's a good book).
The most efficient data structure would be a kd-tree link text
Are the points uniformly distributed?
You could build a quad-tree up to a certain depth, say, 8. At the top you have a tree node that divides the screen into four quadrants. Store at each node:
The top left and the bottom right coordinate
Pointers to four child nodes, which divide the node into four quadrants
Build the tree up to a depth of 8, say, and at the leaf nodes, store a list of points associated with that region. That list you can search linearly.
If you need more granularity, build the quad-tree to a greater depth.
It depends on the frequency of updates and query. For fast query, slow updates, a Quadtree (which is a form of jd-tree for 2-D) would probably be best. Quadtree are very good for non-uniform point too.
If you have a low resolution you could consider using a raw array of width x height of pre-computed values.
If you have very few points or fast update, a simple array is enough, or may be a simple partitioning (which goes toward the quadtree).
So the answer depends on parameters of you dynamics. Also I would add that nowadays the algo isn't everything; making it use multiple processors or CUDA can give a huge boost.
You haven't specified the dimensions of you points, but if it's a 2D line drawing then a bitmap bucket - a 2D array of lists of points in a region, where you scan the buckets corresponding to and near to a cursor can perform very well. Most systems will happily handle bitmap buckets of the 100x100 to 1000x1000 order, the small end of which would put a mean of one point per bucket. Although asymptotic performance is O(N), real-world performance is typically very good. Moving individual points between buckets can be fast; moving objects around can also be made fast if you put the objects into the buckets rather than the points ( so a polygon of 12 points would be referenced by 12 buckets; moving it becomes 12 times the insertion and removal cost of the bucket list; looking up the bucket is constant time in the 2D array ). The major cost is reorganising everything if the canvas size grows in many small jumps.
If it is in 2D, you can create a virtual grid covering the whole space (width and height are up to your actual points space) and find all the 2D points which belong to every cell. After that a cell will be a bucket in a hashtable.

Resources