I would like to know if I am missing any acceleration structure that is designed for retrieving k-nearest spheres within a range.
The context of my question is molecular visualization, specifically, I need to retrieve k-nearest spheres to a point to produce a function that will be used to guide sphere tracing step length.
To simplify, the search can be limited in range to the point being tested.
All I have seen in the articles handle k-nearest points to a point, but my case is different, since I want to work with spheres closest to a point. It seems possible to adapt the kd-trees, changing the test of points to spheres but I believe that it would affect the performance. So I wonder if there is a better structure or if I should use and adapt the kd-trees.
Currently, I am using an Hybrid Bounding Volume Hierarchy but I think that the search performance could be better with other structure, since I have a big overlap of bounding volumes due to the nature of the molecules.
PS: I don't care much about the construction time. I want good search performance and decent memory occupation.
You could use a 3-step approach:
Find the nearest neighbor using the center-points of the spheres.
For this nearest neighbor you substract its radius and add the maximum radius. Then you Perform a spherical range query with the new distance. This will return all center points of spheres who may be the closest to your original sphere.
Then you manually calculate the actual distance for each sphere using it's actual radius.
This should be reasonably efficient assuming that the radius of spheres is not massively bigger than their average distance.
Related
I have a 3D mesh that is comprised of a certain amount of vertices.
I know that there are some vertices that are really close to one another. I want to find groups of these, so that I can normalize them.
I could make a KD and do basic NNS, but that doesn't scale so well if I don't have a reference point.
I want to find these groups in relation to all points.
In my searches I also found k-means but I cannot seem to wrap my head around it's scientific descriptions to find out if that's really what I need.
I'm not well versed in spatial algorithms in general. I know where one can apply them, for instance, for this case, but I lack the actual know-how, to even have the correct keywords.
So, yeah, what algorithms are meant for such task?
Simple idea that might work:
Compue a slightly big bounding volume for each vertex in the mesh. For instance is you use a Sphere, use a small radius for it e.g., the radius can be equal to the length of the smallest edge of the mesh.
Compute the intersection of bounding volumes for each vertex. Use a collision detection algorithm for that such as the I-Collide. Use a disjoint-set datastrcture for grouping the points in collision.
Merge all the points residing in the same set.
You can fine-tune the algorithm by changing the size of the bounding volumes. Also you can use this algorithm as a starting point for a k-means algoritm or other sound clustering technique.
I'm developping a tool for radiotherapy inverse planning based in a pencil-beam approach. An important step in these methods (particularly in dose calculation) is a ray-tracing from many sources and one of the most used algorithms is Siddon's one (here there is a nice short description http://on-demand.gputechconf.com/gtc/2014/poster/pdf/P4218_CT_reconstruction_iterative_algebraic.pdf). Now, I will try to simplify my question:
The input data is a CT image (a 3D matrix with values) and some source positions around the image. You can imagine a cube and many points around, all at same distance but different orientation angles, where the radiation rays come from. Each ray will go through the volume and a value is assigned to each voxel according to the distance from the source. The advantage of Siddon's algorithm is that the length is calculated on-time during the iterative process of the ray-tracing. However, I know that Bresenham's algorithm is an efficient way to evaluate the path from one point to another in a matrix. Thus, the length from the source to a specific voxel could be easily calculated as the euclidean distance two points, even during Bresenham's iterative process.
So then, knowing that both are methods quite old already and efficient, there is a definitive advantage of using Siddon instead of Bresenham? Maybe I'm missing an important detail here but it is weird to me that in these dose calculation procedures Bresenham is not really an option and always Siddon appears as the gold standard.
Thanks for any comment or reply!
Good day.
It seems to me that in most applications involving medical ray tracing, you want not only the distance from a source to a particular voxel, but also the intersection lengths of that path with every single voxel on its way. Now, Bresenham gives you the voxels on that path, but not the intersection lengths, while Siddon does.
I have a set of 3d points that approximate a surface. Each point, however, are subject to some error. Furthermore, the set of points contain a lot more points than is actually needed to represent the underlying surface.
What I am looking for is an algorithm to create a new (much smaller) set of points representing a simplified, smoother version of the surface (pardon for not having a better definition than "simplified, smoother"). The underlying surface is not a mathematical one so I'm not hoping to fit the data set to some mathematical function.
Instead of dealing with it as a point cloud, I would recommend triangulating a mesh using Delaunay triangulation: http://en.wikipedia.org/wiki/Delaunay_triangulation
Then decimate the mesh. You can research decimation algorithms, but you can get pretty good quick and dirty results with an algorithm that just merges adjacent tris that have similar normals.
I think you are looking for 'Level of detail' algorithms.
A simple one to implement is to break your volume (surface) into some number of sub-volumes. From the points in each sub-volume, choose a representative point (such as the one closest to center, or the closest to the average, or the average etc). use these points to redraw your surface.
You can tweak the number of sub-volumes to increase/decrease detail on the fly.
I'd approach this by looking for vertices (points) that contribute little to the curvature of the surface. Find all the sides emerging from each vertex and take the dot products of pairs (?) of them. The points representing very shallow "hills" will subtend huge angles (near 180 degrees) and have small dot products.
Those vertices with the smallest numbers would then be candidates for removal. The vertices around them will then form a plane.
Or something like that.
Google for Hugues Hoppe and his "surface reconstruction" work.
Surface reconstruction is used to find a meshed surface to fit the point cloud; however, this method yields lots of triangles. You can then apply mesh a reduction technique to reduce the polygon count in a way to minimize error. As an example, you can look at OpenMesh's decimation methods.
OpenMesh
Hugues Hoppe
There exist several different techniques for point-based surface model simplification, including:
clustering;
particle simulation;
iterative simplification.
See the survey:
M. Pauly, M. Gross, and L. P. Kobbelt. Efficient simplification of point-
sampled surfaces. In Proceedings of the conference on Visualization’02,
pages 163–170, Washington, DC, 2002. IEEE.
unless you parametrise your surface in some way i'm not sure how you can decide which points carry similar information (and can thus be thrown away).
i guess you can choose a bunch of points at random to get rid of, but that doesn't sound like what you want to do.
maybe points near each other (for some definition of 'near') can be considered to contain similar information, and so reduced to single representatives for each such group.
could you give some more details?
It's simpler to simplify a point cloud without the constraints of mesh triangles and indices.
smoothing and simplification are different tasks though. To simplify the cloud you should first get rid of noise artefacts by making a profile of the kind of noise that you have, it's frequency and directional caracteristics and do a noise profile compared type reduction. good normal vectors are helfpul for that.
here is a document about 5-6 simplifications using delauney, voronoi, and k nearest neighbour maths:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.10.9640&rep=rep1&type=pdf
A later version from 2008:
http://www.wseas.us/e-library/transactions/research/2008/30-705.pdf
here is a recent c++ version:
https://github.com/tudelft3d/masbcpp/blob/master/src/simplify.cpp
What kind of data structure could be used for an efficient nearest neighbor search in a large set of geo coordinates? With "regular" spatial index structures like R-Trees that assume planar coordinates, I see two problems (Are there others I have overlooked?):
Wraparound at the poles and the International Date Line
Distortion of distances near the poles
How can these factors be allowed for? I guess the second one could compensated by transforming the coordinates. Can an R-Tree be modified to take wraparound into account? Or are there specialized geo-spatial index structures?
Could you use a locality-sensitive hashing (LSH) algorithm in 3 dimensions? That would quickly give you an approximate neighboring group which you could then sanity-check by calculating great-circle distances.
Here's a paper describing an algorithm for efficient LSH on the surface of a unit d-dimensional hypersphere. Presumably it works for d=3.
Take a look at Geohash.
Also, to compensate for wraparound, simply use not one but three orthogonal R-trees, so that there does not exist a point on the earth surface such that all three trees have a wraparound at that point. Then, two points are close if they are close according to at least one of these trees.
I just finished implementing a kd-tree for doing fast nearest neighbor searches. I'm interested in playing around with different distance metrics other than the Euclidean distance. My understanding of the kd-tree is that the speedy kd-tree search is not guaranteed to give exact searches if the metric is non-Euclidean, which means that I might need to implement a new data structure and search algorithm if I want to try out new metrics for my search.
I have two questions:
Does using a kd-tree permanently tie me to the Euclidean distance?
If so, what other sorts of algorithms should I try that work for arbitrary metrics? I don't have a ton of time to implement lots of different data structures, but other structures I'm thinking about include cover trees and vp-trees.
The nearest-neighbour search procedure described on the Wikipedia page you linked to can certainly be generalised to other distance metrics, provided you replace "hypersphere" with the equivalent geometrical object for the given metric, and test each hyperplane for crossings with this object.
Example: if you are using the Manhattan distance instead (i.e. the sum of the absolute values of all differences in vector components), your hypersphere would become a (multidimensional) diamond. (This is easiest to visualise in 2D -- if your current nearest neighbour is at distance x from the query point p, then any closer neighbour behind a different hyperplane must intersect a diamond shape that has width and height 2x and is centred on p). This might make the hyperplane-crossing test more difficult to code or slower to run, however the general principle still applies.
I don't think you're tied to euclidean distance - as j_random_hacker says, you can probably use Manhattan distance - but I'm pretty sure you're tied to geometries that can be represented in cartesian coordinates. So you couldn't use a kd-tree to index a metric space, for example.