Data structure for piecewise circular trajectory in plane - data-structures

I'm trying to design a data-structure to hold/express a piecewise circular trajectory in the Euclidian plane. The trajectory is constrained to be continuous and have finite curvature everywhere, and therefore the circular arcs meet tangentially.
Storing all the circle centers, radii, and touching points would allow for inspecting the geometry anywhere in O(1) but would require explicit enforcement of the continuity and curvature constraints due to data redundancy. In my view, this would make the code messy.
Storing only the circle touching points (which are waypoints along the curve) along with the curve's initial direction would be sufficient in principle, and avoid data redundancy, but then it would be necessary to do an O(n) calculation to inspect the geometry of arc n, since that arc depends on all the arcs preceding it in the trajectory.
I would like to avoid data redundancy, but I also don't want to make the cost of geometric inspection prohibitive.
Does anyone have any high-level idea/advice to share?

For the most efficient traversal of the trajectory, if I am right you need
the ending curvilinear abscissas of every arc (cumulative),
the radii,
the starting angles,
the coordinates of the centers,
so that for a given s you find the index of the arc, then the azimuth and the coordinates of the point. (Either incrementally for a sequence of points, or by dichotomy for a single point.) That takes five parameters per arc.
Only the cumulative abscissas are global, but you can't do without them for single-point accesses. You can drop the radii and starting angles and retrieve them for any arc from the difference of curvilinear abscissas and the limit angles (see below). This reduces to three parameters.
On the other hand, knowing just the coordinates of the centers and those of the starting and ending points is enough to recover the whole geometry, and this takes two parameters per arc.
The meeting point of two arcs is found on the line through the centers, and if you know one radius, the other follows. And the limit angle is given by the direction of the line. So for an incremental traversal, this non-redundant description can do.
For convenient computation, knowing s and the arc index, consider the vectors from the center to the centers of the adjoining arcs. Rotate them so that the first becomes horizontal. The components of the other will give you the amplitude angle. The fraction (s - Si-1) / (Si - Si-1) of the amplitude gives you the azimuth of the point, to which you apply the counter-rotation.

I'd store items with the data required to get info for any point of that element. For example, an arc needs x, y, initial direction, radius, lenght (or end point, or angle difference or whatever you find easiest).
Because you need continuity (same x,y, same bearing, perhaps same curvature) between two ending points then a node with this properties is needed. Notice these properties are common to arcs and straights (a special arc identified by radius = 0). So you can treat a node the same as an item.
The trajectory should be calculated before any request. So you have all items-data in advance.
The container depends on how you request info.
If the trajectory can be somehow represented in a grid, then you better use a quad-tree.
I guess you must find the item from a x,y or accumulated length input. You will have to iterate through the container to find the element closest to the input data. Sorted data may help.
My choice is a simple vector with the consecutive elements, which happens to be sorted on accumulated trajectory length.
Finding by x,y on a x-sorted container (or a tree) is not so simple, due to some x,y may have perpendiculars to several items, consecutive or not, near or not, and you need to select the nearest one.


How to quickly pack spheres in 3D?

I'm looking for an algorithm for random close packing of spheres in 3D. The trick is that I'd like to pack spheres around a certain number of existing spheres. So for example, given somewhere between 100 and 1000 spheres in 3D (which have fixed positions and sizes; they may overlap, and may be different sizes), I'd like to pack spheres (all same size, positions can be chosen freely) around them (with no overlaps).
The metric for a good quality of packing is the packing density or void fraction. Essentially I'd like the fixed spheres and the packed spheres to occupy a compact volume of space (eg roughly ~spherical, or packed in layers around the fixed spheres) with as few voids in it as possible.
Is there an off the shelf algorithm that does this? How would you approach it in a way that balances speed of calculation with packing quality?
UPDATE Detail on packing density: this depends on what volume is chosen for the calculation. For this, we're looking to pack a certain number of layers of spheres around the fixed ones. Form a surface of points which are exactly a distance d to the surface of the closest fixed sphere; the packing density should be calculated within the volume enclosed by that surface. It's convenient if d = some multiple of the size of the packed spheres. (Assume we can place at least as many free spheres as needed to fill that volume; there may be excess ones, it doesn't matter where they're placed)
The fixed and all the variable spheres are all pretty similar sizes (let's say within 2x range from smallest to largest). In practice the degree of overlap of the fixed spheres is also limited: no fixed sphere is closer than a certain distance (around 0.2-0.3 diameters) of any other fixed sphere (so it is guaranteed that they are spread out, and/or only overlap a few neighbors rather than all overlapping each other)
Bounty posted!
Use a lattice where each point is equidistant by the diameter of the fill sphere. Any lattice shape, meeting the above definition will suffice.
Orient the translation and rotation of the lattice to minimize the center offsets of the fixed spheres to produce the world transform.
Fixed Pass 1:
Create a list of any lattice points within the fixed spheres radii + the diameter of the fill spheres.
For the latter keep the positional (origin - point) difference vectors in a list.
Flag in the lattice points(removal) in the list.
Lattice Pass 1:
Combine,i.e. re-base origin to overlap point(either a true overlap or extended to fill radius), any overlapping Fixed sphere's distance vectors. Storing the value on one side and flagging it on the other, to permit multiple overlaps.
This is where a decision is needed:
Optimize space over time, Computationally slow:
Add points from the adjusted origin radius + fill radius. Then iterating over lattice points moving one point at a time away from other points until all spacing conditions are met. If the lattice points implement spring logic, an optimal solution is produced, given enough iterations(N^2+ N). ?? Stop Here.... Done.
Pull the remaining points in lattice to fill the void:
Warp the lattice near(The size is as large as needed) each overlap point or origin, if no overlap exists pulling the points, to fill the gap.
Lattice Pass 2:
Add missing, i.e. no other point within fill radius + 1, and not near fixed sphere (other radius + fill radius) flagged points as removed. This should be a small amount of points.
Lattice Pass 3:
Adjust all lattice positions to move closer to the proper grid spacing. This will be monotonically decreasing the distances between points, limited to >= radius1 + radius2.
Repeat 3-4(or more) times. Applying a tiny amount of randomness(1 to -1 pixel max per dimension) bias offset to the first pass of the created points to avoid any equal spacing conflicts after the warp. If no suitable gap is created the solution may settle to a poorly optimized solution.
Each fill sphere is centered on a lattice grid point.
I can see many areas of improvement and optimization, but the point was to provide a clear somewhat fast algorithm that is good, but not guaranteed optimal.
Note the difference between 1 and 2:
Number 1 creates a sphere colliding with other spheres and requires all fills to move multiple times to resolve the collisions.
Number 2 only creates new spheres in empty spaces, and moves the rest inward to adapt, resulting in much faster convergence, since there are no collisions to resolve.

How to index nearby 3D points on the fly?

In physics simulations (for example n-body systems) it is sometimes necessary to keep track of which particles (points in 3D space) are close enough to interact (within some cutoff distance d) in some kind of index. However, particles can move around, so it is necessary to update the index, ideally on the fly without recomputing it entirely. Also, for efficiency in calculating interactions it is necessary to keep the list of interacting particles in the form of tiles: a tile is a fixed size array (eg 32x32) where the rows and columns are particles, and almost every row-particle is close enough to interact with almost every column particle (and the array keeps track of which ones actually do interact).
What algorithms may be used to do this?
Here is a more detailed description of the problem:
Initial construction: Given a list of points in 3D space (on the order of a few thousand to a few million, stored as array of floats), produce a list of tiles of a fixed size (NxN), where each tile has two lists of points (N row points and N column points), and a boolean array NxN which describes whether the interaction between each row and column particle should be calculated, and for which:
a. every pair of points p1,p2 for which distance(p1,p2) < d is found in at least one tile and marked as being calculated (no missing interactions), and
b. if any pair of points is in more than one tile, it is only marked as being calculated in the boolean array in at most one tile (no duplicates),
and also the number of tiles is relatively small if possible (but this is less important than being able to update the tiles efficiently)
Update step: If the positions of the points change slightly (by much less than d), update the list of tiles in the fastest way possible so that they still meet the same conditions a and b (this step is repeated many times)
It is okay to keep any necessary data structures that help with this, for example the bounding boxes of each tile, or a spatial index like a quadtree. It is probably too slow to calculate all particle pairwise distances for every update step (and in any case we only care about particles which are close, so we can skip most possible pairs of distances just by sorting along a single dimension for example). Also it is probably too slow to keep a full (quadtree or similar) index of all particle positions. On the other hand is perfectly fine to construct the tiles on a regular grid of some kind. The density of particles per unit volume in 3D space is roughly constant, so the tiles can probably be built from (essentially) fixed size bounding boxes.
To give an example of the typical scale/properties of this kind of problem, suppose there is 1 million particles, which are arranged as a random packing of spheres of diameter 1 unit into a cube with of size roughly 100x100x100. Suppose the cutoff distance is 5 units, so typically each particle would be interacting with (2*5)**3 or ~1000 other particles or so. The tile size is 32x32. There are roughly 1e+9 interacting pairs of particles, so the minimum possible number of tiles is ~1e+6. Now assume each time the positions change, the particles move a distance around 0.0001 unit in a random direction, but always in a way such that they are at least 1 unit away from any other particle and the typical density of particles per unit volume stays the same. There would typically be many millions of position update steps like that. The number of newly created pairs of interactions per step due to the movement is (back of the envelope) (10**2 * 6 * 0.0001 / 10**3) * 1e+9 = 60000, so one update step can be handled in principle by marking 60000 particles as non-interacting in their original tiles, and adding at most 60000 new tiles (mostly empty - one per pair of newly interacting particles). This would rapidly get to a point where most tiles are empty, so it is definitely necessary to combine/merge tiles somehow pretty often - but how to do it without a full rebuild of the tile list?
P.S. It is probably useful to describe how this differs from the typical spatial index (eg octrees) scenario: a. we only care about grouping close by points together into tiles, not looking up which points are in an arbitrary bounding box or which points are closest to a query point - a bit closer to clustering that querying and b. the density of points in space is pretty constant and c. the index has to be updated very often, but most moves are tiny
Not sure my reasoning is sound, but here's an idea:
Divide your space into a grid of 3d cubes, like this in three dimensions:
The cubes have a side length of d. Then do the following:
Assign all points to all cubes in which they're contained; this is fast since you can derive a point's cube from just their coordinates
Now check the following:
Mark all points in the top left of your cube as colliding; they're less than d apart. Further, every "quarter cube" in space is only the top left quarter of exactly one cube, so you won't check the same pair twice.
Check fo collisions of type (p, q), where p is a point in the top left quartile, and q is a point not in the top left quartile. In this way, you will check collision between every two points again at most once, because very pair of quantiles is checked exactly once.
Since every pair of points is either in the same quartile or in neihgbouring quartiles, they'll be checked by the first or the second algorithm. Further, since points are approximately distributed evenly, your runtime is much less than n^2 (n=no points); in aggregate, it's k^2 (k = no points per quartile, which appears to be approximately constant).
In an update step, you only need to check:
if a point crossed a boundary of a box, which should be fast since you can look at one coordinate at a time, and box' boundaries are a simple multiple of d/2
check for collisions of the points as above
To create the tiles, divide the space into a second grid of (non-overlapping) cubes whose width is chosen s.t. the average count of centers between two particles that almost interact with each other that fall into a given cube is less than the width of your tiles (i.e. 32). Since each particle is expected to interact with 300-500 particles, the width will be much smaller than d.
Then, while checking for interactions in step 1 & 2, assigne particle interactions to these new cubes according to the coordinates of the center of their interaction. Assign one tile per cube, and mark interacting particles assigned to that cube in the tile. Visualization:
Further optimizations might be to consider the distance of a point's closest neighbour within a cube, and derive from that how many update steps are needed at least to change the collision status of that point; then ignore that point for this many steps.
I suggest the following algorithm. E.g we have cube 1x1x1 and the cutoff distance is 0.001
Let's choose three base anchor points: (0,0,0) (0,1,0) (1,0,0)
Associate array of size 1000 ( 1 / 0.001) with each anchor point
Add three numbers into each regular point. We will store the distance between the given point and each anchor point inside these fields
At the same time this distance will be used as an index in an array inside the anchor point. E.g. 0.4324 means index 432.
Let's store the set of points inside of each three arrays
Calculate distance between the regular point and each anchor point every time when update point
Move point between sets in arrays during the update
The given structures will give you an easy way to find all closer points: it is the intersection between three sets. And we choose these sets based on the distance between point and anchor points.
In short, it is the intersection between three spheres. Maybe you need to apply additional filtering for the result if you want to erase the corners of this intersection.
Consider using the Barnes-Hut algorithm or something similar. A simulation in 2D would use a quadtree data structure to store particles, and a 3D simulation would use an octree.
The benefit of using a a tree structure is that it stores the particles in a way that nearby particles can be found quickly by traversing the tree, and far-away particles are in traversal paths that can be ignored.
Wikipedia has a good description of the algorithm:
The Barnes–Hut tree
In a three-dimensional n-body simulation, the Barnes–Hut algorithm recursively divides the n bodies into groups by storing them in an octree (or a quad-tree in a 2D simulation). Each node in this tree represents a region of the three-dimensional space. The topmost node represents the whole space, and its eight children represent the eight octants of the space. The space is recursively subdivided into octants until each subdivision contains 0 or 1 bodies (some regions do not have bodies in all of their octants). There are two types of nodes in the octree: internal and external nodes. An external node has no children and is either empty or represents a single body. Each internal node represents the group of bodies beneath it, and stores the center of mass and the total mass of all its children bodies.

How to know if two sets of points have same pattern

I have 2 sets of points in 3D have the same count, I want to know if the have the same pattern, I thought I may project them on XZ,XY and YZ planes then compare the projections in each plane but I am not sure how to do this, I thought the convex hull may help but it won't be accurate.
Is there an easy algorithm to do that? the complexity is not a big issue so far as the points count will be tiny, I implement in Java.
Can I solve this in 3D direct with the same algorithm ?
The attached image shows an example of what I mean.
No guarantee for order.
No scale, there are rotation and translation only.
I would gather some information about each point: information that only depends on "shape", not on the actual translation/rotation. For instance, it could be the sum of all the distances between the point and any other point of the shape. Or it could be the largest angle between any two points, as seen from the point under consideration. Choose whatever metric brings the most diversity.
Then sort the points by that metric.
Do the above for both groups of points.
As a first step you can compare both groups by their sorted list of metrics. Allow for a little error margin, since you will be dealing with floating point precision limitations. If they cannot be mapped to each other, abort the algorithm: they are different shapes.
Now translate the point set so that the first point in the ordered list is mapped to the origin (0, 0, 0), i.e. subtract the first point from all points in the group.
Now rotate the point set around the Y axis, so that the second point in the ordered list coincides with XY plane. The rotate the point set around the Z axis, so that that point coincides with the X-axis: it should map to (d, 0, 0), where d is the distance between the first and second point in the sorted list.
Finally, rotate the point set around the X axis, so that the third point in the ordered list coincides with the XY plane. If that point is colinear with the previous points, you need to continue doing this with the next point(s) until you have rotated a non-colinear point.
Do this with both groups of points. Then compare the so-transformed coordinates of both lists.
This is the main algorithm, but I have omitted the cases where the metric value is the same for two points, and thus the sorted list could have permutations without breaking the sort order:
In that case you need to perform the above transformations with the different permutations of those equally valued points at the start of the sorted list, for as long as there is no fit.
Also, while checking the fit, you should take into account that the matching point may not be in the exact same order as in the other group's sorted list, and you should verify the next points that have the same metric as well.
If you have a fixed object with different shapes and movements, pair-wise- or multi-matching can be a helpful solution for you. For example see this paper. This method can be extended for higher-dimensions as well.
If you have two different sets of points that come from different objects and you find the similarity between them, one solution can be computing discrete Frechet distance in both sets of points and then compare their value.
The other related concept is Shape Reconstruction. You can mix the result of a proper shape reconstruction algorithm with two previous methods to compute the similarity:

Intersection of a 3D Grid's Vertices

Imagine an enormous 3D grid (procedurally defined, and potentially infinite; at the very least, 10^6 coordinates per side). At each grid coordinate, there's a primitive (e.g., a sphere, a box, or some other simple, easily mathematically defined function).
I need an algorithm to intersect a ray, with origin outside the grid and direction entering it, against the grid's elements. I.e., the ray might travel halfway through this huge grid, and then hit a primitive. Because of the scope of the grid, an iterative method [EDIT: (such as ray marching) ]is unacceptably slow. What I need is some closed-form [EDIT: constant time ]solution for finding the primitive hit.
One possible approach I've thought of is to determine the amount the ray would converge each time step toward the primitives on each of the eight coordinates surrounding a grid cell in some modular arithmetic space in each of x, y, and z, then divide by the ray's direction and take the smallest distance. I have no evidence other than intuition to think this might work, and Google is unhelpful; "intersecting a grid" means intersecting the grid's faces.
I really only care about the surface normal of the primitive (I could easily find that given a distance to intersection, but I don't care about the distance per se).
The type of primitive intersected isn't important at this point. Ideally, it would be a box. Second choice, sphere. However, I'm assuming that whatever algorithm is used might be generalizable to other primitives, and if worst comes to worst, it doesn't really matter for this application anyway.
Here's another idea:
The ray can only hit a primitive when all of the x, y and z coordinates are close to integer values.
If we consider the parametric equation for the ray, where a point on the line is given by
p=p0 + t * v
where p0 is the starting point and v is the ray's direction vector, we can plot the distance from the ray to an integer value on each axis as a function of t. e.g.:
dx = abs( ( p0.x + t * v.x + 0.5 ) % 1 - 0.5 )
This will yield three sawtooth plots whose periods depend on the components of the direction vector (e.g. if the direction vector is (1, 0, 0), the x-plot will vary linearly between 0 and 0.5, with a period of 1, while the other plots will remain constant at whatever p0 is.
You need to find the first value of t for which all three plots are below some threshold level, determined by the size of your primitives. You can thus vastly reduce the number of t values to be checked by considering the plot with the longest (non-infinite) period first, before checking the higher-frequency plots.
I can't shake the feeling that it may be possible to compute the correct value of t based on the periods of the three plots, but I can't come up with anything that isn't scuppered by the starting position not being the origin, and the threshold value not being zero. :-/
Basically, what you'll need to do is to express the line in the form of a function. From there, you will just mathematically have to calculate if the ray intersects with each object, as and then if it does make sure you get the one it collides with closest to the source.
This isn't fast, so you will have to do a lot of optimization here. The most obvious thing is to use bounding boxes instead of the actual shapes. From there, you can do things like use Octrees or BSTs (Binary Space Partitioning).
Well, anyway, there might be something I am overlooking that becomes possible through the extra limitations you have to your system, but that is how I had to make a ray tracer for a course.
You state in the question that an iterative solution is unacceptably slow - I assume you mean iterative in the sense of testing every object in the grid against the line.
Iterate instead over the grid cubes that the line intersects, and for each cube test the 8 objects that the cube intersects. Look to Bresenham's line drawing algorithm for how to find which cubes the line intersects.
Note that Bresenham's will not return absolutely every cube that the ray intersects, but for finding which primitives to test I'm fairly sure that it'll be good enough.
It also has the nice properties:
Extremely simple - this will be handy if you're running it on the GPU
Returns results iteratively along the ray, so you can stop as soon as you find a hit.
Try this approach:
Determine the function of the ray;
Say the grid is divided in different planes in z axis, the ray will intersect with each 'z plane' (the plane where the grid nodes at the same height lie in), and you can easily compute the coordinate (x, y, z) of the intersect points from the ray function;
Swipe z planes, you can easily determine which intersect points lie in a cubic or a sphere;
But the ray may intersects with the cubics/spheres between the z planes, so you need to repeat the 1-3 steps in x, y axises. This will ensure no intersection is left off.
Throw out the repeated cubics/spheres found from x,y,z directions searches.

How to perform spatial partitioning in n-dimensions?

I'm trying to design an implementation of Vector Quantization as a c++ template class that can handle different types and dimensions of vectors (e.g. 16 dimension vectors of bytes, or 4d vectors of doubles, etc).
I've been reading up on the algorithms, and I understand most of it:
here and here
I want to implement the Linde-Buzo-Gray (LBG) Algorithm, but I'm having difficulty figuring out the general algorithm for partitioning the clusters. I think I need to define a plane (hyperplane?) that splits the vectors in a cluster so there is an equal number on each side of the plane.
[edit to add more info]
This is an iterative process, but I think I start by finding the centroid of all the vectors, then use that centroid to define the splitting plane, get the centroid of each of the sides of the plane, continuing until I have the number of clusters needed for the VQ algorithm (iterating to optimize for less distortion along the way). The animation in the first link above shows it nicely.
My questions are:
What is an algorithm to find the plane once I have the centroid?
How can I test a vector to see if it is on either side of that plane?
If you start with one centroid, then you'll have to split it, basically by doubling it and slightly moving the points apart in an arbitrary direction. The plane is just the plane orthogonal to that direction.
But you don't need to compute that plane.
More generally, the region (i) is defined as the set of points which are closer to the centroid c_i than to any other centroid. When you have two centroids, each region is a half space, thus separated by a (hyper)plane.
How to test on a vector x to see on which side of the plane it is? (that's with two centroids)
Just compute the distance ||x-c1|| and ||x-c2||, the index of the minimum value (1 or 2) will give you which region the point x belongs to.
More generally, if you have n centroids, you would compute all the distances ||x-c_i||, and the centroid x is closest to (i.e., for which the distance is minimal) will give you the region x is belonging to.
I don't quite understand the algorithm, but the second question is easy:
Let's call V a vector which extends from any point on the plane to the point-in-question. Then the point-in-question lies on the same side of the (hyper)plane as the normal N iff V·N > 0
