Find closest "true" element in 2D boolean matrix? - algorithm

I have a 2D matrix with boolean values, which is updated highly frequently. I want to choose a 2D index {x, y} within the matrix, and find the nearest element that is "true" in the table, without going through all the elements (the matrix is massive).
For example, if I have the matrix:
0000100
0100000
0000100
0100001
and I choose a coordinate {x1, y1} such as {4, 3}, I want returned the location of the closest "true" value, which in this case is {5, 3}. The distance between the elements is measured using the standard Pythagorean equation:
distance = sqrt(distX * distX + distY * distY) where distX = x1 - x and distY = y1 - y.
I can go through all the elements in the matrix and keep a list of "true" values and select the one with the shortest distance result, but it's extremely inefficient. What algorithm can I use to reduce search time?
Details: The matrix size is 1920x1080, and around 25 queries will be made every frame. The entire matrix is updated every frame. I am trying to maintain a reasonable framerate, more than 7fps is enough.

If matrix is always being updated, then there is no need to build some auxillary structure like distance transform, Voronoy diagram etc.
You can just execute search like BFS (bread-first search) propagating from query point. The only difference from usual BFS is euclidean metrics. So you can generate (u, v) pairs ordered by (u^2+v^2) and check symmetric points shifted by (+-u,+-v),(+-v,+-u) combinations (four points when u or v is zero, eight points otherwise)

You could use a tree data structure like a quad-tree (see https://en.wikipedia.org/wiki/Quadtree) to store all locations with value "true". In this way it should be possible to quickly iterate over all "true" values in the neighborhood of a given location. Furthermore, the tree can be updated in logarithmic time, if the value of a location changes.

Related

How can you iterate linearly through a 3D grid?

Assume we have a 3D grid that spans some 3D space. This grid is made out of cubes, the cubes need not have integer length, they can have any possible floating point length.
Our goal is, given a point and a direction, to check linearly each cube in our path once and exactly once.
So if this was just a regular 3D array and the direction is say in the X direction, starting at position (1,2,0) the algorithm would be:
for(i in number of cubes)
{
grid[1+i][2][0]
}
But of course the origin and the direction are arbitrary and floating point numbers, so it's not as easy as iterating through only one dimension of a 3D array. And the fact the side lengths of the cubes are also arbitrary floats makes it slightly harder as well.
Assume that your cube side lengths are s = (sx, sy, sz), your ray direction is d = (dx, dy, dz), and your starting point is p = (px, py, pz). Then, the ray that you want to traverse is r(t) = p + t * d, where t is an arbitrary positive number.
Let's focus on a single dimension. If you are currently at the lower boundary of a cube, then the step length dt that you need to make on your ray in order to get to the upper boundary of the cube is: dt = s / d. And we can calculate this step length for each of the three dimensions, i.e. dt is also a 3D vector.
Now, the idea is as follows: Find the cell where the ray's starting point lies in and find the parameter values t where the first intersection with the grid occurs per dimension. Then, you can incrementally find the parameter values where you switch from one cube to the next for each dimension. Sort the changes by the respective t value and just iterate.
Some more details:
cell = floor(p - gridLowerBound) / s <-- the / is component-wise division
I will only cover the case where the direction is positive. There are some minor changes if you go in the negative direction but I am sure that you can do these.
Find the first intersections per dimension (nextIntersection is a 3D vector):
nextIntersection = ((cell + (1, 1, 1)) * s - p) / d
And calculate the step length:
dt = s / d
Now, just iterate:
if(nextIntersection.x < nextIntersection.y && nextIntersection.x < nextIntersection.z)
cell.x++
nextIntersection.x += dt.x
else if(nextIntersection.y < nextIntersection.z)
cell.y++
nextIntersection.y += dt.y
else
cell.z++
nextIntersection.z += dt.z
end if
if cell is outside of grid
terminate
I have omitted the case where two or three cells are changed at the same time. The above code will only change one at a time. If you need this, feel free to adapt the code accordingly.
Well if you are working with floats, you can make the equation for the line in direction specifiedd. Which is parameterized by t. Because in between any two floats there is a finite number of points, you can simply check each of these points which cube they are in easily cause you have point (x,y,z) whose components should be in, a respective interval defining a cube.
The issue gets a little bit harder if you consider intervals that are, dense.
The key here is even with floats this is a discrete problem of searching. The fact that the equation of a line between any two points is a discrete set of points means you merely need to check them all to the cube intervals. What's better is there is a symmetry (a line) allowing you to enumerate each point easily with arithmetic expression, one after another for checking.
Also perhaps consider integer case first as it is same but slightly simpler in determining the discrete points as it is a line in Z_2^8?

Indexing and retrieving data using index for a 3D grid for interpolation in c++

I have a 3D Cartesian grid data that needs to be used to create a 3D regular mesh for interpolation method. x,y & z are 3 vectors with data points that are used to form this grid. My question is, how can i efficiently give 2 index to these points say,
where c000 is indexed as 1 point (1,1,1), c100 is indexed as 2 for (2,1,1) for (x,y,z)
coordinate points and another index to identify the 8 points forming the cube. Say if I have a point C, I must retrieve the nearest 8 points for interpolation. so for points c000,c100,c110,c010,c001,c101,c111,c011 point index and cube index. Since the data available is huge, the focus is to use faster implementation. pls give me some hints on how to proceed.
About the maths:
Identifying the cube which a point p surrounds requires a mapping
U ⊂ ℝ+**3 -> ℕ:
p' (= p - O_) -> hash_r(p');
"O_" being located at (min_x(G),min_y(G),min_z(G)) of the Grid G.
Along each axis, the cube numbering is trivial.
Given a compound cube number
n_ = (s,t,u)
and N_x, N_y, N_z being the size of your X_, Y_, Z_, a suitable hash would be
hash_n(n_) = s
| t * 2**(floor(log_2(N_x))+1)
| u * 2**(floor(log_2(N_x)) + floor(log_2(N_y)) + 2).
To calculate e.g. "s" for a point C, take
s = floor((C[0] - O_)/ a)
"a" being the edge length of the cubes.
About taking that to C++
Given you have enough space to allocate
(2**(floor(log_2(max(N_x, N_y, N_z)))+1)**3
buckets, a std::unordered_map<hash_t,Cube> using that (perfect) hash would offer O(1) for finding the cube for a point p.
A lesser pompous std::map<hash_t,Cube> using a less based on that hash would offer O(log(N)) find complexity.

find all points within a range to any point of an other set

I have two sets of points A and B.
I want to find all points in B that are within a certain range r to A, where a point b in B is said to be within range r to A if there is at least one point a in A whose (Euclidean) distance to b is equal or smaller to r.
Each of the both sets of points is a coherent set of points. They are generated from the voxel locations of two non overlapping objects.
In 1D this problem fairly easy: all points of B within [min(A)-r max(A)+r]
But I am in 3D.
What is the best way to do this?
I currently repetitively search for every point in A all points in B that within range using some knn algorithm (ie. matlab's rangesearch) and then unite all those sets. But I got a feeling that there should be a better way to do this. I'd prefer a high level/vectorized solution in matlab, but pseudo code is fine too :)
I also thought of writing all the points to images and using image dilation on object A with a radius of r. But that sounds like quite an overhead.
You can use a k-d tree to store all points of A.
Iterate points b of B, and for each point - find the nearest point in A (let it be a) in the k-d tree. The point b should be included in the result if and only if the distance d(a,b) is smaller then r.
Complexity will be O(|B| * log(|A|) + |A|*log(|A|))
I archived further speedup by enhancing #amit's solution by first filtering out points of B that are definitely too far away from all points in A, because they are too far away even in a single dimension (kinda following the 1D solution mentioned in the question).
Doing so limits the complexity to O(|B|+min(|B|,(2r/res)^3) * log(|A|) + |A|*log(|A|)) where res is the minimum distance between two points and thus reduces run time in the test case to 5s (from 10s, and even more in other cases).
example code in matlab:
r=5;
A=randn(10,3);
B=randn(200,3)+5;
roughframe=[min(A,[],1)-r;max(A,[],1)+r];
sortedout=any(bsxfun(#lt,B,roughframe(1,:)),2)|any(bsxfun(#gt,B,roughframe(2,:)),2);
B=B(~sortedout,:);
[~,dist]=knnsearch(A,B);
B=B(dist<=r,:);
bsxfun() is your friend here. So, say you have 10 points in set A and 3 points in set B. You want to have them arrange so that the singleton dimension is at the row / columns. I will randomly generate them for demonstration
A = rand(10, 1, 3); % 10 points in x, y, z, singleton in rows
B = rand(1, 3, 3); % 3 points in x, y, z, singleton in cols
Then, distances among all the points can be calculated in two steps
dd = bsxfun(#(x,y) (x - y).^2, A, B); % differences of x, y, z in squares
d = sqrt(sum(dd, 3)); % this completes sqrt(dx^2 + dy^2 + dz^2)
Now, you have an array of the distance among points in A and B. So, for exampl, the distance between point 3 in A and point 2 in B should be in d(3, 2). Hope this helps.

Optimizing the layout of a graph with given (erroneous) node-distances

I have a loosely connected graph. For every edge in this graph, I know the approximate distance d(v,w) between node v and w at positions p(v) and p(w) as a vector in R3, not only as an euclidean distance. The error shall be small (lets say < 3%) and the first node is at <0,0,0>.
If there were no errors at all, I can calculate the node-positions this way:
set p(first_node) = <0,0,0>
calculate_position(first_node)
calculate_position(v):
for (v,w) in Edges:
if p(w) is not set:
set p(w) = p(v) + d(v,w)
calculate_position(w)
for (u,v) in Edges:
if p(u) is not set:
set p(u) = p(v) - d(u,v)
calculate_position(u)
The errors of the distance are not equal. But to keep things simple, assume the relative error (d(v,w)-d'(v,w))/E(v,w) is N(0,1)-normal-distributed. I want to minimize the sum of the squared error
sum( ((p(v)-p(w)) - d(v,w) )^2/E(v,w)^2 ) for all edges
The graph may have a moderate amount of Nodes ( > 100 ) but with just some connections between the nodes and have been "prefiltered" (split into subgraphs, if there is only one connection between these subgraphs).
I have tried a simplistic "physical model" with hooks low but its slow and unstable. Is there a better algorithm or heuristic for this kind of problem?
This looks like linear regression. Take error terms of the following form, i.e. without squares and split into separate coordinates:
(px(v) - px(w) - dx(v,w))/E(v,w)
(py(v) - py(w) - dy(v,w))/E(v,w)
(pz(v) - pz(w) - dz(v,w))/E(v,w)
If I understood you correctly, you are looking for values px(v), py(v) and pz(v) for all nodes v such that the sum of squares of the above terms is minimized.
You can do this by creating a matrix A and a vector b in the following way: every row corresponds to one of equation of the above form, and every column of A corresponds to one variable, i.e. a single coordinate. For n vertices and m edges, the matrix A will have 3m rows (since you separate coordinates) and 3n−3 columns (since you also fix the first node px(0)=py(0)=pz(0)=0).
The row for (px(v) - px(w) - dx(v,w))/E(v,w) would have an entry 1/E(v,w) in the column for px(v) and an entry -1/E(v,w) in the column for px(w). All other columns would be zero. The corresponding entry in the vector b would be dx(v,w)/E(v,w).
Now solve the linear equation (AT·A)x = AT·b where AT denotes the transpose of A. The solution vector x will contain the coordinates for your vertices. You can break this into three independent problems, one for each coordinate direction, to keep the size of the linear equation system down.

Traversing a 2D array in an angle

Generally we traverse the array by row or column but here I want to traverse it in an angle.
I will try and explain what I mean,
So lets say if the angle is 45 degree then rather than row by col it would search as (0,0) then (0,1) (1,0) then (0,2) , (1,1) ,(2,0) and so on.. .(sorry could not upload an image as I am new user and not allowed to do so, may be try and imagine/draw an array that would help get what I am trying to say)
But what will happen if the user inputs an angle like 20 degree how can we determine how to search the array.
i just wanted to know if there is any algorithm which does something similar to this? Programming language is not an issue i guess the issue is more of algoritham sort.
Any ideas would be welcome.
Please feel free to ask if I am not able to explain clearly what I am looking for.
Thanks guys.
Easy. Take an angle (let's say 45). This corresponds to a vector v=(1, 1) in your case. (This can be normalized to a unitary vector (sqrt(2)/2, sqrt(2)/2), but this is not necessary)
For every single point in your array, you have their coordinates (x, y). Simply do the scalar product of these coordinates with the vector. Let's call f(x, y) = scalarProduct((x, y), v)
Sort the values of f(x, y) and you've got the "traversing" you're looking for!
A real example.
Your matrix is 3x3
The scalar products are :
(0,0).(1,1) = 0
(0,1).(1,1) = 1
(0,2).(1,1) = 2
(1,0).(1,1) = 1
(1,1).(1,1) = 2
(1,2).(1,1) = 3
(2,0).(1,1) = 2
(2,1).(1,1) = 3
(2,2).(1,1) = 4
If you order these scalar products by ascending order, you obtain the ordering (0,0), (1,0), (1,0), (2,0), (1,1), (0,2), (2,1)...
And if you want to do it with the angle 20, replace all occurences of v=(1, 1) with v=(cos(20), sin(20))
Here's an illustration of a geometrical interpretation. The scalar products correspond to the intersections of the vector v (in red) with the blue lines.
For every starting point (the leftmost point of every row), use trigonometry to determine an ending point for the given angle. The tan(angle) is defined as (height difference / width of the array), so your height differece is tan(angle)*(witdh of the array). You only have to calculate the height difference once. If y+height difference is greater than the height of the array, just subtract the height (or use the modulo operator).
Now that you have a starting point and an ending point you could use Bresenham's Algorithm to determine the points in between: http://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm
You want to look for a space-filling-curve for example a morton curve or z-curve. If you want to subdivide the array in 4 tiles you may want to look for a hilbert curve or a moore curve.

Resources