Search in array with high dimensions having specific properties - algorithm

I have a 3D array in which values are monotonic. How to find all (x,y), |f(X,Y,Z) – v1| < t.

There are Omega(n^2) points whose coordinates sum to n - 1. Nothing is known a priori about how the values of these points compare to each other, so, in the worst case, all of them must be inspected. An upper bound that matches up to constant factors is provided by running the 2D algorithm in each constant-z slice.

For each value (eg. v1), execute the following steps:
Execute the 2D algorithm for the 4 cube faces tangent to the X axis (Y=0, Y=n-1, Z=0, Z=n-1). Index the resulting set of matching (X, Y, Z) cells by X coordinate for the next step.
Execute the 2D algorithm for all n slices along the X axis (X=0..n-1), using the result of step 1 to initialize the first boundary point for the 2D algorithm. If there are no matching cells for the given x coordinate, move on to the next slice in constant time.
Worst case complexity will be O(O(2D algorithm) * n).
For multiple values (v2, etc.) keep a cache of function evaluations, and re-execute the algorithm for each value. For 100^3, a dense array would suffice.
It might be useful to think of this as an isosurface extraction algorithm, though your monotonicity constraint makes it easier.

If the 3d array is monotonically non-decreasing in each dimension then we know that if
f(x0, y0, z0) < v1 - t
or
f(x1, y1, z1) > v1 + t
then no element of the sub-array f(x0...x1, y0...y1, z0...z1) can contain any interesting point. To see this consider for example that
f(x0, y0, z0) <= f(x, y0, z0) <= f(x, y, z0) <= f(x, y, z)
holds for each (x, y, z) of the sub-array, and a similar relation holds (with reversed direction) for (x1, y1, z1). Thus f(x0, y0, z0) and f(x1, y1, z1) are the minimum and maximum value of the sub-array, respectively.
A simple search approach can then be implemented by using a recursive subdivision scheme:
template<typename T, typename CBack>
int values(Mat3<T>& data, T v0, T v1, CBack cback,
int x0, int y0, int z0, int x1, int y1, int z1) {
int count = 0;
if (x1 - x0 <= 2 && y1 - y0 <= 2 && z1 - z0 <= 2) {
// Small block (1-8 cells), just scan it
for (int x=x0; x<x1; x++) {
for (int y=y0; y<y1; y++) {
for (int z=z0; z<z1; z++) {
T v = data(x, y, z);
if (v >= v0 && v <= v1) cback(x, y, z);
count += 1;
}
}
}
} else {
T va = data(x0, y0, z0), vb = data(x1-1, y1-1, z1-1);
count += 2;
if (vb >= v0 && va <= v1) {
int x[] = {x0, (x0 + x1) >> 1, x1};
int y[] = {y0, (y0 + y1) >> 1, y1};
int z[] = {z0, (z0 + z1) >> 1, z1};
for (int ix=0; ix<2; ix++) {
for (int iy=0; iy<2; iy++) {
for (int iz=0; iz<2; iz++) {
count += values<T, CBack>(data, v0, v1, cback,
x[ix], y[iy], z[iz],
x[ix+1], y[iy+1], z[iz+1]);
}
}
}
}
}
return count;
}
The code basically accepts a sub-array and simply skips the search if the lowest element is too big or the highest element is too small, and splits the array in 8 sub-cubes otherwise. The recursion ends when the sub-array is small (2x2x2 or less) and a full scan is performed in this case.
Experimentally I found that with this quite simple approach an array with 100x200x300 elements generated by setting element f(i,j,k) to max(f(i-1,j,k), f(i,j-1,k), f(i,j,k-1)) + random(100) can be searched for the middle value and t=1 checking only about 3% of the elements (25 elements checked for each element found within range).
Data 100x200x300 = 6000000 elements, range [83, 48946]
Looking for [24594-1=24593, 24594+1=24595]
Result size = 6850 (5.4 ms)
Full scan = 6850 (131.3 ms)
Search count = 171391 (25.021x, 2.857%)

Since the function is non-decreasing, I think you can do something with binary searches.
Inside a (x, 1, 1) (column) vector you can do a binary search to find the range that matches your requirement which would be O(log(n)).
To find which column vectors to look in you can do a binary search over (x, y, 1) (slices) vectors checking just the first and last points to know if the value can fall in them which will take again O(log(n)).
To know which slices to look in you can binary search the whole cube checking the 4 points ((0, 0), (x, 0), (x, y), (0, y)) which would take O(log(n)).
So in total, the algorithm will take log(z) + a * log(y) + b * log(x) where a is the number of matching slices and b is the number of matching columns.
Naively calculating the worst case is O(y * z * log(x)).

Related

Writing a vector sum in MATLAB

Suppose I have a function phi(x1,x2)=k1*x1+k2*x2 which I have evaluated over a grid where the grid is a square having boundaries at -100 and 100 in both x1 and x2 axis with some step size say h=0.1. Now I want to calculate this sum over the grid with which I'm struggling:
What I was trying :
clear all
close all
clc
D=1; h=0.1;
D1 = -100;
D2 = 100;
X = D1 : h : D2;
Y = D1 : h : D2;
[x1, x2] = meshgrid(X, Y);
k1=2;k2=2;
phi = k1.*x1 + k2.*x2;
figure(1)
surf(X,Y,phi)
m1=-500:500;
m2=-500:500;
[M1,M2,X1,X2]=ndgrid(m1,m2,X,Y)
sys=#(m1,m2,X,Y) (k1*h*m1+k2*h*m2).*exp((-([X Y]-h*[m1 m2]).^2)./(h^2*D))
sum1=sum(sys(M1,M2,X1,X2))
Matlab says error in ndgrid, any idea how I should code this?
MATLAB shows:
Error using repmat
Requested 10001x1001x2001x2001 (298649.5GB) array exceeds maximum array size preference. Creation of arrays greater
than this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference
panel for more information.
Error in ndgrid (line 72)
varargout{i} = repmat(x,s);
Error in new_try1 (line 16)
[M1,M2,X1,X2]=ndgrid(m1,m2,X,Y)
Judging by your comments and your code, it appears as though you don't fully understand what the equation is asking you to compute.
To obtain the value M(x1,x2) at some given (x1,x2), you have to compute that sum over Z2. Of course, using a numerical toolbox such as MATLAB, you could only ever hope to compute over some finite range of Z2. In this case, since (x1,x2) covers the range [-100,100] x [-100,100], and h=0.1, it follows that mh covers the range [-1000, 1000] x [-1000, 1000]. Example: m = (-1000, -1000) gives you mh = (-100, -100), which is the bottom-left corner of your domain. So really, phi(mh) is just phi(x1,x2) evaluated on all of your discretised points.
As an aside, since you need to compute |x-hm|^2, you can treat x = x1 + i x2 as a complex number to make use of MATLAB's abs function. If you were strictly working with vectors, you would have to use norm, which is OK too, but a bit more verbose. Thus, for some given x=(x10, x20), you would compute x-hm over the entire discretised plane as (x10 - x1) + i (x20 - x2).
Finally, you can compute 1 term of M at a time:
D=1; h=0.1;
D1 = -100;
D2 = 100;
X = (D1 : h : D2); % X is in rows (dim 2)
Y = (D1 : h : D2)'; % Y is in columns (dim 1)
k1=2;k2=2;
phi = k1*X + k2*Y;
M = zeros(length(Y), length(X));
for j = 1:length(X)
for i = 1:length(Y)
% treat (x - hm) as a complex number
x_hm = (X(j)-X) + 1i*(Y(i)-Y); % this computes x-hm for all m
M(i,j) = 1/(pi*D) * sum(sum(phi .* exp(-abs(x_hm).^2/(h^2*D)), 1), 2);
end
end
By the way, this computation takes quite a long time. You can consider either increasing h, reducing D1 and D2, or changing all three of them.

How to loop over matrix in Octave to generate cross-term polynomial of order n

What I am trying to do is the following, I have an n x m sized matrix, with n rows of data and m columns. Each of these columns is a different variable (think X, Y, Z, ect...).
What I want is to output a n x (m+f(m, i)) matrix, where i is the order of the polynomial requested, and f(m, i) is the number of terms, including cross terms of the polynomial.
I'll give an example, say I have a matrix with one row and three columns, and I want to return the polynomial terms up to order 3.
input = [x, y, z]
I want to get to
output = [x, y, z, x^2, y^2, z^2, x*y, x*z, y*z, x^3, y^3, z^3, x^2y, x^2*z, x*y^2, y^2*z, x*z^2, y*z^2, x*y*z]
From this we see f(3, 3) = 16.
I know I can do this with m nested loops, and I believe I can vectorize any algorithm over the number of rows, but it would be helpful to have a more efficient algorithm than brute force.
This can be done numerically using the following code, should be pretty easy to do symbolically as well.
function MatrixWithPolynomialTerms = GeneratePolynomialTerms
(InputDataMatrix, n)
resultMatrix = InputDataMatrix;
[nr, nc] = size(InputDataMatrix);
cart = nthargout ([1:nc], #ndgrid, [0:n]);
combs = cell2mat (cellfun (#(c) c(:), cart, "UniformOutput", false))';
for i = 1:length(combs)
if (sum(combs(:, i)) <= n)
resultColumn = ones(nr, 1);
for j = 1:nc
resultColumn.*=(InputDataMatrix(:, j).^combs(j, i));
end
resultMatrix = [resultMatrix, resultColumn];
end
end
MatrixWithPolynomialTerms = resultMatrix
endfunction

Searching a 3D array for closest point satisfying a certain predicate

I'm looking for an enumeration algorithm to search through a 3D array "sphering" around a given starting point.
Given an array a of size NxNxN where each N is 2^k for some k, and a point p in that array. The algorithm I'm looking for should do the following: If a[p] satisfies a certain predicate, the algorithm stops and p is returned. Otherwise the next point q is checked, where q is another point in the array that is the closest to p and hasn't been visited yet. If that doesn't match either, the next q'is checked an so on until in the worst case the whole array has been searched.
By "closest" here the perfect solution would be the point q that has the smallest Euclidean distance to p. As only discrete points have to be considered, perhaps some clever enumeration algorithm woukd make that possible. However, if this gets too complicated, the smallest Manhattan distance would be fine too. If there are several nearest points, it doesn't matter which one should be considered next.
Is there already an algorithm that can be used for this task?
You can search for increasing squared distances, so you won't miss a point. This python code should make it clear:
import math
import itertools
# Calculates all points at a certain distance.
# Coordinate constraint: z <= y <= x
def get_points_at_squared_euclidean_distance(d):
result = []
x = int(math.floor(math.sqrt(d)))
while 0 <= x:
y = x
while 0 <= y:
target = d - x*x - y*y
lower = 0
upper = y + 1
while lower < upper:
middle = (lower + upper) / 2
current = middle * middle
if current == target:
result.append((x, y, middle))
break
if current < target:
lower = middle + 1
else:
upper = middle
y -= 1
x -= 1
return result
# Creates all possible reflections of a point
def get_point_reflections(point):
result = set()
for p in itertools.permutations(point):
for n in range(8):
result.add((
p[0] * (1 if n % 8 < 4 else -1),
p[1] * (1 if n % 4 < 2 else -1),
p[2] * (1 if n % 2 < 1 else -1),
))
return sorted(result)
# Enumerates all points around a center, in increasing distance
def get_next_point_near(center):
d = 0
points_at_d = []
while True:
while not points_at_d:
d += 1
points_at_d = get_points_at_squared_euclidean_distance(d)
point = points_at_d.pop()
for reflection in get_point_reflections(point):
yield (
center[0] + reflection[0],
center[1] + reflection[1],
center[2] + reflection[2],
)
# The function you asked for
def get_nearest_point(center, predicate):
for point in get_next_point_near(center):
if predicate(point):
return point
# Example usage
print get_nearest_point((1,2,3), lambda p: sum(p) == 10)
Basically you consume points from the generator until one of them fulfills your predicate.
This is pseudocode for a simple algorithm that will search in increasing-radius spherical husks until it either finds a point or it runs out of array. Let us assume that condition returns either true or false and has access to the x, y, z coordinates being tested and the array itself, returning false (instead of exploding) for out-of-bounds coordinates:
def find_from_center(center, max_radius, condition) returns a point
let radius = 0
while radius < max_radius,
let point = find_in_spherical_husk(center, radius, condition)
if (point != null) return point
radius ++
return null
the hard part is inside find_in_spherical_husk. We are interested in checking out points such that
dist(center, p) >= radius AND dist(center, p) < radius+1
which will be our operating definition of husk. We could iterate over the whole 3D array in O(n^3) looking for those, but that would be really expensive in terms of time. A better pseudocode is the following:
def find_in_spherical_husk(center, radius, condition)
let z = center.z - radius // current slice height
let r = 0 // current circle radius; maxes at equator, then decreases
while z <= center + radius,
let z_center = (z, center.x, point.y)
let point = find_in_z_circle(z_center, r)
if (point != null) return point
// prepare for next z-sliced cirle
z ++
r = sqrt(radius*radius - (z-center.z)*(z-center.z))
the idea here is to slice each husk into circles along the z-axis (any axis will do), and then look at each slice separately. If you were looking at the earth, and the poles were the z axis, you would be slicing from north to south. Finally, you would implement find_in_z_circle(z_center, r, condition) to look at the circumference of each of those circles. You can avoid some math there by using the Bresenham circle-drawing algorithm; but I assume that the savings are negligible compared with the cost of checking condition.

Optimizing computational cost on a task involving a multi-nested loop

I am just a beginner of programming, and sorry in advance for bothering you by a (presumably) basic question.
I would like to perform the following task:
(I apologize for inconvenience; I don't know how to input a TeX-y formula in Stack Overflow ). I am primarily considering an implementation on MATLAB or Scilab, but language does not matter so much.
The most naive approach to perform this, I think, is to form an n-nested for loop, that is (the case n=2 on MATLAB is shown for example),
n=2;
x=[x1,x2];
for u=0:1
y(1)=u;
if x(1)>0 then
y(1)=1;
end
for v=0:1
y(2)=v;
if x(2)>0 then
y(2)=1;
end
z=Function(y);
end
end
However, this implementation is too laborious for large n, and more importantly, it causes 2^n-2^k abundant evaluations of the function, where k is a number of negative elements in x. Also, naively forming a k-nested for loop with knowledge of which element in x is negative, e.g.
n=2;
x=[-1,2];
y=[1,1];
for u=0:1
y(1)=u;
z=Function(y);
end
doesn't seem to be a good way; if we want to perform the task for different x, we have to rewrite a code.
I would be grateful if you provide an idea to implement a code such that (a) evaluates the function only 2^k times (possible minimum number of evaluations) and (b) we don't have to rewrite a code even if we change x.
You can evaluate Function on y in Ax easily using recursion
function eval(Function, x, y, i, n) {
if(i == n) {
// break condition, evaluate Function
Function(y);
} else {
// always evaluate y(i) == 1
y(i) = 1;
eval(Function, x, y, i + 1, n);
// eval y(i) == 0 only if x(i) <= 0
if(x(i) <= 0) {
y(i) = 0;
eval(Function, x, y, i + 1, n);
}
}
}
Turning that into efficient Matlab code is another problem.
As you've stated the number of evaluations is 2^k. Let's sort x so that only the last k elements are non-positive. To evaluate Function index y using the reverse of the permutation of the sort of x: Function(y(perm)). Even better the same method allows us to build Ax directly using dec2bin:
// every column of the resulting matrix is a member of Ax: y_i = Ax(:,i)
function Ax = getAx(x)
n = length(x);
// find the k indices of non-positives in x
is = find(x <= 0);
k = length(is);
// construct Y (last k rows are all possible combinations of [0 1])
Y = [ones(n - k, 2 ^ k); (dec2bin(0:2^k-1)' - '0')];
// re-order the rows in Y to get Ax according to the permutation is (inverse is)
perm([setdiff(1:n, is) is]) = 1:n;
Ax = Y(perm, :);
end
Now rewrite Function to accept a matrix or iterate over the columns in Ax = getAx(x); to evaluate all Function(y).

How to determine if a point lies OVER a triangle in 3D

I need an example of fast algorithm allowing to calculate if a point lies over a triangle in 3D. I mean if the projection of this point on a plane containing given triangle is inside of this triangle.
I need to calculate distance between a point and a triangle (between a point and the face of this triangle if its projection lies inside the triangle or between a point and an edge of a triangle if its projection lays outside the triangle).
I hope I made it clear enough. I found some examples for 2D using barycentric coordinates but can't find any for 3D. Is there a faster way than calculating projection of a point, projecting this projected point and a given triangle to 2D and solving standard "point in triangle" problem?
If the triangle's vertices are A, B, C and the point is P, then begin by finding the triangle's normal N. For this just compute N = (B-A) X (C-A), where X is the vector cross product.
For the moment, assume P lies on the same side of ABC as its normal.
Consider the 3d pyramid with faces ABC, ABP, BCP, CAP. The projection of P onto ABC is inside it if and only if the dihedral angles between ABC and each of the other 3 triangles are all less than 90 degrees. In turn, these angles are equal to the angle between N and the respective outward-facing triangle normal! So our algorithm is this:
Let N = (B-A) X (C-A), N1 = (B-A) X (P-A), N2 = (C-B) X (P-B), N3 = (A-C) X (P-C)
return N1 * N >= 0 and N2 * N >= 0 and N3 * N >= 0;
The stars are dot products.
We still need to consider the case where P lies on the opposite side of ABC as its normal. Interestingly, in this case the vectors N1, N2, N3 now point into the pyramid, where in the above case they point outward. This cancels the opposing normal, and the algorithm above still provides the right answer. (Don't you love it when that happens?)
Cross products in 3d each require 6 multiplies and 3 subtractions. Dot products are 3 multiplies and 2 additions. On average (considering e.g. N2 and N3 need not be calculated if N1 * N < 0), the algorithm needs 2.5 cross products and 1.5 dot products. So this ought to be pretty fast.
If the triangles can be poorly formed, then you might want to use Newell's algorithm in place of the arbitrarily chosen cross products.
Note that edge cases where any triangle turns out to be degenerate (a line or point) are not handled here. You'd have to do this with special case code, which is not so bad because the zero normal says much about the geometry of ABC and P.
Here is C code, which uses a simple identity to reuse operands better than the math above:
#include <stdio.h>
void diff(double *r, double *a, double *b) {
r[0] = a[0] - b[0];
r[1] = a[1] - b[1];
r[2] = a[2] - b[2];
}
void cross(double *r, double *a, double *b) {
r[0] = a[1] * b[2] - a[2] * b[1];
r[1] = a[2] * b[0] - a[0] * b[2];
r[2] = a[0] * b[1] - a[1] * b[0];
}
double dot(double *a, double *b) {
return a[0] * b[0] + a[1] * b[1] + a[2] * b[2];
}
int point_over_triangle(double *a, double *b, double *c, double *p) {
double ba[3], cb[3], ac[3], px[3], n[3], nx[3];
diff(ba, b, a);
diff(cb, c, b);
diff(ac, a, c);
cross(n, ac, ba); // Same as n = ba X ca
diff(px, p, a);
cross(nx, ba, px);
if (dot(nx, n) < 0) return 0;
diff(px, p, b);
cross(nx, cb, px);
if (dot(nx, n) < 0) return 0;
diff(px, p, c);
cross(nx, ac, px);
if (dot(nx, n) < 0) return 0;
return 1;
}
int main(void) {
double a[] = { 1, 1, 0 };
double b[] = { 0, 1, 1 };
double c[] = { 1, 0, 1 };
double p[] = { 0, 0, 0 };
printf("%s\n", point_over_triangle(a, b, c, p) ? "over" : "not over");
return 0;
}
I've tested it lightly and it seems to be working fine.
Let's assume that the vertices of the triangle are v, w, and the origin 0. Let's call the point p.
For the benefit of other readers, here's the barycentric approach for 2D point-in-triangle, to which you alluded. We solve the following system in variables beta:
[v.x w.x] [beta.v] [p.x]
[v.y w.y] [beta.w] = [p.y] .
Test whether 0 <= beta.v && 0 <= beta.w && beta.v + beta.w <= 1.
For 3D projected-point-in-triangle, we have a similar but overdetermined system:
[v.x w.x] [beta.v] [p.x]
[v.y w.y] [beta.w] = [p.y] .
[v.z w.z] [p.z]
The linear least squares solution gives coefficients beta for the point closest to p on the plane spanned by v and w, i.e., the projection. For your application, a solution via the following normal equations likely will suffice:
[v.x v.y v.z] [v.x w.x] [beta.v] [v.x v.y v.z] [p.x]
[w.x w.y w.z] [v.y w.y] [beta.w] = [w.x w.y w.z] [p.y] ,
[v.z w.z] [p.z]
from which we can reduce the problem to the 2D case using five dot products. This should be comparable in complexity to the method that Nico suggested but without the singularity.

Resources