Complexity of searching for intersection between grids with RTrees - performance

I have two grids (coming from finite elements, but this is not relevant), say T_1, T_2. One grid is inside the other, think of a square inside another one for instance. For each grid I have constructed an RTree with boost of bounding boxes, say RTree1 and RTree2. To find all the pair of intersecting rectangles, I do the following
for (const auto &[box2, cell2] : RTree_2)
{
for (const auto &[box1, cell1] :
RTree1 | bgi::adaptors::queried(bgi::intersects(box2)))
{
do_something_with_cells(cell1,cell2);
}
}
}
Let's say that I have N bounding boxes for the first tree and M bounding boxes for the second tree. I want to determine the complexity of the snippet above.
Since the intersection has a complexity which is O(log(N)) (on average), I think the snippet above has a complexity which is O(N M log(N)). Is this correct?
If the code above was written like this:
for (const auto &[box2, cell2] : RTree_2)
{
for (const auto &[box1, cell1] :
RTree1)
{
//check if box1 and box2 intersects using bgi
bgi::query(box1,bgi::intersects(box2));
}
}
}
I should have a O(NM log(N)), right?

Since the intersection has a complexity which is O(log(N))
What intersection are you talking about here? If you're talking about the entirety of the query (which seems to make most sense due to your inclusion of N), then you should not have to multiply N again outside the query. I think it is O(M log(N) therefore.

Related

Can someone help me find the time complexity of the following program?

I think the big-O time complexity with be 4^(rows + columns) where rows and columns belong to the grid.
class Solution
{
public void someMethod(int[][] grid, boolean[][] used)
{
compute(grid, 0, 0, 0, used);
}
private void compute(int[][] grid, int i, int j, int count, boolean[][] used)
{
if(i<0 || j<0 || i>= grid.length || j>=grid[0].length || grid[i][j]==0 || used[i][j])
return;
if(grid[i][j] == 1000) // looking to find 1000 from starting position
{
return;
}
used[i][j] = true;
compute(grid, i+1, j, count+1, used); // Go down
compute(grid, i-1, j, count+1, used); // Go up
compute(grid, i, j+1, count+1, used); // Go right
compute(grid, i, j-1, count+1, used); // Go left
used[i][j] = false;
}
}
Can someone explain what the time complexity would be? Also, it'll be someone can provide good helpful resources/ examples for complex time complexity analysis like 2^n, n^n, n! etc
Let n = columns and m = rows. For simplicity assume n == m.
Short: Your algorithm is O[ (2.6382)^(n^2) ] and Ω[ (1.3196)^(n^2) ].
The first being an asymptotic upper bound and the second being an asymptotic lower bound. In any case a function growing as quickly as c^(n^2) for some c>1 is called doubly-exponential. It is growing quicker than any exponential or a factorial.
See derivation below (though some arguments shortened). Better bounds on the particular problem are likely known, I did not research it. This is just giving an idea on how to solve such problems.
You want to count the number of maximal self-avoiding paths starting from (0,0) on a 2D grid (n, m). There are some additional costs, like the actual call depth, but they are polynomial corrections, while the full complexity is certainly super-exponential.
I will try to construct in the following better and better upper and lower bounds on the complexity.
Note that self-avoiding paths on the grid can have at most length n^2 (because after that many steps all of used is true). We can therefore also ignore the fact that the paths shall be maximal, because if we count each of the non-maximal sub-paths of the maximal ones as well, it will be at most a polynomial factor n^2 difference.
At each step of a path we can go 4 directions (4 compute calls), therefore the number of relevant paths can be at most 4^(n^2).
However we can notice, that at least one of the 4 steps goes back to where we already were (or the start) and therefore is not self-avoiding. Thus an upper bound is also 3^(n^2).
Being a bit more creative, we can realize that the number of self-avoiding paths of fixed length s starting at a fixed point on an grid infinite in all directions is known to grow up to polynomial factors exponential in s with a growth constant µ, known as the connective constant. For the 2D square lattice is is not exactly known, but it is roughly µ = 2.6381....
Now that is for an infinite grid, so certainly the number of such paths on a finite grid is smaller and the number of paths is also certainly monotonic in s and so another upper bound for your problem is µ^(n^2 + O(log n)).
Now for the lower bound. Consider the case n==2. On this grid every cell can be reached from every other with at least 2 different self-avoiding paths.
Now consider again larger n and divide the whole grid into 2x2 sub-grids.
There certainly is at least one self-avoiding path of length (n/2)^2 on the outer n/2 x n/2 grid that the sub-grids from. But also, as just said on each of the n^2/4 sub-grids there are at least two equivalent paths to choose from. Therefore the number of relevant paths in total is at least f_0(n) = 2^(n^2 / 4), which is about (1.189...)^(n^2).
Now this we can also improve. Consider n=4 and divide the grid into 2x2 sub-grids. Then in each sub-grid there are 2 possible paths, as well as at least 2 possible paths in the coarse grid, making 2^5 = 32 paths at least. If now n large again, and we divide into sub-grids of length 4, then with the same argument as before there are at least f_1(n) = 32^(n^2 / 16) = (1.255...)^(n^2) such paths.
Repeating this coarse-graining into 2x2 grids we find for each r a bound 2^((sum 4^x for x=0..r)/4^(r+1)*n^2).
which for r to infinity gives the lower bound 2^(n^2 / 3) = (1.2599...)^(n^2).
Now one might try to redo this bound by coarse-graining not into 2x2 grids, but rather into 3x3 grids. Then one can find, that there are at least 9 paths between any pair of border cells and so with the same arguments as above, one will find the bound 9^(n^2 / (3^2-1)) = (1.316...)^(n^2).
One can repeat this for other coarse-graining, and I found the best bound for 4x4 grids with an assumed minimum 64 self-avoiding paths between any pair of border cells (might actually be higher, didn't enumerate all) giving 64^(n^2 / 15) = (1.3195...)^(n^2).

Array merging and sorting complexity calculation

I have one exercise from my algorithm text book and I am not really sure about the solution. I need to explain why this solution:
function array_merge_sorted(array $foo, array $bar)
{
$baz = array_merge($foo, $bar);
$baz = array_unique($baz);
sort($baz);
return $baz;
}
that merge two array and order them is not the most efficient and I need to provide one solution that is the most optimized and prove that not better solution can be done.
My idea was about to use a mergesort algorithm that is O(n log n), to merge and order the two array passed as parameter. But how can I prove that is the best solution ever?
Algorithm
As you have said that both inputs are already sorted, you can use a simple zipper-like approach.
You have one pointer for each input array, pointing to the begin of it. Then you compare both elements, adding the smaller one to the result and advancing the pointer of the array with the smaller element. Then you repeat the step until both pointers reached the end and all elements where added to the result.
You find a collection of such algorithms at Wikipedia#Merge algorithm with my current presented approach being listed as Merging two lists.
Here is some pseudocode:
function Array<Element> mergeSorted(Array<Element> first, Array<Element> second) {
Array<Element> result = new Array<Element>(first.length + second.length);
int firstPointer = 0;
int secondPointer = 0;
while (firstPointer < first.length && secondPointer < first.length) {
Element elementOfFirst = first.get(firstPointer);
Element elementOfSecond = second.get(secondPointer);
if (elementOfFirst < elementOfSecond) {
result.add(elementOfFirst);
firstPointer = firstPointer + 1;
} else {
result.add(elementOfSecond);
secondPointer = secondPointer + 1;
}
}
}
Proof
The algorithm obviously works in O(n) where n is the size of the resulting list. Or more precise it is O(max(n, n') with n being the size of the first list and n' of the second list (or O(n + n') which is the same set).
This is also obviously optimal since you need, at some point, at least traverse all elements once in order to build the result and know the final ordering. This yields a lower bound of Omega(n) for this problem, thus the algorithm is optimal.
A more formal proof assumes a better arbitrary algorithm A which solves the problem without taking a look at each element at least once (or more precise, with less than O(n)).
We call that element, which the algorithm does not look at, e. We can now construct an input I such that e has a value which fulfills the order in its own array but will be placed wrong by the algorithm in the resulting array.
We are able to do so for every algorithm A and since A always needs to work correctly on all possible inputs, we are able to find a counter-example I such that it fails.
Thus A can not exist and Omega(n) is a lower bound for that problem.
Why the given algorithm is worse
Your given algorithm first merges the two arrays, this works in O(n) which is good. But after that it sorts the array.
Sorting (more precise: comparison-based sorting) has a lower-bound of Omega(n log n). This means every such algorithm can not be better than that.
Thus the given algorithm has a total time complexity of O(n log n) (because of the sorting part). Which is worse than O(n), the complexity of the other algorithm and also the optimal solution.
However, to be super-correct, we also would need to argue whether the sort-method truly yields that complexity, since it does not get arbitrary inputs but always the result of the merge-method. Thus it could be possible that a specific sorting method works especially good for such specific inputs, yielding O(n) in the end.
But I doubt that this is in the focus of your task.

Is there any fast way to generate the pairs of cartesian coordinates ordered by their product?

I want to generate the pairs of cartesian coordiantes inside a bounded square ordered by their product in descending order. For example, for a square of size 3, the coordinates are:
(3,3), (3,2), (2,3), (2,2), (3,1), (1,3), (2,1), (1,2), (1,1)
Is there any way to generate this list fast - i.e, a constant-time function that maps integers to the nth coordinate?
your enumeration should proceed from the top-right corner to the bottom-left, naturally.
maintain the boundary as a priority queue. start with top right corner being the only one entry in the boundary.
on each step, pop the max element from the PQ and insert its three descendants (West, South, and South-West) into the queue, without creating duplicates (maybe use actual array of arrays to back the queue, but that means additional space... well, there are no more than n of these short (say, vertical) arrays, each no larger than a few elements, and they never grow/move upwards, only downwards).
Length of the queue is O(n) – think "diagonals", even if curved, –
and you produce n2 results, so overall complexity depends on the efficiency of the queue implementation. If that's logarithmic, it'll be O(n2 log n) and if linear (using hash table, as we know the range of the values involved), O(n2), overall; but it will be on-line, – O(1)...O(log n) per produced pair.
If the precision will allow (for your range it looks like it will), precalculate logarithms of your coordinates, and order the pairs by log(x) + log(y) instead of by x * y, trading O(n2) multiplications for n logarithms and O(n2) additions.
edit: see this for an actual Haskell code for another, very similar algorithm; it also contains additional hint how to speed it up by another factor of 2 (xy==yx), so work on a triangular half of the square only -- this will also halve the space needed. And it looks like there's no need to add the SW child to the priority queue, just S and W should be enough!
Perhaps you could elaborate more on your specific needs in terms of how fast you would like the generation and how rapidly you might change the bounds of the square.
This problem is akin to generating the distinct numbers in a multiplication table (the cardinality of which studied Paul Erdos and the fastest known algorithm to calculate exactly is O(n^2)).
One way to consider generating sections of your list (assuming you will not be listing billions of coordinates) is to quickly hash a partial set of i*js in descending order and sort them. To make the hash accurate, we extend it below the chosen range [n,k] until after n * l is lower than k*k for some l. For example, for the range of coordinates from (10,10) to (7,7), we extend our hash to (5,5) so that (10,5), which is greater than (7,7), will be included.
JavaScript code:
function f(n,k){
var l = k, k2 = k*k;
while (n*l > k2){
l--;
}
console.log("low bound: " + l);
var h = {}, h2 = [];
for (var i=n; i>l; i--){
for (var j=i; j>l; j--){
var m = i*j;
if (h[m]) h[m] = h[m].concat([i,j]);
else {
h[m] = [i,j];
h2.push(m);
}
}
}
h2.sort(function(a,b){return b-a});
var i=0;
while(h2[i] >= k2){
console.log(h[h2[i++]]);
}
}
Output:
f(10,6)
low bound: 3
(10,10)
(10,9)
(9,9)
(10,8)
...
(10,4), (8,5)
(9,4), (6,6)
More output:
f(1000000,999995)
low bound: 999990
(1000000,1000000)
(1000000,999999)
(999999,999999)
(1000000,999998)
(999999,999998)
(1000000,999997)
(999998,999998)
(999999,999997)
(1000000,999996)
(999998,999997)
(999999,999996)
(1000000,999995)
(999997,999997)
(999998,999996)
(999999,999995)
(1000000,999994)
(999997,999996)
(999998,999995)
(999999,999994)
(1000000,999993)
(999996,999996)
(999997,999995)
(999998,999994)
(999999,999993)
(1000000,999992)
(999996,999995)
(999997,999994)
(999998,999993)
(999999,999992)
(1000000,999991)
(999995,999995)
I have not tested this idea. You can quickly generate a list of all the coordinates in roughly the correct order by just taking off the diagonals from bottom right to top left, as with the argument for the countability of the rationals. That will give you a nearly sorted list.
There are sorting methods that can take advantage of that to give you a faster sort. See Which sorting algorithm is best suited to re-sort an almost fully sorted list? for a discussion. You could always try different sorting algorithms to see what works best for your data.

Fast algorithm to find out the number of points under hyperplane

Given points in Euclidean space, is there a fast algorithm to count the number of points 'under' one arbitrary hyperplane? Fast means time complexity lower than O(n)
Time for preprocessing or sorting the points is okay
And, even if not high dimensional, I'd like to know whether there exists one that can be used in 2 dimension space
If you're willing to preprocess the points, then you have to visit each one at least once, which is O(n). If you consider a test of which side the point is on as part of the preprocessing then you've got an O(0) algorithm (with O(n) preprocessing). So I don't think this question makes sense as stated.
Nevertheless, I'll attempt to give a useful answer, even if it's not precisely what the OP asked for.
Choose a hyperplane unit normal and root point. If the plane is given in parametric form
(P - O).N == 0
then you have these already, just make sure the normal is unitized.
If it's given in analytic form: Sum(i = 1 to n: a[i] x[i]) + d = 0, then the vector A = (a1, ... a[n]) is a normal of the plane, and N = A/||A|| is the unit plane normal. A point O (for origin) on the plane is d N.
You can test which side each point P is on by projecting it onto N add checking the sign of the parameter:
Let V = P - O. V is the vector from the chosen origin O to P.
Let s N be the projection of V onto N. If s is negative, then P is "under" the hyperplane.
You should go to the link on vector projection if you're rusty on the subject, but I'll summarize here using my notation. Or, you can take my word for it, and just skip to the formula at the end.
If alpha is the angle between V and N, then from the definition of cosine we have cos(alpha) = s||N||/||V|| = s/||V|| since N is a unit normal. But we also know from vector algebra that cos(alpha) = ||V||(V.N), where "." is scalar product (a.k.a. dot product, or euclidean inner product).
Equating these two expressions for cos(alpha) we have
s = (V.V)(V.N)
(using the fact that ||V||^2 == V.V).
So your proprocesing work is to compute N and O, and your test is:
bool is_under = (dot(V, V)*dot(V, N) < 0.);
I don't believe it can be done any faster.
When setting the point values, use checking conditions at that point setting. Then increment or dont increment the counter. O(n)
I found O(logN) algorithm in 2D dimension by using divide-and-conquer and binary search with O(N log N) preprocessing time complexity and O(N log N) memory complexity
The basic idea is that points can be divided into left N/2 points and right N/2 points, and the number of points that's under the line(in 2D dimension) is sum of the number of left points under the line and the number of the right points under the line. I'll call the infinite line that divides whole points into 'left' and 'right' as 'dividing line'. Dividing line will be look like 'x = k'
If each 'left points' and 'right points' are sorted by y-axis order, then the number of specific points - the points at the right lower corner - can be quickly found by using binary searching 'the number of points whose y values are lower than the y value of intersection point of the line and the Dividing line'.
Therefore time complexity is
T(N) = 2T(N/2) + O(log N)
and finally the time complexity is O(log N)

Area of intersection of axis-aligned rectangles

Each rectangle is comprised of 4 doubles like this: (x0,y0,x1,y1)
The edges are parallel to the x and y axes
They are randomly placed - they may be touching at the edges, overlapping , or not have any contact
I need to find the area that is formed by their overlap - all the area in the canvas that more than one rectangle "covers" (for example with two rectangles, it would be the intersection)
I understand I need to use sweep line algorithm. Do I have to use a tree structure? What is the easiest way of using sweep line algorithm for this problem?
At first blush it seems that an O(n^2) algorithm should be straightforward since we can just check all pairwise points. However, that would create the problem of double counting, as all points that are in 3 rectangles would get counted 3 times! After realizing that, an O(n^2) algorithm doesn't look bad to me now. If you can think of a trivial O(n^2) algorithm, please post.
Here is an O(n^2 log^2 n) algorithm.
Data structure: Point (p) {x_value, isBegin, isEnd, y_low, y_high, rectid}
[For each point, we have a single x_value, two y_values, and the ID of the rectangle which this point came from]
Given n rectangles, first create 2n points as above using the x_left and x_right values of the rectangle.
Create a list of points, and sort it on x_value. This takes O(n log n) time
Start from the left (index 0), use a map to put when you see a begin, and remove when you see an end point.
In other words:
Map m = new HashMap(); // rectangles overlapping in x-axis
for (Point p in the sorted list) {
if (p.isBegin()) {
m.put(p); // m is keyed off of rectangle id
if (s.size() >= 2) {
checkOverlappingRectangles(m.values())
}
} else {
m.remove(p); // So, this takes O(log n) time
}
}
Next, we need a function that takes a list of rectangles, knowing that the all the rectangles have overlapping x axis, but may or may not overlap on y axis. That is in fact same as this algorithm, we just use a transverse data structures since we are interested in y axis now.

Resources