I have an array of objects (Lines) and a binary operation (intersection) that returns true/false.
The brute force is to duplicate the array an run on a nested for loop.
for (Line line : linesArray) {
for (int j = 0; j < linesArray.length; j++) {
if (line.isIntersecting(linesArray[j])) {
Point intersection = line.intersectionWith(linesArray[j]);
d.drawCircle(intersection.getXInt(), intersection.getYInt(), 3);
d.fillCircle(intersection.getXInt(), intersection.getYInt(), 3);
}
}
}
But, there order does not matter, so I thought about generating a subsets of size 2 and check for each subset, but complex wise it does not seems to be any better
Is there an efficient (run time) to do so?
You are correct that enumerating all subsets of size two and checking each will not be an improvement over your existing algorithm. In fact, that's essentially just a restatement of what your algorithm does.
There are faster algorithms for solving the problem of finding all intersections in a collection of line segments. The most famous of these is the Bentley-Ottmann algorithm, which works by sorting the line segments by their x coordinates and sweeping a line over the points from left to right. It works in time O((n + k)log n), where n is the number of line segments and k is the number of intersections.
Other, more modern algorithms exist for solving this problem even faster, but this would likely make for a good starting point.
Hope this helps!
Related
Trying to find the n shortest length between a lot of points.
Example:
A1 = (1.31, 2.22)
B1 = (2.33, 4.88)
C1 = (12,1, 0.11)
.
.
.
.
Z100 = (0.01, 0.05)
target = (1.0, 1.0)
The question is, how could I find out the top 10 shortest length of target and another points.Easiest solution is compare them one by one
for(size_t i = 0; i != points.size(); ++i){
priority_queue.push(distance(target, points[i]));
if(priority_queue.size() >= max_size){
priority_queue.pop();
}
}
I think complexity is O(n) by this naive solution, is it possible to speed things up?Thanks
If you are just presented with a list of N points and a single target point, the cost is likely to be O(N) because if this is all you know you will need to examine all of the N points to find the closest 10, and this is cost O(N).
If, for example, you have a list of N points and expect to receive a large number of queries, it may be worth building a data-structure to speed up the search, especially if the number of dimensions in the problem is small (such as 2, in the example you give). Data structures include https://en.wikipedia.org/wiki/Cover_tree and https://en.wikipedia.org/wiki/K-d_tree. If you are satisfied with finding 10 points which are pretty close but not necessarily the 10 closest, then packages such as https://www.cs.umd.edu/~mount/ANN/ may be useful.
This was a question on our Algorithms final exam. It's verbatim because the prof let us take a copy of the exam home.
(20 points) Let I = {r1,r2,...,rn} be a set of n arbitrary positive integers and the values in I are distinct. I is not given in any sorted order. Suppose we want to find a subset I' of I such that the total sum of all elements in I' is exactly 100*ceil(n^.5) (each element of I can appear at most once in I'). Present an O(n) time algorithm for solving this problem.
As far as I can tell, it's basically a special case of the knapsack problem, otherwise known as the subset-sum problem ... both of which are in NP and in theory impossible to solve in linear time?
So ... was this a trick question?
This SO post basically explains that a pseudo-polynomial (linear) time approximation can be done if the weights are bounded, but in the exam problem the weights aren't bounded and either way given the overall difficulty of the exam I'd be shocked if the prof expected us to know/come up with an obscure dynamic optimization algorithm.
There are two things that make this problem possible:
The input can be truncated to size O(sqrt(n)). There are no negative inputs, so you can discard any numbers greater than 100*sqrt(n), and all inputs are distinct so we know there are at most 100*sqrt(n) inputs that matter.
The playing field has size O(sqrt(n)). Although there are O(2^sqrt(n)) ways to combine the O(sqrt(n)) inputs that matter, you don't have to care about combinations that either leave the 100*sqrt(n) range or redundantly hit a target you can already reach.
Basically, this problem screams dynamic programming with each input being checked against each part of the 'reached number' space somehow.
The solution ends up being a matter of ensuring numbers don't reach off of themselves (by scanning in the right direction), of only looking at each number once, and of giving ourselves enough information to reconstruct the solution afterwards.
Here's some C# code that should solve the problem in the given time:
int[] FindSubsetToImpliedTarget(int[] inputs) {
var target = 100*(int)Math.Ceiling(Math.Sqrt(inputs.Count));
// build up how-X-was-reached table
var reached = new int?[target+1];
reached[0] = 0; // the empty set reaches 0
foreach (var e in inputs) {
// we go backwards to avoid reaching off of ourselves
for (var i = target; i >= e; i--) {
if (reached[i-e].HasValue) {
reached[i] = e;
}
}
}
// was target even reached?
if (!reached[target].HasValue) return null;
// build result by back-tracking via the logged reached values
var result = new List<int>();
for (var i = target; reached[i] != 0; i -= reached[i].Value) {
result.Add(reached[i].Value);
}
return result.ToArray();
}
I haven't actually tested the above code, so beware typos and off-by-ones.
With the typical DP algorithm for subset-sum problem will obtain O(N) time consuming algorithm. We use dp[i][k] (boolean) to indicate whether the first i items have a subset with sum k,the transition equation is:
dp[i][k] = (dp[i-1][k-v[i] || dp[i-1][k]),
it is O(NM) where N is the size of the set and M is the targeted sum. Since the elements are distinct and the sum must equal to 100*ceil(n^.5), we just need consider at most the first 100*ceil(n^.5) items, then we get N<=100*ceil(n^.5) and M = 100*ceil(n^.5).
The DP algorithm is O(N*M) = O(100*ceil(n^.5) * 100*ceil(n^.5)) = O(n).
Ok following is a simple solution in O(n) time.
Since the required sum S is of the order of O(n^0.5), if we formulate an algorithm of complexity S^2, then we are good since our algorithm shall be of effective complexity O(n).
Iterate once over all the elements and check if the value is less than S or not. If it is then push it in a new array. This array shall contain a maximum of S elements (O(n^.5))
Sort this array in descending order in O(sqrt(n)*logn) time ( < O(n)). This is so because logn <= sqrt(n) for all natural numbers. https://math.stackexchange.com/questions/65793/how-to-prove-log-n-leq-sqrt-n-over-natural-numbers
Now this problem is a 1D knapsack problem with W = S and number of elements = S (upper bound).
Maximize the total weight of items and see if it equals S.
It can be solved using dynamic programming in linear time (linear wrt W ~ S).
Given a list of intervals of time, I need to find the set of maximum non-overlapping intervals.
For example,
if we have the following intervals:
[0600, 0830], [0800, 0900], [0900, 1100], [0900, 1130],
[1030, 1400], [1230, 1400]
Also it is given that time have to be in the range [0000, 2400].
The maximum non-overlapping set of intervals is [0600, 0830], [0900, 1130], [1230, 1400].
I understand that maximum set packing is NP-Complete. I want to confirm if my problem (with intervals containing only start and end time) is also NP-Complete.
And if so, is there a way to find an optimal solution in exponential time, but with smarter preprocessing and pruning data. Or if there is a relatively easy to implement fixed parameter tractable algorithm. I don't want to go for an approximation algorithm.
This is not a NP-Complete problem. I can think of an O(n * log(n)) algorithm using dynamic programming to solve this problem.
Suppose we have n intervals. Suppose the given range is S (in your case, S = [0000, 2400]). Either suppose all intervals are within S, or eliminate all intervals not within S in linear time.
Pre-process:
Sort all intervals by their begin points. Suppose we get an array A[n] of n intervals.
This step takes O(n * log(n)) time
For all end points of intervals, find the index of the smallest begin point that follows after it. Suppose we get an array Next[n] of n integers.
If such begin point does not exist for the end point of interval i, we may assign n to Next[i].
We can do this in O(n * log(n)) time by enumerating n end points of all intervals, and use a binary search to find the answer. Maybe there exists linear approach to solve this, but it doesn't matter, because the previous step already take O(n * log(n)) time.
DP:
Suppose the maximum non-overlapping intervals in range [A[i].begin, S.end] is f[i]. Then f[0] is the answer we want.
Also suppose f[n] = 0;
State transition equation:
f[i] = max{f[i+1], 1 + f[Next[i]]}
It is quite obvious that the DP step take linear time.
The above solution is the one I come up with at the first glance of the problem. After that, I also think out a greedy approach which is simpler (but not faster in the sense of big O notation):
(With the same notation and assumptions as the DP approach above)
Pre-process: Sort all intervals by their end points. Suppose we get an array B[n] of n intervals.
Greedy:
int ans = 0, cursor = S.begin;
for(int i = 0; i < n; i++){
if(B[i].begin >= cursor){
ans++;
cursor = B[i].end;
}
}
The above two solutions come out from my mind, but your problem is also referred as the activity selection problem, which can be found on Wikipedia http://en.wikipedia.org/wiki/Activity_selection_problem.
Also, Introduction to Algorithms discusses this problem in depth in 16.1.
I'm trying to solve distance transform problem (using Manhattan's distance). Basically, giving matrix with 0's and 1's, program must assign distances of every position to nearest 1. For example, for this one
0000
0100
0000
0000
distance transform matrix is
2123
1012
2123
3234
Possible solutions from my head are:
Slowest ones (slowest because I have tried to implement them - they were lagging on very big matrices):
Brute-force - for every 1 that program reads, change distances accordingly from beginning till end.
Breadth-first search from 0's - for every 0, program looks for nearest 1 inside out.
Same as 2 but starting from 1's mark every distance inside out.
Much faster (read from other people's code)
Breadth-first search from 1's
1. Assign all values in the distance matrix to -1 or very big value.
2. While reading matrix, put all positions of 1's into queue.
3. While queue is not empty
a. Dequeue position - let it be x
b. For each position around x (that has distance 1 from it)
if position is valid (does not exceed matrix dimensions) then
if distance is not initialized or is greater than (distance of x) + 1 then
I. distance = (distance of x) + 1
II. enqueue position into queue
I wanted to ask if there is faster solution to that problem. I tried to search algorithms for distance transform but most of them are dealing with Euclidean distances.
Thanks in advance.
The breadth first search would perform Θ(n*m) operations where n and m are the width and height of your matrix.
You need to output Θ(n*m) numbers, so you can't get any faster than that from a theoretical point of view.
I'm assuming you are not interested in going towards discussions involving cache and such optimizations.
Note that this solution works in more interesting cases. For example, imagine the same question, but there could be different "sources":
00000
01000
00000
00000
00010
Using BFS, you will get the following distance-to-closest-source in the same time complexity:
21234
10123
21223
32212
32101
However, with a single source, there is another solution that might have a slightly better performance in practice (even though the complexity is still the same).
Before, let's observe the following property.
Property: If source is at (a, b), then a point (x, y) has the following manhattan distance:
d(x, y) = abs(x - a) + abs(y - b)
This should be quite easy to prove. So another algorithm would be:
for r in rows
for c in cols
d(r, c) = abc(r - a) + abs(c - b)
which is very short and easy.
Unless you write and test it, there is no easy way of comparing the two algorithms. Assuming an efficient bounded queue implementation (with an array), you have the following major operations per cell:
BFS: queue insertion/deletion, visit of each node 5 times (four times by neighbors, and one time out of the queue)
Direct formula: two subtraction and two ifs
It would really depend on the compiler and its optimizations as well as the specific CPU and memory architecture to say which would perform better.
That said, I'd advise for going with whichever seems simpler to you. Note however that with multiple sources, in the second solution you would need multiple passes on the array (or multiple distance calculations in one pass) and that would definitely have a worse performance than BFS for a large enough number of sources.
You don't need a queue or anything like that at all. Notice that if (i,j) is at distance d from (k,l), one way to realise that distance is to go left or right |i-k| times and then up or down |j-l| times.
So, initialise your matrix with big numbers and stick a zero everywhere you have a 1 in your input. Now do something like this:
for (i = 0; i < sx-1; i++) {
for (j = 0; j < sy-1; j++) {
dist[i+1][j] = min(dist[i+1][j], dist[i][j]+1);
dist[i][j+1] = min(dist[i][j+1], dist[i][j]+1);
}
dist[i][sy-1] = min(dist[i][sy-1], dist[i][sy-2]+1);
}
for (j = 0; j < sy-1; j++) {
dist[sx-1][j] = min(dist[sx-1][j], dist[sx-2][j]+1);
}
At this point, you've found all of the shortest paths that involve only going down or right. If you do a similar thing for going up and left, dist[i][j] will give you the distance from (i, j) to the nearest 1 in your input matrix.
I've got a collection of 10000 - 100000 spheres, and I need to find the ones farthest apart.
One simple way to do this is to simply compare all the spheres to each other and store the biggest distance, but this feels like a real resource hog of an algorithm.
The Spheres are stored in the following way:
Sphere (float x, float y, float z, float radius);
The method Sphere::distanceTo(Sphere &s) returns the distance between the two center points of the spheres.
Example:
Sphere *spheres;
float biggestDistance;
for (int i = 0; i < nOfSpheres; i++) {
for (int j = 0; j < nOfSpheres; j++) {
if (spheres[i].distanceTo(spheres[j]) > biggestDistance) {
biggestDistance = spheres[i].distanceTo(spheres[j]) > biggestDistance;
}
}
}
What I'm looking for is an algorithm that somehow loops through all the possible combinations in a smarter way, if there is any.
The project is written in C++ (which it has to be), so any solutions that only work in languages other than C/C++ are of less interest.
The largest distance between any two points in a set S of points is called the diameter. Finding the diameter of a set of points is a well-known problem in computational geometry. In general, there are two steps here:
Find the three-dimensional convex hull composed of the center of each sphere -- say, using the quickhull implementation in CGAL.
Find the points on the hull that are farthest apart. (Two points on the interior of the hull cannot be part of the diameter, or otherwise they would be on the hull, which is a contradiction.)
With quickhull, you can do the first step in O(n log n) in the average case and O(n2) worst-case running time. (In practice, quickhull significantly outperforms all other known algorithms.) It is possible to guarantee a better worst-case bound if you can guarantee certain properties about the ordering of the spheres, but that is a different topic.
The second step can be done in Ω(h log h), where h is the number of points on the hull. In the worst case, h = n (every point is on the hull), but that's pretty unlikely if you have thousands of random spheres. In general, h will be much smaller than n. Here's an overview of this method.
Could you perhaps store these spheres in a BSP Tree? If that's acceptable, then you could start by looking for nodes of the tree containing spheres which are furthest apart. Then you can continue down the tree until you get to individual spheres.
Your problem looks like something that could be solved using graphs. Since the distance from Sphere A to Sphere B is the same as the distance from Sphere B to Sphere A, you can minimize the number of comparisons you have to make.
I think what you're looking at here is known as an Adjacency List. You can either build one up, and then traverse that to find the longest distance.
Another approach you can use will still give you an O(n^2) but you can minimize the number of comparisons you have to make. You can store the result of your calculation into a hash table where the key is the name of the edge (so AB would hold the length from A to B). Before you perform your distance calculation, check to see if AB or BA exists in the hash table.
EDIT
Using the adjacency-list method (which is basically a Breadth-First Search) you get O(b^d) or worst-case O(|E| + |V|) complexity.
Paul got my brain thinking and you can optimize a bit by changing
for (int j=0; j < nOfSpheres; j++)
to
for (int j=i+1; j < nOfSpheres; j++)
You don't need to compare sphere A to B AND B to A. This will make the search O(n log n).
--- Addition -------
Another thing that makes this calculation expensive is the DistanceTo calulations.
distance = sqrt((x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2)
That's alot of math. You can trim that down by checking to see if
((x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 > maxdist^2
Removes the sqrt until the end.