Finding the largest interval using Dynamic programming - algorithm

http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/05-dynprog.pdf
I was doing these questions for practice where I came across one that stumped me.
7.(a) Suppose we are given a set L of n line segments in the plane, where each segment has one endpoint on the line y = 0 and one endpoint
on the line y = 1, and all 2n endpoints are distinct. Describe and
analyze an algorithm to compute the largest subset of L in which no
pair of segments intersects.
(b) Suppose we are given a set L of n line segments in the plane,
where the endpoints of each segment lie on the unit circle x 2 + y 2 =
1, and all 2n endpoints are distinct. Describe and analyze an
algorithm to compute the largest subset of L in which no pair of
segments intersects.
I figured out how to do 7a, (the question is a disguised problem to find the largest subset of increasing numbers), in O(n log n) time. Im almost close to giving up on 7b, as I cant figure out a way to do it.
However, is there a way to convert 7b's premise to something more like 7a's? I feel like that's the right way of approaching the problem, and any help in figuring this out would be much appreciated.

I couldn't come up with an O(n*log(n)) algorithm, but here is an O(n2) one.
The idea is that we build a directed graph with vertices representing segments from the given set and edges representing the "lies to the right of" relation.
Let L be the list of segments: {(a1, b1), (a2, b2), ..., (an, bn)}, where ak and bk are k-th segment's endpoints.
Let L' be the list of segments: {(a1, b1), (b1, a1), (a2, b2), (b2, a2), ..., (an, bn), (bn, an)}.
Let the vertices of the graph have indices from 1 to 2*n, each index k representing the segment L'[k], i.e. (ak/2, bk/2) if k is odd, and (bk/2, ak/2) if k is even.
A segment (a1, b1) is said to lie to the right of a segment (a2, b2) when the points a1, a2, b2, b1 are placed in a clockwise order on the unit circle.
Note that 1) If one segment lies to the right of another, they don't intersect; 2) If two segments from L don't intersect, two of the four corresponding segments from L' necessarily lie one to the right of another; 3) Any set of non-intersecting segments from L is defined by a series of segments of L', each lying to the right of the previous one.
Outline of the algorithm:
for every k1 from 1 to 2*n:
for every k2 from 1 to 2*n:
if (L'[k1].a, L'[k1].b) lies to the right of (L'[k2].a, L'[k2].b):
add a directed edge (k1, k2) to the graph
Find the longest path in the graph: (k1, k2, ..., km).
The answer to the problem is: (k1/2, k2/2, ..., km/2).

Here is an O(n2) algorithm.
We have 2n endpoints on the circle. Pick any point and start labeling the points in increasing order starting from 1 in clockwise direction. So, we have points labeled from 1 to 2n. Thus, any line segment l can be represented as (i,j) where i and j are endpoints of l and 1≤i<j≤2n.
Let Li,j be a subset of L such that l=(a,b) ∈ Li,j if i ≤ a < b ≤ j. Also, define D[i,j] for 1 ≤ i ≤ j ≤ 2n as the maximum number of non-intersecting lines in Li,j. We need to find D[1,2n].
We use dynamic programming here. Initialize D[i,i] as 0 and D[i,i+1] as 1 if (i,i+1) is a line segment, otherwise set it as 0. We can build D using the following:
if (i,j) is a line segment :
D[i,j] = D[i+1,j-1]+1
else if (i,k) is a line segment and i<k<j :
D[i,j] = max(D[i,j] , D[i+1,k-1]+D[k+1,j]+1)
else if (k,j) is a line segment and i<k<j :
D[i,j] = max(D[i,j] , D[i,k-1]+D[k+1,j-1]+1)
else :
D[i,j] = max(D[i,j] , D[i+1,j-1])
Since D occupies O(n2) space and we compute each cell of D in constant time the time complexity of the algorithm is O(n2).
Reference:
http://www.cs.toronto.edu/~robere/csc373h/files/A2-sol.pdf
Look under the heading Line Intersections (Redux)

Create two lists. The first list has the ordering created by sweeping counter-clockwise around the circle starting at (1,0). The first endpoint of a segment you hit is marked. The second list is ordered the same, but go clockwise around the circle. In this manner, you now have two lists where the segment endpoints show an ordering for intersection. For example, if the first point in the first list doesn't have its corresponding endpoint first in the second list, then it will intersect with all of the segments which appear before it in the second list. (You need to be careful here in that both points for a line can be before the end of your segment. A simple check eliminates this.) You can then just run the list. The total complexity of this approach seems to be O(n log n) for each list creation and O(n) for running the list.

Related

Given a simple polygon P, consisting of n vertices, and Set S Of k points, determine if each of the polygon vertices are covered by some point from S

Given a simple polygon P, consisting of n vertices, and Set S Of k points, determine if each of the polygon vertices are covered by some point from S.
My best solution was to check for every P vertex if there exist such point in S - total complexity of O(n*k). I belive there should be a more efficient solution. any hints?
Whether P is a polygon or not seems to be irrelevant. So the generalized question becomes: Given 2 sets of points A (with a points) and B (with b points), find out whether A is a subset of B or not?
The simple solution is O(a * b) but you can also get O(a + b) by doing some preprocessing.
Put all the points of B in a hash map with the x-coordinate as key and a hash set with the y-coordinates as values (Map<Number,Set<Number>>). This lets you query whether a point (x, y) is in B in O(1): map.containsKey(x) && map.get(x).contains(y).
Go through all the points of A and check whether the point is in B using the datastructure created above.
Step 1 is O(b) and Step 2 is O(a) which gives O(a + b).

Maximum area rectangle

Given set of points (x[1]; y[1]), (x[2]; y[2]), ..., (x[n]; y[n]) . We need to find maximum area of rectangle that we can get. Rectangle's vertexes should be in points set. Also, rectangle is not necessary be axis-aligned. For example, answer for (1; 1), (2; 2), (2; 0); (3; 1) is 2.
n <= 1300; -10^9 <= x[i], y[i] <= 10^9.
Can someone help me with this problem? My solution is brute-force O(N^3), it's giving TLE. I select some three points and find fourth.
Every pair of points determines a line L, which has a slope m and an intercept c. (Ignore vertical lines for now.) Instead of considering the intercept, let's work with a different quantity that gives much the same information: The distance d(L) between the line and the origin, i.e., the length of a line segment R perpendicular to L and connecting L to the origin. Additionally, we can talk about the "displacement" of a point along L: We can say that the point p on L where it meets R has displacement 0, and the point on L that is x "above" p (has distance x from p and higher y coordinate) has displacement x, with negative displacements for points "below" p. In fact, we don't need the intercept or d(L) to define the displacement of a point with respect to a line L -- just the line's slope. Define disp(m, q) to be the displacement of point q on a line with slope m.
Suppose a, b, c, d are the vertices of a rectangle, with sides ab, bc, cd and da. Observe that the line containing ab has the same slope m as the line containing cd, and (disp(m, a), disp(m, b)) = (disp(m, d), disp(m, c)). So the only 4-tuples of vertices that we need to test are those comprised of pairs of vertex pairs like ab and cd -- vertex pairs having the same slope and displacement pairs. Furthermore, one side length (shared by ab and cd) is equal to |disp(m, b) - disp(m, a)|, and the other side length will be |d(Lab) - d(Lcd)|, where Lab and Lcd are the lines containing the line segments ab and cd, respectively.
To find these 4-tuples of vertices efficiently:
For all pairs of vertices i, j:
Let L be the line passing through i and j. Compute its slope m and distance d(L) from the origin. Also compute disp(m, i) and disp(m, j). If disp(m, i) <= disp(m, j), add the tuple (m, disp(m, i), disp(m, j), d(L)) to an array Z.
Sort Z lexicographically. This will place all point pairs lying on lines of the same slope and having equal displacements in a contiguous block, ordered by increasing d(L).
Scan through the array, looking for block boundaries -- positions k at which any of the first three tuple elements changes. Let prev be the last such k found (initially, prev = 0). For each such k:
Compute (Z[k-1][3] - Z[prev][3]) * (Z[k-1][2] - Z[k-1][1]). This is the area of the largest rectangle having a pair of sides with slope Z[k-1][0] and length (Z[k-1][2] - Z[k-1][1]). If this is greater than the maximum rectangle size found so far, update it.
This algorithm takes O(n^2 log n) time and O(n^2) space.

Algorithm to find if triangles formed by set of points contains origin or not and give total count as well?

Input: S = {p1, . . . , pn}, n points on 2D plane each point is given by its x and y-coordinate.
For simplicity, we assume:
The origin (0, 0) is NOT in S.
Any line L passing through (0, 0) contains at most one point in S.
No three points in S lie on the same line.
If we pick any three points from S, we can form a triangle. So the total number of triangles that can be formed this way is Θ(n^3).
Some of these triangles contain (0, 0), some do not.
Problem: Calculate the number of triangles that contain (0, 0).
You may assume we have an O(1) time function Test(pi, pj , pk) that, given three points pi, pj , pk in S, returns 1, if the triangle formed by {pi, pj , pk} contains (0, 0), and returns 0 otherwise. It’s trivial to solve the problem in Θ(n^3) time (just enumerate and test all triangles).
Describe an algorithm for solving this problem with O(n log n) run time.
My analysis of the above problem leads to the following conclusion
There are 4 coordinates ( + ,+ ) , ( + ,- ) , ( -, - ), ( -, + ) { x and y coordinate > 0 or not }.
Let
s1 = coordinate x < 0 and y > 0
s2 = x > 0 , y > 0
s3 = x < 0 , y < 0
s4 = x > 0 , y < 0
Now we need to do the testing of points in between sets of the following combinations only
S1 S2 S3
S1 S1 S4
S2 S2 S3
S3 S3 S2
S1 S4 S4
S1 S3 S4
S1 S2 S4
S2 S3 S4
I now need to test the points in the above combination of sets only ( e.g. one point from s1 , one point from s2 and one point from s3 < first combinaton > ) and see the points contain (0,0) by calling Test function ( which is assumed as constant time function here) .
Can someone guide me on this ?
Image added below for clarification on why only some subsets (s1,s2 , s4 ) can contain (0,0) and some ( s1,s1,s3) cannot.
I'm guessing we're in the same class (based on the strange wording of the question), so now that the due date is past, I feel alright giving out my solution. I managed to find the n log n algorithm, which, as the question stated, is more a matter of cleverly transforming the problem, and less of a Dynamic Programming / DaC solution.
Note: This is not an exhaustive proof, I leave that to you.
First, some visual observations. Take some triangle that obviously contains the origin.
Then, convert the points to vectors.
Convince yourself that any selection of three points, one from each vector, describes a triangle that also contains the origin.
It also follows that, if you perform the above steps on a triangle that doesn't enclose the origin, any combination of points along those vectors will also not contain the origin.
The main point to get from this is, the magnitude of the vector does not matter, only the direction. Additionally, a hint to the question says that "any line crossing (0,0) only contains one point in S", from which we can extrapolate that the direction of each vector is unique.
So, if only the angle matters, it would follow that there is some logic that determines what range of points, given two points, could possibly form a triangle that encloses the origin. For simplicity, we'll assume we've taken all the points in S and converted them to vectors, then normalized them, effectively making all points lie on the unit circle.
So, take two points along this circle.
Then, draw a line from each point through the origin and to the opposite side of the circle.
It follows that, given the two points, any point that lies along the red arc can form a triangle.
So our algorithm should do the following:
Take each point in S. Make a secondary array A, and for each point, add the angle along the unit circle (atan2(x,y)) to A (0 ≤ Ai ≤ 2π). Let's assume this is O(n)
Sort A by increasing. O(n log n), assuming we use Merge Sort.
Count the number of triangles possible for each pair (Ai,Aj). This means that we count the number of Ai + π ≤ Ak ≤ Aj + π. Since the array is sorted, we can use a Binary Search to find the indices of Ai + π and Aj + π, which is O(2 log n) = O(log n)
However, we run into a problem, there are n^2 points, and if we have to do an O(log n) search for each, we have O(n^2 log n). So, we need to make one more observation.
Given some Ai < Aj, we'll say Tij describes the number of triangles possible, as calculated by the above method. Then, given a third Ak > Aj, we know that Tij ≤ Tik, as the number of points between Ai + π and Ak + π must be at least as many as there are betwen Ai + π and Aj + π. In fact, it is exactly the count between Ai + π and Aj + π, plus the count between Aj + π and Ak + π. Since we already know the count between Ai + π and Aj + π, we don't need to recalculate it - we only need to calculate the number between Aj + π and Ak + π, then add the previous count. It follows that:
A(n) = count(A(n),A(n-1)) + count(A(n-1),A(n-2)) + ... + count(A(1),A(0))
And this means we don't need to check all n^2 pairs, we only need to check consecutive pairs - so, only n-1.
So, all the above can give us the following psuedocode solution.
int triangleCount(point P[],int n)
int A[n], C[n], totalCount = 0;
for(i=0...n)
A[i] = atan2(P[i].x,P[i].y);
mergeSort(A);
int midPoint = binarySearch(A,π);
for(i=0...midPoint-1)
int left = A[i] + π, right = A[i+1] + π;
C[i] = binarySearch(a,right) - binarySearch(a,left);
for(j=0...i)
totalCount += C[j]
return totalCount;
It seems that in the worst case there are Θ(n3) triangles containing the origin, and since you need them all, the answer is no, there is no better algorithm.
For a worst case consider a regular polygon of an odd degree n, centered at the origin.
Here is an outline of the calculations. A chord connecting two vertices which are k < n/2 vertices apart is a base for Θ(k) triangles. Fix a vertex; its contribution is a sum over all chords coming from it, yielding Θ(n2), and a total (a contribution of all n vertices) is Θ(n3) (each triangle is counted 3 times, which doesn't affect the asymptotic).

Finding a square side length is R in 2D plane ?

I was at the high frequency Trading firm interview, they asked me
Find a square whose length size is R with given n points in the 2D plane
conditions:
--parallel sides to the axis
and it contains at least 5 of the n points
running complexity is not relative to the R
they told me to give them O(n) algorithm
Interesting problem, thanks for posting! Here's my solution. It feels a bit inelegant but I think it meets the problem definition:
Inputs: R, P = {(x_0, y_0), (x_1, y_1), ..., (x_N-1, y_N-1)}
Output: (u,v) such that the square with corners (u,v) and (u+R, v+R) contains at least 5 points from P, or NULL if no such (u,v) exist
Constraint: asymptotic run time should be O(n)
Consider tiling the plane with RxR squares. Construct a sparse matrix, B defined as
B[i][j] = {(x,y) in P | floor(x/R) = i and floor(y/R) = j}
As you are constructing B, if you find an entry that contains at least five elements stop and output (u,v) = (i*R, j*R) for i,j of the matrix entry containing five points.
If the construction of B did not yield a solution then either there is no solution or else the square with side length R does not line up with our tiling. To test for this second case we will consider points from four adjacent tiles.
Iterate the non-empty entries in B. For each non-empty entry B[i][j], consider the collection of points contained in the tile represented by the entry itself and in the tiles above and to the right. These are the points in entries: B[i][j], B[i+1][j], B[i][j+1], B[i+1][j+1]. There can be no more than 16 points in this collection, since each entry must have fewer than 5. Examine this collection and test if there are 5 points among the points in this collection satisfying the problem criteria; if so stop and output the solution. (I could specify this algorithm in more detail, but since (a) such an algorithm clearly exists, and (b) its asymptotic runtime is O(1), I won't go into that detail).
If after iterating the entries in B no solution is found then output NULL.
The construction of B involves just a single pass over P and hence is O(N). B has no more than N elements, so iterating it is O(N). The algorithm for each element in B considers no more than 16 points and hence does not depend on N and is O(1), so the overall solution meets the O(N) target.
Run through set once, keeping the 5 largest x values in a (sorted) local array. Maintaining the sorted local array is O(N) (constant time performed N times at most).
Define xMin and xMax as the x-coordinates of the two points with largest and 5th largest x values respectively (ie (a[0] and a[4]).
Sort a[] again on Y value, and set yMin and yMax as above, again in constant time.
Define deltaX = xMax- xMin, and deltaY as yMax - yMin, and R = largest of deltaX and deltaY.
The square of side length R located with upper-right at (xMax,yMax) meets the criteria.
Observation if R is fixed in advance:
O(N) complexity means no sort is allowed except on a fixed number of points, as only a Radix sort would meet the criteria and it requires a constraint on the values of xMax-xMin and of yMax-yMin, which was not provided.
Perhaps the trick is to start with the point furthest down and left, and move up and right. The lower-left-most point can be determined in a single pass of the input.
Moving up and right in steps and counitng points in the square requries sorting the points on X and Y in advance, which to be done in O(N) time requiress that the Radix sort constraint be met.

How to find the minmal bounding rectangles for a set of lines?

Provided a set of N connected lines on a 2D axis, I am looking for an algorithm which will determine the X minimal bounding rectangles.
For example, suppose I am given 10 lines and I would like to bound them with at most 3 (potentially intersecting) rectangles. So if 8 of the lines are clustered closely together, they may use 1 rectangle, and the other two may use a 2nd or perhaps also a 3rd rectangle depending on their proximity to each other.
Thanks.
If the lines are actually a path, then perhaps you wouldn't be averse to the requirement that each rectangle cover a contiguous portion of the path. In this case, there's a dynamic program that runs in time O(n2 r), where n is the number of segments and r is the number of rectangles.
Compute a table with entries C(i, j) denoting the cost of covering segments 1, …, i with j rectangles. The recurrence is, for i, j > 0,
C(0, 0) = 0
C(i, 0) = ∞
C(i, j) = min over i' < i of (C(i', j - 1) + [cost of the rectangle covering segments i' + 1, …, i])
There are O(n r) entries, each of which is computed in time O(n). Recover the optimal collection of rectangles at the end by, e.g., storing the best i' for each entry.
I don't know of a simple, optimal algorithm for the general case. Since there are “only” O(n4) rectangles whose edges each contain a segment endpoint, I would be tempted to formulate this problem as an instance of generalized set cover.

Resources