How to split a polygon along n given diagonals? - computational-geometry

Say I have a polygon represented as a list of vertices in CCW order (not a DCEL) and I have a given list of diagonals of that polygon. How can I split the polygon along all of those diagonals into a list of n+1 polygons?
I'm having no trouble splitting the list along one diagonal. The problem is quickly determining which of the two remaining polygons my other diagonals belong to. From there, I could split the list of diagonals into two lists, and recursively operate on the two split polygons.
Preferably, I'd like to do this in O(n log(n)) time, as opposed to the obvious algorithm of simply walking around the two split polynomials to determine which diagonals lie in which of the subpolygons.

Here is an outline of what I think is a solution.
Let's assume you already made sure the diagonals do not intersect each other and are fully contained inside the polygon.
Then (statement 1) you can think the task is fully combinatorial and forget about the real geometric layout details of the polygon and diagonals. Let's think about the task as you are given a regular polygon with vertices [0,1,..,N-1] and you cut it with a set of diagonals [i_k, j_k], 0<=k
For any vertices i and j let's call
arc_length(i, j) := min(max(i,j) - min(i,j), N - max(i,j) + min(i,j)) -
the least number of steps from i to j or from j to i along the cycle [0,1,..,N-1].
Let's call sortedDiagonals a container (vector>, list>, or alike), of all your diagonals sorted by arc_length(i,j), that is like
struct LessPredicate {
bool operator(pair<int,int> diag1, pair<int,int> diag2) const {
return arc_length(diag1) < arc_length(diag2);
}
};
Now let's cut our regular polygon along the diagonals from sortedDiagonals starting from 'shortest' to 'longest. Imagine a picture of a regular N-gon inscribed into a circle with a few first diagonals from sortedDiagonals drawn. Obviously we see on the picture our regular polygon split into smaller polygons, one of which contains the center of the circle (the case when the diagonal crosses the center may happen only once and at the very last step due to the sorting, so it's easy to sort it out as a special case). (Statement 2) Every time we draw next diagonal from the sortedDiagonals the diagonal belongs to the polygon containing the center.
Now, every time you take next diagonal from sortedDiagonals you know the polygon containing the center (say, you remember the index of the polygon or a pointer to it's structure, etc), so you know the polygon the diagonal belongs to. You cut the polygon with the diagonal into two parts and remember which one of them contains the center to know it on the next step.
If you need proof to statements 1 and 2 here is a sketch.
Statement 1 is true because if the diagonals are fully inside a given polygon and can intersect each other only in their end points then this polygon with all these diagonals and a regular polygon with same number of vertices and diagonals connecting vertices with same indices as in the given polygon are same as graphs in sense that their incidence data is same. So we reduce the task to the case of regular polygon which we need rather to simplify explanation of which polygon we know as containing the next diagonal from our list.
Statement 2. Suppose we cut a regular polygon with someDiagonal not crossing the center. The result is two polygons, one of which does not contain the center - let's call it 'small polygon'. What can we say about arc_length(diag) for any diag fully contained in the 'small polygon'? It can not be greater then arc_length(someDiagonal) and the only case when
arc_length(diag) == arc_length(someDiagonal)
is when diag == someDiagonal.
Now suppose we drew in our regular polygon k first diagonals from sortedDiagonals. Each of them cuts it's own 'small polygon' from the whole regular polygon (and probably some of these 'small polygons' contain some other 'small polygons' from other diagonals, it's ok). Now we draw k-1-th diagonal from the sortedDiagonals and notice that due to our sorting all previously drawn diagonals d_i were not greater then the d_(k+1) in sense of arc_length. Hence their 'small polygons' have no inner diagonals of arc_length greater or equal to arc_length(d_(k+1)). Hence the d_(k+1) can not belong to any 'small polygons' of diagonals preceding it in sortedDiagonals and it can belong only to the polygon containing the center point.
hth

Related

Triangle enclosing the biggest number of points

Given set of 2D points find a triangle built from those points, that encloses the biggest number of points.
Brutal algorithm for this is just building triangles from every possible triad of points and checking how many points they enclose, but time complexity of this solution is O(n^4).
For the optimal solution I thought about first finding the convex hull of those points and arranging points inside this hull with some structure, but I can't figure it out.
Do you have any ideas about the optimal solution for this kind of problem?
In a set of n points, there are (n choose 3) triangles, and using brute force to check for each point whether it is contained in each triangle indeed has O(n4) complexity. To give a practical example of a few set sizes:
points: 100 1,000 10,000
triangles: 161,700 166,167,000 166,616,670,000
checks: 15,684,900 165,668,499,000 1,665,666,849,990,000
Below are a few geometrical ideas; they don't lead straight to a solution, but they can reduce the number of triangles that have to be checked.
Counter-example for convex hull
First of all, using only points on the convex hull is not guaranteed to give the optimal solution. Consider this counter-example:
The convex hull is the red rectangle. However, if we use two of its sides and a diagonal to form a triangle, the diagonal will cut through the central point cluster and leave out some of the points. And even if we only use 1 or 2 corners of the rectangle, combined with a point in the center, it will always cut through the blue triangle and leave out some points. The blue triangle, which has no points on the convex hull, is in fact the optimal solution.
Triangle contained in triangle
If you consider a triangle abc, and three points d, e and f contained within it, then the triangle def cannot be the triangle which contains the most points, because triangle abc contains at least three more points. Triangles made from a combination of points from abc and def, like abd, also contain fewer points than abc.
This means that finding a triangle and some points contained within it, allows you to discard a number of triangles. In the next paragraphs, we will use this idea to discard as many triangles as possible from having to be checked.
Expanding a triangle
If we consider a triangle made from three randomly chosen points a, b and c (named clock-wise), and then check whether all other points are on the left of right side of the lines |ab|, |bc| and |ca|, the points are partitioned into 7 zones:
If we replace a corner point of the triangle by a point in the adjacent coloured zone, e.g. zone LRL for point a, we get a larger triangle that contains triangle abc. If we randomly pick three points from zones LRL, LLR and RLL, we can expand the triangle like this:
We can then partition the points again using this new triangle a'b'c' (points already in zone RRR can be added to the new zone RRR without checking) and expand the triangle again as long as there is at least one point in the zones LRL, LLR or RLL.
If we have caught enough points inside the expanded triangle, we can now use the brute force algorithm, but skip any triangle which doesn't have a point outside of the expanded triangle a'b"c'.
If we haven't caught enough points to make that feasible, we can try again with another three random points. Note, however, that you should not use the union of the points contained within several triangles; three points which are each contained in another triangle, but not in the same triangle, can still be the triangle containing the most points.
Excluding triangles in multiple steps
We could repeatedly choose a random triangle, expand it maximally, and then mark the triangles made from three points on or inside the triangle, to then exclude these from the check. This would require storing a boolean for all possible triangles, e.g. in a 3D bit array, so it is only feasible for sets up to a few thousand points.
To simplify things, instead of expanding random triangles, we could do this with a number of randomly chosen triangles, or triangles made from points on the convex hull, or points far apart when sorted in the x or y-direction, or ... But any of these methods will only help us to find triangles which can be excluded, they will not give us optimal (or even good-enough) triangles by themselves.

How do I detect bounded regions of closed-loop polygons?

I am given an ordered set of points in 2D space as an output of a previous process. Coordinate points will be given in the form ((x0,y0),(x1,y1),...,(xn,yn)), where the very last coordinate pair will be a repetition of the first pair (i.e x0 = xn and y0 = yn). In this way, I know that when I re-encounter the same coordinate, I have made a closed loop. I would like to be able to detect the enclosed area by the polygon. If a single closed loop is given, the output should be the enclosed area of that closed loop. Now say I have a separate set of points, similar to to the first set. If a set of many closed looped polygons is given, then if each polygon is separated in space from each other, the output should be each enclosed area. However, if some of the polygons enclose each other, it should be the area bounded between both of them. For example, if I have one closed loop polygon inside another, the output area should between both of them (or in other words, the area enclosed by the larger one minus the area enclosed by the smaller one). If I have more than one closed loop polygon inside of a single larger one, it should be the area enclosed by the larger one minus all the areas enclosed by the smaller ones).
For a case where I have a region A enclosed by a region B, where B is enclosed by a region C, there are three distinct regions.
Region C minus region B (bounded on the outside by polygon 1)
Region B minus region A (bounded on the outside by polygon 2)
Region A (bounded on the outside by polygon 3)
Of the three regions, I only want region 1) and region 3). The reason I do not take region 2) is because for all the bounded areas on my 2D plane, the outermost polygons always represent the boundaries of a relevant region, and the input that produced my sets of coordinates representing my closed loop polygons would never have given me the points for polygon 2 if in the end region 1) and region 2) were meant to be combined. It instead would have given me only polygon 1 and polygon 3, similar to the case I described above.
In summary,
- I am given enough information to know all the coordinate points for a set of closed loop polygons on a 2D plane and they are distinguishable from each other.
- I need to develop an algorithm that will take in the entire set of closed loops polygons and return enough information to describe a bounded area. In thinking about the problem, I think the output that I would want is to know whether for a each and every line segment of the a closed loop polygon, on which side of that line segment is inside and outside of the polygon.
- It should be able to resolve the case where I have polygons inside of polygons.
- Closed loop polygons will never share any points. Each set of coordinate points will be unique to a polygon.
My initial thoughts were calculate the centroid of the polygon and then compare all line segments to the centroid, but I don't think this would work for all cases.
Judging by the description of your input, splitting your input stream into separate polygons is a trivial task.
After that, in order to "return enough information to describe a bounded area" you can build the following data structure out of your polygons:
Separate all polygons into two classes: main polygons and hole polygons.
Main polygon is... well, the exterior border of a bounded area. It separates the interior of the bounded area from the "outside world".
Hole polygon is a polygon that describes a hole in some main polygon.
Each hole polygon is associated with exactly one main polygon
Each main polygon is associated with zero or more hole polygons
Optionally, you can order the vertices in main polygons counterclockwise and vertices in hole polygons clockwise. But this is not strictly necessary to satisfy the formal requirement of "describing a bounded area"
The resulting structure is two-tiered: you end you with a list of main polygons, and each main polygon might contain a list of its holes.
In your example, you have 4 main polygons. One of them contains two hole polygons.
So, all you need to do is to recognize hole polygons and properly associate them with their main polygons.
An industrial-strength approach for solving this task would be an application of sweep-line algorithm to the input polygons. It would easily perform the classification into main and hole polygons as well as build the proper association between them.
An ad-hoc algorithm might look as follows
Sort all polygons in order of increasing area
For each polygon p in the sorted order
Take any vertex v of p
Perform an "inside" test of v against all polygons of greater area than p (for example, by using a simple even-odd intersection test: How can I determine whether a 2D Point is within a Polygon?)
If the number of polygons that contain v is odd, p is a hole polygon. Otherwise, p is a main polygon
If p is a hole polygon, then the smallest-area polygon that contains v is its associated main polygon.
That's it.
I think the output that I would want is to know whether for a each and every line segment of the a closed loop polygon, on which side of that line segment is inside and outside of the polygon.
Compute the normal for the line segment (perpendicular line).
Compute the mid-point of the line segment (any point will do).
Intersect every other line segment with a ray projected from the mid-point in the direction of the normal.
Intuitively, each intersection means either entering or exiting another polygon. Since finally the ray will be outside all polygons, we can deduce that if the ray intersects an even number of other line segments, then the side of the line segment indicated by the normal is on the outside of the polygon. If odd, it's on the inside.
There are a couple of tricky cases: one is where the ray exactly intersects the end-point of two connected line segments. Be careful to only count this as one intersection. The other case is where the ray is parallel to and exactly overlaps another line segment. This should count as two intersections.
There are more efficient algorithms (e.g. involving triangulation), but this one is simplest.

Surface subdivision into equal parts

I have a closed contour represented by a list of points and I need to split it into equal parts, knowing the area of the parts.
I think that I can use some subdivision algorithm, like Delanuy subdivision. But with this method I have to give the centroid of the subdivded parts.
Anyone has some hints?
if i understand correctly: given say a rectangle say of area 10, and a target area of 1, you would need to partition rectangle into 10 parts, each having area of 1. So slicing the rectangles into 10 thin rectangles (like guitar frets, or bread slices) would do.
If that's the case, then I would do the following:
Create a function to compute the area of a convex poly. This is fairly trivial (since poly is convex).
Observe, since input poly is convex, any line segment that splits the polygon into two, will intersect the polygon in exactly two places. Specifically, you can triangulate the polygon by picking a vertex of the poly and connecting it to every other vert of the polygon, like a fan.
Triangulating in this fashion would create a partition that would be close to what you need. Assume that the input polygon is given by a vertex list poly = {v1, v2, v3, ..., vn}, where verts are unique and no three are co-linear (convex poly).
Observe that given a triangle of that poly formed by say (v2,v3,v4) we can compute its area, A1. Now if we grow the triangle into a poly by adding one extra vert to the next, say v5, to form (v2, v3, v4, v5) the area increased to A2 (sum of two triangles, (v2,v3,v4) and (v2,v4,v5). Due to linearity if you wanted to grow the original triangle to say A2' where A1 < A2' < A2, you can interpolate on the line segment (v4,v5) to find v4' that will give you the right area A2' that you need.
Since you can compute the total area of initial input poly, and you know the target area of each subdivision, you can cut the input poly into pieces of desired area until you subdivide the entire thing. If you want a nicer partition, you can start from the center of the polygon, i.e. first (seed triangle) would be (center, v1,v2). Then shrink/grow it until desired area, move to the next triangle, repeat.
Hope that makes sense :D

Algorithm to take the union of rectangles and to see if the union is still a rectangle

I have a problem in which I have to test whether the union of given set of rectangles forms
a rectangle or not. I don't have much experience solving computational geometry problems.
What my approach to the problem was that since I know the coordinates of all the rectangles, I can easily sort the points and then deduce the corner points of the largest rectangle possible. Then I could sweep a line and see if all the points on the line falls inside the rectangle. But, this approach is flawed and this would fail because the union may be in the form of a 'U'.
I would be a great help if you could push me in the right direction.
Your own version does not take into account that the edges of the rectangles can be non-parallel to each other. Therefore, there might not be "largest rectangle possible".
I would try this general approach:
1) Find the convex hull. You can find convex hull calculation algorithms here http://en.wikipedia.org/wiki/Convex_hull_algorithms.
2) Check if the convex hull is a rectangle. You can do this by looping through all the points on convex hull and checking if they all form 180 or 90 degree angles. If they do not, union is not a rectangle.
3) Go through all points on the convex hull. For each point check if the middle point between ThisPoint and NextPoint lies on the edge of any initially given rectangle.
If every middle point does, union is a rectangle.
If it does not, union is not a rectangle.
Complexity would be O(n log h) for finding convex hull, O(h) for the second part and O(h*n) for third part, where h is number of points on the convex hull.
Edit:
If the goal is to check if the resulting object is a filled rectangle, not only edges and corners rectangle then add step (4).
4) Find all line segments that are formed by intersecting or touching rectangles. Note - by definition all of these line segments are segments of edges of given rectangles. If a rectangle does not touch/intersect other rectangles, the line segments are it's edges.
For each line segment check if it's middle point is
On the edge of the convex hull
Inside one of given rectangles
On the edge of two non-overlapping given rectangles.
If at least one of these is true for every line segment, resulting object is a filled rectangle.
You could deduce the he corner points of the largest rectangle possible, and then go over all the rectangle that share the border with the largest possible rectangle, for example the bottom, and make sure that the line is entirely contained in their borders. This will also fail if an empty space in the middle of the rectangle is a problem, however. I think the complexity will be O(n2).
I think you are on the right direction. After you get the coordinates of largest possible rectangle,
If the largest possible rectangle is a valid rectangle, then each side of it must be union of sides of original rectangles. You can scan the original rectangle set, find those rectangles that is a part of the largest side we are looking for (this can be done in O(n) by checking if X==largestRectangle.Top.X if you are looking at top side, etc.), lets call them S.
For each side s in S we can create an interval [from,to]. All we need to check is whether the union of all intervals matches the side of the largest Rectangle. This can be done in O(nlog(n)) by standard algorithms, or on average O(n) by some hash trick (see http://www.careercup.com/question?id=12523672 , see my last comment (of the last comment) there for the O(n) algorithm ).
For example, say we got two 1*1 rectangles in the first quadrant, there left bottom coordinates are (0,0) and (1,0). Largest rectangle is 2*1 with left bottom coordinate (0,0). Since [0,1] Union [1,2] is [0,2], top side and bottom side match the largest rectangle, similar for left and right side.
Now suppose we got an U shape. 3*1 at (0,0), 1*1 at (0,1), 1*1 at (2,1), we got largest rectangle 3*2 at (0,0). Since for the top side we got [0,1] Union [1,3] does not match [0,3], the algorithm will output the union of above rectangles is not a rectangle.
So you can do this in O(n) on average, or O(nlog(n)) at least if you don't want to mess with some complex hash bucket algorithm. Much better than O(n^4)!
Edit: We have a small problem if there exists empty space somewhere in the middle of all rectangles. Let me think about it....
Edit2: An easy way to detect empty space is for each corner of a rectangle which is not a point on the largest rectangle, we go outward a little bit for all four directions (diagonal) and check if we are still in any rectangle. This is O(n^2). (Which ruins my beautiful O(nlog(n))! Can anyone can come up a better idea?
I haven't looked at a similar problem in the past, so there maybe far more efficient ways of doing it. The key problem is that you cannot look at containment of one rectangle in another in isolation since they could be adjacent but still form a rectangle, or one rectangle could be contained within multiple.
You can't just look at the projection of each rectangle on to the edges of the bounding rectangle unless the problem allows you to leave holes in the middle of the rectangle, although that is probably a fast initial check that could be performed before the following exhaustive approach:
Running through the list once, calculating the minimum and maximum x and y coordinates and the area of each rectangle
Create an input list containing your input rectangles ordered by descending size.
Create a work list containing the bounding rectangle initially
While there are rectangles in the work list
Take the largest rectangle in the input list R
Create an empty list for fragments
for each rectangle r in the work list, intersect r with R, splitting r into a rectangular portion contained within R (if any) and zero or more rectangles not within R. If r was split, discard the portion contained within R and add the remaining rectangles to the fragment list.
add the contents of the fragment list to the work list
Assuming your rectangles are aligned to the coordinate axis:
Given two rectangles A, B, you can make a function that subtracts B from A returning a set of sub-rectangles of A (that may be the empty set): Set = subtract_rectangle(A, B)
Then, given a set of rectangles R for which you want to know if their union is a rectangle:
Calculate a maximum rectangle Big that covers all the rectangles as ((min_x,min_y)-(max_x,max_y))
make the set S contain the rectangle Big: S = (Big)
for every rectangle B in R:
S1 = ()
for evey rectangle A in S:
S1 = S1 + subtract_rectangle(A, B)
S = S1
if S is empty then the union of the rectangles is a rectangle.
End, S contains the parts of Big not covered by any rectangle from R
If the rectangles are not aligned to the coordinate axis you can use a similar algorithm but that employs triangles instead of rectangles. The only issues are that subtracting triangles is not so simple to implement and that handling numerical errors can be difficult.
A simple approach just came to mind: If two rectangles share an edge[1], then together they form a rectangle which contains both - either the rectangles are adjacent [][ ] or one contains the other [[] ].
So if the list of rectangles forms a larger rectangle, then all you need it to repeatedly iterate over the rectangles, and "unify" pairs of them into a single larger one. If in one iteration you can unify none, then it is not possible to create any larger rectangle than you already have, with those pieces; otherwise, you will keep "unifying" rectangles until a single is left.
[1] Share, as in they have the same edge; it is not enough for one of them to have an edge included in one of the other's edges.
efficiency
Since efficiency seems to be a problem, you could probably speed it up by creating two indexes of rectangles, one with the larger edge size and another with the smaller edge size.
Then compare the edges with the same size, and if they are the same unify the two rectangles, remove them from the indexes and add the new rectangle to the indexes.
You can probably speed it up by not moving to the next iteration when you unify something, but to proceed to the end of the indexes before reiterating. (Stopping when one iteration does no unifications, or there is only one rectangle left.)
Additionally, the edges of a rectangle resulting from unification are by analysis always equal or larger than the edges of the original rectangles.
So if the indexes are ordered by ascending edge size, the new rectangle will be inserted in either the same position as you are checking or in positions yet to be checked, so each unification will not require an extra iteration cycle. (As the new rectangle will assuredly not unify with any rectangle previously checked in this iteration, since its edges are larger than all edges checked.)
For this to hold, in each step of a particular iteration you need to attempt unification on the next smaller edge from either of the indexes:
If you're in index1=3 and index2=6, you check index1 and advance that index;
If next edge on that index is 5, next iteration step will be in index1=5 and index2=6, so it will check index1 and advance that index;
If next edge on that index is 7, next iteration step will be in index1=7 and index2=6, so it will check index2 and advance that index;
If next edge on that index is 10, next iteration step will be in index1=7 and index2=10, so it will check index1 and advance that index;
etc.
examples
[A ][B ]
[C ][D ]
A can be unified with B, C with D, and then AB with CD. One left, ABCD, thus possible.
[A ][B ]
[C ][D ]
A can be unified with B, C with D, but AB cannot be unified with CD. 2 left, AB and CD, thus not possible.
[A ][B ]
[C ][D [E]]
A can be unified with B, C with D, CD with E, CDE with AB. 1 left, ABCDE, thus possible.
[A ][B ]
[C ][D ][E]
A can be unified with B, C with D, CD with AB, but not E. 2 left, ABCD and E, thus not possible.
pitfall
If a rectangle is contained in another but does not share a border, this approach will not unify them.
A way to address this is, when one hits an iteration that does not unify anything and before concluding that it is not possible to unify the set of rectangles, to get the rectangle with the widest edge and discard from the indexes all others that are contained within this largest rectangle.
This still does not address two situations.
First, consider the situation where with this map:
A B C D
E F G H
we have rectangles ACGE and BDFH. These rectangles share no edge and are not contained, but form a larger rectangle.
Second, consider the situation where with this map:
A B C D
E F G H
I J K L
we have rectangles ABIJ, CDHG and EHLI. They do not share edges, are not contained within each-other, and no two of them can be unified into a single rectangle; but form a rectangle, in total.
With these pitfalls this method is not complete. But it can be used to greatly reduce the complexity of the problem and reduce the number of rectangles to analyse.
Maybe...
Gather up all the x-coordinates in a list, and sort them. From this list, create a sequence of adjacent intervals. Do the same thing for the y-coordinates. Now you've got two lists of intervals. For each pair of intervals (A=[x1,x2] from the x-list, B=[y1,y2] from the y-list), make their product rectangle A x B = (x1,y1)-(x2,y2)
If every single product rectangle is contained in at least one of your initial rectangles, then the union must be a rectangle.
Making this efficient (I think I've offered about an O(n4) algorithm) is a different question entirely.
As jva stated, "Your own version does not take into account that the edges of the rectangles can be non-parallel to each other." This answer also assumes "parallel" rectangles.
If you have a grid as opposed to needing infinite precision, depending on the number and sizes of the rectangles and the granularity of the grid, it might be feasible to brute-force it.
Just take your "largest rectangle possible" and test all its points to see whether each point is in at least one of the smaller rectangles.
I finally was able to find the impressive javascript project (thanks to github search :) !)
https://github.com/evanw/csg.js
Also have a look into my answer here with other interesting projects
General case, thinking in images:
| outer_rect - union(inner rectangles) |
Check that result is zero

Find the largest convex black area in an image

I have an image of which this is a small cut-out:
As you can see it are white pixels on a black background. We can draw imaginary lines between these pixels (or better, points). With these lines we can enclose areas.
How can I find the largest convex black area in this image that doesn't contain a white pixel in it?
Here is a small hand-drawn example of what I mean by the largest convex black area:
P.S.: The image is not noise, it represents the primes below 10000000 ordered horizontally.
Trying to find maximum convex area is a difficult task to do. Wouldn't you just be fine with finding rectangles with maximum area? This problem is much easier and can be solved in O(n) - linear time in number of pixels. The algorithm follows.
Say you want to find largest rectangle of free (white) pixels (Sorry, I have images with different colors - white is equivalent to your black, grey is equivalent to your white).
You can do this very efficiently by two pass linear O(n) time algorithm (n being number of pixels):
1) in a first pass, go by columns, from bottom to top, and for each pixel, denote the number of consecutive pixels available up to this one:
repeat, until:
2) in a second pass, go by rows, read current_number. For each number k keep track of the sums of consecutive numbers that were >= k (i.e. potential rectangles of height k). Close the sums (potential rectangles) for k > current_number and look if the sum (~ rectangle area) is greater than the current maximum - if yes, update the maximum. At the end of each line, close all opened potential rectangles (for all k).
This way you will obtain all maximum rectangles. It is not the same as maximum convex area of course, but probably would give you some hints (some heuristics) on where to look for maximum convex areas.
I'll sketch a correct, poly-time algorithm. Undoubtedly there are data-structural improvements to be made, but I believe that a better understanding of this problem in particular will be required to search very large datasets (or, perhaps, an ad-hoc upper bound on the dimensions of the box containing the polygon).
The main loop consists of guessing the lowest point p in the largest convex polygon (breaking ties in favor of the leftmost point) and then computing the largest convex polygon that can be with p and points q such that (q.y > p.y) || (q.y == p.y && q.x > p.x).
The dynamic program relies on the same geometric facts as Graham's scan. Assume without loss of generality that p = (0, 0) and sort the points q in order of the counterclockwise angle they make with the x-axis (compare two points by considering the sign of their dot product). Let the points in sorted order be q1, …, qn. Let q0 = p. For each 0 ≤ i < j ≤ n, we're going to compute the largest convex polygon on points q0, a subset of q1, …, qi - 1, qi, and qj.
The base cases where i = 0 are easy, since the only “polygon” is the zero-area segment q0qj. Inductively, to compute the (i, j) entry, we're going to try, for all 0 ≤ k ≤ i, extending the (k, i) polygon with (i, j). When can we do this? In the first place, the triangle q0qiqj must not contain other points. The other condition is that the angle qkqiqj had better not be a right turn (once again, check the sign of the appropriate dot product).
At the end, return the largest polygon found. Why does this work? It's not hard to prove that convex polygons have the optimal substructure required by the dynamic program and that the program considers exactly those polygons satisfying Graham's characterization of convexity.
You could try treating the pixels as vertices and performing Delaunay triangulation of the pointset. Then you would need to find the largest set of connected triangles that does not create a concave shape and does not have any internal vertices.
If I understand your problem correctly, it's an instance of Connected Component Labeling. You can start for example at: http://en.wikipedia.org/wiki/Connected-component_labeling
I thought of an approach to solve this problem:
Out of the set of all points generate all possible 3-point-subsets. This is a set of all the triangles in your space. From this set remove all triangles that contain another point and you obtain the set of all empty triangles.
For each of the empty triangles you would then grow it to its maximum size. That is, for every point outside the rectangle you would insert it between the two closest points of the polygon and check if there are points within this new triangle. If not, you will remember that point and the area it adds. For every new point you want to add that one that maximizes the added area. When no more point can be added the maximum convex polygon has been constructed. Record the area for each polygon and remember the one with the largest area.
Crucial to the performance of this algorithm is your ability to determine a) whether a point lies within a triangle and b) whether the polygon remains convex after adding a certain point.
I think you can reduce b) to be a problem of a) and then you only need to find the most efficient method to determine whether a point is within a triangle. The reduction of the search space can be achieved as follows: Take a triangle and increase all edges to infinite length in both directions. This separates the area outside the triangle into 6 subregions. Good for us is that only 3 of those subregions can contain points that would adhere to the convexity constraint. Thus for each point that you test you need to determine if its in a convex-expanding subregion, which again is the question of whether it's in a certain triangle.
The whole polygon as it evolves and approaches the shape of a circle will have smaller and smaller regions that still allow convex expansion. A point once in a concave region will not become part of the convex-expanding region again so you can quickly reduce the number of points you'll have to consider for expansion. Additionally while testing points for expansion you can further cut down the list of possible points. If a point is tested false, then it is in the concave subregion of another point and thus all other points in the concave subregion of the tested points need not be considered as they're also in the concave subregion of the inner point. You should be able to cut down to a list of possible points very quickly.
Still you need to do this for every empty triangle of course.
Unfortunately I can't guarantee that by adding always the maximum new region your polygon becomes the maximum polygon possible.

Resources