Consider a rectangular shaped canvas, containing rectangles of random sizes and positions. To navigate between these rectangles, a user can use four arrows: up, down, left, right.
Are you familiar with any algorithm of navigation that would produce a fairly straightforward user experience?
I came across a few solutions but none of them seemed suitable. I am aware that no solution will be "ideal". However, the kind of algorithm I am looking for is the sort used to navigate between icons on a desktop using only the arrow keys.
[EDIT 21/5/2013: As pointed out by Gene in a comment, my weighting scheme actually does not guarantee that every rectangle will be reachable from every other rectangle -- only that every rectangle will be connected to some other rectangle in each direction.]
A nice way to do this is using maximum weighted bipartite matching.
What we want to do is build a table defining a function f(r, d) that returns the rectangle that the user will be moved to if they are currently at rectangle r and hit direction d (up, down, left or right). We would like this function to have some nice properties, such as:
It must be possible to reach every rectangle from every other rectangle
Pressing left then right or vice versa, or up then down or vice versa, should leave the user in the same place
Pressing e.g. left should take the user to a rectangle to the left (this is a bit more difficult to state precisely, but we can use a scoring system to measure the quality)
For each rectangle, create 4 vertices in a graph: one for each possible key that could be pressed while at that rectangle. For a particular rectangle r, call them rU, rD, rL and rR. For every pair of rectangles r and s, create 4 edges:
(rU, sD)
(rD, sU)
(rL, sR)
(rR, sL)
This graph has 2 connected components: one contains all U and D vertices, and the other contains all L and R vertices. Each component is bipartite, because e.g. no U vertex is ever connected to another U vertex. We could in fact run maximum weighted bipartite matching on each component separately, although it's easier just to talk about running it once on the entire graph after grouping, say, U vertices with L vertices and D vertices with R vertices.
Assign each of these edges a nonnegative weight according to how much sense it makes for that pair of rectangles to be connected by that pair of keys. You are free to choose the form for this scoring function, but it should probably be:
inversely proportional to the distances between the rectangles (you could use the distance between their centres), and
inversely proportional to how far the angle between the centres of the rectangles differs from the desired horizontal or vertical line, and
zero whenever the rectangles are oriented the wrong way (e.g. if for the edge (rU, sD) if the centre of r is actually above the centre of s). Alternatively, you can just delete these zero-weight edges.
This function attempts to satisfy requirement 3 at the top.
[EDIT #2 24/5/2013: Added an example function below.]
Here is C-ish pseudocode for an example function satisfying these properties. It takes the centre points of 2 rectangles and the direction from rectangle 1 (the direction from rectangle 2 is always the opposite of this direction):
const double MAXDISTSQUARED = /* The maximum possible squared distance */;
const double Z = /* A +ve number. Z > 1 => distance more important than angle */
// Return a weight in the range [0, 1], with higher indicating a better fit.
double getWeight(enum direction d, int x1, int y1, int x2, int y2) {
if (d == LEFT && x1 < x2 ||
d == RIGHT && x1 > x2 ||
d == UP && y1 < y2 ||
d == DOWN && y1 > y2) return 0;
// Don't need to take sqrt(); in fact it's probably better not to
double distSquared = (x1 - x2) * (x1 - x2) + (y1 - y2) * (y1 - y2);
double angle = abs(atan2(x1 - x2, y1 - y2)); // 0 => horiz; PI/2 => vert
if (d == UP || d == DOWN) angle = PI / 2 - angle;
return 1 - pow(distSquared / MAXDISTSQUARED, Z) * (2 * angle / PI);
}
Now run maximum weighted bipartite matching. This will attempt to find the set of edges having highest total weight such that every vertex (or at least as many as possible) are adjacent to a selected edge, but no vertex is adjacent to more than one edge. (If we allowed a vertex to be adjacent to more than one edge, it would mean that pressing that key while at that rectangle would take you to more than one destination rectangle, which doesn't make sense.) Each edge in this matching corresponds to a bidirectional pair of keypresses, so that pressing e.g. up and then down will take to back to where you were, automatically satisfying requirement 2 at the top.
The only requirement not automatically satisfied by this approach so far is the important one, number 1: it does not necessarily guarantee that every rectangle will be reachable. If we just use the "raw" quality scores as edge weights, then this can actually occur for certain configurations, e.g. when there is one rectangle in each of the 4 corners of the screen, plus one at the centre, the centre one might be unreachable.
[EDIT 21/5/2013: As Gene says, my claim below that property 1 is satisfied by the new weighting scheme I propose is wrong. In many cases every rectangle will be reachable, but in general, you need to solve the NP-hard Hamiltonian Cycle problem to guarantee this. I'll leave the explanation in as it gets us some of the way there. In any case it can be hacked around by adjusting weights between connected components upward whenever subcycles are detected.]
In order to guarantee that the matching algorithm always returns a matching in which every rectangle is reachable, we need to adjust the edge weights so that it is never possible for a matching to score higher than a matching with more edges. This can be achieved by scaling the scoring function to between 0 and 1, and adding the number of rectangles, n, to each edge's weight. This works because a full matching then has score at least 4n^2 (i.e. even if the quality score is 0, the edge itself has a weight of n and there are 4n of them), while any matching with fewer edges has score at most 4(n-1)(n+1) = 4n^2 - 4, which is strictly less.
It's a truism that to a person with a hammer everything looks like a nail. Shortest path algorithms are an obvious tool here because shortest distance seems intuitive.
However we are designing a UI where logical distance is much more important than physical distance.
So let's try thinking differently.
One constraint is that repeatedly hitting the up (right, down or left) arrow ought to eventually cycle through all the rectangles. Otherwise some unreachable "orphans" are likely. Achieving this with an algorithm based on physical (2d) distance is difficult because the closest item in 2d might be in the wrong direction in the 1d projection corresponding to the arrow pair being used. I.e. hitting the up arrow could easily select a box below the current. Ouch.
So let's adopt an extremely simple solution. Just sort all the rectangles on the x-coordinate of their centroids. Hitting the right and left arrow cycles through rectangles in this order: right to the next highest x and left to the next lowest x. Wrap at the screen edges.
Also do the same with y-coordinates. Using up and down cycles in this order.
The key (pun intended) to success is adding dynamic information to the screen while cycling to show the user the logic of what is occurring. Here is a proposal. Others are possible.
At first vertical (up or down) key, a pale translucent overlay appears over the rectangles. They are shaded pale red or blue in a pattern that alternates by y coordinate of centroid. There are also horizontal hash marks of matching color across the entire window. The only reason for two colors is to provide a visual indicator of correspondence between lines and rectangles. The currently selected rectangle is non-translucent and the hash mark is brighter than all the others. When you continue to hit the up or down key, the highlighted box changes in the centroid y-order as described above. The overlay disappears when no arrow key has been struck for a half second or so.
A very similar overlay appears if a horizontal key is hit, only it's vertical hash marks and x-order.
As a user I'd really like this scheme. But YMMV.
The algorithm and data structures needed to implement this are obvious, trivial, and scale very well. The effort will go into making the overlays look good.
NB Now that I have done all the drawings I realize it would be a good idea to place a correctly colored dot at the centroid of each box to show which of the lines is intersecting it. Some illustrative diagrams follow.
Bare Boxes
Selection with up or down arrow in progress
Selection with left or right arrow in progress
What about building a movement graph as follows:
for any direction, try to go to the nearest rectangle, in the given direction, whose center point is the middle of the current rectangle's side.
try to eliminate loops, e.g. moving 'right' from A should try to yield a different rectangle than moving 'up-right' from A. For example in this drawing, the 'right' from green should be orange, even though pink would be the nearest mid-point
(Thanks to biziclop): if any rectangles aren't reachable in the graph, then re-map one of the adjoining rectangles to get to it, likely the one with the least error. Repeat until all rectangles are reachable (I think that algorithm would terminate...)
Then store the graph and only use that to navigate. You don't want to change the directions in the middle of the session.
This problem can be modeled as a graph problem and algorithm of navigation can be used as a shortest path routing.
Here is the modelling.
Each rectangle is a vertex in the graph. From each vertex (aka rectangle) you have four options - up, down, left, right. So, you can reach four different rectangles, i.e this vertex will have four neighbors and you add these edges to graph.
I am not sure if this is part of the problem -- "multiple rectangles can be reached from a rectangle using a particular action (e.g. up)". If not, the above modelling is good enough. If yes, then add all such vertices as the neighbor for this vertex. Therefore, you may not end up with a 4-regular graph. Otherwise, you will model your problem into a 4-regular graph.
Now, the question is how do you define your "navigation" algorithm. If you do not want to distinguish between your actions, i.e. if up, down, left, and right are all equal, then you can add weight of 1 to all edges.
If you decide to give a particular action more precedence than other, say up is better than the rest, then you can give weight for edges resulted from up movement as 1, and the remaining edges as 2. The idea is by assigning different weights you can distinguish between the edges you will travel.
If you decide that all up edges are not equal, i.e. the up distance between A and B, is shorter than the up distance between C and D, then you can accordingly assign weights to the edges during the graph construction process.
Here is the routing
Now how to find the route -- You can use dijkstra's algorithm to find a shortest path between a given pair of vertices. If you are interested in multiple shortest paths, you can use k-shortest path algorithm to find k shortest paths between a pair of node, and then pick your best path.
Note that the graph you end up with does not have to be a directed graph. If you prefer a directed graph, you can assign directions to the edges when you are constructing them. Otherwise you should be good using an undirected graph, as all you care is to use an edge to reach a vertex from another. Moreover, if rectangle A can be reached using up from rectangle B, then rectangle B can be reached using down from rectangle A. Thus, directions really do not matter, if you do not need them for other reasons. If you do not like the assumption I just made, then you need to construct a directed graph.
Hope this helps.
Related
I have a text box detection algorithm that outputs word-level detections. Here's an example:
So the output is a list of boxes in the form (x1_i,y1_i,x2_i,y2_i) indicating the bottom-left and top-right coordinates. I'd like to find a simple decent baseline algorithm to merge these boxes into lines. So the desired output would be:
["Hey how are you?" , "I'm great!"]
I've seen a few questions similar to this, but they are primarily about straight (uni-directional) text, e.g.:
Merge the Bounding boxes near by into one
My thoughts on this are to calculate vectors from the centroid of each box, and then doing merging of boxes, based on closeness and near-same direction. I'm wondering if there are any such algorithms already out there? The corner cases that I'd like to try and address are:
Multiple angles of text.
Non-overlapping boxes (like the [I'm] [great!] ones).
Crossing texts at different angles (like the two lines above).
I'd like to find a such a quick-and-easy baseline algorithm using python.
This is the simplest algorithm meeting your criteria that I can think of.
Define a function score(b1, b2) that reflects how likely it is that the word in b1 precedes the word in b2, with lower scores being better (e.g., 0 means that the right edge of b1 is the same as the left edge of b2). For each box b, define pred(b) to be the box b' minimizing score(b', b), and define succ(b) to be the box b' minimizing score(b, b'). Form a graph on the bounding boxes with arcs b->b' such that succ(b) = b' and pred(b') = b. This graph is the disjoint union of some number of simple paths. These are your lines.
I don't know the best way to define score, but one possibility is to define score(b1, b2) = f(distance(midpoint of b1's right side, midpoint of b2's right side) / max(b1's height, b2's height)) + g(angular distance(b1's angle, b2's angle)), where f and g are increasing functions (I'd start the experiments with f and g being linear).
Edit: it occurs that the word bounding box algorithm may only give you a reliable orientation modulo a half rotation. In this case, define score so that it's lower when the two boxes are likelier to precede/succeed one another, form a directed graph where each box has an arc to each of the two boxes with which it has the lowest scores, throw away arcs with no reciprocal arc, form an undirected graph with an edge for each arc pair. The latter graph consists of a disjoint union of paths and cycles. You'll have to figure out what to do with the cycles and how to orient the paths.
I have a set of 2D points where all values are integers. No points are identical. I want to draw polylines/paths/whatever through all the points with a few restrictions:
1: A line should always move in the positive x-direction. p1.x < p2.x < ...
2: Lines may never cross each other.
3: All polylines needs to begin at x = 0 and end at x-max.
4: It should use as few polylines as possible (or a number defined by me).
I have attached an image of a sample set of points. And a hand made solution I drew with a pencil and ruler.
It's trivial to find a solution by hand, but I have no idea how to describe my process in logical terms. I don't need an optimal solution (whatever that means). It doesn't need to be fast.
Point set (Disregard colors)
Connected Points
My current solution is to step through the set along the x-axis and then try all viable combinations and choosing the one with the lowest total vertical movement. This works in some cases but not all. And it seems to over-complicate the problem.
My next idea is to do a brute force approach with backtracking when collisions occour. But that also seems a bit much.
For anyone wondering the points are actually notes on sheet music. The x-axis is time and the y-axis is pitch. The polylines represent the movement of robotic fingers playing a piano.
We will find a solution which uses the minimum number of robotic fingers (least number of polylines). The trick is to consider your input as a Partially ordered set (or poset). Each point in your input is an element of the poset, and with the relation (p1.x, p1.y) < (p2.x , p2.y) if and only if p1.x < p2.x. This basically means that two points which have the same x-coordinate are incomparable with each other.
For now, let us forget this constraint: "Lines may never cross each other". We'll get back to it at the end.
What you are looking for, is a partition of this poset into chains. This is done using Dilworth's Theorem. It should be clear that if there are 5 points with the same x-coordinate, then we need at least 5 different polylines. What Dilworth's says is that if there is no x-coordinate with more than 5 points on it, then we can get 5 polylines (chains) which cover all the points. And it also gives us a way to find these polylines, which I'm summarizing here:
You just create a bipartite graph G = (U,V,E) where U = V = Set of all input points and where (u,v) is an edge in G if u.x < v.x. Then find a maximum matching, M, in this graph, and consider the set of polylines formed by including u and v in the same polyline whenever there is an edge (u,v) in M.
The only issue now, is that some of these polylines could cross each other. We'll see how to fix that:
First, let us assume that there are only two polylines, L1 and L2. You find the first instance (minimum x-coordinate) of their crossing. Suppose the two line segments which cross each other are AB and CD:
We delete AB and CD and instead add AD and CB:
The polylines still cross each other, but their point of crossing has been delayed. So we can keep repeating this process until there is no crossing left. This takes at most n iterations. Thus we know how to 'untangle' two polylines.
[The edge case of B lying on the segment CD is also handled in the exact same way]
Now, suppose we have k different polylines which the maximum matching has given us: L1, L2, ..., Lk. WLOG, let us assume that at x = 0, L1's y-coordinate is lower than L2's y-coordinate, which is lower than L3's and so on.
Take L1 and find the first time that it crosses with any other polyline. At that crossing, applying the swapping operation as above. Keep repeating this, until L1 does not cross with any other polyline. Now, L1 is at the 'bottom', and doesn't cross with any other line. We now output L1 as one of the final polylines, and delete it from our algo. We then repeat the same process with L2, and after outputting it, delete it, and repeat with L3, and so on.
I am currently working on a node-based house-builder for Unity. The system is pretty simple in its workflow: users can create nodes, which are simply cubes, and connect them with each other to create walls. The mesh processing is already done and it works nice and smooth.
What I am trying to do now is to detect how many closed rooms have been created and what vertices are involved in each one of them. The possible inputs can be seen in the following images:
In the first picture, the loops would be
(1,5,3,4), (1,2,6,8,7,5), (6,9,12,11,10,8), (8,10,14,13) and (10,11,17,16,15,14).
In the second one they'd be
(1,2,5,6,8,7), (2,3,4,14,13,6,5), (6,13,12,11,10) and (8,6,10,9).
Each node can be connected to up to four other nodes, one per cardinal side, and every link is stored on both sides. I do not need the nodes to come in any particular order.
I thought I could use a generic loop-detection algorithm and recursively search for sub-loops until the loop I find has has no internal connections, but this would be extremely resource-consuming. There must be some properties I can use to detect loops with no internal connections without iterating over the graph so many times, but I haven't been able to find it.
Do you have any suggestion?
For the following algorithm to work, you need the following:
A unique direction of the edge (which you probably already have)
Two flags for every edge that specify if the edge has been used in the forward and the backward direction
A list of vertices with unused edges
Then the idea is the following. Take any node with unused edges and go along any of the unused edges to the neighbor (keep the direction in mind). Doing so, immediately mark the edge in the according direction as used. At this neighbor, you know the direction from which you came. Look in counter-clockwise order until you find the first unused edge (again watch out for the edge direction). You can also search in clockwise order, this will define the order of all your output faces. E.g. if you came from the left edge, then check the bottom, right, top edges, respectively. Go across this edge (mark as used) and repeat until you arrive at the start vertex. All visited vertices form your room.
Doing so, you should update the list of vertices with unused edges accordingly.
Eventually, you will also create a face for the border. You can detect this e.g. by calculating its orientation:
v1 x v2 + v2 x v3 + v3 x v4 + ... + vn x v1
, where v are the positions of the vertices and x represents the z-component of the cross product (which represents the face orientation):
(x1, y1) x (x2, y2) = (x1 * y2) - (x2 * y1)
The boundary face will have a different sign for this orientation than all other faces. The actual sign depends on whether you used counter-clockwise or clockwise order during the edge traversal.
This is an answer only to the first question, but it might help you with the second one. The number of closed rooms actually has a closed formula:
1 - V + E where V is the number of vertices and E is the number of edges. In your second example, there are 14 vertices, 17 edges and 4 rooms.
The mathematics are a bit complicated, but the key word is Euler characteristic.
I have a problem in which I have to test whether the union of given set of rectangles forms
a rectangle or not. I don't have much experience solving computational geometry problems.
What my approach to the problem was that since I know the coordinates of all the rectangles, I can easily sort the points and then deduce the corner points of the largest rectangle possible. Then I could sweep a line and see if all the points on the line falls inside the rectangle. But, this approach is flawed and this would fail because the union may be in the form of a 'U'.
I would be a great help if you could push me in the right direction.
Your own version does not take into account that the edges of the rectangles can be non-parallel to each other. Therefore, there might not be "largest rectangle possible".
I would try this general approach:
1) Find the convex hull. You can find convex hull calculation algorithms here http://en.wikipedia.org/wiki/Convex_hull_algorithms.
2) Check if the convex hull is a rectangle. You can do this by looping through all the points on convex hull and checking if they all form 180 or 90 degree angles. If they do not, union is not a rectangle.
3) Go through all points on the convex hull. For each point check if the middle point between ThisPoint and NextPoint lies on the edge of any initially given rectangle.
If every middle point does, union is a rectangle.
If it does not, union is not a rectangle.
Complexity would be O(n log h) for finding convex hull, O(h) for the second part and O(h*n) for third part, where h is number of points on the convex hull.
Edit:
If the goal is to check if the resulting object is a filled rectangle, not only edges and corners rectangle then add step (4).
4) Find all line segments that are formed by intersecting or touching rectangles. Note - by definition all of these line segments are segments of edges of given rectangles. If a rectangle does not touch/intersect other rectangles, the line segments are it's edges.
For each line segment check if it's middle point is
On the edge of the convex hull
Inside one of given rectangles
On the edge of two non-overlapping given rectangles.
If at least one of these is true for every line segment, resulting object is a filled rectangle.
You could deduce the he corner points of the largest rectangle possible, and then go over all the rectangle that share the border with the largest possible rectangle, for example the bottom, and make sure that the line is entirely contained in their borders. This will also fail if an empty space in the middle of the rectangle is a problem, however. I think the complexity will be O(n2).
I think you are on the right direction. After you get the coordinates of largest possible rectangle,
If the largest possible rectangle is a valid rectangle, then each side of it must be union of sides of original rectangles. You can scan the original rectangle set, find those rectangles that is a part of the largest side we are looking for (this can be done in O(n) by checking if X==largestRectangle.Top.X if you are looking at top side, etc.), lets call them S.
For each side s in S we can create an interval [from,to]. All we need to check is whether the union of all intervals matches the side of the largest Rectangle. This can be done in O(nlog(n)) by standard algorithms, or on average O(n) by some hash trick (see http://www.careercup.com/question?id=12523672 , see my last comment (of the last comment) there for the O(n) algorithm ).
For example, say we got two 1*1 rectangles in the first quadrant, there left bottom coordinates are (0,0) and (1,0). Largest rectangle is 2*1 with left bottom coordinate (0,0). Since [0,1] Union [1,2] is [0,2], top side and bottom side match the largest rectangle, similar for left and right side.
Now suppose we got an U shape. 3*1 at (0,0), 1*1 at (0,1), 1*1 at (2,1), we got largest rectangle 3*2 at (0,0). Since for the top side we got [0,1] Union [1,3] does not match [0,3], the algorithm will output the union of above rectangles is not a rectangle.
So you can do this in O(n) on average, or O(nlog(n)) at least if you don't want to mess with some complex hash bucket algorithm. Much better than O(n^4)!
Edit: We have a small problem if there exists empty space somewhere in the middle of all rectangles. Let me think about it....
Edit2: An easy way to detect empty space is for each corner of a rectangle which is not a point on the largest rectangle, we go outward a little bit for all four directions (diagonal) and check if we are still in any rectangle. This is O(n^2). (Which ruins my beautiful O(nlog(n))! Can anyone can come up a better idea?
I haven't looked at a similar problem in the past, so there maybe far more efficient ways of doing it. The key problem is that you cannot look at containment of one rectangle in another in isolation since they could be adjacent but still form a rectangle, or one rectangle could be contained within multiple.
You can't just look at the projection of each rectangle on to the edges of the bounding rectangle unless the problem allows you to leave holes in the middle of the rectangle, although that is probably a fast initial check that could be performed before the following exhaustive approach:
Running through the list once, calculating the minimum and maximum x and y coordinates and the area of each rectangle
Create an input list containing your input rectangles ordered by descending size.
Create a work list containing the bounding rectangle initially
While there are rectangles in the work list
Take the largest rectangle in the input list R
Create an empty list for fragments
for each rectangle r in the work list, intersect r with R, splitting r into a rectangular portion contained within R (if any) and zero or more rectangles not within R. If r was split, discard the portion contained within R and add the remaining rectangles to the fragment list.
add the contents of the fragment list to the work list
Assuming your rectangles are aligned to the coordinate axis:
Given two rectangles A, B, you can make a function that subtracts B from A returning a set of sub-rectangles of A (that may be the empty set): Set = subtract_rectangle(A, B)
Then, given a set of rectangles R for which you want to know if their union is a rectangle:
Calculate a maximum rectangle Big that covers all the rectangles as ((min_x,min_y)-(max_x,max_y))
make the set S contain the rectangle Big: S = (Big)
for every rectangle B in R:
S1 = ()
for evey rectangle A in S:
S1 = S1 + subtract_rectangle(A, B)
S = S1
if S is empty then the union of the rectangles is a rectangle.
End, S contains the parts of Big not covered by any rectangle from R
If the rectangles are not aligned to the coordinate axis you can use a similar algorithm but that employs triangles instead of rectangles. The only issues are that subtracting triangles is not so simple to implement and that handling numerical errors can be difficult.
A simple approach just came to mind: If two rectangles share an edge[1], then together they form a rectangle which contains both - either the rectangles are adjacent [][ ] or one contains the other [[] ].
So if the list of rectangles forms a larger rectangle, then all you need it to repeatedly iterate over the rectangles, and "unify" pairs of them into a single larger one. If in one iteration you can unify none, then it is not possible to create any larger rectangle than you already have, with those pieces; otherwise, you will keep "unifying" rectangles until a single is left.
[1] Share, as in they have the same edge; it is not enough for one of them to have an edge included in one of the other's edges.
efficiency
Since efficiency seems to be a problem, you could probably speed it up by creating two indexes of rectangles, one with the larger edge size and another with the smaller edge size.
Then compare the edges with the same size, and if they are the same unify the two rectangles, remove them from the indexes and add the new rectangle to the indexes.
You can probably speed it up by not moving to the next iteration when you unify something, but to proceed to the end of the indexes before reiterating. (Stopping when one iteration does no unifications, or there is only one rectangle left.)
Additionally, the edges of a rectangle resulting from unification are by analysis always equal or larger than the edges of the original rectangles.
So if the indexes are ordered by ascending edge size, the new rectangle will be inserted in either the same position as you are checking or in positions yet to be checked, so each unification will not require an extra iteration cycle. (As the new rectangle will assuredly not unify with any rectangle previously checked in this iteration, since its edges are larger than all edges checked.)
For this to hold, in each step of a particular iteration you need to attempt unification on the next smaller edge from either of the indexes:
If you're in index1=3 and index2=6, you check index1 and advance that index;
If next edge on that index is 5, next iteration step will be in index1=5 and index2=6, so it will check index1 and advance that index;
If next edge on that index is 7, next iteration step will be in index1=7 and index2=6, so it will check index2 and advance that index;
If next edge on that index is 10, next iteration step will be in index1=7 and index2=10, so it will check index1 and advance that index;
etc.
examples
[A ][B ]
[C ][D ]
A can be unified with B, C with D, and then AB with CD. One left, ABCD, thus possible.
[A ][B ]
[C ][D ]
A can be unified with B, C with D, but AB cannot be unified with CD. 2 left, AB and CD, thus not possible.
[A ][B ]
[C ][D [E]]
A can be unified with B, C with D, CD with E, CDE with AB. 1 left, ABCDE, thus possible.
[A ][B ]
[C ][D ][E]
A can be unified with B, C with D, CD with AB, but not E. 2 left, ABCD and E, thus not possible.
pitfall
If a rectangle is contained in another but does not share a border, this approach will not unify them.
A way to address this is, when one hits an iteration that does not unify anything and before concluding that it is not possible to unify the set of rectangles, to get the rectangle with the widest edge and discard from the indexes all others that are contained within this largest rectangle.
This still does not address two situations.
First, consider the situation where with this map:
A B C D
E F G H
we have rectangles ACGE and BDFH. These rectangles share no edge and are not contained, but form a larger rectangle.
Second, consider the situation where with this map:
A B C D
E F G H
I J K L
we have rectangles ABIJ, CDHG and EHLI. They do not share edges, are not contained within each-other, and no two of them can be unified into a single rectangle; but form a rectangle, in total.
With these pitfalls this method is not complete. But it can be used to greatly reduce the complexity of the problem and reduce the number of rectangles to analyse.
Maybe...
Gather up all the x-coordinates in a list, and sort them. From this list, create a sequence of adjacent intervals. Do the same thing for the y-coordinates. Now you've got two lists of intervals. For each pair of intervals (A=[x1,x2] from the x-list, B=[y1,y2] from the y-list), make their product rectangle A x B = (x1,y1)-(x2,y2)
If every single product rectangle is contained in at least one of your initial rectangles, then the union must be a rectangle.
Making this efficient (I think I've offered about an O(n4) algorithm) is a different question entirely.
As jva stated, "Your own version does not take into account that the edges of the rectangles can be non-parallel to each other." This answer also assumes "parallel" rectangles.
If you have a grid as opposed to needing infinite precision, depending on the number and sizes of the rectangles and the granularity of the grid, it might be feasible to brute-force it.
Just take your "largest rectangle possible" and test all its points to see whether each point is in at least one of the smaller rectangles.
I finally was able to find the impressive javascript project (thanks to github search :) !)
https://github.com/evanw/csg.js
Also have a look into my answer here with other interesting projects
General case, thinking in images:
| outer_rect - union(inner rectangles) |
Check that result is zero
I have an image of which this is a small cut-out:
As you can see it are white pixels on a black background. We can draw imaginary lines between these pixels (or better, points). With these lines we can enclose areas.
How can I find the largest convex black area in this image that doesn't contain a white pixel in it?
Here is a small hand-drawn example of what I mean by the largest convex black area:
P.S.: The image is not noise, it represents the primes below 10000000 ordered horizontally.
Trying to find maximum convex area is a difficult task to do. Wouldn't you just be fine with finding rectangles with maximum area? This problem is much easier and can be solved in O(n) - linear time in number of pixels. The algorithm follows.
Say you want to find largest rectangle of free (white) pixels (Sorry, I have images with different colors - white is equivalent to your black, grey is equivalent to your white).
You can do this very efficiently by two pass linear O(n) time algorithm (n being number of pixels):
1) in a first pass, go by columns, from bottom to top, and for each pixel, denote the number of consecutive pixels available up to this one:
repeat, until:
2) in a second pass, go by rows, read current_number. For each number k keep track of the sums of consecutive numbers that were >= k (i.e. potential rectangles of height k). Close the sums (potential rectangles) for k > current_number and look if the sum (~ rectangle area) is greater than the current maximum - if yes, update the maximum. At the end of each line, close all opened potential rectangles (for all k).
This way you will obtain all maximum rectangles. It is not the same as maximum convex area of course, but probably would give you some hints (some heuristics) on where to look for maximum convex areas.
I'll sketch a correct, poly-time algorithm. Undoubtedly there are data-structural improvements to be made, but I believe that a better understanding of this problem in particular will be required to search very large datasets (or, perhaps, an ad-hoc upper bound on the dimensions of the box containing the polygon).
The main loop consists of guessing the lowest point p in the largest convex polygon (breaking ties in favor of the leftmost point) and then computing the largest convex polygon that can be with p and points q such that (q.y > p.y) || (q.y == p.y && q.x > p.x).
The dynamic program relies on the same geometric facts as Graham's scan. Assume without loss of generality that p = (0, 0) and sort the points q in order of the counterclockwise angle they make with the x-axis (compare two points by considering the sign of their dot product). Let the points in sorted order be q1, …, qn. Let q0 = p. For each 0 ≤ i < j ≤ n, we're going to compute the largest convex polygon on points q0, a subset of q1, …, qi - 1, qi, and qj.
The base cases where i = 0 are easy, since the only “polygon” is the zero-area segment q0qj. Inductively, to compute the (i, j) entry, we're going to try, for all 0 ≤ k ≤ i, extending the (k, i) polygon with (i, j). When can we do this? In the first place, the triangle q0qiqj must not contain other points. The other condition is that the angle qkqiqj had better not be a right turn (once again, check the sign of the appropriate dot product).
At the end, return the largest polygon found. Why does this work? It's not hard to prove that convex polygons have the optimal substructure required by the dynamic program and that the program considers exactly those polygons satisfying Graham's characterization of convexity.
You could try treating the pixels as vertices and performing Delaunay triangulation of the pointset. Then you would need to find the largest set of connected triangles that does not create a concave shape and does not have any internal vertices.
If I understand your problem correctly, it's an instance of Connected Component Labeling. You can start for example at: http://en.wikipedia.org/wiki/Connected-component_labeling
I thought of an approach to solve this problem:
Out of the set of all points generate all possible 3-point-subsets. This is a set of all the triangles in your space. From this set remove all triangles that contain another point and you obtain the set of all empty triangles.
For each of the empty triangles you would then grow it to its maximum size. That is, for every point outside the rectangle you would insert it between the two closest points of the polygon and check if there are points within this new triangle. If not, you will remember that point and the area it adds. For every new point you want to add that one that maximizes the added area. When no more point can be added the maximum convex polygon has been constructed. Record the area for each polygon and remember the one with the largest area.
Crucial to the performance of this algorithm is your ability to determine a) whether a point lies within a triangle and b) whether the polygon remains convex after adding a certain point.
I think you can reduce b) to be a problem of a) and then you only need to find the most efficient method to determine whether a point is within a triangle. The reduction of the search space can be achieved as follows: Take a triangle and increase all edges to infinite length in both directions. This separates the area outside the triangle into 6 subregions. Good for us is that only 3 of those subregions can contain points that would adhere to the convexity constraint. Thus for each point that you test you need to determine if its in a convex-expanding subregion, which again is the question of whether it's in a certain triangle.
The whole polygon as it evolves and approaches the shape of a circle will have smaller and smaller regions that still allow convex expansion. A point once in a concave region will not become part of the convex-expanding region again so you can quickly reduce the number of points you'll have to consider for expansion. Additionally while testing points for expansion you can further cut down the list of possible points. If a point is tested false, then it is in the concave subregion of another point and thus all other points in the concave subregion of the tested points need not be considered as they're also in the concave subregion of the inner point. You should be able to cut down to a list of possible points very quickly.
Still you need to do this for every empty triangle of course.
Unfortunately I can't guarantee that by adding always the maximum new region your polygon becomes the maximum polygon possible.