Group divided polygon into N contiguous shapes

Group divided polygon into N contiguous shapes - algorithm

Given the following polygon, which is divided into sub-polygons as depicted below [left], I would like to create n number of contiguous, equally sized groups of sub-polygons [right, where n=6]. There is no regular pattern to the sub-polygons, though they are guaranteed to be contiguous and without holes.
This is not splitting a polygon into equal shapes, it is grouping its sub-polygons into equal, contiguous groups. The initial polygon may not have a number of sub-polygons divisible by n, and in these cases non-equally sized groups are ok. The only data I have is n, the number of groups to create, and the coordinates of the sub-polygons and their outer shell (generated through a clipping library).
My current algorithm is as follows:
list sub_polygons[] # list of polygon objects
for i in range(n - 1):
# start a new grouping
pick random sub_polygon from list as a starting point
remove this sub_polygon from the list
add this sub_polygon to the current group
while (number of shapes in group < number needed to be added):
add a sub_polygon that the group borders to the group
remove this sub-polygon from the sub-polygons list
add all remaining sub-shapes to the final group
This runs into problems with contiguity, however. The below illustrates the problem - if the red polygon is added to the blue group, it cuts off the green polygon such that it cannot be added to anything else to create a contiguous group.
It's simple to add a check for this when adding a sub-polygon to a group, such as
if removing sub-polygon from list will create non-contiguous union
pass;
but this runs into edge conditions where every possible shape that can be added creates a non-contiguous union of the available sub-polygons. In the below, my current algorithm is trying to add a sub-polygon to the red group, and with the check for contiguity is unable to add any:
Is there a better algorithm for grouping the sub-polygons?

I think it's more complicated to be solved in a single run. Despite the criteria used for selecting next polygon, it may stock somewhere in the middle. So, you need an algorithm that goes back and changes previous decision in such cases. The classic algorithm that does so is BackTracking.
But before starting, let's change the representation of the problem. These polygons form a graph like this:
This is the pseudocode of the algorithm:
function [ selected, stop ] = BackTrack(G, G2, selected, lastGroupLen, groupSize)
if (length(selected) == length(G.Node))
stop = true;
return;
end
stop = false;
if (lastGroupLen==groupSize)
// start a new group
lastGroupLen=0;
end;
// check continuity of remaining part of graph
if (discomp(G2) > length(selected))
return;
end
if (lastGroupLen==0)
available = G.Nodes-selected;
else
available = []
// find all nodes connected to current group
for each node in last lastGroupLen selected nodes
available = union(available, neighbors(G, node));
end
available = available-selected;
end
if (length(available)==0)
return;
end
lastSelected = selected;
for each node in available
[selected, stop] = BackTrack(G, removeEdgesTo(G2, node),
Union(lastSelected, node), lastGroupLen+1, groupSize);
if (stop)
break;
end
end
end
where:
selected: an ordered set of nodes that can be divided to n consecutive groups
stop: becomes true when the solution was found
G: the initial graph
G2: what remains of the graph after removing all edges to last selected node
lastGroupLen: number of nodes selected for last group
groupSize: maximum allowable size of each group
discomp(): returns number of discontinuous components of the graph
removeEdgesTo(): removes all edges connected to a node
That should be called like:
[ selected, stop ] = BackTrack( G, G, [], 0, groupSize);
I hope that is clear enough. It goes like this:
Just keep in mind the performance of this algorithm can be severely affected by order of nodes. One solution to speed it up is to order polygons by their centroids:
But there is another solution, if you are not satisfied with this outcome like myself. You can order the available set of nodes by their degrees in G2, so in each step, nodes that have less chance to make the graph disconnected will be visited first:
And as a more complicated problem, i tested map of Iran that has 262 counties. I set the groupSize to 20:

I think you can just follow the procedure:
Take some contiguous group of sub-polygons lying on the perimeter of the current polygon (if the number of polygons on the perimeter is less than the target size of the group, just take all of them and take whatever more you need from the next perimeter, and repeat until you reach your target group size).
Remove this group and consider the new polygon that consists of the remaining sub-polygons.
Repeat until remaining polygon is empty.
Implementation is up to you but this method should ensure that all formed groups are contiguous and that the remaining polygon formed at step 2 is contiguous.
EDIT:
Never mind, user58697 raises a good point, a counterexample to the algorithm above would be a polygon in the shape of an 8, where one sub-polygon bridges two other polygons.

Related

Number of ways N circles of different radius can be arranged in a line

Question: Given M points on a line separated by 11 unit. Find the number of ways N circles of different radii can be drawn so that they don't intersect or overlap or one inside another?? Provided that the centers of circles should be those MM points.
Example 1: N=3,M=6,r1=1,r2=1,r3=1 Answer: 24 ways.
Example 2: N=2,M=5 ,r1=1,r2=2 Answer: 6 ways.
Example 3: N=1,M=10,r=50. Answer =10 ways.
I found this question online and have not been able to solve it till now. Till now I have been able to only work up this much that any circle can take spaces from n−rn−r to n−2rn−2r. But among other issues how can I adjust for edge cases in which a circle with radius 33 takes n−4n−4th point, now the last point will be left untouched but I cannot place any circle with a radius greater than 1. I am not able to see any generalized mathematical solution to this.

If the center of the circles can be placed on non-integer x and y coordinates, then it is either impossible due to the length being too short, or infinitely many due to have enough length and there are infinitely many translations.
So, since you have to compute the results, I will assume that the coordinates of (M,M) are integer numbers.
If there is a single circle, then the solution is the number of points the circle can be legally placed.
If there are at least two circles, then you need to calculate the sum of the diameters and if that happens to be larger than the total length of the line we are speaking about, then you have no solution. If that is not the case, then you need to subtract the sum of diameters from the total length, getting Complementer. You also have N! permutations to compute the order of the circles. And you will have Complementer - 1 possible locations where you can distribute the gaps between the circles. The lengths of the gaps are G1, ..., Gn-1
We know that G1 + ... + Gn-1 = Complementer
The number of possible distributions of G1, ..., Gn-1 is D. The formula therefore would b
N! * D
The remaining question is: how can we compute D?
Solution:
function distr(depth, maxDepth, amount)
if (depth = maxDepth) then
return 1 //we need to put the remaining elements on the last slot
end if
sum = 1 //if we put amount here, that is a trivial case
for i = amount - 1 to 0 do
sum = distr(depth + 1, maxDepth, amount - i)
end for
return sum
end distr
You need to call distr with depth = 1, maxDepth = N-1, amout = Complementer

Algorithm for calculating the area of a region in a grid of squares

I'm working on a game which uses a tilemap. Squares on the map can either be walls or they can be empty. The algorithm I'm trying to develop should take a point on the map and return the number of cells that can be reached from that point (which is equal to the area of the sector containing the point).
Let the function which carries out the algorithm take an x-coordinate, a y-coordinate and a map in the form of a 2D array.
function sectorArea(x_coord,y_coord,map) { ... }
Say the map looks like this (where 1's represent walls):
map = [0,0,1,0,0,0],
[0,0,1,0,0,0],
[1,1,1,0,0,0],
[0,0,0,0,0,0]
Then sectorArea(0,0,map) == 4 and sectorArea(4,0,map) == 15.
My naive implementation is recursive. The target cell is passed to the go function, which then recurses on any adjacent cells which are empty - eventually spreading across all empty cells in the sector. It runs too slowly and reaches the call stack limit very quickly:
function sectorArea(x_coord,y_coord,map) {
# First convert the map into an array of objects of the form:
# { value: 0 or 1,
# visited: false }
objMap = convertMap(map);
# The recursive function:
function go(x,y) {
if ( outOfBounds(x) || outOfBounds(y) ||
objMap[y][x].value == 1 || objMap[y][x].visited )
return 0;
else
objMap[y][x].visited = true;
return 1 + go(x+1,y) + go(x-1,y) + go(x,y+1) + go(x,y-1);
}
return go(x_coord,y_coord);
}
Could anyone suggest a better algorithm? A non-deterministic solution would actually be fine if it is sufficiently accurate, as speed is the main issue (the algorithm could be called 3 or 4 times on different points during a single tick).

Maybe you can speed up the algorithm itself. Wikipedia suggests that a scanline approach is efficient.
As for the repeated calls: You can cache the results so that you don't have to run the area calculation again every time.
An approach might be to keep an region map of integers alongside your tiles. This denotes several regions, where a special value, -1 for example, means no region. (This region map also serves as your visited attribute.) In addition to that, keep a (short) array of regions with their areas.
In your example above:
When you calculate the area of (0, 0), you will assign 0 to the four tiles in the northwest corner. You will also append the area, 4, to the area array.
When you calculate the area of (0, 1), you notice that the region map for that coordinate has a value of zero, not -1. That means that the area was already calculated.
When you calculate the area of (4, 4), you find -1 in the region map. That means that the region hasn't been calculated yet. Do that, mark the region with 1 and append the new area, 15, to the area array.
I don't know how often the board changes. When you must recalculate the regions, blank out the region map and empty the array list.
The region map is only created once, it isn't recreated for every tick. (I can see this as a potential bottleneck in your code, when the objMap is frequently recreated instead of just being overwritten.)

TLE in RECTNG1 spoj problem on finding the number of separate blocks formed by rectangles with integer coordinates

I am trying to solve
the SPOJ problem on rectangles.
Problem statement:
There are n rectangles drawn on the plane. Each rectangle has sides parallel to the coordinate axes and integer coordinates of vertices.
We define a block as follows:
each rectangle is a block,
if two distinct blocks have a common segment then they form the new block otherwise we say that these blocks are separate.
Write a program that for each test case:
reads the number of rectangles and coordinates of their vertices;
finds the number of separate blocks formed by the rectangles;
writes the result to the standard output.
Input:
The number of test cases t is in the first line of input, then t test cases follow separated by an empty line.
In the first line of a test case there is an integer n, 1 <= n <= 7000, which is the number of rectangles. In the following n lines there are coordinates of rectangles. Each rectangle is described by four numbers: coordinates x, y of the bottom-left vertex and coordinates x, y of the top-right vertex. All these coordinates are non-negative integers not greater than 10000.
Output:
For each test case you should output one line with the number of separate blocks formed by the given rectangles.
My approach:
Check for every pair of rectangle r_i and r_j whether they are separate or not based on that set adjacency matrix mat[i][j] and mat[j][i] to true or false respectively
Then run DFS on the constructed graph to count number of connected paths. This count will represent number of separate block.
As number of rectangles is at most 7000, looking at every pair will not cross 10^7. Still I am getting TLE (time limit exceeded).
How can I solve this problem more efficiently?
void comp() {
list.clear();
scanI(n);
REP(i,1,n) {
Rec rec;
scanI(rec.p);
scanI(rec.q);
scanI(rec.r);
scanI(rec.s);
list.pb(rec);
}
REP(i,0,list.size()-2){
Rec rec = list[i];
p = rec.p;
q = rec.q;
r = rec.r;
s = rec.s;
REP(j,i+1,list.size()-1) {
Rec m = list[j];
a = m.p;
b = m.q;
c = m.r;
d = m.s;
if(!isSeparate()) {
eList[i].pb(j); //adjacency list for rec_i
eList[j].pb(i);//adjacency list for rec_j
}
}
}
int cnt=0;
REP(i,0,n-1) {
if(!vis[i]){
cnt++;
dfs(i);
}
}
printf("%d\n",cnt);
}
bool isSeparate(){
if(s<b || d<q || r<a || c<p) return true;
if((r==a && q==d)||(c==p && b==s)||(a==r && b==s)||(p==c && q==d)) return true;
else return false;
}
void dfs(int s) {
cout<<"Visited : "<<s<<endl;
if(vis[s]) return;
vis[s] = true;
REP(i,0,eList[s].size()-1){
if(!vis[eList[s][i]]){
dfs(eList[s][i]);
}
}
}

I've thought of a couple of more algorithmic improvements.
Use a fast union/find data structure instead of building an adjacency-list representation of the graph. Then if one rectangle intersects another rectangle you can stop right then -- there's no need to continue testing it against all other rectangles seen so far. With this in place, problem instances in which most rectangles intersect most other rectangles will be solved very quickly.
There's still the need to efficiently handle problem instances in which most rectangles intersect few or no other rectangles. A couple of observations:
A rectangle can only overlap another rectangle if both their vertical and horizontal extents overlap.
If we have n non-overlapping rectangles centered at the grid points of some h*w grid, it must be that min(h, w) <= sqrt(n).
Suppose the problem instance has the form of the second bullet point above -- an h*w grid of non-overlapping rectangles, with h*w = n but h and w otherwise unknown. As you process each rectangle, insert its vertical extent into a data structure that enables fast point-in-interval queries, such as an interval tree or segment tree, and insert its horizontal extent into another such data structure. The obvious way of using this information -- by looking up all rectangles that overlap the current rectangle vertically, and looking up all rectangles that overlap it horizontally, and then intersecting these 2 lists -- doesn't give much speed advantage, because one of these lists could be very long. What you can do instead is to simply pick the shorter of these 2 lists and test every rectangle in it (as before, stopping as soon as an overlap is detected). This is fast, because we know that the shorter list can have at most sqrt(7000) rectangles in it.
I haven't proven that a grid of non-overlapping rectangles is a true worst case for this algorithm, but I'm confident the above approach will work quickly in any case.

Optimizing the layout of a graph with given (erroneous) node-distances

I have a loosely connected graph. For every edge in this graph, I know the approximate distance d(v,w) between node v and w at positions p(v) and p(w) as a vector in R3, not only as an euclidean distance. The error shall be small (lets say < 3%) and the first node is at <0,0,0>.
If there were no errors at all, I can calculate the node-positions this way:
set p(first_node) = <0,0,0>
calculate_position(first_node)
calculate_position(v):
for (v,w) in Edges:
if p(w) is not set:
set p(w) = p(v) + d(v,w)
calculate_position(w)
for (u,v) in Edges:
if p(u) is not set:
set p(u) = p(v) - d(u,v)
calculate_position(u)
The errors of the distance are not equal. But to keep things simple, assume the relative error (d(v,w)-d'(v,w))/E(v,w) is N(0,1)-normal-distributed. I want to minimize the sum of the squared error
sum( ((p(v)-p(w)) - d(v,w) )^2/E(v,w)^2 ) for all edges
The graph may have a moderate amount of Nodes ( > 100 ) but with just some connections between the nodes and have been "prefiltered" (split into subgraphs, if there is only one connection between these subgraphs).
I have tried a simplistic "physical model" with hooks low but its slow and unstable. Is there a better algorithm or heuristic for this kind of problem?

This looks like linear regression. Take error terms of the following form, i.e. without squares and split into separate coordinates:
(px(v) - px(w) - dx(v,w))/E(v,w)
(py(v) - py(w) - dy(v,w))/E(v,w)
(pz(v) - pz(w) - dz(v,w))/E(v,w)
If I understood you correctly, you are looking for values px(v), py(v) and pz(v) for all nodes v such that the sum of squares of the above terms is minimized.
You can do this by creating a matrix A and a vector b in the following way: every row corresponds to one of equation of the above form, and every column of A corresponds to one variable, i.e. a single coordinate. For n vertices and m edges, the matrix A will have 3m rows (since you separate coordinates) and 3n−3 columns (since you also fix the first node px(0)=py(0)=pz(0)=0).
The row for (px(v) - px(w) - dx(v,w))/E(v,w) would have an entry 1/E(v,w) in the column for px(v) and an entry -1/E(v,w) in the column for px(w). All other columns would be zero. The corresponding entry in the vector b would be dx(v,w)/E(v,w).
Now solve the linear equation (AT·A)x = AT·b where AT denotes the transpose of A. The solution vector x will contain the coordinates for your vertices. You can break this into three independent problems, one for each coordinate direction, to keep the size of the linear equation system down.

Algorithm to convert vertices of a triangular strip to polygon

I have an array with vertices representing a triangular strip.
I need to convert it into polygon.
There are many solution to do the reverse, but I failed to find one for the above problem.
Or it could be too easy and I just cannot see it.
Please help.
OpenGL=compatible, see
http://en.wikipedia.org/wiki/Triangle_strip
Example:
for this strip http://en.wikipedia.org/wiki/File:Triangle_Strip_Small.png
I need output A B D F E C or A C E F D B

I believe the following should work:
Walk through the list of vertices. Add the first point to your polygon. Push the second point on the stack. Add the third point to the polygon. Continue alternating between pushing points on the stack and adding them to the polygon until you reach the end of the list. When you get to the end of the list, pop the points of the stack and add them to the polygon.

I'll assume your triangle strip is always connected the same way (which I believe is true for OpenGL).
The "bottom" vertices are always two
apart: A, C, E, ...
The "top"
vertices are always two apart: B, D,
F, ...
Take the "bottom" list and append the reverse of the "top" list. (ACEFDB for the example)
Or, more directly, using a zero-based index instead of letters:
// do "bottom"
for ( i = 0; i < N; i += 2 )
addVertex( i )
// do "top"
largestOddNumberLessThanN = N % 2 == 0 ? N - 1 : N - 2;
for ( i = largestOddNumberLessThanN; i >= 0; i -= 2 )
addVertex( i )

There may be a shortcut if your shape has a particularly simple structure, but in general I think you want to do the following
Create a list of vertices and edges from your list of triangles. This involves detecting shared points (hopefully they are exact matches so you can find them by hashing instead of needing some sort of fuzzy search) and shared lines (that's easy--just don't draw two edges between the same pair of shared vertices).
Find the upper-left-most point.
Find the edge that travels as upper-left-most as possible, and is in the counterclockwise direction.
Walk to the next vertex.
Find the next edge that doubles back on the previous one as much as possible.
Keep going until you hit the upper-left-most point again.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio