Marching Squares Algorithm's bit shifting step - algorithm

I am currently implementing the Marching Squares for calculating contour curves and I had a question regarding the usage of bit shifting as mentioned here
Compose the 4 bits at the corners of the cell to build a binary index: walk around the cell in a clockwise direction appending the bit to the index, using bitwise OR and left-shift, from most significant bit at the top left, to least significant bit at the bottom left. The resulting 4-bit index can have 16 possible values in the range 0–15.
I have height data that I place on corners of each vertex at a specified (x,y). Then I convert this height data into 0s and 1s by performing a check to see whether this height data is greater or lesser than a specified isovalue(say contour level). Now the vertexes are either 0s or 1s. What is the purpose of the next step i.e. calculating the 4-bit index by traversing in a clockwise direction ?

The purpose of composing a four bit code is simply to identify the configuration, one among sixteen cases.
You could very well use four nested if/then/else statements instead.

Related

Algorithm to fill arbitrary marked/selected tiles on a square grid with the smallest number of rectangles?

What I am asking here is an algorithm question. I'm not asking for specifics of how to do it in the programming language I'm working in or with the framework and libraries I'm currently using. I want to know how to do this in principle.
As a hobby, I am working on an open source virtual reality remake of the 1992 first-person shooter game Wolfenstein 3D. My program will support classic mods and map packs for WOLF3D made in the original format from the 90s. This means that my program will not know in advance what the maps are going to be. They are loaded in at runtime from user provided files.
A Wolfenstein 3D map is a 2D square grid of normally 64x64 tiles. let's assume I have a 2D array of bools which return true if a particular tile can be traversed by the player and false if the tile will never be traversable no matter what happens in the game.
I want to generate rectangular collision objects for a modern game engine which will prevent collisions into non traversable tiles on the map. Right now, I have a small collision object on each surface of each wall tile with a traversible tile next to it and that is very inefficient because it makes way more collision objects than necessary. What I should have instead is a smaller number of large rectangles which fill all of the squares on the grid where that 2D array I mentioned has a false value to indicate non-traversible.
When I search for any algorithms or research that might have been done for problems similar to this, I find lots of information about rectangle packing for the purposes of making texture atlases for games, which packs rectangles into a square, but I haven't found anything that tries to pack the smallest number of rectangles into an arbitrary set of selected / marked square tiles.
The naive approach which occurs to me is to first make 64 rectangles representing 64 rows and then chop out whatever squares are traversible. but I suspect that there's got to be an algorithm which can do better, meaning that it can fill the same spaces with a smaller number of rectangles. Maybe something that starts with my naive approach and then checks each rectangle for adjacent rectangles which it could merge with? But I'm not sure how far to take that approach or if it will even truly reduce the number of rectangles.
The result doesn't have to be perfect. I am just fishing here to see if anyone has any magic tricks that could take me even a little bit beyond the naive approach.
Has anyone done this before? What is it called? Just knowing what some of the vocabulary words I would need to even talk about this are would help. Thanks!
(later edit)
Here is some sample input as comma-separated values. The 1s represent the area that must be filled with the rectangles while the 0s represent the area that should not be filled with the rectangles.
I expect that the result would be a list of sets of 4 integers where each set represents a rectangle like this:
First integer would be the x coordinate of the left/western edge of the rectangle.
Second integer would be the y coordinate of the top/northern edge of the rectangle.
Third integer would be the width of the rectangle.
Fourth integer would be the depth of the rectangle.
My program is in C# but I'm sure I can translate anything in a normal mainstream general purpose programming language or psuedocode.
Mark all tiles as not visited
For each tile:
skip if the tile is not a top-left corner or was visited before
# now, the tile is a top-left corner
expand right until top-right corner is found
expand down
save the rectangle
mark all tiles in the rectangle as visited
However simplistic it looks, it will likely generate minimal number of rectangles - simply because we need at least one rectangle per pair of top corners.
For faster downward expansion, it makes sense to precompute a table holding sum of all element top and left from the tile (aka integral image).
For non-overlapping rectangles, worst case complexity for an n x n "image" should not exceed O(n^3). If rectangles can overlap (would result in smaller number of them), integral image optimization is not applicable and the worst case will be O(n^4).

Minimum amount of rectangles from multi-colored grid

I've been working for some time in an XNA roguelike game and I can't get my head around the following problem: developing an algorithm to divide a matrix of non-binary values into the fewest rectangles grouping these values.
Example: given the following matrix
01234567
0 ---##*##
1 ---##*##
2 --------
The algorithm should return:
3x3 rectangle of '-'s starting at (0,0)
2x2 rectangle of '#'s starting at (3, 0)
1x2 rectangle of '*'s starting at (5, 0)
2x2 rectangle of '#'s starting at (6, 0)
5x1 rectangle of '-'s starting at (3, 2)
Why am I doing this: I've gotten a pretty big dungeon type with a size of approximately 500x500. If I were to individually call the "Draw" method for each tile's Sprite, my FPS would be far too low. It is possible to optimize this process by grouping similar-textured tiles and applying texture repetition to them, which would dramatically decrease the amount of GPU draw calls for that. For example, if my map were the previous matrix, instead of calling draw 16 times, I'd call it only 5 times.
I've looked at some algorithms which can give you the biggest rectangle of a type inside a given binary matrix, but that doesn't fit my problem.
Thanks in advance!
You can use breadth first searches to separate each area of different tile type.
Picking a partitioning within the individual shapes is an NP-hard problem (see https://en.wikipedia.org/wiki/Graph_partition), so you can't find an efficient solution that guarantees the minimum number of rectangles. However if you don't mind an extra rectangle or two for each shape and your shapes are relatively small, you can come up with algorithms that split the shape into a number of rectangles close to the minimum.
An off the top of my head guess for something that could potentially work would be to pick a tile with the maximum connecting tiles and start growing a rectangle from it using a recursive algorithm to maximize the size. Remove the resulting rectangle from the shape, then repeat until there are no more tiles not included in a rectangle. Again, this won't produce perfect results, there are graphs on which this will return with more than the minimum amount of rectangles, but it's an easy to implement ballpark solution. With a little more effort I'm sure you will be able to find better heuristics to use and get better results too.
One possible building block is a routine to check, given two points, whether the rectangle formed by using those points as opposite corners is all of the same type. I think that a fast (but unreliable) means of testing this can be based on mapping each type to a large random number, and then working out the sum of the numbers within a rectangle modulo a large prime. Take one of the numbers within the rectangle. If the sum of the numbers within the rectangle is the size of the rectangle times the one number sampled, assume that the all of the numbers in the rectangle are the same.
In one dimension we can work out all of the cumulative sums a, a+b, a+b+c, a+b+c+d,... in time O(N) and then, for any two points, work out the sum for the interval between them by subtracting cumulative sums: b+c+d = a+b+c+d - a. In two dimensions, we can use cumulative sums to work out, for each point, the sum of all of the numbers from positions which have x and y co-ordinates no greater than the (x, y) coordinate of that position. For any rectangle we can work out the sum of the numbers within that rectangle by working out A-B-C+D where A,B,C,D are two-dimensional cumulative sums.
So with pre-processing O(N) we can work out a table which allows us to compute the sum of the numbers within a rectangle specified by its opposite corners in time O(1). Unless we are very unlucky, checking this sum against the size of the rectangle times a number extracted from within the rectangle will tell us whether the rectangle is all of the same type.
Based on this, repeatedly start with a random point not covered. Take a point just to its left and move that point left as long as the interval between the two points is of the same type. Then move that point up as long as the rectangle formed by the two points is of the same type. Now move the first point to the right and down as long as the rectangle formed by the two points is of the same type. Now you think you have a large rectangle covering the original point. Check it. In the unlikely event that it is not all of the same type, add that rectangle to a list of "fooled me" rectangles you can check against in future and try again. If it is all of the same type, count that as one extracted rectangle and mark all of the points in it as covered. Continue until all points are covered.
This is a greedy algorithm that makes no attempt at producing the optimal solution, but it should be reasonably fast - the most expensive part is checking that the rectangle really is all of the same type, and - assuming you pass that test - each cell checked is also a cell covered so the total cost for the whole process should be O(N) where N is the number of cells of input data.

Algorithm to divide region such that sum of distance is minimized

Suppose we have n points in a bounded region of the plane. The problem is to divide it in 4 regions (with a horizontal and a vertical line) such that the sum of a metric in each region is minimized.
The metric can be for example, the sum of the distances between the points in each region ; or any other measure about the spreadness of the points. See the figure below.
I don't know if any clustering algorithm might help me tackle this problem, or if for instance it can be formulated as a simple optimization problem. Where the decision variables are the "axes".
I believe this can be formulated as a MIP (Mixed Integer Programming) problem.
Lets introduce 4 quadrants A,B,C,D. A is right,upper, B is right,lower, etc. Then define a binary variable
delta(i,k) = 1 if point i is in quadrant k
0 otherwise
and continuous variables
Lx, Ly : coordinates of the lines
Obviously we have:
sum(k, delta(i,k)) = 1
xlo <= Lx <= xup
ylo <= Ly <= yup
where xlo,xup are the minimum and maximum x-coordinate. Next we need to implement implications like:
delta(i,'A') = 1 ==> x(i)>=Lx and y(i)>=Ly
delta(i,'B') = 1 ==> x(i)>=Lx and y(i)<=Ly
delta(i,'C') = 1 ==> x(i)<=Lx and y(i)<=Ly
delta(i,'D') = 1 ==> x(i)<=Lx and y(i)>=Ly
These can be handled by so-called indicator constraints or written as linear inequalities, e.g.
x(i) <= Lx + (delta(i,'A')+delta(i,'B'))*(xup-xlo)
Similar for the others. Finally the objective is
min sum((i,j,k), delta(i,k)*delta(j,k)*d(i,j))
where d(i,j) is the distance between points i and j. This objective can be linearized as well.
After applying a few tricks, I could prove global optimality for 100 random points in about 40 seconds using Cplex. This approach is not really suited for large datasets (the computation time quickly increases when the number of points becomes large).
I suspect this cannot be shoe-horned into a convex problem. Also I am not sure this objective is really what you want. It will try to make all clusters about the same size (adding a point to a large cluster introduces lots of distances to be added to the objective; adding a point to a small cluster is cheap). May be an average distance for each cluster is a better measure (but that makes the linearization more difficult).
Note - probably incorrect. I will try and add another answer
The one dimensional version of minimising sums of squares of differences is convex. If you start with the line at the far left and move it to the right, each point crossed by the line stops accumulating differences with the points to its right and starts accumulating differences to the points to its left. As you follow this the differences to the left increase and the differences to the right decrease, so you get a monotonic decrease, possibly a single point that can be on either side of the line, and then a monotonic increase.
I believe that the one dimensional problem of clustering points on a line is convex, but I no longer believe that the problem of drawing a single vertical line in the best position is convex. I worry about sets of points that vary in y co-ordinate so that the left hand points are mostly high up, the right hand points are mostly low down, and the intermediate points alternate between high up and low down. If this is not convex, the part of the answer that tries to extend to two dimensions fails.
So for the one dimensional version of the problem you can pick any point and work out in time O(n) whether that point should be to the left or right of the best dividing line. So by binary chop you can find the best line in time O(n log n).
I don't know whether the two dimensional version is convex or not but you can try all possible positions for the horizontal line and, for each position, solve for the position of the vertical line using a similar approach as for the one dimensional problem (now you have the sum of two convex functions to worry about, but this is still convex, so that's OK). Therefore you solve at most O(n) one-dimensional problems, giving cost O(n^2 log n).
If the points aren't very strangely distributed, I would expect that you could save a lot of time by using the solution of the one dimensional problem at the previous iteration as a first estimate of the position of solution for the next iteration. Given a starting point x, you find out if this is to the left or right of the solution. If it is to the left of the solution, go 1, 2, 4, 8... steps away to find a point to the right of the solution and then run binary chop. Hopefully this two-stage chop is faster than starting a binary chop of the whole array from scratch.
Here's another attempt. Lay out a grid so that, except in the case of ties, each point is the only point in its column and the only point in its row. Assuming no ties in any direction, this grid has N rows, N columns, and N^2 cells. If there are ties the grid is smaller, which makes life easier.
Separating the cells with a horizontal and vertical line is pretty much picking out a cell of the grid and saying that cell is the cell just above and just to the right of where the lines cross, so there are roughly O(N^2) possible such divisions, and we can calculate the metric for each such division. I claim that when the metric is the sum of the squares of distances between points in a cluster the cost of this is pretty much a constant factor in an O(N^2) problem, so the whole cost of checking every possibility is O(N^2).
The metric within a rectangle formed by the dividing lines is SUM_i,j[ (X_i - X_j)^2 + (Y_i-Y_j)^2]. We can calculate the X contributions and the Y contributions separately. If you do some algebra (which is easier if you first subtract a constant so that everything sums to zero) you will find that the metric contribution from a co-ordinate is linear in the variance of that co-ordinate. So we want to calculate the variances of the X and Y co-ordinates within the rectangles formed by each division. https://en.wikipedia.org/wiki/Algebraic_formula_for_the_variance gives us an identity which tells us that we can work out the variance given SUM_i Xi and SUM_i Xi^2 for each rectangle (and the corresponding information for the y co-ordinate). This calculation can be inaccurate due to floating point rounding error, but I am going to ignore that here.
Given a value associated with each cell of a grid, we want to make it easy to work out the sum of those values within rectangles. We can create partial sums along each row, transforming 0 1 2 3 4 5 into 0 1 3 6 10 15, so that each cell in a row contains the sum of all the cells to its left and itself. If we take these values and do partial sums up each column, we have just worked out, for each cell, the sum of the rectangle whose top right corner lies in that cell and which extends to the bottom and left sides of the grid. These calculated values at the far right column give us the sum for all the cells on the same level as that cell and below it. If we subtract off the rectangles we know how to calculate we can find the value of a rectangle which lies at the right hand side of the grid and the bottom of the grid. Similar subtractions allow us to work out first the value of the rectangles to the left and right of any vertical line we choose, and then to complete our set of four rectangles formed by two lines crossing by any cell in the grid. The expensive part of this is working out the partial sums, but we only have to do that once, and it costs only O(N^2). The subtractions and lookups used to work out any particular metric have only a constant cost. We have to do one for each of O(N^2) cells, but that is still only O(N^2).
(So we can find the best clustering in O(N^2) time by working out the metrics associated with all possible clusterings in O(N^2) time and choosing the best).

Algorithm for the decomposition of polygons

Does anyone know a relatively fast algorithm for decomposing a set of polygons into their distinct overlapping and non-overlapping regions, i.e. Given a set of n polygons, find all the distinct regions among them?
For instance, the input would be 4 polygons representing circles as shown below
and the output will be all polygons representing the distinct regions shown in the different colours.
I can write my own implementation using polygon operations but the algorithm will probably be slow and time consuming. I'm wondering if there's any optimised algorithm out there for this sort of problem.
Your problem in called the map overlay problem. It can be solved in O(n*log(n)+k*log(k)) time, where n is the number of segments and k is the number of segment intersections.
First you need to represent your polygons as a doubly connected edge list, different faces corresponding to the interiors of different polygons.
Then use the Bentley–Ottmann algorithm to find all segment intersections and rebuild the edge list. See: Computing the Overlay of Two Subdivisions or Subdivision representation and map overlay.
Finally, walk around each cycle in the edge list and collect faces of that cycle's half-edges. Every set of the faces will represent a distinct overlapping region.
See also: Shapefile Overlay Using a Doubly-Connected Edge List.
I don't think it is SO difficult.
I have answered the similar question on the friendly site and it was checked by a smaller community:
https://cs.stackexchange.com/questions/20039/detect-closed-shapes-formed-by-points/20247#20247
Let's look for a more common question - let's take curves instead of polygons. And let's allow them to go out of the picture border, but we'll count only for simple polygons that wholly belong to the picture.
find all intersections by checking all pairs of segments, belonging to different curves. Of course, filter them before real check for intersection.
Number all curves 1..n. Set some order of segments in them.
For every point create a sequence of intersections SOI, so: if it starts from the border end, SOI[1] is null. If not, SOI[1]= (number of the first curve it is intersecting with, the sign of the left movement on the intersecting curve). Go on, writing down into SOI every intersection - number of curve if there is some, or 0 if it is the intersection with the border.
Obviously, you are looking only for simple bordered areas, that have no curves inside.
Pieces of curves between two adjacent non-null intersection points we'll call segments.
Having SOI for each curve:
for segment of the curve 1, starting from the first point of the segment, make 2 attempts to draw a polygon of segments. It is 2 because you can go to 2 sides along the first intersecting curve.
For the right attempt, make only left turns, for the left attempt, make only the right turns.
If you arrive at point with no segment in the correct direction, the attempt fails. If you return to the curve 1, it success. You have a closed area.
Remember all successful attempts
Repeat this for all segments of curve 1
Repeat this for all other curves, checking all found areas against the already found ones. Two same adjacent segments is enough to consider areas equal.
How to find the orientation of the intersection.
When segment p(p1,p2) crosses segment q(q1,q2), we can count the vector multiplication of vectors pXq. We are interested in only sign of its Z coordinate - that is out of our plane. If it is +, q crosses p from left to right. If it is -, the q crosses p from right to left.
The Z coordinate of the vector multiplication is counted here as a determinant of matrix:
0 0 1
p2x-p1x p2y-p1y 0
q2x-q1x q2y-q1y 0
(of course, it could be written more simply, but it is a good memorization trick)
Of course, if you'll change all rights for lefts, nothing really changes in the algorithm as a whole.

Minimum number of rectangles in shape made from rectangles?

I'm not sure if there's an algorithm that can solve this.
A given number of rectangles are placed side by side horizontally from left to right to form a shape. You are given the width and height of each.
How would you determine the minimum number of rectangles needed to cover the whole shape?
i.e How would you redraw this shape using as few rectangles as possible?
I've can only think about trying to squeeze as many big rectangles as i can but that seems inefficient.
Any ideas?
Edit:
You are given a number n , and then n sizes:
2
1 3
2 5
The above would have two rectangles of sizes 1x3 and 2x5 next to each other.
I'm wondering how many rectangles would i least need to recreate that shape given rectangles cannot overlap.
Since your rectangles are well aligned, it makes the problem easier. You can simply create rectangles from the bottom up. Each time you do that, it creates new shapes to check. The good thing is, all your new shapes will also be base-aligned, and you can just repeat as necessary.
First, you want to find the minimum height rectangle. Make a rectangle that height, with the width as total width for the shape. Cut that much off the bottom of the shape.
You'll be left with multiple shapes. For each one, do the same thing.
Finding the minimum height rectangle should be O(n). Since you do that for each group, worst case is all different heights. Totals out to O(n2).
For example:
In the image, the minimum for each shape is highlighted green. The resulting rectangle is blue, to the right. The total number of rectangles needed is the total number of blue ones in the image, 7.
Note that I'm explaining this as if these were physical rectangles. In code, you can completely do away with the width, since it doesn't matter in the least unless you want to output the rectangles rather than just counting how many it takes.
You can also reduce the "make a rectangle and cut it from the shape" to simply subtracting the height from each rectangle that makes up that shape/subshape. Each contiguous section of shapes with +ve height after doing so will make up a new subshape.
If you look for an overview on algorithms for the general problem, Rectangular Decomposition of Binary Images (article by Tomas Suk, Cyril Höschl, and Jan Flusser) might be helpful. It compares different approaches: row methods, quadtree, largest inscribed block, transformation- and graph-based methods.
A juicy figure (from page 11) as an appetizer:
Figure 5: (a) The binary convolution kernel used in the experiment. (b) Its 10 blocks of GBD decomposition.

Resources