Squarified treemap algorithm for flat data-sets? - algorithm

At the moment I use the slice & dice approach:
get the list of values and labels
calculate the sum of all values
calculate the ratios by dividing every value by the sum
for every list item:
draw a box with height=1 and width=ratio
draw the label on top of the box

Let me restate your question as I understood it.
You know the areas of a bunch of rectangles that you need to draw and fit into a square, and you'd like to do so with the rectangles not having extreme height/width ratios. Each rectangle represents the portion of the total value taken by a given label. The question is to figure out what the shape and positions of those rectangles should be such that they fit perfectly, and each has the required area.
You do not need a perfect answer. Only one that is better than the current, which just slices the square into vertical strips.
Here is my suggestion. Generalize to fitting rectangles in a targt rectangle. (This lets us use recursion.) Also I will assume that the labels have been sorted by area, with the largest first. (Sorting is an easy step to add.) Then figure out the placement of rectangles according to the following recursive rules:
If the first element in the list is more than 1/3 of the area, split the long side of your target rectangle into the first element, and everything else, then recursively fit everything else in the remainder.
Otherwise split your list into two, with the median of the area going into the first list. Divide your target rectangle into two, one for the first list, one for the second. Recursively fill each rectangle.
This should provide a fairly good division for your purposes with most data sets. It should be fairly fast to compute, and only the very smallest rectangle can have a ratio more extreme than 3 to 1.

Related

Algorithm to fill arbitrary marked/selected tiles on a square grid with the smallest number of rectangles?

What I am asking here is an algorithm question. I'm not asking for specifics of how to do it in the programming language I'm working in or with the framework and libraries I'm currently using. I want to know how to do this in principle.
As a hobby, I am working on an open source virtual reality remake of the 1992 first-person shooter game Wolfenstein 3D. My program will support classic mods and map packs for WOLF3D made in the original format from the 90s. This means that my program will not know in advance what the maps are going to be. They are loaded in at runtime from user provided files.
A Wolfenstein 3D map is a 2D square grid of normally 64x64 tiles. let's assume I have a 2D array of bools which return true if a particular tile can be traversed by the player and false if the tile will never be traversable no matter what happens in the game.
I want to generate rectangular collision objects for a modern game engine which will prevent collisions into non traversable tiles on the map. Right now, I have a small collision object on each surface of each wall tile with a traversible tile next to it and that is very inefficient because it makes way more collision objects than necessary. What I should have instead is a smaller number of large rectangles which fill all of the squares on the grid where that 2D array I mentioned has a false value to indicate non-traversible.
When I search for any algorithms or research that might have been done for problems similar to this, I find lots of information about rectangle packing for the purposes of making texture atlases for games, which packs rectangles into a square, but I haven't found anything that tries to pack the smallest number of rectangles into an arbitrary set of selected / marked square tiles.
The naive approach which occurs to me is to first make 64 rectangles representing 64 rows and then chop out whatever squares are traversible. but I suspect that there's got to be an algorithm which can do better, meaning that it can fill the same spaces with a smaller number of rectangles. Maybe something that starts with my naive approach and then checks each rectangle for adjacent rectangles which it could merge with? But I'm not sure how far to take that approach or if it will even truly reduce the number of rectangles.
The result doesn't have to be perfect. I am just fishing here to see if anyone has any magic tricks that could take me even a little bit beyond the naive approach.
Has anyone done this before? What is it called? Just knowing what some of the vocabulary words I would need to even talk about this are would help. Thanks!
(later edit)
Here is some sample input as comma-separated values. The 1s represent the area that must be filled with the rectangles while the 0s represent the area that should not be filled with the rectangles.
I expect that the result would be a list of sets of 4 integers where each set represents a rectangle like this:
First integer would be the x coordinate of the left/western edge of the rectangle.
Second integer would be the y coordinate of the top/northern edge of the rectangle.
Third integer would be the width of the rectangle.
Fourth integer would be the depth of the rectangle.
My program is in C# but I'm sure I can translate anything in a normal mainstream general purpose programming language or psuedocode.
Mark all tiles as not visited
For each tile:
skip if the tile is not a top-left corner or was visited before
# now, the tile is a top-left corner
expand right until top-right corner is found
expand down
save the rectangle
mark all tiles in the rectangle as visited
However simplistic it looks, it will likely generate minimal number of rectangles - simply because we need at least one rectangle per pair of top corners.
For faster downward expansion, it makes sense to precompute a table holding sum of all element top and left from the tile (aka integral image).
For non-overlapping rectangles, worst case complexity for an n x n "image" should not exceed O(n^3). If rectangles can overlap (would result in smaller number of them), integral image optimization is not applicable and the worst case will be O(n^4).

Minimum amount of rectangles from multi-colored grid

I've been working for some time in an XNA roguelike game and I can't get my head around the following problem: developing an algorithm to divide a matrix of non-binary values into the fewest rectangles grouping these values.
Example: given the following matrix
01234567
0 ---##*##
1 ---##*##
2 --------
The algorithm should return:
3x3 rectangle of '-'s starting at (0,0)
2x2 rectangle of '#'s starting at (3, 0)
1x2 rectangle of '*'s starting at (5, 0)
2x2 rectangle of '#'s starting at (6, 0)
5x1 rectangle of '-'s starting at (3, 2)
Why am I doing this: I've gotten a pretty big dungeon type with a size of approximately 500x500. If I were to individually call the "Draw" method for each tile's Sprite, my FPS would be far too low. It is possible to optimize this process by grouping similar-textured tiles and applying texture repetition to them, which would dramatically decrease the amount of GPU draw calls for that. For example, if my map were the previous matrix, instead of calling draw 16 times, I'd call it only 5 times.
I've looked at some algorithms which can give you the biggest rectangle of a type inside a given binary matrix, but that doesn't fit my problem.
Thanks in advance!
You can use breadth first searches to separate each area of different tile type.
Picking a partitioning within the individual shapes is an NP-hard problem (see https://en.wikipedia.org/wiki/Graph_partition), so you can't find an efficient solution that guarantees the minimum number of rectangles. However if you don't mind an extra rectangle or two for each shape and your shapes are relatively small, you can come up with algorithms that split the shape into a number of rectangles close to the minimum.
An off the top of my head guess for something that could potentially work would be to pick a tile with the maximum connecting tiles and start growing a rectangle from it using a recursive algorithm to maximize the size. Remove the resulting rectangle from the shape, then repeat until there are no more tiles not included in a rectangle. Again, this won't produce perfect results, there are graphs on which this will return with more than the minimum amount of rectangles, but it's an easy to implement ballpark solution. With a little more effort I'm sure you will be able to find better heuristics to use and get better results too.
One possible building block is a routine to check, given two points, whether the rectangle formed by using those points as opposite corners is all of the same type. I think that a fast (but unreliable) means of testing this can be based on mapping each type to a large random number, and then working out the sum of the numbers within a rectangle modulo a large prime. Take one of the numbers within the rectangle. If the sum of the numbers within the rectangle is the size of the rectangle times the one number sampled, assume that the all of the numbers in the rectangle are the same.
In one dimension we can work out all of the cumulative sums a, a+b, a+b+c, a+b+c+d,... in time O(N) and then, for any two points, work out the sum for the interval between them by subtracting cumulative sums: b+c+d = a+b+c+d - a. In two dimensions, we can use cumulative sums to work out, for each point, the sum of all of the numbers from positions which have x and y co-ordinates no greater than the (x, y) coordinate of that position. For any rectangle we can work out the sum of the numbers within that rectangle by working out A-B-C+D where A,B,C,D are two-dimensional cumulative sums.
So with pre-processing O(N) we can work out a table which allows us to compute the sum of the numbers within a rectangle specified by its opposite corners in time O(1). Unless we are very unlucky, checking this sum against the size of the rectangle times a number extracted from within the rectangle will tell us whether the rectangle is all of the same type.
Based on this, repeatedly start with a random point not covered. Take a point just to its left and move that point left as long as the interval between the two points is of the same type. Then move that point up as long as the rectangle formed by the two points is of the same type. Now move the first point to the right and down as long as the rectangle formed by the two points is of the same type. Now you think you have a large rectangle covering the original point. Check it. In the unlikely event that it is not all of the same type, add that rectangle to a list of "fooled me" rectangles you can check against in future and try again. If it is all of the same type, count that as one extracted rectangle and mark all of the points in it as covered. Continue until all points are covered.
This is a greedy algorithm that makes no attempt at producing the optimal solution, but it should be reasonably fast - the most expensive part is checking that the rectangle really is all of the same type, and - assuming you pass that test - each cell checked is also a cell covered so the total cost for the whole process should be O(N) where N is the number of cells of input data.

Minimum number of rectangles in shape made from rectangles?

I'm not sure if there's an algorithm that can solve this.
A given number of rectangles are placed side by side horizontally from left to right to form a shape. You are given the width and height of each.
How would you determine the minimum number of rectangles needed to cover the whole shape?
i.e How would you redraw this shape using as few rectangles as possible?
I've can only think about trying to squeeze as many big rectangles as i can but that seems inefficient.
Any ideas?
Edit:
You are given a number n , and then n sizes:
2
1 3
2 5
The above would have two rectangles of sizes 1x3 and 2x5 next to each other.
I'm wondering how many rectangles would i least need to recreate that shape given rectangles cannot overlap.
Since your rectangles are well aligned, it makes the problem easier. You can simply create rectangles from the bottom up. Each time you do that, it creates new shapes to check. The good thing is, all your new shapes will also be base-aligned, and you can just repeat as necessary.
First, you want to find the minimum height rectangle. Make a rectangle that height, with the width as total width for the shape. Cut that much off the bottom of the shape.
You'll be left with multiple shapes. For each one, do the same thing.
Finding the minimum height rectangle should be O(n). Since you do that for each group, worst case is all different heights. Totals out to O(n2).
For example:
In the image, the minimum for each shape is highlighted green. The resulting rectangle is blue, to the right. The total number of rectangles needed is the total number of blue ones in the image, 7.
Note that I'm explaining this as if these were physical rectangles. In code, you can completely do away with the width, since it doesn't matter in the least unless you want to output the rectangles rather than just counting how many it takes.
You can also reduce the "make a rectangle and cut it from the shape" to simply subtracting the height from each rectangle that makes up that shape/subshape. Each contiguous section of shapes with +ve height after doing so will make up a new subshape.
If you look for an overview on algorithms for the general problem, Rectangular Decomposition of Binary Images (article by Tomas Suk, Cyril Höschl, and Jan Flusser) might be helpful. It compares different approaches: row methods, quadtree, largest inscribed block, transformation- and graph-based methods.
A juicy figure (from page 11) as an appetizer:
Figure 5: (a) The binary convolution kernel used in the experiment. (b) Its 10 blocks of GBD decomposition.

random placement of rectangles with no overlaps

I am looking for a sound algorithm that would randomly place a given number of rectangles of the same size into a bigger rectangle (canvas).
I see two ways to do it:
create an empty array that will contain the rectangles already placed on canvas. start with the empty canvas. in a loop, pick a position at random for a new rectangle to be placed. check if the array has a rectangle that overlaps with the new rectangle. if it does not, put the new rectangle in to the array and repeat the loop. otherwise, pick a new position, and rerun the check again. and so on. This might never terminate (theoretically) I think. I do not like it.
use a grid and place rectangles into the cells randomly. This might still look like a grid placement. I do not like it either.
any better ways to do it? "better" meaning more efficient, or more visually "random" than the grid approach. better in any respect.
Here is a simple heuristic. It will be non-overlapping and random.
Place a rectangle randomly. Then, calculate the intersections of extensions of the the two parallel edges of the first rectangle with the edges of the canvas. You will obtain four convex empty regions. Place other rectangles in these empty regions one-by-one independently and calculate the similar divisions for placements. And try to put the remaining rectangles in empty regions.
You can try different strategies. You can try to place the rectangles close to the corners. Or, you can place them around the center of the regions. We cannot discuss optimality because you introduced randomness.
You might find Quadtrees or R-trees useful for your purpose.
I create internal room-like dungeons using the following method.
1) Scatter N points at random, but not within a few pixels of each other.
2) For each point in turn, expand if possible in all four directions. Cease
expanding if you hit another rectangle.
3) Cease the algorithm when no rooms can expand.
The result is N rectancles with just a few rectangular small spaces.
Code is in the binary image library
https://github.com/MalcolmMcLean/binaryimagelibrary/blob/master/dungeongenerator3.c
#

Randomly and efficiently filling space with shapes

What is the most efficient way to randomly fill a space with as many non-overlapping shapes? In my specific case, I'm filling a circle with circles. I'm randomly placing circles until either a certain percentage of the outer circle is filled OR a certain number of placements have failed (i.e. were placed in a position that overlapped an existing circle). This is pretty slow, and often leaves empty spaces unless I allow a huge number of failures.
So, is there some other type of filling algorithm I can use to quickly fill as much space as possible, but still look random?
Issue you are running into
You are running into the Coupon collector's problem because you are using a technique of Rejection sampling.
You are also making strong assumptions about what a "random filling" is. Your algorithm will leave large gaps between circles; is this what you mean by "random"? Nevertheless it is a perfectly valid definition, and I approve of it.
Solution
To adapt your current "random filling" to avoid the rejection sampling coupon-collector's issue, merely divide the space you are filling into a grid. For example if your circles are of radius 1, divide the larger circle into a grid of 1/sqrt(2)-width blocks. When it becomes "impossible" to fill a gridbox, ignore that gridbox when you pick new points. Problem solved!
Possible dangers
You have to be careful how you code this however! Possible dangers:
If you do something like if (random point in invalid grid){ generateAnotherPoint() } then you ignore the benefit / core idea of this optimization.
If you do something like pickARandomValidGridbox() then you will slightly reduce the probability of making circles near the edge of the larger circle (though this may be fine if you're doing this for a graphics art project and not for a scientific or mathematical project); however if you make the grid size 1/sqrt(2) times the radius of the circle, you will not run into this problem because it will be impossible to draw blocks at the edge of the large circle, and thus you can ignore all gridboxes at the edge.
Implementation
Thus the generalization of your method to avoid the coupon-collector's problem is as follows:
Inputs: large circle coordinates/radius(R), small circle radius(r)
Output: set of coordinates of all the small circles
Algorithm:
divide your LargeCircle into a grid of r/sqrt(2)
ValidBoxes = {set of all gridboxes that lie entirely within LargeCircle}
SmallCircles = {empty set}
until ValidBoxes is empty:
pick a random gridbox Box from ValidBoxes
pick a random point inside Box to be center of small circle C
check neighboring gridboxes for other circles which may overlap*
if there is no overlap:
add C to SmallCircles
remove the box from ValidBoxes # possible because grid is small
else if there is an overlap:
increase the Box.failcount
if Box.failcount > MAX_PERGRIDBOX_FAIL_COUNT:
remove the box from ValidBoxes
return SmallCircles
(*) This step is also an important optimization, which I can only assume you do not already have. Without it, your doesThisCircleOverlapAnother(...) function is incredibly inefficient at O(N) per query, which will make filling in circles nearly impossible for large ratios R>>r.
This is the exact generalization of your algorithm to avoid the slowness, while still retaining the elegant randomness of it.
Generalization to larger irregular features
edit: Since you've commented that this is for a game and you are interested in irregular shapes, you can generalize this as follows. For any small irregular shape, enclose it in a circle that represent how far you want it to be from things. Your grid can be the size of the smallest terrain feature. Larger features can encompass 1x2 or 2x2 or 3x2 or 3x3 etc. contiguous blocks. Note that many games with features that span large distances (mountains) and small distances (torches) often require grids which are recursively split (i.e. some blocks are split into further 2x2 or 2x2x2 subblocks), generating a tree structure. This structure with extensive bookkeeping will allow you to randomly place the contiguous blocks, however it requires a lot of coding. What you can do however is use the circle-grid algorithm to place the larger features first (when there's lot of space to work with on the map and you can just check adjacent gridboxes for a collection without running into the coupon-collector's problem), then place the smaller features. If you can place your features in this order, this requires almost no extra coding besides checking neighboring gridboxes for collisions when you place a 1x2/3x3/etc. group.
One way to do this that produces interesting looking results is
create an empty NxM grid
create an empty has-open-neighbors set
for i = 1 to NumberOfRegions
pick a random point in the grid
assign that grid point a (terrain) type
add the point to the has-open-neighbors set
while has-open-neighbors is not empty
foreach point in has-open-neighbors
get neighbor-points as the immediate neighbors of point
that don't have an assigned terrain type in the grid
if none
remove point from has-open-neighbors
else
pick a random neighbor-point from neighbor-points
assign its grid location the same (terrain) type as point
add neighbor-point to the has-open-neighbors set
When done, has-open-neighbors will be empty and the grid will have been populated with at most NumberOfRegions regions (some regions with the same terrain type may be adjacent and so will combine to form a single region).
Sample output using this algorithm with 30 points, 14 terrain types, and a 200x200 pixel world:
Edit: tried to clarify the algorithm.
How about using a 2-step process:
Choose a bunch of n points randomly -- these will become the centres of the circles.
Determine the radii of these circles so that they do not overlap.
For step 2, for each circle centre you need to know the distance to its nearest neighbour. (This can be computed for all points in O(n^2) time using brute force, although it may be that faster algorithms exist for points in the plane.) Then simply divide that distance by 2 to get a safe radius. (You can also shrink it further, either by a fixed amount or by an amount proportional to the radius, to ensure that no circles will be touching.)
To see that this works, consider any point p and its nearest neighbour q, which is some distance d from p. If p is also q's nearest neighbour, then both points will get circles with radius d/2, which will therefore be touching; OTOH, if q has a different nearest neighbour, it must be at distance d' < d, so the circle centred at q will be even smaller. So either way, the 2 circles will not overlap.
My idea would be to start out with a compact grid layout. Then take each circle and perturb it in some random direction. The distance in which you perturb it can also be chosen at random (just make sure that the distance doesn't make it overlap another circle).
This is just an idea and I'm sure there are a number of ways you could modify it and improve upon it.

Resources