Algorithm placing nodes in 2D - Diagram Creation - algorithm

Is there an unlicensed algorithm for placing nodes/vertices in a compact, clear way, with nodes being close to each other without overlapping and having short edges and those with many links not being all in the center etc, i.e. all that matters in a good diagram?
In other words, how do I isomorph a graph in the most clearly arranged position?
Oh, and the nodes are rectangles (as I said, it's for a diagram) can differ in size depending on their content

Related

Arranging nodes-edges for 'good looking' graph layout

I came across following graph layout proposed in the paper NodeTrix :
The big blocks that are visible are nodes themselves (A sort of composite node of a sub-graph).
I see that the edges are some sort of curves which seem to not intersect too much among themselves. Also, the nodes and edges don't intersect among themselves. Paper doen't talk about it btw.
I was hoping to implement this visualization. I have following doubts:
Q1. Is this some specific algorithm to arrange Nodes-Edges so that the graph look good, as shown in this paper ? Any other algorithm in general ?
Q2. Is there some special algorithm for the curved edges shown above as well ?
It would be great if someone could figure out the exact algorithm in the above figure visually, but some general similar algorithm should also do.
One algorithm is Force-directed graph drawing. It will produce an output very different from the posted picture, but it is quite popular and might give you a place to start looking.
To be honest, I suspect that the shown graph is manually laid out.
EDIT: Answer to comment
In the example all nodes are square boxes, and the edges start/end diagonal to the sides of the boxes. A way to to this could be
Place boxes using force-direction (or likely some customized version of it, forces depend on the size of the box)
Imagine a "guide-edge" going directly between the centers of the boxes
Calculate the the places where the guide-edge intersects the boxes, and use that as the start/end points of the real, drawn edge.
Make the real edge start diagonal to the sides, and use bezier curves to draw the curve.
You probably want to represent this as some vector format, that has bezier cures built in, e.g., svg.

How to fill an area with a collection of rectangles and no space left

I have a collection of different sized rectangles that need to be placed on a 2D surface of a known dimension, with the condition that:
there is no overlap between the blocks,
the rectangles may not be rotated,
there is no area left blank (whole surface needs to be filled).
The rectangles are actually generated by breaking down the surface area, so they ought to fill up the area completely once you start putting everything together.
I figured there would be a dozen algorithms available for this problem, but the algorithms I find are more like best effort algorithms such as sprite generators that do not have the precondition that the whole area needs to be (can be...) filled -- which obviously is not necessary when building sprites, however, it is in my case.
I am a bit lost here, either this problem isn't as simple as I thought, or I am searching on wrong keywords.
Some topics I have found but do not fully suit my needs:
What algorithm can be used for packing rectangles of different sizes into the smallest rectangle possible in a fairly optimal way? (in my case, the area is preset)
How to arrange N rectangles to cover minimum area (in my case, minimum area must equal zero)
Is there any algorithm out there that may suit my needs?
IMHO, the most natural solution is recursive. For the form of source area is not set. And after removing a rectangle from it, we have the same task, only with smaller area and -1 rectangle.
I would start from the edges, because there the possible variants are already limited. So, simply go by spiral, trying to put rectangles along the edge. If no rectangle fits, go back. That will be the simplest and not so slow raw force method.

Packing arbitrary polygons within an arbitrary boundary

I was wondering if anybody could point me to the best algorithm/heuristic which will fit my particular polygon packing problem. I am given a single polygon as a boundary (convex or concave may also contain holes) and a single "fill" polygon (may also be convex or concave, does not contain holes) and I need to fill the boundary polygon with a specified number of fill polygons. (I'm working in 2D).
Many of the polygon packing heuristics I've found assume that the boundary and/or filling polygons will be rectangular and also that the filling polygons will be of different sizes. In my case, the filling polygons may be non-rectangular, but all will be exactly the same.
Maybe this is a particular type of packing problem? If somebody has a definition for this type of polygon packing I'll gladly google away, but so far I've not found anything which is similar enough to be of great use.
Thanks.
The question you ask is very hard. To put this in perspective, the (much) simpler case where you're packing the interior of your bounded polygon with non-overlapping disks is already hard, and disks are the simplest possible "packing shape" (with any other shape you have to consider orientation as well as size and center location).
In fact, I think it's an open problem in computational geometry to determine for an arbitrary integer N and arbitrary bounded polygonal region (in the Euclidean plane), what is the "optimal" (in the sense of covering the greatest percentage of the polygon interior) packing of N inscribed non-overlapping disks, where you are free to choose the radius and center location of each disk. I'm sure the "best" answer is known for certain special polygonal shapes (like rectangles, circles, and triangles), but for arbitrary shapes your best "heuristic" is probably:
Start your shape counter at N.
Add the largest "packing shape" you can fit completely inside the polygonal boundary without overlapping any other packing shapes.
Decrement your shape counter.
If your shape counter is > 0, go to step 2.
I say "probably" because "largest first" isn't always the best way to pack things into a confined space. You can dig into that particular flavor of craziness by reading about the bin packing problem and knapsack problem.
EDIT: Step 2 by itself is hard. A reasonable strategy would be to pick an arbitrary point on the interior of the polygon as the center and "inflate" the disk until it touches either the boundary or another disk (or both), and then "slide" the disk while continuing to inflate it so that it remains inside the boundary without overlapping any other disks until it is "trapped" - with at least 2 points of contact with the boundary and/or other disks. But it isn't easy to formalize this "sliding process". And even if you get the sliding process right, this strategy doesn't guarantee that you'll find the biggest "inscribable disk" - your "locally maximal" disk could be trapped in a "lobe" of the interior which is connected by a narrow "neck" of free space to a larger "lobe" where a larger disk would fit.
Thanks for the replies, my requirements were such that I was able to further simplify the problem by not having to deal with orientation and I then even further simplified by only really worrying about the bounding box of the fill element. With these two simplifications the problem became much easier and I used a stripe like filling algorithm in conjunction with a spatial hash grid (since there were existing elements I was not allowed to fill over).
With this approach I simply divided the fill area into stripes and created a spatial hash grid to register existing elements within the fill area. I created a second spatial hash grid to register the fill area (since my stripes were not guaranteed to be within the bounding area, this made checking if my fill element was in the fill area a little faster since I could just query the grid and if all grids where my fill element were to be placed, were full, I knew the fill element was inside the fill area). After that, I iterated over each stripe and placed a fill element where the hash grids would allow. This is certainly not an optimal solution, but it ended up being all that was required for my particular situation and pretty fast as well. I found the required information about creating a spatial hash grid from here. I got the idea for filling by stripes from this article.
This type of problem is very complex to solve geometrically.
If you can accept a good solution instead of the 100% optimal
solution then you can to solve it with a raster algorithm.
You draw (rasterize) the boundary polygon into one in-memory
image and the fill polygon into another in-memory image.
You can then more easily search for a place where the fill polygon will
fit in the boundary polygon by overlaying the two images with
various (X, Y) offsets for the fill polygon and checking
the pixel values.
When you find a place that the fill polygon fits,
you clear the pixels in the boundary polygon and repeat
until there are no more places where the fill polygon fits.
The keywords to google search for are: rasterization, overlay, algorithm
If your fill polygon is the shape of a jigsaw piece, many algorithms will miss the interlocking alignment. (I don't know what to suggest in that case)
One approach to the general problem that works well when the boundary is much larger than
the fill pieces is to tile an infinite plane with the pieces in the best way you can, and then look for the optimum alignment of the boundary on this plane.

Suggestions on speeding up edge selection

I am building a graph editor in C# where the user can place nodes and then connect them with either a directed or undirected edge. When finished, an A* pathfinding algorithm determines the best path between two nodes.
What I have: A Node class with an x, y, list of connected nodes and F, G and H scores.
An Edge class with a Start, Finish and whether or not it is directed.
A Graph class which contains a list of Nodes and Edges as well as the A* algorithm
Right now when a user wants to select a node or an edge, the mouse position gets recorded and I iterate through every node and edge to determine whether it should be selected. This is obviously slow. I was thinking I can implement a QuadTree for my nodes to speed it up however what can I do to speed up edge selection?
Since users are "drawing" these graphs I would assume they include a number of nodes and edges that humans would likely be able to generate (say 1-5k max?). Just store both in the same QuadTree (assuming you already have one written).
You can easily extend a classic QuadTree into a PMR QuadTree which adds splitting criteria based on the number of line segments crossing through them. I've written a hybrid PR/PMR QuadTree which supported bucketing both points and lines, and in reality it worked with a high enough performance for 10-50k moving objects (rebalancing buckets!).
So your problem is that the person has already drawn a set of nodes and edges, and you'd like to make the test to figure out which edge was clicked on much faster.
Well an edge is a line segment. For the purpose of filtering down to a small number of possible candidate edges, there is no harm in extending edges into lines. Even if you have a large number of edges, only a small number will pass close to a given point so iterating through those won't be bad.
Now divide edges into two groups. Vertical, and not vertical. You can store the vertical edges in a sorted datastructure and easily test which vertical lines are close to any given point.
The not vertical ones are more tricky. For them you can draw vertical boundaries to the left and right of the region where your nodes can be placed, and then store each line as the pair of heights at which the line intersects those lines. And you can store those pairs in a QuadTree. You can add to this QuadTree logic to be able to take a point, and search through the QuadTree for all lines passing within a certain distance of that point. (The idea is that at any point in the QuadTree you can construct a pair of bounding lines for all of the lines below that point. If your point is not between those lines, or close to them, you can skip that section of the tree.)
I think you have all the ingredients already.
Here's a suggestion:
Index all your edges in a spatial data structure (could be QuadTree, R-Tree etc.). Every edge should be indexed using its bounding box.
Record the mouse position.
Search for the most specific rectangle containing your mouse position.
This rectangle should have one or more edges/nodes; Iterate through them, according to the needed mode.
(The tricky part): If the user has not indicated any edge from the most specific rectangle, you should go up one level and iterate over the edges included in this level. Maybe you can do without this.
This should be faster.

Trying to understand Quadtree concept and apply it to storing coloring info of an image

I've read so many articles, but none seem to answer this question. Or maybe I'm just not understanding. I'm attempting to build a quadtree so that it can represent an image. The leaf nodes are to hold pixels, and non-leaf nodes will hold the average value pixel of its children.
My question is:
How does it work that the leaf nodes only hold pixels? Why don't the other nodes hold pixels? And how do we know how many times to subdivide our original root node to represent that given image? Do we just subdivide it n times, where n is the height and width (for a square)?
Edit: So how do I keep track of leaf nodes, so I know when to add pixels at that location? Right now I have a helper function that divides the regions for me, keeping track of width and height.
Quadtrees work best for square images whose size is a power of 2 (for example, most textures). You shouldn't think of each node as representing a "pixel". Instead, think of it as representing a "square block of pixels of size 2^k". In the case of final leaves, k is 0, so each leaf node represents a square block of pixels of size 1, that is, a single pixel. Internal nodes in the tree represent increasingly large sections of image.
Why do only leaf nodes hold pixels? Ask yourself if a non-leaf node held a pixel, then what would its children hold? Since you can't subdivide a pixel, the answer is obviously nothing -- there can be no such nodes.
How do we know how many times to subdivide? Well, there are multiple ways to do it, of course, depending on why you're building the quadtree. In general, the areas of the image with more entropy -- more "detail" -- should be subdivided more, while the lower-entropy, "flatter" areas can be divided less. There are a number of different algorithms for choosing when and where to subdivide. Generally, they compare pixel values within a region, and split when the differences are above some threshold.
how does it work that the leaf nodes
only hold pixels? Why dont the other
nodes hold pixels?
This depends on what you're using the Quadtree for. You can link any kind of information to the other nodes, f.e. a pointer to the upper-left corner and the width/height of the rectangle this node describes, but you won't need it in most cases (or need things like the average values you can precompute to speed things up).
And how do we know how many times to
subdivide our original root node to
represent that given image?
With every subdivision, you half the width and height of a region, so for a square image of size n you'll need to subdivide log2(n) times, for a non-square image of size n*m you'll need at most max(log2(n), log2(m)) steps.
I think the best way to answer your question is to answer two questions that you didn't ask.
What is a quadtree?
How can this be
applied to modelling systems of
erratic density?
A quadtree is a binary tree in two dimensions. That's why there are (up to) four children for every non-leaf node. This allows you to apply an index to a plane just as a database uses a binary-tree or some variation thereof to index a single dimension, with the same highly advantageous sparse phase space representation properties.
The application of this to image compression and progressive display is pretty obvious: if you do a tree-walk limited to a depth of n then you get 4^n items of picture info spanning the entire image space. If you one level deeper, each pixel splits into four. JPEG2000 works like this, if I recall correctly. I said "items of picture info" because they need not be single bit; items could be 32bit ARGB or any other property (or properties) describing the space at that point.

Resources