Adding cycles to a Minimum Spanning Tree without moving the points? - algorithm

I am generating a dungeon layout for a video game. I have created the rooms, spaced them out using seperation steering, and created a fully connected weighted, undirected graph of the rooms. Then I calculated a MST using Prim's Algorithm, all using GML (GameMaker Language). I miss Python.
My intention is to add additional edges to reintroduce loops, so a player does not have to always return along a path, and to make layouts more interesting. The problem is, these edges cannot cross, and I would prefer not to have to move the points around. I had been given a recommendation to use Delaunay Triangulation, but if I am honest this is completely over my head, and may not be a viable solution in GML. I am asking for any suggestions on algorithms that I could use to identify edges that I could add that do not intersect previously created edges.
I have included an image of the MST (the lines connect to the corners of the red markers, even if the image shows they stop short)

If I'm understanding your question correctly, we're looking at more of a geometry problem than a graph theory problem. You have existing points and line segments with concrete locations in 2d space, and you want to add new line segments that will not intersect existing line segments.
For checking whether you can connect two nodes, node1 and node2, you can iterate through all existing edges and see whether the line segment node1---node2 would intersect the line segment edge.endpoint1 --- edge.endpoint2. The key function that checks whether two line segments intersect can be implemented with any of the solutions found here: How can I check if two segments intersect?.
That would take O(E) time and look something like
def canAddEdge(node1, node2):
canAdd = True
for edge in graph:
canAdd = canAdd and not doesIntersect([node1.location(),
node2.location(), edge.endpoint1.location(), edge.endpoint2.location()])
And you can get a list of valid edges to add in O(EV^2) with something like
def getListOfValidEdges(graph):
validEdges = []
for index,firstEndpointNode in enumerate(graph.nodes()):
for secondEndpointNode in graph.nodes()[index:]:
if (canAddEdge(firstEndpointNode, secondEndpointNode)):
validEdges.append([firstEndpointNode, secondEndpointNode])
return validEdges
Of course, you would need to recalculate the valid edges every time after adding a new edge.

Related

Merge adjacent vertices of a graph until single vertex left in the fewest steps possible

I have a game system that can be represented as an undirected, unweighted graph where each vertex has one (relevant) property: a color. The goal of the game in terms of the graph representation is to reduce it down to one vertex in the fewest "steps" possible. In each step, the player can change the color of any one vertex, and all adjacent vertices of the same color are merged with it. (Note that in the example below I just happened to show the user only changing one specific vertex the whole game, but the user can pick any vertex in each step.)
What I am after is a way to compute the fewest amount of steps necessary to "beat" a given graph per the procedure described above, and also provide the specific moves needed to do so. I'm familiar with the basics of path-finding, BFS, and things of that nature, but I'm having a hard time framing this problem in terms of a "fastest path" solution.
I am unable to find this same problem anywhere on Google, or even a graph-theory term that encapsulates the problem. Does anyone have an idea of at least how to get started approaching this problem? Can anyone point me in the right direction?
EDIT Since this problem seems to be really difficult to solve efficiently, perhaps I could change the aim of my question. Could someone describe how I would even set up a brute force, breadth first search for this? (Brute force could possibly be okay, since in practice these graphs will only be 20 vertices at most.) I know how to write a BFS for a normal linked graph data structure... but in this case it seems quite weird since each vertex would have to contain a whole graph within itself, and the next vertices in the search graph would have to be generated based on possible moves to make in the graph within the vertex. How would one setup the data structure and search algorithm to accomplish this?
EDIT 2 This is an old question, but I figured it might help to just state outright what the game was. The game was essentially to be a rip-off of Kami 2 for iOS, except my custom puzzle editor would automatically figure out the quickest possible way to solve your puzzle, instead of having to find the shortest move number by trial and error yourself. I'm not sure if Kami was a completely original game concept, or if there is a whole class of games like it with the same "flood-fill" mechanic that I'm unaware of. If this is a common type of game, perhaps knowing the name of it could allow finding more literature on the algorithm I'm seeking.
EDIT 3 This Stack Overflow question seems like it may have some relevant insights.
Intuitively, the solution seems global. If you take a larger graph, for example, which dot you select first will have an impact on the direct neighbours which will have an impact on their neighbours and so on.
It sounds as if it were of the same breed of problems as the map colouring problem. Not because of the colours but because of the implications of a local selection to the other end of the graph down the road. In the map colouring, you have to decide what colour to draw a country and its neighbouring countries so two countries that touch don't have the same colour. That first set of selections have an impact on whether there is a solution in the subsequent iterations.
Just to show how complex problem is.
Lets check simpler problem where graph is changed with a tree, and only root vertex can change a colour. In that case path to a leaf can be represented as a sequence of colours of vertices on that path. Sequence A of colour changes collapses a leaf if leaf's sequence is subsequence of A.
Problem can be stated that for given set of sequences problem is to find minimal length sequence (S) so that each initial sequence is contained in S. That is called shortest common supersequence problem, and it is NP-complete.
Your problem is for sure more complex than this one :-/
Edit *
This is a comment on question's edit. Check this page for a terms.
Number of minimal possible moves is >= than graph radius. With that it seems good strategy to:
use central vertices for moves,
use moves that reduce graph radius, or at least reduce distance from central vertices to 'large' set of vertices.
I would go with a strategy that keeps track of central vertices and distances of all graph vertices to these central vertices. Step is to check all meaningful moves and choose one that reduce radius or distance to central vertices the most. I think BFS can be used for distance calculation and how move influences them. There are tricky parts, like when central vertices changes after moves. Maybe it is good idea to use not only central vertices but also vertices close to central.
I think the graph term you are looking for is the "valence" of a graph, which is the number of edges that a node is connected to. It looks like you want to change the color based on what node has the highest valence. Then in the resulting graph change the color for the node that has the highest valence, etc. until you have just one node left.

Finding the starting vertex for Dijkstra's algorithm?

Imagine I am implementing Dijkstra's algorithm at a park. There are points and connections between those points; these specify valid paths the user can walk on (e.g. sidewalks).
Now imagine that the user is on the grass (i.e. not on a path) and wants to navigate to another location. The problem is not in Dijkstra's algorithm (which works fine), the problem is determining at which vertex to begin.
Here is a picture of the problem: (ignore the dotted lines for now)
Black lines show the edges in Dijkstra's algorithm; likewise, purple circles show the vertices. Sidewalks are in gray. The grass is, you guessed it, green. The user is located at the red star, and wants to get to the orange X.
If I naively look for the nearest vertex and use that as my starting point, the user is often directed to a suboptimal path, that involves walking further away from their destination at the start (i.e. the red solid path).
The blue solid path is the optimal path that my algorithm would ideally come up with.
Notes:
Assume no paths cross over other paths.
When navigating to a starting point, the user should never cross over a path (e.g. sidewalk).
In the image above, the first line segment coming out of the star is created dynamically, simply to assist the user. The star is not a vertex in the graph (since the user can be anywhere inside the grass region). The line segment from the star to a vertex is simply being displayed so that the user knows how to get to the first valid vertex in the graph.
How can I implement this efficiently and correctly?
Idea #1: Find the enclosing polygon
If I find the smallest polygon which surrounds my starting point, I can now create new paths for Dijkstra's algorithm from the starting point (which will be added as a new vertex temporarily) to each of the vertices that make up the polygon. In the example above, the polygon has 6 sides, so this would mean creating 6 new paths to each of its vertices (i.e. the blue dotted lines). I would then be able to run Dijkstra's algorithm and it would easily determine that the blue solid line is the optimal path.
The problem with this method is in determining which vertices comprise the smallest polygon that surrounds my point. I cannot create new paths to each vertex in the graph, otherwise I will end up with the red dotted lines as well, which completely defeats the purpose of using Dijkstra's algorithm (I should not be allowed to cross over a sidewalk). Therefore, I must take care to only create paths to the vertices of the enclosing polygon. Is there an algorithm for this?
There is another complication with this solution: imagine the user now starts at the purple lightning bolt. It has no enclosing polygon, yet the algorithm should still work by connecting it to the 3 points at the top right. Again, once it is connected to those, running Dijkstra's is easy.
Update: the reason we want to connect to one of these 3 points and not walk around everything to reach the orange X directly is because we want to minimize the walking done on unpaved paths. (Note: This is only a constraint if you start outside a polygon. We don't care how long you walk on the grass if it is within a polygon).
If this is the correct solution, then please post its algorithm as an answer.
Otherwise, please post a better solution.
You can start off by running Dijkstra from the target to find its distance to all vertices.
Now let's consider the case where you start "inside" the graph on the grass. We want to find all vertices that we can reach via a straight line without crossing any edge. For that we can throw together all the line segments representing the edges and the line segments connecting the start point to every vertex and use a sweep-line algorithm to find whether the start-vertex lines intersect any edge.
Alternatively you can use any offline algorithm for planar point location, those also work with a sweep line. I believe this is in the spirit of the more abstract algorithm proposed in the question in that it reports the polygon that surrounds the point.
Then we just need to find the vertex whose connection line to the start does not intersect any edge and the sum d(vertex, target) + d(vertex, start) is minimum.
The procedure when the vertex is outside the graph is somewhat underspecified, but I guess the exact same idea would work. Just keep in mind that there is the possibility to walk all around the graph to the target if it is on the border, like in your example.
This could probably be implemented in O((n+m) log m) per query. If you run an all-pairs shortest path algorithm as a preprocessing step and use an online point location algorithm, you can get logarithmic query time at the cost of the space necessary to store the information to speed up shortest path queries (quadratic if you just store all distance pairs).
I believe simple planar point location works just like the sweep line approaches, only with persistent BSTs to store all the sweepline states.
I'm not sure why you are a bothering with trying to find a starting vertex when you already have one. The point you (the user) are standing at is another vertex in of itself. So the real question now is to find the distance from your starting point to any other point in the enclosing polygon graph. And once you have that, you can simply run Dijkstra's or another shortest path algorithm method like A*, BFS, etc, to find the shortest path to your goal point.
On that note, I think you are better off implementing A* for this problem because a park involves things like trees, playgrounds, ponds (sometimes), etc. So you will need to use a shortest path algorithm that takes these into consideration, and A* is one algorithm that uses these factors to determine a path of shortest length.
Finding distance from start to graph:
The problem of finding the distance from your new vertex to other vertices can be done by only looking for points with the closest x or y coordinate to your start point. So this algorithm has to find points that form a sort of closure around the start point, i.e. a polygon of minimum area which contains the point. So as #Niklas B suggested, a planar point algorithm (with some modifications) might be able to accomplish this. I was looking at the sweep-line algorithm, but that only works for line segments so that will not work (still worth a shot, with modifications might be able to give the correct answer).
You can also decide to implement this algorithm in stages, so first, find the points with the closest y coordinate to the current point (Both negative and positive y, so have to use absolute value), then among those points, you find the ones with the closest x coordinate to the current point and that should give you the set of points that form the polygon. Then these are the points you use to find the distance from your start to the graph.

Find nearest edge in graph

I want to find the nearest edge in a graph. Consider the following example:
Figure 1: yellow: vertices, black: edges, blue: query-point
General Information:
The graph contains about 10million vertices and about 15million edges. Every vertex has coordinates. Edges are defined by the two adjacent vertices.
Simplest solution:
I could simply calculate the distance from the query-point to every other edge in the graph, but that would be horribly slow.
Idea and difficulties:
My idea was to use some spatial index to accelerate the query. I already implemented a kd-tree to find the nearest vertex. But as Figure 1 shows the edges incident to the nearest vertex are not necessarily the nearest to the query-point. I would get the edge 3-4 instead of the nearer edge 7-8.
Question:
Is there an algorithm to find the nearest edge in a graph?
A very simple solution (but maybe not the one with lowest complexity) would be to use a quad tree for all your edges based on their bounding box. Then you simply extract the set of edges closest to your query point and iterate over them to find the closest edge.
The extracted set of edges returned by the quad tree should be many factors smaller than your original 15 million edges and therefore a lot less expensive to iterate through.
A quad tree is a simpler data structure than the R-tree. It is fairly common and should be readily available in many environments. For example, in Java the JTS Topology Suite has a structure QuadTree that can easily be wrapped to perform this task.
There are spatial query structures which are appropriate for other types of data than points. The most general is the "R-tree" structure (and its many, many variants), which will allow you to store the bounding rectangles of your line segments. You can then search outward from your query points, examining the segments in the bounding rectangles and stopping when the nearest remaining rectangle is further than the closest line encountered so far. This could have poor performance when there are many long line segments overlapping, but for a PSLG such as you seem to have here, that shouldn't happen.
Another option is to use the segments to define a BSP tree, and scan outwards from your point to find all the "visible" lines. This in turn will be problematic if your point can see many edges.
Without proof:
You start with a constrained Delaunay Triangulation, that is a triangulation that takes the existing edges into account. E.g. CGAL or Triangle can do this. For each query point you determine which triangle it belongs to. Then you you only have to check the edges touching a corner of that triangle.
I think this should work in most cases, but there are certainly corner cases where it fails, e.g. when there are many vertices without any edge at all, so at least you have to remove those empty vertices.
You can compute the voronoi diagram and run a query on each voronoi cell. You can subdivide the voronoi diagram to get a better result. You can combine metric and voronoi diagram:http://www.cc.gatech.edu/~phlosoft/voronoi/
you could insert extra vertices in long edges to get some approximation based on closest vertices ..

Suggestions on speeding up edge selection

I am building a graph editor in C# where the user can place nodes and then connect them with either a directed or undirected edge. When finished, an A* pathfinding algorithm determines the best path between two nodes.
What I have: A Node class with an x, y, list of connected nodes and F, G and H scores.
An Edge class with a Start, Finish and whether or not it is directed.
A Graph class which contains a list of Nodes and Edges as well as the A* algorithm
Right now when a user wants to select a node or an edge, the mouse position gets recorded and I iterate through every node and edge to determine whether it should be selected. This is obviously slow. I was thinking I can implement a QuadTree for my nodes to speed it up however what can I do to speed up edge selection?
Since users are "drawing" these graphs I would assume they include a number of nodes and edges that humans would likely be able to generate (say 1-5k max?). Just store both in the same QuadTree (assuming you already have one written).
You can easily extend a classic QuadTree into a PMR QuadTree which adds splitting criteria based on the number of line segments crossing through them. I've written a hybrid PR/PMR QuadTree which supported bucketing both points and lines, and in reality it worked with a high enough performance for 10-50k moving objects (rebalancing buckets!).
So your problem is that the person has already drawn a set of nodes and edges, and you'd like to make the test to figure out which edge was clicked on much faster.
Well an edge is a line segment. For the purpose of filtering down to a small number of possible candidate edges, there is no harm in extending edges into lines. Even if you have a large number of edges, only a small number will pass close to a given point so iterating through those won't be bad.
Now divide edges into two groups. Vertical, and not vertical. You can store the vertical edges in a sorted datastructure and easily test which vertical lines are close to any given point.
The not vertical ones are more tricky. For them you can draw vertical boundaries to the left and right of the region where your nodes can be placed, and then store each line as the pair of heights at which the line intersects those lines. And you can store those pairs in a QuadTree. You can add to this QuadTree logic to be able to take a point, and search through the QuadTree for all lines passing within a certain distance of that point. (The idea is that at any point in the QuadTree you can construct a pair of bounding lines for all of the lines below that point. If your point is not between those lines, or close to them, you can skip that section of the tree.)
I think you have all the ingredients already.
Here's a suggestion:
Index all your edges in a spatial data structure (could be QuadTree, R-Tree etc.). Every edge should be indexed using its bounding box.
Record the mouse position.
Search for the most specific rectangle containing your mouse position.
This rectangle should have one or more edges/nodes; Iterate through them, according to the needed mode.
(The tricky part): If the user has not indicated any edge from the most specific rectangle, you should go up one level and iterate over the edges included in this level. Maybe you can do without this.
This should be faster.

planar graph with fixed maximum length of edges

I want to generate random points in a 2D space, this points will be nodes of a planar graph (built using Gabriel graph algorithm or RNG ).
I wrote java code to do this, but I have two hard problem to solve.
1) I need that all edges of the graph are not longer than a given threshold
2) After I want know faces of graph, a face is a collection of nodes connected by edge. A face does not contain within it other nodes. In image below faces are signed by label (F1, F2...)
How to do these two thing ? some algorithms ? There is some way already known?
Below there is an example of the graph that I must to create
http://imageshack.us/photo/my-images/688/immagineps.png/
If you can tolerate some variance in the number of points, then you could modify your Gabriel graph algorithm to be incremental (most of the effort would be making your Delaunay algorithm incremental) and then whenever an edge is too long, insert a random point in the circle having that edge as a diameter.
The most convenient data structures for plane graphs are edge-centric: for example, the doubly-connected edge list and the quad-edge representations. If you're not already using a data structure of this type for the Delaunay step (and I can't imagine why you wouldn't be), you can sort each vertex's outgoing connections by angle. From there, it's easy to implement a function that takes a half-edge and returns the next half-edge on the same face in counterclockwise order. Now iterate through all of the half-edges, and for each half-edge not already visited, iterate around the face until you return to where you started. Label all of the half-edges in the inner iteration as one face.

Resources