Find nearest edge in graph - algorithm

I want to find the nearest edge in a graph. Consider the following example:
Figure 1: yellow: vertices, black: edges, blue: query-point
General Information:
The graph contains about 10million vertices and about 15million edges. Every vertex has coordinates. Edges are defined by the two adjacent vertices.
Simplest solution:
I could simply calculate the distance from the query-point to every other edge in the graph, but that would be horribly slow.
Idea and difficulties:
My idea was to use some spatial index to accelerate the query. I already implemented a kd-tree to find the nearest vertex. But as Figure 1 shows the edges incident to the nearest vertex are not necessarily the nearest to the query-point. I would get the edge 3-4 instead of the nearer edge 7-8.
Question:
Is there an algorithm to find the nearest edge in a graph?

A very simple solution (but maybe not the one with lowest complexity) would be to use a quad tree for all your edges based on their bounding box. Then you simply extract the set of edges closest to your query point and iterate over them to find the closest edge.
The extracted set of edges returned by the quad tree should be many factors smaller than your original 15 million edges and therefore a lot less expensive to iterate through.
A quad tree is a simpler data structure than the R-tree. It is fairly common and should be readily available in many environments. For example, in Java the JTS Topology Suite has a structure QuadTree that can easily be wrapped to perform this task.

There are spatial query structures which are appropriate for other types of data than points. The most general is the "R-tree" structure (and its many, many variants), which will allow you to store the bounding rectangles of your line segments. You can then search outward from your query points, examining the segments in the bounding rectangles and stopping when the nearest remaining rectangle is further than the closest line encountered so far. This could have poor performance when there are many long line segments overlapping, but for a PSLG such as you seem to have here, that shouldn't happen.
Another option is to use the segments to define a BSP tree, and scan outwards from your point to find all the "visible" lines. This in turn will be problematic if your point can see many edges.

Without proof:
You start with a constrained Delaunay Triangulation, that is a triangulation that takes the existing edges into account. E.g. CGAL or Triangle can do this. For each query point you determine which triangle it belongs to. Then you you only have to check the edges touching a corner of that triangle.
I think this should work in most cases, but there are certainly corner cases where it fails, e.g. when there are many vertices without any edge at all, so at least you have to remove those empty vertices.

You can compute the voronoi diagram and run a query on each voronoi cell. You can subdivide the voronoi diagram to get a better result. You can combine metric and voronoi diagram:http://www.cc.gatech.edu/~phlosoft/voronoi/

you could insert extra vertices in long edges to get some approximation based on closest vertices ..

Related

Best way to merge overlapping convex polygons into a single concave polygon?

I am working with several convex polygons that overlap each other and I need to combine them back together to form one single polygon that may be convex or concave.
The problem is always as follows:
1) The polygons that I need to merge together are always convex.
2) The vertices of each polygon are defined in clockwise order.
3) The polygons are never in any specific order.
4) The final polygon can only be simple convex or concave polygon, i.e. no self-intersection, no duplicate vertices or holes in the shape.
Here is an example of the kind of polygons that I am working with.
![overlapping convex polygons]"image removed")
My current approach is to start from the first polygon and vertex by vertex I loop through all vertices of all of the polygons to find overlap. If there is no overlap, I store the vertex for the final outline and continue.
Upon finding overlapping vertices, I determine which polygon to continue to by measuring the angles of the possible paths and by choosing the one that leads towards the outside of the shape.
This method works until I encounter polygons that do not have vertices overlapping each other, but instead one polygon's vertex is overlapping another polygon's side, as is the case with the rectangle in the image.
I am currently planning on solving these situations by running line intersect checks for all shapes that I have not yet processed, but I am convinced that this cannot be the easiest or the best method in terms of performance.
Does someone know how I should approach this problem in a more efficient manner and/or universal manner?
I solved this issue and I'm posting the answer here in case someone else runs into this issue as well.
My first step was to implement a pre-processing loop based on trincot's suggestions.
I calculated the minimum and maximum x and y bounds for each individual shape.
I used these values to determine all overlapping shapes and I stored a simple array for each shape that I could later use to only look at shapes that can overlap each other.
Then, for the actual loop that determines the outline of the final polygon:
I start from the first shape and simply compare its vertices to those of the nearby shapes. If there is at least one vertex that isn't shared by another vertex, it must be on the outer edge and the loop starts from there. If there are only overlapping vertices, then I add the first shape to a table for all checked shapes and repeat this process with another shape until I find a vertex that is on the outer edge.
Once the starting vertex is found, the main loop will check the vertices of the starting shape one by one and measure how far from the given vertex is from every nearby shapes' edges. If the distance is zero, then the vertex either overlaps with another shape's vertex or the vertex lies on the side of another shape.
Upon finding the aforementioned type of vertex, I add the previous shape's number to the table of checked shapes so that it isn't checked again. Then, I check if there are other shapes that share this particular vertex. If there are, then I determine the outermost shape and continue from there, starting back from step 2.
Once all shapes have been checked, I check that all non-overlapping vertices from the starting shape were indeed added to the outline. If they weren't, I add them at the end.
There may be computationally faster methods, but I found this one to be simple to write, that it meets all of my requirements and it is fast enough for my needs.
Given a vertex, you could speed up the search of an "overlapping" vertex or edge as follows:
Finding vertices
Assuming that the coordinates are exact, in the sense that if two vertices overlap, they have exactly the same x and y coordinates, without any "error" of imprecision, then it would be good to first create a hash by x-coordinate, and then for each x-entry you would have a hash by y-coordinate. The value of that inner hash would be a list of polygons that have that vertex.
That structure can be built in O(n) time, and will allow you to find a matching vertex in constant time.
Only if that gives no match, you would go to the next algorithm:
Finding edges
In a pre-processing step (only once), create a segment tree for these polygons where a "segment" corresponds to a min/max x-coordinate range for a particular polygon.
Given a vertex, use the segment tree to find the polygons that are in the right x-coordinate range, i.e. where the x-coordinate of the vertex is within the min/max range of x-coordinates of the polygon.
Iterate those polygons, and eliminate those that do not have an y-coordinate range that has the y-coordinate of the vertex.
If no polygons remain, the vertex does not participate in any edge of another polygon.
You cannot get more than one polygon here, since that would mean another polygon shares the vertex, which is a case already covered by the hash-based algorithm.
If you get just one polygon, then continue your search by going through the edges of that polygon to find a match -- which is what you already planned on doing (line intersect check), but now you would only need to do it for one polygon.
You could speed that line intersect check up a little bit by first filtering the edges to those that have the right x-range. For convex polygons you would end up with at most two edges. At most one of those two will have the right y-range. If you get such an edge, check whether the vertex is really on that edge.

Small circle inside simple polygon

I've been working on a computational geometry problem and ran across the following problem (which is needed as a subroutine) but failed to find any good references or algorithms.
Given a simple (possibly concave) polygon P, the goal is to compute the center and radius of the smallest circle which is completely contained in P (empty circle) but touches the polygon in at least two places (point or edge). If the two "places" happen to be points of the polygon then there are no constraints. Also no constraints if we hit a point and an edge. But if we hit two edges then they should not be consecutive (assuming clockwise or counter-clockwise order).
I am aiming for an implementable algorithm running in order of n^3 or better. Any pointers, references, or ideas would be very helpful.
Thanks!
Amer
Since you're just looking for pointers or ideas, I'll be brief. The Medial Axis of a polygon is set of centers of the circles that touch the boundary in two or more locations (https://en.wikipedia.org/wiki/Topological_skeleton#Centers_of_bi-tangent_circles). Also known as a skeleton, the medial axis consists of a tree-like graph made of lines and parabolas. If you check the circles at the vertices of this graph (ignoring the graph vertices that coincide with polygon vertices), you can find both the largest and smallest circles. You'll have to fine tune to accommodate your "no consecutive edges" requirement.

Plotting Distance Constrained Points on a Plane

I'm crossposting this from the mathematics stack exchange at the suggestion of one user who thought somebody here with experience in embedding algorithms might be able to help, though it should be noted that I'm not trying to do a strict graph embedding (which would not allow for vertices to intersect).
Does anybody know of some algorithmic way to tell if it is possible to plot a set of distance constrained points on a cartesian plane. Or, better still, a method to determine the minimum number of dimensions required to accurately depict the points.
As an example: If you have three points and a constraint that says they are all one unit away from each other, you can plot this easily on a cartesian plane as an equilateral triangle.
However, if you have the constraints A->B = 1, A->C = 1, and B->C = 3 then you will not be able to plot these points while maintaining their distances.
However in my case I have a graph with many more than three vertices. The graph is definitely non-planar: one such case involves 1407 vertices all of which are connected by a weighted bidirectional edge that defines the "distance" between the two vertices.
The question is, is there some way to tell if I can depict this graph with accurate distances on a cartesian plane. I know I can't depict it without edges crossing, but I don't care about doing that. I just want the points on the plane an appropriate distance from each other.
Additional information about the graph in case it helps:
1) Each node represents a set of points. 2) The edge weights are derived by optimally overlaying the point sets from each pair of nodes and then taking the RMSD of the resulting point sets. 3) The sets of points represented by any two nodes can be paired with each other. That is, we can think of each node as a set of 8 points numbered 1-8. This numbering is static. When I overlay node A and node B, the points are numbered identically to when I overlay A and C and B and C.
My thoughts: Because RMSD is a metric on R^3 (At least I believe so. This paper claims to prove it http://onlinelibrary.wiley.com/doi/10.1107/S0108767397010325/abstract), it should be possible for me to do this in R^3 at the very least.
As my real goal here is to turn this set of points into a nice figure, a three dimensional depiction would actually suffice, as I could depict the 3D figure in 2D. I also recognize that numerical instability in the particular optimal overlay algorithm I'm using will cause issues, but I'm interested in the answer for an ideal case.

How to draw a graph with less crossing by using some broken line?

I'm dealing with a graph with n nodes' coordinate and m undirected edges, how can I get a better visual graph(with less crossing) by allow using some broken line instead of straight line?
I know minimize the crossing number is a NP problem. So I just ask for some help here beacuse I think someone may give me some resources about it.
What's more, I think it is ok that change some nodes' coordinate(not move them too far), all in all, it's the problem that how to find a more clear graph for our eyes!
GraphViz website is a good place to start learning about graph visualization.
The Boost graph library (i.e. BGL) has plenty of algorithms and data structures to experiment with, and a dual interface (c++ of course, or python). Of course, Boost isn't the easiest way to start. Surely Graphviz (that BGL can interface) it's way simpler.
In the BGL docs there are many resource you could find useful: for instance, from the previous link:
Any plane drawing separates the plane into distinct regions bordered by graph edges called faces. As a simple example, any embedding of a triangle into the plane separates it into two faces: the region inside the triangle and the (unbounded) region outside the triangle. The unbounded region outside the graph's embedding is called the outer face. Every embedding yields one outer face and zero or more inner faces. A famous result called Euler's formula states that for any planar graph with n vertices, e edges, f faces, and c connected components,
n + f = e + c + 1
This formula implies that any planar graph with no self-loops or parallel edges has at most 3n - 6 edges and 2n- 4 faces. Because of these bounds, algorithms on planar graphs can run in time O(n) or space O(n) on an n vertex graph even if they have to traverse all edges or faces of the graph.
A convenient way to separate the actual planarity test from algorithms that accept a planar graph as input is through an intermediate structure called a planar embedding. Instead of specifying the absolute positions of the vertices and edges in the plane as a plane drawing would, a planar embedding specifies their positions relative to one another. A planar embedding consists of a sequence, for each vertex in the graph, of all of the edges incident on that vertex in the order in which they are to be drawn around that vertex. The orderings defined by this sequence can either represent a clockwise or counter-clockwise iteration through the neighbors of each vertex, but the orientation must be consistent across the entire embedding.
In the Boost Graph Library, a planar embedding is a model of the PlanarEmbedding concept. A type that models PlanarEmbedding can be passed into the planarity test and populated if the input graph is planar. All other "back end" planar graph algorithms accept this populated PlanarEmbedding as an input. '

planar graph with fixed maximum length of edges

I want to generate random points in a 2D space, this points will be nodes of a planar graph (built using Gabriel graph algorithm or RNG ).
I wrote java code to do this, but I have two hard problem to solve.
1) I need that all edges of the graph are not longer than a given threshold
2) After I want know faces of graph, a face is a collection of nodes connected by edge. A face does not contain within it other nodes. In image below faces are signed by label (F1, F2...)
How to do these two thing ? some algorithms ? There is some way already known?
Below there is an example of the graph that I must to create
http://imageshack.us/photo/my-images/688/immagineps.png/
If you can tolerate some variance in the number of points, then you could modify your Gabriel graph algorithm to be incremental (most of the effort would be making your Delaunay algorithm incremental) and then whenever an edge is too long, insert a random point in the circle having that edge as a diameter.
The most convenient data structures for plane graphs are edge-centric: for example, the doubly-connected edge list and the quad-edge representations. If you're not already using a data structure of this type for the Delaunay step (and I can't imagine why you wouldn't be), you can sort each vertex's outgoing connections by angle. From there, it's easy to implement a function that takes a half-edge and returns the next half-edge on the same face in counterclockwise order. Now iterate through all of the half-edges, and for each half-edge not already visited, iterate around the face until you return to where you started. Label all of the half-edges in the inner iteration as one face.

Resources