Polygons from a list of edges - algorithm

Given N points in a map of edges Map<Point, List<Edge>>, it's possible to get the polygons formed by these edges in O(N log N)?
What I know is that you have to walk all the vertices and get the edges containing that vertex as a starting point. These are edges of a voronoi diagram, and each vertex has, at most, 3 artists containing it. So, in the map, the key is a vertex, and the value is a list where the vertex is the start node.
For example:
Points: a,b,c,d,e,f,g
Edges: [a,b]; [a,c]; [a,d], [b,c], [d,e], [e,g], [g,f]
My idea is to iterate the map counterclockwise until I get the initial vertex. That is a polygon, then I put it in a list of polygons and keep looking for others. The problem is I do not want to overcome the complexity O(N log N)
Thanks!

You can loop through the edges and compute the distance from midpoint of the edge to all sites. Then sort the distances in ascending order and for inner voronoi polygons pick the first and the second. For outer polygons pick the first. Basically an edge separate/divide 2 polygons.
It's something O(m log n).

If I did find a polynomial solution to this problem I would not post it here because I am fairly certain this is at least NP-Hard. I think your best bet is to do a DFS. You might find this link useful Finding all cycles in undirected graphs.
You might be able to use the below solution if you can formulate your graph as a directed graph. There are 2^E directed graphs (because each edge can be represented in 2 directions). You could pick a random directed graph and use the below solution to find all of the cycles in this graph. You could do this multiple times for different random directed graphs keeping track of all the cycles and until you've reached a satisfactory error bounds.
You can efficiently create a directed graph with a little bit of state (Maybe store a + or - with an edge to note the direction?) And once you do this in O(n) the first time you can randomly flip x << E directions to get a new graph in what will essentially be constant time.
Since you can create subsequent directed graphs in constant time you need to choose the number of times to run the cycle finding algorithm to have it still be polynomial and efficient.
UPDATE - The below only works for directed graphs
Off the top of my head it seems like it's a better idea to think of this as a graph problem. Your map of vertices to edges is a graph representation. Your problem reduces to finding all of the loops in the graph because each cycle will be a polygon. I think "Tarjan's strongly connected components algorithm" will be of use here as it can do this in O(v+e).
You can find more information on the algorithm here https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm

Related

Greedy colouring algorithm on graph in adjacency list representation

Suppose you have been given a simple undirected graph and the graph has a max degree of d. You are given d + 1 colors, represented by numbers starting from 0 to d and you want to return a valid placement of colors such that no two adjacent vertices share the same color. And as the title suggests, the graph is given in adjacency list representation. The algorithm should run in O(V+E) time.
I think the correct way to approach this is by using a greedy coloring algorithm. However, this may sound stupid but I am stuck on the part where I try to find the first available color that hasn't been used by its neighbors for each vertex. I don't really know how I can do it so that it runs in O(number of neighbors) time for each vertex that helps to fit under time complexity requirements.

Algorithm for Finding Graph Connectivity

I'm tackling an interesting question in programming. It is this: we keep adding undirected edges to a graph, until the graph (or subgraph) is connected (i.e. we can use some path to get from each vertex to any other vertex in that subgraph). We stop as soon as the graph is connected.
For example if we have vertices 1,2,3 and 4 and we want the subgraph 1,2,3 to be connected.
Let's say we have edges (3,4), then (2,3), then (1,4), then (1,3). We only need to add in the first 3 edges for the subgraph to be connected, then we stop (edge 1,3 isn't needed).
Obviously I can run a BFS every time an edge is added to see if we can reach the required vertices, but if there are say m edges then we would potentially have to run BFS m times which seems too slow. Any better options? Thanks.
You should research the marvelous "Disjoint-set data structure" and the corresponding union - find algorithm. It can seem magical, but the worst case time and space complexity are tiny, O(α(n)) and O(n) respectively, where α is the inverse Ackerman function.
You can run just one time the BFS to find connected components. Then, each time you add an edge, if it is between vertices of two different components, you can merge them by a reference. So, the complexity of this algorithm is |V| + |E|.
Notice that the implementation of this method should be done by some reference techniques, especially to update the component number of the vertices.
I would normally do this using a disjoint set structure, as Doug suggests. It would be like Kruskal's algorithm for finding the minimum spanning tree, except you process edges in the given order.
If you don't need a spanning tree as output, though, then you can do this with an incremental BFS or DFS:
Pick any vertex, and find the vertices connected to it with BFS or DFS. Color these vertices red. If you start with no edges, of course, then there will be only one red vertex at this stage.
As you add edges, don't do anything else until you add an edge that connects a red vertex to a non-red vertex. Then run BFS or DFS, excluding the new edge, to find all the new vertices that will connect to the red set. Color them all red.
Stop when all vertices are red.
This is a little simpler in practice than using disjoint set, and takes O(|V|+|E|) time, since each vertex will be traversed by exactly one BFS/DFS search.
It does the work in chunks, though, so if you need each edge test to be fast individually, then disjoint set is better.

What is the simplest, easiest algorithm for finding EMST of a complete graph of order 10^5

I just want to be clear that EMST stands for Euclidean Minimum Spanning Tree.
Essentially, I have been given a file with 100k 4D vertices (one vertex on each line). The goal is to visit every vertex in the file while minimizing the total distance traveled. The distance from a point to another point is simply the Euclidean Distance (Distance if you draw a Straight Line between two points".
I already know that this is pretty much the Traveling Salesman Problem, which is NP Complete, so I am looking to approximate the solution.
The first approximation algorithm that came to my mind is by finding the MST from a graph constructed from the file... But that would take O(N^2) to even just construct all the edges from the file given the fact that it's a complete graph ( I can go from any point to another ). And given that my input is N = 10^5, my algorithm will have a huge running time, which is too slow...
Any ideas on how I can plan on approximating the solution? Thank you very much..
I know it's quadratic-time, but I think you should consider Prim with an implicit graph. The structure of the algorithm is
for each vertex v
mindist[v] := infinity
visited[v] := false
choose a root vertex r
mindist[r] := 0
repeat |V| times
let w be the minimizer of d[w] such that not visited[w]
visited[w] := true
for each vertex v
if not visited[v] and distance(w, v) < mindist[v]:
mindist[v] := distance(w, v)
parent[v] := w
Since the storage used is linear, it will likely stay resident in cache, and there are no fancy data structures, so this algorithm should run pretty fast.
I am going to assume that you actually want a EMST as your title suggests, and the TSP is just a means to that end, and not the actual goal itself. The two have very different restrictions (the TSP being far more restrictive), and thus very different optimal solutions.
Overview
The idea is that we want to run a modified kruskal's algorithm, which will make use of a k-d tree to find the closest pairs without evaluating every potential edge. We can find the shortest edge to each vertex in a connected component, take the shortest overall, and connect our connected components via that edge. As you'll see, this connects at least half of our connected components each iteration, so it takes at most logn iterations to complete.
Nearest Neighbor Search
For constructing an EMST, you'll want to use a data structure for querying for nearest neighbors in 4D space. You could extend octrees to work in a higher dimension, but I'd personally go with a k-d tree. You can construct a k-d tree in O(nlogn) time using the median of medians algorithm to find the median at each level, and you can insert / remove from a balanced k-d tree in O(logn) time.
Once you've built a k-d tree, you'll want to query for the nearest neighbor to each point. We'll then construct the edge between these two vertices. Many of these edges will be duplicated, as for some vertices A and B, A's nearest neighbor may be B, and B's nearest neighbor may be A. We'll handle this by storing which connected component each vertex belongs to, and after two vertices are joined by an edge, the duplicate edge will clearly connect two vertices of the same connected component, and so we'll discard it. To accomplish this, we'll use a disjoint-set (just like in many implementations of kruskal's algorithm) to assign a connected component to each vertex. This will also prevent us from creating cycles in our graph, which would introduce unnecessary edges in the MST.
Merging
However, as we construct each edge, we'll want to insert it into a min-heap priority queue before checking which edges to keep and which edges connect already-connected vertices. This will not affect the outcome of this first iteration, but later on we will need to handle edges by increasing distance. Then dequeue all the edges, check for unnecessary / redundant edges via the disjoint-set, insert valid edges into the MST, and merge the respective disjoint-sets. All of this of course introduces a nlogn factor for constructing and dequeuing elements from the min-heap (we could also just sort them in a plain array, if we wished).
After this first iteration of adding edges, we'll have connected at least half of the MST, maybe more. This is because for each vertex we added one edge, and we can have at most one duplicate per edge, so we've added a few as vertices / 2 edges, but as many as vertices - 1. Now at least 1/2 of our MST has been built. We'll continue the process as described in the following paragraphs, until we've added vertices - 1 edges in total.
Generalizing NN-Search
To continue, we'll want to construct lists of the vertices in each connected component, so that we can iterate over them by groups. This can be done in nearly linear time, as searching (also merging) a disjoint-set takes O(α(n)) time (α being the inverse ackermann function) and we repeat exactly n times. Once we have our lists of vertices per connected component, the rest is fairly straightforward. We'll take our existing k-d tree, and remove all the vertices in our current connected component. We'll then query for the nearest neighbor to each vertex to each vertex in our connected component, and add these edges to our min-heap. We'll then add these vertices back into the k-d tree, and repeat on the next connected component. Since we insert/remove a total of n elements, this amounts to an average case O(nlogn) time complexity.
Now that we have a queue of the shortest potential edges connecting our connected components, we'll dequeue these in order, and just as before insert valid edges and merge the disjoint sets. For the same reasons as before, this is guaranteed to connect at least half of our components, maybe even all of them. We'll repeat this process until we have connected all vertices into a single connected component, which will be our MST. Note that because we halve the number of disconnected components each iteration, it'll take at most O(logn) iterations to connect every vertex in our MST (most likely far less).
Remarks
Overall, this will take O(nlog^2(n)) time. There will likely be far less than log(n) iterations however, so expect a speedup there in practice. Also note that R-tree might be a good alternative to the k-d tree- I don't know how they compare in practice however.

Find the Sunflower subgraph induced in a graph in polynomial amount of time.

A subgraph Sn of a graph G is a sunflower graph, if consists of a Cycle Cn = {v1,v2,..,vn} of n vertices together with other n independent vertices {u1,u2,...,un} such that for each i, ui is adjacent to vi and vj, where j = i-1(mod n).
You could think of a sunflower - in the sense of the question - as a cycle of triangles. In time O(N^3) you can check each triple of points to see if it is a triangle and create a new graph whose vertices denote triangles in the original graph and where two vertices are linked if the two triangles share one or more vertices.
Then a depth first search looking for back edges should find cycles in this graph. Not all cycles are good. I think it may be enough to check that no two successive edges in the supposed cycle in the derived graph are produced by the same vertex in the original graph, and that you can check this as part of the depth first search. It may take some detailed analysis of cases to establish this, unless you can find a neat proof.

Efficient Way to construct triangles from Edges/Lines?

Lets say I have a set of points and I have lines/edges between them. All of these edges create non-overlapping triangles within the convex hull of my points. All points are connected to triangles.
How can I efficiently check which points are part of which triangle? I could check incident points of each edge and gradually construct a triple of points, but that sounds awefully slow (o(n^2)?).
Is there something like linesweep or so to do that?
cheers.
If you have a 2-dimensional set-up like you described, then you have a fully triangulated planar graph (no intersecting edges when you exclude the endpoints) which spans the convex hull of your points. In this case, if you sort the edges around each vertex circularly according to the angle they make with the vertex, then you know for sure that each pair of adjacent edges makes a triangle. Furthermore, every triangle can be found this way if you perform this procedure for each vertex. Each triangle will be found 3 times when you iterate over all vertices. You can either use a hash table to detect duplicates, or sort all your triangles when you are done to identify duplicates. If you use hash table, then the overall complexity if you have V vertices is O(V log d), where d is the maximum degree of a vertex (because the total number of edges is linear in the number of vertices because you have a planar graph). So absolute worst-case is O(V log V), which is the same worst-case if you sort all triangles to find duplicates (because the max number of triangles is also linear in the number of vertices). The only caveat to make this work is that you need to know the neighbor vertices (i.e. the incidental edges) for each vertex.
The edges define an undirected graph G and the triangles are the set of cycles in G with length=3.
Geometric triangulations typically have relatively low nodal degree (degree d is the number of edges adjacent to each node, d<=10 is typical for geometric triangulations) and, as such, here is a reasonably efficient O(n*d^3) algorithm that can be used to construct the set of triangles.
Setup a graph-like data structure, supporting access to the list of edges adjacent to each node.
Iterate over all nodes. Consider all pairs of edges adjacent to a given node i. For a given pair of edges adjacent to i, we have a potential nodal triplet i,j,k. This triplet is a triangle if there is an edge joining nodes j,k, which can be checked by scanning the edge lists of j,k.
Duplicate triangles will be generated by a naive implementation of (2). Maintain a hash table of triangles to reject duplicate triplets as they're considered.
I've assumed that the edges define a valid disjoint triangulation, being non-intersecting, etc.
Hope this helps.

Resources