Graph data structure selection - data-structures

I need to implement an algorithm that works on planar graphs and I want to select an appropriate data structure to represent them.
The vertices are stored in an array, each with an associated pair of coordinates,
An edge has an associated polyline between its end vertices, with an arbitrary number of intermediate points (possibly none), stored in sequence in an auxiliary array.
The edges are undirected (if a=>b exists, then b=>a exists),
The following primitive operations must be supported:
adding an edge between two vertices designated by their indexes,
enumerating all edges originating from a given vertex (and recursively, all paths from a given vertex),
for a given edge, follow the associated polyline until the end vertex.
I am looking for a data structure that is space efficient O(V + E) and avoids data redundancy.
What would you use ? Candidates that I see are adjacency lists, DCEL, winged edges, but I may be missing one. I guess that quad edges is overkill.

Related

faster graph traversal algorithms compared to dfs

I have an undirected unweighted graph represented using adjacency matrix where each node of the graph represents a space partition (e.g. State) while the edges represent the neiborhood relationship (i.e. neighboring states sharing common boundaries). My baseline algorithm uses DFS to traverse the graph and form subgraphs after each step (i.e. adding the new node visited which would result in a bunch of contiguous states). With that subgraph I perform a statistical significance test on certain patterns which exist in the nodes of the graph (i.e. within the states).
At this point I am essentially trying to make the traversal step faster.
I was wondering if you all could suggest any algorithm or resources (e.g. research paper) which performs graph traversal computationally faster than DFS.
Thanks for your suggestion and your time!
Most graph algorithms contain "for given vertex u, list all its neighbors v" as a primitive. Not sure, but sounds like you might want to speed up this piece. Indeed, each country has only few neighbors, typically much less than the total number of countries. If this is the case, replace adjacency matrix graph representation with adjacency lists.
Note that the algorithm itself (DFS or other) will likely remain the same, with just a few changes where it uses this primitive.

Reason for not allowing random access to the vector of edges in adjacency lists

Why is edge_iterator not an integer_iterator like vertex_iterator? I am using undirected adjacency list with vectors to store both vertices and edges.
Adjacency lists store a list of adjacencies.
That is, per vertex, it stores a list of adjacent vertices.
That means that vertices can be stored in a single container, but each vertex contains its own (separate) container of adjacencies ("other vertex references").
This should explain: there is no such thing as "the edge container", making it impossible to directly address the edges by index or as a single adjacent container.
Note there are other graph models (e.g. EdgeList concept, as modeled by edge_list)

Exact (Error-Correcting) Graph Matching Algorithm

I'm looking for an inexact graph matching algorithm on graphs with labeled vertices and labeled, directed edges. My task is to detect changes to two graphs to display them to the developer (think subversion diff). I've already implemented an optimization algorithm based on tabu search (this), but I can't get the algorithm to consider my edge labels. My graphs have at most 120 vertices and 200 edges, so I might get away with a slower but simpler to implement algorithm.
Here is an example for your viewing pleasure, :
Since no one has proposed an existing algorithm, I'll try to invent one...
For each vertex, you can calculate its "signature" by concatenating its label with the labels of all adjacent edges. For consistency, sort the labels alphabetically. Since the edges are directed, concatenate incoming and outgoing edges separately.
These signatures can be used to detect changes in the set of vertices. First, find corresponding vertices with the same signature in the first and the second graph. Remaining unpaired vertices are added vertices, removed vertices, vertices with changed labels, vertices with changed edge connections, and vertices whose edges' labels were changed. You can associate them by comparing their signatures and selecting best matches using some string matching algorithm. Apparently you will have to introduce some critical degree of similarity to distinguish "it's the same vertex with many changed properties" from "it's a new vertex with some accidental signature similarity".
Arrange all vertices of the first graph in an array, in any order. Create another array of the same size. Put the matching vertices of the second graph into the second array at positions corresponding to the first array; do this for all exactly matched vertices and all modified vertices. For first graph vertices which don't have a match in the second graph (deleted vertices), leave the array cells empty. Then, for second graph vertices which don't have a match in the first graph (new vertices), add these vertices to the end of the second array and expand the first array with the corresponding number of empty cells.
Now, when vertices of a graph are listed in an array, the edges can be represented as a 2-dimensional array. If an edge is going from the i-th vertex to the j-th vertex, put its label into the (i,j) cell of the array.
Do this for both graphs. Since you have constructed two vertex arrays of the same size, you get two 2-dimensional arrays of the same size, with a one-to-one correspondence. Comparing these two arrays in a straightforward way allows you to detect added edges, removed edges and edges with changed labels.

Finding the list of common children (descendants) for any two nodes in a cyclic graph

I have a cyclic directed graph and I was wondering if there is any algorithm (preferably an optimum one) to make a list of common descendants between any two nodes? Something almost opposite of what Lowest Common Ancestor (LCA) does.
As user1990169 suggests, you can compute the set of vertices reachable from each of the starting vertices using DFS and then return the intersection.
If you're planning to do this repeatedly on the same graph, then it might be worthwhile first to compute and contract the strong components to supervertices representing a set of vertices. As a side effect, you can get a topological order on supervertices. This allows a data-parallel algorithm to compute reachability from multiple starting vertices at the same time. Initialize all vertex labels to {}. For each start vertex v, set the label to {v}. Now, sweep all vertices w in topological order, updating the label of w's out-neighbors x by setting it to the union of x's label and w's label. Use bitsets for a compact, efficient representation of the sets. The downside is that we cannot prune as with single reachability computations.
I would recommend using a DFS (depth first search).
For each input node
Create a collection to store reachable nodes
Perform a DFS to find reachable nodes
When a node is reached
If it's already stored stop searching that path // Prevent cycles
Else store it and continue
Find the intersection between all collections of nodes
Note: You could easily use BFS (breadth first search) instead with the same logic if you wanted.
When you implement this keep in mind there will be a few special cases you can look for to further optimize your search such as:
If an input node doesn't have any vertices then there are no common nodes
If one input node (A) reaches another input node (B), then A can reach everything B can. This means the algorithm wouldn't have to be ran on B.
etc.
Why not just reverse the direction of the edge and use LCA?

Efficient Way to construct triangles from Edges/Lines?

Lets say I have a set of points and I have lines/edges between them. All of these edges create non-overlapping triangles within the convex hull of my points. All points are connected to triangles.
How can I efficiently check which points are part of which triangle? I could check incident points of each edge and gradually construct a triple of points, but that sounds awefully slow (o(n^2)?).
Is there something like linesweep or so to do that?
cheers.
If you have a 2-dimensional set-up like you described, then you have a fully triangulated planar graph (no intersecting edges when you exclude the endpoints) which spans the convex hull of your points. In this case, if you sort the edges around each vertex circularly according to the angle they make with the vertex, then you know for sure that each pair of adjacent edges makes a triangle. Furthermore, every triangle can be found this way if you perform this procedure for each vertex. Each triangle will be found 3 times when you iterate over all vertices. You can either use a hash table to detect duplicates, or sort all your triangles when you are done to identify duplicates. If you use hash table, then the overall complexity if you have V vertices is O(V log d), where d is the maximum degree of a vertex (because the total number of edges is linear in the number of vertices because you have a planar graph). So absolute worst-case is O(V log V), which is the same worst-case if you sort all triangles to find duplicates (because the max number of triangles is also linear in the number of vertices). The only caveat to make this work is that you need to know the neighbor vertices (i.e. the incidental edges) for each vertex.
The edges define an undirected graph G and the triangles are the set of cycles in G with length=3.
Geometric triangulations typically have relatively low nodal degree (degree d is the number of edges adjacent to each node, d<=10 is typical for geometric triangulations) and, as such, here is a reasonably efficient O(n*d^3) algorithm that can be used to construct the set of triangles.
Setup a graph-like data structure, supporting access to the list of edges adjacent to each node.
Iterate over all nodes. Consider all pairs of edges adjacent to a given node i. For a given pair of edges adjacent to i, we have a potential nodal triplet i,j,k. This triplet is a triangle if there is an edge joining nodes j,k, which can be checked by scanning the edge lists of j,k.
Duplicate triangles will be generated by a naive implementation of (2). Maintain a hash table of triangles to reject duplicate triplets as they're considered.
I've assumed that the edges define a valid disjoint triangulation, being non-intersecting, etc.
Hope this helps.

Resources