Find the all chain of one to one node in graph - algorithm

I go the un-directed graph G an my goal is to find all the possible chains longest than N of nodes in a one to one relation.
For example:
In the the next Graph the "chains" of length more than 2 of nodes in one to one relation are:
- d -> e -> f -> g
- c -> k -> l -> m
So what is the best approach or algorithm to solve this problem ?

If you want to find all paths so that each vertex in it has a degree <=2, then the simple approach may be as follows.
Remove all vertices with degree >2 from your graph. You are left with a graph with each vertex having a degree <=2. It is easy to prove that every connected component of such a graph is either a simple way, either a simple loop, and it is easy to distinguish them (for example, running a DFS from one node and seeing whether you ever return to it).
So, every component that is a path is a path you look for. Every component that is a loop is also a path you look for, or can be easily converted to such a path by removing an edge or a vertex, depending on whether you allow a loop as the needed path.

Related

Path double cover, recursion set up

I'm working on path double cover problem. I have undirected connected graph G and and I change every edge to 2 directed edges and each of them is in opposite direction. Then the goal is to find set of paths(no loops) in this directed graph so that every vertex is used once as start of path and once as end of another path. Each of directed edges are used exactly once.
undirected graph G
directed graph G
For this example there is set of paths P={(1,2,4),(4,3,1),(2,1,3),(3,4,2)}.
There are currently known 2 graphs K3 and K5 (fully connected graphs with 3 and 5 vertices) which cannot be covered in this way.
I want to make script which will find me this covering or tell me if there isn't one. I tried to generate all possible paths and then search in them but for bigger graph this approach isn't usable (n! complexity). I don't know how to set up the recursion so I can keep track of what I've used. I don't care about time complexity but it would be awesome if you had any tip for doing it more quickly. :D
Thanks for any suggestions. :D
Your definition is a bit confusing- you say that you need to find a set of paths (no loops) in the directed graph, with 1 outgoing edge per vertex. There is no way for these edges not to form a loop (at most n - 1 edges can be tree edges).
I'm going to assume that you instead mean "only one cycle; no subcycles".
In that case, your task becomes that of determining whether your graph has a Hamiltonian Cycle or not.
We can use Ore's Theorem as a quick check:
If deg v + deg w ≥ n for every pair of distinct non-adjacent vertices v and w of G then G is Hamiltonian.
Note that this says "if" and not "iif" / "if and only if", so a graph be Hamiltonian, and not satisfy this check.
To take things one step further, we can use the Bondy–Chvátal theorem:
A graph is Hamiltonian if and only if its closure is Hamiltonian.
And we obtain its closure in a similar method to what we did for Ore's Theorem check- we repeatedly add a new edge connecting a nonadjacent pair of vertices u and v with deg(v) + deg(u) ≥ n until no more pairs with this property can be found.
Once this is done, we check whether the closure is Hamiltonian. If the closure is a complete graph, then it is Hamiltonian. I was unable to find any proof that the closure will be complete iif the graph g is Hamiltonian, however it does seem to happen with every example graph I can conjure up, so at least it may be a stronger correlation than Ore's Theorem.
In the end, you just need to determine if the graph has Hamiltonian Cycle. I've listed above two ways you can perform quadratic-time checks to positively identify some of such graph (maybe all, again- not sure of the completeness of the closure bit).

In a DAG, how to find vertices where paths converge?

I have a type of directed acyclic graph, with some constraints.
There is only one "entry" vertex
There can be multiple leaf vertices
Once a path splits, anything under that path cannot reach into the other path (this will become clearer with some examples below)
There can be any number of "split" vertices. They can be nested.
A "split" vertex can split into any number of paths. The examples below only show 2 paths for each, but it could be more.
My challenge is the following: for each "split" vertex (any vertex that has at least 2 outgoing edges), find the vertices where its paths reconnect - if such a vertex exists. The solution should be as efficient as possible.
Example A:
example a
In this example, vertex A is a "split" vertex, and its "reconnect vertex" is F.
Example B:
example b
Here, there are two split vertices: A and E. For both of them vertex G is the reconnect vertex.
Example C:
example c
Now there are three split vertices: A, D and E. The corresponding reconnect vertices are:
A -> K
D -> K
E -> J
Example D:
example d
Here we have three split vertices again: A, D and E. But this time, vertex E doesn't have a reconnect vertex because one of the paths terminates early.
Sounds like what you want is:
Connect each vertex with out-degree 0 to a single terminal vertex
Construct the dominator tree of the edge-reversed graph. The linked wikipedia article points to a couple algorithms for doing this.
The "reconnect vertex" for a split vertex is its immediate dominator in the edge-reversed graph, i.e., its parent in that dominator tree. This is called its "postdominator" in your original graph. If it's the terminal vertex that you added, then it doesn't have a reconnect vertex in your original graph.
This is the problem of identifying post-dominators in compilers and program analysis. This is often used in the context of calculating control dependences in control flow graphs. "Advanced Compiler Design and Implementation" is a good reference on these topics.
If the graph does not have cycles, then the solution (a) suggested by #matt-timmermans will work.
If the graph has cycles, then solution (a) can report spurious post-dominators. In such cases, a network-flow based approach works better. The algorithm to calculate non-termination sensitive control dependence in this paper using this approach. The basic idea is
at every split node, inject a unique token into the graph along each outgoing edge and
propagate the tokens thru the graph subject to this constraint: if node n is reachable from split node m, then tokens arriving at node m pass thru node n only if all tokens of node m have arrived at node n.
At the end, node n post-dominates node m if all tokens of node m have arrived at node n.

Big O in Adjency List - remove vertex and remove edge(time complexity cost of performing various operations on graphs)

I have to prepare explanation of time complexity of removing vertex (O(|V| + |E|)) and edge (O(|E|)) in Adjency List.
When removing vertex from graph with V vertices and E edges we need to go through all the edges (O(|E|)), of course, to check if which ones need to be removed with the vertex, but why do we need to check all vertices?
I don't understand why in order to remove edge we need to go through all the edges.
I think I might have bad understanding from the beginning, so would you kindly help with those two above?
To remove a vertex, you first need to find the vertex in your data structure. This time complexity of this find operation depends on the data structure you use; if you use a HashMap, it will be O(1); if you use a List, it will be O(V).
Once you have identified the vertex that needs to be removed, you now need to remove all the edges of that vertex. Since you are using an adjacency List, you simply need to iterate over the edge-list of the vertex you found in the previous step and update all those nodes. The run-time of this step is O(Deg(V)). Assuming a simple graph, the maximum degree of a node is O(V). For sparse graphs it will be much lower.
Hence the run-time of removeVertex will only be O(V).
Consider a graph like this:
A -> A
A -> B
A -> C
A -> D
B -> C
The adjacency list will look like this.
A: A -> B -> C -> D -> NULL
B: C -> NULL
C: NULL
D: NULL
Let's remove the vertex C, we have to go through all edges to see if we need to remove that edge, that's is O(|E|) Otherwise - how do you find A->C need to be removed?. After then, we need to remove the list C: NULL from the top level container. Depending on the top level container you may or may not need O(|V|) time for this. For example, if the top level container is an array and you don't allow holes, then you need to copy the array. Or the top level is a list, you will need to scan through the list to find the node representing C to delete.
From the original graph, let's removing the edge A->D, we have to go through the whole linked list A -> B -> C -> D to find out the node D and remove it. That's is why you need to go through all vertices. In the worse case, a vertex connects to all other vertices, so it need to go through all vertices to delete that element, or O(|V|). Depending on your top level container, again, you may or may not be able to find the list fast, that will cost you another O(|V|), but in no case I can imagine removing an edge that O(|E|) in an adjacency list representation.

Is this a proper algorithm that would accept graph G if it's connected?

I wanted to ask and see if this suffices in creating a polynomial time algorithm given a Graph P? Just want to double check.
On input (P,u,v) where P is an undirected graph with nodes u and v:
Create a list D of every edge in graph P with numbers from x_0 to x_k.
Select node u as the starting node. Utilizing BFS, traverse through the graph until the current node is the initial u value.
Repeat with using x_0 onwards through the list until x_k is used.
If BFS completes, accept. Otherwise, reject.
I believe my problem lets with the accept/rejection condition but what would it be in this case?
Your question doesn't make sense in a lot of ways, but from what I can tell, no.
Your step 2 will (slowly) find a cycle, which is not useful.
I have no clue what step 3 is supposed to be and your variable v is unused (and having u as as parameter doesn't make sense).
For an directed graph, you need to specify whether you want to check for weak connectivity (like a undirected graph) or strong connectivity.
For weak connectivity, you need to:
Take as input a graph G
Create a local, empty, set S for seen nodes.
Do a depth first search (including backwards links - watch your algorithmic complexity with how you look them up!) from some node, adding each node to S as you visit it, and skipping edges that lead from the current node to a node in S during iteration. (You can use a breadth-first search if you allocate an additional data set for the frontier)
Check whether S has the same number of nodes as G.
For strong connectivity, you need to check that the graph is connected regardless of which node you set at. There's probably a way to do this more efficiently than repeating the above for every node.
An example graph (consider this from any starting node):
1 -> 2
2 -> 3
3 -> 1
3 -> 4
4 -> 5
5 -> 6
6 -> 4

Find paths that cover all edges between two nodes

we hope you are able to help us with the following problem:
A directed graph that may contain cycles is given. One has to find a set of paths that fulfill the following criterion:
all edges that can be passed on the way from node A to node B must be covered by the paths within the set (one edge can be part of more than one paths from the set)
the solution does not have to be necessarily the one with the lowest number of paths and the paths does not have to be necessarily the shortest ones. However, the solution should be efficiently implementable using a programming language just as java. We need the solution to generate a few test cases and it is important to cover all edges between a node A and a node B.
does everyone know a suitable algorithm? or does no efficient solution exist?
thanks a lot in advance for your advise! (we have already searched for a solution, but the one we found was focused on shortest paths and were extremely inefficient)
Here is a graphical representation of our problem:
http://i.stack.imgur.com/wIY34.jpg
Consider all edges R(A) reachable from A. They can be found by adding a node on each edge (i.e. turning each edge U->V to U->X->V) and then perform a Breadth First Search starting from A.
Edges outside of R(A) clearly cannot be be on a path from A to B, since then they'd be reachable from A. So all paths to B must go through edges of R(A).
So the set of edges, U, we want to "cover" are all edges of R(A) that B is reachable from.
Now we are looking for a set of paths S from A to B, which contains all edges of U.
A straightforward method is the following:
Color all edges of R(A) black and set S={ }
While there are black edges remaining:
Take a black edge UV.
If B is reachable from V:
Construct a path P = A -> ... -> U -> V -> ... -> B
Color all edges of P as gray
Add P to S
Else:
Color UV as gray.
Then return S
As #user189 pointed out, if we consider reachable edges from A that go through B, we are allowing paths that go twice through B. (I.e. a->b->c->g->f->e in the example image).
His suggested solution (removing the node B before computing R(A) ) fixes this.
Regarding complexity:
R(A) can be computed in O(|E|) time and the paths from A to an edge UV in R(A) can be directly read from the BFS tree. To check for reachability to B from V and to find the path, we can use a BFS tree starting from B and following edges backwards, computed in O(|E|) time.
If we reference the paths implicitly through the edge UV that connects the two BFS trees, and use a O(1) read/update structure to maintain the set of black edges and to look up edges in the BFS trees, I think we can do this in O(|E|) time.

Resources