Extract chain components from mixed graphs - algorithm

Given a mixed acyclic graph consists of directed and undirected edges, I want to decompose this graph into a directed graph of chain components (each node within a chain component will be connected to each other only with undirected edges) and their orderings.
I am confused whether I should first topologically sort all directed edges, and then hunt undirected edges as chain components, or first should I go over all undirected edges and give them group id's and then find some directed edges to connect those components.
Since the graph is acyclic, I think it's possible to order them from low-numbered components to high-numbered ones, but couldn't come up with a solid answer.

I think both of your methods would work fine.
To my mind the second method seems more natural.
If I was doing this in networkx I would implement your second method by:
Create a new graph H containing all vertices but only the undirected edges.
Call connected_components on H to extract the chain components and assign each component a different group id.
Create a new graph F with 1 node for each group id. Connect groups in F with directed edges based on the directed edges in the original graph.
Call topological_sort on F to compute the ordering of the group ids.

The equivalence relation that defines a chain component is the following by Drton 2009:
Define two vertices v_0 and v_k in a chain graph G to be equivalent if there exists a path (v_0,..., v_k) such that v_i − v_{i+1} in G for all 0 ≤ i ≤ k − 1.
The equivalence classes under this equivalence relation are the chain components of G. Rougly speaking this means all connected components of the graph made from undirected edges of the chain graph, plus all the nodes that are incident with only directed edges, plus all the nodes that have no neighbors at all, which amounts to Peter's answer
Here is the function that correctly decomposes the chain graph CH-Asia given in Cowell 2005,Probabilistic Networks and ..., p. 110 Fig. 6.1.
It is part of a graphical models library I have been developing as hobby project.
Though it uses custom data structures, it should not be too hard to adapt to other code bases involving graphical models.
def get_chain_components(self) -> Set[Set[Node]]:
"""!
"""
# filter out undirected edges
edges = set()
for e in self.edges():
if e.type() == EdgeType.UNDIRECTED:
edges.add(e)
# make a graph from undirected edges
undi = UndiGraph.from_graph(Graph.from_edge_node_set(edges, self.nodes()))
return undi.get_components_as_node_set()

A mixed graph such you describe is again a directed graph. Simply replace each undirected edge with two directed ones pointing in opposite directions.
Also you can't have an acyclic graph that has undirected edges. At least a cycle of length 2 will always exist, so I am not sure what do you mean by this.
It seems you are looking for the strongly connected components in this graph so I advice you to use Tarjan's algorith for finding them.

Related

Cycles between two vertices in a directed graph

I know that in an undirected graph you have to have at least three vertices to form a cycle. My question is, in a directed graph, is it considered a cycle if two vertices have two edges pointing to each other?
Here is an example:
Is this a cyclic graph?
Related questions:
In an undirected graph, the simplest cycle must have 3 nodes?
Existence of cycle between any two vertices of graph
Cycles in an Undirected Graph
A graph has a cycle if there is a non-empty path that originates at some vertex and ends at the same vertex. In your graph above, you have a cycle on path A -> C -> A. Similarly, let's imagine a directed graph with 2 vertices A and B and 2 edges AB and BA (where the first letter is the source vertex). This means that there is a cycle A -> B -> A, thus you can have a cycle in a directed graph of 2 vertices.
I would say it (A-C-A) is a cycle. But I am from a different perspective: you may know that for a directed acyclic graph (dag), there is a topological sorting on it; otherwise, there isn't.
Topological sorting is indeed the linear extension of a partial order <=. Thus, dag is the graphical representation of a partial order <=. Be aware that according to the anti-symmetry property of a partial order <= (i.e., if a<=b and b<=a, then a=b), there is no possibility that two edges (a,b) and (b,a) simultaneously exist between two distinct vertices a and b.
In summary, no cycle => exists topological sorting, since no topological sorting on your digraph, thus there must be a cycle (A-C-A).
No,it is not considered a cycle if two vertices have two edges pointing to each other in directed graph. They are called Parallel Edges.
According to this definition 1:
A circuit is a closed trail with at least one edge
A-C is considered a circuit.
A-C also complies with this definition2:
A cycle is a circuit in which no vertex except the first (which is
also the last) appears more than once.
so it is also a cycle.
1 source: https://proofwiki.org/wiki/Definition:Circuit
2 source: https://proofwiki.org/wiki/Definition:Cycle_(Graph_Theory)

Finding a Minimal set of vertices that satisfy the given constraints

Note: no need for formal proof or anything, just the general idea of the algorithm and I will go deeper myself.
Given a directed graph: G(V,E), I want to find the smallest set of vertices T, such that for each vertex t in T the following edges don't exist: {(t,v) | for every v outside t} in O(V+E)
In other words, it's allowed for t to get edges from vertices outside T, but not to send.
(You can demonstrate it as phone call, where I am allowed to be called from outside and it's free but it's not allowed to call them from my side)
I saw this problem to be so close or similar to finding all strongly connected components (scc) in a directed graph which its time complexity is O(V+E) and I'm thinking of building a new graph and running this algorithm but not totally sure about that.
The main idea is to contract each strongly connected component (SCC) of G into a single vertex while keeping a score on how many vertices were contracted to create each vertex in the contracted graph (condensation of G). The resulting graph is a directed acyclic graph. The answer is the vertex with lower score among the ones with out-degree equal 0.
The answer structure is an union of strongly connected components because of the restriction over edges and you can prove that there is only a SCC involved in the answer because of the min restriction.

Algorithm for dividing graph into edge pairs

I've received a task to find an algorithm which divides a graph G(V,E) into pairs of neighboring edges (colors the graph, such that every pair of neighboring edges has the same color).
I've tried to tackle this problem by drawing out some random graphs and came to a few conclusions:
If a vertex is connected to 2(4,6,8...) vertices of degree 1, these make a pair of edges.
If a vertex of degree 1 is directly connected to a cycle, it doesn't matter which edge of the cycle is paired with the lone edge.
However, I couldn't come up with any other conclusions, so I tried a different approach. I thought about using DFS, finding articulation points and dividing graph into subgraphs with an even number of edges, because those should be dividable by this rule as well, and so on until I end up with only subgraphs of |E(G')| = 2.
Another thing I've come up with is to create a graph G', where E(G) = V(G') and V(G) = E(G'). That way I could get a graph, where I could remove pairs of vertices (former edges) either via DFS or always starting with leaf vertices along with their adjacent vertices.
The last technique is most appealing to me, but it seems to be the slowest one. Any feedback or tips on which of these methods would be the best is much appreciated.
EDIT: In other words, imagine the graph as a layout of a town. Vertices being crossroads, edges being the roads. We want to decorate (sweep, color) each road exactly once, but we can only decorate two connected roads at the same time. I hope this helps for clarification.
For example, having graph G with E={ab,bd,cd,ac,ae,be,bf,fd}, one of possible pair combinations is P={{ab,bf},{ac,cd},{ae,eb},{bd,df}}.
One approach is to construct a new graph G where:
A vertex in G corresponds to an edge in the original graph
An edge in G connects vertices a and b in G where a and b represent edges in the original graph that meet at a vertex in the original graph
Then, if I have understood the original problem correctly, the objective for G is to find the maximum matching, which can be done, for example, with the Blossom algorithm.

Can there be self loops for undirected graphs?

I mean directed graphs can have a self-loop, so I don't see the reason why an undirected graph cannot have it (CLRS says it's forbidden without giving a valid reason).
Example:
G_directed = (V,E) is a directed graph
Say this graph has the vertex set V = {1,2,3,4,5,6}
With edges E = {(1,2),(2,2),(2,4),(2,5),(4,1),(4,5),(5,4),(6,3)}
-----------------------------------------------------------------
Say we now decide to turn G_directed into an undirected graph:
G_undirected = (Vu,Eu) is an undirected graph
Vu = {1,2,3,4,5,6}
With edges E = {(1,2),(2,2),(2,4),(2,5),(4,1),(4,5),(6,3)}
In the example (2,2) is the self loop. I seriously don't see any problems this can have with graph traversals.
There are several categories of undirected graphs. Where loops (self references) are not allowed, they are called simple graphs. But there is indeed no reason to consider undirected graphs with loops and even multiple edges between the same pair of nodes: these are called multigraphs:
A loop is an edge (directed or undirected) that connects a vertex to itself; it may be permitted or not, according to the application.
A multigraph, as opposed to a simple graph, is an undirected graph in which multiple edges (and sometimes loops) are allowed.

Minimize set of edges in a directed graph keeping connected components

Here is the full question:
Assume we have a directed graph G = (V,E), we want to find a graph G' = (V,E') that has the following properties:
G' has same connected components as G
G' has same component graph as G
E' is minimized. That is, E' is as small as possible.
Here is what I got:
First, run the strongly connected components algorithm. Now we have the strongly connected components. Now go to each strong connected component and within that SCC make a simple cycle; that is, a cycle where the only nodes that are repeated are the start/finish nodes. This will minimize the edges within each SCC.
Now, we need to minimize the edges between the SCCs. Alas, I can't think of a way of doing this.
My 2 questions are: (1) Does the algorithm prior to the part about minimizing edges between SCCs sound right? (2) How does one go about minimizing the edges between SCCs.
For (2), I know that this is equivalent to minimizing the number of edges in a DAG. (Think of the SCCs as the vertices). But this doesn't seem to help me.
The algorithm seems right, as long as you allow for closed walks (i.e. repeating vertices.) Proper cycles might not exist (e.g. in an "8" shaped component) and finding them is NP-hard.
It seems that it is sufficient to group the inter-component edges by ordered pairs of components they connect and leave only one edge in each group.
Regarding the step 2,minimize the edges between the SCCs, you could randomly select a vertex, and run DFS, only keeping the longest path for each pair of (root, end), while removing other paths. Store all the vertices searched in a list L.
Choose another vertex, if it exists in L, skip to the next vertex; if not, repeat the procedure above.

Resources