I am trying to write a maximum graph matching algorithm for my thesis. I stuck on how to store the augmenting path on the augmentation step of the algorithm. First, I will write the actual problem I am trying to solve. Then I will try to simplify it.
Suppose you have a graph, where a node can be matched with one of its neighbours and you are trying to calculate the maximum matching. Each node is connected to exactly three other nodes. You did your best, but some nodes left unmatched so you need to do augmentation. At this point, you already know the matchings and the list of nodes that remain unmatched. Augmentation process works like this: You pick two nodes from the unmatched nodes list and find an alternating path between them. An alternating path consists of the two nodes mentioned and a list of matched nodes in between. It is called an alternating path because matched and unmatched edges between nodes alternate along the path.
You can find an alternating path between any two unmatched nodes, but picking the closer ones are better for the performance of the algorithm. So instead, you pick an unmatched node and do BFS until you reach another unmatched node. When you find one such path (which satisfies the alternating path rule), you swap the matchings so the unmatched nodes are now matched.
In short, I need to run some sort of BFS algorithm where there are two types of edges (let's say red and black) and these edges shall alternate along the path until I reach my destination. In the end, I need the path from my source to the closest of the possible destinations which satisfy the red/black rule.
I was able to come up with the pseudocode of the algorithm but I don't know how to store the alternating path that the algorithm finds. How can do it? Any alternative strategies are also welcome.
Related
Is there an efficient way to remove all edges that are not part of a longest paths between two nodes in a DAG?
For example, for the graph (DAG): (1->2, 2->3, 2->4, 1->3, 1->4) I want to remove the edges 1->3, 1->4 since the paths 1->2->3, 1->2->4 are longer
Edit: so I think the best way is to use topological sort and traverse the array for right to left while aggregating for each node its descendants. Then for each edge a->b we can check whether b is reachable from a using all the other direct descendants of a (and if so we delete the edge). But I didn't find an implementation and I'm not sure it's correct, does anyone aware of an implementation of something like this?
The algorithm you suggest is correct, because if an edge is on any longest path, then it must be the longest and only path from its source to target vertices. You can therefore remove any edge that is redundant.
The name for what you are trying to do is "transitive reduction": https://en.wikipedia.org/wiki/Transitive_reduction
While your algorithm works, I don't see that the topological sort is doing you any good. The simple algorithm is to do a search from each vertex to find other ways of reaching its adjacent vertices.
Given undirected not weighted graph with any type of connectivity, i.e. it can contain from 1 to several components with or without single nodes, each node can have 0 to many connections, cycles are allowed (but no loops from node to itself).
I need to find the maximal amount of vertex pairs assuming that each vertex can be used only once, ex. if graph has nodes 1,2,3 and node 3 is connected to nodes 1 and 2, the answer is one (1-3 or 2-3).
I am thinking about the following approach:
Remove all single nodes.
Find the edge connected a node with minimal number of edges to node with maximal number of edges (if there are several - take any of them), count and remove this pair of nodes from graph.
Repeat step 2 while graph has connected nodes.
My questions are:
Does it provide maximal number of pairs for any case? I am
worrying about some extremes, like cycles connected with some
single or several paths, etc.
Is there any faster and correct algorithm?
I can use java or python, but pseudocode or just algo description is perfectly fine.
Your approach is not guaranteed to provide the maximum number of vertex pairs even in the case of a cycle-free graph. For example, in the following graph your approach is going to select the edge (B,C). After that unfortunate choice, there are no more vertex pairs to choose from, and therefore you'll end up with a solution of size 1. Clearly, the optimal solution contains two vertex pairs, and hence your approach is not optimal.
The problem you're trying to solve is the Maximum Matching Problem (not to be confused with the Maximal Matching Problem which is trivial to solve):
Find the largest subset of edges S such that no vertex is incident to more than one edge in S.
The Blossom Algorithm solves this problem in O(EV^2).
The way the algorithm works is not straightforward and it introduces nontrivial notions (like a contracted matching, forest expansions and blossoms) to establish the optimal matching. If you just want to use the algorithm without fully understanding its intricacies you can find ready-to-use implementations of it online (such as this Python implementation).
I just want to get the distance of source node from every node. But it is different than graph problems since it is a tree and path between every node is unique so I expect answer to be in more efficient time.
Is it possible to get answer in efficient time?
You're absolutely right that in a tree, the difficulty of finding a path between two nodes is a lot lower than in a general graph because once you find any path (at least, one without cycles) you know it's the shortest. So all you have to do is just find all paths starting at the given node and going to each other node. You can do this with either a depth-first or a breadth-first search in time O(n). To find the lengths, just keep track of the lengths of the edges you've seen along the paths you've traveled as you travel them.
This is not different from "graph problems": a tree is a special case of a graph. Dijkstra's algorithm is a standard of graph traversal. Just modify it a little: keep all of the path lengths as you find them, and don't worry about the compare-update step, since you're going to keep all of the results. Continue until you run out of nodes to check, and there are your path lengths.
I was thinking on how to find a longest possible path in a complete, directed graph for every single vertex.
Example of such a graph
So for every single vertex I want to find the maximum possible amount of vertices that one can travel through (not going through any vertex more than once) and the specific path that has that specific length.
For example in the given graph, for starting vertex nr.1, the maximum length is 4 and the path:
1,4,2,3 ,or 1,2,3,4 (I just need to get one of them, not all).
What kind of algoritm could handle that?
In case it matters I use C++.
I'm looking for an algorithm to count the number of paths crossing a specific node in a DAG (similar to the concept of 'betweenness'), with the following conditions and constraints:
I need to do the counting for a set of source/destination nodes in the graph, and not all nodes, i.e. for a middle node n, I want to know how many distinct shortest paths from set of nodes S to set of nodes D pass through n (and by distinct, I mean every two paths that have at least one non-common node)
What are the algorithms you may suggest to do this, considering that the DAG may be very large but sparse in edges, and hence preference is not given to deep nested loops on nodes.
You could use a breadth first search for each pair of Src/Dest nodes and see which of those have your given node in the path. You would have to modify the search slightly such that once you've found your shortest path, you continue to empty the queue until you reach a path that causes you to increase the size. In this way you're not bound by random chance if there are multiple shortest paths. This is only an option with non-weighted graphs, of course.