We know the standard algorithm for finding maximum matching in general graph.
https://en.wikipedia.org/wiki/Blossom_algorithm
What I am trying to understand is that what is the need to handle blossom separately?
I think finding augmenting path and complementing it is enough. It works with odd cycle as well.
You are correct that you can form a maximum matching by using the following general algorithm:
If such a path exists, augment along that path to increase the size of the matching by one. Repeat.
If no such path exists, your matching is maximum and you're done.
The challenge, though, is determining whether such an augmenting path exists in the graph. With a small graph it might not be all that hard to find a path, but as the graphs get larger and larger and the partial matchings increase in size it can become pretty challenging. Brute-force searching through all the possible paths is not feasible at this scale.
In the case where the graph is bipartite (there are no odd-length cycles), there are nice algorithms that do this that are based on a modification of breadth-first or depth-first search. They work nicely because you can classify nodes as being either at an odd or even distance from a start node. With odd cycles, this breaks down, and these simple algorithms don't work.
The reason the blossom algorithm exists - and the reason why blossoms are there in the first place - is that they provide a mechanism for efficiently searching the graph for augmenting paths even in the presence of blossoms. The intuition is that every time you see a blossom, you can contract it down to a point in a way that doesn't mess up your ability to find augmenting paths. Contracting an odd cycle reduces the size of the graph, and through recursion you'll either end a graph where it's easy to find an augmenting path or you'll find that there aren't any.
So in a sense, the reason we use blossoms is that it enables us to efficiently (in polynomial time) check whether augmenting paths exist and, if so, find them and augment across them.
Related
Most of the time when implementing a pathfinding algorithm such as A*, we seek to minimize the travel cost along the path. We could also seek to find the optimal path with the fewest number of turns. This could be done by, instead of having a grid of location states, having a grid of location-direction states. For any given location in the old grid, we would have 4 states in that spot representing that location moving left, right, up, or down. That is, if you were expanding to a node above you, you would actually be adding the 'up' state of that node to the priority queue, since we've found the quickest route to this node when going UP. If you were going that direction anyway, we wouldnt add anything to the weight. However, if we had to turn from the current node to get to the expanded node, we would add a small epsilon to the weight such that two shortest paths in distance would not be equal in cost if their number of turns differed. As long as epsilon is << cost of moving between nodes, its still the shortest path.
I now pose a similar problem, but with relaxed constraints. I no longer wish to find the shortest path, not even a path with the fewest turns. My only goal is to find a path of ANY length with numTurns <= n. To clarify, the goal of this algorithm would be to answer the question:
"Does there exist a path P from locations A to B such that there are fewer than or equal to n turns?"
I'm asking whether using some sort of greedy algorithm here would be helpful, since I do not require minimum distance nor turns. The problem is, if I'm NOT finding the minimum, the algorithm may search through more squares on the board. That is, normally a shortest path algorithm searches the least number of squares it has to, which is key for performance.
Are there any techniques that come to mind that would provide an efficient way (better or same as A*) to find such a path? Again, A* with fewest turns provides the "optimal" solution for distance and #turns. But for my problem, "optimal" is the fastest way the function can return whether there is a path of <=n turns between A and B. Note that there can be obstacles in the path, but other than that, moving from one square to another is the same cost (unless turning, as mentioned above).
I've been brainstorming, but I can not think of anything other than A* with the turn states . It might not be possible to do better than this, but I thought there may be a clever exploitation of my relaxed conditions. I've even considered using just numTurns as the cost of moving on the board, but that could waste a lot of time searching dead paths. Thanks very much!
Edit: Final clarification - Path does not have to have least number of turns, just <= n. Path does not have to be a shortest path, it can be a huge path if it only has n turns. The goal is for this function to execute quickly, I don't even need to record the path. I just need to know whether there exists one. Thanks :)
i'm trying to make multiple agents move at the same time to a specified point on a 2d map and have an upper limit for the maximum distance one agent can move.
If possible, all agents should move the maximum distance, else less.
The paths of different agents shouldn't cross if possible, but if not, they can still cross.
My idea was some sort of adjusted A* algorithm.
Would this be a good approach or is there a better algorithm for this kind of problem?
(to be honest,i currently have A* and dijkstra on my radar as possiblities for solving this, so if there is anything better,a push in the right direction would be great)
Thanks for your help already
PS: i don't have any kind of underlying graph yet, so i'm still open to any idea, but can of course create a graph that works for dijkstra/A*
Your problem is close to vertex/edge disjoint path problem, which is NP-Complete in general, also your restricted version seems to be NP-Complete because shortest disjoint path in grid graph is NP-Hard, which is related to your restricted version. But there are lots of algorithms for disjoint paths in grid (even if you have different layers), so best option that I can suggest is use one of the exact algorithms, to find the vertex disjoint path, after that increase the size of paths (if is needed), by traversing some adjacent vertices.
Also for grid you don't need Dijkstra for finding path between two nodes (even shortest path or path with specific length), you can do it simply by running a BFS and is O(n) (start BFS from vertex v, and set the number of its adjacent to 1, and then for each adjacent of 1's set the new value to 2, ... see this answer and numbering algorithm part).
May be this question also helps if you looking for some heuristics in dynamic situation.
How can I use the A star algorithm to find the first 100 shortest paths?
The problem of finding k'th shortest path is NP-Hard, so any modification to A-Star that will do what you are after - will be exponential in the size of the input.
Proof:
(Note: I will show on simple paths)
Assume you had a polynomial algorithm that runs in polynomial time and returns the length of kthe shortest path let the algorithm be A(G,k)
The maximal number of paths is n!, and by applying binary search on the range [1,n!] to find a shortest path of length n, you need O(log(n!)) = O(nlogn) invokations of A.
If you have found there is a path of length n - it is a hamiltonian path.
By repeating the process for each source and target in the graph (O(n^2) of those), you can solve the Hamiltonian Path Problem polynomially, assuming such A exists.
QED
From this we can conclude, that unless P=NP (and it is very unlikely according to most CS researchers), the problem cannot be solved polynomially.
An alternative is using a variation of Uniform Cost Search without maintaining visited/closed set. You might be able to modify A* as well, by disabling the closed nodes, and yielding/generating solutions once encountered instead of returning them and finishing, but I cannot think of a way to prove it for A* at the moment.
Besides of this problem being NP-hard, it is impossible to do this with A* or dijkstra without major modifications. Here are some major reasons:
First of all, the algorithm keeps at every step only the best path so far. Consider the following Graph:
A
/ \
S C-E
\ /
B
Assume distances d(S,A)=1, d(S,B)=2, d(A,C)=d(B,C)=d(C,E)=10.
When visiting C you will pick the path via A, but you will nowhere store the path via B. So you'd have to keep this information.
But, secondly, you don't even consider every path possible, assume the following graph:
S------A--E
\ /
B--C
Assume distances d(S,A)=1, d(S,B)=2, d(B,C)=1, d(A,E)=3. Your visiting order will be {S,A,B,C,E}. So when visiting A you can't even save the detour via B and C because you don't know of it. You'd have to add something like a "potential path via C" for every unvisited neighbor.
Thirdly, you'd have to incorporate loops and cul-de-sacs's , because yes, it is perfectly possible that a path with a loop in it ends up being one of your 100 shortest paths. You'd of course might want to constraint this away, but it is a generic possibility. Consider for example graphs like this:
S-A--D--E
| |
B--C
It's clear you can easily start looping here, unless you disallow 'going back' (e.g. forbid D->A if A->D already in path). Actually this is even a problem without an obvious graphical loop, because in the generic case you can always ping-pong between two neighbors (path A-B-A-B-A-...).
And now I'm probably even forgetting some issues.
Note that most of these things make it also very hard to develop a generic algorithm, certainly the last part because with loops it is hard to constrain your number of possible paths ('endless loop').
This is not an NP hard algorithm, and the below link is the Yen's algorithm for computing K-shortest paths in a graph in polynomial time.
Yen's algorithm link
Use a* search, when the destination is k-th time pushing into the queue. It would be the k-th shortest path.
I have a random undirected social graph.
I want to find a Hamiltonian path if possible. Or if not possible (or not possible to know if possible in polynomial time) a series of paths. In this "series of paths" (where all N nodes are used exactly once), I want to minimize the number of paths and maximize the average length of the paths. (So no trivial solution of N paths of a single node).
I have generated an adjacency matrix for the nodes and edges already.
Any suggestions? Pointers in the right direction? I realize this will require heuristics because of the NP-complete (?) nature of the problem, and I am OK with a "good enough" answer. Also I would like to do this in Java.
Thanks!
If I'm interpreting your question correctly, what you're asking for is still NP-hard, since the best solution to the "multiple paths" problem would be a Hamiltonian path, and determining whether one exists is known to be NP-hard. Moreover, even if you're guaranteed that a Hamiltonian path doesn't exist, solving this problem could still be NP-hard, since I could give you a graph with a single disconnected node floating in space, for which the best solution is a trivial path containing that node and a Hamiltonian path in the remaining graph. As a result, unless P = NP, there isn't going to be a polynomial-time algorithm for your problem.
Hope this helps, and sorry for the negative result!
Angluin and Valiant gave a near linear-time heuristic that works almost always in a sufficiently dense Erdos-Renyi random graph. It's described by Wilf, on page 121. Probably your random graph is not Erdos-Renyi, but the heuristic might work anyway (when it "fails", it still gives you a (hopefully) long path; greedily take this path and run A-V again).
Use a genetic algorithm (without crossover), where each individual is a permutation of the nodes. This gives you "series of paths" at each generation, evolving to a minimal number of paths (1) and a maximal avg. length (N).
As you have realized there is no exact solution in polynomial time. You can try some random search methods though. My recommendation, start with genetic algorithm and try out tabu search.
I have implemented the hungarian algorithm, a solution to the assignment problem, as described by this article, but it fails on a few percent of random costs matrices.
I've spent weeks debugging it(I started when I asked this question, not full time though). I took random cost matrices for which the algorithm fails and performed the algorithm with good old pen and paper, and compared that with my implementation to see what went wrong. This led me to a few bugs which I've corrected now, but I have encountered an example for which I do not get the right solution when solving it by hand. For anyone who is interested: the costmatrix of that example is {{0,6,4,3},{3,2,1,2},{0,7,6,4},{3,8,5,3}}, for which the correct solution has the sum of 9=4+2+0+3(in that order). In that example there is eventually a matched edge not on the equality subgraph, and I think that is impossible, indicating something is wrong.
Either I don't fully understand the solution, which is a viable option, or an extremely subtle bug in the presented solution, which I will elaborate on below.
I realize I have to introduce some terminology, but since this is a detailed question I am not going to explain all concepts in full detail, as anyone needing that explanation probably wouldn't be able to answer my question anyway.
The input of the problem is a weighted complete bipartite graph with n nodes on each partition.
The presented method specifies to find n augmenting paths.
An augmenting path is an alternating path starting and ending at a unmatched nodes.
An alternating path is a path alternating between matched an unmatched edges on the equality subgraph.
These alternating paths are grown in a breadth-first manner, stopping only when either:
An augmenting path is found or
the alternating paths cannot be grown any further.
And a crucial fact to the possible bug: the algorithm remembers what nodes the alternating paths have encountered, which affects the algorithm in a part irrelevant to this question.
When an augmented path is found, the presented method says to stop growing the alternating paths. I believe this is incorrect. I think all alternating paths need to be grown up to the cost of the found augmented path. Notice that the alternating paths are grown in a breadth first manner, so this only grows paths whose costs can tie with the found path. This small change might result in some nodes being marked as 'visited by alternating path' which otherwise wouldn't have been marked, which affect the algorithm further on.
The actual question:
Should I consider alternating paths with costs equal to the costs of the augmented path (and starting at the same node) explored? This is contrary to the presented method, which says to stop as soon as an augmented path is found, regardless of any ties in costs with other paths.
Looking at the presentation of the Hungarian algorithm in "The Stanford GraphBase" you can track its progress towards a solution as adding a constant to every cell in a row of the cost matrix, or every cell in a column of the cost matrix, and see that you have a solution when you have a complete set of independent zeros in the altered matrix.
I have read just once the paper you refer to. Is it the case that finding an augmenting path allows you to increase the number of independent zeros in the altered matrix? If so, then finding n augmenting paths, as in their Figure 3 step 2, will find a good solution, because you must then have n independent zeros. If so, then you can check your implementation of the algorithm by checking that each augmenting path found adds an independent zero, even in the case when there are other paths that it could have found but stopped short of finding.