bi-directional really gives shortest path? - algorithm

I was told by the cracking interview book that bi-directional algorithm give shortest path between 2 points in graph.
I don't get why it is guaranteed shortest path. Doesn't collision point change depended on vertexs' queuing order during breadth-first search?
thx

Doesn't collision point change depended on vertexs' queuing order during breadth-first search?
Yes, it does. However, whichever node ends up being chosen, it will connect one of the shortest paths between the source and target nodes.
So if there are multiple such choices, it can be through any of them depending on queueing order as you said. But you are guaranteed that the resulting path will be of the same optimal length.

Related

Dijkstra's bi-directional implementation

I am implementing a bidirectional Dijkstra's algorithm and having issues understanding how the various implementations of the stopping condition work out.
Take this graph for example, where we're starting from node A, targetting node J. Below the graph I have listed the cluster (processed) and relaxed (fringes) nodes at the time the algorithm stops:
The accepted answer to “Bidirectional Dijkstra” by NetworkX explains that the algorithm stops when the same node has been processed in both directions. In my graph that will be Node F. If that were the case, the algorithm stops after finding the shortest path of length 9 going from A-B-C...H-I-J. But this wouldn't be the shortest path because A-J have a direct edge of length 8 which is never taken because the weight 8 is never popped from the priority queue.
Even in this Java implementation on github of Dijksta's bi-directional algorithm, the stopping condition is:
double mtmp = DISTANCEA.get(OPENA.min()) +
DISTANCEB.get(OPENB.min());
if (mtmp >= bestPathLength) return PATH;
This algorithm stops when the top node weights -- from each front and backward queue -- add up to at least the best path length so far. But this wouldn't return the correct shortest path either. Because in that case it will be node G(6) and E(5) totalling to 11, which is greater than the best path so far of length 9.
I don't understand that seemingly both of these stopping conditions yield an incorrect shortest path. I am not sure what I am misunderstanding.
What is a good stopping condition for Dijkstra's bidirectional algorithm? Also, what would be a stopping condition for bidirectional A*?
Conceptually the stopping condition for Dijkstra's algorithm, whether bidirectional or not, is that you stop when the best path you've found is as good as any path you might find if you continue. Only closed paths (the ones in your "cluster" sets above) count as found.
For bidirectional Dijkstra's, a path is "found" whenever the same vertex exists in both the forward and reverse closed sets. That part is easy, but how good is the best path you might find in the future?
To make sure the answer you get is correct, you evaluation of the best path you might find needs to be accurate, or an underestimate. Lets consider the possibilities:
A path might be made with a vertex that is open in both directions.
A vertex that is open in one direction might make a path with one that is already closed in the other direction.
The problem is case (2). The sets and priority queues we use for Dijkstra's algorithm do not allow us to make a very useful underestimate of the best path in this case. The smallest distance in a closed set is always 0, and if we add this to the minimum open from the other direction, then we come up with:
double mtmp = min ( DISTANCEA.get(OPENA.min()) , DISTANCEB.get(OPENB.min()) );
This works, and will produce the correct answer, but it will make the algorithm run until the best complete path is found in at least one direction. Unidirectional Dijkstra's would be faster in many cases.
I think this "bidirectonal Dijkstra's" idea needs significant rework in order to be really good.

A* for unweighted graphs

Does it make sense to use A* search algorithm on unweighted directed graphs for finding shortest path?
From reading http://www.cs.cmu.edu/~cga/ai-course/astar.pdf seems like A* could be expensive in terms of memory, also for unweighted graphs, how would it even determine heuristic?
This post here seems to conclude A* should not be used for unweighted graphs.
What would be the best/lease expensive algorithm to use for finding shortest path on unweighted directed graphs? Just a simple BFS?
There is no point to the full A* unless you have a useful heuristic to use it with. That said, if your heuristic is that every node is guessed to be the same possible distance from the target, then A* search will give you the same result as BFS because you will look at every node reached by a shorter path before looking at a node reached by a longer one.
As for the best, the best algorithm that I am aware of is a BFS starting at both ends, using a hash to detect the first intersection. That is, you mark the source and the target. Then extend the source out to a depth of 1, then the target to a depth of 1, then the source out to a depth of 2, then the target to a depth of 2, and so on. When you intersect, you have the shortest path out to the intersection from both directions. So traverse the one from the source out to the intersection point, then the intersection back to the target.
This is, for example, the kind of algorithm that gets used to find who is close to you in a large social network like LinkedIn.
If you have a heuristic, use A*. If you don't, don't.
Often unweighted graphs have additional structure that can be exploited, eg. if your graph is actually a 2D grid, Jump Point Search is much faster than normal A*. We'll need to know more about your problem domain to recommend anything further.

How to get path from one node to all other nodes in a weighted tree in minimum time?

I just want to get the distance of source node from every node. But it is different than graph problems since it is a tree and path between every node is unique so I expect answer to be in more efficient time.
Is it possible to get answer in efficient time?
You're absolutely right that in a tree, the difficulty of finding a path between two nodes is a lot lower than in a general graph because once you find any path (at least, one without cycles) you know it's the shortest. So all you have to do is just find all paths starting at the given node and going to each other node. You can do this with either a depth-first or a breadth-first search in time O(n). To find the lengths, just keep track of the lengths of the edges you've seen along the paths you've traveled as you travel them.
This is not different from "graph problems": a tree is a special case of a graph. Dijkstra's algorithm is a standard of graph traversal. Just modify it a little: keep all of the path lengths as you find them, and don't worry about the compare-update step, since you're going to keep all of the results. Continue until you run out of nodes to check, and there are your path lengths.

How can I find the shortest path in a graph, with adding the least number of new nodes?

I need to find the shortest path in a graph with the least number of added nodes. The start and end nodes are not important. If there is no path in a graph just between specified n-nodes, I can add some nodes to complete the shortest tree but I want to add as few new nodes as possible.
What algorithm can I use to solve this problem?
Start with the start node.
if it is the target node, you are done.
Check every connected node, if it is the target node. If true you are done
Check if any of the connected nodes is connected to the target node. If true you are done.
Else add a node that is connected to start and end node. done.
I recommend you to use genetic algorithm. More information here and here.
Quickly explaining it, GA is an algorithm to find exact or approximate solutions to optimization and search problems.
You create initial population of possible solutions. You evaluate them with fitness function in order to find out, which of them are most suitable. After that, you use evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover.
After several generations, you'll find the most suitable (read shortest) solution to the problem.
You want to minimize the number of nodes in the path (instead of the sum-of-weight as in general algorithms).
If that is the case, assign equal weight to all the edges and find the shortest path (using the generic algorithms). You will have what you needed.
And if there is no path, just add that edge to the graph.
Sands.
PS: If you give a value of 1 for each edge, the number of nodes in the path would be the weight-1 (excluding the source and destination nodes)

Find the shortest Path between two nodes (vertices)

I have a list of interconnected edges (E), how can I find the shortest path connecting from one vertex to another?
I am thinking about using lowest common ancestors, but the edges don't have a clearly defined root, so I don't think the solution works.
Shortest path is defined by the minimum number of vertexes traversed.
Note: There could be a multi-path connecting two vertices, so obviously breadth first search won't work
Dijkstra's algorithm will do this for you.
I'm not sure if you need a path between every pair of nodes or between two particular nodes. Since someone has already given an answer addressing the former, I will address the latter.
If you don't have any prior knowledge about the graph (if you do, you can use a heuristic-based search such as A*) then you should use a breadth-first search.
The Floyd-Warshall algorithm would be a possible solution to your problem, but there are also other solutions to solve the all-pairs shortest path problem.
Shortest path is defined by the minimum number of vertexes treversed
it is same as minimum number of edges plus one.
you can use standard breadth first search and it will work fine. If you have more than one path connecting two vertices just save one of them it will not affect anything, because weight of every edge is 1.
Additional 2 cents. Take a look at networkx. There are interesting algos already implemented for what you need, and you can choose the best suited.

Resources