I have a list of interconnected edges (E), how can I find the shortest path connecting from one vertex to another?
I am thinking about using lowest common ancestors, but the edges don't have a clearly defined root, so I don't think the solution works.
Shortest path is defined by the minimum number of vertexes traversed.
Note: There could be a multi-path connecting two vertices, so obviously breadth first search won't work

Dijkstra's algorithm will do this for you.

I'm not sure if you need a path between every pair of nodes or between two particular nodes. Since someone has already given an answer addressing the former, I will address the latter.
If you don't have any prior knowledge about the graph (if you do, you can use a heuristic-based search such as A*) then you should use a breadth-first search.

The Floyd-Warshall algorithm would be a possible solution to your problem, but there are also other solutions to solve the all-pairs shortest path problem.

Shortest path is defined by the minimum number of vertexes treversed
it is same as minimum number of edges plus one.
you can use standard breadth first search and it will work fine. If you have more than one path connecting two vertices just save one of them it will not affect anything, because weight of every edge is 1.

Additional 2 cents. Take a look at networkx. There are interesting algos already implemented for what you need, and you can choose the best suited.


How to find a shortest path in a directed graph which must pass through specific nodes?

I have a directed graph with less than 600 nodes, and each node's edge number is less than 8.
Now I need to find a path in this graph which must pass through some given nodes(<50). The order of passing given nodes is free.
I know it's a NPC problem, but I don't know how to solve it.
An approximate solution is also acceptable.
Compute the shortest ways between all pairs of the specific nodes. Then create a new graph that only contains those nodes, with the length of the shortest paths as distances. Now, the problem is "reduced" to Travelling Salesman.
(TSM has a fast 3/2-approximation that utilitizes minimal spanning trees and a matching, if that is good enough - the 50! possibilities are too much in any case)

Shortest path with constraints on traversing edges

While working on a project I've stumbled upon a graph algorithms problem I
haven't been able to solve. The problem is as follows:
You have a directed, weighted graph and want to find the shortest path between a
start node and an end node while visiting specified nodes (very much like
Find the shortest path in a graph which visits certain nodes).
However, along with nodes and edges, this graph also has the notion of "items",
which reside at nodes and you "pick up" when you enter that node. Now there is an extra constraint that edges can only be
traversed if you have obtained the necessary items, I, for that particular edge.
Think of it as a key for a door; you need to obtain a key before being able to
pass through the door.
I can only think of brute-force methods that blow up exponentially. Can anyone think of anything better or point me to a place where this problem is solved? Or maybe convince me that this is "hard" (computationally speaking)? Thanks for any help!
This problem is NP-HARD to solve optimally. There's a simple reduction from the Hamiltonian path problem:
Put unique items on each vertex of the original graph. Construct an sink vertex connected only to the destination vertex. Let the edge between these two vertices require all of the items.
You can try a modified version of the saving algorithm. It's a heuristic to solve the vehicle routing problem. Maybe you can reverse it and create a pick up function for the wanted keys. It's use for a delivery and shortest path problem.

I was wondering for dijkstra's and prim's algorithm, what happens when they are choosing between more than one vertex to go to ,and there are more than one vertex with the same weight.
For example
It doesn't matter. Usually the tie will be broken in some arbitrary way like which node was added to the priority queue first.
The goal of Dijkstra is to find a shortest path. If you wanted to find all shortest paths, you would then have to worry about ties.
There could be multiple MSTs, and whichever arbitrary tiebreaking rules you use might give you a different one, but it'll still be a MST.
For example, you can imagine a triangle A-B-C where all the edge weights are one. There are three MST in this case, and they are all minimum.
The same goes for Dijkstra and the shortest path spanning tree -- there could be multiple shortest path spanning trees.
Correct me if I'm wrong, but your graph doesn't have any alternate paths for Dijkstra's algorithm to apply.
Dijkstra algorithms expands (or "relaxes") all the edges from a touches but not expanded node (or "gray" node) with the smallest cost.
How can I find the shortest path in a graph, with adding the least number of new nodes?

I need to find the shortest path in a graph with the least number of added nodes. The start and end nodes are not important. If there is no path in a graph just between specified n-nodes, I can add some nodes to complete the shortest tree but I want to add as few new nodes as possible.
What algorithm can I use to solve this problem?
Start with the start node.
if it is the target node, you are done.
Check every connected node, if it is the target node. If true you are done
Check if any of the connected nodes is connected to the target node. If true you are done.
Else add a node that is connected to start and end node. done.
I recommend you to use genetic algorithm. More information here and here.
Quickly explaining it, GA is an algorithm to find exact or approximate solutions to optimization and search problems.
You create initial population of possible solutions. You evaluate them with fitness function in order to find out, which of them are most suitable. After that, you use evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover.
After several generations, you'll find the most suitable (read shortest) solution to the problem.
You want to minimize the number of nodes in the path (instead of the sum-of-weight as in general algorithms).
If that is the case, assign equal weight to all the edges and find the shortest path (using the generic algorithms). You will have what you needed.
And if there is no path, just add that edge to the graph.
Find the shortest path in a graph which visits certain nodes

I have a undirected graph with about 100 nodes and about 200 edges. One node is labelled 'start', one is 'end', and there's about a dozen labelled 'mustpass'.
I need to find the shortest path through this graph that starts at 'start', ends at 'end', and passes through all of the 'mustpass' nodes (in any order).
( / is the graph in question - it represents a corn maze in Lancaster, PA)
Everyone else comparing this to the Travelling Salesman Problem probably hasn't read your question carefully. In TSP, the objective is to find the shortest cycle that visits all the vertices (a Hamiltonian cycle) -- it corresponds to having every node labelled 'mustpass'.
In your case, given that you have only about a dozen labelled 'mustpass', and given that 12! is rather small (479001600), you can simply try all permutations of only the 'mustpass' nodes, and look at the shortest path from 'start' to 'end' that visits the 'mustpass' nodes in that order -- it will simply be the concatenation of the shortest paths between every two consecutive nodes in that list.
In other words, first find the shortest distance between each pair of vertices (you can use Dijkstra's algorithm or others, but with those small numbers (100 nodes), even the simplest-to-code Floyd-Warshall algorithm will run in time). Then, once you have this in a table, try all permutations of your 'mustpass' nodes, and the rest.
Something like this:
//Precomputation: Find all pairs shortest paths, e.g. using Floyd-Warshall
n = number of nodes
for i=1 to n: for j=1 to n: d[i][j]=INF
for k=1 to n:
for i=1 to n:
for j=1 to n:
d[i][j] = min(d[i][j], d[i][k] + d[k][j])
//That *really* gives the shortest distance between every pair of nodes! :-)
//Now try all permutations
shortest = INF
for each permutation a[1],a[2],...a[k] of the 'mustpass' nodes:
shortest = min(shortest, d['start'][a[1]]+d[a[1]][a[2]]+...+d[a[k]]['end'])
print shortest
(Of course that's not real code, and if you want the actual path you'll have to keep track of which permutation gives the shortest distance, and also what the all-pairs shortest paths are, but you get the idea.)
It will run in at most a few seconds on any reasonable language :)
[If you have n nodes and k 'mustpass' nodes, its running time is O(n3) for the Floyd-Warshall part, and O(k!n) for the all permutations part, and 100^3+(12!)(100) is practically peanuts unless you have some really restrictive constraints.]
run Djikstra's Algorithm to find the shortest paths between all of the critical nodes (start, end, and must-pass), then a depth-first traversal should tell you the shortest path through the resulting subgraph that touches all of the nodes start ... mustpasses ... end
This is two problems... Steven Lowe pointed this out, but didn't give enough respect to the second half of the problem.
You should first discover the shortest paths between all of your critical nodes (start, end, mustpass). Once these paths are discovered, you can construct a simplified graph, where each edge in the new graph is a path from one critical node to another in the original graph. There are many pathfinding algorithms that you can use to find the shortest path here.
Once you have this new graph, though, you have exactly the Traveling Salesperson problem (well, almost... No need to return to your starting point). Any of the posts concerning this, mentioned above, will apply.
Actually, the problem you posted is similar to the traveling salesman, but I think closer to a simple pathfinding problem. Rather than needing to visit each and every node, you simply need to visit a particular set of nodes in the shortest time (distance) possible.
The reason for this is that, unlike the traveling salesman problem, a corn maze will not allow you to travel directly from any one point to any other point on the map without needing to pass through other nodes to get there.
I would actually recommend A* pathfinding as a technique to consider. You set this up by deciding which nodes have access to which other nodes directly, and what the "cost" of each hop from a particular node is. In this case, it looks like each "hop" could be of equal cost, since your nodes seem relatively closely spaced. A* can use this information to find the lowest cost path between any two points. Since you need to get from point A to point B and visit about 12 inbetween, even a brute force approach using pathfinding wouldn't hurt at all.
Just an alternative to consider. It does look remarkably like the traveling salesman problem, and those are good papers to read up on, but look closer and you'll see that its only overcomplicating things. ^_^ This coming from the mind of a video game programmer who's dealt with these kinds of things before.
This is not a TSP problem and not NP-hard because the original question does not require that must-pass nodes are visited only once. This makes the answer much, much simpler to just brute-force after compiling a list of shortest paths between all must-pass nodes via Dijkstra's algorithm. There may be a better way to go but a simple one would be to simply work a binary tree backwards. Imagine a list of nodes [start,a,b,c,end]. Sum the simple distances [start->a->b->c->end] this is your new target distance to beat. Now try [start->a->c->b->end] and if that's better set that as the target (and remember that it came from that pattern of nodes). Work backwards over the permutations:
One of those will be shortest.
(where are the 'visited multiple times' nodes, if any? They're just hidden in the shortest-path initialization step. The shortest path between a and b may contain c or even the end point. You don't need to care)
Andrew Top has the right idea:
1) Djikstra's Algorithm
2) Some TSP heuristic.
I recommend the Lin-Kernighan heuristic: it's one of the best known for any NP Complete problem. The only other thing to remember is that after you expanded out the graph again after step 2, you may have loops in your expanded path, so you should go around short-circuiting those (look at the degree of vertices along your path).
I'm actually not sure how good this solution will be relative to the optimum. There are probably some pathological cases to do with short circuiting. After all, this problem looks a LOT like Steiner Tree: and you definitely can't approximate Steiner Tree by just contracting your graph and running Kruskal's for example.
Considering the amount of nodes and edges is relatively finite, you can probably calculate every possible path and take the shortest one.
Generally this known as the travelling salesman problem, and has a non-deterministic polynomial runtime, no matter what the algorithm you use.
The question talks about must-pass in ANY order. I have been trying to search for a solution about the defined order of must-pass nodes. I found my answer but since no question on StackOverflow had a similar question I'm posting here to let maximum people benefit from it.
If the order or must-pass is defined then you could run dijkstra's algorithm multiple times. For instance let's assume you have to start from s pass through k1, k2 and k3 (in respective order) and stop at e. Then what you could do is run dijkstra's algorithm between each consecutive pair of nodes. The cost and path would be given by:
dijkstras(s, k1) + dijkstras(k1, k2) + dijkstras(k2, k3) + dijkstras(k3, 3)
How about using brute force on the dozen 'must visit' nodes. You can cover all the possible combinations of 12 nodes easily enough, and this leaves you with an optimal circuit you can follow to cover them.
Now your problem is simplified to one of finding optimal routes from the start node to the circuit, which you then follow around until you've covered them, and then find the route from that to the end.
Final path is composed of :
start -> path to circuit* -> circuit of must visit nodes -> path to end* -> end
You find the paths I marked with * like this
Do an A* search from the start node to every point on the circuit
for each of these do an A* search from the next and previous node on the circuit to the end (because you can follow the circuit round in either direction)
What you end up with is a lot of search paths, and you can choose the one with the lowest cost.
There's lots of room for optimization by caching the searches, but I think this will generate good solutions.
It doesn't go anywhere near looking for an optimal solution though, because that could involve leaving the must visit circuit within the search.
One thing that is not mentioned anywhere, is whether it is ok for the same vertex to be visited more than once in the path. Most of the answers here assume that it's ok to visit the same edge multiple times, but my take given the question (a path should not visit the same vertex more than once!) is that it is not ok to visit the same vertex twice.
So a brute force approach would still apply, but you'd have to remove vertices already used when you attempt to calculate each subset of the path.
