How does Dijkstra's Algorithm find shortest path? - algorithm

How can the shortest path be A,C,E,B,D when there is no path between E and B?

Dijkstra's algorithm adds nodes to the queue in the same order as Breadth-First-Search (BFS) does: when a node is tested its immediate neighbors are added to the queue.
The difference is the way nodes are pulled out from the queue. While BFS does it in FIFO (first in first out) sequence, Dijkstra's algorithm does it by priority.
The node with the highest priority is pulled out from the queue. The priority is set by the cost to get from the origin to that node.
When the origin A is tested its immediate neighbors are added to the queue, so the queue holds 2 nodes :
B(10), C(3)
For convenience I added the cost to each node's name.
The next node to be pulled out of the queue and tested, is the one with the highest priority = lowest cost which is C. After testing C the queue looks like that:
B(7), E(5), D(11)
The cost of B was updated from 10 to 7 because a path with a lower cost (A->C->B) was found.
The next node to be pulled out of the queue is E. Testing E does not add add any of its neighbors (C,D) to the queue. C has already been tested , and D is in the queue.
The queue after pulling E out looks like that:
B(7), D(11)
B which has the highest priority (lowest cost from origin) is pulled out from the queue.
Testing B updates the cost of D to 7+2 = 9. Now we have only D in the queue:
D(9)
D is pulled out and because it it the target the search stops. The right shortest path having the cost of 9 has been found.

Dijkstra's algorithm computes what is the lowest cost from the starting node in this case A to all other nodes in a typical implementation.
To get the complete path from node A to some other node we follow the back pointers back to
A. This, is not shown in this example.
The nodes in S are arranged in the order of increasing cost from A. I am including a few resources on the topic, which might be helpful:
https://math.mit.edu/~rothvoss/18.304.3PM/Presentations/1-Melissa.pdf
https://www.programiz.com/dsa/dijkstra-algorithm

Related

Space Complexity in Breadth First Search (BFS) Algorithm

According to
Artificial Intelligence A Modern Approach - Stuart J. Russell , Peter Norvig (Version 4), space complexity of BFS is O(b^d), where 'b' is branching factor and 'd' is depth.
Complexity of BFS is obtained by this assumption: we store all nodes till we arrive to target node, in other word: 1 + b + b^2 + b^3 + ... + b^d => O(b^d)
But why should we store all nodes? don't we use queue for implementation?
If we use queue, don't need to store all nodes, because we enqueue and dequeue some nodes in steps, then when we find target node(s), we can say some nodes are in queue (but not all of them).
Is my understanding wrong?
The problem in BFS is that you essentially pursue a number of paths in parallel. In depth-first search, you take one branch only, and once that has been explored, all the nodes on it can be dequeued. So you never need more than one branch worth of nodes in your queue.
But in BFS you have to keep every branch up to the current depth; you cannot discard any of them, as they haven't been fully explored yet. So you need to keep track of the 'history' of the current 'head'-node of the path. In DFS there is only ever one path at a time, but in BFS there are more, depending on branching factor and current depth.
At any moment while we apply BFS, the queue would have at most two levels of nodes, for example if we just started searching in depth d, then the queue now contains all nodes at depth d and as we proceed the queue would finish all nodes at depth d and have all nodes at depth d+1, so at any moment we have O(b^d) space.
Also 1+b+b^2+...+b^d = (b^(d+1)-1)/(b-1).

Throw an error if multiple shortest paths are found

Given an undirected weighted graph, a start, and an end point. You need to find the shortest path to that end point, but throw an error if multiple shortest paths are found. How would we do the error part? Is there a way to check if multiple shortest paths to a destination node exist?
My idea is that when we pop out of the priority queue in dijkstra's algorithm, at that time if the node is the destination node, then we check if in this priority queue, another element exists for the same destination node.
During the relaxation, instead of only pushing to the queue if the distance is less than, we can push if the distance is less than or equal to. But I am not sure about it.
EDIT - Its a weighted undirected graph
One way to do this is by creating a shortest path DAG. Whenever you relax some edge from node A to node B with cost C (assuming the current shortest distance from source to each node is stored in array dist), if dist[A] + C is greater than dist[B] then do nothing, if dist[A] + C is equal to dist[B], then we can reach B in a shortest path using a different route than before, so we add A to the list of nodes that can reach B in its shortest path (let's call this array pars), so we add A to pars of B, and finally if dist[A] + C is less than dist[B], then we update dist[B] and clear the previous values from pars[B], and add A to pars[B].
The resulting graph is guaranteed to be a DAG if all edge weights are strictly greater than 0. Then you can count the number of paths to the destination node using some easy dynamic programming methods, process node in a topological order, then the number of paths of each node is the sum of number of paths of nodes that reach it (nodes in pars[node]).
Hopefully this was useful and clear.

Single-source shortest path in a graph with positive weights and diameter D

In a problem, I am given a graph G with only positive weights and its diameter (i.e. the greatest of the shortest paths among each pairs of vertices in G) = D. The problem asks for a single-source shortest path algorithm that is faster than Dijkstra and runs in O(V+E+D) time.
What I've considered so far:
I have thought about the method of adding dummy nodes so as to transform G into an unweighted graph G' and then run BFS, but this would result in a complexity of: O(V+WE)
(As in G', E' = O(WE) and V = O(WE+V))
It seems to me D doesn't really help reduce the complexity of the problem at all, as the sum of weights (i.e. total number of dummy nodes to add) is irrelevant to D.
Use Djikstra's algorithm with an optimised version of the priority queue. Assume the graph has nodes 0..V-1.
The priority queue will consist of an array Arr[0..D] (an array with indices between 0 and D inclusive) of doubly-linked lists of nodes, together with an index i indicating that the priority of all nodes in the array are of distance at least i from the starting node, and an array location[0..V - 1], where the value of location[node] is the doubly-linked list node in Arr containing node, or null if there is no such node. We store a node in the list Arr[i] when we have found a path of length i from the start node to the node in question.
Adding a node to the priority queue which is not already there is O(1) - if we have a tentative distance s, then we add the node to the linked list Arr[s] and update location[node] accordingly. Note that if the priority is >D, we should actually refrain from adding the node to the priority queue entirely and be confident that we will later add it to the queue with a priority <= D.
Removing a given node from the priority queue is also O(1) - we can find its doubly-linked-list node in O(1) using location[node], delete that node from the doubly-linked-list, and set location[node] to null. We will need this operation when we change the priority of a node.
Finding and removing the minimal node is less trivial. We continually increment i until we find some i such that Arr[i] is not empty. We then remove a node which is found in Arr[i] from the priority queue (don't forget to update location[node] as well). The total number of incrementings done in the whole program is D, since we will change i from 0 to D one increment at a time. Ignoring the incrementings, the other work done in this process is O(1).
Note that this is ONLY valid because we can guarantee that once we remove a node with priority i, we will never add another node with priority <i into the priority queue. It also only works because we know that we could never actually remove anything added to the priority queue with priority > D, since we can only remove something from the priority queue when it has its finalised correct path length, which is <= D - therefore, it's unnecessary to add anything to the priority queue with priority > D. This follows from the general properties of Dijkstra's algorithm when the graph has positive edge weights, and from the fact that the graph's diameter is D.
So the algorithm will be O(V + E + D), as required.

calculate the degree of separation between two nodes in a connected graph

I am working on a graph library that requires to determine whether two nodes are connected or not and if connected what is the degree of separation between them
i.e number of nodes needed to travel to reach the target node from the source node.
Since its an non-weighted graph, a bfs gives the shortest path. But how to keep the track of number of nodes discovered before reaching the target node.
A simple counter which increments on discovering a new node will give a wrong answer as it may include nodes which are not even in the path.
Another way would be to treat this as a weighted graph of uniform weighted edges and using Djkastra's shortest path algorithm.
But I want to manage it with bfs only.
How to do it ?
During the BFS, have each node store a pointer to its predecessor node (the node in the graph along whose edge the node was first discovered). Then, once you've run BFS, you can repeatedly follow this pointer from the destination node to the source node. If you count up how many steps this takes, you will have the distance from the destination to the source node.
Alternatively, if you need to repeatedly determine the distances between nodes, you might want to use the Floyd-Warshall all-pairs shortest paths algorithm, which if precomputed would let you immediately read off the distances between any pair of nodes.
Hope this helps!
I don't see why a simple counter wouldn't work. In this case, breadth-first search would definitely give you the shortest path. So what you want to do is attach a property to every node called 'count'. Now when you encounter a node that you have not visited yet, you populate the 'count' property with whatever the current count is and move on.
If later on, you come back to the node, you should know by the populated count property that it has already been visited.
EDIT: To expand a bit on my answer here, you'll have to maintain a variable that'll track the degree of separation from your starting node as you navigate the graph. For every new set of children that you load into the queue, make sure that you increment the value in that variable.
If all you want to know is the distance (possibly to cut off the search if the distance is too large), and all edges have the same weight (i.e. 1):
Pseudocode:
Let Token := a new node object which is different from every node in the graph.
Let Distance := 0
Let Queue := an empty queue of nodes
Push Start node and Token onto Queue
(Breadth-first-search):
While Queue is not empty:
If the head of Queue is Target node:
return Distance
If the head of Queue is Token:
Increment Distance
Push Token onto back of the Queue
If the head of Queue has not yet been seen:
Mark the head of the Queue as seen
Push all neighbours of the head of the Queue onto the back of Queue
Pop the head of Queue
(Did not find target)

Is there a proper algorithm to solve edge-removing problem?

There is a directed graph (not necessarily connected) of which one or more nodes are distinguished as sources. Any node accessible from any one of the sources is considered 'lit'.
Now suppose one of the edges is removed. The problem is to determine the nodes that were previously lit and are not lit anymore.
An analogy like city electricity system may be considered, I presume.
This is a "dynamic graph reachability" problem. The following paper should be useful:
A fully dynamic reachability algorithm for directed graphs with an almost linear update time. Liam Roditty, Uri Zwick. Theory of Computing, 2002.
This gives an algorithm with O(m * sqrt(n))-time updates (amortized) and O(sqrt(n))-time queries on a possibly-cyclic graph (where m is the number of edges and n the number of nodes). If the graph is acyclic, this can be improved to O(m)-time updates (amortized) and O(n/log n)-time queries.
It's always possible you could do better than this given the specific structure of your problem, or by trading space for time.
If instead of just "lit" or "unlit" you would keep a set of nodes from which a node is powered or lit, and consider a node with an empty set as "unlit" and a node with a non-empty set as "lit", then removing an edge would simply involve removing the source node from the target node's set.
EDIT: Forgot this:
And if you remove the last lit-from-node in the set, traverse the edges and remove the node you just "unlit" from their set (and possibly traverse from there too, and so on)
EDIT2 (rephrase for tafa):
Firstly: I misread the original question and thought that it stated that for each node it was already known to be lit or unlit, which as I re-read it now, was not mentioned.
However, if for each node in your network you store a set containing the nodes it was lit through, you can easily traverse the graph from the removed edge and fix up any lit/unlit references.
So for example if we have nodes A,B,C,D like this: (lame attempt at ascii art)
A -> B >- D
\-> C >-/
Then at node A you would store that it was a source (and thus lit by itself), and in both B and C you would store they were lit by A, and in D you would store that it was lit by both A and C.
Then say we remove the edge from B to D: In D we remove B from the lit-source-list, but it remains lit as it is still lit by A. Next say we remove the edge from A to C after that: A is removed from C's set, and thus C is no longer lit. We then go on to traverse the edges that originated at C, and remove C from D's set which is now also unlit. In this case we are done, but if the set was bigger, we'd just go on from D.
This algorithm will only ever visit the nodes that are directly affected by a removal or addition of an edge, and as such (apart from the extra storage needed at each node) should be close to optimal.
Is this your homework?
The simplest solution is to do a DFS (http://en.wikipedia.org/wiki/Depth-first_search) or a BFS (http://en.wikipedia.org/wiki/Breadth-first_search) on the original graph starting from the source nodes. This will get you all the original lit nodes.
Now remove the edge in question. Do again the DFS. You can the nodes which still remain lit.
Output the nodes that appear in the first set but not the second.
This is an asymptotically optimal algorithm, since you do two DFSs (or BFSs) which take O(n + m) times and space (where n = number of nodes, m = number of edges), which dominate the complexity. You need at least o(n + m) time and space to read the input, therefore the algorithm is optimal.
Now if you want to remove several edges, that would be interesting. In this case, we would be talking about dynamic data structures. Is this what you intended?
EDIT: Taking into account the comments:
not connected is not a problem, since nodes in unreachable connected components will not be reached during the search
there is a smart way to do the DFS or BFS from all nodes at once (I will describe BFS). You just have to put them all at the beginning on the stack/queue.
Pseudo code for a BFS which searches for all nodes reachable from any of the starting nodes:
Queue q = [all starting nodes]
while (q not empty)
{
x = q.pop()
forall (y neighbour of x) {
if (y was not visited) {
visited[y] = true
q.push(y)
}
}
}
Replace Queue with a Stack and you get a sort of DFS.
How big and how connected are the graphs? You could store all paths from the source nodes to all other nodes and look for nodes where all paths to that node contain one of the remove edges.
EDIT: Extend this description a bit
Do a DFS from each source node. Keep track of all paths generated to each node (as edges, not vertices, so then we only need to know the edges involved, not their order, and so we can use a bitmap). Keep a count for each node of the number of paths from source to node.
Now iterate over the paths. Remove any path that contains the removed edge(s) and decrement the counter for that node. If a node counter is decremented to zero, it was lit and now isn't.
I would keep the information of connected source nodes on the edges while building the graph.(such as if edge has connectivity to the sources S1 and S2, its source list contains S1 and S2 ) And create the Nodes with the information of input edges and output edges. When an edge is removed, update the output edges of the target node of that edge by considering the input edges of the node. And traverse thru all the target nodes of the updated edges by using DFS or BFS. (In case of a cycle graph, consider marking). While updating the graph, it is also possible to find nodes without any edge that has source connection (lit->unlit nodes). However, it might not be a good solution, if you'd like to remove multiple edges at the same time since that may cause to traverse over same edges again and again.

Resources