As we know, Dijkstra finds the shortest path from a single source node to any other node in a given graph. I try to modify the original Dijkstra to find the shortest path between a pair of the source node and destination node. It seems easy that only set a termination condition for terminating the program when the Dijkstra finds the destination node.
However, the "termination condition" I set in my Python codes seems to lead a sub-optimal shortest path rather than the optimal shortest path.
The Dijkstra code is as follows,
def dijkstra(adjList, source, sink):
#define variables
n = len(adjList) #intentionally 1 more than the number of vertices, keep the 0th entry free for convenience
visited = [False]*n
parent = [-1] *n
#distance = [float('inf')]*n
distance = [1e7]*n
heapNodes = [None]*n
heap = FibonacciHeap()
for i in range(1, n):
heapNodes[i] = heap.insert(1e7, i)
distance[source] = 0
heap.decrease_key(heapNodes[source], 0)
while heap.total_nodes:
current = heap.extract_min().value
#print("Current node is: ", current)
visited[current] = True
#early exit
if sink and current == sink:
break
for (neighbor, cost) in adjList[current]:
if not visited[neighbor]:
if distance[current] + cost < distance[neighbor]:
distance[neighbor] = distance[current] + cost
heap.decrease_key(heapNodes[neighbor], distance[neighbor])
if neighbor == sink and current != source: # this is a wrong logic , since the neighbor may not be selected as the next hop.
print("find the sink 1")
printSolution(source, sink, distance,parent)
break
if neighbor == sink:
print("find the sink2")
break
return distance
adjList = [
[],
[[2, 7], [3, 9], [6, 14]],
[[1, 7], [4, 15], [3, 10]],
[[1, 9], [2, 10], [4, 11], [6, 2]],
[[2, 15], [3, 11], [5, 6]],
[[4, 6], [6, 9]],
[[5, 9], [1, 14]]
]
dijkstra(adjList,1,4)
The graph of the adjacency list is as shown:
I want to find the path from node 1 to node 4, there are three paths:
path 1: 1 --> 2 --> 4 cost: 22
path 2: 1 --> 2 --> 3 --> 4 cost: 28
path 3: 1 --> 3 --> 4 cost: 20
path 4: 1 --> 3 --> 6 --> 5 --> 4 cost: 26
path 5: 1 --> 6 --> 3 --> 4 cost: 28
path 6: 1 --> 6 --> 5 --> 4 cost: 29
Originally, Dijkstra will select path 3: 1 --> 3 --> 4 since it has the minimum cost.
But, I modify the termination condition, i.e., when finding the adjacency node of the current node is the destination, the program will be ended. And I get the result of a path between node 1 and node 4. The result is path 1: 1 --> 2 --> 4.
I analyze that, this is because I set the wrong termination condition. The program will be terminated when finding the adjacency node of the current node is the destination, that is wrong but I have no idea that setting a proper termination condition when the destination node is found.Could you please provide some ideas?
The only right place for the termination condition is at the start of the outer loop when you just got the current node from the heap.
It is wrong to do that test when you iterate the neighbors, as you don't have the guarantee that this last edge is part of the shortest path. Just imagine some insane high cost for that last step to the neighbor: never could that be on the shortest path, so don't perform the terminating condition there: there still might be another path to the sink that is cheaper.
I also did not see where you actually populated parent in your code.
I would also not put all nodes on the heap from the start, as heaps are faster when they have fewer elements. You can start with a heap with just 1 node.
Another little optimisation is to use parent also for marking nodes as visited, so you don't actually need both parent and visited.
Finally, I don't know the FibonacciHeap library, so I have just taken heapq, which is a very light heap implementation:
from heapq import heappop, heappush
def dijkstra(adjList, source, sink):
n = len(adjList)
parent = [None]*n
heap = [(0, source, 0)] # No need to push all nodes on the heap at the start
# only add the source to the heap
while heap:
distance, current, came_from = heappop(heap)
if parent[current] is not None: # skip if already visited
continue
parent[current] = came_from # this also marks the node as visited
if sink and current == sink: # only correct place to have terminating condition
# build path
path = [current]
while current != source:
current = parent[current]
path.append(current)
path.reverse()
return distance, path
for (neighbor, cost) in adjList[current]:
if parent[neighbor] is None: # not yet visited
heappush(heap, (distance + cost, neighbor, current))
adjList = [
[],
[[2, 7], [3, 9], [6, 14]],
[[1, 7], [4, 15], [3, 10]],
[[1, 9], [2, 10], [4, 11], [6, 2]],
[[2, 15], [3, 11], [5, 6]],
[[4, 6], [6, 9]],
[[5, 9], [1, 14]]
]
dist, path = dijkstra(adjList,1,4)
print("found shortest path {}, which has a distance of {}".format(path, dist))
You actually have the correct condition for exit in your code that is when current==sink. You cannot impose any other exit condition. The algorithm necessarily needs to run until the destination node is visited because only at this point you can fix the value of the shortest path to the destination. Because of this condition, the complexity of finding the single source single destination shortest path is the same as that of the single source all nodes shortest paths. So your early exit condition is correct and you should remove all the neighbor condition checks.
Related
I am trying to implement an algorithm that requires a post-order traversal. Here is my graph (taken from here, pg. 8):
When I try to do a postorder traversal of this, the order I get is:
[3, 2, 1, 5, 4, 6]
The problem with this order is that the algorithm won't work in this order. This is the code I am using to get it (pseudocode):
function PostOrder(root, out_list) {
root.visited = true
for child in root.Children {
if not child.visited {
PostOrder(child, out_list)
}
}
out_list.append(root)
}
Is the postorder correct?
Yes, the post order traversal of your algorithm is correct. The expected output is indeed as you provided it.
Your confusion may come from the fact that the graph is not a binary tree, and not even a tree. It is a directed graph.
In general postorder means that you first perform a postorder traversal on the node behind the first outgoing edge, then on the node behind its next outgoing edge, ...etc, and only after all outgoing edges have been traversed, the node itself is output.
Since at node 1 you are not at the end yet, and still can go to 2, and from there to 3, you need to follow those edges before outputting anything. And only then backtrack.
For reference, here is your algorithm implemented in python:
def postorder(root, out_list, children, visited):
visited[root] = True
for child in children[root]:
if not visited[child]:
postorder(child, out_list, children, visited)
out_list.append(root)
children = [
[], # dummy for node 0
[2], # 1
[1,3], # 2
[2], # 3
[2,3], # 4
[1], # 5
[5,4] # 6
]
nodes = []
postorder(6, nodes, children, [False] * len(children))
print(nodes) # [3, 2, 1, 5, 4, 6]
I think you got confused with the postorder traversal of binary trees.
Postorder traversal in graph is different.
Post Ordering in Graphs – If we list the vertices in the order in which they are last visited by DFS traversal then the ordering is called PostOrder.
Assuming your root is node is 6, the order mentioned gives the correct answer.
Checkout the following example on how the post order traversal list is generated:
Pass 1:
List:[]
6 -> 5 -> 1 -> 2 -> 3 (Now Node 3 has no adjacent nodes which are unvisited)
List: [3]
Pass 2:
6 -> 5 -> 1 -> 2
Node 2 has has no adjacent nodes which are unvisited.
List: [3, 2]
Pass 3:
6 -> 5 -> 1
Node 1 has has no adjacent nodes which are unvisited.
List: [3, 2, 1]
Pass 4:
6 -> 5
Node 5 has has no adjacent nodes which are unvisited.
List: [3, 2, 1, 5]
Pass 5:
6 -> 4
Node 4 has has no adjacent nodes which are unvisited.
List: [3, 2, 1, 5, 4]
Pass 6:
Node 6 has has no adjacent nodes which are unvisited.
List: [3, 2, 1, 5, 4, 6]
Important Notes:
As we are using DFS, there can be multiple paths possible depending upon the order of the nodes in the adjacency list.
Possible are the correct orders:
[3, 2, 1, 5, 4, 6]
[1, 3, 2, 4, 5, 6]
[3, 1, 2, 4, 5, 6]
[1, 2, 3, 4, 5, 6]
I have a large dataset of segments (ai, bi), where ai < bi, and many queries. Each query asks for the number of intersected segments with the given range (b, e). The number of queries can be very large. A naive algorithm is to search for all intersected segments per query which takes O(N) time apparently. Is there a faster way to do this? I can imagine soring the segments dataset in ascending order of ai may help but I don't know what to do with the other direction.
segments: [1, 3], [2, 6], [4, 7], [7, 8]
query 1: [2, 5] => output [1, 3] [2, 6], [4, 7]
...
Make list B of sorted start points, as you wrote.
Make list P of structures containing all points - both starting and ending points together with field SE = +1/-1 for start and end correspondingly. Sort it by point coordinate.
Make Active = 0. Walk through P, adding SE to Counter and making new list A containing point position and Active count.
For every query start search (with binary search) lower position in A, get Active - number of opened segments at this moment.
Then search indexes in B corresponding to query start and query end, get index difference - number of segments starting inside query interval.
Sum of these values is needed number of intersected segments (you don't need segments themselves according to the problem statement)
Time per query is O(log(N))
[1, 3], [2, 6], [4, 7], [7, 8] initial list
[1, 2, 4, 7] list B
(1,1),(2,1),(3,-1),(4,1),(6,-1),(7,-1),(7,1),(8,-1) list P
(1,1),(2,2),(3,1), (4,2),(6,1), (7,0), (7,1),(8,0) list A
^
q start 2 gives active = 2 (two active intervals)
searching 2 in B gives index 1, searching 5 gives index 2,
difference is 1
result = 2 + 1 = 3
This question is very similar to Leetcode's Critical Connections in a Network. Given an undirected graph, we want to find all bridges. An edge in an undirected connected graph is a bridge iff removing it disconnects the graph.
Variant
Instead of finding all bridges, I want to maximise the number of edges to remove so that the graph remains connected.
Example 1
Input: n = 5, edges = [[1, 2], [1, 3], [3, 4], [1, 4], [4, 5]]
Output: 1
Firstly, I can remove [3,4], [1,3], or [1,4]. Next, after removing either of the 3 edges, the remaining edges are all bridges. Hence, the maximum number of edges to remove so that the graph remains connected is 1.
Example 2
Input: n = 6, edges = [[1, 2], [1, 3], [2, 3], [2, 4], [2, 5], [4, 6], [5, 6]]
Output: 2
Well this is easy, if we have E edges and N nodes in a connected graph we can remove E-N+1 edges so that graph remains connected.
How to do this?:
Just do DFS/BFS to find any spanning tree of the graph, since spanning tree is connected we can just remove all other edges.
Suppose I have the a tree given in the nested list representation, how do I traverse it breadth first? For example, if I'm given
[1, [2, [3, [4, [3, 5]]]], [3, [4, 5, 2]]]
The output would be
[1,2,3,3,4,4,5,2,3,5]
Also, given a flattened representation of the depth-first order like [1,2,3,4,3,5,3,4,5,2], how do I find the indices of the breadth-first order?
Thanks in advance for any help.
Here's the code in Python:
queue = [1, [2, [3, [4, [3, 5]]]], [3, [4, 5, 2]]]
while queue:
firstItem = queue.pop(0)
if type(firstItem) is list:
for item in firstItem:
queue.append(item)
else:
print('Traversed %d' % (firstItem))
The output is:
Traversed 1
Traversed 2
Traversed 3
Traversed 3
Traversed 4
Traversed 5
Traversed 2
Traversed 4
Traversed 3
Traversed 5
After studying my output and what you specified the output should be in your question, I think my output is more correct. More specifically, the left most 3 in the input list and [4, 5, 2] at the end of the input list are on the same "level", and thus should be traversed 3, 4, 5, 2, as shown from the 4th line to the 7th line of my output.
As for your second question, I think you should ask a separate question because it really is a completely different question.
I want an algorithm that gives one instance of a cycle in a directed graph if there is any. Can anyone show me a direction? In pseudo-code, or preferably, in Ruby?
I previously asked a similar question, and following the suggestions there, I implemented Kahn's algorithm in Ruby that detects if a graph has a cycle, but I want not only whether it has a cycle, but also one possible instance of such cycle.
example_graph = [[1, 2], [2, 3], [3, 4], [3, 5], [3, 6], [6, 2]]
Kahn's algorithm
def cyclic? graph
## The set of edges that have not been examined
graph = graph.dup
n, m = graph.transpose
## The set of nodes that are the supremum in the graph
sup = (n - m).uniq
while sup_old = sup.pop do
sup_old = graph.select{|n, _| n == sup_old}
graph -= sup_old
sup_old.each {|_, ssup| sup.push(ssup) unless graph.any?{|_, n| n == ssup}}
end
!graph.empty?
end
The above algorithm tells whether a graph has a cycle:
cyclic?(example_graph) #=> true
but I want not only that but an example of a cycle like this:
#=> [[2, 3], [3, 6], [6, 2]]
If I were to output the variable graph in the above code at the end of examination, it will give:
#=> [[2, 3], [3, 4], [3, 5], [3, 6], [6, 2]]
which includes the cycle I want, but it also includes extra edges that are irrelevant to the cycle.
I asked the same question in the math stackexchange site, and got an answer. It turned out that Tarjan's algorithm is good for solving this problem. I implemented it in Ruby as follows:
module DirectedGraph; module_function
## Tarjan's algorithm
def strongly_connected_components graph
#index, #stack, #indice, #lowlink, #scc = 0, [], {}, {}, []
#graph = graph
#graph.flatten(1).uniq.each{|v| strong_connect(v) unless #indice[v]}
#scc
end
def strong_connect v
#indice[v] = #index
#lowlink[v] = #index
#index += 1
#stack.push(v)
#graph.each do |vv, w|
next unless vv == v
if !#indice[w]
strong_connect(w)
#lowlink[v] = [#lowlink[v], #lowlink[w]].min
elsif #stack.include?(w)
#lowlink[v] = [#lowlink[v], #indice[w]].min
end
end
if #lowlink[v] == #indice[v]
i = #stack.index(v)
#scc.push(#stack[i..-1])
#stack = #stack[0...i]
end
end
end
So if I apply it to the example above, I get a list of strongly connected components of the graph:
example_graph = [[1, 2], [2, 3], [3, 4], [3, 5], [3, 6], [6, 2]]
DirectedGraph.strongly_connected_components(example_graph)
#=> [[4], [5], [2, 3, 6], [1]]
By selecting those components that are longer than one, I get the cycles:
DirectedGraph.strongly_connected_components(example_graph)
.select{|a| a.length > 1}
#=> [[2, 3, 6]]
And further if I select from the graph the edges whose both vertices are included in the components, I get the crucial edges that constitute the cycles:
DirectedGraph.strongly_connected_components(example_graph)
.select{|a| a.length > 1}
.map{|a| example_graph.select{|v, w| a.include?(v) and a.include?(w)}}
#=> [[[2, 3], [3, 6], [6, 2]]]
Depth first search, where you keep track of the visited vertices and the parent will give you the cycle. If you see an edge to a previously visited vertex then you have detected a cycle between your parent, yourself, and that vertex. A slight problem you may encounter is, if it is a cycle of length > 3, you'll only be able to tell the three vertices involved and will have to do some investigation into finding the rest of the vertices in the cycle.
For the investigation, you can start a breadth first search 'up' the tree starting from the parent and looking for the visited vertex, you should be able to find the whole cycle by doing that.