Visiting edges in a graph - algorithm

I am using DFS algorithm and want to mark each edges as visited,An approach would be to look for the node and replace it with some sentinel but that would be costly,if i make an adjacency list to store the value corresponding to the node visited that would increase the lookup time,A matrix would consume a lot of space. what is the best algorithm to do so?

You just need to maintain a set of vertex pairs. E.g in Java HashMap<Pair<Vertex, Vertex>>. In Python, a Set of 2-element tuples.
The visiting of an edge occurs as you're enumerating the descendants of a new vertex just discovered and adding them to the DFS stack. If you're using a recursive DFS, then it's as you're making a recursive call on each descendant. Here's the stack version:
dfs(graph)
visitedVertices = \emptyset
visitedEdges = \emptyset
// Try all vertices as search roots
for each vertex r in graph
push r onto empty stack
while notEmpty(stack)
u = pop stack
if u not in visitedVertices
add u to visitedVertices
foreach v such that u->v is in graph
add (u,v) to visitedEdges // Visit the edge
push v on stack
Having said that, I'm not sure why you want to do this. Correctly implemented DFS naturally traverses each edge exactly once. You can prove this to yourself by looking at the algorithm above. Visiting (u,v) is only possible if u has never been visited before.
Perhaps you have some other thread that's watching search progress or are actually adding other info to edges as you visit?

I had to solve the same problem.
In my code, a graph is represented as an adjacency list: graph[i] - is a list of nodes for given vertex i.
I made a set, in which I store visited edges. An edge is a tuple of size two.
Depth-Search First:
def visit_and_print_node(node):
"""Node visitor"""
print(str(node))
def dsf_edges(graph, node_visitor_function=visit_and_print_node, starting_node=0):
"""
Performs depth-first search algorithm on the edges of the graph.
Note, here we visit edges, so it is OK for us to visit the same node more than once.
"""
visited_edges = set()
# next node to be visited is the last inserted element of the stack.
stack = []
next_nodes = graph[starting_node]
node_visitor_function(starting_node)
for node in next_nodes:
stack.append((starting_node, node))
while len(stack) > 0:
visited_edge = stack.pop()
if visited_edge in visited_edges:
continue
visited_node = visited_edge[1]
# visit node
node_visitor_function(visited_node)
visited_edges.add(visited_edge)
next_nodes = graph[visited_node]
for node in next_nodes:
if not (visited_node, node) in visited_edges:
stack.append((visited_node, node))
If you would like to see other examples, BSF on edges (use queue instead of stack), or on nodes (visiting each node once), check my git repository:
Other useful links:
https://github.com/williamfiset/Algorithms/blob/master/src/main/java/com/williamfiset/algorithms/graphtheory/EulerianPathDirectedEdgesAdjacencyList.java
https://www.youtube.com/watch?v=8MpoO2zA2l4

Related

Returning all shortest paths in lowest run time and complexity

This post has is the result that constantly appears for this problem but doesn't provide an optimal solution.
Currently I am trying to return all shortest paths starting atfrom and ending at target using BFS but I am running into a bottleneck with either my algorithm or the data structures I use.
pseudocode:
// The graph is an adjacency list of type unordered_map<string, unordered_set<string>>
// deque with pair of (visited unordered_set, vector with current path)
deque q = [({from}, [from]);
while q:
pair = q.dequeue()
visited = pair.first
path = pair.second
foreach adjacent_node to path[-1] in the graph:
if (adjacent_node == target):
res.append(path + [adjacent_node])
else if adjacent_node not in visited:
newPath = path + [adjacent_node]
visited.add(adjacent_node)
q.push((visited, newPath))
Currently the bottleneck seems to be with the queue's pair of items. I'm unsure how to solve the problem without storing a visited set with every path, or without copying a new path into the queue.
Firstly you should know that number of shortest paths can be huge and returning them all is not practical. Consider a graph with 2k+1 layers numbered from 1 to 2k+1, in which each layer is fully connected with the next layer, and odd layers has only one point while even layers has q points. Although this graph only has k(q+1)+1 nodes and kq edges, there are in total q^k different shortest paths which can be inefficient for normal computers to handle. However if you're sure that the number of shortest paths is relatively small I can introduce the following algorithm.
The basic idea is to store a list back for each node, meaning the shortest distance between from and x equals to the shortest distance between from and v plus one if and only if v in back[x]. back[x] can be computed during the process. Then you can perform a depth-first search to print all the shortest path. Pseudo code (BTW I noticed that your code is not correct):
queue q = [ from ]
visited = set<node>
back = map<node, list<node>>
while q.not_empty():
now = q.front()
if (now == target):
continue
foreach adjacent_node to now in the graph:
if (adjacent_node in visited):
back[adjacent_node].push(now)
else:
visited.add(adjacent_node)
back[adjacent_node] = [ now ]
q.push(adjacent_node)
# Now collect all shortest paths
ret = []
current = []
def collect(x):
current.push(x)
if (x == from):
ret.push(current.reversed())
return
foreach v in back[x]:
collect(v)
current.pop()
Sorry for my poor English. Feel free to point out my typos and mistakes.

Write an efficient algorithm to divide the tree into connected components with at most V/2 vertices

I'm referring to the link below to try and write an algorithm to find a vertex in a tree so that removing that vertex gives connected components with the size of each component being at most V/2 vertices.
https://math.stackexchange.com/questions/1742440/you-can-always-delete-a-vertex-from-a-tree-g-such-that-the-remaining-connected
I do understand the proof given in the accepted answer which uses arrows to find that vertex. I can't quite figure out how to write an algorithm for the same.
I will just explain the proof and then later on give the pseudo code so that you can understand the psuedo code easily. The vertex that you are looking for is called centroid. So basically we need to find the centroid of the tree.
First of all this needs to be clear that there can be only one node that satisfies this property.
Let the given tree be T. Start from any vertex claiming to be the required vertex. Then check whether this is true or not. If this is the required vertex then nothing needs to be done. If this is not the vertex then select the next vertex adjacent to the current vertex that is the part of the subtree which had more than n/2 vertices in it. Repeat the process until you find the answer.
Now the pseudo code. Here are the meaning of the variables used.
v_centroid stores the centroid
v[i] stores the list of all nodes that are connected to i
size[i] stores the size of subtree of i.
v_centroid = any vertex
dfs(v_centroid,parent) // v_centroid is the assumed centroid and parent is parent of the node processing. For initial call you can use -1 as parent or any other undefined value suitable.
v_centroid = findCentroid(v_centroid,v_centroid)
func dfs(int node, int parent)
size[node] := 1
for i in v[node]
if(*i not equals parent)
dfs(*i, node)
size[node] = size[node] + size[parent]
end if
end for
end func
func findCentroid(int node, int parent)
for i in v[node]
if(i not equals parent and size[i]>MAX_SIZE/2)
return findCentroid(i, node)
end if
end for
return node
end func

Error in the algorithm of ordering nodes in the undirected graph

The idea is to construct a list of the nodes in the undirected graph ordered by their degrees.
Graph is given in the form {node: (set of its neighbours) for node in the graph}
The code raises KeyError exception at the line "graph[neighbor].remove(node)". It seems like the node have already been deleted from the set, but I don't see where.
Can anyone please point out at the mistake?
Edit: This list of nodes is used in the simulation of the targeted attack in order of values of nodes in the graph. So, after an attack on the node with the biggest degree, it is removed from the node, and the degrees of the remaining nodes should be recalculated accordingly.
def fast_targeted_order(graph):
"""returns an orderedv list of the nodes in the graph in decreasing
order of their degrees"""
number_of_nodes = len(graph)
# initialise a list of sets of every possible degree
degree_sets = [set() for dummy_ind in range(number_of_nodes)]
#group nodes in sets according to their degrees
for node in graph:
degree = len(graph[node])
degree_sets[degree] |= {node}
ordered_nodes = []
#starting from the set of nodes with the maximal degree
for degree in range(number_of_nodes - 1, -1, -1):
#copy the set to avoid raising the exception "set size changed
during the execution
copied_degree_set = degree_sets[degree].copy()
while degree_sets[degree]:
for node in copied_degree_set:
degree_sets[degree] -= {node}
for neighbor in graph[node]:
neighbor_degree = len(graph[neighbor])
degree_sets[neighbor_degree] -= {neighbor}
degree_sets[neighbor_degree - 1] |= {neighbor}
graph[neighbor].remove(node)
ordered_nodes.append(node)
graph.pop(node)
return ordered_nodes
My previous answer (now deleted) was incorrect, the issue was not in using set, but in deleting items in any sequence during iteration through the same sequence.
Python tutorial for version 3.1 clearly warns:
It is not safe to modify the sequence being iterated over in the loop
(this can only happen for mutable sequence types, such as lists). If
you need to modify the list you are iterating over (for example, to
duplicate selected items) you must iterate over a copy.
However, tutorial for Python 3.5. (which I use) only advises:
If you need to modify the sequence you are iterating over while inside
the loop (for example to duplicate selected items), it is
recommended that you first make a copy.
It appears that this operation is still very unpredictable in Python 3.5, producing different results with the same input.
From my point of view, the previous version of the tutorial is preferred to the current one.
#PetarPetrovic and #jdehesa, thanks for the valuable advice.
Working solution:
def fast_targeted_order(ugraph):
"""
input: undirected graph in the form {node: set of node's neighbors)
returns an ordered list of the nodes in V in decresing order of their degrees
"""
graph = copy_graph(ugraph)
number_of_nodes = len(graph)
degrees_dict = {degree: list() for degree in range(number_of_nodes)}
for node in graph:
degree = len(graph[node])
degrees_dict[degree].append(node)
ordered_degrees = OrderedDict(sorted(degrees_dict.items(),
key = lambda key_value: key_value[0],
reverse = True))
ordered_nodes = []
for degree, nodes in ordered_degrees.items():
nodes_copy = nodes[:]
for node in nodes_copy:
if node in nodes:
for neighbor in graph[node]:
neighbor_degree = len(graph[neighbor])
ordered_degrees[neighbor_degree].remove(neighbor)
if neighbor_degree:
ordered_degrees[neighbor_degree - 1].append(neighbor)
graph[neighbor].remove(node)
graph.pop(node)
ordered_degrees[degree].remove(node)
ordered_nodes.append(node)
return ordered_nodes

Graph Traversal using DFS

I am learning graph traversal from The Algorithm Design Manual by Steven S. Skiena. In his book, he has provided the code for traversing the graph using dfs. Below is the code.
dfs(graph *g, int v)
{
edgenode *p;
int y;
if (finished) return;
discovered[v] = TRUE;
time = time + 1;
entry_time[v] = time;
process_vertex_early(v);
p = g->edges[v];
while (p != NULL) {
/* temporary pointer */
/* successor vertex */
/* allow for search termination */
y = p->y;
if (discovered[y] == FALSE) {
parent[y] = v;
process_edge(v,y);
dfs(g,y);
}
else if ((!processed[y]) || (g->directed))
process_edge(v,y);
}
if (finished) return;
p = p->next;
}
process_vertex_late(v);
time = time + 1;
exit_time[v] = time;
processed[v] = TRUE;
}
In a undirected graph, it looks like below code is processing the edge twice (calling the method process_edge(v,y). One while traversing the vertex v and another at processing the vertex y) . So I have added the condition parent[v]!=y in else if ((!processed[y]) || (g->directed)). It processes the edge only once. However, I am not sure how to modify this code to work with the parallel edge and self-loop edge. The code should process the parallel edge and self-loop.
Short Answer:
Substitute your (parent[v]!=y) for (!processed[y]) instead of adding it to the condition.
Detailed Answer:
In my opinion there is a mistake in the implementation written in the book, which you discovered and fixed (except for parallel edges. More on that below). The implementation is supposed to be correct for both directed and undeirected graphs, with the distinction between them recorded in the g->directed boolean property.
In the book, just before the implementation the author writes:
The other important property of a depth-first search is that it partitions the
edges of an undirected graph into exactly two classes: tree edges and back edges. The
tree edges discover new vertices, and are those encoded in the parent relation. Back
edges are those whose other endpoint is an ancestor of the vertex being expanded,
so they point back into the tree.
So the condition (!processed[y]) is supposed to handle undirected graphs (as the condition (g->directed) is to handle directed graphs) by allowing the algorithm to process the edges that are back-edges and preventing it from re-process those that are tree edges (in the opposite direction). As you noticed, though, the tree-edges are treated as back-edges when read through the child with this condition so you should just replace this condition with your suggested (parent[v]!=y).
The condition (!processed[y]) will ALWAYS be true for an undirected graph when the algorithm reads it as long as there are no parallel edges (further details why this is true - *). If there are parallel edges - those parallel edges that are read after the first "copy" of them will yield false and the edge will not be processed, when it should be. Your suggested condition, however, will distinguish between tree-edges and the rest (back-edges, parallel edges and self-loops) and allow the algorithm to process only those that are not tree-edges in the opposite direction.
To refer to self-edges, they should be fine both with the new and old conditions: they are edges with y==v. Getting to them, y is discovered (because v is discovered before going through its edges), not processed (v is processed only as the last line - after going through its edges) and it is not v's parent (v is not its own parent).
*Going through v's edges, the algorithm reads this condition for y that has been discovered (so it doesn't go into the first conditional block). As quoted above (in the book there is a semi-proof for that as well which I will include at the end of this footnote), p is either a tree-edge or a back-edge. As y is discovered, it cannot be a tree-edge from v to y. It can be a back edge to an ancestor which means the call is in a recursion call that started processing this ancestor at some point, and so the ancestor's call has yet to reach the final line, marking it as processed (so it is still marked as not processed) and it can be a tree-edge from y to v, in which case the same situation holds - and y is still marked as not processed.
The semi-proof for every edge being a tree-edge or a back-edge:
Why can’t an edge go to a brother or cousin node instead of an ancestor?
All nodes reachable from a given vertex v are expanded before we finish with the
traversal from v, so such topologies are impossible for undirected graphs.
You are correct.
Quoting the book's (2nd edition) errata:
(*) Page 171, line -2 -- The dfs code has a bug, where each tree edge
is processed twice in undirected graphs. The test needs to be
strengthed to be:
else if (((!processed[y]) && (parent[v]!=y)) || (g->directed))
As for cycles - see here

How to find all shortest paths

I have a graph and I want to find all shortest paths between two nodes. I've found a shortest path between two nodes by BFS. However, it just gives me one of the shortest paths if there exists one more than.
How could I get all of them using BFS?
I've implement my code from well-known BFS pseudocode.
Also, I have a adjacency list vector which holds adjacency vertices for all nodes.
You can easily do it by maintaining a list or vector of parents for each node.
If two or more nodes ( say X, Y, Z) at the same distance from the starting node , leads to another node M , make all X , Y and Z as the parents of M.
You just have to add a check to see while adding a parent to the node whether that parent is in the same level as the previous parents.
By level , I mean the distance from the starting point.
This way you can get all the shortest paths by tracing back the parent vectors.
Below is my C++ implementation.
I hope you know how to print the paths by starting from the destination ,tracing the parents and reach the starting point.
EDIT : Pseudo Code
bfs (start , end)
enqueue(start)
visited[start] = 1
while queue is NOT empty
currentNode = queue.front()
dequeue()
if(currentNode == end)
break
for each node adjacent to currentNode
if node is unvisited
visited[node] = visited[curr] + 1
enqueue(node)
parent[node].add(currentNode)
else if(currentNode is in same level as node's parents)
parent[node].add(currentNode)
return
If the graph is large, finding all paths from start to end and then selecting the shortest ones can be very inefficient. Here is a better algorithm:
Using BFS, label each node with its distance from the start node. Stop when you get to the end node.
def bfs_label(start, end):
depth = {start: 0}
nodes = [start]
while nodes:
next_nodes = []
for node in nodes:
if node == end:
return depth
for neighbor in neighbors(node):
if neighbor not in depth:
depth[neighbor] = depth[node] + 1
fringe.append(neighbor)
Using DFS, find all paths from the start node to the end node such that the depth strictly increases for each step of the path.
def shortest_paths(node, end, depth, path=None):
if path is None:
path = []
path.append(node)
if node == end:
yield tuple(path)
else:
for neighbor in neighbors(node):
if neighbor in depth and depth[neighbor] == depth[node]+1:
for sp in shortest_paths(neighbor, end, depth, path):
yield sp
path.pop()
A simpler way is to find all paths from source to destination using dfs. Now find the shortest paths among these paths. Here is a sudo code:
dfs(p,len)
if(visited[p])
return
if(p== destination)
paths.append(len)
return
visited[p]=1
for each w adjacent to p
dfs(w,len+1)
visited[p]=0
You can find the path by maintaining an array for paths. I will leave that to you as an assignment
We can use a simple BFS algorithm for finding all the shortest paths. We can maintain the path along with the current node. I have provided the link to the python code for the same below.
https://gist.github.com/mridul111998/c24fbdb46492b57f7f17decd8802eac2

Resources