need a graph algorithm similar to DFS - algorithm

I'm curious if there is a specific graph algorithm that traverses an unweighted acyclic directed graph by choosing a start node and then proceeding via DFS. If a node is encountered that has unsearched predecessors then it should back track the incoming paths until all paths to start have been explored.
I found a wikipedia category for graph algorithms but there is a small sea of algorithms here and I'm not familiar with most of them.
EDIT: example:
given the graph {AB, EB, BC, BD}, traverse as: {A,B,E,B,C,D} or unique order as {A,B,E,C,D}.
Note this algorithm unlike BFS or DFS does not need to begin again at a new start node if all paths of the first start node are exhausted.

In DFS, you usually choose the vertex to be visited after u based on the edges starting at u. You want to choose based first on the edges ending at u. To do this, you could have a transpose graph info, and try to get the vertex from there first.
It would be something like this:
procedure dfs(vertex u)
mark u as visited
for each edge (v, u) //found in transpose graph
if v not visited
dfs(v)
for each edge (u, w)
if v not visited
dfs(w)

What you are looking for is the topological sort. As far as I'm aware there's no easy way to traverse a graph in its topologically sorted order without any precomputation.
The standard way to get the topsort is to do a standard DFS, and then store the visited nodes in order of their visiting times. Finally, reverse those nodes and voila, you have them in the order you desire.
Pseudocode:
list topsort
procedure dfs(vertex u)
mark u as visited
for each edge (u, v)
if v not visited
dfs(v)
add u to the back of topsort
The list topsort will then contain the vertices in the reverse order that you want. Just reverse the elements of topsort to correct that.

If you're looking for topological sort, you can also do this, given an adjacency list (or a list of edges (u,v) which you can preprocess in O(E) time):
list top_sort( graph in adjacency list )
parent = new list
queue = new queue
for each u in nodes
parent(u) = number of parents
if ( parent(u) is 0 ) // nothing points to node i
queue.enqueue( u )
while ( queue is not empty )
u = queue.pop
add u to visited
for each edge ( u, v )
decrement parent(v) // children all have one less parent
if ( parent(v) is 0 )
queue.enqueue( v )
Given an adjacency list (or a list of edges (u,v)), this is O( V + E ), since each edge is touched twice - once to increment, once to decrement, in O(1) time each. With a normal queue, each vertice will also be processed by the queue at most twice - which can be done also in O(1) with a standard queue.
Note that this differs from the DFS (at least a straight-up implementation) in that it handles forests.
Another interesting note is that if you substitute queue with a priority_queue imposing some sort of structure/ordering, you can actually return the results sorted in some order.
For example, for a canonical class dependency graph (you can only take class X if you took class Y):
100:
101: 100
200: 100 101
201:
202: 201
you would probably get, as a result:
100, 201, 101, 202, 200
but if you change it so that you always want to take lower numbered classes first, you can easily change it to return:
100, 101, 200, 201, 202

Related

Shortest paths problem with two conditions

Let's say i have a directed graph G(V,E,w,c) where w is the positive weight of each edge and c is the cost of every edge being either 1 or 0.I need to find an algorithm that for given source vertice u finds the shortest paths from u to every vertice in V that have cost ≤ k(where k≥1).
I tried modifying Bellman ford's algorithm but i can't seem to find the solution.
Let me restate my understanding of the problem.
For all vertices that you can reach with a cost of no more than k, you want the path of minimal weight that gets there from a vertex u.
You need a combination of ideas to get there.
Suppose that a RouteToNode object has the following attributes: cost, weight, node, lastRouteToNode and an autoincrementing id. This is a linked list carrying us back to the original node, letting us reconstruct the route. We compare them by cost, then weight, then id.
We have a hash/dictionary/whatever you want to call it that maps nodes to the lowest weight RouteToNode object reaching that node. Call it bestRoute.
We have a todo list that has RouteToNodes that we have not yet processed which is a priority queue that always returns the minimal RouteToNode. Note that it always returns them from lowest cost to highest.
We start with bestRoute having nothing in it, and a todo queue with only a single RouteToNode, namely:
{
id: 0,
cost: 0,
weight: 0,
node: u,
lastRouteToNode: null
}
And now we execute the following pseudocode:
while todo is not empty:
thisRouteToNode = todo.pop()
if thisRouteToNode.node not in bestRoute or
thisRouteToNode.weight < bestRoute[thisRouteToNode.node].weight:
bestRoute[thisRouteToNode.node] = thisRouteToNode
for edge adjacent to thisRouteToNode.node:
construct nextRouteToNode by adding edge
if nextRouteToNode.cost <= k:
todo.push(nextRouteToNode)

Linear-time algorithm for number of distinct paths from each vertex in a directed acyclic graph

I am working on the following past paper question for an algorithms module:
Let G = (V, E) be a simple directed acyclic graph (DAG).
For a pair of vertices v, u in V, we say v is reachable from u if there is a (directed) path from u to v in G.
(We assume that every vertex is reachable from itself.)
For any vertex v in V, let R(v) be the reachability number of vertex v, which is the number of vertices u in V that are reachable from v.
Design an algorithm which, for a given DAG, G = (V, E), computes the values of R(v) for all vertices v in V.
Provide the analysis of your algorithm (i.e., correctness and running time
analysis).
(Optimally, one should try to design an algorithm running in
O(n + m) time.)
So, far I have the following thoughts:
The following algorithm for finding a topological sort of a DAG might be useful:
TopologicalSort(G)
1. Run DFS on G and compute a DFS-numbering, N // A DFS-numbering is a numbering (starting from 1) of the vertices of G, representing the point at which the DFS-call on a given vertex v finishes.
2. Let the topological sort be the function a(v) = n - N[v] + 1 // n is the number of nodes in G and N[v] is the DFS-number of v.
My second thought is that dynamic programming might be a useful approach, too.
However, I am currently not sure how to combine these two ideas into a solution.
I would appreciate any hints!
EDIT: Unfortunately the approach below is not correct in general. It may count multiple times the nodes that can be reached via multiple paths.
The ideas below are valid if the DAG is a polytree, since this guarantees that there is at most one path between any two nodes.
You can use the following steps:
find all nodes with 0 in-degree (i.e. no incoming edges).
This can be done in O(n + m), e.g. by looping through all edges
and marking those nodes that are the end of any edge. The nodes with 0
in-degree are those which have not been marked.
Start a DFS from each node with 0 in-degree.
After the DFS call for a node ends, we want to have computed for that
node the information of its reachability.
In order to achieve this, we need to add the reachability of the
successors of this node. Some of these values might have already been
computed (if the successor was already visited by DFS), therefore this
is a dynamic programming solution.
The following pseudocode describes the DFS code:
function DFS(node) {
visited[node] = true;
reachability[node] = 1;
for each successor of node {
if (!visited[successor]) {
DFS(successor);
}
reachability[node] += reachability[successor];
}
}
After calling this for all nodes with 0 in-degree, the reachability
array will contain the reachability for all nodes in the graph.
The overall complexity is O(n + m).
I'd suggest using a Breadth First Search approach.
For every node, add all the nodes that are connected to the queue. In addition to that, maintain a separate array for calculating the reachability.
For example, if a A->B, then
1.) Mark A as traversed
2.) B is added to the queue
3.) arr[B]+=1
This way, we can get R(v) for all vertices in O(|V| + |E|) time through arr[].

Minimum Spanning tree different from another

Assume we are given
an undirected graph g where every node i,1 <= i < n is connected to all j,i < j <=n
and a source s.
We want to find the total costs (defined as the sum of all edges' weights) of the cheapest minimum spanning tree that differs from the minimum distance tree of s (i.e. from the MST obtained by running prim/dijkstra on s) by at least one edge.
What would be the best way to tackle this? Because currently, I can only think of some kind of fixed-point iteration
run dijkstra on (g,s) to obtain reference graph r that we need to differ from
costs := sum(edge_weights_of(r))
change := 0
for each vertex u in r, run a bfs and note for each reached vertex v the longest edge on the path from u to v.
iterate through all edges e = (a,b) in g: and find e'=(a',b') that is NOT in r and minimizes newchange := weight(e') - weight(longest_edge(a',b'))
if(first_time_here OR newchange < 0) then change += newchange
if(newchange < 0) goto 4
result := costs + change
That seems to waste a lot of time... It relies on the fact that adding an edge to a spanning tree creates a cycle from which we can remove the longest edge.
I also thought about using Kruskal to get an overall minimum spanning tree and only using the above algorithm to replace a single edge when the trees from both, prim and kruskal, happen to be the same, but that doesn't seem to work as the result would be highly dependent on the edges selected during a run of kruskal.
Any suggestions/hints?
You can do it using Prim`s algorithm
Prim's algorithm:
let T be a single vertex x
while (T has fewer than n vertices)
{
1.find the smallest edge connecting T to G-T
2.add it to T
}
Now lets modify it.
Let you have one minimum spanning tree. Say Tree(E,V)
Using this algorithm
Prim's algorithm (Modified):
let T be a single vertex
let isOther = false
while (T has fewer than n vertices)
{
1.find the smallest edge (say e) connecting T to G-T
2.If more than one edge is found, {
check which one you have in E(Tree)
choose one different from this
add it to T
set isOther = true
}
else if one vertex is found {
add it to T
If E(Tree) doesn`t contain this edge, set isOther = true
Else don`t touch isOther ( keep value ).
}
}
If isOther = true, it means you have found another tree different from Tree(E,V) and it is T,
Else graph have single minimum spanning tree

Graph algorithm to calculate node degree

I'm trying to implement the topological-sort algorithm for a DAG. (http://en.wikipedia.org/wiki/Topological_sorting)
First step of this simple algorithm is finding nodes with zero degree, and I cannot find any way to do this without a quadratic algorithm.
My graph implementation is a simple adjacency list and the basic process is to loop through every node and for every node go through each adjacency list so the complexity will be O(|V| * |V|).
The complexity of topological-sort is O(|V| + |E|) so i think there must be a way to calculate the degree for all nodes in a linear way.
You can maintain the indegree of all vertices while removing nodes from the graph and maintain a linked list of zero indegree nodes:
indeg[x] = indegree of node x (compute this by going through the adjacency lists)
zero = [ x in nodes | indeg[x] = 0 ]
result = []
while zero != []:
x = zero.pop()
result.push(x)
for y in adj(x):
indeg[y]--
if indeg[y] = 0:
zero.push(y)
That said, topological sort using DFS is conceptionally much simpler, IMHO:
result = []
visited = {}
dfs(x):
if x in visited: return
visited.insert(x)
for y in adj(x):
dfs(y)
result.push(x)
for x in V: dfs(x)
reverse(result)
You can achieve it in o(|v|+|e|). Follow below given steps:
Create two lists inDegree, outDegree which maintain count for in coming and out going edges for each node, initialize it to 0.
Now traverse through given adjacency list, for edge (u,v) in graph g, increase count of outdegree for u, and increment count of indegree for v.
You can traverse through adjacency list in o(v +e) , and will have indegree and outdegree for each u in o(|v|+|e|).
The Complexity that you mentioned for visiting adjacency nodes is not quite correct (O(n2)), because if you think carefully, you will notice that this is more like a BFS search. So, you visit each node and each edge only once. Therefore, the complexity is O(m+n). Where, n is the number of nodes and m is the edge count.
You can also use DFS for topological sorting. You won't need additional pass to calculate in-degree after processing each node.
http://www.geeksforgeeks.org/topological-sorting/

Find a maximum tree subgraph with given number of edges that is a subgraph of a tree

So a problem is as follows: you are given a graph which is a tree and the number of edges that you can use. Starting at v1, you choose the edges that go out of any of the verticies that you have already visited.
An example:
In this example the optimal approach is:
for k==1 AC -> 5
for k==2 AB BH -> 11
for k==3 AC AB BH -> 16
At first i though this is a problem to find the maximum path of length k starting from A, which would be trivial, but the point is you can always choose to go a different way, so that approach did not work.
What i though of so far:
Cut the tree at k, and brute force all the possibilites.
Calculate the cost of going to an edge for all edges.
The cost would include the sum of all edges before the edge we are trying to go to divided by the amount of edges you need to add in order to get to that edge.
From there pick the maximum, for all edges, update the cost, and do it again until you have reached k.
The second approach seems good, but it reminds me a bit of the knapsack problem.
So my question is: is there a better approach for this? Is this problem NP?
EDIT: A counter example for the trimming answer:
This code illustrates a memoisation approach based on the subproblem of computing the max weight from a tree rooted at a certain node.
I think the complexity will be O(kE) where E is the number of edges in the graph (E=n-1 for a tree).
edges={}
edges['A']=('B',1),('C',5)
edges['B']=('G',3),('H',10)
edges['C']=('D',2),('E',1),('F',3)
cache={}
def max_weight_subgraph(node,k,used=0):
"""Compute the max weight from a subgraph rooted at node.
Can use up to k edges.
Not allowed to use the first used connections from the node."""
if k==0:
return 0
key = node,k,used
if key in cache:
return cache[key]
if node not in edges:
return 0
E=edges[node]
best=0
if used<len(E):
child,weight = E[used]
# Choose the amount r of edges to get from the subgraph at child
for r in xrange(k):
# We have k-1-r edges remaining to be used by the rest of the children
best=max(best,weight+
max_weight_subgraph(node,k-1-r,used+1)+
max_weight_subgraph(child,r,0))
# Also consider not using this child at all
best=max(best,max_weight_subgraph(node,k,used+1))
cache[key]=best
return best
for k in range(1,4):
print k,max_weight_subgraph('A',k)

Resources