Can we use BFS algo to identify the furthest vertex from staring vertex v in any graph, in terms of number of edges.
Yes. Let's call the distance from A node to B node the number of edges from A to B.
BFS is finding all the nodes of distance 1, then all nodes of distance 2 and so on. For finding the furthest vertex just retain the last node searched, because he has certain the longest distance.
Related
we are given a directed graph G = (V, E) with positive and negative edge weights, but no cycles. Let s ∈ V be a given source vertex. How to find an algorithm that finds distance of all vertices from s, supposably runs faster than Bellman Ford's O(VE) time complexity.
If the graph has no cycles, then you can just process the vertices in topological order.
For each vertex v, if it is reachable from s at distance d, then for every edge (v,u) with weight w, mark u as reachable with weight d+w. If u is already reachable at a lower weight, then leave it alone.
Because you process the graph in topological order, you know that when you process a vertex v, you will already have processed all its predecessors, so you will know the length of its shortest path from s. The first reachable vertex will, of course, be s.
It's pretty easy to combine this with Kahn's algorithm for topological sorting, on the subgraph of vertices reachable from s.
First do a BFS search to find all the vertices reachable from s, and simultaneously count each vertex's incoming edges within this subset.
s will be the only vertex with count '0'. It also has a known 0 distance from s. Put it in a queue.
While there are vertexes in the queue:
Remove a vertex v from the queue
Adjust the distances to its neighbors
Reduce the incoming edge counts of it's neighbors. If any neighbor's count gets to 0, then put it in the queue.
When you're done, all reachable vertexes will be process and all their distances will be known.
I have to give an algorithm as follows:
Given an undirected connected graph G, give an algorithm that finds two nodes x,y such that their distance is at least half the diameter of the Graph. Prove any claim.
I'm assuming I have to run a BFS from any arbitrary node and find its furthest node to find the diameter. Then find two of the explored nodes whose distance is bigger than half the diameter.
But I doubt this is the optimal and asked for solution. Is there any other way that when running the BFS to find the diameter, to simultaneously find these two required nodes? So that the complexity remains polynomial.
Any guidance or hint would be appreciated!
The diameter (lets call it D) of a graph is the largest distance (= minimal number of hops) between any of its nodes.
Choose any node and perform BFS, while retaining, for each node, the number of hops from your initial node. This takes O(V), since you will visit all nodes exactly once. Note that this number of hops is also the shortest distance to v from the root - which I will refer to as d(root, v).
Now, take the leaf z that has the largest number of hops from your root. Congratulations, d(root, z) >= D/2, because
Lemma: for any node x in a connected graph of diameter D, there must exist a node y that is at least D/2 far away.
Proof: If this were not so, then there would be some node x so that, for all y, d(x,y) = D/2 - k <= D/2 (with k>=1). But then, by passing through x, we could find paths from any node to all others in at most 2 * (D/2 - k) = D - 2k - and therefore, the graph's diameter could not be D, but D - 2k.
Thats actually the tricky one, but I think I got it. Interesting thing is that your partially wrong solution put me on the right way.
Lets just copy here few definitions:
Distance between two vertices in a graph is the number of edges in a shortest path
The eccentricity of a vertex v is the greatest distance between v and any other vertex
The diameter d of a graph is the maximum eccentricity of any vertex in the graph. That is, d is the greatest distance between any pair of vertices
The real issue would be to actually find the diameter, its not an easy task. To find diameter you cannot just choose any node and run BFS - in such case you just find node that has highest distance from that node (the eccentricity), but it is not diameter. To actually find diameter you would have to run BFS (=find eccentricity) from every single node and the highest distance you got is diameter (there are some better alghoritms, but as I said - its not simple task).
However! You dont have to know the diameter at all. If you actually run BFS from random node and you find the node with highest distance (eccentricity) - thats the solution to your alghorithm. x would be your starting node and y would be the node with highest distance.
Why? If you imagine super simple graph like this
You can see that the diameter is between nodes 1 and nodes 4. So no matter from which point you run the BFS, that point has to be either in a middle (which means it will have half the diameter) or not in the middle and then the node with highest distance must have even higher distance than half the diameter.
Even more complex graphs do not change the fact
If you choose 6 or 7, its not exactly in diameter path (because the highest distance is between 1-2-3-4-5), but it means that you get even higher distance, which is fine for your task.
Result: Run the BFS from random node, when it ends, take node with highest distance from the starting node (=find eccentricity and remember the furthest node) and the starting and "ending" nodes are (x,y)
The diameter as in, the largest minimum distance between any two points in the graph.
To solve this, would we just do BFS from any node, and then choose a node among the farthest nodes from the original node. Do BFS on this new node, and then the largest distance here is the diameter of the graph.
Another post talks about weighted directed graphs. This is strictly for unweighted. Although the same algorithm might work here, I am asking if we can do it more efficiently w/ the algo I proposed here.
diameter does exactly this.
G = nx.lollipop_graph(5, 5)
nx.diameter(G)
Output: 6
I have a directed graph where each node has a score. Starting from a node, I need to find the highest score that can be achieved by following a path. Not all nodes can be final nodes. Also it is possible to revisit a node, but only the first visit counts for the score. How can I compute the highest achievable score?
First you may find a strongly connected components of the graph. Then you may build a condensation of the graph.
Each vertex in condensation may have a score equal to the sum of the scores of vertices in initial graph.
Blue numbers show the score of each vertex in initial graph. Yellow - in graph condensation.
Also mark some of the vertices of the condensation as terminal if they contain a final node. You will also have a mapping of each graph vertex to a vertex in condensation.
The notion of connected component is important because if you find yourself in one vertex of a component you may easily visit all the other vertices of the component to maximise the score. You are free to revisit each vertex any number of times.
Condensation itself is a directed acyclic graph. You can now traverse a condensation graph with depth first search maintaining the function
Fv = 0 - if V does not have reachable termination vertex (bottom-right vertex on the picture below)
Fv = MAXi(Fchildv,i) + scorev - otherwise
Red circles show what vertices in initial graph and condensation considered terminal.
Numbers in green show what F-value each vertex in condensation graph has.
The answer to your problem would be F-value of the vertex in condensation that corresponds to a starting vertex in initial graph. Overall time complexity would be O(N + M) wher N is a number of vertices and M - a number of edges in initial graph.
I am learning minimum spanning tree. I go through Prim's algorithm for weighted directed graph.
Algorithm is simple
you have two set of vertices, visited and non-visited
set distance for all edges to infinity
start with any vertex in non-visited set and explore its edges
In all edges, update distance of the destination vertex with the weight of the edge if destination vertex it is not visited and if weight of the edge is less than the distance of destination vertex
pick the non-visited vertex with smallest distance and do it again until all vertex are visited
I believe with above algorithm, I will be able to find the spanning tree having minimum cost among all spanning trees, i.e. Minimum spanning tree.
But I applied it to the following example and I think it is failed.
Consider following example
Vertices are {v1,v2,v3,v4,v5} and edges with weight (x,y) : w =>
(v1,v2) : 8
(v1,v3) : 15
(v1,v4) : 7
(v2,v5) : 4
(v4,v5) : 7
First I explore v1, it has edges to v2,v3,v4 so graph become
Vertex v1 is visited and (vertex, distance) =>
(v2,8)
(v3,15)
(v4,7)
Now v4 has the least distance i.e. 7 so I explore v4, it has edge to v5 so following modification occur
Vertex v4 is visited and (vertex, distance) => (v5,7)
Now among all v5 is having the least distance , i.e. 7 , so I explore v5 and it does not have any edge so I just mark it visited
Vertex v5 is visited
Now, confusion starts from here
The vertex with the least distance is now v2, it has edge to v5 with the weight 4 and currently v5 having distance is 7, previously assigned by the edge (v4,v5) : 7 , so, I believe that to make minimum spanning tree, distance for v5 should be updated from 7 to 4 as 4 < 7 but it will not because v5 has already been visited and Prim's Algorithm do not update the distance of the vertex that already been visited and distance for v5 will remain 7 instead of 4 and this tree will not have minimum cost
Do I get it right ? or am I doing any mistake ?
Thanks
First I should mention that Prim's algorithm is just applicable to undirected graphs so if we consider the graph is undirected, this is the step by step progress of the algorithm on your case:
And you should consider that finding a minimum spanning tree is not even possible many times in the directed graphs, nevertheless the closest notion to MST for directed graphs is minimum cost arborescence.