Necessary to MST edges in a graph - algorithm

Given a G(V,E) weighted(on edges) graph i need to find the number of edges that belong in every MST as well as the number of the edges that belong in at least one but not all and the ones that belong in none.
The graph is given as input in the following form(example):
3 3
1 2 1
1 3 1
2 3 2
First 3 is the number of nodes,second 3 is the number of edges.The following three lines are the edges where first number is where they start,second is where they end and third is the value.
I've thought about running kruskal once to find an MST and then for each edge belonging in G check in(linear?) time if it can be replace an edge in this MST without altering it's overall weight.If it can't it belongs in none.If it can it belongs in one but not all.I can also mark the edges in the first MST(maybe with a second value 1 or 0)and in the end check how many of them could not be replaced.These are the ones belonging in every possible MST. This algorithm is probably O(V^2) and i am not quite sure how to write it in C++.My questions are,could we somehow reduce it's complexity? If i add an edge to the MST how can i check(and implement in C++) if the cycle formed contains an edge that has less weight?

You can do this by adding some extra work to Kruskal's algorithm.
In Kruskal's algorithm, edges with the same weight can be in any order, and the order in which they are actually checked determines which MST you get out of all possible MSTs. For every MST, there is a sort-consistent order of the edges that will produce that tree.
No matter what sort-consistent order is used, the state of the union-find structure will be the same between weights.
If there is only one edge of a specific weight, then it is in every MST if Kruskal's algorithm selects it, because it would be selected with any sort-consistent edge order, and otherwise it is in no MSTs.
If there are multiple edges with the same weight, and Kruskal's algorithm would select at least one of them, then you can pause Kruskal's at that point and make a new (small) graph with just those edges that connect different sets, using the sets that they connect as vertices.
All of those edges are in at least one MST, because Kruskal's might pick them first. Any of those edges that are bridges in the new graph are in every MST, because Kruskal's would select them no matter what. See: https://en.wikipedia.org/wiki/Bridge_(graph_theory)
So, modify Kruskal's algorithm as follows:
When Kruskal's would select an edge, before doing the union, gather and process ALL the non-redundant edges with the same weight.
Make a small graph with those edges using the find() sets they connect as vertices
Use Tarjan's algorithm or equivalent (see the link) to find all the bridges in the graph. Those edges are in ALL MSTs.
The remaining edges in the small graph are in some, but not all, MSTs.
perform all the unions for edges in the small graph and move on to the next weight.
Since bridge-finding can be done in linear time, and every edge is in at most one small graph, the whole algorithm is still O(|V| + |E| log |E|).

Related

linear time algorithm for mst given |E|=99+|V|

I need to make a linear algorithm for finding minimal spanning tree given an undirected weighted and connected graph(with no isolated verities) the has |V| vertices and |V|+99 edges
I think the solution should be based on Kruskal and divide an concur but no luck till now, any ideas?
First remove all vertexes of degree 1, and add their adjacent edges to the MST until there are no more vertexes of degree 1.
Then consider each chain of degree-2 vertices. If a chain is a cycle, then add all but the most expensive edge to the MST.
Otherwise, the chain must run between two vertexes of degree >= 3. Replace the chain with a single edge. The cost of this edge will be the cost of the most expensive edge in the chain. That is the difference in cost between connecting the two ends via the chain instead of just connecting the vertexes to the ends.
Now you're left with vertexes of degree >=3. There must be less than 200 of those, and less than 300 edges remaining. Run Kruskal's to determine which edges go in the MST.

Algorithm for Finding Graph Connectivity

I'm tackling an interesting question in programming. It is this: we keep adding undirected edges to a graph, until the graph (or subgraph) is connected (i.e. we can use some path to get from each vertex to any other vertex in that subgraph). We stop as soon as the graph is connected.
For example if we have vertices 1,2,3 and 4 and we want the subgraph 1,2,3 to be connected.
Let's say we have edges (3,4), then (2,3), then (1,4), then (1,3). We only need to add in the first 3 edges for the subgraph to be connected, then we stop (edge 1,3 isn't needed).
Obviously I can run a BFS every time an edge is added to see if we can reach the required vertices, but if there are say m edges then we would potentially have to run BFS m times which seems too slow. Any better options? Thanks.
You should research the marvelous "Disjoint-set data structure" and the corresponding union - find algorithm. It can seem magical, but the worst case time and space complexity are tiny, O(α(n)) and O(n) respectively, where α is the inverse Ackerman function.
You can run just one time the BFS to find connected components. Then, each time you add an edge, if it is between vertices of two different components, you can merge them by a reference. So, the complexity of this algorithm is |V| + |E|.
Notice that the implementation of this method should be done by some reference techniques, especially to update the component number of the vertices.
I would normally do this using a disjoint set structure, as Doug suggests. It would be like Kruskal's algorithm for finding the minimum spanning tree, except you process edges in the given order.
If you don't need a spanning tree as output, though, then you can do this with an incremental BFS or DFS:
Pick any vertex, and find the vertices connected to it with BFS or DFS. Color these vertices red. If you start with no edges, of course, then there will be only one red vertex at this stage.
As you add edges, don't do anything else until you add an edge that connects a red vertex to a non-red vertex. Then run BFS or DFS, excluding the new edge, to find all the new vertices that will connect to the red set. Color them all red.
Stop when all vertices are red.
This is a little simpler in practice than using disjoint set, and takes O(|V|+|E|) time, since each vertex will be traversed by exactly one BFS/DFS search.
It does the work in chunks, though, so if you need each edge test to be fast individually, then disjoint set is better.

Linear algorithm to make edge weights unique

I have a weighted graph, and I want to compute a new weighting function for the graph, such that the edge weights are distinct, and every MST in the new graph corresponds to an MST in the old graph.
I can't find a feasible algorithm. I doubled all the weights, but that won't make them distinct. I also tried doubling the weights and adding different constants to edges with the same weights, but that doesn't feel right, either.
The new graph will have only 1 MST, since all edges are distinct.
Very simple: we multiply all weights by a factor K large enough to ensure that our small changes cannot affect the validity of an MST. I'll go overboard on this one:
K = max(<sum of all graph weights>,
<quantity of edges>)
+ 1
Number the N edges in any order, 0 through N-1. To each edge weight, add the edge number. It's trivial to show that
the edge weights are now unique (new difference between different weights is larger than the changes);
any MST in the new graph maps directly to a corresponding MST in the
old one (each new path sum is K times the old one, plus a quantity smaller than K -- the comparison (less or greater than) on any two paths cannot be affected).
Yes, this is overkill: you can restrict the value of K quite a bit. However, making it that large reduces the correctness proofs to lemmas a junior-high algebra student can follow.
We definitely cannot guarantee that all of the MSTs in the old graph are MSTs in the new graph; a counterexample is the complete graph on three vertices where all edges have equal weights. So I assume that you do not require the construction to give all MSTs, as that is not possible in the general case.
Can we always make it so that the new graph's MSTs are a subset of the old graph's? This would be easy if we could construct a graph without a MST. Of course, that doesn't make any sense and is impossible, since all graphs have at least one MST. Is it possible to change edge weights so that only one of the old graph's MSTs is an MST for the new graph? I propose that this is possible in general.
Determine some MST of the old graph.
Construct a new graph with the same edges and vertices, but with weights assigned as follows:
if the edge in the new graph belongs to the MST determined in step 1, give it a unique weight between 1 and n, the number of edges in the graph.
otherwise, give the edge in the new graph a unique weight greater than or equal to n^2, the square of the number of edges in the graph.
Without proof, it seems like this should guarantee that only the nominated MST from the old graph is an MST of the new graph, and therefore every MST in the new graph (there is just the one) is an MST in the old graph.
Now, one could ask whether we can do the deed with additional restrictions:
Can you do it if you only want to change the values of edges that are not unique in the old graph?
Can you do it if you want to keep relative weights the same for edges which were unique in the old graph?
One could even pose optimization problems:
What is the minimum number of edge weights that must be changed to guarantee it?
What is the weighting with minimum distance (according to some metric) from the old weighting that guarantees it?
What is the weighting that minimizes the average change while also guaranteeing it?
I am hesitant to attempt these, what I believe to be much more difficult, problems.

Why can't Prim's or Kruskal's algorithms be used on a directed graph?

Prim's and Kruskal's algorithms are used to find the minimum spanning tree of a graph that is connected and undirected. Why can't they be used on a graph that is directed?
It's a minor miracle that these algorithms work in the first place -- most greedy algorithms just crash and burn on some instances. Assuming that you want to use them to find a minimum spanning arborescence (directed paths from one vertex to all others), then one problematic graph for Kruskal looks like this.
5
--> a
/ / ^
s 1| |2
\ v /
--> b
3
We'll take the a->b arc of cost 1, then get stuck because we really wanted s->b of cost 3 and b->a of cost 2.
For Prim, this graph is problematic.
5
--> a
/ /
s 1|
\ v
--> b
3
We'll take s->b of cost 3, but we really wanted s->a of cost 5 and a->b of cost 1.
Prim's and Kruskal's algorithms output a minimum spanning tree for connected and "undirected" graphs. If the graph is not connected, we can tweak the algorithms to output minimum spanning forests.
In Prim's algorithm, we divide the graph in two sets of vertices. One set of the explored vertices which have already formed a MST (Set 1) and another set of unexplored vertices which will eventually join the first set to complete "spanning" (Set 2). At each instant, we select a minimum weighted edge in the cut joining the two disjoint sets. If there is no directed edge from explored nodes of the MST to the remaining unexplored nodes, the algorithm gets stuck even though there are edges from unexplored nodes to explored nodes in the MST.
In Kruskal's algorithm, the idea is to sort the edges in ascending order by their weight and pick them up in order and include them in the MST of explored nodes/edges if they do not already form a cycle with any explored nodes. This is done using the Union-Find data structure. But detection of cycles for directed graphs fails with this method. For example, a graph containing edges [1->2] [2->3] [1->3] will be reported to contain a cycle with the Union-Find method.
So Prim's fails because it assumes every node is reachable from every node which, though valid for undirected graphs, may not be true for digraphs. Kruskal's fails because of failure to detect cycles and sometimes it is essential to add edges making cycles to satisfy the "minimum" weighted property of MST.
Also, in case of digraphs, MST doesn't make complete sense. Its equivalent for digraphs is "minimum spanning arborescence" which will produce a tree where every vertex can be reached from a single vertex.

Find all critical edges of an MST

I have this question from Robert Sedgewick's book on algorithms.
Critical edges. An MST edge whose deletion from the graph would cause the
MST weight to increase is called a critical edge. Show how to find all critical edges in a
graph in time proportional to E log E. Note: This question assumes that edge weights
are not necessarily distinct (otherwise all edges in the MST are critical).
Please suggest an algorithm that solves this problem.
One approach I can think of does the job in time E.V.
My approach is to run the kruskal's algorithm.
But whenever we encounter an edge whose insertion in the MST creates a cycle and if that
cycle already contains an edge with the same edge weight, then, the edge already inserted will not be a critical edge (otherwise all other MST edges are critical edges).
Is this algorithm correct? How can I extend this algorithm to do the job in time E log E.
The condition you suggest for when an edge is critical is correct I think. But it's not necessary to actually find a cycle and test each of its edges.
The Kruskal algorithm adds edges in increasing weight order, so the sequence of edge additions can be broken into blocks of equal-weight edge additions. Within each equal-weight block, if there is more than one edge that joins the same two components, then all of these edges are non-critical, because any one of the other edges could be chosen instead. (I say they are all non-critical because we are not actually given a specific MST as part of the input -- if we were then this would identify a particular edge to call non-critical. The edge that Kruskal actually chooses is just an artefact of initial edge ordering or how sorting was implemented.)
But this is not quite sufficient: it might be that after adding all edges of weight 4 or less to the MST, we find that there are 3 weight-5 edges, connecting component pairs (1, 2), (2, 3) and (1, 3). Although no component pair is joined by more than 1 of these 3 edges, we only need (any) 2 of them -- using all 3 would create a cycle.
For each equal-weight block, having weight say w, what we actually need to do is (conceptually) create a new graph in which each component of the MST so far (i.e. using edges having weight < w) is a vertex, and there is an edge between 2 vertices whenever there is a weight-w edge between these components. (This may result in multi-edges.) We then run DFS on each component of this graph to find any cycles, and mark every edge belonging to such a cycle as non-critical. DFS takes O(nEdges) time, so the sum of the DFS times for each block (whose sizes sum to E) will be O(E).
Note that Kruskal's algorithm takes time O(Elog E), not O(E) as you seem to imply -- although people like Bernard Chazelle have gotten close to linear-time MST construction, TTBOMK no one has got there yet! :)
Yes, your algorithm is correct. We can prove that by comparing the execution of Kruskal's algorithm to a similar execution where the cost of some MST edge e is changed to infinity. Until the first execution considers e, both executions are identical. After e, the first execution has one fewer connected component than the second. This condition persists until an edge e' is considered that, in the second execution, joins the components that e would have. Since edge e is the only difference between the forests constructed so far, it must belong to the cycle created by e'. After e', the executions make identical decisions, and the difference in the forests is that the first execution has e, and the second, e'.
One way to implement this algorithm is using a dynamic tree, a data structure that represents a labelled forest. One configuration of this ADT supports the following methods in logarithmic time.
MakeVertex() - constructs and returns a fresh vertex.
Link(u, c, v) - vertices u and v must not be connected. Creates an unmarked edge from vertex u to vertex v with cost c.
Mark(u, v) - vertices u and v must be endpoints of an edge e. Marks e.
Connected(u, v) - indicates whether vertices u and v are connected.
FindMax(u, v) - vertices u and v must be connected. Returns the endpoints of an unmarked edge on the unique path from u to v with maximum cost, together with that cost. The endpoints of this edge are given in the order that they appear on the path.
I make no claim that this is a good algorithm in practice. Dynamic trees, like Swiss Army knives, are versatile but complicated and often not the best tool for the job. I encourage you to think about how to take advantage of the fact that we can wait until all of the edges are processed to figure out what the critical edges are.

Resources