Special Case for MST algorithm in linear time - algorithm

Let G = (V, E) be a weighted undirected connected graph, where all the
edge weights are distinct. Let T denote the minimum spanning tree.
Suppose that G has m ≤ n + 157 edges. For this special case, give an MST
algorithm that runs in O(n) time beating Kruskals and Prims algorithm.
Any hints?

First verify that the graph is connected.
Then repeat until the graph is a tree (# edges = n-1):
Find a cycle using DFS. There must be one since #edges >= n
Remove the longest edge in the cycle. It cannot be part of the MST.
When done you are left with the MST.
Even though this can take O(n) time per iteration, there will be at most 158 iterations, so that is still O(n) all together.

Related

Time Complexity Analysis of BFS

I know that there are a ton of questions out there about the time complexity of BFS which is : O(V+E)
However I still struggle to understand why is the time complexity O(V+E) and not O(V*E)
I know that O(V+E) stands for O(max[V,E]) and my only guess is that it has something to do with the density of the graph and not with the algorithm itself unlike say Merge Sort where it's time complexity is always O(n*logn).
Examples I've thought of are :
A Directed Graph with |E| = |V|-1 and yeah the time complexity will be O(V)
A Directed Graph with |E| = |V|*|V-1| and the complexity would in fact be O(|E|) = O(|V|*|V|) as each vertex has an outgoing edge to every other vertex besides itself
Am I in the right direction? Any insight would be really helpful.
Your "examples of thought" illustrate that the complexity is not O(V*E), but O(E). True, E can be a large number in comparison with V, but it doesn't matter when you say the complexity is O(E).
When the graph is connected, then you can always say it is O(E). The reason to include V in the time complexity, is to cover for the graphs that have many more vertices than edges (and thus are disconnected): the BFS algorithm will not only have to visit all edges, but also all vertices, including those that have no edges, just to detect that they don't have edges. And so we must say O(V+E).
The complexity comes off easily if you walk through the algorithm. Let Q be the FIFO queue where initially it contains the source node. BFS basically does the following
while Q not empty
pop u from Q
for each adjacency v of u
if v is not marked
mark v
push v into Q
Since each node is added once and removed once then the while loop is done O(V) times. Also each time we pop u we perform |adj[u]| operations where |adj[u]| is the number of
adjacencies of u.
Therefore the total complexity is Sum (1+|adj[u]|) over all V which is O(V+E) since the sum of adjacencies is O(E) (2E for undirected graph and E for a directed one)
Consider a situation when you have a tree, maybe even with cycles, you start search from the root and your target is the last leaf of your tree. In this case you will traverse all the edges before you get into your destination.
E.g.
0 - 1
1 - 2
0 - 2
0 - 3
In this scenario you will check 4 edges before you actually find a node #3.
It depends on how the adjacency list is implemented. A properly implemented adjacency list is a list/array of vertices with a list of related edges attached to each vertex entry.
The key is that the edge entries point directly to their corresponding vertex array/list entry, they never have to search through the vertex array/list for a matching entry, they can just look it up directly. This insures that the total number of edge accesses is 2E and the total number of vertex accesses is V+2E. This makes the total time O(E+V).
In improperly implemented adjacency lists, the vertex array/list is not directly indexed, so to go from an edge entry to a vertex entry you have to search through the vertex list which is O(V), which means that the total time is O(E*V).

Describing an algorithm at most O(nm log n) run time

If I had to give an algorithm in O|V|3| that takes as input a directed graph with positive edge lengths and returns the length of the shortest cycle in the graph (if the graph is acyclic, it should say so). I know that it will be:
Let G be a graph, define a matrix Dij which stores the shortest path from vertex i to j for any pair of vertices u,v. There can be two shortest paths between u and v. The length of the cycle is Duv+ Dvu. This then is enough to compote the minimum of the Duv+Dvu for any given pair of vertices u and v.
Could I write this in a way to make it at most O(nm log n) (where n is the number of vertices and m is the number of edges) instead of O|V|3|?
Yes, in fact this problem can be solved in O(nm) according to a conference paper by Orlin and Sedeño-Noda (2017), titled An O(nm) time algorithm for finding the min length directed cycle in a graph:
In this paper, we introduce an O(nm) time algorithm to determine the minimum length directed cycle (also called the "minimum weight directed cycle") in a directed network with n nodes and m arcs and with no negative length directed cycles.

Proving optimality for a new algorithm that finds minimum spanning tree

Below is an algorithm that finds the minimum spanning tree:
MSTNew(G, w)
Z ← empty array
for each edge e in E, taken in random order do
Z ← Z ∪ e
if Z has a cycle c then
let e be a maximum-weight edge on c
Z ← Z − e
return (Z)
Does this algorithm always return the optimal MST solution?
I would say yes. It sort of looks like Kruskals algorithm in disguise - sort-of.
Being fairly new to graph theory, I really don't have much of an idea other than that. Would someone have any ideas or advice?
Yes, IMO the algorithm outputs a Minimum Spanning Tree.
Informal Proof:
At every iteration, we remove only that edge which is the most expensive edge on a cycle. Such an edge can never be included in a MST (by exchange argument). Thus we always exclude those edges which can never be a part of the MST.
Also, the output of the algorithm is always a spanning tree because we are deleting edges only when the new edge results in a cycle.
However, note that this algorithm will be highly inefficient since at each iteration you are not only checking for cycles (as in Kruskal's) but also searching for the maximum cost edge on the cycle.

tree graph - find how many pairs of vertices, for which the sum of edges on a path between them is C

I've got a weighted tree graph, where all the weights are positive. I need an algorithm to solve the following problem.
How many pairs of vertices are there in this graph, for which the sum of the weights of edges between them equals C?
I thought of a solutions thats O(n^2)
For each vertex we start a DFS from it and stop it when the sum gets bigger than C. Since the number of edges is n-1, that gives us obviously an O(n^2) solution.
But can we do better?
For an undirected graph, in terms of theoretic asymptotic complexity - no, you cannot do better, since the number of such pairs could be itself O(n^2).
As an example, take a 'sun/flower' graph:
G=(V[union]{x},E)
E = { (x,v) | v in V }
w(e) = 1 (for all edges)
It is easy to see that the graph is indeed a tree.
However, the number of pairs that have distance of exactly 2 is (n-1)(n-2) which is in Omega(n^2), and thus any algorithm that finds all of them will be Omega(n^2) in this case.

Breadth-first search algorithm (graph represented by the adjacency list) has a quadratic time complexity?

A friend told me that breadth-first search algorithm (graph represented by the adjacency list) has a quadratic time complexity. But in all the sources says that the complexity of BFS algortim exactly O (|V| + |E|) or O (n + m), from which we obtain a quadratic complexity ?
All the sources are right :-) With BFS you visit each vertex and each edge exactly once, resulting in linear complexity. Now, if it's a completely connected graph, i.e. each pair of vertices is connected by an edge, then the number of edges grows quadratic with the number of vertices:
|E| = |V| * (|V|-1) / 2
Then one might say the complexity of BFS is quadratic in the number of vertices: O(|V|+|E|) = O(|V|^2)
BFS is O(E+V) hence in terms of input given it is linear time algorithm but if vertices of graph are considered then no of edges can be O(|V|^2) in dense graphs hence if we consider time complexity in terms of vertices in graph then BFS is O(|V|^2) hence can be considered quadratic in terms of vertices
O(n + m) is linear in complexity and not quadratic. O(n*m) is quadratic.
0. Initially all the vertices are labelled as unvisited. We start from a given vertex as the current vertex.
1. A BFS will cover (visit) all the adjacent unvisited vertices to the current vertex queuing up these children.
2. It would then label the current vertex as 'visited' so that it might not be visited (queued again).
3 BFS would then take out the first vertex from the queue and would repeat the steps from 1 till no more unvisited vertices remain.
The runtime for the above algorithm is linear in the total no. of vertices and edges together because the algorithm would visit each vertex once and check each of its edges once and thus it would take no. of vertices + no. of edges steps to completely search the graph

Resources