Partitioning a graph into two clusters - algorithm

I have a complete weighted graph G(V, E). I want to partition V into two clusters such that maximum intra-cluster edge length gets minimized. What is the fastest algorithm that solves this problem? I believe this can be solved in O(n^2) time where |V|=n. One approach would be making the graph bipartite. I could not figure out the complete algorithm. Can anyone help me to figure out the complete algorithm?

Two-color (depth-first search, O(n) time) a maximum spanning forest (Prim's algorithm, O(n2) time). Proof of correctness left as an exercise.
For the record, for sparser graphs with only m edges, I'm pretty sure there's an O(m)-time algorithm.

Related

Can we use the BFS on each vertex in order to find the graph's diameter? if so, is this the best solution?

So i found an old topic :
Algorithm for diameter of graph?
which they said the best solution for non sparse graph is O(V^3)
but can't we just use the BFS on each vertex and then find the maximum?
and this way the time complexity will be O(V*(V+E)) = O(V^2 + VE)
am i wrong? because if the number of edges is just a multiplicand of V then this would work better, right?
so i guess my question is :
what is the best time complexity for computing the graph's diameter as of now in 2018
is my method wrong? what am i missing here?
The matrix in question is non-sparse. So it gives a worst case E ~ (V^2)/2 edges. The solution mentioned will thus become O(V^2+V*(V^2)) for non-sparse matrixes.
If the matrix was sparse then it would indeed be faster than O(V^3).
Also given the graph is non-sparse, it is usually represented using adjacency matrix, for faster lookup times. Breadth First Search would thus take O(V^2). This done as you mentioned across all nodes will again lead to O(V^3) computational time complexity.
Finding the diameter can be done by finding all pair shortest paths first and determining the maximum length found. Floyd-Warshall algorithm does this in O(V^3) time. Johnson's algorithm can be implemented to achieve O(V^2 logV + VE) time.

Building MST from a graph with "very few" edges in linear time

I was at an interview and interviewer asked me a question:
We have a graph G(V,E), we can find MST using prim's or kruskal algorithm. But these algorithms do not take into the account that there are "very few" edges in G. How can we use this information to improve time complexity of finding MST? Can we find MST in linear time?
The only thing I could remember was that Kruskal's algorithm is faster in a sparse graphs while Prim's algorithm is faster in really dense graphs. But I couldn't answer him how to use prior knowledge about the number of edges to make MST in linear time.
Any insight or solution would be appreciated.
Kruskal's algorithm is pretty much linear after sorting the edges. If you use a union find structure like disjoint set forest The complexity for processing a single edge will be in the order of lg*(n) where n is the number of vertices and this function grows so slowly that for this case can be considered constant. However the problem is that to sort the edges you still need a O(m * log(m)). Where m is the number of edges.
Prim's algorithm will not be able to take advantage of the fact that the edges are very few.
One approach that you can use is something like a 'reversed' MST approach where you start off with all edges and remove the longest edge until the graph becomes disconnected. You keep doing that until only n - 1 edges are left. Still note that this will be better than Kruskal only if the number of edges to remove k are few enough so that k * n < m * log(m).
Lets say |E| = |V| +c ,c being a small constant. You can run DFS on the graph and every time you detect a circle, remove the largest edge. you must do that c +1 times. O(c+1 * |E|) = O(E) linear time in theory.

Dijkstra's algorithm vs relaxing edges in topologically sorted graph for DAG

I was reading Introduction To Algorithms 3rd Edition. There are 3 methods given to solve the problem. My inquiry is about two of them.
The one with no name
The algorithm starts by topologically sorting the dag (see Section 22.4) to impose a linear ordering on the vertices. If the dag contains a path from vertex u to vertex v, then u precedes v in the topological sort. We make just one pass over the vertices in the topologically sorted order. As we process each vertex, we relax each edge that leaves the vertex.
Dijkstra's Algorithm
This is quite well known
As far as the book shows, time complexity of without name one is O(V+E) but of Dijstra's is O(ElogV). We cannot use Dijkstra's on negative weight but we can use the other. What are the advantages of using Dijkstra's Algorithm except it can be used in cyclic ones?
Because the first algorithm you give only works on acyclic graphs, whereas Dijkstra runs on graph with non-negative weight.
The limitations are not the same.
In real-world, many applications can be modelled as graphs with non-negative weights, that's why Dijkstra is so used. Plus, it is very simple to implement. The complexity of Dijkstra is higher because it relies on priority queue, but this does not mean it takes necessarily more time to execute. (nlog(n) time is not that bad, because log(n) is a relatively small number: log(10^80) = 266)
However, this stand for sparse graphs (low density of edges). For dense graphs, other algorithms may be more efficient.

How can a heap be used to optimizie Prim's minimum spanning tree algorithm?

I have to solve a question that is something like this:
I am given a number N which represents the number of points I have. Each point has two coordinates: X and Y.
I can find the distance between two points with the following formula:
abs(x2-x1)+abs(y2-y1),
(x1,y1) being the coordinates of the first point, (x2,y2) the coordinates of the second point and abs() being the absolute value.
I have to find the minimum spanning tree, meaning I must have all my points connected with the sum of the edges being minimal. Prim's algorithm is good, but it is too slow. I read that I can make it faster using a heap but I didn't find any article that explains how to do that.
Can anyone explain me how Prim's algorithm works with a heap(some sample code would be good but not neccesarily), please?
It is possible to solve this problem efficiently(in O(n log n) time), but it is not that easy. Just using the Prim's algorithm with a heap does not help(it actually makes it even slower), because its time complexity is O(E log V), which is O(n^2 * log n) in this case.
However, you can use the Delaunay triangulation to reduce the number of edges in the graph. The Delaunay triangulation graph is planar, so it has linear number of edges. That's why running the Prim's algorithm with a heap on it gives O(n log n) time complexity(there are O(n) edges and n vertices). You can read more about it here(covering this algorithm in details and proving its correctness would make my answer way too long): http://en.wikipedia.org/wiki/Euclidean_minimum_spanning_tree. Note that even though the article is about the Euclidian mst, the approach for your case is essentially the same(it is possible to build the Delaunay triangulation for manhattan distance efficiently, too).
A description of the Prim's algorithm with a heap itself is already present in two other answers to your question.
From the Wikipedia article on Prim's algorithm:
[S]toring vertices instead of edges can improve it still further. The heap should order the vertices by the smallest edge-weight that connects them to any vertex in the partially constructed minimum spanning tree (MST) (or infinity if no such edge exists). Every time a vertex v is chosen and added to the MST, a decrease-key operation is performed on all vertices w outside the partial MST such that v is connected to w, setting the key to the minimum of its previous value and the edge cost of (v,w).
While it was pointed out that Prim's with a heap is O(E log V), which is O(n^2 log n) in the worst case, I can provide what makes the heap faster in cases other than that worst case, since that has still not been answered.
What makes Prim's so costly at O(V^2) is the necessary updating each iteration in the algorithm. In general, Prim's works by keeping a table of your vertices with the lowest length to other vertices and picking the cheapest vertex to add to your growing tree until all are added. Every time you add a vertex, you must then go back to your table and update any vertices that can now be accessed with less weight. You then must walk back all the way through your table to decide which vertex is cheapest to add. This setup - having to pick the next vertex (O(V)) V times - gives the O(V^2).
The heap is able to help this running time is all cases besides the worst case because it fixes this bottleneck. By working with a minimum heap, you can access the minimum weight in consideration in O(1). Additionally, it costs O(log V) to fix a heap after adding a number to it to maintain its properties, which is done E times for O(E log V) to maintain the heap for Prim's. This becomes the new bottleneck, which is what gives rise to the final running time of O(E log V).
So, depending on how much you know about your data, Prim's with a heap can certainly be more efficient than without!

some questions on MST

I am learning the topic of Minimum-Spanning-Tree right now, and I understand the most of it, but I still have some things that I do not understand.
I am dealing with undirected weighted graphs.
First, I know that finding MST costs O(E*log V). Now, I want to optimize it to linear time - O(V+E), when we dealing with planar graphs.
Secondly, I saw an example of n points in the unit-square and I succeed to show that a MST that weights O(sqrt n) is exist. The problem is that I could not find an algorithm to find this MST.
Thanks all,
Or
Boruvka's algorithms runs in O(V) time on planar graphs. For details see
http://www.cs.princeton.edu/~wayne/kleinberg-tardos/pdf/04GreedyAlgorithmsII.pdf
Also, you can compute the Euclidean MST of n points in the plane in O(n log n) time by computing MST of edges in Delauney triangulation.

Resources