Dependence of complexity of graph algorithms on weight of edges? - algorithm

This might be a silly question, but why doesn't the complexity depend on weight of the edges present in the graph?

There are many different graph algorithms and in some cases the complexities do depend on the edge weights. For example, the Ford-Fulkerson max-flow algorithm has runtime O(mF), where F is the maximum possible flow, which depends on the maximum capacity of the edges. Other algorithms like Dijkstra's algorithm have runtimes that are independent of the edge lengths because it's assumed in the computational model that operations on those weights always take time O(1).
Generally speaking, algorithms with runtimes that depend on the weights/capacities/lengths of the edges in the graph gain their dependency by iterating a number of times based on the capacities/weights/lengths of those edges. If the algorithm only does numeric computations on the weights etc., there typically isn't a dependency because arithmetic operations typically are only considered to take time O(1) unless there's a reason to believe otherwise.
Hope this helps!

Related

Bellman-Ford vs. Dijkstra graph density

I was testing the two algorithms and Bellman-Ford performed better on sparse graphs and looking at the big-O analysis of both, O(VE) for Bellman-Ford and O(E + V lg V) for Dijkstra's. I believe this is correct. I did some researching that said
Dijkstra's is always faster and Bellman-Ford is only used when negative weight cycles are present.
Is that really the case?
TRUE.
Wikipedia: However, Dijkstra's algorithm greedily selects the minimum-weight node
that has not yet been processed, and performs this relaxation process
on all of its outgoing edges; in contrast, the Bellman–Ford algorithm
simply relaxes all the edges, and does this |V | − 1 times, where |V |
is the number of vertices in the graph. In each of these repetitions,
the number of vertices with correctly calculated distances grows, from
which it follows that eventually all vertices will have their correct
distances. This method allows the Bellman–Ford algorithm to be applied
to a wider class of inputs than Dijkstra.
Bellman-Ford performs the check on all the vertices, Dijkstra only on the one with the best distance calculated so far. Again already noted, this improves the complexity of the Dijkstra approach, however it requires to compare all the vertices to find out the minimum distance value. Being this not necessary in the Bellman-Ford, it is easier to implement in a distributed environment. That's why it is used in Distance Vector routing protocols (e.g., RIP and IGRP), where mostly local information is used. To use Dijkstra in routing protocols, instead, it is necessary first to distribute the entire topology, and this is what happens in Link State protocols, such as OSPF and ISIS.
Asymptotically, for any graph where E ≥ V, the runtime of Dijkstra’s algorithm (O(E + V log V)) is smaller than that of Bellman-Ford (O(EV)). This means that, at least in theory, Dijkstra’s algorithm will outperform Bellman-Ford for large graphs of any density (assuming there’s at least as many edges as nodes). That’s borne out in practice, with Dijkstra’s algorithm usually running much faster.
Bellman-Ford is used, as you’ve mentioned, in cases where there are negative edges, which Dijkstra can’t handle. But there are other cases where Bellman-Ford is useful, too. For example, in network routing, where each network router needs to find shortest paths and there isn’t a central computer coordinating everything, the network routers can run a distributed version of Bellman-Ford to find shortest paths between computers due to how the computation only requires local updates to node distances. Dijkstra’s algorithm doesn’t work in this case.
There are also ways to improve the performance of Bellman-Ford in practice for many types of graphs. The Shortest Paths Faster Algorithm (SPFA) is a relatively simple optimization of Bellman-Ford that, while still retaining Bellman-Ford’s worst case runtime, is empirically faster in practice. Dijkstra’s algorithm, IIRC, is usually still faster than SPFA, but this does close the gap in some circumstances.
As you’ve mentioned

Dijkstra's algorithm vs relaxing edges in topologically sorted graph for DAG

I was reading Introduction To Algorithms 3rd Edition. There are 3 methods given to solve the problem. My inquiry is about two of them.
The one with no name
The algorithm starts by topologically sorting the dag (see Section 22.4) to impose a linear ordering on the vertices. If the dag contains a path from vertex u to vertex v, then u precedes v in the topological sort. We make just one pass over the vertices in the topologically sorted order. As we process each vertex, we relax each edge that leaves the vertex.
Dijkstra's Algorithm
This is quite well known
As far as the book shows, time complexity of without name one is O(V+E) but of Dijstra's is O(ElogV). We cannot use Dijkstra's on negative weight but we can use the other. What are the advantages of using Dijkstra's Algorithm except it can be used in cyclic ones?
Because the first algorithm you give only works on acyclic graphs, whereas Dijkstra runs on graph with non-negative weight.
The limitations are not the same.
In real-world, many applications can be modelled as graphs with non-negative weights, that's why Dijkstra is so used. Plus, it is very simple to implement. The complexity of Dijkstra is higher because it relies on priority queue, but this does not mean it takes necessarily more time to execute. (nlog(n) time is not that bad, because log(n) is a relatively small number: log(10^80) = 266)
However, this stand for sparse graphs (low density of edges). For dense graphs, other algorithms may be more efficient.

Finding fully connected components?

I'm not sure if I'm using the right term here, but for fully connected components I mean there's an (undirected) edge between every pair of vertices in a component, and no additional vertices can be included without breaking this property.
There're a number algorithms for finding strongly connected components in a graph though (for example Tarjan's algorithm), is there an algorithm for finding such "fully connected components"?
What you are looking for is a list of all the maximal cliques of the graph. It's also called the clique problem. No known polynomial time solution exists for a generic undirected graph.
Most versions of the clique problem are hard. The clique decision problem is NP-complete (one of Karp's 21 NP-complete problems). The problem of finding the maximum clique is both fixed-parameter intractable and hard to approximate. And, listing all maximal cliques may require exponential time as there exist graphs with exponentially many maximal cliques. Therefore, much of the theory about the clique problem is devoted to identifying special types of graph that admit more efficient algorithms, or to establishing the computational difficulty of the general problem in various models of computation.
-https://en.wikipedia.org/wiki/Clique_problem
I was also looking at the same question.
https://en.wikipedia.org/wiki/Bron-Kerbosch_algorithm This turns out to be an algorithm to list it, however, it's not fast. If your graph is sparse, you may want to use the vertex ordering version of the algorithm:
For sparse graphs, tighter bounds are possible. In particular the vertex-ordering version of the Bron–Kerbosch algorithm can be made to run in time O(dn3d/3), where d is the degeneracy of the graph, a measure of its sparseness. There exist d-degenerate graphs for which the total number of maximal cliques is (n − d)3d/3, so this bound is close to tight.[6]

A deterministic algorithm for minimum cut of undirected graph?

Could someone name a few deterministic algorithm for minimum cut of undirected graph, along with their complexity please?
(By the way I learnt that there is a undirected version of Ford-Fulkerson algorithm by adding a opposing parallel edge for each directed edge, could someone tell me what is the time complexity of this one and maybe give me a bit more reference to read?)
Thanks.
Solving the global minimum cut by computing multiple maximum flows is possible but suboptimal. Using the fastest known algorithm (Orlin for sparse graphs and King-Rao-Tarjan for dense graphs), maxflow can be solved in O(mn). By picking a fixed source vertex and computing maxflow to all other vertices, we get (by the duality) the global mincut in O(mn²).
There exist several algorithms specifically for global mincuts. For algorithms independent of graph structure, the most commonly used are
Nagamochi & Ibaraki, 1992, O(nm + n²log(n)). Does not use flows and gradually shrinks the graph.
Stoer & Wagner, 1997, also O(nm + n²log(n)). Easier to implement. It is implemented in BGL
Hao & Orlin's algorithm can also run very fast in practice, especially when some of the known heuristics are applied.
There are many algorithms that exploit structural properties of input graphs. I'd suggest the recent algorithm of Brinkmeier, 2007 which runs in "O(n² max(log(n), min(m/n,δ/ε))), where ε is the minimal edge weight, and δ is the minimal weighted degree". In particular, when we ignore the weights, we get O(n² log(n)) for inputs with m in o(n log(n)) and O(nm) for denser graphs, meaning its time complexity is never worse than that of N-I or S-W regardless of input.

Efficient minimal spanning tree in metric space

I have a large set of points (n > 10000 in number) in some metric space (e.g. equipped with Jaccard Distance). I want to connect them with a minimal spanning tree, using the metric as the weight on the edges.
Is there an algorithm that runs in less than O(n2) time?
If not, is there an algorithm that runs in less than O(n2) average time (possibly using randomization)?
If not, is there an algorithm that runs in less than O(n2) time and gives a good approximation of the minimum spanning tree?
If not, is there a reason why such algorithm can't exist?
Thank you in advance!
Edit for the posters below:
Classical algorithms for finding minimal spanning tree don't work here. They have an E factor in their running time, but in my case E = n2 since I actually consider the complete graph. I also don't have enough memory to store all the >49995000 possible edges.
Apparently, according to this: Estimating the weight of metric minimum spanning trees in sublinear time there is no deterministic o(n^2) (note: smallOh, which is probably what you meant by less than O(n^2), I suppose) algorithm. That paper also gives a sub-linear randomized algorithm for the metric minimum weight spanning tree.
Also look at this paper: An optimal minimum spanning tree algorithm which gives an optimal algorithm. The paper also claims that the complexity of the optimal algorithm is not yet known!
The references in the first paper should be helpful and that paper is probably the most relevant to your question.
Hope that helps.
When I was looking at a very similar problem 3-4 years ago, I could not find an ideal solution in the literature I looked at.
The trick I think is to find a "small" subset of "likely good" edges, which you can then run plain old Kruskal on. In general, it's likely that many MST edges can be found among the set of edges that join each vertex to its k nearest neighbours, for some small k. These edges might not span the graph, but when they don't, each component can be collapsed to a single vertex (chosen randomly) and the process repeated. (For better accuracy, instead of picking a single representative to become the new "supervertex", pick some small number r of representatives and in the next round examine all r^2 distances between 2 supervertices, choosing the minimum.)
k-nearest-neighbour algorithms are quite well-studied for the case where objects can be represented as vectors in a finite-dimensional Euclidean space, so if you can find a way to map your objects down to that (e.g. with multidimensional scaling) then you may have luck there. In particular, mapping down to 2D allows you to compute a Voronoi diagram, and MST edges will always be between adjacent faces. But from what little I've read, this approach doesn't always produce good-quality results.
Otherwise, you may find clustering approaches useful: Clustering large datasets in arbitrary metric spaces is one of the few papers I found that explicitly deals with objects that are not necessarily finite-dimensional vectors in a Euclidean space, and which gives consideration to the possibility of computationally expensive distance functions.

Resources