Analysis of Algorithms Notation Confirmation - algorithm

I'll go ahead and mention that this is homework, but I'm not seeking typical homework help. I'm just wanting confirmation on wording of the question. The question states that my algorithm should be linear in the number of vertices in the graph. I've never seen that wording, is that just saying my running time should be O(|V|)? If so I think I have my solution.

In analysis of algorithms, algorithms are categorized by efficiency as a function of their input size.
O(|V|) means that your algorithm must examine, or 'touch', every vertex in your graph. So yes, linear in the number of vertices means O(|V|).
For reference, in Big O, Ɵ, or Ω; the two vertical bars mean number of. They are also used to denote length of in some proofs.

Related

I have found on the Internet polynomial time algorithm for graph coloring, possibly proving P=NP

I was searching for graph coloring algorithms, and I have found algorithm, which, how author states, runs in polynomial time.
Author gives also C++ program source code and demonstration program.
The suspicious thing is that decision problem whether graph is k-colorable, is NP-complete, so no polynomial time algorithm should exist until P=NP.
However, author doesn't claims, that algorithm works for all graphs, he only says, that he haven't found any graph, for which algorithm doesn't work.
So, the question: does that algorithm really works for every graph and that means actually P=NP, or there exist certain graphs/graph classes for which it doesn't work? Or maybe there is simply a mistake in complexity calculation?
I think you haven't read the abstract very carefully.
The author presents an algorithm which finds m-colorings of a graph, for some m less than the limit imposed by Brooks' theorem: https://en.wikipedia.org/wiki/Brooks'_theorem
(which is old and states that chi < delta + 1 as the author states in second sentence.)
The author is aware of the P vs NP question. The paper does not claim to resolve the question, he merely states:
For all known examples of graphs, the algorithm finds a proper m-coloring of the vertices of the graph G for m equal to the chromatic number χ(G)
Then he asks,
In view of the importance of the P versus NP question, we ask: does there exist a graph G for which this algorithm cannot find a proper m-coloring of the vertices of G with m equal to the chromatic number χ(G)?
Emphasis in original (!)
So it doesn't claim to resolve P vs NP, its just, as a matter of academic research, they ask "can anyone produce an example on which this algorithm fails to reach the chromatic number", which might be instructive to them for mathematical purposes. It is highly unlikely that the algorithm actually achieves the chromatic number for all graphs. (Although it is, scientifically speaking, unknown whether it does or doesn't.)

Data Structures / undirected graphs and Dijkstra's algorithm

I am stuck on a two part practice problem regarding the subjects mentioned in the title.
The first part of the question asks:
By considering the complete graph with n verticies, show that the maximum number of simple paths between two verticies is O((n-1)!). (I am assuming I am supposed to show this somehow with a formal definition)
The second portion of the questions asks:
Give an example where Dijkstra's algorithm gives the wrong answer in the presence of a negative edge but no negative cost cycles.
Thanks for any help!

Topological sort of cyclic graph with minimum number of violated edges

I am looking for a way to perform a topological sorting on a given directed unweighted graph, that contains cycles. The result should not only contain the ordering of vertices, but also the set of edges, that are violated by the given ordering. This set of edges shall be minimal.
As my input graph is potentially large, I cannot use an exponential time algorithm. If it's impossible to compute an optimal solution in polynomial time, what heuristic would be reasonable for the given problem?
Eades, Lin, and Smyth proposed A fast and effective heuristic for the feedback arc set problem. The original article is behind a paywall, but a free copy is available from here.
There’s an algorithm for topological sorting that builds the vertex order by selecting a vertex with no incoming arcs, recursing on the graph minus the vertex, and prepending that vertex to the order. (I’m describing the algorithm recursively, but you don’t have to implement it that way.) The Eades–Lin–Smyth algorithm looks also for vertices with no outgoing arcs and appends them. Of course, it can happen that all vertices have incoming and outgoing arcs. In this case, select the vertex with the highest differential between incoming and outgoing. There is undoubtedly room for experimentation here.
The algorithms with provable worst-case behavior are based on linear programming and graph cuts. These are neat, but the guarantees are less than ideal (log^2 n or log n log log n times as many arcs as needed), and I suspect that efficient implementations would be quite a project.
Inspired by Arnauds answer and other interesting topological sorting algorithms have I created the cyclic-toposort project and published it on github. cyclic_toposort does exactly what you desire in that it quickly sorts a directed cyclic graph providing the minimal amount of violating edges. It optionally also provides the maximum groupings of nodes that are on the same topological level (and can therefore be activated at the same time) if desired.
If the problem is still relevant to you then I would be happy if you check out my project and let me know what you think!
This project was useful to my own neural network topology research, so I had to create something like this anyway. I am answering your question this late in case anyone else stumbles upon this thread in search for the same question.

max-weight k-clique in a complete k-partite graph

My Problem
Whether there's an efficient algorithm to find a max-weight (or min-weight) k-clique in a complete k-partite graph (a graph in which vertices are adjacent if and only if they belong to different partite sets according to wikipedia)?
More Details about the Terms
Max-weight Clique: Every edge in the graph has a weight. The weight of a clique is the sum of the weights of all edges in the clique. The goal is to find a clique with the maximum weight.
Note that the size of the clique is k which is the largest possible clique size in a complete k-partite graph.
What I have tried
I met this problem during a project. Since I am not a CS person, I am not sure about the complexity etc.
I have googled several related papers but none of them deals with the same problem. I have also programmed a greedy algorithm + simulated annealing to deal with it (the result seems not good). I have also tried something like Dynamic Programming (but it does not seem efficient). So I wonder whether the exact optimal can be computed efficiently. Thanks in advance.
EDIT Since my input can be really large (e.g. the number of vertices in each clique is 2^k), I hope to find a really fast algorithm (e.g. polynomial of k in time) that works out the optimal result. If it's not possible, can we prove some lower bound of the complexity?
Generalized Maximum Clique Problem (GMCP)
I understand that you are looking for the Generalized Maximum/ minimum Clique Problem (GMCP), where finding the clique with maximum score or minimum cost is the optimization problem.
This problem is a NP-Hard problem as shown in Generalized network design problems, so there is currently no polynomial time exact solution to your problem.
Since, there is no known polynomial solution to your problem, you have 2 choices. Reducing the problem size to find the exact solution or to find an estimated solution by relaxing your problem and it leads you to a an estimation to the optimal solution.
Example and solution for the small problem size
In small k-partite graphs (in our case k is 30 and each partite has 92 nodes), we were able to get the optimal solution in a reasonable time by a heavy branch and bounding algorithm. We have converted the problem into another NP-hard problem (Mixed Integer Programming), reduced number of integer variables, and used IBM Cplex optimizer to find the optimal solution to GMCP.
You can find our project page and paper useful. I can also share the code with you.
How to estimate the solution
One straight forward estimation to this NP-Hard problem is relaxing the Mixed Integer Programming problem and solve it as a linear programming problem. Of course it will give you an estimation of the solution, but still you might get a reasonable answer in practice.
More general problem (Generalized Maximum Multi Clique Problem)
In another work, we solve the Generalized Maximum Multi Clique Problem (GMMCP), where maximizing the score or minimizing the cost of selecting multiple k-cliques in a complete k-partite graph is in interest. You can find the project page by searching for GMMCP Tracking.
The maximum clique problem in a weighted graph in general is intractable. In your case, if the graph contains N nodes, you can enumerate through all possible k-cliques in N ** k time. If k is fixed (don't know if it is), your problem is trivially polynomially solvable, as this is a polynomial in N. I don't believe the problem to be tractable if k is a free parameter because I can't see how the assumption of a k-partite graph would make the problem significantly simpler from the general one.
How hard your problem is in practice depends also on how the weights are distributed. If all the weights are very near to each others, i.e. the difference between "best" and "good" is relatively small, the problem is very hard. If you have wildly different weights on the edges, the problem can be easier, because a greedy algorithm can give you a good "initial" solution, and you can use that and subsequent good solutions to limit your combinatorial search using the well-known branch-and-bound method.

Efficient minimal spanning tree in metric space

I have a large set of points (n > 10000 in number) in some metric space (e.g. equipped with Jaccard Distance). I want to connect them with a minimal spanning tree, using the metric as the weight on the edges.
Is there an algorithm that runs in less than O(n2) time?
If not, is there an algorithm that runs in less than O(n2) average time (possibly using randomization)?
If not, is there an algorithm that runs in less than O(n2) time and gives a good approximation of the minimum spanning tree?
If not, is there a reason why such algorithm can't exist?
Thank you in advance!
Edit for the posters below:
Classical algorithms for finding minimal spanning tree don't work here. They have an E factor in their running time, but in my case E = n2 since I actually consider the complete graph. I also don't have enough memory to store all the >49995000 possible edges.
Apparently, according to this: Estimating the weight of metric minimum spanning trees in sublinear time there is no deterministic o(n^2) (note: smallOh, which is probably what you meant by less than O(n^2), I suppose) algorithm. That paper also gives a sub-linear randomized algorithm for the metric minimum weight spanning tree.
Also look at this paper: An optimal minimum spanning tree algorithm which gives an optimal algorithm. The paper also claims that the complexity of the optimal algorithm is not yet known!
The references in the first paper should be helpful and that paper is probably the most relevant to your question.
Hope that helps.
When I was looking at a very similar problem 3-4 years ago, I could not find an ideal solution in the literature I looked at.
The trick I think is to find a "small" subset of "likely good" edges, which you can then run plain old Kruskal on. In general, it's likely that many MST edges can be found among the set of edges that join each vertex to its k nearest neighbours, for some small k. These edges might not span the graph, but when they don't, each component can be collapsed to a single vertex (chosen randomly) and the process repeated. (For better accuracy, instead of picking a single representative to become the new "supervertex", pick some small number r of representatives and in the next round examine all r^2 distances between 2 supervertices, choosing the minimum.)
k-nearest-neighbour algorithms are quite well-studied for the case where objects can be represented as vectors in a finite-dimensional Euclidean space, so if you can find a way to map your objects down to that (e.g. with multidimensional scaling) then you may have luck there. In particular, mapping down to 2D allows you to compute a Voronoi diagram, and MST edges will always be between adjacent faces. But from what little I've read, this approach doesn't always produce good-quality results.
Otherwise, you may find clustering approaches useful: Clustering large datasets in arbitrary metric spaces is one of the few papers I found that explicitly deals with objects that are not necessarily finite-dimensional vectors in a Euclidean space, and which gives consideration to the possibility of computationally expensive distance functions.

Resources