Minimum product spanning tree with negative weights - algorithm

Suppose if all the edges have positive Weights the minimum product spanning tree can be obtained by taking the log of every edge and then apply Kruskal or Prim. But if some weights are negative, we can't apply this procedure. since we need to include odd number of negative edges, and those edges must be of the maximum weight. How to do in such case?

I highly doubt you can modify Prims algorithm to work for this problem because negative numbers completely change it. If you manage to get a negative result then the absolute value has to be maximized which means the edges with the highest absolute values have to be used, hence trying to optimize a result found by Prims algo and taking the log(abs()) will not work, unless it is impossible to get a negative result, then this will actually return the best solution.
This makes the problem a little simpler, because we only have to look for the best negative solution and if we don't find any we use Prims with log(abs()).
If we assign each vertice a value of 1, then two vertices can be merged by creating a new vertice with all the edges of both vertices except the one connecting them and the value is the product of the values of the removed vertices and edge.
Based on this we can start to simplify by merging all nodes with only one edge. Parallel to each merge step the removed edge has to be marked as used in the original graph, so that the tree can be reconstructed from the marked edges in the end.
Additionally we can merge all nodes with only positive or only negative edges removing the edge with the highest absolute value. After merging the new node can have several connections to the same node, you can discard all but the negative and positive edge with the highest absolute value (so max 2 edges to the same node). Btw. as soon as we have 2 edges to the same node (following the removal conditions above) we know a solution <= 0 has to exist.
If you end up with one node and it is negative then the problem was solved successfully, if it is positive there is no negative solution. If we have a 0 vertice we can merge the rest of the nodes in any order. More likely we end up with a highly connected graph where each node has at least one negative and one positive edge. If we have an odd number of negative vertices then we want to merge the nodes with an even number of negative edges and vice versa.
Always merge by the edge with the highest absolute value. If the resulting vertice is <= 0 then you found the best solution. Otherwise it gets complicated. You could look at all the unused edges, try to add it, see which edges can be removed to make it a tree again, only look at those with different sign and build the ratio abs(added_edge/removed_edge). Then finally do the change with the best ratio (if you found any combination with opposite signs otherwise there is no negative solution). But I am not 100% sure if this would always give the best result.

Here is a simple solution. If there is at least one negative edge, find the most optimal spanning tree that maximizes log(abs(edge)) sum. Then, check if the actual product (without abs) is negative. If negative output the current spanning tree, else replace one of the positive edges with a negative edge or negative with positive to get the solution.
If none of the edges are negative, minimizing for log(edge) sum should work.
Complexity: O(n^2) with a naive solution.
More explanation on naive algorithm:
Select the edge that has the lowest absolute value for removal. Removing this edge will split the tree into two parts. We could go through every pair between those sets (should be positive or negative depending on the case) whose edge value is the largest. Complexity of this part is O(n^2).
We might have to try removing multiple edges to reach the best solution. Assuming we go through every edge, complexity is O(n^3).
I am very confident this could be improved though.

Related

dijkstra with at most ten negative edges in a path

A question from homework, maybe need to change the implementation of Dijkstra or just reduction somehow.
Let G=(V, E) and let W be a weight function W: E->Z.
All the negative weight edges with the same negative value x. (for example, all the negative weights on edges are with value -10 and all the other are positive)
Let's define "weight up to 10 negative edges," which returns the weight of the path if there is at most ten negative edges or infinity if there are more than ten negative edges.
I need to find a "weight up to 10 negative edges" path from vertex S to all other vertices.
The complexity time should be O(Elog(V)) or O(E+Vlog(V)).
I thought to duplicate the graph ten times and each time there is a negative weight edge we will move from duplicate to the next one. We will make edges with a weight of infinity between the 10th duplicate to the 11th duplicate and run Dijkstra But I don't think it works.
There should be a solution that uses Dijkstra in some way.
Dijkstra's algorithm doesn't work with negative edges because it iteratively selects the "unconfirmed" node with the lowest path length, marks it as "confirmed", and then never updates the path length for that node again. If a negative edge exists, then a "shorter" path might be found to a node after it has already been "confirmed", but if the node becomes "unconfirmed" again as a result of that then there could potentially be an infinite loop; the same node could keep getting confirmed then unconfirmed over and over, and the algorithm would never terminate. Any change to the algorithm to solve this problem must address that problem.
As a way to guarantee termination, instead of just recording the path length, you can record a pair like (path length, # of negative edges). When a shorter path to a "confirmed" node is found using a negative edge, the path length may get shorter but the number of negative edges in that path is increased. So you can write a condition to stop updating it if the number of negative edges in the resulting path would be greater than 10.
The problem is more subtle than that, though, because it's no longer the case that the "shortest path so far" to a node is the best one to keep. Suppose you have are looking for a shortest path from A to C using at most 10 negative edges, and you have found a path of length 10 from A to B using no negative edges, and another one from A to B of length 5 using three negative edges; you don't yet know which one leads to a better solution (or a solution at all), because there may be 8 negative edges in the path from B to C. So at each node you need to record not just the pair of (path length, # of negative edges), you need to record a set of all best such pairs.
Hopefully that gives you an idea of how Dijkstra's algorithm can be adapted to solve your problem; there are some remaining details you will need to fill in yourself.
You can't use Dijkstra's algorithm with negative weights and come up with a correct solution. See this other post for the reasoning behind why it fails.

Will a standard Kruskal-like approach for MST work if some edges are fixed?

The problem: you need to find the minimum spanning tree of a graph (i.e. a set S of edges in said graph such that the edges in S together with the respective vertices form a tree; additionally, from all such sets, the sum of the cost of all edges in S has to be minimal). But there's a catch. You are given an initial set of fixed edges K such that K must be included in S.
In other words, find some MST of a graph with a starting set of fixed edges included.
My approach: standard Kruskal's algorithm but before anything else join all vertices as pointed by the set of fixed edges. That is, if K = {1,2}, {4,5} I apply Kruskal's algorithm but instead of having each node in its own individual set initially, instead nodes 1 and 2 are in the same set and nodes 4 and 5 are in the same set.
The question: does this work? Is there a proof that this always yields the correct result? If not, could anyone provide a counter-example?
P.S. the problem only inquires finding ONE MST. Not interested in all of them.
Yes, it will work as long as your initial set of edges doesn't form a cycle.
Keep in mind that the resulting tree might not be minimal in weight since the edges you fixed might not be part of any MST in the graph. But you will get the lightest spanning tree which satisfies the constraint that those fixed edges are part of the tree.
How to implement it:
To implement this, you can simply change the edge-weights of the edges you need to fix. Just pick the lowest appearing edge-weight in your graph, say min_w, subtract 1 from it and assign this new weight,i.e. (min_w-1) to the edges you need to fix. Then run Kruskal on this graph.
Why it works:
Clearly Kruskal will pick all the edges you need (since these are the lightest now) before picking any other edge in the graph. When Kruskal finishes the resulting set of edges is an MST in G' (the graph where you changed some weights). Note that since you only changed the values of your fixed set of edges, the algorithm would never have made a different choice on the other edges (the ones which aren't part of your fixed set). If you think of the edges Kruskal considers, as a sorted list of edges, then changing the values of the edges you need to fix moves these edges to the front of the list, but it doesn't change the order of the other edges in the list with respect to each other.
Note: As you may notice, giving the lightest weight to your edges is basically the same thing as you suggest. But I think it is a bit easier to reason about why it works. Go with whatever you prefer.
I wouldn't recommend Prim, since this algorithm expands the spanning tree gradually from the current connected component (in the beginning one usually starts with a single node). The case where you join larger components (because your fixed edges might not all be in a single component), would be needed to handled separately - it might not be hard, but you would have to take care of it. OTOH with Kruskal you don't have to adapt anything, but simply manipulate your graph a bit before running the regular algorithm.
If I understood the question properly, Prim's algorithm would be more suitable for this, as it is possible to initialize the connected components to be exactly the edges which are required to occur in the resulting spanning tree (plus the remaining isolated nodes). The desired edges are not permitted to contain a cycle, otherwise there is no spanning tree including them.
That being said, apparently Kruskal's algorithm can also be used, as it is explicitly stated that is can be used to find an edge that connects two forests in a cost-minimal way.
Roughly speaking, as the forests of a given graph form a Matroid, the greedy approach yields the desired result (namely a weight-minimal tree) regardless of the independent set you start with.

Negative weight edges

Full question: Argue that if all edge weights of a graph are positive, then any subset of edges that connects all vertices and has minimum total weight must be a tree. Give an example to show that the same conclusion does not follow if we allow some weights to be nonpositive.
My answer: Since the edges connects all vertices, it must be a tree. In a graph, you can remove one of the edges and still connect all the vertices. Also, negative edges can be allowed in a graph (e.g. Prim and Kruskal's algorithms).
Please let me know if there's a definite answer to this and explain to me how you got the conclusion. I'm a little bit lost with this question.
First off, a tree is a type of graph. So " In a graph, you can remove one of the edges and still connect all the vertices" isn't true. A tree is a graph without cycles - i.e., with only one path between any two nodes.
Negatives weights in general can exist in either a tree or a graph.
The way to approach this problem is to show that if you have a graph that connects all components, but is NOT a tree, then it is also not of minimum weight (i.e., there is some other graph that does the same thing, with a lower total weight.) This conclusion is only true if the graph contains only positive edges, so you should also provide a counterexample - a graph which is NOT a tree, which IS of minimum weight, and which IS fully connected.
With non-negative weights, adding an edge to traverse from one node to another always results in the weight increasing, so for minimum weight you always avoid that.
If you allow negative weights, adding an edge may result in reducing the weight. If you have a cycle with negative weight overall, minimum weight demands that you stay in that cycle infinitely (leading to infinitely negative weight for the path overall).

Can I use Dijkstra's shortest path algorithm in my graph?

I have a directed graph that has all non-negative edges except the edge(s) that leave the source (S). There are no edges from any other vertices to the source. To find the shortest distance from source (S) to a vertex (T) in the graph, can I use Dijkstra's shortest path algorithm even though the edges leaving the source is negative?
Assuming only source-adjecent edges can have negative weights and there is no path back to the source from any of the source-adjecent nodes (as mentioned in the comment), you can just add a constant C onto all edges leaving the source to make them all non-negative. Then subtract C from the final result.
On a more general note, Dijkstra can be used to solve shortest-path in any graph with negative edge weights (but no negative cycles) after applying Johnson's reweighting algorithm (which is essentially Bellman-Ford, but needs to be performed only once).
Yes, you can use Dijkstra on that type of directed graph.
If you use already finished alghoritm for Dijsktra and it cannot use negative values, it can be good practise to find the lowest negative edge and add that number to all starting edges, therefore there is no-negative number at all. You substract that number after finishing.
If you code it yourself (which is acutally pretty easy and I recommend it to you), you almost does not change anything, just start with lowest value (as usual for Dijkstra) and allow it, that lowest value can be negative. It will work in your case.
The reason you generally can't use Dijkstra's algorithm for (directed) graphs with negative links is that Dijkstra's algorithm is greedy. It assumes that once you pick a vertex with minimum distance, there is no way it can later be reached by a smaller paths.
In your particular graph, after the very first step, you traverse all possible negative edges and Dijkstra's assumption actually holds from now on. Regardless of the fact that those vertices directly connected to start now have negative values, once you identify which has the minimum distance, it can never be reached again with a smaller distance (since all edges you would traverse from this point on would have a positive distance).
If you think about the conditions that dijkstra's algorithm puts upon the edges for the algorithm to work it is only that they are never decreasing after initialisation.
Thus, it actually doesn't matter if the first step is negative as from those several points onwards the function is constantly increasing and thus the correct output will be found (provided there is no way to get back to the start square.).

Finding maximum number k such that for all combinations of k pairs, we have k different elements in each combination

We are given N pairs. Each pair contains two numbers. We have to find maximum number K such that if we take any combination of J (1<=J<=K) pairs from the given N pairs, we have at least J different numbers in all those selected J pairs. We can have more than one pair same.
For example, consider the pairs
(1,2)
(1,2)
(1,2)
(7,8)
(9,10)
For this case K = 2, because for K > 2, if we select three pairs of (1,2), we have only two different numbers i.e 1 and 2.
Checking for each possible combination starting from one will take a very large amount of time. What would be an efficient algorithm for solving the problem?
Create a graph with one vertex for each number and one edge for each pair.
If this graph is a chain or a tree, we have the number of "numbers", equal to number of "pairs" plus one, After removing any number of edges from this graph, we never get less vertexes than edges.
Now add a single cycle to this chain/tree. There is equal number of vertexes and edges. After removing any number of edges from this graph, again we never get less vertexes than edges.
Now add any number of disconnected components, each should not contain more than one cycle. Once again, we never get less vertexes than edges after removing any number of edges.
Now add a second cycle to any of disconnected components. After removing all other components. at last we have more edges than vertexes (more pairs than numbers).
All this leads to the conclusion that K+1 is exactly the number of edges in the smallest possible subgraph, consisting of two cycles and, possibly, a chain, connecting these cycles.
Algorithm:
For each connected component, find the shortest cycle going through every node with Floyd-Warshall algorithm.
Then for each non-overlapping pair of cycles (in single component), use Dijkstra’s algorithm, starting from any node with at least 3 edges in one cycle, to find shortest path to other cycle; and compute a sum of lengths of both cycles and a shortest path, connecting them. For each overlapping pair of cycles, just compute the number of their edges.
Now find the minimum length of all these subgraphs. And subtract 1.
The above algorithm computes K if there is at least one double-cycle component in the graph. If there are no such components, K = N.
Seems related to MinCut/MaxFlow. Here is a try to reduce it to MinCut/MaxFlow:
- Produce one vertex for each number
- Produce one vertex for each pair
- Produce an edge from number i to a pair if the number is present in the pair, weight 1
- Produce a source node and connect it to all numbers, weight 1 for each connection
- Produce a sink node and connect it to all numbers, weight 1 for each connection
Running MaxFlow on this should give you the number K, since any set of three pairs which only contains two numbers in total, will be "blocked" by the constrains on the outgoing edges from the number.
I am not sure whether this is the fastest solution. There might also be a matroid hidden in there somewhere, I think. In that case there is a greedy approach. But I cannot find a proof for the matroid properties of the sets you are constructing.
I made some progress on it, but not yet an efficient solution. However it may point the way.
Make a graph whose points are pairs, and connect any pair of points if they share a number. Then for any subgraph, the number of numbers in it is the number of vertices minus the number of edges. Therefore your problem is the same as locating the smallest subgraph (if any) that has more edges than vertices.
A minimal subgraph that has the same number of edges and vertices is a cycle. Therefore the graphs we're looking for are either 2 cycles that share one or more vertices, or else 2 cycles which are connected by a path. There are no other minimal types possible.
You can locate and enumerate cycles fairly easily with a breadth-first search. There may be a lot of them, but this is doable. Armed with that you can look for subgraphs of these subtypes. (Enumerate minimal cycles, look for either pairs that share points, or which are connected.) But that isn't guaranteed to be polynomial. I suspect it will be something where on average it is pretty good, but the worst case is very bad. However that may be more efficient than what you're doing now.
I keep on thinking that some kind of breadth-first search can find these in polynomial time, but I keep failing to see exactly how to do it.
This is equivalent to finding the chord that chords the smallest cycle in the graph. A very naive algorithm would be:
Check if removal of an edge results in a cycle containing the vertices corresponding to the edge. If yes, then note down the length of the smallest cycle.

Resources