How can I find a way to minimum the number of edges? - algorithm

I am thinking an algorithm to solve the problem below:
A given graph composed of vertices and edges.
There are N customers who want to travel from a vertex to another vertex.
And each customer requirement need a directed edge to connect two vertices.
The problem is how to find the minimum number of edges to satisfy all customers requirements ?
There is a simple example:
Customer 1 wants to travel from vertex a to vertex b.
Customer 2 wants to travel from vertex b to vertex c.
Customer 3 wants to travel from vertex a to vertex c.
The simplest way is to give an edge for each customers:
edge 1: vertex a -> vertex b
edge 2: vertex b -> vertex c
edge 3: vertex a -> vertex c
But actually there only needs 2 edges (i.e. edge 1 and edge 2) to satisfy three customer requirements.
If the number customers is large, how to find the minimum edges to satisfy all customer requirements ?
Is there a algorithm to solve this problem ?

You can model the problem as a mixed integer program. You can define binary variables for "arc a-> b is used" and "customer c uses arc a -> b" and write down the requirements as linear inequalities. If your graph is not too large, you can solve such models in reasonable time by a mixed integer program solver (CPLEX, GUROBI, but there also free alternatives on the web).
I know that this solution requires some work if you are not familiar with linear programming, but it guarantees to find best solutions in finite time and you can probably solve it for (say) 1000 customers and 1000 arcs.

If you have N vertices, you can always construct a solution with N (directed) edges. Just create a directed cycle V_1 -> V_2 -> V_3 ->... -> V_N -> V_1. You can never have directed path from every vertex V_a to every other vertex V_b with fewer edges (because you'd have a directed tree which necessarily contains a leaf). The leaf is either un-reachable (if the edge goes from leaf out) or the leaf is a sink (can't connect to anything else) if the edge is ->leaf.

No need to use any new algorithm. You can use BFS/DFS algorithm.
Find if there exists any path between source and destination.
if !true
add a direct edge between source and destination
count++;
return count;
Here the key part is instead of loop through the graph we have to loop through newly added edges.

You can use Disjoint set data structure.
https://en.wikipedia.org/wiki/Disjoint-set_data_structure
while (num_edges--)
if root(vertex_a) != root(vertex_b)
count++
union(vertex_a,vertex_B)

If I think of the same problem for undirected edges, what we are looking for is the minimum spanning tree (MST) of the original graph (constructed of all edges). The brief explanation is that for each edge E (v1 -> v2) if there is a second path to v2 from v1, there exist a cycle, and for each existing cycle there is an edge we can omit.
For finding MST of a directed graph there is Chu–Liu/Edmonds' algorithm you can use.
Note that you are assigning a weight of 1 to all of your edges.

Related

In a DAG, how to find vertices where paths converge?

I have a type of directed acyclic graph, with some constraints.
There is only one "entry" vertex
There can be multiple leaf vertices
Once a path splits, anything under that path cannot reach into the other path (this will become clearer with some examples below)
There can be any number of "split" vertices. They can be nested.
A "split" vertex can split into any number of paths. The examples below only show 2 paths for each, but it could be more.
My challenge is the following: for each "split" vertex (any vertex that has at least 2 outgoing edges), find the vertices where its paths reconnect - if such a vertex exists. The solution should be as efficient as possible.
Example A:
example a
In this example, vertex A is a "split" vertex, and its "reconnect vertex" is F.
Example B:
example b
Here, there are two split vertices: A and E. For both of them vertex G is the reconnect vertex.
Example C:
example c
Now there are three split vertices: A, D and E. The corresponding reconnect vertices are:
A -> K
D -> K
E -> J
Example D:
example d
Here we have three split vertices again: A, D and E. But this time, vertex E doesn't have a reconnect vertex because one of the paths terminates early.
Sounds like what you want is:
Connect each vertex with out-degree 0 to a single terminal vertex
Construct the dominator tree of the edge-reversed graph. The linked wikipedia article points to a couple algorithms for doing this.
The "reconnect vertex" for a split vertex is its immediate dominator in the edge-reversed graph, i.e., its parent in that dominator tree. This is called its "postdominator" in your original graph. If it's the terminal vertex that you added, then it doesn't have a reconnect vertex in your original graph.
This is the problem of identifying post-dominators in compilers and program analysis. This is often used in the context of calculating control dependences in control flow graphs. "Advanced Compiler Design and Implementation" is a good reference on these topics.
If the graph does not have cycles, then the solution (a) suggested by #matt-timmermans will work.
If the graph has cycles, then solution (a) can report spurious post-dominators. In such cases, a network-flow based approach works better. The algorithm to calculate non-termination sensitive control dependence in this paper using this approach. The basic idea is
at every split node, inject a unique token into the graph along each outgoing edge and
propagate the tokens thru the graph subject to this constraint: if node n is reachable from split node m, then tokens arriving at node m pass thru node n only if all tokens of node m have arrived at node n.
At the end, node n post-dominates node m if all tokens of node m have arrived at node n.

Finding MST such that a specific vertex has a minimum degree

Given undirected, connected graph G={V,E}, a vertex in V(G), label him v, and a weight function f:E->R+(Positive real numbers), I need to find a MST such that v's degree is minimal. I've already noticed that if all the edges has unique weight, the MST is unique, so I believe it has something to do with repetitive weights on edges. I though about running Kruskal's algorithm, but when sorting the edges, I'll always consider edges that occur on v last. For example, if (a,b),(c,d),(v,e) are the only edges of weight k, so the possible permutations of these edges in the sorted edges array are: {(a,b),(c,d),(v,e)} or {(c,d),(a,b),(v,e)}. I've ran this variation over several graphs and it seems to work, but I couldn't prove it. Does anyone know how to prove the algorithm's correct (Meaning proving v's degree is minimal), or give a contrary example of the algorithm failing?
First note that Kruskal's algorithm can be applied to any weighted graph, whether or not it is connected. In general it results in a minimum-weight spanning forest (MSF), with one MST for each connected component. To prove that your modification of Kruskal's algorithm succeeds in finding the MST for which v has minimal degree, it helps to prove the slightly stronger result that if you apply your algorithm to a possibly disconnected graph then it succeeds in finding the MSF where the degree of v is minimized.
The proof is by induction on the number, k, of distinct weights.
Basis Case (k = 1). In this case weights can be ignored and we are trying to find a spanning forest in which the degree of v is minimized. In this case, your algorithm can be described as follows: pick edges for as long as possible according to the following two rules:
1) No selected edge forms a cycle with previously selected edges
2) An edge involving v isn't selected unless any edge which doesn't
involve v violates rule 1.
Let G' denote the graph from which v and all incident edges have been removed from G. It is easy to see that the algorithm in this special case works as follows. It starts by creating a spanning forest for G'. Then it takes those trees in the forest that are contained in v's connected component in the original graph G and connects each component to v by a single edge. Since the components connected to v in the second stage can be connected to each other in no other way (since if any connecting edge not involving v exists it would have been selected by rule 2) it is easy to see that the degree of v is minimal.
Inductive Case: Suppose that the result is true for k and G is a weighted graph with k+1 distinct weights and v is a specified vertex in G. Sort the distinct weights in increasing order (so that weight k+1 is the longest of the distinct weights -- say w_{k+1}). Let G' be the sub-graph of G with the same vertex set but with all edges of weight w_{k+1} removed. Since the edges are sorted in the order of increasing weight, note that the modified Kruskal's algorithm in effect starts by applying itself to G'. Thus -- by the induction hypothesis prior to considering edges of weight w_{k+1}, the algorithm has succeeded in constructing an MSF F' of G' for which the degree, d' of v in G' is minimized.
As a final step, modified Kruskal's applied to the overall graph G will merge certain of the trees in F' together by adding edges of weight w_{k+1}. One way to conceptualize the final step is the think of F' as a graph where two trees are connected exactly when there is an edge of weight w_{k+1} from some node in the first tree to some node in the second tree. We have (almost) the basis case with F'. Modified Kruskal's will add edged of weight w_{k+1} until it can't do so anymore -- and won't add an edge connecting to v unless there is no other way to connect to trees in F' that need to be connected to get a spanning forest for the original graph G.
The final degree of v in the resulting MSF is d = d'+d" where d" is the number of edges of weight w_{k+1} added at the final step. Neither d' nor d" can be made any smaller, hence it follows that d can't be made any smaller (since the degree of v in any spanning forest can be written as the sum of the number of edges whose weight is less than w_{k+1} coming into v and the number off edges of weight w_{k+1} coming into v).
QED.
There is still an element of hand-waving in this, especially with the final step -- but Stack Overflow isn't a peer-reviewed journal. Anyway, the overall logic should be clear enough.
One final remark -- it seems fairly clear that Prim's algorithm can be similarly modified for this problem. Have you looked into that?

Minimum sum weight of connecting 3 vertices in an undirected, weighted graph, with only positive edge weights

I'm looking for pointers as to where one could start looking for a solution to this problem.
After googling for some time, the only problem I have found which is simmilar to my problem is a minimum spanning tree. The difference is that I am not looking for a tree that spans all the vertices in a graph, rather who spans 3 given vertices.
I am not looking for a complete program, but a pointer in the general direction of the answer.
Another idea I had was to run 3 searches with the Dijkstra's algorithm. The idea was to somehow find the best path by combining the different shortest paths. I do not know how this would be done.
Here is a graphical example of the type of graph I am talking about:
So the task is to find an way to find the minimum sum weight of connecting any 3 vertecies in this kind of graph.
EDIT :
I solved the problem by running 3 searches with Dijkstra's algorithem. Then I found the vertex which had the minimum sum weight connecting the 3 vertexes by adding togheter all uniqe edges. Thanks for all the help :)
I'm pretty sure you can do this with Dijkstra's algorithm, the only trick is you don't know what order to visit the nodes in, so you'd have to try all 6 orderings.
So if you've got nodes A, B, and C, for the first ordering A, B, C, you'd run Dijkstra's between A and B, between B and C, and between C and A. Then you'd do the next ordering A, C, B. And keep going from there.
Algorithm 1
I think that your idea of using Dijkstra is good.
One way in which you could make this work is by trying every vertex x as a start point and compute the smallest value for the sum w(x,a)+w(x,b)+w(x,c) where a,b,c are the 3 vertices you wish to connect and w(u,v) is the shortest path computed with Dijkstra.
I believe this smallest sum will be the minimum sum weight to connect the 3 vertices.
Unfortunately, this means running Djikstra 3.n times.
Algorithm 2
A better approach is to run Djikstra from each of your nodes to be connected, and store the distances to each node in your graph. (So wa(x) is the shortest distance from x to a, etc.)
You can then iterate over every vertex x as before and compute the smallest value for the sum wa(x)+wb(x)+wc(x)
This is equivalent to algorithm 1, but n times faster as Dijkstra is only run 3 times.
With the restrictions that the weights are all positive and the graph is undirected, you can solve the problem using Dijkstra's algorithm, as suggested, let us say that the nodes in question are A, B, C, all in some graph G.
Run Dijkstra's on G from:
A -> B
B -> C
C -> A
These form the edges of a triangle connecting the three vertices.
We can do this because of the condition that the graph is undirected which implies that the shortest path from A -> B is the same as the one from B -> A.
Because the weights are all positive, the shortest path connecting A, B, and C will contain precisely two edges. (Assuming you are happy ignoring the possible alternate solution of a cycle arising from three 0 weight paths in the "triangle").
So how do we pick which two edges? Any two edges will connect all three vertices, so we can eliminate any of the three, so we will eliminate the longest one.
So this algorithm will do it in the same time complexity as Dijkstra's algorithm.
Looks like generalization of minimum Steiner tree problem.

Minimize set of edges in a directed graph keeping connected components

Here is the full question:
Assume we have a directed graph G = (V,E), we want to find a graph G' = (V,E') that has the following properties:
G' has same connected components as G
G' has same component graph as G
E' is minimized. That is, E' is as small as possible.
Here is what I got:
First, run the strongly connected components algorithm. Now we have the strongly connected components. Now go to each strong connected component and within that SCC make a simple cycle; that is, a cycle where the only nodes that are repeated are the start/finish nodes. This will minimize the edges within each SCC.
Now, we need to minimize the edges between the SCCs. Alas, I can't think of a way of doing this.
My 2 questions are: (1) Does the algorithm prior to the part about minimizing edges between SCCs sound right? (2) How does one go about minimizing the edges between SCCs.
For (2), I know that this is equivalent to minimizing the number of edges in a DAG. (Think of the SCCs as the vertices). But this doesn't seem to help me.
The algorithm seems right, as long as you allow for closed walks (i.e. repeating vertices.) Proper cycles might not exist (e.g. in an "8" shaped component) and finding them is NP-hard.
It seems that it is sufficient to group the inter-component edges by ordered pairs of components they connect and leave only one edge in each group.
Regarding the step 2,minimize the edges between the SCCs, you could randomly select a vertex, and run DFS, only keeping the longest path for each pair of (root, end), while removing other paths. Store all the vertices searched in a list L.
Choose another vertex, if it exists in L, skip to the next vertex; if not, repeat the procedure above.

minimum connected subgraph containing a given set of nodes

I have an unweighted, connected graph. I want to find a connected subgraph that definitely includes a certain set of nodes, and as few extras as possible. How could this be accomplished?
Just in case, I'll restate the question using more precise language. Let G(V,E) be an unweighted, undirected, connected graph. Let N be some subset of V. What's the best way to find the smallest connected subgraph G'(V',E') of G(V,E) such that N is a subset of V'?
Approximations are fine.
This is exactly the well-known NP-hard Steiner Tree problem. Without more details on what your instances look like, it's hard to give advice on an appropriate algorithm.
I can't think of an efficient algorithm to find the optimal solution, but assuming that your input graph is dense, the following might work well enough:
Convert your input graph G(V, E) to a weighted graph G'(N, D), where N is the subset of vertices you want to cover and D is distances (path lengths) between corresponding vertices in the original graph. This will "collapse" all vertices you don't need into edges.
Compute the minimum spanning tree for G'.
"Expand" the minimum spanning tree by the following procedure: for every edge d in the minimum spanning tree, take the corresponding path in graph G and add all vertices (including endpoints) on the path to the result set V' and all edges in the path to the result set E'.
This algorithm is easy to trip up to give suboptimal solutions. Example case: equilateral triangle where there are vertices at the corners, in midpoints of sides and in the middle of the triangle, and edges along the sides and from the corners to the middle of the triangle. To cover the corners it's enough to pick the single middle point of the triangle, but this algorithm might choose the sides. Nonetheless, if the graph is dense, it should work OK.
The easiest solutions will be the following:
a) based on mst:
- initially, all nodes of V are in V'
- build a minimum spanning tree of the graph G(V,E) - call it T.
- loop: for every leaf v in T that is not in N, delete v from V'.
- repeat loop until all leaves in T are in N.
b) another solution is the following - based on shortest paths tree.
- pick any node in N, call it v, let v be a root of a tree T = {v}.
- remove v from N.
loop:
1) select the shortest path from any node in T and any node in N. the shortest path p: {v, ... , u} where v is in T and u is in N.
2) every node in p is added to V'.
3) every node in p and in N is deleted from N.
--- repeat loop until N is empty.
At the beginning of the algorithm: compute all shortest paths in G using any known efficient algorithm.
Personally, I used this algorithm in one of my papers, but it is more suitable for distributed enviroments.
Let N be the set of nodes that we need to interconnect. We want to build a minimum connected dominating set of the graph G, and we want to give priority for nodes in N.
We give each node u a unique identifier id(u). We let w(u) = 0 if u is in N, otherwise w(1).
We create pair (w(u), id(u)) for each node u.
each node u builds a multiset relay node. That is, a set M(u) of 1-hop neigbhors such that each 2-hop neighbor is a neighbor to at least one node in M(u). [the minimum M(u), the better is the solution].
u is in V' if and only if:
u has the smallest pair (w(u), id(u)) among all its neighbors.
or u is selected in the M(v), where v is a 1-hop neighbor of u with the smallest (w(u),id(u)).
-- the trick when you execute this algorithm in a centralized manner is to be efficient in computing 2-hop neighbors. The best I could get from O(n^3) is to O(n^2.37) by matrix multiplication.
-- I really wish to know what is the approximation ration of this last solution.
I like this reference for heuristics of steiner tree:
The Steiner tree problem, Hwang Frank ; Richards Dana 1955- Winter Pawel 1952
You could try to do the following:
Creating a minimal vertex-cover for the desired nodes N.
Collapse these, possibly unconnected, sub-graphs into "large" nodes. That is, for each sub-graph, remove it from the graph, and replace it with a new node. Call this set of nodes N'.
Do a minimal vertex-cover of the nodes in N'.
"Unpack" the nodes in N'.
Not sure whether or not it gives you an approximation within some specific bound or so. You could perhaps even trick the algorithm to make some really stupid decisions.
As already pointed out, this is the Steiner tree problem in graphs. However, an important detail is that all edges should have weight 1. Because |V'| = |E'| + 1 for any Steiner tree (V',E'), this achieves exactly what you want.
For solving it, I would suggest the following Steiner tree solver (to be transparent: I am one of the developers):
https://scipjack.zib.de/
For graphs with a few thousand edges, you will usually get an optimal solution in less than 0.1 seconds.

Resources