Directed graph with max indegree of a vertex - algorithm

I was trying to look at few applications of network flow when I came across this problem:
We begin with a directed graph, G = (V,E). We need to add more edges to the graph such that we have \forall u,v \in V, e = (u -> v) or e = (v -> u) but not both. i.e. we want to add more edges to the graph so that every pair of vertices in the graph are connected to each other (either with an outgoing edge or incoming edge but not both). So, in total we will have |V||V-1|/2 edges. While we build this graph, we need to ensure that the indegree of a given vertex, say w is the maximum among all the vertices of the graph (if it is possible, given the original graph). Note that we cannot change the orientation of the edges in the original graph.
I am trying to solve it using network flow by building a network without vertex w (and with 2 new vertices for source, s and sink, t). But I'm not sure how to represent the capacities and flow direction in the new graph so as to simplify the problem to network flow in order to find the edge orientations in the graph. Maybe what I'm doing is wrong, but I just wrote if someone might get a hint from it.

When attacking this kind of problem, I tend to write down a mathematical program and then massage it. Clearly, we should orient all missing edges involving w toward w. Let d be the resulting in-degree of w. For all distinct i, j, let x_{ij} = 1 if arc i->j appears in the solution and let x_{ij} = 0 if arc j->i appears.
forall j. sum_i x_{ij} <= k
forall i <> j. x_{ij} = 1 - x_{ji}
forall i <> j. x_{ij} in {0, 1}
Rewrite to use x_{ij} only if i < j.
(*) forall j. sum_{i<j} x_{ij} + sum_{i>j} (1-x_{ji}) <= k
forall i < j. x_{ij} in {0, 1}
Now (*) begins to resemble conservation constraints, as each variable appears once negatively and once positively. Let's change the inequality to an equality.
(*) forall j. x_{si} + sum_{i<j} x_{ij} + sum_{i>j} (1-x_{ji}) = k
^^^^^^ ^
forall i < j. x_{ij} in {0, 1}
forall i. x_{si} >= 0
^^^^^^^^^^^^^^^^^^^^^
We're almost all the way to a flow LP -- we just need to clean out the constants 1 and k. I'll let you handle the rest (it involves introducing t).

Related

why when we change the cost of every edge in G as c'= log17(c),every MST in G is still an MST in G′ (and vice versa)?

remarks:c' is logc with base 17
MST means (minimum spanning tree)
it's easy to prove the conclusion is correct when we use linear function to transform the cost of every edge.
But log function is not a linear function ,I could not understand why this conclusion is correct。
Supplementary notes:
I did not consider specific algorithms, such as the greedy algorithm. I simply consider the relationship between the sum of the weights of the two trees after transformation.
Numerically if (a + b) > (c + d) , (log a + log b) maybe not > ( logc + logd) .
If a tree generated by G has two edge a and b ,another tree generated by G has c and d,a + b < c + d and the first tree is a MST,but in transformed graph G' ,the sum of weights of edges of second tree may be smaller.
Because of this, I want to construct a counterexample based on "if (a + b)> (c + d), (log a + log b) maybe not> (logc + logd) ", but I failed.
One way to characterize when a spanning tree T is a minimum spanning tree is that, for every edge e not in T, the cycle formed by e and edges of T (the fundamental cycle of e with respect to T) has no edge more expensive than e. Using this characterization, I hope you see how to prove that transforming the costs with any increasing function preserves minimum spanning trees.
There's a one line proof that this condition is necessary. If the fundamental cycle contained a more expensive edge, we could replace it with e and get a spanning tree that costs less than T.
It's less obvious that this condition is sufficient, since at first glance it looks like we're trying to prove global optimality from a local optimality condition. To prove this statement, let T be a spanning tree that satisfies the condition, let T' be a minimum spanning tree, and let G' be the graph whose edges are the union of the edges of T and T'. Run Kruskal's algorithm on G', breaking ties by favoring edges in T over edges not in T. Let T'' be the resulting minimum spanning tree in G'. Since T' is a spanning tree in G', the cost of T'' is not greater than T', hence T'' is a minimum spanning tree in G as well as G'.
Suppose to the contrary that T'' ≠ T. Then there exists an edge in T but not in T''. Let e be the first such edge considered by Kruskal's algorithm. At the time that e was considered, it formed a cycle C in the edges that had been selected from T''. Since T is acyclic, C \ T is nonempty. By the tie breaking criterion, we know that every edge in C \ T costs less than e. Observing that some edge e' in C \ T must have one endpoint in each of the two connected components of T \ {e}, we infer that the fundamental cycle of e' with respect to T contains e, which violates the local optimality condition. In conclusion, T = T'', hence is a minimum spanning tree in G.
If you want a deeper dive, this logic gets abstracted out in the theory of matroids.
Well, its pretty easy to understand...let's see if I can break it down for you:
c` = log_17(c) // here 17 is base
log may not be linear function...but we can say that:
log_b(x) > log_b(y) if x > y and b > 1 (and of course x > 0 and y > 0)
I hope you get the equation I've written...In words in means, consider a base "b" such that b > 1, then log_b(x) would be greater than log_b(y) if x > y.
So, if we apply this rule in your costs of MST of G, then we see that the edges those were selected for G, would still produce the least possible edges to construct MST G' if c' = log_17(c) // here 17 is base.
UPDATE: As I can see you've problem understanding the proof, I'm elaborating a bit:
I guess, you know MST construction is greedy. We're going to use kruskal's algo to proof why it is correct.(In case, you don't know, how kruskal's algo works, you can read it somewhere, or just google it, you'll find millions of resources). Now, Let me write some steps of kruskal's edge selection for MST of G:
// the following edges are sorted by cost..i.e. c_0 <= c_1 <= c_2 ....
c_0: A, F // here, edge c_0 connects A, F, we've to take the edge in MST
c_1: A, B // it is also taken to construct MST
c_2: B, R // it is also taken to construct MST
c_3: A, R // we won't take it to construct to MST, cause (A, R) already connected through A -> B -> R
c_4: F, X // it is also taken to construct MST
...
...
so on...
Now, when constructing MST of G', we've to select edges which are in the form c' = log_17(c) // where 17 is base
Now, if we convert the edges using log of base 17, then c_0 becomes c_0', c_1 becomes c_1' and so on...
But we, know that:
log_b(x) > log_b(y) if x > y and b > 1 (and of course x > 0 and y > 0)
So, we may say that,
log_17(c_0) <= log_17(c_1), cause c_0 <= c_1
in general,
log_17(c_i) <= log_17(c_j), where i <= j
And now, we may say:
c_0` <= c_1` <= c_2` <= c_3` <= ....
So, the edge selection process to construct MST of G' would be:
// the following edges are sorted by cost..i.e. c_0` <= c_1` <= c_2` ....
c_0`: A, F // here, edge c_0` connects A, F, we've to take the edge in MST
c_1`: A, B // it is also taken to construct MST
c_2`: B, R // it is also taken to construct MST
c_3`: A, R // we won't take it to construct to MST, cause (A, R) already connected through A -> B -> R
c_4`: F, X // it is also taken to construct MST
...
...
so on...
Which is same as MST of G...
That proves the theorem ultimately....
I hope you get it...if not ask me in the comment what is not clear to you...

Disconnect all vertices in a graph - Algorithm

I am looking for an algorithm that finds minimal subset of vertices such that by removing this subset (and edges connecting these vertices) from graph all other vertices become unconnected (i.e. the graph won't have any edges).
Is there such algorithm?
If not: Could you recommend some kind of heuristics to designate the vertices.
I have a basic knowledge of graph theory so please excuse any incorrectness.
IIUC, this is the classic Minimum Vertex Cover problem, which is, unfortunately, NP Complete.
Fortunately, the most intuitive and greedy possible algorithm is as good as it gets in this case.
The greedy algorithm is a 2-approximation for vertex cover, which in theory, under the Unique Games Conjecture, is as good as it gets. In practice, solving a formulation of vertex cover as an integer program will most likely yield much better results. The program is
min sum_{v in V} x(v)
s.t.
forall {u, v} in E, x(u) + x(v) >= 1
forall v in V, x(v) in {0, 1}.
Try this way:
Define a variable to count number of vertexes, starting by 0;
Create a Max-Heap of vertexes sorted by the length of the adjacent list of each vertex;
Remove all edges from the first vertex of the Heap (the one with biggest number of edges) and remove it from the Heap, adding 1 to the count;
Reorder the Heap now that number of edges of the vertexes changed, repeating the previous step until the length of the adjacent list from the first vertex is 0;
Heap Q
int count = 0
while(1){
Q = Create_Heap(G)
Vertex first = Q.pop
if(first.adjacents.size() == 0) {
break
}
for( Vertex v : first.adjacent ){
RemoveEdge(first, v)
RemoveEdge(v, first) /* depends on the implementation */
}
count = count + 1
}
return count

Finding a New Minimum Spanning Tree After a New Edge Was Added to The Graph

Let G = (V, E) be a weighted, connected and undirected graph and let T be a minimum spanning tree. Let e be any edge not in E (and has a weight W(e)).
Prove or disprove:
T U {e} is an edge set that contains a minimum spanning tree of G' = (V, E U {e}).
Well, it sounds true to me, so I decided to prove it but I just get stuck every time...
For example, if e is the new edge with minimum weight, who can promise us that the edges in T weren't chosen in a bad way that would prevent us from obtaining a new minimum weight without the 'help' of other edges in E - T ?
I would appreciate any help,
Thanks in advance.
Let [a(1), a(2), ..., a(n-1)] be a sequence of edges selected from E to construct MST of G by Kruskal's algorithm (in the order they were selected - weight(a(i)) <= weight(a(i + 1))).
Let's now consider how Kruskal's Algorithm behaves being given as input E' = E U {e}.
Let i = min{i: weight(e) < weight(a(i))}. Firstly algorithm decides to choose edges [a(1), ..., a(i - 1)] (e hasn't been processed yet, so it behaves the same). Then it need to decide on e - if e is dropped, solution for E' will be the same as for E. So let's suppose that first i edges selected by algorithm are [a(1), ..., a(i - 1), e] - I will call this new sequence a'. Algorithm continues - as long as its following selections (for j > i) satisfy a'(j) = a(j - 1) we are cool. There are two scenarios that break such great streak (let's say streak breaks at index k + 1):
1) Algorithm selects some edge e' that is not in T, and weight(e') < weight(a(k+1)). By now a' sequence is:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k), e']
But if it was possible to append e' to this list it would be also possible to append it to [a(1), ..., a(k-1), a(k)]. But Kruskal's algorithm didn't do it when looking for MST for G. That leads to contradiction.
2) Algorithm politely selected:
[a(1), ..., a(i-1), e, a(i), a(i+1), ..., a(k-1), a(k)]
but decided to drop edge a(k+1). But if e was not present in the list algorithm would decide to append a(k+1). That means that in graph (V, {a(1), ..., a(k)}) edge a(k+1) would connect the same components as edge e. And that means that after considering by algorithm edge a(k + 1) in case of both G and G' the division into connected components (determined by set of selected edges) is the same. So after processing a(k+1) algorithm will proceed in the same way in both cases.
When ever a edge is add to a graph without adding a node , then that edge creates a cycle in minimum spanning tree of graph, cycle length may vary from 2 to n where n= no of nodes in graph.
T = Minimum spanning tree of G
Now to find the MST for (T + added edge) , we have to just remove one edge from that cycle .. so remove that edge which has maximum weight.
So T' always comes from T U {e}.
And if you are thinking that this doesn't prove that new MST will be an edge set of T U {e} then analyse Kruskal algorithim for for new graph. i.e. if e is of minimum weight it must have been selected for MST acc to Kruskal algorithim and same here if it is minimum it can not be removed from cycle.

Path finding algorithm on graph considering both nodes and edges

I have an undirected graph. For now, assume that the graph is complete. Each node has a certain value associated with it. All edges have a positive weight.
I want to find a path between any 2 given nodes such that the sum of the values associated with the path nodes is maximum while at the same time the path length is within a given threshold value.
The solution should be "global", meaning that the path obtained should be optimal among all possible paths. I tried a linear programming approach but am not able to formulate it correctly.
Any suggestions or a different method of solving would be of great help.
Thanks!
If you looking for an algorithm in general graph, your problem is NP-Complete, Assume path length threshold is n-1, and each vertex has value 1, If you find the solution for your problem, you can say given graph has Hamiltonian path or not. In fact If your maximized vertex size path has value n, then you have a Hamiltonian path. I think you can use something like Held-Karp relaxation, for finding good solution.
This might not be perfect, but if the threshold value (T) is small enough, there's a simple algorithm that runs in O(n^3 T^2). It's a small modification of Floyd-Warshall.
d = int array with size n x n x (T + 1)
initialize all d[i][j][k] to -infty
for i in nodes:
d[i][i][0] = value[i]
for e:(u, v) in edges:
d[u][v][w(e)] = value[u] + value[v]
for t in 1 .. T
for k in nodes:
for t' in 1..t-1:
for i in nodes:
for j in nodes:
d[i][j][t] = max(d[i][j][t],
d[i][k][t'] + d[k][j][t-t'] - value[k])
The result is the pair (i, j) with the maximum d[i][j][t] for all t in 0..T
EDIT: this assumes that the paths are allowed to be not simple, they can contain cycles.
EDIT2: This also assumes that if a node appears more than once in a path, it will be counted more than once. This is apparently not what OP wanted!
Integer program (this may be a good idea or maybe not):
For each vertex v, let xv be 1 if vertex v is visited and 0 otherwise. For each arc a, let ya be the number of times arc a is used. Let s be the source and t be the destination. The objective is
maximize ∑v value(v) xv .
The constraints are
∑a value(a) ya ≤ threshold
∀v, ∑a has head v ya - ∑a has tail v ya = {-1 if v = s; 1 if v = t; 0 otherwise (conserve flow)
∀v ≠ x, xv ≤ ∑a has head v ya (must enter a vertex to visit)
∀v, xv ≤ 1 (visit each vertex at most once)
∀v ∉ {s, t}, ∀cuts S that separate vertex v from {s, t}, xv ≤ ∑a such that tail(a) ∉ S &wedge; head(a) &in; S ya (benefit only from vertices not on isolated loops).
To solve, do branch and bound with the relaxation values. Unfortunately, the last group of constraints are exponential in number, so when you're solving the relaxed dual, you'll need to generate columns. Typically for connectivity problems, this means using a min-cut algorithm repeatedly to find a cut worth enforcing. Good luck!
If you just add the weight of a node to the weights of its outgoing edges you can forget about the node weights. Then you can use any of the standard algorigthms for the shortest path problem.

Algorithm for finding distinct paths from A to B in weighted, directed, cyclic graph

Suppose we have a DIRECTED, WEIGHTED and CYCLIC graph.
Suppose we are only interested in paths with a total weight of less than MAX_WEIGHT
What is the most appropriate (or any) algorithm to find the number of distinct paths between two nodes A and B that have a total weight of less than MAX_WEIGHT?
P.S: It's not my homework. Just personal, non-commercial project.
If the number of nodes and MAX_WEIGHT aren't too large (and all weights are integers), you can use dynamic programming
unsigned long long int num_of_paths[MAX_WEIGHT+1][num_nodes];
initialize to 0, except num_of_paths[0][start] = 1;.
for(w = 0; w < MAX_WEIGHT; ++w){
for(n = 0; n < num_nodes; ++n){
if (num_of_paths[w][n] > 0){
/* for each child c of node n
* if w + weight(n->c) <= MAX_WEIGHT
* num_of_paths[w+weight(n->c)][c] += num_of_paths[w][n];
*/
}
}
}
solution is sum of num_of_paths[w][target], 0 <= w <= MAX_WEIGHT .
Simple recursion. You have it in exponential time. Obviously, no zero-weight cycles allowed.
function noe(node N, limit weight W)
no. of path is zero if all outgoing edges have weight > W
otherwise no. of path is sum of numbers of path obtained by sum(noe(C1,W-W1),noe(C2,W-W2),... noe(Cn,W-Wn)) where C1 ... Cn are the nodes connected to N for which W-Wi is not negative where Wi is weight of the connecting edge, written in your favorite language.
More eficient solution should exist, along the lines of Dijkstra's algorithm, but I think this is enough for homework.
Your problem is more general case of K-Disjoint Path In directed planar graphs, with not fixed K.
K disjoint paths problem for directed planar graphs is as this:
given: a directed planar graph G = (V;E) and k pairs (r1; s1); .... ; (rk; sk) of vertices of G;
find: k pairwise vertex-disjoint directed paths P1; ... ; Pk in G, where Pi runs from ri to si (i = 1; .... ; k).
In k-disjoint path you can draw arc from all si to B, and Also arc from A to all ri by this way you create graph G' from G.
Now if you can solve your problem in G' in P you can solve k-disjoint Path in G, So P=NP.
But if you read the paper linked it gives some idea for general graph (solving k-disjoint path with fixed k) and you can use it to have some good approximation.
Also there is more complicated algorithm which solves this problem in P (for fixed k) in general graphs. but in all it's not easy (it's by Seymour ).
So your best choice currently is to use brute force algorithms.
Edit: Because MAXWEIGHT is independent to your input size (your graph size) It doesn't affect to this problem, Also because it's NP-Hard for undirected unweighted graph, still you simply can conclude it's NP-Hard.

Resources