Min s-t cut in a network - algorithm

I am trying to simulate a network of wireless sensor nodes in order to research about the robustness of the network. I am faced with the following problem:
I have a network of nodes with some edge capacities. This is equivalent to something like network flow problem in algorithms. There is a source node (which detects certain events) and a sink node (my base station). Now, I want to find the minimum s-t cut in the network so that the size of the source set is minimized. The source set here refers to the set of nodes separated by the min s-t cut that contains the source.
e.g. if the s-t cut, C = {S,T}, then there is a set of edges which can be removed to separate the network into two sets, S and T and the set S contains the source and T contains the sink. The cut is minimum when the sum of capacities of the edges in the cut is minimum among all possible s-t cuts. There can be several such min-cuts. I need to find a min-cut that has least number of elements in the set S
Note that this is not the original problem but I have tried to simplify it in order to express it in terms of algorithms.

I believe that you can solve this problem by finding a minimum cut in a graph with slightly modified constraints. The idea is as follows - since the cost of a cut is equal to the total capacity crossing the cut, we could try modifying the graph by adding in an extra edge from each node in the graph to t that has capacity one. Intuitively, this would mean that every node in the same part of the cut as s would contribute one extra cost to the total cost of the cut, because the edge from that node to t would cross the edge. Of course, this would definitely mess up the actual min-cut because of the extra capacity. To fix this, we apply the following transformation - first, multiply the capacities of the edges by n, where n is the number of nodes in the graph. Then add one to each edge. The intuition here is that by multiplying the edge capacities by n, we've made it so that the cost of the min-cut (ignoring the new edges from each node to t) will be n times the original cost of the cut. When we then add in the extra one-capacity edges from each node to t, the maximum possible contribution these edges can make to the cost of the cut is n - 1 (if every node in the graph except for t is on the same side as s). Thus the cost of the old min-cut was C, the cost of the new min-cut (S, V - S) is nC + |S|, where |S| is the the number of nodes on the same side of the cut as s.
More formally, the construction is as follows. Given a directed, capacitated graph G and a (source, sink) pair (s, t), construct the graph G' by doing the following:
For each edge (u, v) in the graph, multiply its capacity by n.
For each node v in the graph, add a new edge (v, t) with capacity 1.
Compute a min s-t cut in the graph.
I claim that a min s-t cut in the graph G' corresponds to a min s-t cut in graph G with the fewest number of nodes on the same side of the cut as s. The proof is as follows. Let (S, V - S) be a min s-t cut in G'. First, we need to show that (S, V - S) is a min s-t cut in G. This proof is by contradiction; assume for the sake of contradiction that there is an s-t cut (S', V - S') whose cost is lower than the cost of (S, V - S). Let the cost of (S', V - S') in G be C' and let the cost of (S, V - S) in G be C. Now, let's consider the cost of these two cuts in G'. By constriction, the cost of C' would be nC' + |S'| (since each node on the S' side of the cut contributes one capacity across the cut) and the cost of C would be nC + |S|. Since we know that C' < C, we must have that C' + 1 ≤ C. Thus
nC + |S| ≥ n(C' + 1) + |S| = nC' + n + |S|
Now, note that 0 ≤ |S| < n and 0 ≤ |S'| < n, because there can be at most n nodes on the same side of the cut as s. Thus means that
nC + |S| ≥ nC' + n + |S| > nC' + |S'| + |S| > nC' + |S'|
But this means that the cost of (S, V - S) in G' is greater than the cost of (S', V - S') in G', contradicting the fact that (S, V - S) is a min s-t cut in G'. This allows us to conclude that any min s-t cut in G' is also a min s-t cut in G.
Now, we need to show that not only is a min s-t cut in G' also a min s-t cut in G, but it corresponds to a min s-t cut in G with the fewest number of nodes on the same side of the cut as s. Again, this proof is by contradiction; suppose that (S, V - S) is a min s-t cut in G' but that there is some min s-t cut in G with fewer nodes on the s side of the cut. Call this better cut (S', V - S'). Since (S, V - S) is a min s-t cut in G', it's also a min s-t cut in G, so the cost of (S', V - S') and (S, V - S) in G is some number C. Then the cost of (S, V - S) and (S', V - S') in G' will be nC + |S| and nC + |S'|, respectively. We know that nC + |S'| < nC + |S|, since we've chosen (S', V - S') to be an s-t min cut in G with the fewest number of nodes on the same side as S. But this means that (S', V - S') has a lower cost than (S, V - S), contradicting the fact that (S, V - S) is a min s-t cut in G'. Thus our assumption was wrong and (S, V - S) is a min s-t cut in G with the fewest number of nodes on the same side as S. This completes the correctness proof of the construction.
Hope this helps!

tl;dr Compute an max s-t flow and let S be the set of nodes reachable from s by arcs of positive residual capacity.
Proof of correctness: clearly S is an min s-t cut (cut = set of nodes in the part containing s). Suppose that S* is an s-t cut smaller than S (i.e., |S*| < |S|). By an easy counting argument, let u be a node in S - S*. If we add a positive capacity arc from u to t, then the computed flow has an augmenting path and is no longer maximum, but the capacity of the cut S* is unchanged, since u and t both belong to V - S*. We conclude by weak duality that S* is not a min cut.
In fact, the class of s-t min cuts is a distributive lattice under intersection and union, so every instance of your problem has a unique solution.

In your question and comment I think you say two different thing, First Finding minmum s-t cut such that separates node source and think and it's weight is minimum (weight will be calculated by remove edges sizes) and this can be done with Ford-Fulkerson Algorithm and here is sample implementation in java (also Matlab has a function graphmaxflow) also it's available in igraph library.
But as your comment and first part of question you asked for finding min cut, such that number of nodes in s part is minimized, In this case you should remove all edge of S to have a groups of size 1,n-1, Or you should rephrase your question.

Related

Find the N highest cost vertices that has a path to S, where S is a vertex in an undirected Graph G

I would like to know, what would be the most efficient way (w.r.t., Space and Time) to solve the following problem:
Given an undirected Graph G = (V, E), a positive number N and a vertex S in V. Assume that every vertex in V has a cost value. Find the N highest cost vertices that is connected to S.
For example:
G = (V, E)
V = {v1, v2, v3, v4},
E = {(v1, v2),
(v1, v3),
(v2, v4),
(v3, v4)}
v1 cost = 1
v2 cost = 2
v3 cost = 3
v4 cost = 4
N = 2, S = v1
result: {v3, v4}
This problem can be solved easily by the graph traversal algorithm (e.g., BFS or DFS). To find the vertices connected to S, we can run either BFS or DFS starting from S. As the space and time complexity of BFS and DFS is same (i.e., time complexity: O(V+E), space complexity: O(E)), here I am going to show the pseudocode using DFS:
Parameter Definition:
* G -> Graph
* S -> Starting node
* N -> Number of connected (highest cost) vertices to find
* Cost -> Array of size V, contains the vertex cost value
procedure DFS-traversal(G,S,N,Cost):
let St be a stack
let Q be a min-priority-queue contains <cost, vertex-id>
let discovered is an array (of size V) to mark already visited vertices
St.push(S)
// Comment: if you do not want to consider the case "S is connected to S"
// then, you can consider commenting the following line
Q.push(make-pair(S, Cost[S]))
label S as discovered
while St is not empty
v = St.pop()
for all edges from v to w in G.adjacentEdges(v) do
if w is not labeled as discovered:
label w as discovered
St.push(w)
Q.push(make-pair(w, Cost[w]))
if Q.size() == N + 1:
Q.pop()
let ret is a N sized array
while Q is not empty:
ret.append(Q.top().second)
Q.pop()
Let's first describe the process first. Here, I run the iterative version of DFS to traverse the graph starting from S. During the traversal, I use a priority-queue to keep the N highest cost vertices that is reachable from S. Instead of the priority-queue, we can use a simple array (or even we can reuse the discovered array) to keep the record of the reachable vertices with cost.
Analysis of space-complexity:
To store the graph: O(E)
Priority-queue: O(N)
Stack: O(V)
For labeling discovered: O(V)
So, as O(E) is the dominating term here, we can consider O(E) as the overall space complexity.
Analysis of time-complexity:
DFS-traversal: O(V+E)
To track N highest cost vertices:
By maintaining priority-queue: O(V*logN)
Or alternatively using array: O(V*logV)
The overall time-complexity would be: O(V*logN + E) or O(V*logV + E)

very hard and elegant question on shortest path

Given a weighed, connected and directed graph G=(V,E) with n vertexes and m edges, and given a pre-calculated shortest path distance's matrix S where S is n*n S(i,j) denotes the weight of shortest path from vertex i to vertex j.
we know just weight of one edge (u, v) is changed (increased or decreased).
for two specific vertex s and t we want to update the shortest path length between these two vertex.
This can be done in O(1).
How is this possible? what is the trick of this answer?
You certainly can for decreases. I assume S will always refer to the old distances. Let l be the new distance between (u, v). Check if
S(s, u) + l + S(v, t) < S(s, t)
if yes then the left hand side is the new optimal distance between s and t.
Increases are impossible. Consider the following graph (edges in red have zero weight):
Suppose m is the minimum weight edge here, except for (u, v) which used to be lower. Now we update (u, v) to some weight l > m. This means we must find m to find the new optimum length.
Suppose we could do this in O(1) time. Then it means we could find the minimum of any array in O(1) time by feeding it into this algorithm after adding (u, v) with weight -BIGNUMBER and then 'updating' it to BIGNUMBER (we can lazily construct the distance matrix because all distances are either 0, inf or just the edge weights). That is clearly not possible, thus we can't solve this problem in O(1) either.

How to count all reachable nodes in a directed graph?

There is a directed graph (which might contain cycles), and each node has a value on it, how could we get the sum of reachable value for each node. For example, in the following graph:
the reachable sum for node 1 is: 2 + 3 + 4 + 5 + 6 + 7 = 27
the reachable sum for node 2 is: 4 + 5 + 6 + 7 = 22
.....
My solution: To get the sum for all nodes, I think the time complexity is O(n + m), the n is the number of nodes, and m stands for the number of edges. DFS should be used,for each node we should use a method recursively to find its sub node, and save the sum of sub node when finishing the calculation for it, so that in the future we don't need to calculate it again. A set is needed to be created for each node to avoid endless calculation caused by loop.
Does it work? I don't think it is elegant enough, especially many sets have to be created. Is there any better solution? Thanks.
This can be done by first finding Strongly Connected Components (SCC), which can be done in O(|V|+|E|). Then, build a new graph, G', for the SCCs (each SCC is a node in the graph), where each node has value which is the sum of the nodes in that SCC.
Formally,
G' = (V',E')
Where V' = {U1, U2, ..., Uk | U_i is a SCC of the graph G}
E' = {(U_i,U_j) | there is node u_i in U_i and u_j in U_j such that (u_i,u_j) is in E }
Then, this graph (G') is a DAG, and the question becomes simpler, and seems to be a variant of question linked in comments.
EDIT previous answer (striked out) is a mistake from this point, editing with a new answer. Sorry about that.
Now, a DFS can be used from each node to find the sum of values:
DFS(v):
if v.visited:
return 0
if v is leaf:
return v.value
v.visited = true
return sum([DFS(u) for u in v.children])
This is O(V^2 + VE) worst vase, but since the graph has less nodes, V
and E are now significantly lower.
Some local optimizations can be made, for example, if a node has a single child, you can reuse the pre-calculated value and not apply DFS on the child again, since there is no fear of counting twice in this case.
A DP solution for this problem (DAG) can be:
D[i] = value(i) + sum {D[j] | (i,j) is an edge in G' }
This can be calculated in linear time (after topological sort of the DAG).
Pseudo code:
Find SCCs
Build G'
Topological sort G'
Find D[i] for each node in G'
apply value for all node u_i in U_i, for each U_i.
Total time is O(|V|+|E|).
You can use DFS or BFS algorithms for solving Your problem.
Both have complexity O(V + E)
You dont have to count all values for all nodes. And you dont need recursion.
Just make something like this.
Typically DFS looks like this.
unmark all vertices
choose some starting vertex x
mark x
list L = x
while L nonempty
choose some vertex v from front of list
visit v
for each unmarked neighbor w
mark w
add it to end of list
In Your case You have to add some lines
unmark all vertices
choose some starting vertex x
mark x
list L = x
float sum = 0
while L nonempty
choose some vertex v from front of list
visit v
sum += v->value
for each unmarked neighbor w
mark w
add it to end of list

Given an undirected graph G = (V, E), determine whether G is a complete graph

I'm pretty sure this problem is P and not NP, but I'm having difficulty coming up with a polynomially bound algorithm to solve it.
You can :
check that number of edges in the graph is n(n-1)/2.
check that each vertice is connected to exaclty n-1 distinct vertices.
This will run in O(V²), which is polynomial.
Hope it helped.
Here's an O(|E|) algorithm that also has a small constant.
It's trivial to enumerate every edge in a complete graph. So all you need to do is scan your edge list and verify that every such edge exists.
For each edge (i, j), let f(i, j) = i*|V| + j. Assuming vertices are numbered 0 to |V|-1.
Let bitvec be a bit vector of length |V|2, initialized to 0.
For each edge (i, j), set bitvec[f(i, j)] = 1.
G is a complete graph if and only if every element of bitvec == 1.
This algorithm not only touches E once, but it's also completely vectorizable if you have a scatter instruction. That also means it's trivial to parallelize.
Here is an O(E) algorithm:
Use O(E) as it is input time, to scan the graph
Meanwhile, record each vertex p's degree, increase degree only if the neighbor is not p itself (self-connecting edge) and is not a vertex q where p and q has another edge counted already (multiple edge), these checking can be done in O(1)
Check if all vertex's degree is |V|-1, this step is O(V), if Yes then it is a complete graph
Total is O(E)
For a given graph G = (V,E), check for each pair u, v in the V, and see if edge (u,v) is in E.
The total number of u, v pairs are |V|*(|V|-1)/2. As a result, with a time complexity of O(|V|^2), you can check and see if a graph is complete or not.

Maximum weighted path between two vertices in a directed acyclic Graph

Love some guidance on this problem:
G is a directed acyclic graph. You want to move from vertex c to vertex z. Some edges reduce your profit and some increase your profit. How do you get from c to z while maximizing your profit. What is the time complexity?
Thanks!
The problem has an optimal substructure. To find the longest path from vertex c to vertex z, we first need to find the longest path from c to all the predecessors of z. Each problem of these is another smaller subproblem (longest path from c to a specific predecessor).
Lets denote the predecessors of z as u1,u2,...,uk and dist[z] to be the longest path from c to z then dist[z]=max(dist[ui]+w(ui,z))..
Here is an illustration with 3 predecessors omitting the edge set weights:
So to find the longest path to z we first need to find the longest path to its predecessors and take the maximum over (their values plus their edges weights to z).
This requires whenever we visit a vertex u, all of u's predecessors must have been analyzed and computed.
So the question is: for any vertex u, how to make sure that once we set dist[u], dist[u] will never be changed later on? Put it in another way: how to make sure that we have considered all paths from c to u before considering any edge originating at u?
Since the graph is acyclic, we can guarantee this condition by finding a topological sort over the graph. topological sort is like a chain of vertices where all edges point left to right. So if we are at vertex vi then we have considered all paths leading to vi and have the final value of dist[vi].
The time complexity: topological sort takes O(V+E). In the worst case where z is a leaf and all other vertices point to it, we will visit all the graph edges which gives O(V+E).
Let f(u) be the maximum profit you can get going from c to u in your DAG. Then you want to compute f(z). This can be easily computed in linear time using dynamic programming/topological sorting.
Initialize f(u) = -infinity for every u other than c, and f(c) = 0. Then, proceed computing the values of f in some topological order of your DAG. Thus, as the order is topological, for every incoming edge of the node being computed, the other endpoints are calculated, so just pick the maximum possible value for this node, i.e. f(u) = max(f(v) + cost(v, u)) for each incoming edge (v, u).
Its better to use Topological Sorting instead of Bellman Ford since its DAG.
Source:- http://www.utdallas.edu/~sizheng/CS4349.d/l-notes.d/L17.pdf
EDIT:-
G is a DAG with negative edges.
Some edges reduce your profit and some increase your profit
Edges - increase profit - positive value
Edges - decrease profit -
negative value
After TS, for each vertex U in TS order - relax each outgoing edge.
dist[] = {-INF, -INF, ….}
dist[c] = 0 // source
for every vertex u in topological order
if (u == z) break; // dest vertex
for every adjacent vertex v of u
if (dist[v] < (dist[u] + weight(u, v))) // < for longest path = max profit
dist[v] = dist[u] + weight(u, v)
ans = dist[z];

Resources