Maximum weight connected subgraph in an directed acyclic graph - algorithm

I am working on a research problem involving logic circuits (which can be represented as DAGs). Each node in the DAG has a given weight, which can be negative. My objective is to find a connected subgraph such that the sum of the node weights is maximal.
The maximum weight connected subgraph problem given edge weights is NP-hard apparently, but what I am hoping is that the directed-acyclic nature and the fact that I am dealing with node weights rather than edge weights makes the problem somewhat easier. Can someone point me in the right direction of how to start attacking this problem?
Thanks

the problem you mentioned is NP-hard, see:
“Discovering regulatory and signaling circuits in molecular interaction networks”
by Trey Ideker, Owen Ozier, Benno Schwikowski, and Andrew F. Siegel,
Bioinformatics, Vol 18, p.233-240, 2002
and the supplementary information to this paper:
http://prosecco.ucsd.edu/ISMB2002/nph.pdf

First approach, Assign to each edge the inverse of the weight of the starting node, and apply a shortest path algorithm like Bellman-Ford. The Dijkstra's algorithm won't work as some edges can be negative.
Second approach, starting on each leaf node, add "tags" to each edge that keeps track of the ids of all the nodes involved, and the total weight. There is no need to mark the node, as each node is guaranteed to be visited only once for each chain starting on the leafs. For example, given the following Acyclic Directed graph (directed top to bottom) where each node weights 1:
A G
/ \ /
/ \ /
B C
| / \
D E F
\ /
H
The edge between A and B will be tagged {{D,B,A},3}, the edge between A and C will have two tags {{H,E,C,A},4} and {{H,F,C,A},4}.
After this pre-procesing, find the greatest weight path for each root node. The information is in the tags of their outbound edges.

You mentioned that connected that connected subgraph should be "maximal". For this greedily choose a vertex and grow it until you cannot grow. this assures maximality. However if you mean "maximum" then the problem might be NP_Complete. Also let me remind you that node weighted graphs are more general than edge weighted graphs. Every algorithm built for the former is applicable to later but vice-versa is not always true. This is very easy to see. Try out yourself.
What i understand the problem, i feel it is in P. If that is correct then the Hint for that is to use some special property for DAGs (which u shud know since u r researching and this seems a lecture notes problem). For general graphs, this is reducible to steiner trees so it is NP-Cmplete(infact also for planar graphs).

I think your problem is NP hard if the maximum weight connected subgraph problem given edge weights is NP hard. You can reduce the node weight problem to the edge weight problem.
1)Lets say that your nodes have weights wn1,wn2,wn3,....wnN; where N is # of nodes.
2)Lets also say that the edges can be represented by e1,e2,e3,...eE; E- # of edges.
The weight of the edge ei:nj->nk can be defined as F(wnj,wnk), the function being
arbitrary. For simplicity we can assume wei=wnj+wnk.
Now if we assume that all node weights are independent and non-identical, then we
can say the same about the edge weights. As a DAG with non-identical edge weights
is NP hard, your problem too is.
Having said that, I think you should proceed in the following way:
1)Look for similarity in node weights for your particular problem. If there are any,
try to look up the literature for similar problems.
2)If they are hard to find, I suggest you translate your node weight problem to edge
weight one, and see how the similarity in node weights translates to edge weights
problem and see what simplification can you apply to this problem, again from
literature.
I hope this helps.

Related

Subgraph with minimum edge weight and node weight >= Val

I came across this problem - In an undirected graph every node and edge has a weight. All the weights are non-negative. Given a value S, Find the connected subgraph with minimum sum of edge weights such that its sum of node weights is at least S.
The most obvious solution is a brute force approach considering all possible subgraphs. But the time complexity is exponential. Is there any better algorithm for this? My intuition is that we can convert node weights to edge weights and then apply spanning tree algorithm. But I couldn't solve it clearly. How to solve this problem?
EDIT : Looks like I was not clear enough about the description of subgraph. The selected subgraph must be a single, connected component. I hope it's clear now.
I think this problem is NP-hard via a reduction from the Steiner tree problem. Given a graph G and a set of nodes S that need to be spanned, set the weight of all of the nodes in S to one and all the other nodes to 0. A subgraph with node weight at least |S| with minimum total edge cost must be a tree (if there are any cycles, deleting an edge from the cycle only decreases the cost) and must connect all of the nodes that need to be spanned. It's therefore a Steiner tree. Overall, this reduction can be computed in polynomial time, so your problem is NP-hard.

Dijkstra vs Bellman- ford A Directed Graph which will give different result

I am trying to learn Graphs in which i found that to find shortest path from one node to other node we can use Dijkstra and Bellman-ford algorithm.
In which Dijkstra will not work for the Graph which contains negative weight edges.
While Brllman-ford can handle such Graph which contains negative weight edges.
My doubt is i tried many kind of Graphs which contains negative weight edge and applied Dijkstra and Bellman-ford both but in all the cases i found the same result i mean no difference, for negative weight edge also dijkstra is working fine.
May be my thought process or the way how i am solving is wrong so only i am getting correct answer for dikstra.
My question is can any one explain me a Graph which have negative edge and explain the different result for dijkstra and bellman-ford.
Djikstra algorithm to find the shortest path between two edges can be used only for graphs that have positive weights. To see the difference of answers that bellman-ford and djikstra gives when there is a negative edge weight, lets take a simple example
we have 3 nodes in the graph, A B C
A is connected to B edge weight 4
A is connected to C edge weight 2
B is connected to C edge weight -3
when djikstra is used to calculate shortest path between A and C, we get weight 2
but when bellman-ford is used to calculate the shortest path between A and C, the weight is 1
This is happening because of the fact that djikstra finalises the node which has the minimum edge weight, ignoring the fact that there could be path with less weight to that node (note that this could happen only when negative weights are present. with only positive weights this is not possible).
hope you understood the difference

NP-Complete? Optimal graph embedding for a graph with specific constraints

I have a grid based graph, where nodes and edges occupy cells. Edges can cross, but cannot travel on top of each other in the same direction.
Lets say I want to optimize the graph so that the distance covered by edges is minimized.
I am currently using A* search for each connection, but the algorithm is greedy and does not plan ahead. Consider the diagram below, where the order in which connections are made is changed (note also that there can be multiple shortest paths for any given edge, see green and
purple connections).
My intuition says this is NP-Complete and that an exhaustive search is necessary, which will be extremely expensive as the size of the graph grows. However, I have no way of showing this, and it is not quite the same as other graph embedding problems which usually concern minimization of crossing.
You didn't really describe your problem and your image is gone, but your problem sounds like the minimum T-join problem.
The minimum T-join problem is defined on a graph G. You're given a set T of even size, and you're trying to find a subgraph of the graph where the vertices of T have odd degree and the other vertices have even degree. You've got weights on the edges and you're trying to minimise the sum of the weights of edges in the subgraph.
Surprisingly, the minimum T-join problem can be solved in polynomial time thanks to a very close connection with the nonbipartite matching problem. Namely, if you find all-pairs shortest paths between vertices of T, the minimum T-join is attained by the minimum-weight perfect matching of vertices in T, where there's an edge between two vertices whose length is the length of the shortest path in G.
The minimum T-join will be a collection of paths. If two distinct paths, say a->b and c->d, use the same edge uv, then they can be replaced by a->u->c and b->v->d and reduce the cost of the T-join. So it won't use the same edge twice.

Completely disconnecting a bipartite graph

I have a disconnected bipartite undirected graph. I want to completely disconnect the graph. Only operation that I can perform is to remove a node. Removing a node will automatically delete its edges. Task is to minimize the number of nodes to be removed. Each node in the graph has atmost 4 edges.
By completely disconnecting a graph, I mean that no two nodes should be connected through a link. Basically an empty edge set.
I think, you cannot prove your algorithm is optimal because, in fact, it is not optimal.
To completely disconnect your graph minimizing the number of nodes to be removed, you have to remove all the nodes belonging to the minimal vertex cover of your graph. Searching the minimal vertex cover is usually NP-complete, but for bipartite graphs there is a polynomial-time solution.
Find maximum matching in the graph (probably with Hopcroft–Karp algorithm). Then use König's theorem to get the minimal vertex cover:
Consider a bipartite graph where the vertices are partitioned into left (L) and right (R) sets. Suppose there is a maximum matching which partitions the edges into those used in the matching (E_m) and those not (E_0). Let T consist of all unmatched vertices from L, as well as all vertices reachable from those by going left-to-right along edges from E_0 and right-to-left along edges from E_m. This essentially means that for each unmatched vertex in L, we add into T all vertices that occur in a path alternating between edges from E_0 and E_m.
Then (L \ T) OR (R AND T) is a minimum vertex cover.
Here's a counter-example to your suggested algorithm.
The best solution is to remove both nodes A and B, even though they are different colors.
Since all the edges are from one set to another, find these two sets using say BFS and coloring using 2 colours. Then remove the nodes in smaller set.
Since there are no edges among themselves the rest of the nodes are disconnected as well.
[As a pre-processing step you can leave out nodes with 0 edges first.]
I have thought of an algorithm for it but am not able to prove if its optimal.
My algorithm: On each disconnected subgraph, I run a BFS and color it accordingly. Then I identify the number of nodes colored with each color and take the minimum of the two and store. I repeat the procedure for each subgraph and add up to get the required minimum. Help me prove the algorithm if it's correct.
EDIT: The above algorithm is not optimal. The accepted answer has been verified to be correct.

Minimal path - all edges at least once

I have directed graph with lot of cycles, probably strongly connected, and I need to get a minimal cycle from it. I mean I need to get cycle, which is the shortest cycle in graph, and every edge is covered at least once.
I have been searching for some algorithm or some theoretical background, but only thing I have found is Chinese postman algorithm. But this solution is not for directed graph.
Can anybody help me? Thanks
Edit>> All edges of that graph have the same cost - for instance 1
Take a look at this paper - Directed Chinese Postman Problem. That is the correct problem classification though (assuming there are no more restrictions).
If you're just reading into theory, take a good read at this page, which is from the Algorithms Design Manual.
Key quote (the second half for the directed version):
The optimal postman tour can be constructed by adding the appropriate edges to the graph G so as to make it Eulerian. Specifically, we find the shortest path between each pair of odd-degree vertices in G. Adding a path between two odd-degree vertices in G turns both of them to even-degree, thus moving us closer to an Eulerian graph. Finding the best set of shortest paths to add to G reduces to identifying a minimum-weight perfect matching in a graph on the odd-degree vertices, where the weight of edge (i,j) is the length of the shortest path from i to j. For directed graphs, this can be solved using bipartite matching, where the vertices are partitioned depending on whether they have more ingoing or outgoing edges. Once the graph is Eulerian, the actual cycle can be extracted in linear time using the procedure described above.
I doubt that it's optimal, but you could do a queue based search assuming the graph is guaranteed to have a cycle. Each queue entry would contain a list of nodes representing paths. When you take an element off the queue, add all possible next steps to the queue, ensuring you are not re-visiting nodes. If the last node is the same as the first node, you've found the minimum cycle.
what you are looking for is called "Eulerian path". You can google it to find enough info, basics are here
And about algorithm, there is an algorithm called Fleury's algorithm, google for it or take a look here
I think it might be worth while just simply writing which vertices are odd and then find which combo of them will lead to the least amount of extra time (if the weights are for times or distances) then the total length will be every edge weight plus the extra. For example, if the odd order vertices are A,B,C,D try AB&CD then AC&BD and so on. (I'm not sure if this is a specifically named method, it just worked for me).
edit: just realised this mostly only works for undirected graphs.
The special case in which the network consists entirely of directed edges can be solved in polynomial time. I think the original paper is Matching, Euler tours and the Chinese postman (1973) - a clear description of the algorithm for the directed graph problem begins on page 115 (page 28 of the pdf):
When all of the edges of a connected graph are directed and the graph
is symmetric, there is a particularly simple and attractive algorithm for
specifying an Euler tour...
The algorithm to find an Euler tour in a directed, symmetric, connected graph G is to first find a spanning arborescence of G. Then, at
any node n, except the root r of the arborescence, specify any order for
the edges directed away from n so long as the edge of the arborescence
is last in the ordering. For the root r, specify any order at all for the
edges directed away from r.
This algorithm was used by van Aardenne-Ehrenfest and de Bruin to
enumerate all Euler tours in a certain directed graph [ 1 ].

Resources