Find maximal subgraph containing only nodes of degree 2 and 3 - algorithm

I'm trying to implement a (Unweighted) Feedback Vertex Set approximation algorithm from the following paper: FVS-Approximation-Paper. One of the steps of the algorithm (described on page 4) is to compute a maximal 2-3 subgraph of the input graph.
To be precise, a 2-3 graph is one that has only vertices of degree either 2 or 3.
By maximal we mean that that no set of edges or vertices of the original graph can be added to the maximal subgraph without violating the 2-3 condition.
The authors of the paper claim that the computation can be carried out by a "simple Depth First Search (DFS)" on the graph. However, this algorithm seems to elude me. How can the maximal subgraph be computed?

I think I managed to figure out something like what the authors intended. I wouldn't call it simple, though.
Let G be the graph and H be an initially empty 2-3 subgraph of G. The algorithm bears a family resemblance to a depth-first traversal, yet I wouldn't call it that. Starting from an arbitrary node, we walk around in the graph, pushing the steps onto a stack. Whenever we detect that the stack contains a path/cycle/sigma shape that would constitute a 2-3 super-graph of H, we move it from the stack to H and continue. When it's no longer possible to find such a shape, H is maximal, and we're done.
In more detail, the stack usually consists of a path having no nodes of degree 3 in H. The cursor is positioned at one end of the path. With each step, we examine the next edge incident to the end. If the only incident edge is the one by which we arrived, we delete it from both G and the stack and move the end back one. Otherwise we can possibly extend the path by some edge e. If e's other endpoint has degree 3 in H, we delete e from G and consider the next edge incident to the end. If e's other endpoint has degree 2 in H but is not currently on the stack, then we've anchored this end. If the other end is anchored too, then add the stack path to H and keep going. Otherwise, move the cursor to the other end of the stack, reversing the stack. The final case is if the stack loops back on itself. Then we can extract a path/cycle/sigma and keep going.
Typing this out on mobile, so sorry for the terse description. Maybe I'll find time to implement it.

Related

Algorithm for the Smallest Set of Vertices that will "Infect" the Entire Graph

My question is about infecting an entire graph with the smallest set of vertices that will be deemed as infected. The question goes something like this. For a vertex A in a (not necessarily simple) directed graph, A will become infected if for all in-edges of the form (A, B) (it's a directed graph so A will be pointing towards B) B is also infected. If we were to take a specific example:
In this case, if the vertices E, A were infected:
Iteration 1:
vertices F, D are infected because of the fact that the only vertex that points to them is E and E is infected.
Iteration 2:
The vertex B is infected as both vertices A and D are infected.
Iteration 3:
Finally the vertex C is infected as a result of the infection of vertex B from Iteration 2.
In this case, the infected set {E, A} that I chose was able to infect the entire graph. Obviously, this isn't always possible as in the case with the infected set of {B} (the vertex A doesn't end up infected as B doesn't point to it and thus there is no way of reaching it) or the infected set of {A} (the vertex B is not infected as it has a perfectly healthy parent in D).
I really want to find an algorithm that finds the smallest set of infected vertices that will end up infecting the entire graph after an arbitrary number of iterations. Does something like this already exist?
Just for clarification, for vertices that are a self-loop, it would necessarily have to be in the infected set as that's the only way that it can get infected.
btilly gave a response about how the problem is NP-hard. Could someone suggest a good approximation algorithm then? It doesn't need to be too too efficient. After all I only need to run it once albeit on a large graph. It has around 750,000 nodes and around 10 edges for each of them.
This problem is NP-hard.
The vertices which are part of this minimal infection set fall into two buckets. The first bucket is all nodes with no incoming edges. The second bucket is a minimal set of nodes that makes the graph acyclic. (If you left a cycle, then things in that cycle don't get infected. Conversely if you leave no cycles, then it is easy to prove by induction that the whole graph winds up infected.)
Very importantly, if you were able to solve this problem, then it would be easy to take the solution, remove from it the nodes which had no incoming edges, and then you're left with the minimal feedback vertex set. This would give an algorithm to solve an NP-hard problem for arbitrary graphs.
(Yes, yes, I know that the Wikipedia article says NP-complete everywhere. It is wrong. The question, "Is there a feedback vertex set of size k" is NP-complete. The question, "What is the smallest feedback vertex set" is NP-hard because, given a claimed minimal set, you don't have a polynomial algorithm to verify that no minimal set is shorter. That is, the decision problem is NP-complete and the optimization problem is NP-hard.)
The difficulty is that all source nodes for each node must be infected to infect a node. So, the reality is that every edge must carry infection. So -
- LOOP over the roots ( nodes with only out edges )
Add root to solution
Mark all edges reachable from root
If every edge marked
STOP
- LOOP over every node that has unmarked out edges
Add node to solution
Mark all edges reachable from node
If every edge marked
STOP
- LOOP over disconnected nodes
Add node to solution
STOP
Note that this does not always give the optimal solution. It gives a successful solution that infects every node. Brute force optimization: place algorithm in loop over enumeration of node orders and keep the solution if it is improved.

Graph theory: best algorithm to find combination of edges “directions”, where each node has at most one edge directed to it

I’m dealing with a graph where there are a certain number of nodes, and there are predefined connections between them which don’t have “directions” yet.
Problem is to give all the edges a direction (ex. If there’s a connection between A And B, give this edge the A->B direction, or B->A), in a way that no node is at the receiving end of more than one edge.
Examples:
For this model (A-B-C), A->B->C works, but A->B<-C does not work, as B is at the receiving end of more than one connection. Although A<-B->C works, as B is on the giving end of both of its connections.
I’ve tried loop detection, but the fact that these nodes can be arbitrarily connected to one another, there can be numerous loops which may or may not be directly attached to each other, I could not find a solution to make use of the information.
Number of nodes can be north of thousands, and connections can be many hundreds in my case. This also rules out brute force.
It is not guaranteed that there will be a definite solution, the aim of the algorithm is to find a combination where there’s the least number of connections causing nodes to have more than one edge pointing to them.
Not a complete algorithm, but given your description of the problem in the comments, I feel like these steps will probably bring the problem back into the brute-forcible range.
First, you should "trim" your graph. Any nodes of degree one should be pruned, with their connected edge being directed at the pruned node. Since no other edge can point to that node, we know that this choice is optimal. Rinse and repeat until all nodes remaining have two or more edges.
Next, as you mentioned, you should exclude any isolated nodes. You can actually extend this up to connected components of size <= 3. This is because for up to three nodes, your number of edges cannot exceed the number of nodes, so you can randomly assign one edge, and the rest will fall into place.
Now, what will remain are a bunch of large, highly-connected, connected components. You could actually do one more check and see if any of these form a single cycle (all nodes degree two) and then assign one edge randomly, but this is probably a fairly rare case. You'll probably just want to start brute forcing each of these independently. It'd probably be best to start from the nodes with the smallest number of edges first, updating the degree of nodes as you assign edges (and also pruning any degree one edges as before), backtracking as necessary.
This is a continuation of the answer by Dillon Davis.
After tree-like branches are removed, and simple cycles are resolved, the remaining graph has nodes of degree 2 or more. I propose that (for the purposes of analyzing the graph) all of the nodes of degree 2 can be removed.
Allow me to explain by example. In this example, when a node is represented by a number, that number is the degree of the node. When a node is represented by a letter, that node has degree 2. So the graph
3 - A - B - C - 4
represents a node of degree 3, connected to a chain of nodes of degree 2, connected to a node of degree 4.
The two ideal choices for this section of the graph are
3 -> A -> B -> C -> 4
3 <- A <- B <- C <- 4
These are ideal in the sense that each lettered node has exactly one incoming edge. I propose that these aren't just ideal choices, they are the only choices. Consider the first ideal solution
3 -> A -> B -> C -> 4
If node 4 has too many incoming edges, we can reduce its count by reversing the edge to C, giving
3 -> A -> B -> C <- 4
But that hasn't improved the situation, it trades "too many edges into 4" with "too many edges into C". Subsequently reversing the edge between C and B resolves C, but breaks B. Keep reversing along the chain and eventually the connection between A and 3 is reversed, and we've arrived at the second ideal solution.
Which leads me to conclude that (for the purposes of analysis)
3 - A - B - C - 4
is equivalent to
3 - 4
So how is this useful in simplifying the problem. Consider the following graph:
When nodes A and B are removed, the remaining edge connects the top node 3 to itself, so that edge can be removed. Likewise for C and D. Which leaves a graph with a single edge. Choose either direction for that edge. Then complete the solution by choosing a direction for the simple cycle A-B-3, and independently choose a direction for the simple cycle C-D-3.
Here's another example:
In this case, removing A and B creates redundant edges between the remaining nodes. After removing the redundant edges, choose either direction for the edge. The direction of that edge determines the direction of the cycle 3-A-3, and cycle 3-B-3.
I wasn't sure about adding another answer, but the answer by user3386109 gave me insight into what I believe is the complete solution, and I felt that it differs too drastically from the spirit of my original answer to include as an edit.
To recap, we have a few tools under out belt:
We can prune nodes with a single edge optimally, repeating the process to completion
We can assign a direction to any edge in a simple cycle (connected components with only nodes of degree 2) and the rest will follow (optimally).
Nodes with two edges in more complex cycles can be temporarily ignored, as their edge directions will be assigned by higher degree nodes.
After reading the last point, the problem itself becomes a bit more clear. Once we have pruned the degree one nodes in bullet one, all remaining nodes have at least two edges. We can say for certain in the optimal graph that each of these nodes will have at least one directional edge pointing to them. As proof, since each node has at least two edges, but the connected component is not a simple cycle (else it would be eliminated in bullet 2), we have more edges than nodes. If any node has zero edges directed towards it, one of those edges could be reversed to reduce the number of conflicting edges, or to "free up" another node to have zero inward edges, to then do the same.
Armed with this knowledge, we know that the minimal number of conflicts (extra edges directed at nodes that already have an edge directed at them) equals the number of edges minus the number of vertices in our pruned graph. We can also conclude that as long as we manage to direct at least one edge to each node, we'll have an optimal graph, regardless of how we scatter the conflicting edges.
Originally I tried to draft an algorithm based on bullet three to accomplish this assignment, but it turns out the answer is actually a lot simpler than that even. The only way we can accidentally create a node with no edges directed away from it is by actively directing all edges away from that node. The solution is to pick a single edge in the connected component, and assign it a direction at random. Then, do a search (DFS, BFS, anything) outward from the node its directed at, assigning directions to the edges as you go, in the direction you that traverse them. Any node you reach will have an edge directed at it (the edge you took to reach it), and the root node has the edge you manually assigned to it.
In the end, this will produce a graph with the minimal number of extra edges directed at nodes. If you instead wish to minimize the number of nodes containing conflicting edges, solve the problem as stated above, and then form a subgraph of the nodes of degree three or more and their connecting edges. Solve for the minimal vertex cover of this subgraph, and then reverse the direction of the edges connecting nodes not in the minimal vertex cover yet containing conflicting edges, with those of the corresponding node in the minimal vertex cover.

find shortest path in a graph that compulsorily visits certain Edges while others are not compulsory to visit

I have an undirected graph with about 1000 nodes and 2000 edges, a start node and an end node. I have to traverse from the start node to the end node passing through all the compulsory edges(which are about 10) while its not necessary to traverse through all the vertices or nodes. Is there an easy solution to this like some minor changes in the existing graph traversing algorithms? How do I do it?
Thanks for help.:)
This question is different from Find the shortest path in a graph which visits certain nodes as my question is regarding compulsory edges not vertices.
EDIT: The compulsory edges can be traversed in any order.
To start with a related problem, say you have a graph G = (V, E), 10 specific edges you must traverse in a given order E' = 1, ..., e10 > &in; E, and a start and end nodes s, v &in; V. You need to find the shortest distance from s to v using E' in the given order.
You could do this by making 10 copies of the graph. Start with a single copy (i.e., isomorphic t G = (V, E)), except that e1 moves from the first copy to the second copy. In the second copy (again isomorphic t G = (V, E)), remove e1, and have e2 move from the second copy to the third copy. Etc. In the resulting graph, run any algorithm to get from s in the first copy to e in the 10th copy.
Explanation: imagine intuitively that your graph G is drawn on a 2d sheet of paper. photocopy it so that you have 10 copies, and stack them up to a pile of 10 papers (imagine them with a bit of space between each two, though). Now change the graphs a bit so that the only way to go up to the second sheet, from the first sheet, is through an edge e1 leading from the bottom sheet to the second sheet. The only way to go up to the third sheet, from the second sheet, is through an edge e2 leading from the second sheet to the third sheet, and so on. You problem is to find the shortest path starting at the node corresponding to s on the bottom sheet, and ending at the node corresponding to e on the top sheet.
To solve the original problem, just repeat this with all possible permutations of E'. Note that there are 10! ~ 3.5e6 possibilities, which isn't all that much.

What is meant by the set of all possible configuration in a given graph G

I'm trying to understand a Solved exercise 2, Chapter 3 - Algorithm design by tardos.
But i'm not getting the idea of the answer.
In short the question is
We are given two robots located at node a & node b. The robots need to travel to node c and d respectively. The problem is if one of the nodes gets close to each other. "Let's assume the distance is r <= 1 so that if they become close to each other by one node or less" they will have an interference problem, So they won't be able to transmit data to the base station.
The answer is quite long and it does not make any sense to me or I'm not getting its idea.
Anyway I was thinking can't we just perform DFS/BFS to find a path from node a to c, & from b to d. then we modify the DFS/BFS Algorithm so that we keep checking at every movement if the robots are getting close to each other?
Since it's required to solve this problem in polynomial time, I don't think this modification to any of the algorithm "BFS/DFS" will consume a lot of time.
The solution is "From the book"
This problem can be tricky to think about if we view things at the level of the underlying graph G: for a given configuration of the robots—that is, the current location of each one—it’s not clear what rule we should be using to decide how to move one of the robots next. So instead we apply an idea that can be very useful for situations in which we’re trying to perform this type of search. We observe that our problem looks a lot like a path-finding problem, not in the original graph G but in the space of all possible configurations.
Let us define the following (larger) graph H. The node set of H is the set of all possible configurations of the robots; that is, H consists of all possible pairs of nodes in G. We join two nodes of H by an edge if they represent configurations that could be consecutive in a schedule; that is, (u,v) and (u′,v′)will be joined by an edge in H if one of the pairs u,u′ or v,v′ are equal, and the other pair corresponds to an edge in G.
Why the need for larger graph H?
What does he mean by: The node set of H is the set of all possible configurations of the robots; that is, H consists of all possible pairs of nodes in G.
And what does he mean by: We join two nodes of H by an edge if they represent configurations that could be consecutive in a schedule; that is, (u,v) and (u′,v′) will be joined by an edge in H if one of the pairs u,u′ or v,v′ are equal, and the other pair corresponds to an edge in G.?
I do not have the book, but it seems from their answer that at each step they move one robot or the other. Assuming that, H consists of all possible pairs of nodes that are more than distance r apart. The nodes in H are adjacent if they can be reached by moving one robot or the other.
There are not enough details in your proposed algorithm to say anything about it.
Anyway I was thinking can't we just perform DFS/BFS to find a path from node a to c, & from b to d. then we modify the DFS/BFS Algorithm so that we keep checking at every movement if the robots are getting close to each other?
I don't think this would be possible. What you're proposing is to calculate the full path, and afterwards check if the given path could work. If not, how would you handle the situation so that when you rerun the algorithm, it won't find that pathological path? You could exclude that from the set of possible options, but I don't see think that'd be a good approach.
Suppose a path of length n, and now suppose that the pathology resides in the first step of the given path. Suppose now that this happens every time you recalculate the path. You would have to recalculate the path a lot of times just because the algorithm itself isn't aware of the restrictions needed to get to the right answer.
I think this is the point: the algorithm itself doesn't consider the problem's restrictions, and that is the main problem, because there's no easy way of correcting the given (wrong) solution.
What does he mean by: The node set of H is the set of all possible configurations of the robots; that is, H consists of all possible pairs of nodes in G.
What they mean by that is that each node in H represents each possible position of the two robots, which is the same as "all possible pairs of nodes in G".
E.g.: graph G has nodes A, B, C, D, E. H will have nodes AB, AC, AD, AE, BC, BD, BE, CD, CE, DE (consider AB = BA for further analysis).
Let the two robots be named r1 and r2, they start at nodes A and B (given info in the question), so the path will start in node AB in graph H. Next, the possibilities are:
r1 moves to a neighbor node from A
r2 moves to a neighbor node from B
(...repeat for each step unitl r1 and r2 each reach its destination).
All these possible positions of the two robots at the same time are the configurations the answer talks about.
And what does he mean by: We join two nodes of H by an edge if they represent configurations that could be consecutive in a schedule; that is, (u,v) and (u′,v′) will be joined by an edge in H if one of the pairs u,u′ or v,v′ are equal, and the other pair corresponds to an edge in G.?
Let's look at the possibilities from what they state here:
(u,v) and (u′,v′) will be joined by an edge in H if one of the pairs u,u′ or v,v′ are equal, and the other pair corresponds to an edge in G.
The possibilities are:
(u,v) and (u,w) / (v,w) is and edge in E. In this case r2 moves to one of the neighbors from its current node.
(u,v) and (w,v) / (u,w) is and edge in E. In this case r1 moves to one of the neighbors from its current node.
This solution was a bit tricky to me too at first. But after reading it several times and drawing some examples, when I finally bumped into your question, the way you separated each part of the problem then helped me to fully understand each part of the solution. So, a big thanks to you for this question!
Hope it's clearer now for anyone stuck with this problem!

Weighted graph traversal with skips

While I was in the shower today, I had a thought - How difficult would it be to write an algorithm to traverse a weighted di-graph and find the shortest path while allowed to skip a fixed number of edges s. I started thinking about even one skip, and for the brute force method it seems to multiply the problem by the number of edges in your graph, as you have to find the shortest path for each case where an edge is set to 0 cost and then compare across all graphs. I don't know if there are any algorithms that do this, but a cursory search of google didn't show any.
My first question would be for skipping the most costed edge(s), but it's also an interesting problem to examine having to find a path assuming you skip the least costed edge(s).
This is just to satisfy my curiosity, so no rush.
Thanks!
What follows is the logic of how to solve this problem. The way to solve this type of problem is to consider a graph composed of two copies of the original graph you want to traverse, which I'll describe how to create. For your sake, draw a small graph, and then draw it topologically sorted (which helps with the visualization, but is not necessary in the program.) Next, draw a copy of that graph a few inches above the original. You're in the bottom section of this graph when you have not yet used your skip, and you're in the top part when you have used your skip. Let's call the nodes in the bottom graph A1, A2, A3 ... and the nodes in the top graph B1, B2, B3 ... If, in your original graph, node 1 is connected to nodes 2, then your new graph has edges A1->A2, B1->B2, and a free connection, A1->B2 (with edge cost 0).
Consider the following original graph, where you start at the black node, and desire to end up at the blue node.
Your new graph will look like the following, where you again start at black and wish to go to the blue node.
At each location in the bottom half of the graph, you have not used your skip, and thus can either skip (moving to the top part of the graph) or can move normally, going to another node in the bottom graph.
You can then use any of the standard graph traversal algorithms.

Resources