Is DFS classification of edges valid? - algorithm

DFS can be used to classify edges to tree, forward, back and cross.
Given the classification of edges and number of vertices, can we determine in linear complexity, is it a valid result of DFS? And if so - how?
For example, here is invalid classification: (it's impossible to get one like that, regardless of the root vertex we choose and the order of visiting children)

Apologies for a somewhat brief answer: yes, and here's an algorithm. Verify that the tree edges indeed form a tree. For every other edge, compute the leafmost common ancestor of its endpoints. For forward edges, this should be the tail. For back edges, this should be the head. For cross edges, this should be neither.
Assuming that everything looks good so far, the rootward paths from the head and tail of a cross edge arrive at the LCA via different children. In every possible DFS order that generates the given classification, the head's child is visited before the tail's child. Collect all of these constraints and verify that there is no cycle.
I claim that, together, these checks are sound and complete. Completeness is verified by linearizing the constraints with a topological sort and constructing a plausible DFS order.

Related

Algorithm to visit every node in a directed cyclic graph

As the title says, I have a graph that contains cycles and is directed. It's strongly connected so there's no danger of getting "stuck". Given a start node, I want to find the a path (ideally the shortest but that's not the thing I'm optimising for) that visits every node.
It's worth saying that many of the nodes in this graph are frequently connected both ways - i.e. it's almost undirected. I'm wondering if there's a modified DFS that might work well for this particular use case?
If not, should I be looking at the Held-Karp algortihm? The visit once and return to starting point restrictions don't apply for me.
The easiest approach would probably be to choose a root arbitrarily and compute a BFS tree on G (i.e., paths from the root to each other vertex) and a BFS tree on the transpose of G (i.e., paths from each other vertex to the root). Then for each other vertex you can navigate to and from the root by alternating tree paths. There are various quick optimizations to this method.
Another possibility would be to use A* on the search space consisting of states current node × set of visited nodes, with heuristic equal to the number of nodes not visited yet. The worst-case running time is comparable to Held–Karp (which you could also apply after running Floyd–Warshall to form a complete unsymmetric distance matrix).

Will a standard Kruskal-like approach for MST work if some edges are fixed?

The problem: you need to find the minimum spanning tree of a graph (i.e. a set S of edges in said graph such that the edges in S together with the respective vertices form a tree; additionally, from all such sets, the sum of the cost of all edges in S has to be minimal). But there's a catch. You are given an initial set of fixed edges K such that K must be included in S.
In other words, find some MST of a graph with a starting set of fixed edges included.
My approach: standard Kruskal's algorithm but before anything else join all vertices as pointed by the set of fixed edges. That is, if K = {1,2}, {4,5} I apply Kruskal's algorithm but instead of having each node in its own individual set initially, instead nodes 1 and 2 are in the same set and nodes 4 and 5 are in the same set.
The question: does this work? Is there a proof that this always yields the correct result? If not, could anyone provide a counter-example?
P.S. the problem only inquires finding ONE MST. Not interested in all of them.
Yes, it will work as long as your initial set of edges doesn't form a cycle.
Keep in mind that the resulting tree might not be minimal in weight since the edges you fixed might not be part of any MST in the graph. But you will get the lightest spanning tree which satisfies the constraint that those fixed edges are part of the tree.
How to implement it:
To implement this, you can simply change the edge-weights of the edges you need to fix. Just pick the lowest appearing edge-weight in your graph, say min_w, subtract 1 from it and assign this new weight,i.e. (min_w-1) to the edges you need to fix. Then run Kruskal on this graph.
Why it works:
Clearly Kruskal will pick all the edges you need (since these are the lightest now) before picking any other edge in the graph. When Kruskal finishes the resulting set of edges is an MST in G' (the graph where you changed some weights). Note that since you only changed the values of your fixed set of edges, the algorithm would never have made a different choice on the other edges (the ones which aren't part of your fixed set). If you think of the edges Kruskal considers, as a sorted list of edges, then changing the values of the edges you need to fix moves these edges to the front of the list, but it doesn't change the order of the other edges in the list with respect to each other.
Note: As you may notice, giving the lightest weight to your edges is basically the same thing as you suggest. But I think it is a bit easier to reason about why it works. Go with whatever you prefer.
I wouldn't recommend Prim, since this algorithm expands the spanning tree gradually from the current connected component (in the beginning one usually starts with a single node). The case where you join larger components (because your fixed edges might not all be in a single component), would be needed to handled separately - it might not be hard, but you would have to take care of it. OTOH with Kruskal you don't have to adapt anything, but simply manipulate your graph a bit before running the regular algorithm.
If I understood the question properly, Prim's algorithm would be more suitable for this, as it is possible to initialize the connected components to be exactly the edges which are required to occur in the resulting spanning tree (plus the remaining isolated nodes). The desired edges are not permitted to contain a cycle, otherwise there is no spanning tree including them.
That being said, apparently Kruskal's algorithm can also be used, as it is explicitly stated that is can be used to find an edge that connects two forests in a cost-minimal way.
Roughly speaking, as the forests of a given graph form a Matroid, the greedy approach yields the desired result (namely a weight-minimal tree) regardless of the independent set you start with.

Advantage of depth first search over breadth first search or vice versa

I have studied the two graph traversal algorithms,depth first search and breadth first search.Since both algorithms are used to solve the same problem of graph traversal I would like to know how to choose between the two.I mean is one more efficient than the other or any reason why i would choose one over the other in a particular scenario ?
Thank You
Main difference to me is somewhat theoretical. If you had an infinite sized graph then DFS would never find an element if it exists outside of the first path it chooses. It would essentially keep going down the first path and would never find the element. The BFS would eventually find the element.
If the size of the graph is finite, DFS would likely find a outlier (larger distance between root and goal) element faster where BFS would find a closer element faster. Except in the case where DFS chooses the path of the shallow element.
In general, BFS is better for problems related to finding the shortest paths or somewhat related problems. Because here you go from one node to all node that are adjacent to it and hence you effectively move from path length one to path length two and so on.
While DFS on the other end helps more in connectivity problems and also in finding cycles in graph(though I think you might be able to find cycles with a bit of modification of BFS too). Determining connectivity with DFS is trivial, if you call the explore procedure twice from the DFS procedure, then the graph is disconnected (this is for an undirected graph). You can see the strongly connected component algorithm for a directed graph here, which is a modification of DFS. Another application of the DFS is topological sorting.
These are some applications of both the algorithms:
DFS:
Connectivity
Strongly Connected Components
Topological Sorting
BFS:
Shortest Path(Dijkstra is some what of a modification of BFS).
Testing whether the graph is Bipartitie.
When traversing a multiply-connected graph, the order in which nodes are traversed may greatly influence (by many orders of magnitude) the number of nodes to be tracked by the traversing method. Some kinds of algorithms will be massively better when using breadth-first; others will be massively better when using depth-search.
At one extreme, doing a depth-first search on a binary tree with N leaf nodes requires that the traversing method keep track of lgN nodes while a breadth-first search would require keeping track of at least N/2 nodes (since it might scan all other nodes before it scans any leaf nodes; immediately prior to scanning the first leaf node, it would have encountered N/2 of the leafs' parent nodes which have to be tracked separately since none of them reference each other).
On the other extreme, doing a flood-fill on an NxN grid with a method that, if its pixel hasn't been colored yet, colors that pixel and then flood-fills its neighbors will require enqueuing O(N) pixel coordinates if using breadth-first search, but O(N^2) pixel coordinates if using depth-first. When using breadth-first search, paint will seem to "spread out", regardless of the shape to be painted; when using depth-first algorithm to paint a rectangular spiral, each line of which is straight on one side and jagged on the other (which sides should be straight and jagged depends upon the exact algorithm used), all of the straight sections will get painted before any of the jagged ones, meaning that the system must track the location of every jag separately.
For a complete/perfect tree, DFS takes a linear amount of space with respect to the depth of the tree whereas BFS takes an exponential amount of space with respect to the depth of the tree. This is because for BFS the maximum number of nodes in the queue is proportional to the number of nodes in one level of the tree. In DFS the maximum number of nodes in the stack is proportional to the depth of the tree.

How to detect if breaking an edge will make a graph disjoint?

I have a graph that starts off with a single, root node. Nodes are added one by one to the graph. At node creation time, they have to be linked either to the root node, or to another node, by a single edge. Edges can also be created and deleted (one by one, between any two nodes). Nodes can be deleted one at a time. Node and edge creation, deletion operations can happen in any arbitrary order.
OK, so here's my question: When an edge is deleted, is it possible do determine, in constant time (i.e. with an O(1) algorithm), if doing this will divide the graph into two disjoint subgraphs? If it will, then which side of the edge will the root node belong?
I'm willing to maintain, within reasonable limits, any additional data structure that can facilitate the derivation of this information.
Maybe it is not possible to do it in O(1), if so any pointers to literature will be appreciated.
Edit: The graph is a directed graph.
Edit 2: OK, maybe I can restrict the case to deletion of edges from the root node. [Edit 3: not, actually] Also, no edge lands into the root node.
To speed things up a little over the obvious O(|V|+|E|) solution, you could keep a spanning tree which is fairly easy to update as the graph is changed.
If an edge not in the spanning tree is deleted, then the graph isn't disconnected and do nothing. If an edge in the spanning tree is deleted, then you must try to find a new path between those two vertices (if you find one, use it to update the spanning tree, otherwise the graph is disconnected).
So, best case O(1), worst-case O(|V|+|E|), but fairly simple to implement anyway.
Is this a directed graph? The below assumes undirected.
What you are looking for is whether the given edge is a Bridge in the graph. I believe this can be found using a traversal looking for cycles containing that edge and would be O(|V| + |E|).
O(1) is too much to ask.
You might find that looking to maintain 2-edge connected components in dynamic graphs could be useful to you.
Eppstein et al have a paper on this: http://www.ics.uci.edu/~eppstein/pubs/EppGalIta-TR-93-20.pdf
which can maintain 2-edge connected components, in a graph of n nodes where edge insertions and deletions are allowed. It has O(sqrt(n)) time per update and O(log n) time per query.
So any time you delete, you can query in O(logn) to determine if the number of 2-edge connected components has changed. I suppose it can also tell you which component a specific node is in.
This paper is more general and applies to other graph problems, not only 2 edge connected components.
I suggest you look for bridges and dynamic 2-edge connectivity to get you started.
Hope that helps.
as said by Moron just before, you are actually looking for a Bridge in your graph.
Now a Bridge is an edge that has the described attribute and also originates and ends up in Cut Vertexes. Cut vertex is exactly what a Bridge is, but in a vertex (node) edition.
So the only way (though quite bending the initial data structure hypothesis) I can think of, to get a O(1) complexity for this, is if you first check every node in your graph if it is a Cut Vertex and then simply in constant time checking if the edge you want to delete is a attached to one of those two.
Finding if a node in a graph is a Cut Vertex takes O(m+n) where m = # edges and n= # nodes.
Cheers

How do I test whether a tree has a perfect matching in linear time?

Give a linear-time algorithm to test whether a tree has a perfect matching,
that is, a set of edges that touches each vertext of the tree exactly once.
This is from Algorithms by S. Dasgupta, and I just can't seem to nail this problem down. I know I need to use a greedy approach in some manner, but I just can't figure this out. Help?
Pseudocode is fine; once I have the idea, I can implement in any language trivially.
The algorithm has to be linear in anything. O( V + E ) is fine.
I think I have the solution. Since we know the graph is a tree, we know of the existance of leaf nodes, nodes with one edge and no children. In order for this node to be included in the perfect matching, that edge MUST exist in the final solution.
Ergo, we can find all edges connecting to a leaf node, add to the solution, and remove the touched edges from the graph. If, at the end of this process, we are left any remaining nodes untounched, there exists no perfect matching.
In the case of a "graph",
The first step of the problem should be to find the connected components.
Since every edge in the final answer connect two vertices, they belong to at most one of the connected components.
Then, the perfect matching could be found for each connected component.
Linear in what? Linear in the number of edges, keep the edges as an ordered incidence list, ie, every edge (vi, vj) in some total order. Then you can compare the two lists in O(n) of the edges.
The working algorithm would be something as follows:
For each leaf in the tree:
add edge from leaf to its parent to the solution
delete edge from leaf to its parent
delete all edges from the parent to any other vertices
delete leaf and parent from the tree
If the tree is empty then the answer is yes. Otherwise, there's no perfect matching.
I think that it's a simplified problem of finding a Hamiltonian path in a graph:
http://en.wikipedia.org/wiki/Hamiltonian_path
http://en.wikipedia.org/wiki/Hamiltonian_path_problem
I think that there are many solutions on the internet to this problem, but generally finding Hamilton cycle is a NP problem.

Resources