I can't decide whether or not to use a bfs or dfs in these two situations
situation1: the graph is unbalanced undirected edge weighted tree with height 40 and minimal depth to any leaf node of 38. What is the best algorithm to find the minimal edge cost from root to any leaf
situation2: the graph is a max heap which algorithm is the best to find the maximum key value within each level of the heap.
For situation 1 I'm thinking DFS because you don't have to go through all of the branches to find the smallest one, the second a branch is bigger than the comparison you stop.
for situation 2 I'm thinking BFS because a BFS gets all the nodes from each level at once, and is better for comparison..
any advice?
I am assuming that you only have a pointer to the root of the tree/heap to start off with in both cases.
The worst case time complexity for both situations regardless of whether you use BFS or DFS is O(n), where n is the number of nodes. Thus any optimizations that you may be able to come up with would be "on average" optimizations.
You are correct that DFS is likely to perform better than BFS for situation 1 for the exact reason that you have given.
For situation 2, however, DFS is no slower than BFS (in theory at least) because you can simply store each node at their corresponding levels and them compare all nodes in each level later. For space complexity, however, BFS would be better, because once a level is done and you move onto the next, you don't have to store any of the parent nodes. For this reason BFS can be recommended for situation 2.
Related
People always talk about how if there are infinite nodes downwards, then DFS will get stuck traversing this infinitely long branch and never reaching the answer in another branch.
Isn't this applicable to BFS as well? For example if the root node has an infinite amount of neighbours, wouldn't the program just spend an infinite amount of time trying to add each one into a queue?
In some cases, yes.
However, in order to have an infinite graph you basically need an implicit graph, https://en.wikipedia.org/wiki/Implicit_graph and many of them have bounded degree which avoids that problem.
Additionally, another advantage with BFS over DFS is that a path with fewer vertices often is "better" in some way - and by having a cost for the vertices that can be formulated using algorithms like Djikstra's that in some cases can be extended even to unbounded degrees.
Yes you are right, in the second case BFS will not have any real progress. For this theoretical infinite scenarios, let's discuss all the three possible cases:
If the graph had infinite nodes downwards and finite neighbors, then
we should use BFS (you already explained the reason)
But if the graph has infinite neighbors and finite nodes downwards,
then we should use DFS as in this case while doing DFS search for
each neighbor we would be able to search it's complete
path in finite time and then move on to the next neighbor. Here, BFS wouldn't have gotten any real progress while searching.
If graph had both infinite neighbors and infinite nodes downwards, then DFS and BFS will seize to differ as we are dealing with infinity on both ends.
What are the advantage and disadvantages of Level Order Traversal compare to Depth Order Traversal(in-order, pre-order, post-order) ?
I think your problem is the same as Breadth-First-Search VS Depth-First-Search.
I couldn't say which is better. It is depend on your application.
Here : Breadth First Vs Depth First, you can find a good explanation of the two methods (given that by level order you mean a particular kind of breadth search).
The pro/cons of the two are various:
if you expect to find data quite up in the graph, than BFS would be better in time since passes to the next depth level only after having explored the whole super-level. If you instead expect to find a node quite at the bottom of the graph, DFS is better for the opposite reason.
if the graph/tree is huge, and particularly large (nodes with lots of children/adjacencies for each level), the queue that BFS implements will require a lot of memory, while the stack memory of the vertical recursive call should be reasonably smaller, so DFS could be preferable: here too, the opposite for really narrow graphs.
for paths in the graph, BFS will always return first the shortest path it encounters, while DFS could return first a path that's not necessarily the shortest.
both have same worst case time complexity (if the required node is the last one you encounter).
I am working on a graph library.It has to have a function which finds the two nodes which are most separated i.e they maximum number of the minimum number of nodes required to traverse before reaching the target node from the source node.
One naive way would be to calculate the degree of separation from each node to all other node and repeat the same for every node.
The complexity of this turns out to be O(n^2).
Any better solution to this problem ?
Use Floyd-Warshall algorithm to find all pairs shortest path. Then iterate through results and find one with the longest path.
Without any assumptions on the graph, Floyd-Warshall is the way to go.
If your graph is sparse (i.e. it has a relatively few edges by node, or |E|<<|N|^2), then Johnson is likely to be faster.
With unit edge weight (which seems to be your case), a naïve approach by computing the furthest node (with BFS) for each node leads to O(|N|.|E|). This can probably be improved further, but I don't see a way right now.
To expand on the title, I need all simple (non-cyclical) paths between all nodes in a very large undirected graph.
The most obvious optimization I can think of is that once I have calculated all the paths between a particular pair of nodes I can just reverse them all instead of recalculating when I need to go the other way.
I was looking into transitive closures and the Floyd–Warshall algorithm, but it looks like the best I could do if I went down that route would be to find only the shortest paths between all nodes.
Any ideas? Right now I'm looking at running a DFS on every node in the graph, which seems to me to be significantly less than optimal.
I don't understand the reasoning behind your idea that DFS is significantly less than optimal. In fact, DFS is clearly optimal here.
If you traverse the graph, limiting the branching only to vertices which haven't been visited in this branch so far, then the total number of nodes in the DFS tree will be equal to the number of simple paths from the starting vertex to all other vertices. As all of these paths are a part of your output, the algorithm cannot be meaningfully improved, as you can't reduce complexity below the size of the output.
There is simply no way to output a factorial amount of data in polynomial time, regardless of what the problem is or what algorithm you are using.
I have studied the two graph traversal algorithms,depth first search and breadth first search.Since both algorithms are used to solve the same problem of graph traversal I would like to know how to choose between the two.I mean is one more efficient than the other or any reason why i would choose one over the other in a particular scenario ?
Thank You
Main difference to me is somewhat theoretical. If you had an infinite sized graph then DFS would never find an element if it exists outside of the first path it chooses. It would essentially keep going down the first path and would never find the element. The BFS would eventually find the element.
If the size of the graph is finite, DFS would likely find a outlier (larger distance between root and goal) element faster where BFS would find a closer element faster. Except in the case where DFS chooses the path of the shallow element.
In general, BFS is better for problems related to finding the shortest paths or somewhat related problems. Because here you go from one node to all node that are adjacent to it and hence you effectively move from path length one to path length two and so on.
While DFS on the other end helps more in connectivity problems and also in finding cycles in graph(though I think you might be able to find cycles with a bit of modification of BFS too). Determining connectivity with DFS is trivial, if you call the explore procedure twice from the DFS procedure, then the graph is disconnected (this is for an undirected graph). You can see the strongly connected component algorithm for a directed graph here, which is a modification of DFS. Another application of the DFS is topological sorting.
These are some applications of both the algorithms:
DFS:
Connectivity
Strongly Connected Components
Topological Sorting
BFS:
Shortest Path(Dijkstra is some what of a modification of BFS).
Testing whether the graph is Bipartitie.
When traversing a multiply-connected graph, the order in which nodes are traversed may greatly influence (by many orders of magnitude) the number of nodes to be tracked by the traversing method. Some kinds of algorithms will be massively better when using breadth-first; others will be massively better when using depth-search.
At one extreme, doing a depth-first search on a binary tree with N leaf nodes requires that the traversing method keep track of lgN nodes while a breadth-first search would require keeping track of at least N/2 nodes (since it might scan all other nodes before it scans any leaf nodes; immediately prior to scanning the first leaf node, it would have encountered N/2 of the leafs' parent nodes which have to be tracked separately since none of them reference each other).
On the other extreme, doing a flood-fill on an NxN grid with a method that, if its pixel hasn't been colored yet, colors that pixel and then flood-fills its neighbors will require enqueuing O(N) pixel coordinates if using breadth-first search, but O(N^2) pixel coordinates if using depth-first. When using breadth-first search, paint will seem to "spread out", regardless of the shape to be painted; when using depth-first algorithm to paint a rectangular spiral, each line of which is straight on one side and jagged on the other (which sides should be straight and jagged depends upon the exact algorithm used), all of the straight sections will get painted before any of the jagged ones, meaning that the system must track the location of every jag separately.
For a complete/perfect tree, DFS takes a linear amount of space with respect to the depth of the tree whereas BFS takes an exponential amount of space with respect to the depth of the tree. This is because for BFS the maximum number of nodes in the queue is proportional to the number of nodes in one level of the tree. In DFS the maximum number of nodes in the stack is proportional to the depth of the tree.