Breadth first search branching factor - performance

The run time of BFS is O(b^d)
b is the branching factor
d is the depth(# of level) of the graph from starting node.
I googled for awhile, but I still dont see anyone mention how they figure out this "b"
So I know branching factor means the "# of child that each node has"
Eg, branching factor for a binary Tree is 2.
so for a BFS graph , is that b= average all the branching factor of each node in our graph.
or b = MAX( among all branch factor of each node) ?
Also, no matter which way we pick the b, still seeming ambiguous to approach our run time.
For example , if our graph has 30000 nodes, only 5 nodes has 10000 branching, and all the rest 29955 nodes just have 10 branching. and we have the depth setup to be 100.
Seems O(b^d) is not making sense at this case.
Can someone explain a little bit. Thankyou!

The runtime that is more often quoted is that BFS is O(m + n) where m is the number of edges and n the number of nodes. This is because each vertex is processed once and each edge at most twice.
I think O(b^d) is used when using BFS on, say, brute-forcing a game of chess, where each position had a relatively constant branching factor and your engine needs to search a certain number of positions deep. For example, b is about 35 for chess and Deep Blue had a search depth of 6-8 (going up to 20).
In such cases, because the graph is relatively acyclic, b^d is roughly the same as m + n (they are equal for trees). O(b^d) is more useful as b is fixed and d is something you control.

in graphs O(b^d), the b = MAX. Since it is the worst case. check this link from princeton http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Breadth-first_search.html - go to time complexity portion

To quote from Artificial Intelligence - A modern approach by Stuart Russel and Peter Norvig:
Time and space complexity are always considered with respect to some measure of the prob- lem difficulty. In theoretical computer science, the typical measure is the size of the state space graph, |V | + |E|, where V is the set of vertices (nodes) of the graph and E is the set of edges (links). This is appropriate when the graph is an explicit data structure that is input to the search program. (The map of Romania is an example of this.) In AI, the graph is often represented implicitly by the initial state, actions, and transition model and is frequently infi- nite. For these reasons, complexity is expressed in terms of three quantities: b, the branching factor or maximum number of successors of any node; d, the depth of the shallowest goal node (i.e., the number of steps along the path from the root); and m, the maximum length of any path in the state space. Time is often measured in terms of the number of nodes generated during the search, and space in terms of the maximum number of nodes stored in memory. For the most part, we describe time and space complexity for search on a tree; for a graph, the answer depends on how “redundant” the paths in the state space are.
This should give you a clear insight about the difference between O(|V|+|E|) and b^d

Related

Why is the complexity of BFS O(V+E) instead of O(E)? [duplicate]

This question already has answers here:
Why is time complexity for BFS/DFS not simply O(E) instead of O(E+V)?
(2 answers)
Breadth First Search time complexity analysis
(8 answers)
Closed 2 years ago.
This is a generic BFS implementation:
For a connected graph with V nodes and E total number of edges, we know that every edge will be considered twice in the inner loop. So if the total number of iterations in the inner loop of BFS is going to be 2 * number of edges E, isn't the runtime going to be O(E) instead?
This is a case where one needs to look a little deeper at the implementation. In particular, how do I determine if a node is visited or not?
The traditional algorithm does this by coloring the vertices. All vertices are colored white at first, and they get colored black as they are visited. Thus visitation can be determined simply by looking at the color of the vertex. If you use this approach, then you have to do O(V) worth of initialization work setting the color of each vertex to white at the start.
You could manage your colors differently. You could maintain a data structure containing all visited nodes. If you did this, you could avoid the O(V) initialization cost. However, you will pay that cost elsewhere in the data structure. For example, if you stored them all in a balanced tree, each if w is not visited now costs O(log V).
This obviously gives you a choice. You can have O(V+E) using the traditional coloring approach, or you can have O(E log V) by storing this information in your own data structure.
You specify a connected graph in your problem. In this case, O(V+E) == O(E) because the number of vertices can never be more than E+1. However, the time complexity of BFS is typically given with respect to an arbitrary graph, which can include a very sparse graph.
If a graph is sufficiently sparse (say, a million vertices and five edges), the cost of initialization may be great enough that you want to switch to a O(E ln V) algorithm. However, these are pretty rare in a practical setting. In a practical setting, the speed of the traditional approach (giving each vertex a color) is just so blinding fast compared to the more fancy data structures that you choose this traditional coloring scheme for everything except the most extraordinarily sparse graphs.
If you maintained a dedicated color property on your vertices with an invariant rule that all nodes are black between algotihm invocations, you could drop the cost to O(E) by doing each BFS twice. On your first pass, you could set them all to white, and then do a second pass to turn them all black. If you had a very sparse graph, this could be more efficient.
Well, let's break it up into easy pieces...
You've kept a visited array, and by looking it up, you decide whether to push a node into the queue or not. Once visited, you don't push it again. So, how many nodes get pushed into the queue: (of course) V nodes. And it's complexity is O(V).
Now, each time, you take out a node from queue and visit all of its neighboring nodes. Now, following this way, for all of V nodes, how many node you'll come across. Well, it's the number of edges if the graph is unidirectional, or, 2 * number of edges if the graph is bidirectional. So, the complexity would be O(E) for unidirectional and O(2 * E) for bidirectional.
So, the ultimate(i.e. total) complexity would be O(V + E) or O(V + 2 * E) or generally, we may say O(v + E).
Because there might be graph having edges less than number of vertices.
Consider this graph:
1 ---- 2
|
|
3 ---- 4
There are 4 vertices but only 3 edges, and in BFS you have to traverse each and every vertex. Thatswhy time complexity is O(V+E) as it considers both V as well as E.

Shortest Path in a Directed Acyclic Graph with two types of costs

I am given a directed acyclic graph G = (V,E), which can be assumed to be topologically ordered (if needed). The edges in G have two types of costs - a nominal cost w(e) and a spiked cost p(e).
The goal is to find the shortest path from a node s to a node t which minimizes the following cost:
sum_e (w(e)) + max_e (p(e)), where the sum and maximum are taken over all edges in the path.
Standard dynamic programming methods show that this problem is solvable in O(E^2) time. Is there a more efficient way to solve it? Ideally, an O(E*polylog(E,V)) algorithm would be nice.
---- EDIT -----
This is the O(E^2) solution I found using dynamic programming.
First, order all costs p(e) in an ascending order. This takes O(Elog(E)) time.
Second, define the state space consisting of states (x,i) where x is a node in the graph and i is in 1,2,...,|E|. It represents "We are in node x, and the highest edge weight p(e) we have seen so far is the i-th largest".
Let V(x,i) be the length of the shortest path (in the classical sense) from s to x, where the highest p(e) encountered was the i-th largest. It's easy to compute V(x,i) given V(y,j) for any predecessor y of x and any j in 1,...,|E| (there are two cases to consider - the edge y->x is has the j-th largest weight, or it does not).
At every state (x,i), this computation finds the minimum of about deg(x) values. Thus the complexity is O(|E| * sum_(x\in V) deg(x)) = O(|E|^2), as each node is associated to |E| different states.
I don't see any way to get the complexity you want. Here's an algorithm that I think would be practical in real life.
First, reduce the graph to only vertices and edges between s and t, and do a topological sort so that you can easily find shortest paths in O(E) time.
Let W(m) be the minimum sum(w(e)) cost of paths max(p(e)) <= m, and let P(m) be the smallest max(p(e)) among those shortest paths. The problem solution corresponds to W(m)+P(m) for some cost m. Note that we can find W(m) and P(m) simultaneously in O(E) time by finding a shortest W-cost path, using P-cost to break ties.
The relevant values for m are the p(e) costs that actually occur, so make a sorted list of those. Then use a Kruskal's algorithm variant to find the smallest m that connects s to t, and calculate P(infinity) to find the largest relevant m.
Now we have an interval [l,h] of m-values that might be the best. The best possible result in the interval is W(h)+P(l). Make a priority queue of intervals ordered by best possible result, and repeatedly remove the interval with the best possible result, and:
stop if the best possible result = an actual result W(l)+P(l) or W(h)+P(h)
stop if there are no p(e) costs between l and P(h)
stop if the difference between the best possible result and an actual result is within some acceptable tolerance; or
stop if you have exceeded some computation budget
otherwise, pick a p(e) cost t between l and P(h), find a shortest path to get W(t) and P(t), split the interval into [l,t] and [t,h], and put them back in the priority queue and repeat.
The worst case complexity to get an exact result is still O(E2), but there are many economies and a lot of flexibility in how to stop.
This is only a 2-approximation, not an approximation scheme, but perhaps it inspires someone to come up with a better answer.
Using binary search, find the minimum spiked cost θ* such that, letting C(θ) be the minimum nominal cost of an s-t path using edges with spiked cost ≤ θ, we have C(θ*) = θ*. Every solution has either nominal or spiked cost at least as large as θ*, hence θ* leads to a 2-approximate solution.
Each test in the binary search involves running Dijkstra on the subset with spiked cost ≤ θ, hence this algorithm takes time O(|E| log2 |E|), well, if you want to be technical about it and use Fibonacci heaps, O((|E| + |V| log |V|) log |E|).

Time/Space Complexity of Depth First Search

I've looked at various other StackOverflow answer's and they all are different to what my lecturer has written in his slides.
Depth First Search has a time complexity of O(b^m), where b is the
maximum branching factor of the search tree and m is the maximum depth
of the state space. Terrible if m is much larger than d, but if search
tree is "bushy", may be much faster than Breadth First Search.
He goes on to say..
The space complexity is O(bm), i.e. space linear in length of action
sequence! Need only store a single path from the root to the leaf
node, along with remaining unexpanded sibling nodes for each node on
path.
Another answer on StackOverflow states that it is O(n + m).
Time Complexity: If you can access each node in O(1) time, then with branching factor of b and max depth of m, the total number of nodes in this tree would be worst case = 1 + b + b2 + … + bm-1. Using the formula for summing a geometric sequence (or even solving it ourselves) tells that this sums to = (bm - 1)/(b - 1), resulting in total time to visit each node proportional to bm. Hence the complexity = O(bm).
On the other hand, if instead of using the branching factor and max depth you have the number of nodes n, then you can directly say that the complexity will be proportional to n or equal to O(n).
The other answers that you have linked in your question are similarly using different terminologies. The idea is same everywhere. Some solutions have added the edge count too to make the answer more precise, but in general, node count is sufficient to describe the complexity.
Space Complexity: The length of longest path = m. For each node, you have to store its siblings so that when you have visited all the children, and you come back to a parent node, you can know which sibling to explore next. For m nodes down the path, you will have to store b nodes extra for each of the m nodes. That’s how you get an O(bm) space complexity.
The complexity is O(n + m) where n is the number of nodes in your tree, and m is the number of edges.
The reason why your teacher represents the complexity as O(b ^ m), is probably because he wants to stress the difference between Depth First Search and Breadth First Search.
When using BFS, if your tree has a very large amount of spread compared to it's depth, and you're expecting results to be found at the leaves, then clearly DFS would make much more sense here as it reaches leaves faster than BFS, even though they both reach the last node in the same amount of time (work).
When a tree is very deep, and non-leaves can give information about deeper nodes, BFS can detect ways to prune the search tree in order to reduce the amount of nodes necessary to find your goal. Clearly, the higher up the tree you discover you can prune a sub tree, the more nodes you can skip.
This is harder when you're using DFS, because you're prioritize reaching a leaf over exploring nodes that are closer to the root.
I suppose this DFS time/space complexity is taught on an AI class but not on Algorithm class.
The DFS Search Tree here has slightly different meaning:
A node is a bookkeeping data structure used to represent the search
tree. A state corresponds to a configuration of the world. ...
Furthermore, two different nodes can contain the same world state if
that state is generated via two different search paths.
Quoted from book 'Artificial Intelligence - A Modern Approach'
So the time/space complexity here is focused on you visit nodes and check whether this is the goal state. #displayName already give a very clear explanation.
While O(m+n) is in algorithm class, the focus is the algorithm itself, when we store the graph as adjacency list and how we discover nodes.

Time complexity of creating a minimal spanning tree if the number of edges is known

Suppose that the number of edges of a connected graph is known and the weight of each edge is distinct, would it possible to create a minimal spanning tree in linear time?
To do this we must look at each edge; and during this loop there can contain no searches otherwise it would result in at least n log n time. I'm not sure how to do this without searching in the loop. It would mean that, somehow we must only look at each edge once, and decide rather to include it or not based on some "static" previous values that does not involve a growing data structure.
So.. let's say we keep the endpoints of the node in question, then look at the next node, if the next node has the same vertices as prev, then compare the weight of prev and current node and keep the lower one. If the current node's endpoints are not equal to prev, then it is in a different component .. now I am stuck because we cannot create a hash or array to keep track of the component nodes that are already added while look through each edge in linear time.
Another approach I thought of is to find the edge with the minimal weight; since the edge weights are distinct this edge will be part of any MST. Then.. I am stuck. Since we cannot do this for n - 1 edges in linear time.
Any hints?
EDIT
What if we know the number of nodes, the number of edges and also that each edge weight is distinct? Say, for example, there are n nodes, n + 6 edges?
Then we would only have to find and remove the correct 7 edges correct?
To the best of my knowledge there is no way to compute an MST faster by knowing how many edges there are in the graph and that they are distinct. In the worst case, you would have to look at every edge in the graph before finding the minimum-cost edge (which must be in the MST), which takes Ω(m) time in the worst case. Therefore, I'll claim that any MST algorithm must take Ω(m) time in the worst case.
However, if we're already doing Ω(m) work in the worst-case, we could do the following preprocessing step on any MST algorithm:
Scan over the edges and count up how many there are.
Add an epsilon value to each edge weight to ensure the edges are unique.
This can be done in time Ω(m) as well. Consequently, if there were a way to speed up MST computation knowing the number of edges and that the edge costs are distinct, we would just do this preprocessing step on any current MST algorithm to try to get faster performance. Since to the best of my knowledge no MST algorithm actually tries to do this for performance reasons, I would suspect that there isn't a (known) way to get a faster MST algorithm based on this extra knowledge.
Hope this helps!
There's a famous randomised linear-time algorithm for minimum spanning trees whose complexity is linear in the number of edges. See "A randomized linear-time algorithm to find minimum spanning trees" by Karger, Klein, and Tarjan.
The key result in the paper is their "sampling lemma" -- that, if you independently randomly select a subset of the edges with probability p and find the minimum spanning tree of this subgraph, then there are only |V|/p edges that are better than the worst edge in the tree path connecting its ends.
As templatetypedef noted, you can't beat linear-time. That all edge weights are distinct is a common assumption that simplifies analysis; if anything, it makes MST algorithms run a little slower.
The fact that a number of edges (N) is known does not influence the complexity in any way. N is still a finite but unbounded variable, and each graph will have different N. If you place a upper bound on N, say, 1 million, then the complexity is O(1 million log 1 million) = O(1).
The fact that each edge has distinct weight does not influence the program either, because it does not say anything about the graph's structure. Therefore knowledge about current case cannot influence further processing, as we cannot predict how the graph's structure will look like in the next step.
If the number of edges is close to n, like in this case n-6 (after edit), we know that we only need to remove 7 edges as every spanning tree has only n-1 edges.
The Cycle Property shows that the most expensive edge in a cycle does not belong to any Minimum Spanning tree(assuming all edges are distinct) and thus, should be removed.
Now you can simply apply BFS or DFS to identify a cycle and remove the most expensive edge. So, overall, we need to run BFS 7 times. This takes 7*n time and gives us a time complexity of O(n). Again, this is only true if the number of edges is close to the number of nodes.

What is the worst-case time and space complexity of a uniform-cost search algorithm?

My book here (Artificial intelligence A modern approach) says that the worst-case time and space complexity of a uniform-cost search algorithm would be O(b[C*/e]) , where b is the branching factor, C* is the cost of the optimal solution, and every action costs atleast e. But why is this so?
First, the complexity is O(B^(C/e)) [exponential in C/e].
To understand it, think of a simple example case first:
Let G=(V,E) be a graph, with branch factor B. The graph is unweighted (w(e) = 1 for each e).
Consider finding the shortest path from S to T.
In this case, the algorithm is actually a BFS, and it will discover all nodes in the path up to length SOL, where SOL is the length of the shortest path, which is O(B^|SOL|)
For the general case - the same idea holds, you need to discover all nodes up to cost C. So you discover nodes up to depth C/e, giving you O(B^(C/e)) total nodes needed to be explored.
The exponential factor is because: First level (root) has B^0=1 nodes, second level has B nodes. from each of these you discover B nodes, giving you B^2, ....
EDIT:
Missed it so far, but the title asks for space complexity and not time complexity. However, the answer remains the same, since a uniform cost search holds a visited set, for already visited nodes. Since each node you discover is also added to it - the answer remains O(B^(C/e))
C*/e means average number of nodes which should be visited during the search, and for visiting each of this nodes you should look at all possible b branches (at least root nodes), so you should check b[C*/e] node in your search. which is your search time complexity, this is by assuming process on each node takes O(1).
P.S: It's Ω(b[C*/e])in worst case

Resources