General confusion about space complexity - algorithm

I'm having trouble understanding space complexity. My general question is: how can the space complexity of an algorithm on a tree be smaller than the number of nodes in the tree? Here's a specific example:
If b is the branching factor
d is Depth of shallowest goal node and,
m is Maximum length of any path in the state space
For DFS, the space complexity is supposed to be O(bm). I thought it would just always be the size of the tree? Where's the rest of the tree and how do we use the entire tree with only O(bm) space complexity?

The space complexity of an algorithm is normally separate from the space taken by the raw data.
Just for example, in searching a tree you might keep a stack of the nodes in the tree you descended through to get to some particular leaf. In this case, the three takes O(N) space, but the search takes (assuming a balanced tree) O(log N) space over and above what the tree itself occupies.

Because space complexity represents the extra space it takes besides the input.
Complexity, in general, is defined related to turing machines. The space an algorithm takes is the extra number of cells needed for it to run. The input cells are not taken into account, and can be reused by the algorithm to reduce extra storage.

Related

Why is the space complexity of Iterative Deepening Search O(bm)?

In the Artificial Intelligence: A Modern Approach textbook, IDS is stated to have a space complexity of O(bm), where b = branching factor and m = maximum depth of tree. What nodes does IDS store during its traversal that causes it to have a O(bm) space complexity ?
On Wikipedia it says the space complexity is simply the depth d of the goal, as it is essentially a depth-first search; that is what it actually says in my copy of AIAMA (p. 88)
I can only imagine that the O(bm) assumes that the top level of all visited nodes is stored, which would be the branching level times the current depth. There is no need to store the higher-level nodes, as they have already been searched.

Inorder Traversal || Call Stack space to be considered (or) Not?

This query has been in my mind for many days and I wanted someone to clear it.
Problem:- Find the number of nodes in a binary tree
Approach 1 :- (Iterative)
Do Inorder traversal using the stack. whenever you are popping elements from the stack, keep a count of it which are number of nodes in a binary tree.
Time Complexity - O(n)
Space Complexity - O(n)
Approach 2 :- (Recursive)
Time Complexity - O(n)
Space Complexity - O(1) or O(n)????
We can do inorder traversal recursively, but in an interview, which approach would be optimal expressing to the interviewer.....Iterative or recursive?? and also should i consider the recursive call stack space which boils down the space complexity to O(n) or should i stick with the O(1) Space complexity?
Your question - "which approach would be optimal expressing to the interviewer" - can't really be answered by anyone except the interviewer themself. However, the differences between the possible approaches to this problem are worthy of discussion.
For a start, let's note that both the iterative and recursive approaches use a stack; the iterative approach has an explicit stack, but a recursive function works using a call stack which is not managed by the programmer. Therefore the auxiliary space used by either approach will be asymptotically the same, but with a lower constant for the iterative approach since it only pushes nodes to the stack, while the recursive approach pushes whole call frames, including all local variables.
Note that the auxiliary space is O(h) where h is the height of the tree, not O(n) where n is the number of nodes. This is important because the worst case will depend on whether or not the tree is balanced. For an unbalanced tree, the height h is O(n) in the worst case, whereas for a balanced tree, h is O(log n). The question doesn't specify that the tree is balanced, so there is a risk that the recursive approach will overflow the stack when the height of the tree is too large. In contrast, the iterative approach stores an explicit stack in main memory.
That's all a discussion of efficiency, but there is more to programming than algorithmic efficiency. For instance, if the tree is never going to be very large, you might prefer the recursive approach since it is much simpler to write; it takes only a few lines of very clean code. The imperative approach needs to create a stack, and push and pop from it in a loop, so the code is likely to be longer and harder to understand. Don't underestimate the value of clean, easy-to-understand code.
The other thing is that you have jumped straight to in-order traversal as the solution to the problem, but if the problem is to count the number of nodes in a binary tree, then you can traverse it in any order. Pre-order traversal is a bit simpler to implement iteratively than in-order traversal, and is just as efficient.
Alternatively, if the data structure itself can be modified, then it is straightforward to give each node a property holding the cardinality of its subtree. The insert, delete and rebalance operations will need to be modified to maintain this property, but the extra work is O(1), and it allows the size of the tree to be computed in O(1) by simply reading the root node's cardinality property. Adding this property has other benefits, such as supporting a "find the kth node" operation in O(h) time instead of O(h + k).

range search complexity of R tree and R* tree

What is the range search complexity for R tree and R* Tree? I understand the process of range search: similar to a DFS search, it visits each node and if a node's bounding box intersects the target range, then include the node in the result set. More precisely, we also need to consider the branch-and-bound strategy it uses: if a parent node doesn't intersect with the target, then we don't visit its children nodes. Then the complexity should be smaller than O(n), where n is the number of nodes. I really don't know how to calculate the number of nodes given the number of leaves(or data points).
Could anybody give me an explanation here? Thank you.
Obviously, the worst case must be at least O(n) if your range is [-∞;∞] in every dimension. It may be as bad as O(n log n) then because of the tree.
Assuming the answer is a single entry, the average case probably is O(log n) - only few paths through the tree need to be followed (if you have little enough overlap).
It is log to the base of your page size. So it will usually not exceed 5, because you never want trees with more than say 1000^5=10^15 objects.
For all practical purposes, assume the runtime complexity is simply the answer set size O(s). Select 2% of your data it takes twice as long as 1%.

Is there any algorithm that is O(n) time and necessarily uses O(n) auxiliary space?

I have noticed that problems that can be solved in linear time can be tweaked to use no more than O(1) auxiliary space. Take Weighted Indepented Set problem for path graphs. If only total weight is required, it takes O(1) space. But if set is also asked for in solution, then it uses O(n) space, however, auxiliary space used is still O(1). Other problems that admit linear time algorithms are Maximum Subarray Sum problem, rotating a 1D vector by i positions, convert a BST to a sorted doubly linked list, etc...
The Z algorithm, linear-time suffix array generation, the Burrows-Wheeler Transform, etc. all need O(n) auxiliary space.
Actually, I think even depth-first search, breadth-first search, etc. require O(n) auxiliary space in their worst cases (linked list for DFS, single-layer tree for BFS).

What is the time and space complexity of a breadth first and depth first tree traversal?

Can someone explain with an example how we can calculate the time and space complexity of both these traversal methods?
Also, how does recursive solution to depth first traversal affect the time and space complexity?
BFS:
Time complexity is O(|V|), where |V| is the number of nodes. You need to traverse all nodes.
Space complexity is O(|V|) as well - since at worst case you need to hold all vertices in the queue.
DFS:
Time complexity is again O(|V|), you need to traverse all nodes.
Space complexity - depends on the implementation, a recursive implementation can have a O(h) space complexity [worst case], where h is the maximal depth of your tree.
Using an iterative solution with a stack is actually the same as BFS, just using a stack instead of a queue - so you get both O(|V|) time and space complexity.
(*) Note that the space complexity and time complexity is a bit different for a tree than for a general graphs becase you do not need to maintain a visited set for a tree, and |E| = O(|V|), so the |E| factor is actually redundant.
DFS and BFS time complexity: O(n)
Because this is tree traversal, we must touch every node, making this O(n) where n is the number of nodes in the tree.
BFS space complexity: O(n)
BFS will have to store at least an entire level of the tree in the queue (sample queue implementation). With a perfect fully balanced binary tree, this would be (n/2 + 1) nodes (the very last level). Best Case (in this context), the tree is severely unbalanced and contains only 1 element at each level and the space complexity is O(1). Worst Case would be storing (n - 1) nodes with a fairly useless N-ary tree where all but the root node are located at the second level.
DFS space complexity: O(d)
Regardless of the implementation (recursive or iterative), the stack (implicit or explicit) will contain d nodes, where d is the maximum depth of the tree. With a balanced tree, this would be (log n) nodes. Worst Case for DFS will be the best case for BFS, and the Best Case for DFS will be the worst case for BFS.
There are two major factors of complexity
Time Complexity
Space complexity
Time Complexity
It is the amount of time need to generate the node.
In DFS the amount of time needed is proportional to the depth and branching factor. For DFS the total amount of time needed is given by-
1 + b + b2 + b3 + ... + bd ~~ bd
Thus the time complexity = O(bd)
Space complexity
It is the amount of space or memory required for getting a solution
DFS stores only current path it is pursuing. Hence the space complexity is a linear function of the depth.
So space complexity is given by O(d)

Resources