In the Artificial Intelligence: A Modern Approach textbook, IDS is stated to have a space complexity of O(bm), where b = branching factor and m = maximum depth of tree. What nodes does IDS store during its traversal that causes it to have a O(bm) space complexity ?
On Wikipedia it says the space complexity is simply the depth d of the goal, as it is essentially a depth-first search; that is what it actually says in my copy of AIAMA (p. 88)
I can only imagine that the O(bm) assumes that the top level of all visited nodes is stored, which would be the branching level times the current depth. There is no need to store the higher-level nodes, as they have already been searched.
Related
What is the range search complexity for R tree and R* Tree? I understand the process of range search: similar to a DFS search, it visits each node and if a node's bounding box intersects the target range, then include the node in the result set. More precisely, we also need to consider the branch-and-bound strategy it uses: if a parent node doesn't intersect with the target, then we don't visit its children nodes. Then the complexity should be smaller than O(n), where n is the number of nodes. I really don't know how to calculate the number of nodes given the number of leaves(or data points).
Could anybody give me an explanation here? Thank you.
Obviously, the worst case must be at least O(n) if your range is [-∞;∞] in every dimension. It may be as bad as O(n log n) then because of the tree.
Assuming the answer is a single entry, the average case probably is O(log n) - only few paths through the tree need to be followed (if you have little enough overlap).
It is log to the base of your page size. So it will usually not exceed 5, because you never want trees with more than say 1000^5=10^15 objects.
For all practical purposes, assume the runtime complexity is simply the answer set size O(s). Select 2% of your data it takes twice as long as 1%.
Check if 2 tree nodes are related (i.e. ancestor-descendant)
solve it in O(1) time, with O(N) space (N = # of nodes)
pre-processing is allowed
That's it. I'll be going to my solution (approach) below. Please stop if you want to think yourself first.
For a pre-processing I decided to do a pre-order (recursively go through the root first, then children) and give a label to each node.
Let me explain the labels in details. Each label will consist of comma-separated natural numbers like "1,2,1,4,5" - the length of this sequence equals to (the depth of the node + 1). E.g. the label of the root is "1", root's children will have labels "1,1", "1,2", "1,3" etc.. Next-level nodes will have labels like "1,1,1", "1,1,2", ..., "1,2,1", "1,2,2", ...
Assume that "the order number" of a node is the "1-based index of this node" in the children list of its parent.
Common rule: node's label consists of its parent label followed by comma and "the order number" of the node.
Thus, to answer if two nodes are related (i.e. ancestor-descendant) in O(1), I'll be checking if the label of one of them is "a prefix" of the other's label. Though I'm not sure if such labels can be considered to occupy O(N) space.
Any critics with fixes or an alternative approach is expected.
You can do it in O(n) preprocessing time, and O(n) space, with O(1) query time, if you store the preorder number and postorder number for each vertex and use this fact:
For two given nodes x and y of a tree T, x is an ancestor of y if and
only if x occurs before y in the preorder traversal of T and after y
in the post-order traversal.
(From this page: http://www.cs.arizona.edu/xiss/numbering.htm)
What you did in the worst case is Theta(d) where d is the depth of the higher node, and so is not O(1). Space is also not O(n).
if you consider a tree where a node in the tree has n/2 children (say), the running time of setting the labels will be as high as O(n*n). So this labeling scheme wont work ....
There are linear time lowest common ancestor algorithms(at least off-line). For instance have a look here. You can also have a look at tarjan's offline LCA algorithm. Please note that these articles require that you know the pairs for which you will be performing the LCA in advance. I think there are also online linear time precomputation time algorithms but they are very complex. For instance there is a linear precomputation time algorithm for the range minimum query problem. As far as I remember this solution passed through the LCA problem twice . The problem with the algorithm is that it had such a large constant that it require enormous input to be actually faster then the O(n*log(n)) algorithm.
There is much simpler approach that requires O(n*log(n)) additional memory and again answers in constant time.
Hope this helps.
Can someone explain with an example how we can calculate the time and space complexity of both these traversal methods?
Also, how does recursive solution to depth first traversal affect the time and space complexity?
BFS:
Time complexity is O(|V|), where |V| is the number of nodes. You need to traverse all nodes.
Space complexity is O(|V|) as well - since at worst case you need to hold all vertices in the queue.
DFS:
Time complexity is again O(|V|), you need to traverse all nodes.
Space complexity - depends on the implementation, a recursive implementation can have a O(h) space complexity [worst case], where h is the maximal depth of your tree.
Using an iterative solution with a stack is actually the same as BFS, just using a stack instead of a queue - so you get both O(|V|) time and space complexity.
(*) Note that the space complexity and time complexity is a bit different for a tree than for a general graphs becase you do not need to maintain a visited set for a tree, and |E| = O(|V|), so the |E| factor is actually redundant.
DFS and BFS time complexity: O(n)
Because this is tree traversal, we must touch every node, making this O(n) where n is the number of nodes in the tree.
BFS space complexity: O(n)
BFS will have to store at least an entire level of the tree in the queue (sample queue implementation). With a perfect fully balanced binary tree, this would be (n/2 + 1) nodes (the very last level). Best Case (in this context), the tree is severely unbalanced and contains only 1 element at each level and the space complexity is O(1). Worst Case would be storing (n - 1) nodes with a fairly useless N-ary tree where all but the root node are located at the second level.
DFS space complexity: O(d)
Regardless of the implementation (recursive or iterative), the stack (implicit or explicit) will contain d nodes, where d is the maximum depth of the tree. With a balanced tree, this would be (log n) nodes. Worst Case for DFS will be the best case for BFS, and the Best Case for DFS will be the worst case for BFS.
There are two major factors of complexity
Time Complexity
Space complexity
Time Complexity
It is the amount of time need to generate the node.
In DFS the amount of time needed is proportional to the depth and branching factor. For DFS the total amount of time needed is given by-
1 + b + b2 + b3 + ... + bd ~~ bd
Thus the time complexity = O(bd)
Space complexity
It is the amount of space or memory required for getting a solution
DFS stores only current path it is pursuing. Hence the space complexity is a linear function of the depth.
So space complexity is given by O(d)
I'm having trouble understanding space complexity. My general question is: how can the space complexity of an algorithm on a tree be smaller than the number of nodes in the tree? Here's a specific example:
If b is the branching factor
d is Depth of shallowest goal node and,
m is Maximum length of any path in the state space
For DFS, the space complexity is supposed to be O(bm). I thought it would just always be the size of the tree? Where's the rest of the tree and how do we use the entire tree with only O(bm) space complexity?
The space complexity of an algorithm is normally separate from the space taken by the raw data.
Just for example, in searching a tree you might keep a stack of the nodes in the tree you descended through to get to some particular leaf. In this case, the three takes O(N) space, but the search takes (assuming a balanced tree) O(log N) space over and above what the tree itself occupies.
Because space complexity represents the extra space it takes besides the input.
Complexity, in general, is defined related to turing machines. The space an algorithm takes is the extra number of cells needed for it to run. The input cells are not taken into account, and can be reused by the algorithm to reduce extra storage.
Isn't it redundant to rescan n-1 levels of nodes for each iteration?
I quote from Artificial Intelligence: A Modern Approach:
Iterative deepening search may seem wasteful because states are generated multiple times. It turns out this is not too costly. The reason is that in a search tree with the same (or nearly the same) branching factor at each level, most of the nodes are in the bottom level, so it does not matter much that the upper levels are generated multiple times. In an iterative deepening search, the nodes on the bottom level (depth d) are generated once, those on the next-to-bottom level are generated twice, and so on, up to the children of the root, which are generated d times. So the total number of nodes generated in the worst case is
N(IDS) = (d)*b+(d-1)*b^2+...+(1)*b^d
which gives a time complexity of O(b^d) - asymptotically the same as breadth-first search.