I have a method which checks if an array is a Heap. On every recursive call, I create 2 new recursive calls on the left subtree and the right subtree to traverse the nodes and check that their values are correct.
I want to calculate BigO of this. I think that in the worst case it is O(n) because if it IS a Heap, then it never stops early and needs to visit every node. I think the best case is O(3), and that would occur when it checks the very first left subtree and right subtree and both return false (not a Heap).
My question is: does this logic make sense? I think it does, but whenever I see the time complexity of recursive functions they always seem to be in some form of logarithmic time. It is almost as if there is some mysterious quality to recursive functions that nobody is explicitly stating. How come recursive functions often times process things in logarithmic time? And is my above logic valid?
Yes it makes sense. The reason that you see most algorithms take logarithmic time is because it repeats over something and keeps divide the scope by some factor.
Yes, that makes sense. Only one of the three cases of the Master Theorem (though arguably the most interesting) has a logarithm.
Related
Why should one choose recursion over iteration, when a solution has the same time complexity for both cases but better space complexity for iterative?
Here's a particular example of a case where there are extra considerations. Tree search algorithms can be defined recursively (because each subtree of a tree is a tree) or iteratively (with a stack). However, while a recursive search can work perfectly for finding the first leaf with a certain property or searching over all leaves, it does not lend itself to producing a well-behaved iterator: an object or function state that returns a leaf, and later when called again returns the next leaf, etc. In an iterative design the search stack can be stored as a static member of the object or function, but in a recursive design the call stack is lost whenever the function returns and is difficult or expensive to recreate.
Iteration is more difficult to understand in some algorithms. An algorithm that can naturally be expressed recursively may not be as easy to understand if expressed iteratively. It can also be difficult to convert a recursive algorithm into an iterative algorithm, and verifying that the algorithms are equivalent can also be difficult.
Recursion allows you to allocate additional automatic objects at each function call. The iterative alternative is to repeatedly dynamically allocate or resize memory blocks. On many platforms automatic allocation is much faster, to the point that its speed bonus outweighs the speed penalty and storage cost of recursive calls. (But some platforms don't support allocation of large amounts of automatic data, as mentioned above; it's a trade-off.)
recursion is very beneficial when the iterative solutions requires that you simulate recursion with a stack. Recursion acknowledges that the compiler already manages a stack to accomplish precisely what you need. When you start managing your own, not only are you likely re-introducing the function call overhead you intended to avoid; but you're re-inventing a wheel (with plenty of room for bugs) that already exists in a pretty bug-free form.
Some Benefits for Recursion
Code is Perfect Elegant (compared to loops)
very useful in backtracking data structures like LinkedList, Binary Search Trees as the recursion works by calling itself in addition stack made especially for this recursive calls and each call chained by its previous one
I'm not sure about the O() complexity of these functions. My answers are in the boxes. Someone told me they are all O(n) but I don't understand why that is. Thanks.
All four are O(n) (ignoring that the two best case questions should use Ω(n)) since you must examine every node.
Consider height: you have to recursively check each subtree, terminating only once you reach the bottom of a tree. That means you're going to reach every leaf node eventually. You can't terminate early.
The same goes for balanced; you can't verify that a tree is balanced without first verifying that each subtree is balanced, which in this implementation means calling height for each subtree.
Now for the wording of the exam. Big O notation is used for worst cases because a worst case is (by definition) "bigger" than all other cases. An upper bound for the worst case is necessarily an upper bound for all cases. Similarly, a best case is by definition "smaller" than all other cases. An upper bound on the best case is mostly useless, because you can't say anything about the remaining cases.
When talking about best cases, you use Ω (big omega) notation, which provides a lower bound. Saying the best case is Ω(n) tells you that no matter how good the best base (and thus every case) is, it's no smaller than n.
For height and `balanced, you can actually show that the best case is Ω(n) and the worst case is O(n). In that case, you can combine them and say that each is Θ(n); the upper and lower bounds match.
height()
Best case: both left and right trees are null. Therefore O(1) for a single max comparison, though technically, n = 1, so you can say O(n).
Worst case: must completely traverse both left and right trees when neither is null. O(n)
Same for balanced(), as far as I can tell.
If a recursive solution ends up calling itself consecutively for say, ~N times, before going back up a level, the space efficiency is O(N) at best, because each of the N calls uses up a certain amount of stack space.
Does this also imply the time efficiency is also O(N) at best, because the code inside the recursive function is similar to an inner loop code that gets run ~N times?
In addition to #Ben's answer there is also the case of "tail recursion" where the current stack frame is removed and replaced by the callee's stack frame, but only when the caller's last action is to return the result of a callee. This can result in O(n) time functions having O(1) space when implemented in an entirely functional language.
No, but your observation has some truth in it - basically if you know that any algorithm (recursive or otherwise, since we don't distinguish the two; and there isn't anything really that could distinguish them, it's more a matter of style) for a given problem has space complexity at least f(n), it must have time complexity at least f(n), too.
No, since each step of the recursive algorithm can take longer than O(1). If each step take O(n) then the total time complexity is O(n^2).
I'm taking a Java data structures course atm. One of my assignment asks me to choose a data structure of my choice and write a spell checker program. I am in the process of checking the performance of the different data structures.
I went to the api for treeset and this is what it says...
"This implementation provides guaranteed log(n) time cost for the basic operations (add, remove and contains)."
would that include removeAll()?
how else would I be able to figure this out
thank you in advance
It would not include removeAll(), but I have to disagree with polkageist's answer. It is possible that removeAll() could be executed in constant time depending on the implementation, although it seems most likely that the execution would happen in linear time.
I think that NlogN would be if it was implemented in pretty much the worst way. If you are removing each element, there is no need to search for elements. Any element that you have needs to be removed, so there's no need to search.
Nope. For an argument collection of size k, the worst-case upper bound of removeAll() is, of course, O(k*log n) - because each of the elements contained in the argument collection have to be removed from the tree set (this requires at least searching for them), each of this searches yielding a cost of log n.
There is a question on an assignment that was due today which solutions have been released for, and I don't understand the correct answer. The question deals with best-case performance of disjoint sets in the form of disjoint set forests that utilize the weighed union algorithm to improve performance (the smaller of the trees has its root connected as a child to the root of the larger of the two trees) but without using the path compression algorithm.
The question is whether the best case performance of doing (n-1) Union operations on n singleton nodes and m>=n Find operations in any order is Omega(m*logn) which the solution confirms is correct like this:
There is a sequence S of n-1 Unions followed by m >= n Finds that takes Omega(m log n) time. The sequence S starts with a sequence n-1 Unions that builds a tree with depth Omega(log n). Then it has m>=n Finds, each one for the deepest leaf of that tree, so each one takes
(log n) time.
My question is, why does that prove the lower bound is Omega(m*logn) is correct? Isn't that just an isolated example of when the bound would be Omega(m*logn) that doesn't prove it for all inputs? I am certain one needs to only show one counter-example when disproving a claim but needs to prove a predicate for all possible inputs in order to prove its correctness.
In my answer, I pointed out the fact that you could have a case when you start off by joining two singleton nodes together. You then join in another singleton to that 2-node tree with 3 nodes sharing the same parent, then another etc., until you join together all the n nodes. You then have a tree where n-1 nodes all point up to the same parent, which is essentially the result you obtain if you use path compression. Then every FIND is executed in O(1) time. Thus, a sequence of (n-1) Unions and m>=n Finds ends up being Omega(n-1+m) = Omega(n+m) = Omega(m).
Doesn't this imply that the Omega(m*logn) bound is not tight and the claim is, therefore, incorrect? I'm starting to wonder if I don't fully understand Big-O/Omega/Theta :/
EDIT : fixed up the question to be a little clearer
EDIT2: Here is the original question the way it was presented and the solution (it took me a little while to realize that Gambarino and the other guy are completely made up; hardcore Italian prof)
Seems like I indeed misunderstood the concept of Big-Omega. For some strange reason, I presumed Big-Omega to be equivalent to "what's the input into the function that results in the best possible performance". In reality, most likely unsurprisingly to the reader but a revelation to me, Big-Omega simply describes the lower bound of a function. That's it. Therefore, a worst case input will have a lower and upper bounds (big-O and omega), and so will the best possible input. In case of big-omega here, all we had to do was come up with a scenario where we pick the 'best' input given the limitations of the worst case, i.e. that there is some input of size n that will take the algorithm at least m*logn steps. If such input exists, then the lower bound is tight.