BST tree- running time - performance

I have a pseudo- code:
function func(BST t):
x = MIN(t)
for i=1..n do:
print x.key
x = SUCCESSOR(x)
Now, I need to prove it's runnig time is THETA(n).
BUT, I know SUCCESSOR running time is O(logn), and therefor running time is O(nlogn).
where is my mistake here?
Thank in advance...

There are two possibilities:
This not true, the run time is O(nlogn)
You know the exact implementation of SUCCESSOR, which has upper bounded logarithmic complexity (as stated, O(logn)), but you can deduce, that when performing it one after another it actually degenerates to theta(1). In fact, good implementation of SUCCESSOR in BST should have amortized theta(1) complexity as each node will be visited at most twice during the whole func execution.

It really depends on the implementation of your BST, but if your BST holds a 'father' node, and is using it to find the successor, it will need to traverse each edge at most twice - once one you go "down", the first time to the node, and one when you go "up", back from it.
Since a tree has n-1 edges, you get at most 2*(n-1) number edges read, and this is O(n).
Note that indeed the worst case of the SUCCESSOR() function is O(logn), but the average case is O(1), if it is implemented the way I described.

Related

Ternary tree time complexity

I've an assignment to explain the time complexity of a ternary tree, and I find that info on the subject on the internet is a bit contradictory, so I was hoping I could ask here to get a better understanding.
So, with each search in the tree, we move to the left or right child a logarithmic amount of times, log3(n), with n being the amount of String in the tree, correct? And no matter what, we would also have to traverse down the middle child L number of times, where L is the length of the prefix we are searching.
Does the running time then come out to O(log3(n)+L)? I see many people simply saying that it runs in logarithmic time, but does Linear time not grow faster, and hence dominate?
Hope I'm making sense, thanks for any answers on the subject!
If the tree is balanced, then yes, any search that needs to visit only one child per iteration will run in logarithmic time.
Notice that O(log_3(n) = O(ln(n) / ln(3)) = O(ln(n) * c) = O(ln(n))
so the base of the logarithm does not matter. We say logarithmic time, O(log n).
Notice also that a balanced tree has a height of O(log(n)), where n is the number of nodes. So it looks like your L describes the height of the tree and is therefore also O(log n), so not linear w.r.t. n.
Does this answer your questions?

Worst case Big-O runtime for heaps

You intend to run heapify on an array of n integers, in order to turn it into a heap in-place. How long will this operation take, in the worst case? (choose the tightest possible bound)
Options are:
a) O(n)
b) O(nlogn)
c) O(nlog^2n)
d) O(n^2)
I tried this out and got the following:
Since we have at most n nodes we have O(n) and since we need to move up and compare only the height of the tree times we get O(logn) thereby giving us O(nlogn). But this solution is wrong.
Then I thought maybe we don't compare the only the height of the tree times because a smaller node can be placed on the right side of the tree forcing us to go all the way to the right side and I marked O(n^2). And that was wrong too. Any suggestions?

Doubts on finding complexity of an algorithm

So far I think I understand the basic of finding algorithm complexity:
Basics operations like read,write,assignments and allocations have constant complexity O(k) that can be simplified as O(1).
For loops you have to think of the worst case, so for what value n the loop will take the longest time:
The complexity is O(n) if there are constant increments, for example if you have a variable i that starts from 0 and you increase it or decrease it by one at each loop iteration until you reach n.
The complexity is O(logn) if you have a variable and you increase it or decrease it by multiples.
The complexity is O(n^2) if there are nested loops.
If in a function there are multiple loops, the complexity of the function will be the loop with the worst complexity.
In case the value n doesn't change, and you always have to iterate n times, you use the Θ notation because there isn't a worst case or best case scenario.
Please correct me if anything I said so far is wrong.
For recursive functions the complexity depends on how many recursive calls there will be in the worst case scenario, you have to find a recurrence relation and solve it with one of the 3 methods:
This is where the problems begin for me:
Example
Let's say I have a binary tree with this structure: pointer to left and right child and value of depth of the node.
There is a function that initially takes the root and wants to perform an operation on each left child of nodes that have odd depths. To solve this with recursion, I'll have to check if node has odd depth, if it has a left child,if yes perform the operation on left child and then make the recursive call to the next node. In this case I think the complexity should be O(n), where n is the number of odd nodes and the worst case is that all odd nodes have a left child.
But what's the recurrence relation in a function like this?

Running time complexity for binary search tree

I already know if you try to find the item with particular key the running time of worst case
is O(n) ,nis the number of node. If you try to print out all the data items in order of their keys then the running time of worst case is O(n). If you try to search for a particular data item(you don't know the key) then the running time of worst case is O(n). However, what if the keys and data are both integers and, the input items were randomly scrambled before they were inserted. Will the worst cases of running time still the same?
In the worst-case, yes. A randomly-built BST with n nodes has a 2n-1 / n! chance of being built degenerately, which is extremely rare as n gets to any reasonable size but still possible. In that case, a lookup might take Θ(n) time because the search might need to descend all the way down to the deepest leaf.
On expectation, though, the tree height will be Θ(log n), so lookups will take expected O(log n) time.
The time to print a tree is independent of the shape of the tree, by the way. It's always Θ(n).
Hope this helps!
You might not be able to change the worst case running time of a normal BST, however, if you randomize the input(in less than O(log n) time, if you're targeting O(log n) overall), then chances of that worst case occurring are highly rare. See mathematical analysis here.
In case you are interested in guaranteed O(log n) time, you can use Balanced BSTs like Red Black Trees etc. However, time to print will still be O(n) as you still need to visit each and every node before you can print it.

Worst case running time of constructing a BST?

Could someone explain to me how the Worst case running time of constructing a BST is n^2? I asked my professor and the only feedback i received is
"Because the tree is linear to the size of the input. The cost is 1+2+3+4+...+(n-1)."
Can someone explain this in a different way? Her explanation makes me think its O(n)....
I think the worst case happens when the input is already sorted:
A,B,C,D,E,F,G,H.
That's why you might want to randomly permute the input sequence if applicable.
The worst-case running time is proportional to the square of the input because the BST is unbalanced. An ubalanced BST can exhibit a degenerate structure: in the worst case, a singly linked list. Constructing this list will require that each insertion marches down the full length of the growing list to get to the leaf node to add a new leaf.
For instance, try running the algorithm on data which is precisely in the reverse order, so that each new node must become the new leftmost node of the tree.
A BST (even a balanced one!) can be constructed in linear time only if the input data is already sorted. Moreover, this is done using a special algorithm which takes advantage of the order; not by performing N insertions.
I'm guessing the 1+2+3+4+...+(n-1) insertion steps are clear, (for a reversed ordered list).
You should get comfortable with the idea that this number of steps is quadratic. Think about running the algorithm twice and count the number of steps:
[1+2+3+4+...+(n-1)] + [1+2+3+4+...+(n-1)] = [1+2+3+4+...+(n-1)] + [(n-1) + ... + 4+3+2+1] = n+n+...n = n^2
Therefore, one run take 0.5*n^2 steps.

Resources