I'm reading the introduction to algorithm, and for the merge sort Cormen gives this recursion tree, he said we assume that n is exact power of 2. But I don't quite get why the time cost of the root level is cn ? n is the input size, it's not the time cost so it's strange to me.
Related
If someone wants to generates a complete binary tree. This tree has h levels where h can be any positive integer and as an input to the algorithm. What complexity will it lie in and why?
A complete binary tree is tree where all levels are full of nodes except the last level, we can define the time complexity in terms of upper bound.
If we know the height of the tree is h, then the maximum number of possible nodes in the tree are 2h - 1.
Therefore, time complexity = O(2h - 1).
To sell your algorithm in the market, you need tight upper bounds to prove that your algorithm is better than the others'.
A slightly tight upper bound for this problem can be defined after knowing exactly how many nodes are there in the tree. Let's say there are N.
Then, the time complexity = O(N).
I have an algorithm which operates on a rooted tree. It first recursively computes results for each of the root's child subtrees. It then does some work to combine them. The amount of work at the root is K^2 where K is the number of distinct values among the sizes of the subtrees.
What's the best bound on its runtime complexity? I haven't been able to construct a case in which it does more than linear work in the size of the tree.
This is governed by the Master Theorem of divide and conqour algorithms. For this particular case (me reading between the lines in what you have described) it is mainly determined by how much work it takes on a single node to combine the work compiled for K values in the subtrees. Specifically if it is less than K work, then the cost is dominated by the cost at the lowest level and would be O(K) in total, if the work at a given level is O(K) then the total work becomes O(K log(K)). For work at a level higher than O(K), it is dominated by the work at the higest level. We therefor have that your algorithm as a runtime complexity of O(K^2).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have always had this question in my head, and have never been able to connect these two concepts so I am looking for some help in understanding Logarithms in Computer Science with respect to Big-O notation and algorithmic time complexity. I understand logarithms as a math concept as being able to answer the question, "what number do I need to raise this base to exponentially to get X?". For example, log2(16) tells us that we need to raise 2 to the 4th power to get 16. I also have a memorization-level understanding that O(log n) algorithms are faster than O(n) and other slower algorithms such as those that are exponential and that an example of an O(log n) algorithm is searching a balanced binary search tree.
My question is a little hard to state exactly, but I think it boils down to why is searching a balanced BST logarithmic and what makes it logarithmic and how do I relate mathematical logarithms with the CS use of the term? And a follow-up question would be what is the difference between O(n log n) and O(log n)?
I know that is not the clearest question in the world, but if someone could help me connect these two concepts it would clear up a lot of confusion for me and take me past the point of just memorization (which I generally hate).
When you are calculating Big O notation, you are calculating the complexity of an algorithm as the problem size grows.
For example, when performing a linear search of a list, the worst possible case is that the element is either in the last index, or not in the list at all, meaning your search will perform N steps, with N being the number of elements in the list. O(N).
An algorithm that will always take the same amount of steps to complete regardless of problem size is O(1).
Logarithms come into play when you are cutting the problem size as you move through an algorithm. For a BST, you start in the middle of a list. If the element to search for is smaller, you only focus on the first half of the list. If it is larger, you only focus on the second half. After only one step, you just cut your problem size in half. You continue cutting the list in half until you either find the element or can not proceed. (Note that a binary search assumes the list is in order)
Let's consider we are looking for 0 in the list below (A BST is represented as an ordered list):
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
We first start in the middle: 7
0 is less than 7 so we look in the first half of the list: [0,1,2,3,4,5,6]
We look in the middle of this list: 3
0 is less than 3 and our working list is now: [0,1,2]
So we look at 1. 0 is less than 1, so our list is now [0].
Given we have a working list of just 1 element, we are at the worst case. We either found the element, or it does not exist in the list. We were able to determine this in just four steps, looking at 7,3,1, and 0.
The problem size is 16 (number of elements in the list), which we represent as N.
In the worst case, we perform 4 comparisons (2^4 = 16 OR Log base 2 of 16 is 4)).
If we took a look at a problem size of 32, we would perform only 5 comparisons (2^5 = 32 OR Log base 2 of 32 is 5).
Therefor, the Big O for a BST is O(logN) (note that we use a base 2 for logarithms in CS).
For O(NlogN), the worst case is the problem size times the calculation of it's logarithm. Insertion sort, quick sort, and merge sort are all examples of O(NlogN)
In computer science, the big O notation indicates how fast the number of operations of an algorithm increases with a given parameter n of the requested problem statement. In a balanced binary search tree, n can be number of nodes in the tree. As you search through the tree, the algorithm needs to take a decision at each depth level of the tree. Since the number of nodes doubles at each level, the number of node in the tree n=2^d-1, where d is the depth of the tree. It is thus relatively intuitive that the number of decision that the algorithm takes is d-1 = log_{2}(n+1)-1. This shows that the complexity of the algorithm is of the order O(log(n)), which means that the number of operations is grows like log(n). As a function, log grows slower than n, that is as n becomes large log(n) is smaller than n, so an algorithm that is of time complexity O(log(n)) will be faster than one with complexity O(n), which is itself faster than O(n log(n)).
There are 2^n number of leaves in a BST. ānā is the hight of the tree. When you search, you check at each time the tree branching. So you have logarithmic time. (Logarithm function is inverse of exponent function)
assume I have a complete binary tree up-to a certain depth d. What would the time complexity be to traverse (pre-order traversal) this tree.
I am confused because I know that the amount of nodes in the tree is 2^d, so therefore the time complexity would be BigO(2^d) ? because the tree is growing exponentially.
But, upon research on the internet, Everyone states that's traversal is BigO(n) where n is the number of elements (which would be 2^d in this case), not BigO(2^d), what am I missing?
thanks
n is defined as the number of nodes.
2^d is only the number of nodes when every possible node at that depth is full
ie.
o
/ \
o o
/ \
o o
only has 5 nodes when 2^d is 8
A complete binary tree has every node filled except for last row and all of the nodes are filled to the left. You can find the definition on wikipedia
http://en.wikipedia.org/wiki/Binary_tree#Types_of_binary_trees
Even if you can express the time complexity as O(2^d), that's pretty useless as it's not something that you can use to compare it to the time complexity of any other collection.
Expressing the time complexity as O(n) is on the other hand very useful. It tells you exactly how the collection reacts when you increase the number of items, without having to know exactly how the collection is implement, and you can compare it to the time complexity of other collections.
We always see operations on a (binary search) tree has O(logn) worst case running time because of the tree height is logn. I wonder if we are told that an algorithm has running time as a function of logn, e.g m + nlogn, can we conclude it must involve an (augmented) tree?
EDIT:
Thanks to your comments, I now realize divide-conquer and binary tree are so similar visually/conceptually. I had never made a connection between the two. But I think of a case where O(logn) is not a divide-conquer algo which involves a tree which has no property of a BST/AVL/red-black tree.
That's the disjoint set data structure with Find/Union operations, whose running time is O(N + MlogN), with N being the # of elements and M the number of Find operations.
Please let me know if I'm missing sth, but I cannot see how divide-conquer comes into play here. I just see in this (disjoint set) case that it has a tree with no BST property and a running time being a function of logN. So my question is about why/why not I can make a generalization from this case.
What you have is exactly backwards. O(lg N) generally means some sort of divide and conquer algorithm, and one common way of implementing divide and conquer is a binary tree. While binary trees are a substantial subset of all divide-and-conquer algorithms, the are a subset anyway.
In some cases, you can transform other divide and conquer algorithms fairly directly into binary trees (e.g. comments on another answer have already made an attempt at claiming a binary search is similar). Just for another obvious example, however, a multiway tree (e.g. a B-tree, B+ tree or B* tree), while clearly a tree is just as clearly not a binary tree.
Again, if you want to badly enough, you can stretch the point that a multiway tree can be represented as sort of a warped version of a binary tree. If you want to, you can probably stretch all the exceptions to the point of saying that all of them are (at least something like) binary trees. At least to me, however, all that does is make "binary tree" synonymous with "divide and conquer". In other words, all you accomplish is warping the vocabulary and essentially obliterating a term that's both distinct and useful.
No, you can also binary search a sorted array (for instance). But don't take my word for it http://en.wikipedia.org/wiki/Binary_search_algorithm
As a counter example:
given array 'a' with length 'n'
y = 0
for x = 0 to log(length(a))
y = y + 1
return y
The run time is O(log(n)), but no tree here!
Answer is no. Binary search of a sorted array is O(log(n)).
Algorithms taking logarithmic time are commonly found in operations on binary trees.
Examples of O(logn):
Finding an item in a sorted array with a binary search or a balanced search tree.
Look up a value in a sorted input array by bisection.
As O(log(n)) is only an upper bound also all O(1) algorithms like function (a, b) return a+b; satisfy the condition.
But I have to agree all Theta(log(n)) algorithms kinda look like tree algorithms or at least can be abstracted to a tree.
Short Answer:
Just because an algorithm has log(n) as part of its analysis does not mean that a tree is involved. For example, the following is a very simple algorithm that is O(log(n)
for(int i = 1; i < n; i = i * 2)
print "hello";
As you can see, no tree was involved. John, also provides a good example on how binary search can be done on a sorted array. These both take O(log(n)) time, and there are of other code examples that could be created or referenced. So don't make assumptions based on the asymptotic time complexity, look at the code to know for sure.
More On Trees:
Just because an algorithm involves "trees" doesn't imply O(logn) either. You need to know the tree type and how the operation affects the tree.
Some Examples:
Example 1)
Inserting or searching the following unbalanced tree would be O(n).
Example 2)
Inserting or search the following balanced trees would both by O(log(n)).
Balanced Binary Tree:
Balanced Tree of Degree 3:
Additional Comments
If the trees you are using don't have a way to "balance" than there is a good chance that your operations will be O(n) time not O(logn). If you use trees that are self balancing, then inserts normally take more time, as the balancing of the trees normally occur during the insert phase.