Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am trying to understand the runtime of printing BST keys in the given range.
I tried to understand it from this example, but I could not.
I think I understand where the O(log n) is coming from. That is from going through the BST recursively, this will take O(log n) for each side, but I am not sure about:
Where the K is coming from. is it just the constant time it takes to print? if yes why is the runtime not O(log n) + O(k) , and than you would ignore the K
Where is the O(n) from the in order traversal? because it is not in this runtime.
How the runtime will change if we have several values in the range on each side of the tree. For example, what if the range was from 4?
An easier way to understand the solution is to consider the following algorithm:
Searching for a minimum value greater than key k1 in the BST tree - O(lgn)
Performing in-order traversal of the BST tree nodes from k1 till we reach a node less than or equal to k2, and printing their keys. Because the in-order traversal of the complete BST takes O(n) time, if there are k keys between k1 and k2, the in-order traversal will take O(k) time.
The given algorithm is doing the same thing; Searching for a key between k1 and k2 takes O(lgn) time, whereas printing is done only for k keys within the range k1 and k2 which is O(k). If all BST keys lie within k1 and k2, the runtime will be O(lgn) + O(n) = O(n) because all n keys need to be printed out.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last year.
Improve this question
I’m having trouble with this question.
Let X be a set of n keys
Let S be a set of m subsets of X
Find a way to find the maximum key of every subset in S with O(n log n) comparisons.
I know I can find maximum with quick sort by O(n) and binary sort by O(log n), but I’m unsure of how to proceed further. Any help would be appreciated!
If a subset is defined by the enumeration of its elements, the largest element is obtained in time proportional to the number of elements and this is optimal.
For m subsets, the total work is the total number of elements, Σni, which is still optimal.
If a subset is specified by a binary mask of length n, you can't avoid O(nm) operations.
Let X be a set of n keys
Let S be a set of m subsets of X
Find the maximum of every subset in S with O(n log n) comparisons.
Solution :
Construct a Max-Heap for each of the m subsets of X.
Use Heap-Sort on each of the m Max-Heaps to find the Maximum of each of the m subsets of X.
The number of comparisons in a Heap-Sort is O(n log n).
So, to Max-Heapify m sets, and to find the maximum of each subset (the root of each Max-Heap), the total number of comparisons would be O(mn log n), but if m is a constant, we can approximate it to O(n log n).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have always had this question in my head, and have never been able to connect these two concepts so I am looking for some help in understanding Logarithms in Computer Science with respect to Big-O notation and algorithmic time complexity. I understand logarithms as a math concept as being able to answer the question, "what number do I need to raise this base to exponentially to get X?". For example, log2(16) tells us that we need to raise 2 to the 4th power to get 16. I also have a memorization-level understanding that O(log n) algorithms are faster than O(n) and other slower algorithms such as those that are exponential and that an example of an O(log n) algorithm is searching a balanced binary search tree.
My question is a little hard to state exactly, but I think it boils down to why is searching a balanced BST logarithmic and what makes it logarithmic and how do I relate mathematical logarithms with the CS use of the term? And a follow-up question would be what is the difference between O(n log n) and O(log n)?
I know that is not the clearest question in the world, but if someone could help me connect these two concepts it would clear up a lot of confusion for me and take me past the point of just memorization (which I generally hate).
When you are calculating Big O notation, you are calculating the complexity of an algorithm as the problem size grows.
For example, when performing a linear search of a list, the worst possible case is that the element is either in the last index, or not in the list at all, meaning your search will perform N steps, with N being the number of elements in the list. O(N).
An algorithm that will always take the same amount of steps to complete regardless of problem size is O(1).
Logarithms come into play when you are cutting the problem size as you move through an algorithm. For a BST, you start in the middle of a list. If the element to search for is smaller, you only focus on the first half of the list. If it is larger, you only focus on the second half. After only one step, you just cut your problem size in half. You continue cutting the list in half until you either find the element or can not proceed. (Note that a binary search assumes the list is in order)
Let's consider we are looking for 0 in the list below (A BST is represented as an ordered list):
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
We first start in the middle: 7
0 is less than 7 so we look in the first half of the list: [0,1,2,3,4,5,6]
We look in the middle of this list: 3
0 is less than 3 and our working list is now: [0,1,2]
So we look at 1. 0 is less than 1, so our list is now [0].
Given we have a working list of just 1 element, we are at the worst case. We either found the element, or it does not exist in the list. We were able to determine this in just four steps, looking at 7,3,1, and 0.
The problem size is 16 (number of elements in the list), which we represent as N.
In the worst case, we perform 4 comparisons (2^4 = 16 OR Log base 2 of 16 is 4)).
If we took a look at a problem size of 32, we would perform only 5 comparisons (2^5 = 32 OR Log base 2 of 32 is 5).
Therefor, the Big O for a BST is O(logN) (note that we use a base 2 for logarithms in CS).
For O(NlogN), the worst case is the problem size times the calculation of it's logarithm. Insertion sort, quick sort, and merge sort are all examples of O(NlogN)
In computer science, the big O notation indicates how fast the number of operations of an algorithm increases with a given parameter n of the requested problem statement. In a balanced binary search tree, n can be number of nodes in the tree. As you search through the tree, the algorithm needs to take a decision at each depth level of the tree. Since the number of nodes doubles at each level, the number of node in the tree n=2^d-1, where d is the depth of the tree. It is thus relatively intuitive that the number of decision that the algorithm takes is d-1 = log_{2}(n+1)-1. This shows that the complexity of the algorithm is of the order O(log(n)), which means that the number of operations is grows like log(n). As a function, log grows slower than n, that is as n becomes large log(n) is smaller than n, so an algorithm that is of time complexity O(log(n)) will be faster than one with complexity O(n), which is itself faster than O(n log(n)).
There are 2^n number of leaves in a BST. “n” is the hight of the tree. When you search, you check at each time the tree branching. So you have logarithmic time. (Logarithm function is inverse of exponent function)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm looking for an efficient algorithm or data structure to find largest element by second parameter in first N elements of multiset in which I'll make many ,so I can't use segment tree.Any Ideas?
note:I have multiset of pairs.
You can use any balanced binary search tree implementation you are familiar with. Arguably the most well known are AVL tree, Red-black tree.
Usually binary search tree description mentions a key and value pair stored in tree node. The keys are ordered from left to right. Insert, delete and find operations work with O(log(n)) time complexity because tree is balanced. Balance is often supported by tree rotation.
In order to be able to find maximum value on a range of elements you have to store and maintain additional information in each tree node namely maxValue in the subtree of the node and size of the subtree. Define a recursive function for a node to find maximum value among the first N nodes of its subtree. If N is equal to size you will already have an answer in maxValue of current node. Otherwise call the function for left/right node if some elements are in threir subtrees.
F(node, N) =
if N == size[node] : maxValue[node]
else if N <= size[leftChild[node]] :
F(leftChild[node], N)
else if N == size[leftChild[node]] + 1 :
MAX(F(leftChild[node], N), value[node])
else :
MAX(maxValue[leftChild[node]],
value[node],
F(rightChild[node], N - size[leftChild[node]] - 1)
If you are familiar with segment tree you will not encounter any problems with this implementation.
I may suggest you to use Treap. This is randomised binary tree. Because of the this randomised nature the tree always remains balances providing O(log(n)) time complexity for the basic operations. Treap DS has two basic operations split and merge, all other operations are implemented via their usage. An advatage of treap is that you don't have to deal with rotations.
EDIT: There is no way to maintain maxKey/minKey in each node explicitly O(log(n)).
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have some problem. I must add a lot of different values and just get only k-th largest in the end. How can I effectively implement that and what algorithm should I use?
Algorithm:
Create a binary maximum heap, and add each one of the first K values into the heap.
For each one of the remaining N-K values, if it larger than the last value in the heap:
Put it instead of the last value, and bubble it up in order to resort the heap.
Extract all the (K) values from the heap into a list.
Complexity:
O(K)
O((N-K)×log(K))
O(K×log(K))
If N-K ≥ K, then the overall complexity is O((N-K)×log(K)).
If N-K < K, then the overall complexity is O(K×log(K)).
(Based on comments that you do not want to store all the numbers you have seen...)
Keep a running list (sorted) of the k largest you have seen so far. As you get new numbers, look to see if it is larger than the least element in the list. If it is, remove the least element and insert (sorted) the new element into the list of k largest. Your original list of k (when you've seen no numbers) would consist of k entries of negative infinity.
first build max-heap using those elements which is O(n) time.
then extract k-1 elements in O(klogn) time.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am a fresher and preparing for interviews. In my recent interview I was asked a question, for which I couldn't find suitable answer.
I was given some 100 files, each file is containing large number of comma separated integers. I had to find the top 10 integers among the whole files. I tried to solve it using heap. But I got confused with the time complexity of the process. Any help will be appreciated, thanks.
I think you are on the right track with using a heap data structure.
You could process the files in parallel and for each file you could maintain a min-heap of size 10.
As you iterate through a file you insert a value into the min-heap until it is full (size 10) then for values in positions 11 through n
if current_value > min_heap.current()
min_heap.extract()
min_heap.insert(current_value)
You have to iterate through n values and the worst case scenario is if the file is sorted in ascending order. In that case you will have to extract the min value and insert a new value for all the values in positions 11 thru n. The heap operations will be O(log n) giving you an overall running time of O(n * log n) for each file.
At this point you have m (# of files) min-heaps each of size 10. Here you can use a final min heap to store the ten largest numbers contained in the m min-heaps. This computation will be O(m) because the all the heaps at this point will be of max size 10, a constant.
Overall the running time will be O(n * log n + m). m could be much smaller than n so amongst friends we could say O(n * log n).
Even if you don't do the first step in parallel it would be O(m * n * log n + m), but once again if n dominates m we could say O(n * log n).