I'm wondering, can there exists such a data stucture under the following criterions and times(might be complicated)?
if we obtain an unsorted list L an build a data structure out of it like this:
Build(L,X) - under O(n) time, we build the structure S from an unsorted list of n elements
Insert (y,S) under O(lg n) we insert z into the structure S
DEL-MIN(S) - under O(lg n) we delete the minimal element from S
DEL-MAX(S) - under O(lg n) we delete the maximal element from S
DEL-MId(S) - under O(lg n) we delete the upper medial(ceiling function) element from S
the problem is that the list L is unsorted. can such a data structure exist?
DEL-MIN and DEL-MAX are easy: keep a min-heap and max-heap of all the elements. The only trick is that you have to keep indices of the value in the heap so that when (for example) you remove the max, you can also find it and remove it in the min-heap.
For DEL-MED, you can keep a max-heap of the elements less than the median and a min-heap of the elements greater than or equal to the median. The full description is in this answer: Data structure to find median. Note that in that answer the floor-median is returned, but that's easily fixed. Again, you need to use the cross-indexing trick to refer to the other datastructures as in the first part. You will also need to think how this handles repeated elements if that's possible in your problem formulation. (If necessary, you can do it by storing repeated elements as (count, value) in your heap, but this complicates rebalancing the heaps on insert/remove a little).
Can this all be built in O(n)? Yes -- you can find the median of n things in O(n) time (using the median-of-median algorithm), and heaps can be built in O(n) time.
So overall, the datastructure is 4 heaps (a min-heap of all the elements, a max-heap of all the elements, a max-heap of the floor(n/2) smallest elements, a min-heap of the ceil(n/2) largest elements. All with cross-indexes to each other.
Related
For the following questions
Question 3
You are given a heap with n elements that supports Insert and Extract-Min. Which of the following tasks can you achieve in O(logn) time?
Find the median of the elements stored in the heap.
Find the fifth-smallest element stored in the heap.
Find the largest element stored in the heap.
Find the median of the elements stored in theheap.
Why is "Find the largest element stored in the heap."not correct, my understanding here is that you can use logN time to go to the bottom of the heap, and one of the element there must be the largest element.
"Find the fifth-smallest element stored in the heap." this should take constant time right, because you only need to go down 5 layers at most?
"Find the median of the elements stored in the heap. " should this take O(n) time? because we extract min for the n elements to get a sorted array, and take o(1) to find the median of it?
It depends on what the running times are of the operations insert and extract-min. In traditional heaps, both take ϴ(log n) time. However, in finger-tree-based heaps, only insert takes ϴ(log n) time, while extract-min takes O(1) time. There, you can find the fifth smallest element in O(5) = O(1) time and the median in O(n/2) = O(n) time. You can also find the largest element in O(n) time.
Why is "Find the largest element stored in the heap."not correct, my understanding here is that you can use logN time to go to the bottom of the heap, and one of the element there must be the largest element.
The lowest level of the heap contains half of the elements. More correctly, half of the elements of the heap are leaves--have no children. The largest element in the heap is one of those. Finding the largest element of the heap, then, will require that you examine n/2 items. Except that the heap only supports insert and extract-min, so you end up having to call extract-min on every element. Finding the largest element will take O(n log n) time.
"Find the fifth-smallest element stored in the heap." this should take constant time right, because you only need to go down 5 layers at most?
This can be done in log(n) time. Actually 5*log(n) because you have to call extract-min five times. But we ignore constant factors. However it's not constant time because the complexity of extract-min depends on the size of the heap.
"Find the median of the elements stored in the heap." should this take O(n) time? because we extract min for the n elements to get a sorted array, and take o(1) to find the median of it?
The median is the middle element. So you only have to remove n/2 elements from the heap. But removing an item from the heap is a log(n) operation. So the complexity is O(n/2 log n) and since we ignore constant factors in algorithmic analysis, it's O(n log n).
Suppose there are ⌈logn⌉ sorted lists of ⌊n/logn⌋ elements each. The time complexity of producing a sorted list of all these elements is: (Hint:Use a heap data structure)
A. O(nloglogn)
B. Θ(nlogn)
C. Ω(nlogn)
D. Ω(n3/2)
My Understanding:
There are logn list each containing n/logn elements then we can apply min heap procedure each of the list
it can be done in O(n/logn). Now we have logn list which satisfy the min heap property. Now how can i understand it further i am really confused here. Please help me to visualize it.
[I assume we're sorting into increasing order]
Build a heap of the smallest (ie: first) element of each list, (and for each, along with the value, keep a record of which list it came from at which index). Repeatedly remove the smallest element of this heap, and then insert the next element in the list it came from (if that list hasn't already been consumed). This gives you the sorted list of all the elements.
This heap has [log(n)] elements, so the initial cost of building this heap is O(log(n)), and each remove and insert takes O(log(log n)) time. So overall, the cost of this sort is O(log(n) + nlog(log n)) = O(nloglogn).
How to find nth Smallest element from Binary Search Tree
Constraints are :
time complexity must be O(1)
No extra space should be used
I have already tried 2 approaches.
Doing inorder traversal and finding nth element - Time complexity O(n)
Maintaining no. of small elements than current node and finding element with m small elements - Time complexity O(log n)
The only way I could think about is to change the data structure that holds the BST in memory. Should be simple if you actually consider every nodes as structure themselves (value, left_child and right_child) instead of storing them in a unordered array, you can store them in a ordered array. Thus the nth smallest element would be the nth element in your array. The extra computation will be at insertion and deletion. But it still would be more effective if you use for example a C++ set (log(n) for both insertion and deletion).
It mainly depends on your use case.
If you do not use data structure for handling the tree (based on array position) I don't think you cannot do it in something better than log(n).
I am solving a problem but I got stuck on this part.
There are 3 types of query: add a element (integer), remove a element, get sum of n (n can be any integer) largest elements. How can I do this efficient ? I am current use this solution: add a element , remove a element (binary search, O(lg n) ). getSum (naive, O(n) ).
A segment tree is commonly used to find the sum of a given range. Building that on top of a binary search tree should get the data structure you are looking for with O(log N) adds, remove and sum given range. By querying sum over the range where the k-largest elements are (roughly N-k to N), you can get the sum of the k-largest elements in O(log N). The result being a mutable ordered segment tree rather than the standard immutable (static) unordered one.
Basically, you just add variables to hold the number of children and the sum of their values to each parent node and use that information to find the sum via O(log N) additions and/or subtractions.
If k is fixed, you can use the same approach that allows for O(1) find-min/max in heaps to allow for O(1) find the k-largest elements sum simply by updating a variable holding the value during each O(log N) add/remove.
A lot depends on the relative frequency of the queries but if we assume a typical situation where the sum query will be much more frequent than the add-remove requests (and add is more frequent than remove), the solution is to store a tuple of the sums and the numbers.
So the first element will be (a1, a1), the second element in your list will be (a2, a1+a2) and so on. (Note that when you insert a new element in the k-th position, you still don't need to do the whole sum, just add the new number to the preceding element's sum.)
Removals will be quite expensive though but that's the trade-off for an O(1) sum query.
How would you find the k smallest elements from an unsorted array using quicksort (other than just sorting and taking the k smallest elements)? Would the worst case running time be the same O(n^2)?
You could optimize quicksort, all you have to do is not run the recursive potion on the other portions of the array other than the "first" half until your partition is at position k. If you don't need your output sorted, you can stop there.
Warning: non-rigorous analysis ahead.
However, I think the worst-case time complexity will still be O(n^2). That occurs when you always pick the biggest or smallest element to be your pivot, and you devolve into bubble sort (i.e. you aren't able to pick a pivot that divides and conquers).
Another solution (if the only purpose of this collection is to pick out k min elements) is to use a min-heap of limited tree height ciel(log(k)) (or exactly k nodes). So now, for each insert into the min heap, your maximum time for insert is O(n*log(k)) and the same for removal (versus O(n*log(n)) for both in a full heapsort). This will give the array back in sorted order in linearithmic time worst-case. Same with mergesort.