I have the following question as part of my revision for a final exam:
For each of the following problems give the worst-case running time in
Big-O notation.
(i) Adding n numbers
(ii) Finding the minimum of n numbers in an unordered array
(iii) Finding an item in a binary heap
(iv) Sorting an unordered list of items using merge sort
(v) Finding the median (The value of a numerical set that equally divides
the number of values that are larger and smaller) of an array of sorted
items
Are my current ideas correct?
(i) This would be O(n) because you are adding n numbers.
(ii) This again would be O(n). You have to check every element in this list.
(iii) Not 100% sure here, but i assume it would be worst case O(n log n) as most things are with binary heaps.
(iv) This would be O(n log n)
(v) Again i am not sure on this, maybe O(log n) since the array is sorted so you only need to search half the values, essentially a binary chop.
Could anybody point me in the right direction if any of my answers are incorrect.
Thanks,
Chris.
(v) Finding the median (The value of a numerical set that equally divides
the number of values that are larger and smaller) of an array of sorted
items
(v) Again i am not sure on this, maybe O(log n) since the array is sorted so you only need to search half the values, essentially a binary chop.
This one is O(1). You are interested in the item that is in the middle of the sorted array (for N odd), or the average of the two "closest" to the middle (for N even).
Since the data is ordered, you can simply examine the one or two elements needed for the result.
(iii) Finding an item in a binary heap
(iii) Not 100% sure here, but i assume it would be worst case O(n log
n) as most things are with binary heaps.
This is actually O(N) as it can be solved by a traversal of the binary tree the heap is built on.
Related
I'm wondering, can there exists such a data stucture under the following criterions and times(might be complicated)?
if we obtain an unsorted list L an build a data structure out of it like this:
Build(L,X) - under O(n) time, we build the structure S from an unsorted list of n elements
Insert (y,S) under O(lg n) we insert z into the structure S
DEL-MIN(S) - under O(lg n) we delete the minimal element from S
DEL-MAX(S) - under O(lg n) we delete the maximal element from S
DEL-MId(S) - under O(lg n) we delete the upper medial(ceiling function) element from S
the problem is that the list L is unsorted. can such a data structure exist?
DEL-MIN and DEL-MAX are easy: keep a min-heap and max-heap of all the elements. The only trick is that you have to keep indices of the value in the heap so that when (for example) you remove the max, you can also find it and remove it in the min-heap.
For DEL-MED, you can keep a max-heap of the elements less than the median and a min-heap of the elements greater than or equal to the median. The full description is in this answer: Data structure to find median. Note that in that answer the floor-median is returned, but that's easily fixed. Again, you need to use the cross-indexing trick to refer to the other datastructures as in the first part. You will also need to think how this handles repeated elements if that's possible in your problem formulation. (If necessary, you can do it by storing repeated elements as (count, value) in your heap, but this complicates rebalancing the heaps on insert/remove a little).
Can this all be built in O(n)? Yes -- you can find the median of n things in O(n) time (using the median-of-median algorithm), and heaps can be built in O(n) time.
So overall, the datastructure is 4 heaps (a min-heap of all the elements, a max-heap of all the elements, a max-heap of the floor(n/2) smallest elements, a min-heap of the ceil(n/2) largest elements. All with cross-indexes to each other.
How would you find the k smallest elements from an unsorted array using quicksort (other than just sorting and taking the k smallest elements)? Would the worst case running time be the same O(n^2)?
You could optimize quicksort, all you have to do is not run the recursive potion on the other portions of the array other than the "first" half until your partition is at position k. If you don't need your output sorted, you can stop there.
Warning: non-rigorous analysis ahead.
However, I think the worst-case time complexity will still be O(n^2). That occurs when you always pick the biggest or smallest element to be your pivot, and you devolve into bubble sort (i.e. you aren't able to pick a pivot that divides and conquers).
Another solution (if the only purpose of this collection is to pick out k min elements) is to use a min-heap of limited tree height ciel(log(k)) (or exactly k nodes). So now, for each insert into the min heap, your maximum time for insert is O(n*log(k)) and the same for removal (versus O(n*log(n)) for both in a full heapsort). This will give the array back in sorted order in linearithmic time worst-case. Same with mergesort.
given numbers in an unsorted way say X:{4,2,5,1,8,2,7}
How do you find rank of number??
Eg: Rank of 4:4
: Rank of 5:5
Complexity has to be O(lg n).
It can be done in complexity of O(lg n) with the help of Red Black Trees and Augmented Data structure approach(one of the fascinating stuff nowadays).
lets make use of order statistic treeOrder Statistic Tree
Algorithm:
RANK(T,x)
//T: order-statistic tree, x: node(to find rank of this node)
r = x.left.size + 1
y=x
While y != T.root
if y==y.p.right
r= + y.p.left.size + 1
y=y.p
Return r;
any help is appreciated.
are there any better approach than this??
Given numbers in an unsorted way, say X:{4,2,5,1,8,2,7}
How do you find rank of number?
Rank is the position of the element when it is sorted.
Complexity has to be O(lg n).
That's impossible. You have to look at each element at least once. Thus, you can't get better than O(n), and it's trivial in O(n):
set found to false
set smaller to 0
for each number in array
if the number is smaller than needle
increment the smaller counter
if the number is equal to the needle
set found to true
if found, return smaller+1, else return error
It can be done in complexity of O(lg n) with the help of Red Black Trees and Augmented Data structure approach(one of the fascinating stuff nowadays). Let's make use of order statistic tree
The problem is you don't have an order-statistic tree, and you don't have the time to build one. Building an order-statistic tree takes more than O(lg n) time*.
But let's say you have the time to build an order-statistic tree. Since extracting the sorted list of nodes in a binary search tree takes linear time, building an order-statistic tree cannot be faster than sorting an array directly.
So, let's sort the array directly. Then, finding the rank of an element is equivalent to finding the element in a sorted array. This is a well known task that can be solved in O(lg n) via binary search (repeatedly split the array in half until you find the element). It turns out that the order-statistic tree does not, quite, help. In fact, you can imagine the binary search as a lookup in an order-statistic tree (except the tree doesn't actually exist).
If x could change at runtime, then order-statistic trees do help. Then, element removal/addition takes Th(lg n) (worst-case) time, while it takes Th(n)* (average-case) in an ordinary sorted array because you need shift the elements around. With x immutable, order-statistic trees don't speed up anything over plain arrays.
* Technically, O(lg n) is a set of functions that grow asymptotically no more than lg n. When I say "more than O(lg n)", the correct interpretation is "more than every function in O(lg n). Incidentally, this is equivalent to saying the run time is omega(lg n) (note the omega is lowercase).
Th(lg n) is the set of functions that are asymptotically equal to lg n, up to a constant. Expressing the same using O(lg n) and english while staying technically correct would be awkward.
Can we do better than O(n lg n) running time for a comparison-based algorithm when all of the values are in the range 1 to k, where k < n.
Counting sort and radix sort are not comparison-based algorithms and are disallowed. By a decision tree analysis, it seems like there are k^n possible permutations. There are 2^h leaves, so it should be possible to solve the problem in O(n lg k) time with a comparison-based sorting algorithm.
Please do not give a non-comparison based sorting algorithm for solving this problem, all sorting must be based on comparisons between two elements. Thanks!
It may easily be done in the bound you specified. Build a binary tree of k leaves and include a count value on each leaf. Processing each element (adding it or bumping the count) will be O(lg k) if one uses a suitable balancing algorithm, so doing all of them will be O(n lg k). Reconstituting the list will then be O(n).
Ok, if you insist you want comparisons.
You have k elements. So, keep a tree structure that will hold all the elements.
Go over the list of items, each time add the item to the tree. If the item is already in the tree, just increment the counter in that node. (or if you want the actual items you can keep a list in each node)
The tree will have no more than k items.
in the end, go over the tree in an inorder way, and add the items back in the right order (while adding the amount that are in the node's counter).
Complexity: O(nlogk)
Yes, you could use an array of size k. (Without comparisons)
Each cell i will contain a list.
go over the original array, put every item in the list of the right cell.
Go over the the second array, and pull them out, put them back in the right order.
O(n)
In comments to this answer an idea is brought up that inverting a simply linked list could only be done in O(nlog(n)), not O(n) time.
This is definitely wrong – an O(n) inversion is not a problem - just traverse the list and change pointers as you go. Three temporary pointers are required - that's constant extra memory.
I understand completely that O(nlog(n)) is worse (slower) than O(n).
But out of curiosity - what could be a O(nlog(n)) algorithm for inverting a simply linked list? An algorithm with constant extra memory is preferable.
I think you're confused. You're saying O(n log(n)) which is in fact worse than O(n). Do you perhaps mean O(log n)? If so, the answer is no. You can't invert a linked list in O(log n). O(n) is trivial (and the obvious solution). O(n log(n)) doesn't make a lot of sense.
Edit: Ok, so you do mean O(n log(n)). Then the answer is yes. How? Easy. You sort the list:
Count the length of the list. Cost: O(n);
Create an array of that same size;
Copy the elements of the linked list into the array in random order, putting the original order as part of the element. For example: [A,B,C] -> [(B,2),(C,3),(A,1)]. Cost: O(n);
Sort the array using an efficient sort (eg quick sort) in inverted original order eg [(C,3),(B,2),(A,1)]. Cost: O(n log(n));
Create a linked list from the reversed array. Cost: O(n).
Total Cost: O(n log(n))
Despite all the intermediate steps, the sort is the most expensive operation. The O(n) other steps are constant (meaning the number of steps is not a factor of n) so the total cost is O(n log(n)).
Edit 2: I originally didn't put the list items in random order but realized that you could argue that an efficient sort on an already sorted list was less than O(n log(n)) even if you were reversing it. Now I'm not completely convinced that that's the case but the above revision removes that potential criticism.
And yes, this is a pathological question (and answer). Of course you can do it in O(n).
Every O(n) algorithm is also O(n log n), so the answer is yes.
Stupid idea, but O(n log n) and not O(n)
Assign each item of a list an unique ID. The id of each successor should be greater than the id of the item (O(n))
Sort the items by ascending order using the id as key using any comparison based sorting algorithm (O(n log n))
Build up a new list using the order given by sorting the items (O(n))
Well...
You could use a recursion that accepts a linked list and inverts it, by calling itself with two halves of the linked list, where if the input is just two nodes, it inverts them.
This is highly inefficient but I believe it's O(nlog(n))...
Something like the following in pseudo code (assuming you have a len function that returns the length of a list in O(n) and a sub_list(list, start_id, end_id) function that returns a subset of list starting at start_id and ending at end_id in O(N)):
function invert(list)
if len(list) == 1 return list
if len(list) == 2:
new_list = list.next
new_list.next = list
return new_list
middle_of_list = len(list) / 2 <-- integer division
first_half = invert ( sub_list(list, 1, middle_of_list) )
second_half = invert ( sub_list(list, middle_of_list+2, len(list) )
first_half.next = second_half
return first_half
If you are pedantic then this algorithm is O(n log n), because the pointers are of size at least log n and must be assigned n times.
But in reality machines have a fixed word size, so this isn't usually taken into account.
If the question is actually for a Ω(nlgn) algorithm, perhaps this overly complicated algorithm will do?
build a balanced tree structure from the linked list
each leaf contains both original value from the linked list and an index number of the value in the linked list; use the index as the tree key
traverse the tree to report all leafs in reverse order to the original index
create a new linked list from the matching values