Algorithm to solve variation of k-partition - algorithm

For my algorithm design class homework came this brain teaser:
Given a list of N distinct positive integers, partition the list into two
sublists of n/2 size such that the difference between sums of the sublists
is maximized.
Assume that n is even and determine the time complexity.
At first glance, the solution seems to be
sort the list via mergesort
select the n/2 location
for all elements greater than, add to high array
for all elements lower than, add to low array
This would have a time complexity of O((n log n)+ n)
Are there any better algorithm choices for this problem?

Since you can calculate median in O(n) time you can also solve this problem in O(n) time. Calculate median, and using it as threshold, create high array and low array.
See http://en.wikipedia.org/wiki/Median_search on calculating median in O(n) time.

Try
http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm
What you're effectively doing is finding the median. The trick is, once you've found the values, you wouldn't have needed to sort the first n/2 and the last n/2.

Related

K Closest with unsorted array

I am prepping for interview leet-code type problems and I came across the k closest problem, but given a sorted array. This problem requires finding the k closest elements by value to an input value from the array. The answer to this problem was fairly straight forward and I did not have any issues determining a linear-time algorithm to solve it.
However, working on this problem got me thinking. Is it possible to solve this problem given an unsorted array in linear time? My first thought was to use a heap and that would give an O(nlogk) time complexity solution, but I am trying to determine if its possible to come up with an O(n) solution? I was thinking about possibly using something like quickselect, but the issue is that this has an expected time of O(n), not a worst case time of O(n).
Is this even possible?
The median-of-medians algorithm makes Quickselect take O(n) time in the worst case.
It is used to select a pivot:
Divide the array into groups of 5 (O(n))
Find the median of each group (O(n))
Use Quickselect to find the median of the n/5 medians (O(n))
The resulting pivot is guaranteed to be greater and less than 30% of the elements, so it guarantees linear time Quickselect.
After selecting the pivot, of course, you have to continue on with the rest of Quickselect, which includes a recursive call like the one we made to select the pivot.
The worst case total time is T(n) = O(n) + T(0.7n) + T(n/5), which is still linear. Compared to the expected time of normal Quickselect, though, it's pretty slow, which is why we don't often use this in practice.
Your heap solution would be very welcome at an interview, I'm sure.
If you really want to get rid of the logk, which in practical applications should seldom be a problem, then yes, using Quickselect would be another option. Something like this:
Partition your array in values smaller and larger than x. <- O(n).
For the lower half, run Quickselect to find the kth largest number, then take the right-side partition which are your k largest numbers. <- O(n)
Repeat step 2 for the higher half, but for the k smallest numbers. <- O(n)
Merge your k smallest and k largest numbers and extract the k closest numbers. <- O(k)
This gives you a total time complexity of O(n), as you said.
However, a few points about your worry about expected time vs worst-case time. I understand that if an interview question explicitly insists on worst-case O(n), then this solution might not be accepted, but otherwise, this can well be considered O(n) in practice.
The key here being that for randomized quickselect and random or well-behaved input, the probability that the time complexity goes beyond O(n) decreases exponentially as the input grows. Meaning that already at largeish inputs, the probability is as small as guessing at a specific atom in the known universe. The assumption on well-behaved input concerns being somewhat random in nature and not adversarial. See this discussion on a similar (not identical) problem.

Comparison-based algorithm that pairs the largest element with the smallest one in linear time

Given an array of integers. I want to design a comparison-based algorithm
that pairs the largest element with the smallest one, the second largest one with the second smallest one and so on. Obviously, this is easy if I sort the array, but I want to do it in O(n) time. How can I possibly solve this problem?
Well i can prove that it does not exists.
Let`s proof by contradiction: suppose there was such algorithm
When we could get an array of kth min and kth max pairs.
We could when get sorted array by taking all mins in order then all max in order,
so we could get original array sorted in O(n) steps.
So we could get a comparision based sorting algorithm that sorts in O(n)
Yet it can be proven that comparision based sorting algorithm must take atleast n
log n steps. (many proofs online. i.e. https://www.geeksforgeeks.org/lower-bound-on-comparison-based-sorting-algorithms/)
Hence we have a contradiction so such algortihm does not
exist.

Find the optimal sorting algorithm by inversions number/Pearson's r

is it possible to find a optimal sorting algorithm with a given number of elements in a presorted sequence and a inversions number or a Pearson's r of that sequence?
For example I have a presorted sequence of 262143 elements.
The maximum amount of inversions is donated by (n(n-1))/2 where n is the number of elements in the sequence (see here page 2 for this assumption). For this example the maximum is therefor 34359345153.
Now the number of inversions of my presorted sequence is 1299203725 which is 3.78% of the maximum. My Pearson's r is 0.9941. By my understanding this should be a presorted sequence with a high "sortedness" (Please correct my if I'm wrong).
I found many references to the number of inversions and the Person's r as a way to define the "sortedness" of a sequence but I could not get some kind of comparison for which number of elements and inversions/Pearson's r which sorting algorithm is the preferred one.
Thanks for your help.
It is probably very hard to beat the worst-case O(n log n) time of traditional sorting algorithms like merge sort, if you assume that your sorting algorithm is comparison-based. I believe in order to do as good or better you would probably have to assume that the number of inversions is O(n log n), much smaller than the worst-case O(n^2). Then something like bubble sort could run as fast as O(n) time if you have O(n) inversions and you keep swapping backwards as long as an element forms an inversion with its left neighbor.

Get the k smallest elements of an array using quick sort

How would you find the k smallest elements from an unsorted array using quicksort (other than just sorting and taking the k smallest elements)? Would the worst case running time be the same O(n^2)?
You could optimize quicksort, all you have to do is not run the recursive potion on the other portions of the array other than the "first" half until your partition is at position k. If you don't need your output sorted, you can stop there.
Warning: non-rigorous analysis ahead.
However, I think the worst-case time complexity will still be O(n^2). That occurs when you always pick the biggest or smallest element to be your pivot, and you devolve into bubble sort (i.e. you aren't able to pick a pivot that divides and conquers).
Another solution (if the only purpose of this collection is to pick out k min elements) is to use a min-heap of limited tree height ciel(log(k)) (or exactly k nodes). So now, for each insert into the min heap, your maximum time for insert is O(n*log(k)) and the same for removal (versus O(n*log(n)) for both in a full heapsort). This will give the array back in sorted order in linearithmic time worst-case. Same with mergesort.

In-place sorting algorithm for the k smallest integers in an array on n distinct integers

Is there an in-place algorithm to arrange the k smallest integers in an array of n distinct integers with 1<=k<=n?
I believe counting sort can be modified for this, but I can't seem to figure out how?
Any help will be appreciated.
How about selection sort? It runs in place in O(n^2). Just stop after you've found k smallest elements.
Do you want to partition the array so that k smallest elements are the first k elements (not necessarily sorted order)? IF so, what you are looking for is generalized median find algorithm which runs in O(n) (Just google for median find algorithm).
If you can live with randomized algorithm that finishes in linear time with high probability then all you have to do is keep picking your pivot randomly which greatly simplifies the implementation.
You could use randomized selection to select the kth smallest integer in O(n) time, then partition on that element, and then use quicksort on the k smallest elements. This uses O(1) additional memory and runs in total time O(n + k log k).
You're looking for a selection algorithm. BFPRT will give you guaranteed worst-case O(n) performance, but it's pretty complex.

Resources