finding middle element of an array [duplicate] - algorithm

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How to find the kth largest element in an unsorted array of length n in O(n)?
Hi all,
I came cross a question in my interview.
Question:
Array of integers will be given as the input and you should find out the middle element when sorted , but without sorting.
For Example.
Input: 1,3,5,4,2
Output: 3
When you sort the given input array, it will be 1,2,3,4,5 where middle element is 3.
You should find this in one pass without sorting.
Any solutions for this?

This is a selection algorithm problem which is O(n).
Edit: but if you sure items are consecutive you can compute smallest and biggest and count of elements (in one pass) and return [smallest + (biggest - smallest + 1)/ 2]

To me, it sounds like you can use std::nth_element straight off - don't know if that is an acceptable answer.

You can use a "modified" quicksort to find it. It runs in O(n^2) but should be fairly fast on average. What you do is every time you choose a pivot, you check how many elements were less than the pivot and how many were greater. If there are same elements less and greater than the pivot, the pivot is the median. If not, you can recurse only to the portion where the element is contained. Worst case scenario, you will be performing a complete sorting though.
Example:
Array with 7 elements, we are looking for the 4-th smallest element.
5 3 8 6 7 1 9
Suppose quicksort chooses 3 as pivot, than you'll get:
1 3 5 8 6 7 9
Now, you want the 2nd smallest in the subarray [5, 8, 6, 7, 9]. Keep going until the pivot is the k-th smallest you are searching in the current iteration.
I think this solution is pretty good for an interview question although you should mention that there is an O(n) deterministic solution.

Related

Why QuickSort bad at sorting almost sorted data [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Why is QuickSort bad at sorting almost sorted data? In comparison, why is insertion sort better? Trying to understand Big O notation!
Your statement is true for certain variants of QS depending on the choice of pivot. QS performance depends on the pivoting operation to divide the data into approximately equally sized chunks, which will then be sorted separately. If the pivot is the min or max of the data, or represents a high or low percentile, the pivoting operation will divide the data into two parts whereby most of the data is in one of the two, which still needs to be sorted. If the first element of the data is chosen as a pivot, and the data is sorted, this worst case scenario occurs. By just choosing a random element as pivot, the worst case scenario has a negligible chance of occurring. This is irrelevant to worst case analysis, but on average (over possible pivots, worst case wrt input) or in practice this results in good performance.
Quicksort's algorithm is as follows:
Select a "pivot" value from the elements in the list.
Reorder the list so that all values are in their correct position relative to the pivot (e.g. if we want to sort the list in ascending order then all values less than the pivot would go before the pivot, and all values greater than the pivot would go after the pivot).
Quicksort the sections of the list before and after the pivot.
Whether the assertion that it performs poorly with sorted/nearly-sorted lists is even true depends entirely upon how step 1 is performed. What is the pivot? Say I'm trying to sort the following list into ascending order:
1, 2, 3, 4, 5, 6
Well, let's consider step 1. Which do I use as a pivot? If we designed our code under the assumption that the list order is random, we'd probably just use the first element, as any pivot is equally likely to be good when the order is completely random. In this case, however, the two sub-lists that need to be sorted are extremely uneven. Specifically, the first is empty, and the second is all remaining values
2, 3, 4, 5, 6
When we sort it, we will use 2 as the pivot and find the exact same thing happens again. This ultimately means that each value is compared to each other value. If we had selected 3 as the pivot instead, however, we would then have our remaining values split into 1, 2 and 4, 5, 6. As a result, 1 would be compared to 2, but neither would ever need to be compared to any of the values in 4, 5, 6. Let's consider how 4, 5, 6 would then be sorted. If 4 were selected as the pivot, 4 would be compared to 5 and 6, and then 5 would need to be compared to 6 in the next iteration. Conversely, were 5 our pivot, 5 would be compared to 4 and 6, but 4 and 6 would never be compared to each-other.
Note that this problem is the same for cases where the list is in perfectly reversed order as well.
Of course, a solution could be to use a different technique for choosing a pivot.
In terms of big O notation, Insertion-sort has a O(n^2), and Quicksort has a worst-case O(n^2), but a best-case O(nlog(n)). Insertion-sort is almost never preferable to Quicksort.
Addendum: Insertion-sort works well for a pre-sorted list because it works by iteratively comparing elements to their adjacent element to see if they should be swapped with one-another. In a pre-sorted list there would be no swapping, and as such no need for more than 1 comparison per element, and as such could be considered a O(n).

Binary search in 2 sorted integer arrays

There is a big array which consists of 2 small integer arrays written one at the end of another. Both small arrays are sorted by ascending. We have to find an element in big array as fast, as possible. My idea was to find the end of the left array by binsearch in big array and then implement 2 binsearches on small arrays. The problem is that I don't know how to find that end. If you have an idea, how to find element without finding borders of smaller arrays, you're welcome!
Information about arrays: both small arrays have integer elements, both are sorted by ascending, they both can have length from 0 to any positive integer number, but there can be only one copy of an element.
Here are some examples of big arrays:
1 2 3 4 5 6 7 (all the elements of the second array are bigger, than the maximum of the first array)
100 1 (both arrays have only one element)
1 3 5 2 4 6 or 2 4 6 1 3 5 (most common situations)
This problem is impossible to solve in guaranteed time complexity faster than O(n) and not possible to solve at all for certain arrays. Binary search runs in O(log n) for a sorted array, but the big array is not guaranteed to be sorted and will in the worst-case require one or more comparisions per element, which is O(n). The best guaranteed time complexity is O(n) with the trivial algorithm: compare every item with its neighbour until you find the "turning point" with A[i] > A[i+1]. However, if you use a breadth-first search, you may get lucky and find the "turning point" early.
Proof that the problem is unsolvable for some arrays: let the array M = [A B] be our big array. To find the point where the arrays meet we're looking for an index i where M[i] > M[i+1]. Now let A=[1 2 3] and B=[4 5]. There is no index in the array M for which the condition holds true, thus the problem is unsolvable for some arrays.
Informal proof for the former: let M=[A B] and A=[1..x] and B=[(x+1)..y] be two sorted arrays. Then swap the positions of element x and y in M. We have no way of finding the index of x without (in the worst case) checking every index, thus the problem is O(n).
Binary search relies on being able to eliminate half the solution space with each comparision, but in this case we cannot eliminate anything from the array and so we cannot do better than a linear search.
(From a practical standpoint, you should never do this in a program. The two arrays should be separate. If this isn't possible, append the length of either array to the bigger array.)
Edit: changed my answer after question was updated. It's possible to do it faster than linear time for some arrays, but not all possible arrays. Here's my idea for an algorithm using breadth-first search:
Start with the interval [0..n-1] where n is the length of the big array.
Make a list of intervals and put the starting interval in it.
For each interval in the list:
if the interval is only two elements and the first element is greater than the last
we found the turning point, return it
else if the interval is two elements or less
remove it from the list
else if the first element of the interval is greater than the last
turning point is in this interval
clear the list
split this interval in two equal parts and add them to the list
else
split this interval in two equal parts and replace this interval in the list with the two parts
I think a breadth-first approach will increase the odds of finding an interval where A[first] > A[last] early. Note that this approach will not work if the turning point is between two intervals, but it's something to get you started. I would test this myself, but unfortunately I don't have the time now.

Removing elements to sort array

I'm looking for an algorithm to sort an array, but not by moving the values. Rather, I'd like to delete as few values as possible and end up with a sorted list. Basically I want to find the longest ascending sub-array.
To illustrate:
1 4 5 6 7 2 3 8
Should become (2 removes)
1 4 5 6 7 8
And not (5 removes)
1 2 3
I can see how I can do this in a naive way, i.e. by recursively checking both the 'remove' and 'dont remove' tree for each element. I was just wondering if there was a faster / more efficient way to do this. Is there a common go-to algorithm for this kind of problem?
You're looking for the longest increasing subsequence problem. There is an algorithm that solves it in O(n log n) time.
There is one O(NlogN) algorithm from the site which is faster than the recursive algorithm .
http://www.algorithmist.com/index.php/Longest_Increasing_Subsequence

Interview Algorithm: find two largest elements in array of size n

This is an interview question I saw online and I am not sure I have correct idea for it.
The problem is here:
Design an algorithm to find the two largest elements in a sequence of n numbers.
Number of comparisons need to be n + O(log n)
I think I might choose quick sort and stop when the two largest elements are find?
But not 100% sure about it. Anyone has idea about it please share
Recursively split the array, find the largest element in each half, then find the largest element that the largest element was ever compared against. That first part requires n compares, the last part requires O(log n). Here is an example:
1 2 5 4 9 7 8 7 5 4 1 0 1 4 2 3
2 5 9 8 5 1 4 3
5 9 5 4
9 5
9
At each step I'm merging adjacent numbers and taking the larger of the two. It takes n compares to get down to the largest number, 9. Then, if we look at every number that 9 was compared against (5, 5, 8, 7), we see that the largest one was 8, which must be the second largest in the array. Since there are O(log n) levels in this, it will take O(log n) compares to do this.
For only 2 largest element, a normal selection may be good enough. it's basically O(2*n).
For a more general "select k elements from an array size n" question, quick Sort is a good thinking, but you don't have to really sort the whole array.
try this
you pick a pivot, split the array to N[m] and N[n-m].
if k < m, forget the N[n-m] part, do step 1 in N[m].
if k > m, forget the N[m] part, do step in in N[n-m]. this time, you try to find the first k-m element in the N[n-m].
if k = m, you got it.
It's basically like locate k in an array N. you need log(N) iteration, and move (N/2)^i elements in average. so it's a N + log(N) algorithm (which meets your requirement), and has very good practical performance (faster than plain quick sort, since it avoid any sorting, so the output is not ordered).

minimum number of comparisons needed

what is the minimum number of comparisons needed to find the largest element from 4 distinct elements? I know for 5 distinct numbers it is 6, floor(5/2) * 3; this is from clrs book. but I know there is no one general formula for finding this, or is there?
edit clarification
these 4 elements could be in any different order(for all permutations of these 4 elements) im not interested in a counting technique to keep track of the largest element as you traverse the elements, but comparisons like > or <.
for 4 elements the min. number of comparisons is 3.
In general, to find largest of N elements you need N-1 comparisons. This gives you 4 for 5 numbers, not 6.
Proof:
there is always a solution with N-1 comparisons: just compare first two and then select the larger and compare with next one, select the larger and compare with next one etc....
there cannot be shorter solution because this solution would not compare all the elements.
QED.
I know it does not answer the original question, but I enjoyed reading this not-so-intuitive post on the minimum number of comparisons needed to find the smallest AND the largest number from an unsorted array (with proof).
Think of it as a competition. By comparing two elements you have a looser and a winner.
So if you have n elements and need 1 final winner you need n-1 comparisons to rule out the other ones.
for elements a,b,c,d
if a>b+c+d, then it only required one comparison to know that a is the biggest.
You do have to get lucky though.

Resources