Software algorithm practice-Binary Search [duplicate] - algorithm

This question already has an answer here:
Find numbers having a particular difference within a sorted list
(1 answer)
Closed 7 years ago.
I am new to algorithm. found this question and got stuck half of my day. My guess of times of key comparison is T(n)=2n-2. Ie O(n) Any advice? Appreciate.
Given a sorted array of n elements, A[0..n − 1], and a constant C. We want to determine if there
is a pair of elements A[i] and A[j], i != j, such that A[i] + A[j] = C. (We want a Boolean function
which returns a TRUE/FALSE answer.)
(a) Outline how this problem may be solved by using the binary-search algorithm, BS(A, lef t, right, key).
(Do not give the code for BS. It is a given function, which you call.) Analyze the time complexity of this approach.
(b) Descrive a more efficient O(n) algorithm to solve this problem. Give the pseudo-code. Explain how the algorithm works, and provide a numerical illustration.

a) You could iterate through all of the elements in the array (in O(n) time) and call binary search to find the number C - A[i] in the same array (in O(log n) time. If such a number exists, the two numbers would sum to C. The total running time of this operation is O(n log n)
b) You could use a set to store all of the values in the array as you iterate through the array and then for each number you come across, you check to see if the set contains the number C-A[i]. You then add the number A[i] to the set. Checking to see if a set contains a number is O(1) (assuming it's a hashset), and iterating through the array is O(n), giving a final runtime of O(n).
Pseudocode:
set S to a new set
for i in list A:
if S contains C-A[i]:
return true
add A[i] to set
return false
EDIT: I see you are using T(n). In the programming world, everybody uses O(n), which is a crucial subject to understand if you want to become proficient in algorithms. I suggest you learn it before you progress to anything else.

Related

Find Pair with Difference less than K with O(n) complexity on average

I have an unsorted array of n positive numbers and a parameter k, I need to find out if there is a pair of numbers in the array that the difference between than is less than k and I need to do so in time complexity of O(n) on probable average and in space complexity of O(n).
I believe it requires the use of a universal hash table but I'm not sure how, any ideas?
This answer works even on unbounded integers and floats (doing some assumptions on the nicety of the hashmap you'll be using - the java implementation should work for instance):
keep a hashmap<int, float> all_divided_values. For each key y,
if all_divided_values[y] exists, it will contain a value v that
is in the array such that floor(v/k) = y.
For each value v in the original array A, if v/k is in all_divided_values's keys, output (v, all_divided_values[v/k])
(they are distant by less than k). Else, store v in
all_divided_values[v/k]
Once all_divided_values is filled, go through A again. For each v, test whether all_divided_values[v/k - 1] exists, and if so,
output the pair (v, all_divided_values[v/k - 1]) if and only if abs(v-all_divided_values[v/k - 1])<=k
Inserting in a hashmap is usually (with Java hashmap for instance) O(1) in average, so the total time is O(n). But please note that technically this could be false, for instance if your language's implementation does not have a nice strategy about the hashmap.
Simple solution:
1- Sort the array
2- Calculate the difference between consecutive elements
a) If the difference is smaller than k return that pair
b) If no consecutive number difference yields a value smaller than k, then your array has no pair of numbers such that the difference is smaller than k.
Sorting is O(nlogn), but if you have only Integers of limited size, you can use Counting sort, that is O(n)
You can consider this way.
The problem can be modeled as this:-
consider each element (considering integer) now you convert them to a range (A[i]-K,A[i]+K)
Now you want to check if any of the two intervals overlap.
Interval intersection problem without any sorted ness is not solvable in O(n) (worst case). You need to sort them and then inn O(n) you can check if hey intersect.
Same goes for your logic. Sort it and find it.

Given an unsorted array A, check if A[i] = i exists efficiently

Given array A, check if A[i] = i for any i exists.
I'm supposed to solve this faster than linear time, which to me seems impossible. The solution I came up with is to first sort the array in n*log(n) time, and then you can easily check faster than linear time. However, since the array is given unsorted I can't see an "efficient" solution?
You can't have a correct algorithm with better than O(N) complexity for an arbitrary (unsorted) array.
Suppose you have the solution better than O(N). It means that the algorithm has to omit some items of the array since scanning all the items is O(N).
Construct A such that A[i] != i for all i then run the algorithm.
Let A[k] be the item which has been omitted. Assign k to A[k],
run the algorithm again - it'll return no such items when k is expected.
You'll get O(log n) with a parallel algorithm (you didn't restrict that). Just start N processors in ld(N) steps and let them check the array items in parallel.

Partial selection sort vs Mergesort to find "k largest in array"

I was wondering if my line of thinking is correct.
I'm preparing for interviews (as a college student) and one of the questions I came across was to find the K largest numbers in an array.
My first thought was to just use a partial selection sort (e.g. scan the array from the first element and keep two variables for the lowest element seen and its index and swap with that index at the end of the array and continue doing so until we've swapped K elements and return a copy of the first K elements in that array).
However, this takes O(K*n) time. If I simply sorted the array using an efficient sorting method like Mergesort, it would only take O(n*log(n)) time to sort the entire array and return the K largest numbers.
Is it good enough to discuss these two methods during an interview (comparing log(n) and K of the input and going with the smaller of the two to compute the K largest) or would it be safe to assume that I'm expected to give a O(n) solution for this problem?
There exists an O(n) algorithm for finding the k'th smallest element, and once you have that element, you can simply scan through the list and collect the appropriate elements. It's based on Quicksort, but the reasoning behind why it works are rather hairy... There's also a simpler variation that probably will run in O(n). My answer to another question contains a brief discussion of this.
Here's a general discussion of this particular interview question found from googling:
http://www.geeksforgeeks.org/k-largestor-smallest-elements-in-an-array/
As for your question about interviews in general, it probably greatly depends on the interviewer. They usually like to see how you think about things. So, as long as you can come up with some sort of initial solution, your interviewer would likely ask questions from there depending on what they were looking for exactly.
IMHO, I think the interviewer wouldn't be satisfied with either of the methods if he says the dataset is huge (say a billion elements). In this case, if K to be returned is huge (nearing a billion) your partial selection would almost result in an O(n^2). I think it entirely depends on the intricacies of the question proposed.
EDIT: Aasmund Eldhuset's answer shows you how to achieve the O(n) time complexity.
If you want to find K (so for K = 5 you'll get five results - five highest numbers ) then the best what you can get is O(n+klogn) - you can build prority queue in O(n) and then invoke pq.Dequeue() k times. If you are looking for K biggest number then you can get it with O(n) quicksort modification - it's called k-th order statistics. Pseudocode looks like that: (it's randomized algorithm, avg time is approximately O(n) however worst case is O(n^2))
QuickSortSelection(numbers, currentLength, k) {
if (currentLength == 1)
return numbers[0];
int pivot = random number from numbers array;
int newPivotIndex = partitionAroundPivot(numbers) // check quicksort algorithm for more details - less elements go left to the pivot, bigger elements go right
if ( k == newPivotIndex )
return pivot;
else if ( k < newPivotIndex )
return QuickSortSelection(numbers[0..newPivotIndex-1], newPivotIndex, k)
else
return QuickSortSelection(numbers[newPivotIndex+1..end], currentLength-newPivotIndex+1, k-newPivotIndex);
}
As i said this algorithm is O(n^2) worst case because pivot is chosen at random (however probability of running time of ~n^2 is something like 1/2^n). You can convert it deterministic algorithm with same running time worst case using for instance median of three median as a pivot - but it is slower in practice (due to constant).

Finding a k element subset in a set of real numbers (Programming Pearls book)

I am solving problems from Column2 of Programming Pearls. I came across this problem:
"Given a set of n real numbers, a real number t, and an integer k, how quickly can you determine whether there exists a k-element subset of the set that sums to at most t?"
My solution is to sort the set of real numbers and then look at the sum for the first k elements. If this sum is less than or equal to t, then we know there exists at least one
set that satisfies the condition.
Is the solution correct?
Is there a better or different solution?
Note: Just to make it clear, do not assume the input to be already sorted.
Because you need only first k elements sorted as per your problem , I suggest following:-
Select the kth element in array using randomised select O(N)
Take sum of first k elements in array and check if its less than t
Time complexity O(N + k) = O(N) as k is O(N)
Randomized Selection
Note:- when k is very small as compared to N then max heap can be very efficient as the storage does not cost that much and it can solve problem in worst case O(Nlogk).

Special Sorting

There is an external array of integers on which you can perform the following operations in O(1) time.
get(int i) - returns the value at the index 'i' in the external array.
reverse( int i, int j) - returns the reverse of the array between index positions i and j (including i and j).
example for reverse: consider an array {1,2,3,4,5}. reverse(0,2) will return {3,2,1,4,5} and reverse(1,4) will return {1,5,4,3,2}.
Write a code to sort the external array. Mention the time and space complexity for your code.
Obviously We can sort in nlogn using quick sort or merge sort. But given the scenerio can we do better?
To sort an array is to find the permutation, or shuffle, that restores it to a sorted state. In other words, your algorithm determines which of the n! possible permutations must be applied, and applies it. Since your algorithm explores the array by asking yes-no questions (Is cell i smaller or greater than cell j?) it follows an implicit decision tree that has depth log(n!) ~ n*log(n).
This means there will be O(n*log(n)) calls to get() to determine how to sort the array.
An interesting variant is to determine the smallest number of calls to reverse() necessary to sort the array, once you know what permutation you need. We know that this number is smaller than n-1, which can be achieved by using selection sort. Can the worst case number be smaller than n-2 ? I must say that I have no idea...
I'd try to reduce the problem to a classic swaps() based sorting algorithm.
In the following we assume without loss of generality j>=i:
Note that swap(i,j) = reverse(i,j) for each j <= i+2, the reversed sub array is only swapping the edges if there are 3 or less elements
Now, for any j>i+2 - all you need is just reverse() the array, by this swapping the edges - and then reverse the "middle" to get it back to the original, so you get: swap(i,j) = reverse(i,j) ; reverse(i+1,j-1)
Using the just built swap(), you can use any compare based algorithms that uses swaps, such as quicksort, which is O(nlogn). The complexity remains O(nlogn) since for each swap() you need up to 2 reverse() ops, which is O(1)
EDIT: Note: This solution fits for the original question (before it was editted), which asked for a solution, and not to optimize it better then quicksort/mergesort.
Assuming you want to minimize the number of external operations get and reverse:
read all integers into an internal array by calling get n times
do an internal sort (n log in internal ops) and calculate the permutation
sort the external array by calling reverse a maximum of n times
This has O(n) time and O(n) space complexity.
Edit in response to anonymous downvotes:
when talking about time complexity, you always have to state, which operations are to be counted. Here I assumed, only the external operations have a cost.
Based on get(int i) and reverse( int i, int j), we can't optimise the code. It will have same complexity.

Resources