complexity of a Sequential algorithm - min suffixes - complexity-theory

In a Sequential algorithm (not parallel)..
is the best complexity for finding the min in each suffix of an array would be O(nlogn) ?could it be O(n)? if not? why?
INPUT:
array={x1,x2....xn}
OUTPUT:
X= {min(x1,x2....xn),min(x2....xn),(x3,x4....xn)...........min(xn-1,xn),xn}

Using the fact
min(x1,x2,...,xn) = min(x1,min(x2,x3,...,xn))
You can see that you can use a DP algorithm to solve for X in O(n)
Pseudocode
Min-Suffixes(input)
n = input.length
let output = new array of size n
output[n] = input[n]
for i = n-1 to 1
output[i] = min(output[i+1],input[i])
return output

Related

Is there a in-place algorithm with time complexity O(n * Log(n)) to select the second largest number in an array/vector?

It's quite clear that we see an O(n^2) algorithm to choose the second largest number, and an algorithm using tree style with O(n * Log(n)), but, with extra space cost, like below:
But, eh..., is there a in-place algorithm with time complexity O(n * Log(n)) to select the second largest number in an array/vector?
Yes, in fact you can do this with a single pass over the range without modifying it. Here's an example algorithm:
Let m and M be the second largest, and largest elements. Initialize them to the smallest possible values the input range could contain.
For each number n in the range, the new second largest number depends on the relative order between n, m and M. The 3 possible orderings are n < m < M, m < n < M, or m < M < n. The new second largest element must be m, n, and M respectively. Essentially, n must be clamped between m and M.
The new largest number can't be m, so it must be the larger of n and M.
Here's a demonstration in c++:
int m = 0, M = 0; // assuming a range with non-negative values
for (int n : v)
{
m = std::clamp(n, m, M);
M = std::max(n, M);
}
If you are looking for something very simple O(n):
int getSecondLargest(vector<int>& vec){
int firstLargest = INT_MIN, secondLargest = INT_MIN;
for(auto i: vec){
if(i >= firstLargest){
if(firstLargest != INT_MIN){
secondLargest = firstLargest;
}
firstLargest = i;
}
else if(i > secondLargest){
secondLargest = i;
}
}
return secondLargest;
}
nth_element:
Pros:
If tomorrow you want not the second largest but say fifth largest, you won't need much code changes. The above algorithm I presented won't help.
Cons:
If you are just looking for second largest, nth_element is an overkill. The swaps and/or writes are more as compared to the above algorithm I showed.
Why are you guys giving me O(n) when I am asking for O(nlogn)?
You can find various in-place O(nlogn) sorting algorithms. One of them is Block Sort.
No. I want it to solve it with a tree style and I want O(nlogn) and I want in place. Do you have something like that?
No. That is not possible. When you say in-place, you can't use extra space depending on n. Constant extra space is fine. But tree style would require O(logn) extra space.

Divide and conquer algorithm

I had a job interview a few weeks ago and I was asked to design a divide and conquer algorithm. I could not solve the problem, but they just called me for a second interview! Here is the question:
we are giving as input two n-element arrays A[0..n − 1] and B[0..n − 1] (which
are not necessarily sorted) of integers, and an integer value. Give an O(nlogn) divide and conquer algorithm that determines if there exist distinct values i, j (that is, i != j) such that A[i] + B[j] = value. Your algorithm should return True if i, j exists, and return False otherwise. You may assume that the elements in A are distinct, and the elements in B are distinct.
can anybody solve the problem? Thanks
My approach is..
Sort any of the array. Here we sort array A. Sort it with the Merge Sort algorithm which is a Divide and Conquer algorithm.
Then for each element of B, Search for Required Value- Element of B in array A by Binary Search. Again this is a Divide and Conquer algorithm.
If you find the element Required Value - Element of B from an Array A then Both element makes pair such that Element of A + Element of B = Required Value.
So here for Time Complexity, A has N elements so Merge Sort will take O(N log N) and We do Binary Search for each element of B(Total N elements) Which takes O(N log N). So total time complexity would be O(N log N).
As you have mentioned you require to check for i != j if A[i] + B[j] = value then here you can take 2D array of size N * 2. Each element is paired with its original index as second element of the each row. Sorting would be done according the the data stored in the first element. Then when you find the element, You can compare both elements original indexes and return the value accordingly.
The following algorithm does not use Divide and Conquer but it is one of the solutions.
You need to sort both the arrays, maintaining the indexes of the elements maybe sorting an array of pairs (elem, index). This takes O(n log n) time.
Then you can apply the merge algorithm to check if there two elements such that A[i]+B[j] = value. This would O(n)
Overall time complexity will be O(n log n)
I suggest using hashing. Even if it's not the way you are supposed to solve the problem, it's worth mentioning since hashing has a better time complexity O(n) v. O(n*log(n)) and that's why more efficient.
Turn A into a hashset (or dictionary if we want i index) - O(n)
Scan B and check if there's value - B[j] in the hashset (dictionary) - O(n)
So you have an O(n) + O(n) = O(n) algorithm (which is better that required (O n * log(n)), however the solution is NOT Divide and Conquer):
Sample C# implementation
int[] A = new int[] { 7, 9, 5, 3, 47, 89, 1 };
int[] B = new int[] { 5, 7, 3, 4, 21, 59, 0 };
int value = 106; // 47 + 59 = A[4] + B[5]
// Turn A into a dictionary: key = item's value; value = item's key
var dict = A
.Select((val, index) => new {
v = val,
i = index, })
.ToDictionary(item => item.v, item => item.i);
int i = -1;
int j = -1;
// Scan B array
for (int k = 0; k < B.Length; ++k) {
if (dict.TryGetValue(value - B[k], out i)) {
// Solution found: {i, j}
j = k;
// if you want any solution then break
break;
// scan further (comment out "break") if you want all pairs
}
}
Console.Write(j >= 0 ? $"{i} {j}" : "No solution");
Seems hard to achieve without sorting.
If you leave the arrays unsorted, checking for existence of A[i]+B[j] = Value takes time Ω(n) for fixed i, then checking for all i takes Θ(n²), unless you find a trick to put some order in B.
Balanced Divide & Conquer on the unsorted arrays doesn't seem any better: if you divide A and B in two halves, the solution can lie in one of Al/Bl, Al/Br, Ar/Bl, Ar/Br and this yields a recurrence T(n) = 4 T(n/2), which has a quadratic solution.
If sorting is allowed, the solution by Sanket Makani is a possibility but you do better in terms of time complexity for the search phase.
Indeed, assume A and B now sorted and consider the 2D function A[i]+B[j], which is monotonic in both directions i and j. Then the domain A[i]+B[j] ≤ Value is limited by a monotonic curve j = f(i) or equivalently i = g(j). But strict equality A[i]+B[j] = Value must be checked exhaustively for all points of the curve and one cannot avoid to evaluate f everywhere in the worst case.
Starting from i = 0, you obtain f(i) by dichotomic search. Then you can follow the border curve incrementally. You will perform n step in the i direction, and at most n steps in the j direction, so that the complexity remains bounded by O(n), which is optimal.
Below, an example showing the areas with a sum below and above the target value (there are two matches).
This optimal solution has little to do with Divide & Conquer. It is maybe possible to design a variant based on the evaluation of the sum at a central point, which allows to discard a whole quadrant, but that would be pretty artificial.

Finding the temporal complexity of an exponential algorithm

Problem: Find best way to cut a rod of length n.
Each cut is integer length.
Assume that each length i rod has a price p(i).
Given: rod of length n, and a list of prices p, which provided the price of each possible integer lenght between 0 and n.
Find best set of cuts to get maximum price.
Can use any number of cuts, from 0 to n−1.
There is no cost for a cut.
Following I present a naive algorithm for this problem.
CUT-ROD(p,n)
if(n == 0)
return 0
q = -infinity
for i = 1 to n
q = max(q, p[i]+CUT-ROD(p,n-1))
return q
How can I prove that this algorithm is exponential? Step-by-step.
I can see that it is exponential. However, I'm not able to proove it.
Let's translate the code to C++ for clarity:
int prices[n];
int cut-rod(int n) {
if(n == 0) {
return 0;
}
q = -1;
res = cut-rod(n-1);
for(int i = 0; i < n; i++) {
q = max(q, prices[i] + res);
}
return q;
}
Note: We are caching the result of cut-rod(n-1) to avoid unnecessarily increasing the complexity of the algorithm. Here, we can see that cut-rod(n) calls cut-rod(n-1), which calls cut-rod(n-2) and so on until cut-rod(0). For cut-rod(n), we see that the function iterates over the array n times. Therefore the time complexity of the algorithm is equal to O(n + (n-1) + (n-2) + (n-3)...1) = O(n(n+1)/2) which is approximately equal to O((n^2)/2).
EDIT:
If we are using the exact same algorithm as the one in the question, its time complexity is O(n!) since cut-rod(n) calls cut-rod(n-1) n times. cut-rod(n-1) calls cut-rod(n-2) n-1 times and so on. Therefore the time complexity is equal to O(n*(n-1)*(n-2)...1) = O(n!).
I am unsure if this counts as a step-by-step solution but it can be shown easily by induction/substitution. Just assume T(i)=2^i for all i<n then we show that it holds for n:

Finding the median of an unsorted array

To find the median of an unsorted array, we can make a min-heap in O(nlogn) time for n elements, and then we can extract one by one n/2 elements to get the median. But this approach would take O(nlogn) time.
Can we do the same by some method in O(n) time? If we can, then please tell or suggest some method.
You can use the Median of Medians algorithm to find median of an unsorted array in linear time.
I have already upvoted the #dasblinkenlight answer since the Median of Medians algorithm in fact solves this problem in O(n) time. I only want to add that this problem could be solved in O(n) time by using heaps also. Building a heap could be done in O(n) time by using the bottom-up. Take a look to the following article for a detailed explanation Heap sort
Supposing that your array has N elements, you have to build two heaps: A MaxHeap that contains the first N/2 elements (or (N/2)+1 if N is odd) and a MinHeap that contains the remaining elements. If N is odd then your median is the maximum element of MaxHeap (O(1) by getting the max). If N is even, then your median is (MaxHeap.max()+MinHeap.min())/2 this takes O(1) also. Thus, the real cost of the whole operation is the heaps building operation which is O(n).
BTW this MaxHeap/MinHeap algorithm works also when you don't know the number of the array elements beforehand (if you have to resolve the same problem for a stream of integers for e.g). You can see more details about how to resolve this problem in the following article Median Of integer streams
Quickselect works in O(n), this is also used in the partition step of Quicksort.
The quick select algorithm can find the k-th smallest element of an array in linear (O(n)) running time. Here is an implementation in python:
import random
def partition(L, v):
smaller = []
bigger = []
for val in L:
if val < v: smaller += [val]
if val > v: bigger += [val]
return (smaller, [v], bigger)
def top_k(L, k):
v = L[random.randrange(len(L))]
(left, middle, right) = partition(L, v)
# middle used below (in place of [v]) for clarity
if len(left) == k: return left
if len(left)+1 == k: return left + middle
if len(left) > k: return top_k(left, k)
return left + middle + top_k(right, k - len(left) - len(middle))
def median(L):
n = len(L)
l = top_k(L, n / 2 + 1)
return max(l)
No, there is no O(n) algorithm for finding the median of an arbitrary, unsorted dataset.
At least none that I am aware of in 2022. All answers offered here are variations/combinations using heaps, Median of Medians, Quickselect, all of which are strictly O(nlogn).
See https://en.wikipedia.org/wiki/Median_of_medians and http://cs.indstate.edu/~spitla/abstract2.pdf.
The problem appears to be confusion about how algorithms are classified, which is according their limiting (worst case) behaviour. "On average" or "typically" O(n) with "worst case" O(f(n)) means (in textbook terms) "strictly O(f(n))". Quicksort for example, is often discussed as being O(nlogn) (which is how it typically performs), although it is in fact an O(n^2) algorithm because there is always some pathological ordering of inputs for which it can do no better than n^2 comparisons.
It can be done using Quickselect Algorithm in O(n), do refer to Kth order statistics (randomized algorithms).
As wikipedia says, Median-of-Medians is theoretically o(N), but it is not used in practice because the overhead of finding "good" pivots makes it too slow.
http://en.wikipedia.org/wiki/Selection_algorithm
Here is Java source for a Quickselect algorithm to find the k'th element in an array:
/**
* Returns position of k'th largest element of sub-list.
*
* #param list list to search, whose sub-list may be shuffled before
* returning
* #param lo first element of sub-list in list
* #param hi just after last element of sub-list in list
* #param k
* #return position of k'th largest element of (possibly shuffled) sub-list.
*/
static int select(double[] list, int lo, int hi, int k) {
int n = hi - lo;
if (n < 2)
return lo;
double pivot = list[lo + (k * 7919) % n]; // Pick a random pivot
// Triage list to [<pivot][=pivot][>pivot]
int nLess = 0, nSame = 0, nMore = 0;
int lo3 = lo;
int hi3 = hi;
while (lo3 < hi3) {
double e = list[lo3];
int cmp = compare(e, pivot);
if (cmp < 0) {
nLess++;
lo3++;
} else if (cmp > 0) {
swap(list, lo3, --hi3);
if (nSame > 0)
swap(list, hi3, hi3 + nSame);
nMore++;
} else {
nSame++;
swap(list, lo3, --hi3);
}
}
assert (nSame > 0);
assert (nLess + nSame + nMore == n);
assert (list[lo + nLess] == pivot);
assert (list[hi - nMore - 1] == pivot);
if (k >= n - nMore)
return select(list, hi - nMore, hi, k - nLess - nSame);
else if (k < nLess)
return select(list, lo, lo + nLess, k);
return lo + k;
}
I have not included the source of the compare and swap methods, so it's easy to change the code to work with Object[] instead of double[].
In practice, you can expect the above code to be o(N).
Let the problem be: finding the Kth largest element in an unsorted array.
Divide the array into n/5 groups where each group consisting of 5 elements.
Now a1,a2,a3....a(n/5) represent the medians of each group.
x = Median of the elements a1,a2,.....a(n/5).
Now if k<n/2 then we can remove the largets, 2nd largest and 3rd largest element of the groups whose median is greater than the x. We can now call the function again with 7n/10 elements and finding the kth largest value.
else if k>n/2 then we can remove the smallest ,2nd smallest and 3rd smallest element of the group whose median is smaller than the x. We can now call the function of again with 7n/10 elements and finding the (k-3n/10)th largest value.
Time Complexity Analysis:
T(n) time complexity to find the kth largest in an array of size n.
T(n) = T(n/5) + T(7n/10) + O(n)
if you solve this you will find out that T(n) is actually O(n)
n/5 + 7n/10 = 9n/10 < n
Notice that building a heap takes O(n) actually not O(nlogn), you can check this using amortized analysis or simply check in Youtube.
Extract-Min takes O(logn), therefore, extracting n/2 will take (nlogn/2) = O(nlogn) amortized time.
About your question, you can simply check at Median of Medians.

what is the efficiency of this algorithm

What is the big O value for the following algorithm? Why is it that value?
algorithm A (val array <ptr to int>)
1 n = 0
2 loop ( n < array size )
1 min = n;
2 m = n;
3 loop ( m < array size)
1 if (array[m] < array[min])
1 min = m;
4 swap(array[min],array[n]);
3 n = n + 1
I answered O(n^2) am I correct? As to how I arrived to this conclusion, the inner loops executes the n times where n = the array size and the outer loop executes n times where n is the array size n*n = n^2
That is so-called Selection sort, and indeed it has O(n2) complexity.
Yes! you are correct!
This is selection sort algorithm.
Its Θ(n^2) to be more precise.
Edit : Why is it that value?
You take the first element. Compare it with all the other elements to find minimum in the array and place it in the first place. Iterations : n.
You take the second element. Compare it with rest of the array and find minimum in that part (second minimum in whole array) and place it in the second place. Iterations : n-1.
Continuing in this way for last element, Iterations : 1.
Total = n+n-1+ ... +1 = n(n+1)/2. That is O(n^2).

Resources