How to find a peak in an array? - algorithm

Given an array such as
[69,20,59,35,10]
I would like to discover all peaks in this array. By the definition of the problem, a peak is an element pi of the array that satisfy the property p_k < p_i > p_j with k < i < j. I'm not interested just in the neighbors of an certain element, I want to analyze all elements before and after this element. With this definition and this example we have the following peaks:
[20,59,35]
[20,59,10]
[20,35,10]
What kind of algorithm or approach I have to use to deal with this?

As mentioned in the comments, the total number of peaks in the worst case would be on the order of O(n^3), therefore an optimal algorithm that outputs all peaks cannot be better than O(n^3) - and the other answers provide cubic-time implementations. An example of an input that has this order of peaks is 00...011...100..0, where each of the three segments of identical characters is of equal length.
However, assuming that you are interested in counting the number of peaks rather than outputting each of them, there is a much faster O(n logn) solution. You can implement a BST (Binary Search Tree) that supports computing ranks (i.e., each node knows how many nodes are to its left - that is, how many values are below it) in logarithmic time. Create two BSTs - one will store the element to the left of the current wannabe peak, and the other to its right. For each i from 1 to n-1, assume it is the middle and find how many pairs of indices would work with it. Every value in the first BST that's lower than the i-th element could be the left index, and every value in the second BST that's lower than the i-th element could be the right index. Hence, the product of these counts is how many peaks with i-th element in the middle exist.

Assuming your arrays are 0-indexed, you can use the algorithm below:
i = 1
while i < length(array) - 1 do
c = array[i]
j = 0
while j < i do
k = i + 1
while k < length(array) do
l = array[j]
r = array[k]
if c > l and c > r then
write('found peak: ', [l, c, r])
k = k + 1
j = j + 1
i = i + 1

I am sharing this figure I drew, in case this can help you to clearly understand what is going on and build a more optimal algorithm.
This is the pseudo code:
Cp is a positive counter: increase it when going uphill
Cn is a negative counter: increase it when going downhill
Reset Cp and Cn when moving horizontally, or when we have reach a valley (Opposite of a Peak).
If array[i] > array[i-1] and array[i] > array[i+1], then array[i] is a peak. The opposite of this statement can be used to find when we reach a valley
After we reach the peak, keep incrementing Cn (Cn += 1) until an eventual reset of Cn.
Right before resetting Cn to zero, set peak_length = Cp+Cn. If we reached the end of the array and no reset is made, then the peak length is Cp+Cn.
Calculate the max of the different peak_length
And Here is the Python code
def peaks_and_valleys(A):
Cp, Cn = 0, 0
longest_path = 0
peak_dict = {} # Track and save the peaks and their length
valley = [] # This is just to track and save the valleys
N = len(A)
for i in range(N-2):
if A[i+1] > A[i]: # Uphill
Cp += 1
if (A[i+1] == A[i+2]): # Uphill and Flat
Cp, Cn = 0, 0
if (A[i+1] > A[i]) and (A[i+1] > A[i+2]): # Peak
peak = A[i+1] # Record and save the peaks
# Keep incrementing negative counter while going Downhill
while i< N-2 and A[i+1] > A[i+2]:
Cn += 1
i += 1
# At the end of the peak, calculate the longest path
longest_path = max(longest_path, Cp+Cn+1)
peak_dict[peak] = longest_path # Track the peaks
elif A[i+1] < A[i]: # Downhill
Cn += 1
if A[i+1] < A[i+2]: # Valley
valley.append(A[i+1]) # Save the Valleys
Cp, Cn = 0, 0
elif (A[i+1] == A[i]) : # Flat
Cp, Cn = 0, 0
print("{Peak': 'Peak Lenght}")
print(peak_dict)
print("valley",valley)
return longest_path

Related

Find continuous subarrays that have at least 1 pair adding up to target sum - Optimization

I took this assessment that had this prompt, and I was able to pass 18/20 tests, but not the last 2 due to hitting the execution time limit. Unfortunately, the input values were not displayed for these tests.
Prompt:
// Given an array of integers **a**, find how many of its continuous subarrays of length **m** that contain at least 1 pair of integers with a sum equal to **k**
Example:
const a = [1,2,3,4,5,6,7];
const m = 5, k = 5;
solution(a, m, k) will yield 2, because there are 2 subarrays in a that have at least 1 pair that add up to k
a[0]...a[4] - [1,2,3,4,5] - 2 + 3 = k ✓
a[1]...a[5] - [2,3,4,5,6] - 2 + 3 = k ✓
a[2]...a[6] - [3,4,5,6,7] - no two elements add up to k ✕
Here was my solution:
// strategy: check each subarray if it contains a two sum pair
// time complexity: O(n * m), where n is the size of a and m is the subarray length
// space complexity: O(m), where m is the subarray length
function solution(a, m, k) {
let count = 0;
for(let i = 0; i <= a.length - m; i++){
let set = new Set();
for(let j = i; j < i + m; j++){
if(set.has(k - a[j])){
count++;
break;
}
else
set.add(a[j]);
}
}
return count;
}
I thought of ways to optimize this algo, but failed to come up with any. Is there any way this can be optimized further for time complexity - perhaps for any edge cases?
Any feedback would be much appreciated!
maintain a map of highest position of the last m values (add/remove/query is O(1)) and highest position of the first value of a complementary pair
for each array element, check if complementary element is in the map, update the highest position if necessary.
if at least m elements were processed and higest position is in the range, increase counter
O(n) overall. Python:
def solution(a, m, k):
count = 0
last_pos = {} # value: last position observed
max_complement_pos = -1
for head, num in enumerate(a, 1): # advance head by one
tail = head - m
# deletion part is to keep space complexity O(m).
# If this is not a concern (likely), safe to omit
if tail > 0 and last_pos[a[tail]] <= tail: # time to pop last element
del last_pos[a[tail]]
max_complement_pos = max(max_complement_pos, last_pos.get(k-num, -1))
count += head >= m and max_complement_pos > tail
last_pos[num] =head # add element at head
return count
Create a counting hash: elt -> count.
When the window moves:
add/increment the new element
decrement the departing element
check if (k - new_elt) is in your hash with a count >= 1. If it is, you've found a good subarray.

Find scalar interval containing maximum elements from population A and zero elements from population B

Given two large sets A and B of scalar (floating point) values, what algorithm would you use to find the (scalar) range [x0,x1] containing zero elements from B and the maximum number of elements from A?
Is sorting complexity (O(n log n)) unavoidable?
Create a single list with all values, where each value is marked with two counts: one count that relates to set A, and another that relates to set B. Initially these counts are 1 and 0, when the value comes from set A, and 0 and 1 when the value comes from set B. So entries in this list could be tuples (value, countA, countB). This operation is O(n).
Sort these tuples. O(nlogn)
Merge tuples with duplicate values into one tuple, and accumulate the values in the corresponding counters, so that the tuple tells us how many times the value occurs in set A and how many times in set B. O(n)
Traverse this list in sorted order and maintain the largest sum of counts for countA of a series of adjacent tuples where countB is always 0, and the minimum and maximum value of that range. O(n)
The sorting is the determining factor of the time complexity: O(nlogn).
Sort both A and B in O(|A| log |A| + |B| log |B|). Then apply the following algorithm, which has complexity O(|A| + |B|):
i = j = k = 0
best_interval = (0, 1)
while i < len(B) - 1:
lo = B[i]
hi = B[i+1]
j = k # We can skip ahead from last iteration.
while j < len(A) and A[j] <= lo:
j += 1
k = j # We can skip ahead from the above loop.
while k < len(A) and A[k] < hi:
k += 1
if k - j > best_interval[1] - best_interval[0]:
best_interval = (j, k)
i += 1
x0 = A[best_interval[0]]
x1 = A[best_interval[1]-1]
It may look quadratic at a first inspection but note we never decrease j and k - it really is just a linear scan with three pointers.

Counting inversions in an array of 2D pair

Problem Description:
Let there be an array of 2D pairs ((x1, y1), . . . ,(xn, yn))
. With a fixed constant
y' a pair (i, j) is called half-inverted if i < j, xi > xj , and yi ≥ y' > yj . Devise an algorithm
that counts the number of half-inverted pairs. You will get full marks if your algorithm is
correct of complexity no more than O(n log n).
\My idea is to treat this using similar method as counting inversion in a normal array, but my problem is that how do we maintain the order during the Merge And Count step?
It is a simple modification of the familiar merge-sort inversion counting algorithm which can be used to solve this problem so make you fully understand it as a prerequisite.
If we examine the merge step of this algorithm we have 2 sorted halves and 2 pointers pointing to an element of each. Let our left pointer be i and our right, j. Using the traditional definition of an inversion, if our i pointer points to a value that is larger than the value pointed to by j then due the arrays being sorted and all the elements on the left being before those on the right in the real array, we know all the elements from i to the end of the left half meet our definition of an inversion for our value at j so we increase our count by mid - i where mid is the end of the left half.
Switching back to your problem, we are dealing with pairs (x,y). If we can keep our x values sorted then, using the approach described above, we can simply count the number of inversions only considering x values. Looking at your definition of half inversions we will surely be over counting the number we need if we only count xi > xj. We are missing the additional constraint of yi >= y' > yj which must be filtered out of our counting.
So, if we look back to our traditional algorithm when our i pointer is pointing to a value greater than the value at j we also need to make sure that our y value at j is less than y'. If this not true then none of the x's from i to mid will match our definition of a half inversion and so we cannot count them. Now let's assume our j's y is smaller than y', if we simply counted all the pairs from i to mid then we would still be over counting the pairs which have yi < y'.
One way to fix this is to keep track of the of y values in the left half from i to mid which are >= y' and add that value to our count. We can keep track of how many y >= y' we see in the merge step up to any i, and subtract that from the total number of y's which are >= y' in the left half. To keep track of that total number we can return that value from our recursive function (total = left + right) and only use the number which came from the left half when merging. We also need to modify our base case which is straightforward.
def count_half_inversions(l, y):
return count_rec(l, 0, len(l), l.copy(), y)[0]
def count_rec(l, begin, end, copy, y):
if end-begin <= 1:
# we have only 1 pair
return (0, 1 if l[begin][1] >= y else 0)
mid = begin + ((end-begin) // 2)
left = count_rec(copy, begin, mid, l, y)
right = count_rec(copy, mid, end, l, y)
between = merge_count(l, begin, mid, end, copy, left[1], y)
# return (inversion count, number of pairs, (i,j), with j >= y)
return (left[0] + right[0] + between, left[1] + right[1])
def merge_count(l, begin, mid, end, copy, left_y_count, y):
result = 0
i,j = begin, mid
k = begin
while i < mid and j < end:
if copy[i][0] > copy[j][0]:
if y > copy[j][1]:
result += left_y_count
smaller = copy[j]
j += 1
else:
if copy[i][1] >= y:
left_y_count -= 1
smaller = copy[i]
i += 1
l[k] = smaller
k += 1
while i < mid:
l[k] = copy[i]
i += 1
k += 1
while j < end:
l[k] = copy[j]
j += 1
k += 1
return result
test_case = [(1,1), (6,4), (6,3), (1,2), (1,2), (3,3), (6,2), (0,1)]
fixed_y = 2
print(count_half_inversions(test_case, fixed_y))

What is the minimum cost of arranging the range a (n) (a [i] <= 20) such that the equal values will form a continuous segment?

You provide 1 string: a1, a2..an (a [i] <= 20)
Requirement: The minimum cost (number of steps) to swap any two elements in the sequence so that the final sequence obtained has equal values ​​that lie in succession:
Each step you can only choose 2 adjacent values to swap: swap (a [i], a [i + 1]) = 1steps
example:
1 1 3 1 3 2 3 2
Swap (a [3], a [4])
Swap (a [6], a [7])
-> 1 1 1 3 3 3 2 2
minimum = 2
I need your help.
Note that since A[i] <= 20 we can go ahead and enumerate every subset of all A[i] and fit comfortably within any time constraints.
Let M be the number of unique A[i], then there is a O(NM + M * 2^M) dynamic programming solution with bitmasks.
note that when I say moving an A[i] I mean moving every element with value A[i]
To understand how we do this let's first consider the brute force solution. We have some set of unique A[i] moved to the front of the string, and then at each step we pick the next A[i] to move behind what we had originally. This is O(M! * N).
There's one important observation to be made here: if we have some set of A[i] at the start of the string, and then we move the next one, the order of our original set of A[i] doesn't actually matter. Any move will cost the same regardless of the order.
Let cost(subset, A[i]) be the cost of moving all A[i] behind that subset of A[i] at the front of the string. Then we can write the following:
dp = [float('inf')] * (1 << M) # every subset of A[i]
dp[0] = 0
for mask in range(len(dp)):
for bit in range(M):
# if this A[i] hasn't been moved to the front, we move it to the front
if (mask >> bit) & 1 == 0:
dp[mask^(1 << bit)] = min(dp[mask^(1 << bit)], dp[mask] + cost(mask, bit))
If we compute cost naively then we have O(M * 2^M * N). However we can precompute every value of cost with O(1) per value.
Here's how we can do this:
Idea: The number of swaps needed to sort an array is the number of inversions.
Let's define a new array inversions[M][M], where inversions[i][j] is the number of times j comes after i in the arr. For clarity here's how we would compute it naively:
for i in range(len(arr)):
for j in range(i + 1, len(arr)):
if arr[i] != arr[j]: inversions[arr[i]][arr[j]] += 1
Assume that we have inversions, then we can compute cost(subset, A[i]) like so:
cost = 0
for bit in range(M):
# if bit isn't in the mask and thus needs to get swapped with A[i]
if (subset >> bit) & 1 == 0:
cost += inversions[bit][A[i]]
What's left is the following:
Compute inversions in O(NM). This can be done with keeping a count of each M at each index in N.
Currently cost is O(M) and not O(1). We can run a separate dynamic programming on cost to build an array cost[(1 << M)][M], where cost[i][j] is the cost to move item j to subset i.
For sake of completeness here is a complete code written in C++. It's my submission for the same problem on codeforces. Note that in that code cost is named contribution.

Find largest continuous sum such that the minimum of it and it's complement is largest

I'm given a sequence of numbers a_1,a_2,...,a_n. It's sum is S=a_1+a_2+...+a_n and I need to find a subsequence a_i,...,a_j such that min(S-(a_i+...+a_j),a_i+...+a_j) is the largest possible (both sums must be non-empty).
Example:
1,2,3,4,5 the sequence is 3,4, because then min(S-(a_i+...+a_j),a_i+...+a_j)=min(8,7)=7 (and it's the largest possible which can be checked for other subsequences).
I tried to do this the hard way.
I load all values into the array tab[n].
I do this n-1 times tab[i]+=tab[i-j]. So that tab[j] is the sum from the beginning till j.
I check all possible sums a_i+...+a_j=tab[j]-tab[i-1] and substract it from the sum, take the minimum and see if it's larger than before.
It takes O(n^2). This makes me very sad and miserable. Is there a better way?
Seems like this can be done in O(n) time.
Compute the sum S. The ideal subsequence sum is the longest one which gets closest to S/2.
Start with i=j=0 and increase j until sum(a_i..a_j) and sum(a_i..a_{j+1}) are as close as possible to S/2. Note which ever is closer and save the values of i_best,j_best,sum_best.
Increment i and then increase j again until sum(a_i..a_j) and sum(a_i..a_{j+1}) are as close as possible to S/2. Note which ever is closer and replace the values of i_best,j_best,sum_best if they are better. Repeat this step until done.
Note that both i and j are never decremented, so they are changed a total of at most O(n) times. Since all other operations take only constant time, this results in an O(n) runtime for the entire algorithm.
Let's first do some clarifications.
A subsequence of a sequence is actually a subset of the indices of the sequence. Haivng said that, and specifically int he case where you sequence has distinct elements, your problem will reduce to the famous Partition problem, which is known to be NP-complete. If that is the case, you can manage to solve the problem in O(Sn) where "n" is the number of elements and "S" is the total sum. This is not polynomial time as "S" can be arbitrarily large.
So lets consider the case with a contiguous subsequence. You need to observe array elements twice. First run sums them up into some "S". In the second run you carefully adjust array length. Lets assume you know that a[i] + a[i + 1] + ... + a[j] > S / 2. Then you let i = i + 1 to reduce the sum. Conversely, if it was smaller, you would increase j.
This code runs in O(n).
Python code:
from math import fabs
a = [1, 2, 3, 4, 5]
i = 0
j = 0
S = sum(a)
s = 0
while s + a[j] <= S / 2:
s = s + a[j]
j = j + 1
s = s + a[j]
best_case = (i, j)
best_difference = fabs(S / 2 - s)
while True:
if fabs(S / 2 - s) < best_difference:
best_case = (i, j)
best_difference = fabs(S / 2 - s)
if s > S / 2:
s -= a[i]
i += 1
else:
j += 1
if j == len(a):
break
s += a[j]
print best_case
i = best_case[0]
j = best_case[1]
print "Best subarray = ", a[i:j + 1]
print "Best sum = " , sum(a[i:j + 1])

Resources