Find triplets in better than linear time such that A[n-1] >= A[n] <= A[n+1] - algorithm

A sequence of numbers was given in an interview such that A[0] >= A[1] and A[N-1] >= A[N-2]. I was asked to find at-least one triplet such that A[n-1] >= A[n] <= A[n+1].
I tried to solve in iterations. Interviewer expected better than linear time solution. How should I approach this question?
Example: 9 8 5 4 3 2 6 7
Answer: 3 2 6

We can solve this in O(logn) time using divide & conquer aka. binary search. Better than linear time. So we need to find a triplet such that A[n-1] >= A[n] <= A[n+1].
First find the mid of the given array. If mid is smaller than its left and greater than its right. then return, thats your answer. Incidentally this would be a basecase in your recursion. Also if len(arr) < 3 then too return. another basecase.
Now comes the recursion scenarios. When to recurse, we would need to inspect further right. For that, If mid is greater than the element on its left then consider start to left of the array as a subproblem and recurse with this new array. i.e. in tangible terms at this point we would have ...26... with index n being 6. So we move left to see if the element to the left of 2 completes the triplet.
Otherwise if mid is greater than element on its right subarray then consider mid+1 to right of the array as a subproblem and recurse.
More Theory: The above should be sufficient to understand the problem but read on. The problem essentially boils down to finding local minima in a given set of elements. A number in the array is called local minima if it is smaller than both its left and right numbers which precisely boils down to A[n-1] >= A[n] <= A[n+1].
A given array such that its first 2 elements are decreasing and last 2 elements are increasing HAS to have a local minima. Why is that? Lets prove this by negation. If first two numbers are decreasing, and there is no local minima, that means 3rd number is less than 2nd number. otherwise 2nd number would have been local minima. Following the same logic 4th number will have to be less than 3rd number and so on and so forth. So the numbers in the array will have to be in decreasing order. Which violates the constraint of last two numbers being in increasing order. This proves by negation that there need to be a local minima.
The above theory suggests a O(n) linear approach but we definitely can do better. But the theory definitely gives us a different perspective about the problem.
Code: Here's python code (fyi - was typed in stackoverflow text editor freehand, it might misbheave).
def local_minima(arr, start, end):
mid = (start+end)/2
if mid-2 < 0 and mid+1 >= len(arr):
return -1;
if arr[mid-2] > arr[mid-1] and arr[mid-1] < arr[mid]: #found it!
return mid-1;
if arr[mid-1] > arr[mid-2]:
return local_minima(arr, start, mid);
else:
return local_minima(arr, mid, end);
Note that I just return the index of the n. To print out the triple just do -1 and +1 to the returned index. source

It sounds like what you're asking is this:
You have a sequence of numbers. It starts decreasing and continues to decrease until element n, then it starts increasing until the end of the sequence. Find n.
This is a (non-optimal) solution in linear time:
for (i = 1; i < length(A) - 1; i++)
{
if ((A[i-1] >= A[i]) && (A[i] <= A[i+1]))
return i;
}
To do better than linear time, you need to use the information that you get from the fact that the series decreases then increases.
Consider the difference between A[i] and A[i+1]. If A[i] > A[i+1], then n > i, since the values are still decreasing. If A[i] <= A[i+1], then n <= i, since the values are now increasing. In this case you need to check the difference between A[i-1] and A[i].
This is a solution in log time:
int boundUpper = length(A) - 1;
int boundLower = 1;
int i = (boundUpper + boundLower) / 2; //initial estimate
while (true)
{
if (A[i] > A[i+1])
boundLower = i + 1;
else if (A[i-1] >= A[i])
return i;
else
boundUpper = i;
i = (boundLower + boundUpper) / 2;
}
I'll leave it to you to add in the necessary error check in the case that A does not have an element satisfying the criteria.

Linear you could just do by iterating through the set, comparing them all.
You could also check the slope of the first two, then do a kind of binary chop/in order traversal comparing pairs until you find one of the opposite slope. That would amortize to a better than n time, I think, though it's not guaranteed.
edit: just realised what your ordering meant. The binary chop method is guaranteed to do this in <n time, as there is guaranteed to be a point of change (assuming that your N-1, N-2 are the last two elements of the list).
This means you just need to find it/one of them, in which case binary chop will do it in order log(n)

Related

Q: Count array pairs with bitwise AND > k ~ better than O(N^2) possible?

Given an array nums
Count no. of pairs (two elements) where bitwise AND is greater than K
Brute force
for i in range(0,n):
for j in range(i+1,n):
if a[i]&a[j] > k:
res += 1
Better version:
preprocess to remove all elements ≤k
and then brute force
But i was wondering, what would be the limit in complexity here?
Can we do better with a trie, hashmap approach like two-sum?
( I did not find this problem on Leetcode so I thought of asking here )
Let size_of_input_array = N. Let the input array be of B-bit numbers
Here is an easy to understand and implement solution.
Eliminate all values <= k.
The above image shows 5 10-bit numbers.
Step 1: Adjacency Graph
Store a list of set bits. In our example, 7th bit is set for numbers at index 0,1,2,3 in the input array.
Step 2: The challenge is to avoid counting the same pairs again.
To solve this challenge we take help of union-find data structure as shown in the code below.
//unordered_map<int, vector<int>> adjacency_graph;
//adjacency_graph has been filled up in step 1
vector<int> parent;
for(int i = 0; i < input_array.size(); i++)
parent.push_back(i);
int result = 0;
for(int i = 0; i < adjacency_graph.size(); i++){ // loop 1
auto v = adjacency_graph[i];
if(v.size() > 1){
int different_parents = 1;
for (int j = 1; j < v.size(); j++) { // loop 2
int x = find(parent, v[j]);
int y = find(parent, v[j - 1]);
if (x != y) {
different_parents++;
union(parent, x, y);
}
}
result += (different_parents * (different_parents - 1)) / 2;
}
}
return result;
In the above code, find and union are from union-find data structure.
Time Complexity:
Step 1:
Build Adjacency Graph: O(BN)
Step 2:
Loop 1: O(B)
Loop 2: O(N * Inverse of Ackermann’s function which is an extremely slow-growing function)
Overall Time Complexity
= O(BN)
Space Complexity
Overall space complexity = O(BN)
First, prune everything <= k. Also Sort the value list.
Going from the most significant bit to the least significant we are going to keep track of the set of numbers we are working with (initially all ,s=0, e=n).
Let p be the first position that contains a 1 in the current set at the current position.
If the bit in k is 0, then everything that would yield a 1 world definetly be good and we need to investigate the ones that get a 0. We have (end - p) * (end-p-1) /2 pairs in the current range and (end-p) * <total 1s in this position larger or equal to end> combinations with larger previously good numbers, that we can add to the solution. To continue we update end = p. We want to count 1s in all the numbers above, because we only counted them before in pairs with each other, not with the numbers this low in the set.
If the bit in k is 1, then we can't count any wins yet, but we need to eliminate everything below p, so we update start = p.
You can stop once you went through all the bits or start==end.
Details:
Since at each step we eliminate either everything that has a 0 or everything that has a 1, then everything between start and end will have the same bit-prefix. since the values are sorted we can do a binary search to find p.
For <total 1s in this position larger than p>. We already have the values sorted. So we can compute partial sums and store for every position in the sorted list the number of 1s in every bit position for all numbers above it.
Complexity:
We got bit-by-bit so L (the bit length of the numbers), we do a binary search (logN), and lookup and updates O(1), so this is O(L logN).
We have to sort O(NlogN).
We have to compute partial bit-wise sums O(L*N).
Total O(L logN + NlogN + L*N).
Since N>>L, L logN is subsummed by NlogN. Since L>>logN (probably, as in you have 32 bit numbers but you don't have 4Billion of them), then NlogN is subsummed by L*N. So complexity is O(L * N). Since we also need to keep the partial sums around the memory complexity is also O(L * N).

write a number as sum of a consecutive primes

How to check if n can be partitioned to sum of a sequence of consecutive prime numbers.
For example, 12 is equal to 5+7 which 5 and 7 are consecutive primes, but 20 is equal to 3+17 which 3 and 17 are not consecutive.
Note that, repetition is not allowed.
My idea is to find and list all primes below n, then use 2 loops to sum all primes. The first 2 numbers, second 2 numbers, third 2 numbers etc. and then first 3 numbers, second 3 numbers and so far. But it takes lot of time and memory.
Realize that a consecutive list of primes is defined only by two pieces of information, the starting and the ending prime number. You just have to find these two numbers.
I assume that you have all the primes at your disposal, sorted in the array called primes. Keep three variables in memory: sum which initially is 2 (the smallest prime), first_index and last_index which are initially 0 (index of the smallest prime in array primes).
Now you have to "tweak" these two indices, and "travel" the array along the way in the loop:
If sum == n then finish. You have found your sequence of primes.
If sum < n then enlarge the list by adding next available prime. Increment last_index by one, and then increment sum by the value of new prime, which is primes[last_index]. Repeat the loop. But if primes[last_index] is larger than n then there is no solution, and you must finish.
If sum > n then reduce the list by removing the smallest prime from the list. Decrement sum by that value, which is primes[first_index], and then increment first_index by one. Repeat the loop.
Dialecticus's algorithm is the classic O(m)-time, O(1)-space way to solve this type of problem (here I'll use m to represent the number of prime numbers less than n). It doesn't depend on any mysterious properties of prime numbers. (Interestingly, for the particular case of prime numbers, AlexAlvarez's algorithm is also linear time!) Dialecticus gives a clear and correct description, but seems at a loss to explain why it is correct, so I'll try to do this here. I really think it's valuable to take the time to understand this particular algorithm's proof of correctness: although I had to read a number of explanations before it finally "sank in", it was a real "Aha!" moment when it did! :) (Also, problems that can be efficiently solved in the same manner crop up quite a lot.)
The candidate solutions this algorithm tries can be represented as number ranges (i, j), where i and j are just the indexes of the first and last prime number in a list of prime numbers. The algorithm gets its efficiency by ruling out (that is, not considering) sets of number ranges in two different ways. To prove that it always gives the right answer, we need to show that it never rules out the only range with the right sum. To that end, it suffices to prove that it never rules out the first (leftmost) range with the right sum, which is what we'll do here.
The first rule it applies is that whenever we find a range (i, j) with sum(i, j) > n, we rule out all ranges (i, k) having k > j. It's easy to see why this is justified: the sum can only get bigger as we add more terms, and we have determined that it's already too big.
The second, trickier rule, crucial to the linear time complexity, is that whenever we advance the starting point of a range (i, j) from i to i+1, instead of "starting again" from (i+1, i+1), we start from (i+1, j) -- that is, we avoid considering (i+1, k) for all i+1 <= k < j. Why is it OK to do this? (To put the question the other way: Couldn't it be that doing this causes us to skip over some range with the right sum?)
[EDIT: The original version of the next paragraph glossed over a subtlety: we might have advanced the range end point to j on any previous step.]
To see that it never skips a valid range, we need to think about the range (i, j-1). For the algorithm to advance the starting point of the current range, so that it changes from (i, j) to (i+1, j), it must have been that sum(i, j) > n; and as we will see, to get to a program state in which the range (i, j) is being considered in the first place, it must have been that sum(i, j-1) < n. That second claim is subtle, because there are two different ways to arrive in such a program state: either we just incremented the end point, meaning that the previous range was (i, j-1) and this range was found to be too small (in which case our desired property sum(i, j-1) < n obviously holds); or we just incremented the start point after considering (i-1, j) and finding it to be too large (in which case it's not obvious that the property still holds).
What we do know, however, is that regardless of whether the end point was increased from j-1 to j on the previous step, it was definitely increased at some time before the current step -- so let's call the range that triggered this end point increase (k, j-1). Clearly sum(k, j-1) < n, since this was (by definition) the range that caused us to increase the end point from j-1 to j; and just as clearly k <= i, since we only process ranges in increasing order of their start points. Since i >= k, sum(i, j-1) is just the same as sum(k, j-1) but with zero or more terms removed from the left end, and all of these terms are positive, so it must be that sum(i, j-1) <= sum(k, j-1) < n.
So we have established that whenever we increase i to i+1, we know that sum(i, j-1) < n. To finish the analysis of this rule, what we (again) need to make use of is that dropping terms from either end of this sum can't make it any bigger. Removing the first term leaves us with sum(i+1, j-1) <= sum(i, j-1) < n. Starting from that sum and successively removing terms from the other end leaves us with sum(i+1, j-2), sum(i+1, j-3), ..., sum(i+1, i+1), all of which we know must be less than n -- that is, none of the ranges corresponding to these sums can be valid solutions. Therefore we can safely avoid considering them in the first place, and that's exactly what the algorithm does.
One final potential stumbling block is that it might seem that, since we are advancing two loop indexes, the time complexity should be O(m^2). But notice that every time through the loop body, we advance one of the indexes (i or j) by one, and we never move either of them backwards, so if we are still running after 2m loop iterations we must have i + j = 2m. Since neither index can ever exceed m, the only way for this to hold is if i = j = m, which means that we have reached the end: i.e. we are guaranteed to terminate after at most 2m iterations.
The fact that primes have to be consecutive allows to solve quite efficiently this problem in terms of n. Let me suppose that we have previously computed all the primes less or equal than n. Therefore, we can easily compute sum(i) as the sum of the first i primes.
Having this function precomputed, we can loop over the primes less or equal than n and see whether there exists a length such that starting with that prime we can sum up to n. But notice that for a fixed starting prime, the sequence of sums is monotone, so we can binary search over the length.
Thus, let k be the number of primes less or equal than n. Precomputing the sums has cost O(k) and the loop has cost O(klogk), dominating the cost. Using the Prime number theorem, we know that k = O(n/logn), and then the whole algorithm has cost O(n/logn log(n/logn)) = O(n).
Let me put a code in C++ to make it clearer, hope there are not bugs:
#include <iostream>
#include <vector>
using namespace std;
typedef long long ll;
int main() {
//Get the limit for the numbers
int MAX_N;
cin >> MAX_N;
//Compute the primes less or equal than MAX_N
vector<bool> is_prime(MAX_N + 1, true);
for (int i = 2; i*i <= MAX_N; ++i) {
if (is_prime[i]) {
for (int j = i*i; j <= MAX_N; j += i) is_prime[j] = false;
}
}
vector<int> prime;
for (int i = 2; i <= MAX_N; ++i) if (is_prime[i]) prime.push_back(i);
//Compute the prefixed sums
vector<ll> sum(prime.size() + 1, 0);
for (int i = 0; i < prime.size(); ++i) sum[i + 1] = sum[i] + prime[i];
//Get the number of queries
int n_queries;
cin >> n_queries;
for (int z = 1; z <= n_queries; ++z) {
int n;
cin >> n;
//Solve the query
bool found = false;
for (int i = 0; i < prime.size() and prime[i] <= n and not found; ++i) {
//Do binary search over the lenght of the sum:
//For all x < ini, [i, x] sums <= n
int ini = i, fin = int(prime.size()) - 1;
while (ini <= fin) {
int mid = (ini + fin)/2;
int value = sum[mid + 1] - sum[i];
if (value <= n) ini = mid + 1;
else fin = mid - 1;
}
//Check the candidate of the binary search
int candidate = ini - 1;
if (candidate >= i and sum[candidate + 1] - sum[i] == n) {
found = true;
cout << n << " =";
for (int j = i; j <= candidate; ++j) {
cout << " ";
if (j > i) cout << "+ ";
cout << prime[j];
}
cout << endl;
}
}
if (not found) cout << "No solution" << endl;
}
}
Sample input:
1000
5
12
20
28
17
29
Sample output:
12 = 5 + 7
No solution
28 = 2 + 3 + 5 + 7 + 11
17 = 2 + 3 + 5 + 7
29 = 29
I'd start by noting that for a pair of consecutive primes to sum to the number, one of the primes must be less than N/2, and the other prime must be greater than N/2. For them to be consecutive primes, they must be the primes closest to N/2, one smaller and the other larger.
If you're starting with a table of prime numbers, you basically do a binary search for N/2. Look at the primes immediately larger and smaller than that. Add those numbers together and see if they sum to your target number. If they don't, then it can't be the sum of two consecutive primes.
If you don't start with a table of primes, it works out pretty much the same way--you still start from N/2 and find the next larger prime (we'll call that prime1). Then you subtract N-prime1 to get a candidate for prime2. Check if that's prime, and if it is, search the range prime2...N/2 for other primes to see if there was a prime in between. If there's a prime in between your number is a sum of non-consecutive primes. If there's no other prime in that range, then it is a sum of consecutive primes.
The same basic idea applies for sequences of 3 or more primes, except that (of course) your search starts from N/3 (or whatever number of primes you want to sum to get to the number).
So, for three consecutive primes to sum to N, 2 of the three must be the first prime smaller than N/3 and the first prime larger than N/3. So, we start by finding those, then compute N-(prime1+prime2). That gives use our third candidate. We know these three numbers sum to N. We still need to prove that this third number is a prime. If it is prime, we need to verify that it's consecutive to the other two.
To give a concrete example, for 10 we'd start from 3.333. The next smaller prime is 3 and the next larger is 5. Those add to 8. 10-8 = 2. 2 is prime and consecutive to 3, so we've found the three consecutive primes that add to 10.
There are some other refinements you can make as well. The most obvious would be based on the fact that all primes (other than 2) are odd numbers. Therefore (assuming we can ignore 2), an even number can only be the sum of an even number of primes, and an odd number can only be a sum of an odd number of primes. So, given 123456789, we know immediately that it can't possibly be the sum of 2 (or 4, 6, 8, 10, ...) consecutive primes, so the only candidates to consider are 3, 5, 7, 9, ... primes. Of course, the opposite works as well: given, say, 12345678, the simple fact that it's even lets us immediately rule out the possibility that it could be the sum of 3, 5, 7 or 9 consecutive primes; we only need to consider sequences of 2, 4, 6, 8, ... primes. We violate this basic rule only when we get to a large enough number of primes that we could include 2 as part of the sequence.
I haven't worked through the math to figure out exactly how many that would be be for a given number, but I'm pretty sure it should be fairly easy and it's something we want to know anyway (because it's the upper limit on the number of consecutive primes to look for for a given number). If we use M for the number of primes, the limit should be approximately M <= sqrt(N), but that's definitely only an approximation.
I know that this question is a little old, but I cannot refrain from replying to the analysis made in the previous answers. Indeed, it has been emphasized that all the three proposed algorithms have a run-time that is essentially linear in n. But in fact, it is not difficult to produce an algorithm that runs at a strictly smaller power of n.
To see how, let us choose a parameter K between 1 and n and suppose that the primes we need are already tabulated (if they must be computed from scratch, see below). Then, here is what we are going to do, to search a representation of n as a sum of k consecutive primes:
First we search for k<K using the idea present in the answer of Jerry Coffin; that is, we search k primes located around n/k.
Then to explore the sums of k>=K primes we use the algorithm explained in the answer of Dialecticus; that is, we begin with a sum whose first element is 2, then we advance the first element one step at a time.
The first part, that concerns short sums of big primes, requires O(log n) operations to binary search one prime close to n/k and then O(k) operations to search for the other k primes (there are a few simple possible implementations). In total this makes a running time
R_1=O(K^2)+O(Klog n).
The second part, that is about long sums of small primes, requires us to consider sums of consecutive primes p_1<\dots<p_k where the first element is at most n/K.
Thus, it requires to visit at most n/K+K primes (one can actually save a log factor by a weak version of the prime number theorem). Since in the algorithm every prime is visited at most O(1) times, the running time is
R_2=O(n/K) + O(K).
Now, if log n < K < \sqrt n we have that the first part runs with O(K^2) operations and the second part runs in O(n/K). We optimize with the choice K=n^{1/3}, so that the overall running time is
R_1+R_2=O(n^{2/3}).
If the primes are not tabulated
If we also have to find the primes, here is how we do it.
First we use Erathostenes, that in C_2=O(T log log T) operations finds all the primes up to T, where T=O(n/K) is the upper bound for the small primes visited in the second part of the algorithm.
In order to perform the first part of the algorithm we need, for every k<K, to find O(k) primes located around n/k. The Riemann hypothesis implies that there are at least k primes in the interval [x,x+y] if y>c log x (k+\sqrt x) for some constant c>0. Therefore a priori we need to find the primes contained in an interval I_k centered at n/k with width |I_k|= O(k log n)+O(\sqrt {n/k} log n).
Using the sieve Eratosthenes to sieve the interval I_k requires O(|I_k|log log n) + O(\sqrt n) operations. If k<K<\sqrt n we get a time complexity C_1=O(\sqrt n log n log log n) for every k<K.
Summing up, the time complexity C_1+C_2+R_1+R_2 is maximized when
K = n^{1/4} / (log n \sqrt{log log n}).
With this choice have the sublinear time complexity
R_1+R_2+C_1+C_2 = O(n^{3/4}\sqrt{log log n}.
If we do not assume the Riemann Hypothesis we will have to search on larger intervals, but we still get at the end a sublinear time complexity. If instead we assume stronger conjectures on prime gaps, we may only need to search on intervals I_k with width |I_k|=k (log n)^A for some A>0. Then, instead of Erathostenes, we can use other deterministic primality tests. For example, suppose that you can test a single number for primality in O((log n)^B) operations, for some B>0.
Then you can search the interval I_k in O(k(log n)^{A+B}) operations. In this case the optimal K is still K\approx n^{1/3}, up to logarithmic factors, and so the total complexity is O(n^{2/3}(log n)^D for some D>0.

Reduce a sequence in most optimal way

We are given a sequence a of n numbers. The reduction of sequence a is defined as replacing the elements a[i] and a[i+1] with max(a[i],a[i+1]).
Each reduction operation has a cost defined as max(a[i],a[i+1]). After n-1 reductions a sequence of length 1 is obtained.
Now our goal is to print the cost of the optimal reduction of the given sequence a such that the resulting sequence of length 1 has the minimum cost.
e.g.:
1
2
3
Output :
5
An O(N^2) solution is trivial. Any ideas?
EDIT1:
People are asking about my idea, so my idea was to traverse through the sequence pairwise and for each pair check cost and in the end reduce the pair with least cost.
1 2 3
2 3 <=== Cost is 2
So reduce above sequence to
2 3
now again traverse through sequence, we get cost as 3
2 3
3 <=== Cost is 3
So total cost is 2+3=5
Above algorithm is of O(N^2). That is why I was asking for some more optimized idea.
O(n) solution:
High-level:
The basic idea is to repeatedly merge any element e smaller than both its neighbours ns and nl with its smallest neighbour ns. This produces the minimal cost because both the cost and result of merging is max(a[i],a[i+1]), which means no merge can make an element smaller than it currently is, thus the cheapest possible merge for e is with ns, and that merge can't increase the cost of any other possible merges.
This can be done with a one pass algorithm by keeping a stack of elements from our array in decreasing order. We compare the current element to both its neighbours (one being the top of the stack) and perform appropriate merges until we're done.
Pseudo-code:
stack = empty
for pos = 0 to length
// stack.top > arr[pos] is implicitly true because of the previous iteration of the loop
if stack.top > arr[pos] > arr[pos+1]
stack.push(arr[pos])
else if stack.top > arr[pos+1] > arr[pos]
merge(arr[pos], arr[pos+1])
else while arr[pos+1] > stack.top > arr[pos]
merge(arr[pos], stack.pop)
Java code:
Stack<Integer> stack = new Stack<Integer>();
int cost = 0;
int arr[] = {10,1,2,3,4,5};
for (int pos = 0; pos < arr.length; pos++)
if (pos < arr.length-1 && (stack.empty() || stack.peek() >= arr[pos+1]))
if (arr[pos] > arr[pos+1])
stack.push(arr[pos]);
else
cost += arr[pos+1]; // merge pos and pos+1
else
{
int last = Integer.MAX_VALUE; // required otherwise a merge may be missed
while (!stack.empty() && (pos == arr.length-1 || stack.peek() < arr[pos+1]))
{
last = stack.peek();
cost += stack.pop(); // merge stack.pop() and pos or the last popped item
}
if (last != Integer.MAX_VALUE)
{
int costTemp = Integer.MAX_VALUE;
if (!stack.empty())
costTemp = stack.peek();
if (pos != arr.length-1)
costTemp = Math.min(arr[pos+1], costTemp);
cost += costTemp;
}
}
System.out.println(cost);
I am confused if you mean by "cost" of reduction "computational cost" i.e. an operation taking time max(a[i],a[i+1]) or simply something you want to calculate. If it is the latter, then the following algorithm is better than O(n^2):
sort the list, or more precise, define b[i] s.t. a[b[i]] is the sorted list: O(n) if you can use RADIX sort, O(n log n) otherwise.
starting from the second-lowest item i in the sorted list: if left/right is lower than i, then perform reduction: O(1) for each item, update list from 2, O(n) in total.
I have no idea if that is the optimal solution, but it's O(n) for integers and O(n log n), otherwise.
edit: Realized that removing a precomputing step made it much simpler
If you don't consider it cheating to sort the list, then do it in n log n time and then merge the first two entries recursively. The total cost in this case will be the sum of the entries minus the smallest entry. This is optimal since
the cost will be the sum of n-1 entries (with repeats allowed)
the ith smallest entry can appear at most i-1 times in the cost function
The same fundamental idea works even if the list isn't sorted. An optimal solution is to merge the smallest element with its smallest neighbor. To see that this is optimal, note that
the cost will be the sum of n-1 entries (with repeats allowed)
entry a_i can appear at most j-1 times in the cost function, where j is the length of the longest consecutive subsequence containing a_i such that a_i is the maximum element of the subsequence
In the worst case, the sequence is decreasing and the time is O(n^2).
Greedy approach indeed works.
You can always reduce the smallest number with its smaller neighbor.
Proof: we have to reduce smallest number at some point. Any reduction of a neighbor will make the value of neighbor at least the same(possibly) bigger, so operation that reduces minimal element a[i] will always have cost c>=min(a[i-1], a[i+1])
Now we need to
quickly find/remove smallest number
find its neigbors
I'd go with 2 RMQs on that. Doing operation 2 as a binary search. Which gives us O(N * log^2(N))
EDIT: first RMQ - values. When you remove an element put some big value there
second RMQ - "presence". 0 or 1 (value is there/isn't there). To find a [for example] left neighbor of a[i], you need to find the greatest l, that sum[l,i-1] = 1.

Algorithm: Find peak in a circle

Given n integers, arranged in a circle, show an efficient algorithm that can find one peak. A peak is a number that is not less than the two numbers next to it.
One way is to go through all the integers and check each one to see whether it is a peak. That yields O(n) time. It seems like there should be some way to divide and conquer to be more efficient though.
EDIT
Well, Keith Randall proved me wrong. :)
Here's Keith's solution implemented in Python:
def findPeak(aBase):
N = len(aBase)
def a(i): return aBase[i % N]
i = 0
j = N / 3
k = (2 * N) / 3
if a(j) >= a(i) and a(j) >= a(k)
lo, candidate, hi = i, j, k
elif a(k) >= a(j) and a(k) >= a(i):
lo, candidate, hi = j, k, i + N
else:
lo, candidate, hi = k, i + N, j + N
# Loop invariants:
# a(lo) <= a(candidate)
# a(hi) <= a(candidate)
while lo < candidate - 1 or candidate < hi - 1:
checkRight = True
if lo < candidate - 1:
mid = (lo + candidate) / 2
if a(mid) >= a(candidate):
hi = candidate
candidate = mid
checkRight = False
else:
lo = mid
if checkRight and candidate < hi - 1:
mid = (candidate + hi) / 2
if a(mid) >= a(candidate):
lo = candidate
candidate = mid
else:
hi = mid
return candidate % N
Here's a recursive O(log n) algorithm.
Suppose we have an array of numbers, and we know that the middle number of that segment is no smaller than the endpoints:
A[i] <= A[m] >= A[j]
for i,j indexes into an array, and m=(i+j)/2. Examine the elements midway between the endpoints and the midpoint, i.e. those at indexes x=(3*i+j)/4 and y=(i+3*j)/4. If A[x]>=A[m], then recurse on the interval [i,m]. If A[y]>=A[m], then recurse on the interval [m,j]. Otherwise, recurse on the interval [x,y].
In every case, we maintain the invariant on the interval above. Eventually we get to an interval of size 2 which means we've found a peak (which will be A[m]).
To convert the circle to an array, take 3 equidistant samples and orient yourself so that the largest (or one tied for the largest) is in the middle of the interval and the other two points are the endpoints. The running time is O(log n) because each interval is half the size of the previous one.
I've glossed over the problem of how to round when computing the indexes, but I think you could work that out successfully.
When you say "arranged in a circle", you mean like in a circular linked list or something? From the way you describe the data set, it sounds like these integers are completely unordered, and there's no way to look at N integers and come to any kind of conclusion about any of the others. If that's the case, then the brute-force solution is the only possible one.
Edit:
Well, if you're not concerned with worst-case time, there are slightly more efficient ways to do it. The naive approach would be to look at Ni, Ni-1, and Ni+1 to see if Ni is a peak, then repeat, but you can do a little better.
While not done
If N[i] < N[i+1]
i++
Else
If N[i]>N[i-1]
Done
Else
i+=2
(Well, not quite that, because you have to deal with the case where N[i]=N[i+1]. But something very similar.)
That will at least keep you from comparing Ni to Ni+1, adding 1 to i, and then redundantly comparing Ni to Ni-1. It's a distinctly marginal gain, though. You're still marching through the numbers, but there's no way around that; jumping blindly is unhelpful, and there's no way to look ahead without taking just as long as doing the actual work would be.

How to find the second smallest element in n + logn -2 comparisons? [duplicate]

Given n numbers, how do I find the largest and second largest number using at most n+log(n) comparisons?
Note that it's not O(n+log(n)), but really n+log(n) comparisons.
pajton gave a comment.
Let me elaborate.
As pajton said, this can be done by tournament selection.
Think of this as a knock out singles tennis tournament, where player abilities have a strict order and the outcome of a match is decided solely by that order.
In the first round half the people are eliminated. In the next round another half etc (with some byes possible).
Ultimately the winner is decided in the last and final round.
This can be viewed as a tree.
Each node of the tree will be the winner of the match between the children of that node.
The leaves are the players. The winner of the first round are the parents of the leaves etc.
This is a complete binary tree on n nodes.
Now follow the path of the winner. There are log n matches the winner has played. Now consider the losers of those log n matches. The second best must be the best among those.
The winner is decided in n-1 matches (you knock out one per match) and the winner among the logn is decided in logn -1 matches.
Thus you can decide the largest and second largest in n+logn - 2 compares.
In fact, it can proven that this is optimal. In any comparison scheme in the worst case, the winner would have to play logn matches.
To prove that assume a point system where after a match the winner gets the points of the loser. Initially all start out with 1 point each.
At the end the final winner has n points.
Now given any algorithm, it could be arranged so that player with more points is always the winner. Since the points of any person at most double in any match in that scenario, you require at least log n matches played by the winner in the worst case.
Is there a problem with this? It's at most 3n comparisons (not counting the i < n comparison). If you count that, it's 4n (or 5n in the second example).
double first = -1e300, second = -1e300;
for (i = 0; i < n; i++){
if (a[i] > first){
second = first;
first = a[i];
}
else if (a[i] > second && a[i] < first){
second = a[i];
}
}
another way to code it:
for (i = 0; i < n; i++) if (a[i] > first) first = a[i];
for (i = 0; i < n; i++) if (a[i] < first && a[i] > second) second = a[i];

Resources