Select pairs of numbers with the minimum overall difference - algorithm

Given n pairs of numbers, select k pairs so that the difference between the minimum value and the maximum value is minimal. Note that 2 numbers in 1 pair cannot be separated. Example (n=5, k=3):
INPUT OUTPUT (return the index of the pairs)
5 4 1 2 4
1 5
9 8
1 0
2 7
In this case, choosing (5,4) (1,5) (1,0) will give a difference of 5 (max is 5, min is 0). I'm looking for an efficient way (n log n) of doing this since the input will be pretty large and I don't want to go through every possible case.
Thank you.
NOTE: No code is needed. An explanation of the solution is enough.

Here's a method with O(n log n) time complexity:
First sort the array according to the smaller number in the pair. Now iterate back from the last element in the sorted array (the pair with the highest minimum).
As we go backwards, the elements already visited will necessarily have an equal or higher minimum than the current element. Store the visited pairs in a max heap according to the maximal number in the visited pair. If the heap size is smaller than k-1, keep adding to the heap.
Once the heap size equals k-1, begin recording and comparing the best interval so far. If the heap size exceeds k-1, pop the maximal element off. The heap is guaranteed to contain the first k-1 pairs where the minimal number is greater than or equal to the current minimal number and the maximal is smallest (since we keep popping off the maximal element when the heap size exceeds k-1).
Total time O(n log n) for sorting + O(n log n) to iterate and maintain the heap = O(n log n) in total.
Example:
5 4
1 5
9 8
1 0
2 7
k = 3
Sort pairs by the smaller number in each pair:
[(1,0),(1,5),(2,7),(5,4),(9,8)]
Iterate from end to start:
i = 4; Insert (9,8) into heap
i = 3; Insert (5,4) into heap
i = 2; Range = 2-9
i = 1; Pop (9,8) from heap; Range = 1-7
i = 0; Pop (2,7) from heap; Range = 0-5
Minimal interval [0,5] (find k matching indices in O(n) time)

Lets keep to sorted arrays: one which sorted according to minimal number in pair and other to maximal. Lets iterate over first array and fix minimal number in answer. We can keep pointer on k-th number in second array. When we go to next pair we remove all pairs with less minimal value from second array and forward pointer if needed. To find position in log n time in second array we can keep additional map between pair and position.

Related

Maximize number of zigzag sequence in an array

I want to maximize number of zigzag sequence in an array(without reordering).
I've a main array of random sequence of integers.I want a sub-array of index of main array that has zigzag pattern.
A sequence of integers is called zigzag sequence if each of its elements is either strictly less or strictly greater than its neighbors(and two adjacent of neighbors).
Example : The sequence 4 2 3 1 5 2 forms a zigzag, but 7 3 5 5 2 and 3 8 6 4 5
and 4 2 3 1 5 3 don't.
For a given array of integers we need to find (contiguous) sub-array of indexes that forms a zigzag sequence.
Can this be done in O(N) ?
Yes, this would seem to be solvable in O(n) time. I'll describe the algorithm as a dynamic program.
Setup
Let the array containing potential zig-zags be called Z.
Let U be an array such that len(U) == len(Z), and U[i] is an integer representing the largest contiguous left-to-right subsequence starting at i that is a zig-zag such that Z[i] < Z[i+1] (it zigs up).
Let D be similar to U, except that D[i] is an integer representing the largest contiguous left-to-right subsequence starting at i that is a zig-zag such that Z[i] > Z[i+1] (it zags down).
Subproblem
The subproblem is to find both U[i] and D[i] at each i. This can be done as follows:
U[i] = {
1 + D[i+1] if i < i+1
0 otherwise
}
L[i] = {
1 + U[i+1] if i > i+1
0 otherwise
}
The top version says that if we're looking for the largest sequence beginning with an up-zig, we see if the next element is larger (goes up), and then add a single zig to the size of the next down-zag sequence. The next one is the reverse.
Base Cases
If i == len(Z) (it is the last element), U[i] = L[i] = 0. The last element cannot have a left-to-right sequence after it because there is nothing after it.
Solution
To get the solution, first we find max(U[i]) and max(L[i]) for every i. Then get the maximum of those two values, store i, and store the length of this largest zig-zag (in a variable called length). The sequence begins at index i and ends at index i + length.
Runtime
There are n indexes, so there are 2n subproblems between U and L. Each subproblem takes O(1) time to solve, given that solutions to previously solved subproblems are memoized. Finally, iterating through U and L to get the final answer takes O(2n) time.
We thus have O(2n) + O(2n) time, or O(n).
This may be an overly complex solution, but it demonstrates that it can be done in O(n).

find maximum possible min value of array

There is an array containing n integers. In each step we are allowed to increment all the elements present in any subarray of size w by 1. The maximum number of such steps allowed is m. Any element of the array cannot be incremented more than k times. We are required to maximize the minimum possible element in the array after these operations.
For example we are given n=6, m=2, w=3, k=1
And the array is 2 2 2 2 1 1.
then the answer is 2, as k=1( we can only increment each element once, hence considering a window of size 3 at the end of the array will give us the required answer. Also note that since m=2, the first 3 elements will be incremented in the next step.)
How do i approach this problem?
Edit: Constraints are
1 ≤ w ≤ n ≤ 10^5
1 ≤ k ≤ m ≤ 10^5
Elements in the array are in range 1 to 10^9.

Sequence increasing and decreasing by turns

Let's assume we've got a sequence of integers of given length n. We want to delete some elements (maybe none), so that the sequence is increasing and decreasing by turns in result. It means, that every element should have neighbouring elements either both bigger or both smaller than itself.
For example 1 3 2 7 6 and 5 1 4 2 10 are both sequences increasing and decreasing by turns.
We want to delete some elements to transform our sequence that way, but we also want to maximize the sum of elements left. So, for example, from sequence 2 18 6 7 8 2 10 we want to delete 6 and make it 2 18 7 8 2 10.
I am looking for an effective solution to that problem. Example above shows that the most naive greedy algorithm (delete every first element that breaks the sequence) won't work - it would delete 7 instead of 6, which would not maximize the sum of elements left.
Any ideas how to solve that effectively (O(n) or O(n log n) probably) and correctly?
For every element of the sequence with index i we will calculate F(i, high) and F(i, low), where F(i, high) equals to the biggest sum of the subsequence with wanted characteristics that ends with the i-th element and this element is a "high peak". (I'll explain mainly the "high" part, the "low" part can be done similarly). We can calculate these functions using the following relations:
The answer is maximal among all F(i, high) and F(i, low) values.
That gives us a rather simple dynamic programming solution with O(n^2) time complexity. But we can go further.
We can optimize a calculation of max(F(j,low)) part. What we need to do is to find the biggest value among previously calculated F(j, low) with the condition that a[j] < a[i]. This can be done with segment trees.
First of all, we'll "squeeze" our initial sequence. We need the real value of the element a[i] only when calculating the sum. But we need only the relative order of the elements when checking that a[j] is less than a[i]. So we'll map every element to its index in the sorted elements array without duplicates. For example, sequence a = 2 18 6 7 8 2 10 will be translated to b = 0 5 1 2 3 0 4. This can be done in O(n*log(n)).
The biggest element of b will be less than n, as a result, we can build a segment tree on the segment [0, n] with every node containing the biggest sum within the segment (we need two segment trees for "high" and "low" part accordingly). Now let's describe the step i of the algorithm:
Find the biggest sum max_low on the segment [0, b[i]-1] using the "low" segment tree (initially all nodes of the tree contain zero).
F(i, high) is equal to max_low + a[i].
Find the biggest sum max_high on the segment [b[i]+1, n] using the "high" segment tree.
F(i, low) is equal to max_high + a[i].
Update the [b[i], b[i]] segment of the "high" segment tree with F(i, high) value recalculating maximums of the parent nodes (and [b[i], b[i]] node itself).
Do the same for "low" segment tree and F(i, low).
Complexity analysis: b sequence calculation is O(n*log(n)). Segment tree max/update operations have O(log(n)) complexity and there are O(n) of them. The overall complexity of this algorithm is O(n*log(n)).

Sorting as much as possible: values can travel no more than k positions to their left

Given an array of length N and an integer K, sort the array as much as possible such that no element travels more than K positions to its left. An element however can travel as much as it likes to its right.
Let's define sortedness as the number of disordered pairs, i.e.: sortedness(1,2,3) = 0 and sortedness(3,1,2) = 2.
Clarification: If the first k+1 items of the array are moved to the end of the array, the other ones should be considered moved k+1 positions to the left.
This is an interview question. I thought of using a bubble sort. The outer loop would run K times with a run-time of O(nk). The smallest integer would be the only integer shifted to the left K times. The other integers would be shifted to the left less than K times.
Is there a more efficient way to approach this problem?
Use a min heap to sort the list of n elements in O(n log k).
Add the first k+1 unsorted elements to the heap.
Repeat this step: pop off the min element from the heap. Add it to the end of the sorted list. add the next unsorted element to the heap.
Because the heap always has at most k+1 elements regardless of n, all heap operations are O(log k), and the total running time is O(n log k)
Why is this correct?
Suppose it isn't. Then for some inputs my algorithm gives non-optimal sorts. Let I be such an input, let A be the output of my algorithm on I, and let B be the optimal sort.
Let i be the first index where A and B disagree. Let x = A[i], y = B[i], and let j be the index of x in B.
I claim that swapping x and y in B improve the sortedness of B, which is a contradiction.
Because A and B are identical for positions before i, the same set of k+1 elements are eligible to go into position i for both. Because my algorithm chose x to be the min of those elements, we know that x is less than y. We also know j is greater than i.
What happens when we swap x and y in B?
First, note that the change in sortedness is unaffected by anything to the left of i or to the right of j, because their positions relative to both x and y are unchanged by the swap.
We know there are no elements between i and j that are less than x, because my sort chose the smallest available element. Therefore all elements between i and j are at least as large as x.
For each element between i and j equal to x, swapping x and y improves sortedness by 1 because we improve y relative to these elements and x is unaffected.
For each element between i and j greater than x, the sortedness of x relative to these is improved by 1, and in the worst case the sortedness of y relative to these is degraded by 1, so the net effect is at worst 0.
Furthermore, swapping x and y improves the sortedness of x relative to y by 1, so this swap strictly improves overall sortedness.
Contradiction.
Naive approach:
iterate the array from left to right.For each position i we consider a subarray from i to i+k. Then we have to get the minimum valued element in this subarray and swap the 1st element of this subarray with this element. Now, go to position i+1 and do the same.
Optimized Approach:
We can use segment tree to solve this. Using this data structure you can find the minimum value between any range of an array and also edit any data online in O(logn). In your problem, we can get the solution array using following steps,
arr[1] = minimum value between position 1 to min(k,n), then edit this position with infinity
arr[2] = min value between position 1 to min(k+1,n), then edit this position with infinity
arr[3] = min value between position 1 to min(k+2,n), then edit this position with infinity
arr[4] = min value between position 1 to min(k+3,n), then edit this position with infinity
...
...
arr[n] = min value between position 1 to min(k+n,n), then edit this position with infinity
Overall complexity O(nlogn)
for example:
given array = 5 3 4 7 8 2 1 0 and K = 2
using this algorithm you will get the solution array as this:
3 4 5 2 1 0 7 8 sortedness value = 12
Hope it helps!
Best regards,
Agassaa

minimum number of steps required to set all flags of array elements to 1 which were initially 0 by default [duplicate]

Two integers N<=10^5 and K<=N are given, where N is the size of array A[] and K is the length of continuous subsequence we can choose in our process.Each element A[i]<=10^9. Now suppose initially all the elements of array are unmarked. In each step we'll choose any subsequence of length K and if this subsequence has unmarked elements then we will mark all the unmarked elements which are minimum in susequence. Now how to calculate minimum number of steps to mark all the elements?
For better understanding of problem see this example--
N=5 K=3
A[]=40 30 40 30 40
Step 1- Select interval [1,3] and mark A[1] and A[3]
Step2- Select interval [0,2] and mark A[0] and A[2]
Step 3- Select interval [2,4] and mark A[4]
Hence minimum number of steps here is 3.
My approach(which is not fast enough to pass)-
I am starting from first element of array and marking all the unmarked elements equal to it at distance <=K and incrementing steps by 1.
First consider how you'd answer the question for K == N (i.e. without any effective restriction on the length of subsequences). Your answer should be that the minimum number of steps is the number of distinct values in the array.
Then consider how this changes as K decreases; all that matters is how many copies of a K-length interval you need to cover the selection set {i: A[i] == n} for each value n present in A. The naive algorithm of walking a K-length interval along A, halting at each position A[i] not yet covered for that value of n is perfectly adequate.
As we see minimum number of steps = N/k or N/k+1 and maximum number of steps =(n+k-1).
We have to optimize the total number of steps and which depend on past history of choices we made which refers to dynamic solution.
For dynamic theory tutorial see http://www.quora.com/Dynamic-Programming/How-do-I-get-better-at-DP-Are-there-some-good-resources-or-tutorials-on-it-like-the-TopCoder-tutorial-on-DP/answer/Michal-Danil%C3%A1k
Can be solved in O(n) as follows:
Trace each element a[i]. If a[i] wasn't traced before then map the number and its index and increase counter.If the number was traced previously then check whether its (last index-curr_index)>=K if yes update the index and increase count. Print count.
Map STL will be beneficial.

Resources