count no. of pairs such that absolute difference is less than K - algorithm

Given an array A of size N, how do I count the number of pairs(A[i], A[j]) such that the absolute difference between them is less than or equal to K where K is any positive natural number? (i, j<=N and i!=j)
My approach:
Sort the array.
Create another array that stores the absolute difference between two consecutive numbers.
Am I heading in the right direction? If yes, then how do I proceed further?

Here is a O(nlogn) algorithm :-
1. sort input
2. traverse the sorted array in ascending order.
3. for A[i] find largest index A[j]<=A[i]+k using binary search.
4. count = count+j-i
5. do 3 to 4 all i's
Time complexity :-
Sorting : O(n)
Binary Search : O(logn)
Overall : O(nlogn)

This is O(n^2):
Sort the array
For each item_i in array,
For each item_j in array such that j > i
If item_j - item_i <= k, print (item_j, item_i)
Else proceed with the next item_i

Your approach is partially correct. You first sort the array. Then keep two pointers i and j.
1. Initialize i = 0, j = 1.
2. Check if A[j] - A[i] <= K.
- If yes, then increment j,
- else
- **increase the count of pairs by C(j-i,2)**.
- increment i.
- if i == j, then increment j.
3. Do this till pointer j goes past the end of the array. Then just add C(j-1,2) to the count and stop.
By i and j, you are basically maintaining a window within which the difference between elements is <= K.
EDIT: This is the basic idea, you will have to check for boundary conditions. Also you will have to keep track of the past interval that was added to the count. You will need to subtract the overlap with the current interval to avoid double counting.
Complexity: O(NlogN), for the sort operation, linear for the array traversal

Once your array is sorted, you can compute the sum in O(N) time.
Here's some code. The O(N) algorithm is pair_sums, and pair_sums_slow is the obviously correct, but O(N^2) algorithm. I run through some test cases at the end to make sure that the two algorithms returns the same results.
def pair_sums(A, k):
A.sort()
counts = 0
j = 0
for i in xrange(len(A)):
while j < len(A) and A[j] - A[i] <= k:
j+=1
counts += j - i - 1
return counts
def pair_sums_slow(A, k):
counts = 0
for i in xrange(len(A)):
for j in xrange(i+1, len(A)):
if A[j] - A[i] <= k:
counts+=1
return counts
cases = [
([0, 1, 2, 3, 4, 5], 10),
([0, 0, 0, 0, 0], 1),
([0, 1, 2, 4, 8, 16], 9),
([0, -1, -2, 1, 2], 2)
]
for A, k in cases:
want = pair_sums_slow(A, k)
got = pair_sums(A, k)
if want != got:
print A, k, want, got
The idea behind pair_sums is that for each i, we find the smallest j such that A[j] - A[i] > K (or j=N). Then j-i-1 is the number of pairs with i as the first value.
Because the array is sorted, j only ever increases as i increases, so the overall complexity is linear since although there's nested loops the inner operation j+=1 can occur at most N times.

Related

Finding largest sum in an unsorted array using divide and conquer algorithm

I have a sequence of n real numbers stored in a array, A[1], A[2], …, A[n]. I am trying to implement a divide and conquer algorithm to find two numbers A[i] and A[j], where i < j, such that A[i] ≤ A[j] and their sum is the largest.
For eg. {2, 5, 9, 3, -2, 7} will give the output of 14 (5+9, not 16=9+7). Can anyone suggest me some ideas on how to do it?
Thanks in advance.
This problem is not really suited to a divide and conquer approach. It's easy to observe that if (i, j) is a solution for this problem, then A[j] >= A[k] for every k > j, i.e A[j] is the maximum in A[j..n]
Prove: if there exists such k > j and A[k] > A[j], then (j, k) is a better solution than (i, j)
So we only need to consider js that satisfies that criteria.
Algorithm (pseudo-code)
maxj = n
for (j = n - 1 down to 1):
if (a[j] > a[maxj]) then:
maxj = j
else:
check if (j, maxj) is a better solution
Complexity: O(n)
C++ implementation: http://ideone.com/ENp5WR (The implementation use an integer array, but it should be the same for floats)
Declare two variables, during your algorithm check if the current number is bigger than either of the two values currently be stored in the variables, if yes replace the smallest, if not, continue.
Here's a recursive solution in Python. I wouldn't exactly call it "divide and conquer" but then again, this problem isn't very suited to a divide and conquer approach.
def recurse(lst, pair): # the remaining list left to process
if not lst: return # if lst is empty, return
for i in lst[1:]: # for each elements in lst starting from index 1
curr_sum = lst[0] + i
if lst[0] < i and curr_sum > pair[0]+pair[1]: # if the first value is less than the second and curr_sum is greater than the max sum so far
pair[0] = lst[0]
pair[1] = i # update pair to contain the new pair of values that give the max sum
recurse(lst[1:], pair) # recurse on the sub list from index 1 to the end
def find_pair(s):
if len(s) < 2: return s[0]
pair = [s[0],s[1]] # initialises pair array
recurse(s, pair) # passed by reference
return pair
Sample output:
s = [2, 5, 9, 3, -2, 7]
find_pair(s) # ============> (5,9)
I think you can just use an algorithm in O(n) as described follow
(The merge part uses constant time)
Here is the outline of the algorithm:
Divide the problem into two half: LHS & RHS
Each half should returned the largest answer meeting the requirement in that half AND the largest element in that half
Merge and return the answer to upper level: answer is the maximum of LHS's answer, RHS's answer, and the sum of the largest element in both half (consider this only if RHS's largest element >= LHS's largest element)
Here is the demonstration of the algorithm using your example: {2, 5, 9, 3, -2, 7}
Divide into {2,5,9}, {3,-2,7}
Divide into {2,5}, {9}, {3,-2}, {7}
{2,5} return max(2,5, 5+2) = 7, largest element = 5
{9} return 9, largest element = 9
{3,-2} return max(3,-2) = 3, largest element = 3
{7} return 7, largest element = 7
{2,5,9} merged from {2,5} & {9}: return max(7,9,9+5) = 14, largest element = max(9,5) = 9
{3,-2,7} merged from {3,-2} & {7}: return max(3,7,7+3) = 10, largest element = max(7,3) = 7
{2,5,9,3,-2,7} merged from {2,5,9} and {3,-2,7}: return max(14,10) = 14, largest element = max(9,7) = 9
ans = 14
Special cases like {5,4,3,2,1} which yields no answer needs extra handling but not affecting the core part and the complexity of the algorithm.

Number of Contigious subarrays satisfying constraints

Given array A , find number of continious sub arrays which satisfies condition:
There is no pair (i,j) in the subarray such that i < j and A[i] mod A[j]= M
1<=A[i]<=100000
My Approach: Do it naive way in O(n^2) time complexity, which is bad.
Can I reduce it to (nlogn) ?
This is a O(N) time complexity, but requires O(N+M) space complexity:
We scan the array and perform A[i] = A[i] mod M
Keep a counter array of size N which keeps track of how many elements before it respect the given condition (basically not being the equal to the current element):
C[i] = the number of elements A[j], such that A[j]!=A[i] where j
Consider the array:
Original A array: 12, 25, 16, 14, 37, 18, 28, 17, 9, 37
New A array: 0, 1, 4, 2, 1, 6, 4, 5, 9, 1 (A[i] mod M)
Counter array: 0, 1, 2, 3, 2, 3, 3, 4, 5, 4
The counter array can be constructed incrementally based on previous values:
A[j] = A[i] and j
Otherwise C[i] = C[i-1]+1;
When looking for the last occurrence of A[i] before index i, we need to keep a dictionary (A[i], last-index-of-A[i]) for fast lookup.
Since we keep only values for A[i] mod M => a O(M) dictionary will do.
We now just sum up the counter array values:
Number of contiguous subarrays = Sum(C)
In this case we will have 27 contiguous subarrays that respect this condition.
Basically, we need to find all pairs such that i < j and A[i] mod A[j] = M
If a mod b = m, then a mod d = m, where d is a divisor of b and d > m
Hash the numbers in the array. For each number a in the array, check if a - m or any of the divisors of a - m are elements in the array and their index is greater than the index of a - enumerate any existing pairs (a, a - m) or (a, divisor of (a - m) > m) in O(n sqrt n).
Clearly, any contiguous sub-arrays satisfying the condition lie between any such pairs. If we aggregate the pairs in an interval tree, as we traverse the array, we can test if we are within a pair in O(log n) time. Once we detect we are overlapping a pair (an interval in the tree), we reset our window to (i + 1, j) (where (i,j) is the interval) and keep counting; we add to the total the largest segments achievable according to the formula, segment * (segment + 1) / 2, subtracted by the previous overlapping count.

Count number of swaps to sort first k-smallest element using a bubble sort like algorithm

Given an array a and integer k. Someone uses following algorithm to get first k smallest elements:
cnt = 0
for i in [1, k]:
for j in [i + 1, n]:
if a[i] > a[j]:
swap(a[i], a[j])
cnt = cnt + 1
The problem is: How to calculate value of cnt (when we get final k-sorted array), i.e. the number of swaps, in O(n log n) or better ?
Or simply put: calculate the number of swaps needed to get first k-smallest number sorted using the above algorithm, in less than O(n log n).
I am thinking about a binary search tree, but I get confused (How array will change when increase i ? How to calculate number of swap for a fixed i ?...).
This is a very good question: it involves Inverse Pairs, Stack and some proof techniques.
Note 1: All index used below are 1-based, instead of traditional 0-based.
Note 2: If you want to see the algorithm directly, please start reading from the bottom.
First we define Inverse Pairs as:
For a[i] and a[j], in which i < j holds, if we have a[i] > a[j], then a[i] and a[j] are called an Inverse Pair.
For example, In the following array:
3 2 1 5 4
a[1] and a[2] is a pair of Inverse Pair, a[2] and a[3] is another pair.
Before we start the analysis, let's define a common language: in the reset of the post, "inverse pair starting from i" means the total number of inverse pairs involving a[i].
For example, for a = {3, 1, 2}, inverse pair starting from 1 is 2, and inverse pair starting from 2 is 0.
Now let's look at some facts:
If we have i < j < k, and a[i] > a[k], a[j] > a[k], swap a[i] and a[j] (if they are an inverse pair) won't affect the total number of inverse pair starting from j;
Total inverse pairs starting from i may change after a swap (e.g. suppose we have a = {5, 3, 4}, before a[1] is swapped with a[2], total number of inverse pair starting from 1 is 2, but after swap, array becomes a = {3, 5, 4}, and the number of inverse pair starting from 1 becomes 1);
Given an array A and 2 numbers, a and b, as the head element of A, if we can form more inverse pair with a than b, we have a > b;
Let's denote the total number of inverse pair starting from i as ip[i], then we have: if k is the min number satisfies ip[i] > ip[i + k], then a[i] > a[i + k] while a[i] < a[i + 1 .. i + k - 1] must be true. In words, if ip[i + k] is the first number smaller than ip[i], a[i + k] is also the first number smaller than a[i];
Proof of point 1:
By definition of inverse pair, for all a[k], k > j that forms inverse pair with a[j], a[k] < a[j] must hold. Since a[i] and a[j] are a pair of inverse and provided that i < j, we have a[i] > a[j]. Therefore, we have a[i] > a[j] > a[k], which indicates the inverse-pair-relationships are not broken.
Proof of point 3:
Leave as empty since quite obvious.
Proof of point 4:
First, it's easy to see that when i < j, a[i] > a[j], we have ip[i] >= ip[j] + 1 > ip[j]. Then, it's inverse-contradict statement is also true, i.e. when i < j, ip[i] <= ip[j], we have a[i] <= a[j].
Now back to the point. Since k is the min number to satisfy ip[i] > ip[i + k], then we have ip[i] <= ip[i + 1 .. i + k - 1], which indicates a[i] <= a[i + 1.. i + k - 1] by the lemma we just proved, which also indicates there's no inverse pairs in the region [i + 1, i + k - 1]. Therefore, ip[i] is the same as the number of inverse pairs starting from i + k, but involving a[i]. Given ip[i + k] < ip[i], we know a[i + k] has less inverse pair than a[i] in the region of [i + k + 1, n], which indicates a[i + k] < a[i] (by Point 3).
You can write down some sequences and try out the 4 facts mentioned above and convince yourself or disprove them :P
Now it's about the algorithm.
A naive implementation will take O(nk) to compute the result, and the worst case will be O(n^2) when k = n.
But how about we make use of the facts above:
First we compute ip[i] using Fenwick Tree (see Note 1 below), which takes O(n log n) to construct and O(n log n) to get all ip[i] calculated.
Next, we need to make use of facts. Since swap of 2 numbers only affect current position's inverse pair number but not values after (point 1 and 2), we don't need to worry about the value change. Also, since the nearest smaller number to the right shares the same index in ip and a, we only need to find the first ip[j] that is smaller than ip[i] in [i + 1, n]. If we denote the number of swaps to get first i element sorted as f[i], we have f[i] = f[j] + 1.
But how to find this "first smaller number" fast? Use stack! Here is a post which asks a highly similar problem: Given an array A,compute B s.t B[i] stores the nearest element to the left of A[i] which is smaller than A[i]
In short, we are able to do this in O(n).
But wait, the post says "to the left" but in our case it's "to the right". The solution is simple: we do backward in our case, then everything the same :D
Therefore, in summary, the total time complexity of the algorithm is O(n log n) + O(n) = O(n log n).
Finally, let's talk with an example (a simplified example of #make_lover's example in the comment):
a = {2, 5, 3, 4, 1, 6}, k = 2
First, let's get the inverse pairs:
ip = {1, 3, 1, 1, 0, 0}
To calculate f[i], we do backward (since we need to use the stack technique):
f[6] = 0, since it's the last one
f[5] = 0, since we could not find any number that is smaller than 0
f[4] = f[5] + 1 = 1, since ip[5] is the first smaller number to the right
f[3] = f[5] + 1 = 1, since ip[5] is the first smaller number to the right
f[2] = f[3] + 1 = 2, since ip[3] is the first smaller number to the right
f[1] = f[5] + 1 = 1, since ip[5] is the first smaller number to the right
Therefore, ans = f[1] + f[2] = 3
Note 1: Using Fenwick Tree (Binary Index Tree) to get inverse pair can be done in O(N log N), here is a post on this topic, please have a look :)
Update
Aug/20/2014: There was a critical error in my previous post (thanks to #make_lover), here is the latest update.

Finding pairs with product greater than sum

Given as input, a sorted array of floats, I need to find the total number of pairs (i,j) such as A[i]*A[j]>=A[i]+A[j] for each i < j.
I already know the naive solution, using a loop inside other loop, which will give me O(n^2) algorithm, but i was wondering if there is a more optimal solution.
Here's an O(n) algorithm.
Let's look at A * B >= A + B.
When A, B <= 0, it's always true.
When A, B >= 2, it's always true.
When A >= 1, B <= 1 (or B >= 1, A <= 1), it's always false.
When 0 < A < 1, B < 0 (or 0 < B < 1, A < 0), it can be either true or false.
When 1 < A < 2, B > 0 (or 1 < B < 2, A > 0), it can be either true or false.
Here's a visualization, courtesy of Wolfram Alpha and Geobits:
Now, onto the algorithm.
* To find the pairs where one number is between 0 and 1 or 1 and 2 I do something similar to what is done for the 3SUM problem.
* "Pick 2" here is referring to combinations.
Count all the pairs where both are negative
Do a binary search to find the index of the first positive (> 0) number - O(log n).
Since we have the index, we know how many numbers are negative / zero, we simply need to pick 2 of them, so that's amountNonPositive * (amountNonPositive-1) / 2 - O(1).
Find all the pairs where one is between 0 and 1
Do a binary search to find the index of the last number < 1 - O(log n).
Start from that index as the right index and the left-most element as the left index.
Repeat this until the right index <= 0: (runs in O(n))
While the product is smaller than the sum, decrease the left index
Count all the elements greater than the left index
Decrease the right index
Find all the pairs where one is between 1 and 2
Do a binary search to find the index of the first number > 1 - O(log n).
Start from that index as the left index and the right-most element as the right index.
Repeat this until the left index >= 2: (runs in O(n))
While the product is greater than the sum, decrease the right index
Count all the elements greater than the right index
Increase the left index
Count all the pairs with both numbers >= 2
At the end of the last step, we're at the first index >= 2.
Now, from there, we just need to pick 2 of all the remaining numbers,
so it's again amountGreaterEqual2 * (amountGreaterEqual2-1) / 2 - O(1).
You can find and print the pairs (in a shorthand form) in O(n log n).
For each A[i] there is a minimum number k that satisfies the condition(1).
All values greater than k will also satisfy the condition.
Finding the lowest j such that A[j] >= k using binary search is O(log n).
So you can find and print the result like this:
(i, j)
(1, no match)
(2, no match)
(3, >=25)
(4, >=20)
(5, >=12)
(6, >6)
(7, >7)
...
(n-1, n)
If you want to print all combinations, then it is O(n^2), because the number of combinations are O(n^2).
(*) To handle negative numbers it actually needs to be a bit more complex, because the numbers that satify the equation can be more that one range.
I'm not absolutely sure how it behaves for small negative numbers, but if the number of ranges is not absolutely limited then my solution is no longer better than O(n^2).
Here's a binary search, O(n log n):
There's a breaking point for each number at A*B = A+B. You can reduce this to B = A / (A - 1). All numbers on one side or the other will fit it. It doesn't matter if there are negative numbers, etc.
If A < 1, then all numbers <= B fit.
If A > 1, then all numbers >= B fit.
If A == 1, then there is no match(divide by zero).
(Wolfram Alpha link)
So some pseudocode:
loop through i
a = A[i]
if(a == 1)
continue
if(a >= 2)
count += A.length - i
continue
j = binsearch(a / (a-1))
if(j <= i)
continue
if(a < 1)
count += j-i
if(a > 1)
count += A.length - j
Here's a O(n) algorithm that solves the problem when the array's elements are positive.
When the elements are positive, we can say that:
If A[i]*A[j] >= A[i]+A[j] when j>i then A[k]*A[j] >= A[k]+A[j] for any k that satisfies k>i (because the array is sorted).
If A[i]*A[j] < A[i]+A[j] when j>i then A[i]*A[k] < A[i]+A[k] for any k that satisfies k<j.
(these facts don't hold when both numbers are fractions, but then the condition won't be satisfied anyway)
Thus we can perform the following algorithm:
int findNumOfPairs(float A[])
{
start = 0;
end = A.length - 1;
numOfPairs = 0;
while (start != end)
{
if (A[start]*A[end] >= A[start]+A[end])
{
numOfPairs += end - start;
end--;
}
else
{
start++;
}
}
return numOfPairs;
}
How about excluding all floats that less then 1.0 first, since any number multiple with number less than 1, the x*0.3=A[i]+A[j] for each i < j, so we only need to count numbers of array to calculate the number of pairs(i, j), we can use formula about permutation and combination to calculate it. formula should be n(n-1)/2.

Finding largest from each subarray of length k

Interview Question :- Given an array and an integer k , find the maximum for each and every contiguous sub array of size k.
Sample Input :
1 2 3 1 4 5 2 3 6
3 [ value of k ]
Sample Output :
3
3
4
5
5
5
6
I cant think of anything better than brute force. Worst case is O(nk) when array is sorted in decreasing order.
Just iterate over the array and keep k last elements in a self-balancing binary tree.
Adding element to such tree, removing element and finding current maximum costs O(logk).
Most languages provide standard implementations for such trees. In STL, IIRC, it's MultiSet. In Java you'd use TreeMap (map, because you need to keep count, how many times each element occurs, and Java doesn't provide Multi- collections).
Pseudocode
for (int i = 0; i < n; ++i) {
tree.add(a[i]);
if (tree.size() > k) {
tree.remove(a[i - k]);
}
if (tree.size() == k) {
print(tree.max());
}
}
You can actually do this in O(n) time with O(n) space.
Split the array into blocks of each.
[a1 a2 ... ak] [a(k+1) ... a2k] ...
For each block, maintain two more blocks, the left block and the right block.
The ith element of the left block will be the max of the i elements from the left.
The ith element of the right block will be the max of the i elements from the right.
You will have two such blocks for each block of k.
Now if you want to find the max in range a[i... i+k], say the elements span two of the above blocks of k.
[j-k+1 ... i i+1 ... j] [j+1 ... i+k ... j+k]
All you need to do is find the max of RightMax of i to j of the first block and the left max of j+1 to i+k of the second block.
Hope this is the solution which you are looking for:
def MaxContigousSum(lst, n):
m = [0]
if lst[0] > 0:
m[0] = lst[0]
maxsum = m[0]
for i in range(1, n):
if m[i - 1] + lst[i] > 0:
m.append(m[i - 1] + lst[i])
else:
m.append(0)
if m[i] > maxsum:
maxsum = m[i]
return maxsum
lst = [-2, 11, -4, 13, -5, 2, 1, -3, 4, -2, -1, -6, -9]
print MaxContigousSum(lst, len(lst))
**Output**
20 for [11, -4, 13]

Resources