Determining the pairs of integers that sum to some value in the array - algorithm

I have the program which counts the number of pairs of N integers that sum to value. To simplify the problem, assume also that the integers are distinct.
l.Sort();
for (int i = 0; i < l.Count; ++i)
{
int j = l.BinarySearch(value - l[i]);
if (j > i)
{
Console.WriteLine("{0} {1}", i + 1, j+1);
}
}
To solve the problem, we sort the array (to enable binary search) and then, for every entry a[i] in the array, do a binary search for value - a[i]. If the result is an index j with j > i, we show this pair.
But this algorithm don't work on the following input:
1 2 3 4 4 9 56 90 because j always smaller than i.
How to fix that?

I would go with more efficient solution that needs more space.
Assume that numbers are not distinct
Create a hash table with your integers as a key and a frequency as a value
Iterate over this hash table.
For each key
calculate diff diff = value - k
lookup for diff in hash
if there is a match check if this value have got frequency > 0
if frequency is > 0 decrement it by 1 and yield current pair k, diff
Here is a Python code:
def count_pairs(arr, value):
hsh = {}
for k in arr:
cnt = hsh.get(k, 0)
hsh[k] = cnt + 1
for k in arr:
diff = value - k
cnt = hsh.get(diff)
if cnt > 0:
hsh[k] -= 1
print("Pair detected: " + str(k) + " and " + str(diff))
count_pairs([4, 2, 3, 4, 9, 1, 5, 4, 56, 90], 8)
#=> Pair detected: 4 and 4
#=> Pair detected: 3 and 5
#=> Pair detected: 4 and 4
#=> Pair detected: 4 and 4
As far as counts the number of pairs is very vague description, here you could see 4 distinct (by number's index) pairs.

If you want this to work for non-distinct values (which your
question does not say, but your comment implies), binary search only the
portion of the array after i. This also eliminates the need for the
if (j > i) test.
Would show the code, but I don't know how to specify such a slice in
whatever language you're using.

Related

Find greater number form self in left side and smaller number form self in right side

Consider an array a of n integers, indexed from 1 to n.
For every index i such that 1<i<n, define:
count_left(i) = number of indices j such that 1 <= j < i and a[j] > a[i];
count_right(i) = number of indices j such that i < j <= n and a[j] < a[i];
diff(i) = abs(count_left(i) - count_right(i)).
The problem is: given array a, find the maximum possible value of diff(i) for 1 < i < n.
I got solution by brute force. Can anyone give better solution?
Constraint: 3 < n <= 10^5
Example
Input Array: [3, 6, 9, 5, 4, 8, 2]
Output: 4
Explanation:
diff(2) = abs(0 - 3) = 3
diff(3) = abs(0 - 4) = 4
diff(4) = abs(2 - 2) = 0
diff(5) = abs(3 - 1) = 2
diff(6) = abs(1 - 1) = 0
maximum is 4.
O(nlogn) approach:
Walk through array left to right and add every element to augmented binary search tree (RB, AVL etc) containing fields of subtree size, initial index and temporary rank field. So immediately after adding we know rank of element in the current tree state.
lb = index - temprank
is number of left bigger elements - remember it in temprank field.
After filling the tree with all items traverse tree again, retrieving final element rank.
rs = finalrank - temprank
is number of right smaller elements. Now just get abs of difference of lb and rs
diff = abs(lb - rs) = abs(index - temprank - finalrank + temprank ) =
abs(index - finalrank)
But ... we can see that we don't need temprank at all.
Moreover - we don't need binary tree!
Just perform sorting of pairs (element; initial index) by element key and get max absolute difference of new_index - old_index (except for old indices 1 and n)
a 3, 6, 9, 5, 4, 8, 2
old 2 3 4 5 6
new 5 7 4 3 6
dif 3 4 0 2 0
Python code for concept checking
a = [3, 6, 9, 5, 4, 8, 2]
b = sorted([[e,i] for i,e in enumerate(a)])
print(b)
print(max([abs(n-o[1]) if 0<o[1]<len(a)-1 else 0 for n,o in enumerate(b)]))

Counting Sort - Why go in reverse order during the insertion?

I was looking at the code for Counting Sort on GeeksForGeeks and during the final stage of the algorithm where the elements from the original array are inserted into their final locations in the sorted array (the second-to-last for loop), the input array is traversed in reverse order.
I can't seem to understand why you can't just go from the beginning of the input array to the end, like so :
for i in range(len(arr)):
output_arr[count_arr[arr[i] - min_element] - 1] = arr[i]
count_arr[arr[i] - min_element] -= 1
Is there some subtle reason for going in reverse order that I'm missing? Apologies if this is a very obvious question. I saw Counting Sort implemented in the same style here as well.
Any comments would be helpful, thank you!
Stability. With your way, the order of equal-valued elements gets reversed instead of preserved. Going over the input backwards cancels out the backwards copying (that -= 1 thing).
To process an array in forward order, the count / index array either needs to be one element larger so that the starting index is 0 or two local variables can be used. Example for integer array:
def countSort(arr):
output = [0 for i in range(len(arr))]
count = [0 for i in range(257)] # change
for i in arr:
count[i+1] += 1 # change
for i in range(256):
count[i+1] += count[i] # change
for i in range(len(arr)):
output[count[arr[i]]] = arr[i] # change
count[arr[i]] += 1 # change
return output
arr = [4,3,0,1,3,7,0,2,6,3,5]
ans = countSort(arr)
print(ans)
or using two variables, s to hold the running sum, c to hold the current count:
def countSort(arr):
output = [0 for i in range(len(arr))]
count = [0 for i in range(256)]
for i in arr:
count[i] += 1
s = 0
for i in range(256):
c = count[i]
count[i] = s
s = s + c
for i in range(len(arr)):
output[count[arr[i]]] = arr[i]
count[arr[i]] += 1
return output
arr = [4,3,0,1,3,7,0,2,6,3,5]
ans = countSort(arr)
print(ans)
Here We are Considering Stable Sort --> which is actually considering the Elements position by position.
For eg if we have array like
arr--> 5 ,8 ,3, 1, 1, 2, 6
0 1 2 3 4 5 6 7 8
count-> 0 2 1 1 0 1 1 0 1
Now we take cummulative sum of all frequencies
0 1 2 3 4 5 6 7 8
count-> 0 2 3 4 4 5 6 6 7
After Traversing the Original array , we prefer from last Since
we want to add Elements on their proper position so when we subtract the index , the Element will be added to lateral position.
But if we start traversing from beginning , then there will be no meaning for taking the cummulative sum since we are not adding according to the Elements placed. We are adding hap -hazardly which can be done even if we not take their cummulative sum.

Maximum Gcd and Sum

You are given two arrays A and B containing n elements each. Choose a pair of elements (x, y) such that:
• x belongs to Array A
• y belongs to Array B
• GCD(x, y) is the maximum of all pairs (x, y).
If there is more than one such pair having maximum gcd, then choose the one with maximum sum. Print the sum of elements of this maximum-sum pair.
This is question from Hackerrank weekofcode 34.
from fractions import gcd
from itertools import product
n = int(input().strip()) #two arrays of equal length
A = set(map(int, input().strip().split(' '))) #array1
B = set(map(int, input().strip().split(' '))) # arry2
output_sum=[]
output_GCD=[]
c=list(product(A,B))
for i in c:
temp1=i[0]
temp2=i[1]
sum_two=temp1+temp2
temp3=gcd(temp1,temp2)
output_GCD.append(temp3)
output_sum.append(temp1+temp2)
temp=[]
for i in range(len(output_GCD)):
if(output_GCD[i]==max(output_GCD)):
temp.append(output_sum[i])
print(max(temp))
This solution works for smaller conditions and I got timed out for most of the test cases, please help me how to improve my solution.
You can calculate all divisors a_divisors for array A by next way:
# it is not real python-code, just ideas of algorithm
count = {}
for (i : A):
count[i]++
a_divisors = {}
for (i : range(1, 10^6)):
for (j = i * i; j <= 10^6; j += i):
if j in count.keys():
a_divisors[i] = 1
After you can construct same array b_divisors for B and after choose common maximum from both arrays
For example:
5
3 1 4 2 8
5 2 12 8 3
produce arrays of divisors:
a: 1, 2, 3, 4, 8
b: 1, 2, 3, 4, 5, 6, 8, 12
Common maximum is: 4
If you know gcd(a, b) = 4 than you just choose 1 maximal value from A that has divisor 4 and 1 from B: 8 + 12 = 16
You must convert A and B to Set(to easily find in it)
def maximumGcdAndSum(A, B):
A = set(A)
B = set(B)
max_nbr = max(max(A), max(B))
i = max_nbr
while i > 0: # for each i starting from max number
i_pow = i # i, i^2, i^3, i^4, ...
maxa = maxb = 0
while i_pow <= max_nbr: # '<=' is a must here
if i_pow in A:
maxa = i_pow # get the max from power list which devides A
if i_pow in B:
maxb = i_pow # get the max from power list which devides B
i_pow += i
if maxa and maxb:
return maxa + maxb # if both found, stop algorithm
i -= 1
return 0

Why does this maximum product subarray algorithm work?

The problem is to find the contiguous subarray within an array (containing at least one number) which has the largest product.
For example, given the array [2,3,-2,4],
the contiguous subarray [2,3] has the largest product 6.
Why does the following work? Can anyone provide any insight on how to prove its correctness?
if(nums == null || nums.Length == 0)
{
throw new ArgumentException("Invalid input");
}
int max = nums[0];
int min = nums[0];
int result = nums[0];
for(int i = 1; i < nums.Length; i++)
{
int prev_max = max;
int prev_min = min;
max = Math.Max(nums[i],Math.Max(prev_max*nums[i], prev_min*nums[i]));
min = Math.Min(nums[i],Math.Min(prev_max*nums[i], prev_min*nums[i]));
result = Math.Max(result, max);
}
return result;
Start from the logic-side to understand how to solve the problem. There are two relevant traits for each subarray to consider:
If it contains a 0, the product of the subarray is aswell 0.
If the subarray contains an odd number of negative values, it's total value is negative aswell, otherwise positive (or 0, considering 0 as a positive value).
Now we can start off with the algorithm itself:
Rule 1: zeros
Since a 0 zeros out the product of the subarray, the subarray of the solution mustn't contain a 0, unless only negative values and 0 are contained in the input. This can be achieved pretty simple, since max and min are both reset to 0, as soon as a 0 is encountered in the array:
max = Math.Max(0 , Math.Max(prev_max * 0 , prev_min * 0));
min = Math.Min(0 , Math.Min(prev_max * 0 , prev_min * 0));
Will logically evaluate to 0, no matter what the so far input is.
arr: 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 0
result: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
min: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
max: 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 0
//non-zero values don't matter for Rule 1, so I just used 1
Rule 2: negative numbers
With Rule 1, we've already implicitly splitted the array into subarrays, such that a subarray consists of either a single 0, or multiple non-zero values. Now the task is to find the largest possible product inside that subarray (I'll refer to that as array from here on).
If the number of negative values in the array is even, the entire problem becomes pretty trivial: just multiply all values in the array and the result is the maximum-product of the array. For an odd number of negative values there are two possible cases:
The array contains only a single negative value: In that case either the subarray with all values with smaller index than the negative value or the subarray with all values with larger index than the negative value becomes the subarray with the maximum-value
The array contains at least 3 negative values: In that case we have to eliminate either the first negative number and all of it's predecessors, or the last negative number and all of it's successors.
Now let's have a look at the code:
max = Math.Max(nums[i] , Math.Max(prev_max * nums[i] , prev_min * nums[i]));
min = Math.Min(nums[i] , Math.Min(prev_max * nums[i] , prev_min * nums[i]));
Case 1: the evaluation of min is actually irrelevant, since the sign of the product of the array will only flip once, for the negative value. As soon as the negative number is encountered (= nums[i]), max will be nums[i], since both max and min are at least 1 and thus multiplication with nums[i] results in a number <= nums[i]. And for the first number after the negative number nums[i + 1], max will be nums[i + 1] again. Since the so far found maximum is made persistent in result (result = Math.Max(result, max);) after each step, this will automatically result in the correct result for that array.
arr: 2 3 2 -4 4 5
result: 2 6 12 12 12 20
max: 2 6 12 -4 4 20
//Omitted min, since it's irrelevant here.
Case 2: Here min becomes relevant too. Before we encounter the first negative value, min is the smallest number encountered so far in the array. After we encounter the first positive element in the array, the value turns negative. We continue to build both products (min and max) and swap them each time a negative value is encountered and keep updating result. When the last negative value of the array is encountered, result will hold the value of the subarray that eliminates the last negative value and it's successor. After the last negative value, max will be the product of the subarray that eliminates the first negative value and it's predecessors and min becomes irrelevant. Now we simply continue to multiply max with the remaining values in the array and update result until the end of the array is reached.
arr: 2 3 -4 3 -2 5 -6 3
result: 2 6 6 6 144 770 770 770
min: 2 6 -24 -72 -6 -30 -4620 ...
max: 2 6 -4 3 144 770 180 540
//min becomes irrelevant after the last negative value
Putting the pieces together
Since min and max are reset every time we encounter a 0, we can easily reuse them for each subarray that doesn't contain a 0. Thus Rule 1 is applied implicitly without interfering with Rule 2. Since result isn't reset each time a new subarray is inspected, the value will be kept persistent over all runs. Thus this algorithm works.
Hope this is understandable (To be honest, I doubt it and will try to improve the answer, if any questions appear). Sry for that monstrous answer.
Lets take assume the contiguous subarray, which produces the maximal product, is a[i], a[i+1], ..., a[j]. Since it is the array with the largest product, it is also the one suffix of a[0], a[1], ..., a[j], that produces the largest product.
The idea of your given algorithm is the following: For every prefix-array a[0], ..., a[j] find the largest suffix array. Out of these suffix arrays, take the maximal.
At the beginning, the smallest and biggest suffix-product are simply nums[0]. Then it iterates over all other numbers in the array. The largest suffix-array is always build in one of three ways. It's just the last numbers nums[i], it's the largest suffix-product of the shortened list multiplied by the last number (if nums[i] > 0), or it's the smallest (< 0) suffix-product multiplied by the last number (if nums[i] < 0). (*)
Using the helper variable result, you store the maximal such suffix-product you found so far.
(*) This fact is quite easy to proof. If you have a different case, for instance there exists a different suffix-product that produces a bigger number, than together with the last number nums[i] you create an even bigger suffix, which would be a contradiction.

Link list algorithm to find pairs adding up to 10

Can you suggest an algorithm that find all pairs of nodes in a link list that add up to 10.
I came up with the following.
Algorithm: Compare each node, starting with the second node, with each node starting from the head node till the previous node (previous to the current node being compared) and report all such pairs.
I think this algorithm should work however its certainly not the most efficient one having a complexity of O(n2).
Can anyone hint at a solution which is more efficient (perhaps takes linear time). Additional or temporary nodes can be used by such a solution.
If their range is limited (say between -100 and 100), it's easy.
Create an array quant[-100..100] then just cycle through your linked list, executing:
quant[value] = quant[value] + 1
Then the following loop will do the trick.
for i = -100 to 100:
j = 10 - i
for k = 1 to quant[i] * quant[j]
output i, " ", j
Even if their range isn't limited, you can have a more efficient method than what you proposed, by sorting the values first and then just keeping counts rather than individual values (same as the above solution).
This is achieved by running two pointers, one at the start of the list and one at the end. When the numbers at those pointers add up to 10, output them and move the end pointer down and the start pointer up.
When they're greater than 10, move the end pointer down. When they're less, move the start pointer up.
This relies on the sorted nature. Less than 10 means you need to make the sum higher (move start pointer up). Greater than 10 means you need to make the sum less (end pointer down). Since they're are no duplicates in the list (because of the counts), being equal to 10 means you move both pointers.
Stop when the pointers pass each other.
There's one more tricky bit and that's when the pointers are equal and the value sums to 10 (this can only happen when the value is 5, obviously).
You don't output the number of pairs based on the product, rather it's based on the product of the value minus 1. That's because a value 5 with count of 1 doesn't actually sum to 10 (since there's only one 5).
So, for the list:
2 3 1 3 5 7 10 -1 11
you get:
Index a b c d e f g h
Value -1 1 2 3 5 7 10 11
Count 1 1 1 2 1 1 1 1
You start pointer p1 at a and p2 at h. Since -1 + 11 = 10, you output those two numbers (as above, you do it N times where N is the product of the counts). Thats one copy of (-1,11). Then you move p1 to b and p2 to g.
1 + 10 > 10 so leave p1 at b, move p2 down to f.
1 + 7 < 10 so move p1 to c, leave p2 at f.
2 + 7 < 10 so move p1 to d, leave p2 at f.
3 + 7 = 10, output two copies of (3,7) since the count of d is 2, move p1 to e, p2 to e.
5 + 5 = 10 but p1 = p2 so the product is 0 times 0 or 0. Output nothing, move p1 to f, p2 to d.
Loop ends since p1 > p2.
Hence the overall output was:
(-1,11)
( 3, 7)
( 3, 7)
which is correct.
Here's some test code. You'll notice that I've forced 7 (the midpoint) to a specific value for testing. Obviously, you wouldn't do this.
#include <stdio.h>
#define SZSRC 30
#define SZSORTED 20
#define SUM 14
int main (void) {
int i, s, e, prod;
int srcData[SZSRC];
int sortedVal[SZSORTED];
int sortedCnt[SZSORTED];
// Make some random data.
srand (time (0));
for (i = 0; i < SZSRC; i++) {
srcData[i] = rand() % SZSORTED;
printf ("srcData[%2d] = %5d\n", i, srcData[i]);
}
// Convert to value/size array.
for (i = 0; i < SZSORTED; i++) {
sortedVal[i] = i;
sortedCnt[i] = 0;
}
for (i = 0; i < SZSRC; i++)
sortedCnt[srcData[i]]++;
// Force 7+7 to specific count for testing.
sortedCnt[7] = 2;
for (i = 0; i < SZSORTED; i++)
if (sortedCnt[i] != 0)
printf ("Sorted [%3d], count = %3d\n", i, sortedCnt[i]);
// Start and end pointers.
s = 0;
e = SZSORTED - 1;
// Loop until they overlap.
while (s <= e) {
// Equal to desired value?
if (sortedVal[s] + sortedVal[e] == SUM) {
// Get product (note special case at midpoint).
prod = (s == e)
? (sortedCnt[s] - 1) * (sortedCnt[e] - 1)
: sortedCnt[s] * sortedCnt[e];
// Output the right count.
for (i = 0; i < prod; i++)
printf ("(%3d,%3d)\n", sortedVal[s], sortedVal[e]);
// Move both pointers and continue.
s++;
e--;
continue;
}
// Less than desired, move start pointer.
if (sortedVal[s] + sortedVal[e] < SUM) {
s++;
continue;
}
// Greater than desired, move end pointer.
e--;
}
return 0;
}
You'll see that the code above is all O(n) since I'm not sorting in this version, just intelligently using the values as indexes.
If the minimum is below zero (or very high to the point where it would waste too much memory), you can just use a minVal to adjust the indexes (another O(n) scan to find the minimum value and then just use i-minVal instead of i for array indexes).
And, even if the range from low to high is too expensive on memory, you can use a sparse array. You'll have to sort it, O(n log n), and search it for updating counts, also O(n log n), but that's still better than the original O(n2). The reason the binary search is O(n log n) is because a single search would be O(log n) but you have to do it for each value.
And here's the output from a test run, which shows you the various stages of calculation.
srcData[ 0] = 13
srcData[ 1] = 16
srcData[ 2] = 9
srcData[ 3] = 14
srcData[ 4] = 0
srcData[ 5] = 8
srcData[ 6] = 9
srcData[ 7] = 8
srcData[ 8] = 5
srcData[ 9] = 9
srcData[10] = 12
srcData[11] = 18
srcData[12] = 3
srcData[13] = 14
srcData[14] = 7
srcData[15] = 16
srcData[16] = 12
srcData[17] = 8
srcData[18] = 17
srcData[19] = 11
srcData[20] = 13
srcData[21] = 3
srcData[22] = 16
srcData[23] = 9
srcData[24] = 10
srcData[25] = 3
srcData[26] = 16
srcData[27] = 9
srcData[28] = 13
srcData[29] = 5
Sorted [ 0], count = 1
Sorted [ 3], count = 3
Sorted [ 5], count = 2
Sorted [ 7], count = 2
Sorted [ 8], count = 3
Sorted [ 9], count = 5
Sorted [ 10], count = 1
Sorted [ 11], count = 1
Sorted [ 12], count = 2
Sorted [ 13], count = 3
Sorted [ 14], count = 2
Sorted [ 16], count = 4
Sorted [ 17], count = 1
Sorted [ 18], count = 1
( 0, 14)
( 0, 14)
( 3, 11)
( 3, 11)
( 3, 11)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 7, 7)
Create a hash set (HashSet in Java) (could use a sparse array if your numbers are well-bounded, i.e. you know they fall into +/- 100)
For each node, first check if 10-n is in the set. If so, you have found a pair. Either way, then add n to the set and continue.
So for example you have
1 - 6 - 3 - 4 - 9
1 - is 9 in the set? Nope
6 - 4? No.
3 - 7? No.
4 - 6? Yup! Print (6,4)
9 - 1? Yup! Print (9,1)
This is a mini subset sum problem, which is NP complete.
If you were to first sort the set, it would eliminate the pairs of numbers that needed to be evaluated.

Resources