Find greater number form self in left side and smaller number form self in right side - algorithm

Consider an array a of n integers, indexed from 1 to n.
For every index i such that 1<i<n, define:
count_left(i) = number of indices j such that 1 <= j < i and a[j] > a[i];
count_right(i) = number of indices j such that i < j <= n and a[j] < a[i];
diff(i) = abs(count_left(i) - count_right(i)).
The problem is: given array a, find the maximum possible value of diff(i) for 1 < i < n.
I got solution by brute force. Can anyone give better solution?
Constraint: 3 < n <= 10^5
Example
Input Array: [3, 6, 9, 5, 4, 8, 2]
Output: 4
Explanation:
diff(2) = abs(0 - 3) = 3
diff(3) = abs(0 - 4) = 4
diff(4) = abs(2 - 2) = 0
diff(5) = abs(3 - 1) = 2
diff(6) = abs(1 - 1) = 0
maximum is 4.

O(nlogn) approach:
Walk through array left to right and add every element to augmented binary search tree (RB, AVL etc) containing fields of subtree size, initial index and temporary rank field. So immediately after adding we know rank of element in the current tree state.
lb = index - temprank
is number of left bigger elements - remember it in temprank field.
After filling the tree with all items traverse tree again, retrieving final element rank.
rs = finalrank - temprank
is number of right smaller elements. Now just get abs of difference of lb and rs
diff = abs(lb - rs) = abs(index - temprank - finalrank + temprank ) =
abs(index - finalrank)
But ... we can see that we don't need temprank at all.
Moreover - we don't need binary tree!
Just perform sorting of pairs (element; initial index) by element key and get max absolute difference of new_index - old_index (except for old indices 1 and n)
a 3, 6, 9, 5, 4, 8, 2
old 2 3 4 5 6
new 5 7 4 3 6
dif 3 4 0 2 0
Python code for concept checking
a = [3, 6, 9, 5, 4, 8, 2]
b = sorted([[e,i] for i,e in enumerate(a)])
print(b)
print(max([abs(n-o[1]) if 0<o[1]<len(a)-1 else 0 for n,o in enumerate(b)]))

Related

Kth element in transformed array

I came across this question in recent interview :
Given an array A of length N, we are supposed to answer Q queries. Query form is as follows :
Given x and k, we need to make another array B of same length such that B[i] = A[i] ^ x where ^ is XOR operator. Sort an array B in descending order and return B[k].
Input format :
First line contains interger N
Second line contains N integers denoting array A
Third line contains Q i.e. number of queries
Next Q lines contains space-separated integers x and k
Output format :
Print respective B[k] value each on new line for Q queries.
e.g.
for input :
5
1 2 3 4 5
2
2 3
0 1
output will be :
3
5
For first query,
A = [1, 2, 3, 4, 5]
For query x = 2 and k = 3, B = [1^2, 2^2, 3^2, 4^2, 5^2] = [3, 0, 1, 6, 7]. Sorting in descending order B = [7, 6, 3, 1, 0]. So, B[3] = 3.
For second query,
A and B will be same as x = 0. So, B[1] = 5
I have no idea how to solve such problems. Thanks in advance.
This is solvable in O(N + Q). For simplicity I assume you are dealing with positive or unsigned values only, but you can probably adjust this algorithm also for negative numbers.
First you build a binary tree. The left edge stands for a bit that is 0, the right edge for a bit that is 1. In each node you store how many numbers are in this bucket. This can be done in O(N), because the number of bits is constant.
Because this is a little bit hard to explain, I'm going to show how the tree looks like for 3-bit numbers [0, 1, 4, 5, 7] i.e. [000, 001, 100, 101, 111]
*
/ \
2 3 2 numbers have first bit 0 and 3 numbers first bit 1
/ \ / \
2 0 2 1 of the 2 numbers with first bit 0, have 2 numbers 2nd bit 0, ...
/ \ / \ / \
1 1 1 1 0 1 of the 2 numbers with 1st and 2nd bit 0, has 1 number 3rd bit 0, ...
To answer a single query you go down the tree by using the bits of x. At each node you have 4 possibilities, looking at bit b of x and building answer a, which is initially 0:
b = 0 and k < the value stored in the left child of the current node (the 0-bit branch): current node becomes left child, a = 2 * a (shifting left by 1)
b = 0 and k >= the value stored in the left child: current node becomes right child, k = k - value of left child, a = 2 * a + 1
b = 1 and k < the value stored in the right child (the 1-bit branch, because of the xor operation everything is flipped): current node becomes right child, a = 2 * a
b = 1 and k >= the value stored in the right child: current node becomes left child, k = k - value of right child, a = 2 * a + 1
This is O(1), again because the number of bits is constant. Therefore the overall complexity is O(N + Q).
Example: [0, 1, 4, 5, 7] i.e. [000, 001, 100, 101, 111], k = 3, x = 3 i.e. 011
First bit is 0 and k >= 2, therefore we go right, k = k - 2 = 3 - 2 = 1 and a = 2 * a + 1 = 2 * 0 + 1 = 1.
Second bit is 1 and k >= 1, therefore we go left (inverted because the bit is 1), k = k - 1 = 0, a = 2 * a + 1 = 3
Third bit is 1 and k < 1, so the solution is a = 2 * a + 0 = 6
Control: [000, 001, 100, 101, 111] xor 011 = [011, 010, 111, 110, 100] i.e. [3, 2, 7, 6, 4] and in order [2, 3, 4, 6, 7], so indeed the number at index 3 is 6 and the solution (always talking about 0-based indexing here).

Determining the pairs of integers that sum to some value in the array

I have the program which counts the number of pairs of N integers that sum to value. To simplify the problem, assume also that the integers are distinct.
l.Sort();
for (int i = 0; i < l.Count; ++i)
{
int j = l.BinarySearch(value - l[i]);
if (j > i)
{
Console.WriteLine("{0} {1}", i + 1, j+1);
}
}
To solve the problem, we sort the array (to enable binary search) and then, for every entry a[i] in the array, do a binary search for value - a[i]. If the result is an index j with j > i, we show this pair.
But this algorithm don't work on the following input:
1 2 3 4 4 9 56 90 because j always smaller than i.
How to fix that?
I would go with more efficient solution that needs more space.
Assume that numbers are not distinct
Create a hash table with your integers as a key and a frequency as a value
Iterate over this hash table.
For each key
calculate diff diff = value - k
lookup for diff in hash
if there is a match check if this value have got frequency > 0
if frequency is > 0 decrement it by 1 and yield current pair k, diff
Here is a Python code:
def count_pairs(arr, value):
hsh = {}
for k in arr:
cnt = hsh.get(k, 0)
hsh[k] = cnt + 1
for k in arr:
diff = value - k
cnt = hsh.get(diff)
if cnt > 0:
hsh[k] -= 1
print("Pair detected: " + str(k) + " and " + str(diff))
count_pairs([4, 2, 3, 4, 9, 1, 5, 4, 56, 90], 8)
#=> Pair detected: 4 and 4
#=> Pair detected: 3 and 5
#=> Pair detected: 4 and 4
#=> Pair detected: 4 and 4
As far as counts the number of pairs is very vague description, here you could see 4 distinct (by number's index) pairs.
If you want this to work for non-distinct values (which your
question does not say, but your comment implies), binary search only the
portion of the array after i. This also eliminates the need for the
if (j > i) test.
Would show the code, but I don't know how to specify such a slice in
whatever language you're using.

How many permutations of a given array result in BST's of height 2?

A BST is generated (by successive insertion of nodes) from each permutation of keys from the set {1,2,3,4,5,6,7}. How many permutations determine trees of height two?
I been stuck on this simple question for quite some time. Any hints anyone.
By the way the answer is 80.
Consider how the tree would be height 2?
-It needs to have 4 as root, 2 as the left child, 6 right child, etc.
How come 4 is the root?
-It needs to be the first inserted. So we have one number now, 6 still can move around in the permutation.
And?
-After the first insert there are still 6 places left, 3 for the left and 3 for the right subtrees. That's 6 choose 3 = 20 choices.
Now what?
-For the left and right subtrees, their roots need to be inserted first, then the children's order does not affect the tree - 2, 1, 3 and 2, 3, 1 gives the same tree. That's 2 for each subtree, and 2 * 2 = 4 for the left and right subtrees.
So?
In conclusion: C(6, 3) * 2 * 2 = 20 * 2 * 2 = 80.
Note that there is only one possible shape for this tree - it has to be perfectly balanced. It therefore has to be this tree:
4
/ \
2 6
/ \ / \
1 3 5 7
This requires 4 to be inserted first. After that, the insertions need to build up the subtrees holding 1, 2, 3 and 5, 6, 7 in the proper order. This means that we will need to insert 2 before 1 and 3 and need to insert 6 before 5 and 7. It doesn't matter what relative order we insert 1 and 3 in, as long as they're after the 2, and similarly it doesn't matter what relative order we put 5 and 7 in as long as they're after 6. You can therefore think of what we need to insert as 2 X X and 6 Y Y, where the X's are the children of 2 and the Y's are the children of 6. We can then find all possible ways to get back the above tree by finding all interleaves of the sequences 2 X X and 6 Y Y, then multiplying by four (the number of ways of assigning X and Y the values 1, 3, 5, and 7).
So how many ways are there to interleave? Well, you can think of this as the number of ways to permute the sequence L L L R R R, since each permutation of L L L R R R tells us how to choose from either the Left sequence or the Right sequence. There are 6! / 3! 3! = 20 ways to do this. Since each of those twenty interleaves gives four possible insertion sequences, there end up being a total of 20 × 4 = 80 possible ways to do this.
Hope this helps!
I've created a table for the number of permutations possible with 1 - 12 elements, with heights up to 12, and included the per-root break down for anybody trying to check that their manual process (described in other answers) is matching with the actual values.
http://www.asmatteringofit.com/blog/2014/6/14/permutations-of-a-binary-search-tree-of-height-x
Here is a C++ code aiding the accepted answer, here I haven't shown the obvious ncr(i,j) function, hope someone will find it useful.
int solve(int n, int h) {
if (n <= 1)
return (h == 0);
int ans = 0;
for (int i = 0; i < n; i++) {
int res = 0;
for (int j = 0; j < h - 1; j++) {
res = res + solve(i, j) * solve(n - i - 1, h - 1);
res = res + solve(n - i - 1, j) * solve(i, h - 1);
}
res = res + solve(i, h - 1) * solve(n - i - 1, h - 1);
ans = ans + ncr(n - 1, i) * res;
}
return ans
}
The tree must have 4 as the root and 2 and 6 as the left and right child, respectively. There is only one choice for the root and the insertion should start with 4, however, once we insert the root, there are many insertion orders. There are 2 choices for, the second insertion 2 or 6. If we choose 2 for the second insertion, we have three cases to choose 6: choose 6 for the third insertion, 4, 2, 6, -, -, -, - there are 4!=24 choices for the rest of the insertions; fix 6 for the fourth insertion, 4, 2, -, 6, -,-,- there are 2 choices for the third insertion, 1 or 3, and 3! choices for the rest, so 2*3!=12, and the last case is to fix 6 in the fifth insertion, 4, 2, -, -, 6, -, - there are 2 choices for the third and fourth insertion ((1 and 3), or (3 and 1)) as well as for the last two insertions ((5 and 7) or (7 and 5)), so there are 4 choices. In total, if 2 is the second insertion we have 24+12+4=40 choices for the rest of the insertions. Similarly, there are 40 choices if the second insertion is 6, so the total number of different insertion orders is 80.

Find algorithm to split sequence in 2 to minimize difference in sum [duplicate]

This question already has answers here:
Is partitioning an array into halves with equal sums P or NP?
(5 answers)
Closed 9 years ago.
Here's the problem: given a sequence of numbers, split these numbers into 2 sequences, so that the difference between the two sequences is the minimum. For example, given the sequence: [5, 4, 3, 3, 3] the solution is:
[5, 4] -> sum is 9
[3, 3, 3] -> sum is 9
The difference is 0
In other terms, can you find an algorithm (C language preferred) that given an input vector (variable size) of integers, can output two vector where the difference between the two sum is minimum?
Brutal force algorithm should be avoided.
To be sure to get the right solution, should be nice to compare in a benchmark the results between your algorithm and a brutal force algorithm.
It sounds like a sub-arrays problem (which is my interpretation of "sequences").
Meaning the only possibilities for 5, 4, 3, 3, 3 are:
| 5, 4, 3, 3, 3 => 0 - 18 => 18
5 | 4, 3, 3, 3 => 5 - 13 => 8
5, 4 | 3, 3, 3 => 9 - 9 => 0
5, 4, 3 | 3, 3 => 12 - 6 => 6
5, 4, 3, 3 | 3 => 15 - 3 => 12
5, 4, 3, 3, 3 | => 18 - 0 => 18 (same as first)
It is as simple as just comparing the sums on either side of every index.
Code: (untested)
int total = 0;
for (int i = 0; i < n; i++)
total += arr[i];
int best = INT_MAX, bestPos = -1, current = 0;
for (int i = 0; i < n; i++)
{
current += arr[i];
int diff = abs(current - total);
if (diff < best)
{
best = diff;
bestPos = i;
}
// else break; - optimisation, may not work
}
printf("The best position is at %d\n", bestPos);
The above is O(n), logically, you can't do much better than that.
You can slightly optimize the above by doing a binary-search-like process on the sequence to get down to n + log n rather than 2n, but both are O(n). Basic pseudo-code:
sum[0] = arr[0]
// sum[i] represents sum from indices 0 to i
for (i = 1:n)
sum[i] = sum[i-1] + arr[i]
total = sum[n]
start = 0
end = n
best = MAX
repeat:
if (start == end) stop
mid = (start + end) / 2
sumFromMidToN = sum[n] - sum[mid]
best = max(best, abs(sumFromMidToN - sum[mid]))
if (sum[mid] > sumFromMidToN)
end = mid
else if (sum[mid] < sumFromMidToN)
start = mid
else
stop
If it's actually subsets, then, as already mentioned, it appears to be the optimization version of the Partition problem, which is a lot more difficult.

Link list algorithm to find pairs adding up to 10

Can you suggest an algorithm that find all pairs of nodes in a link list that add up to 10.
I came up with the following.
Algorithm: Compare each node, starting with the second node, with each node starting from the head node till the previous node (previous to the current node being compared) and report all such pairs.
I think this algorithm should work however its certainly not the most efficient one having a complexity of O(n2).
Can anyone hint at a solution which is more efficient (perhaps takes linear time). Additional or temporary nodes can be used by such a solution.
If their range is limited (say between -100 and 100), it's easy.
Create an array quant[-100..100] then just cycle through your linked list, executing:
quant[value] = quant[value] + 1
Then the following loop will do the trick.
for i = -100 to 100:
j = 10 - i
for k = 1 to quant[i] * quant[j]
output i, " ", j
Even if their range isn't limited, you can have a more efficient method than what you proposed, by sorting the values first and then just keeping counts rather than individual values (same as the above solution).
This is achieved by running two pointers, one at the start of the list and one at the end. When the numbers at those pointers add up to 10, output them and move the end pointer down and the start pointer up.
When they're greater than 10, move the end pointer down. When they're less, move the start pointer up.
This relies on the sorted nature. Less than 10 means you need to make the sum higher (move start pointer up). Greater than 10 means you need to make the sum less (end pointer down). Since they're are no duplicates in the list (because of the counts), being equal to 10 means you move both pointers.
Stop when the pointers pass each other.
There's one more tricky bit and that's when the pointers are equal and the value sums to 10 (this can only happen when the value is 5, obviously).
You don't output the number of pairs based on the product, rather it's based on the product of the value minus 1. That's because a value 5 with count of 1 doesn't actually sum to 10 (since there's only one 5).
So, for the list:
2 3 1 3 5 7 10 -1 11
you get:
Index a b c d e f g h
Value -1 1 2 3 5 7 10 11
Count 1 1 1 2 1 1 1 1
You start pointer p1 at a and p2 at h. Since -1 + 11 = 10, you output those two numbers (as above, you do it N times where N is the product of the counts). Thats one copy of (-1,11). Then you move p1 to b and p2 to g.
1 + 10 > 10 so leave p1 at b, move p2 down to f.
1 + 7 < 10 so move p1 to c, leave p2 at f.
2 + 7 < 10 so move p1 to d, leave p2 at f.
3 + 7 = 10, output two copies of (3,7) since the count of d is 2, move p1 to e, p2 to e.
5 + 5 = 10 but p1 = p2 so the product is 0 times 0 or 0. Output nothing, move p1 to f, p2 to d.
Loop ends since p1 > p2.
Hence the overall output was:
(-1,11)
( 3, 7)
( 3, 7)
which is correct.
Here's some test code. You'll notice that I've forced 7 (the midpoint) to a specific value for testing. Obviously, you wouldn't do this.
#include <stdio.h>
#define SZSRC 30
#define SZSORTED 20
#define SUM 14
int main (void) {
int i, s, e, prod;
int srcData[SZSRC];
int sortedVal[SZSORTED];
int sortedCnt[SZSORTED];
// Make some random data.
srand (time (0));
for (i = 0; i < SZSRC; i++) {
srcData[i] = rand() % SZSORTED;
printf ("srcData[%2d] = %5d\n", i, srcData[i]);
}
// Convert to value/size array.
for (i = 0; i < SZSORTED; i++) {
sortedVal[i] = i;
sortedCnt[i] = 0;
}
for (i = 0; i < SZSRC; i++)
sortedCnt[srcData[i]]++;
// Force 7+7 to specific count for testing.
sortedCnt[7] = 2;
for (i = 0; i < SZSORTED; i++)
if (sortedCnt[i] != 0)
printf ("Sorted [%3d], count = %3d\n", i, sortedCnt[i]);
// Start and end pointers.
s = 0;
e = SZSORTED - 1;
// Loop until they overlap.
while (s <= e) {
// Equal to desired value?
if (sortedVal[s] + sortedVal[e] == SUM) {
// Get product (note special case at midpoint).
prod = (s == e)
? (sortedCnt[s] - 1) * (sortedCnt[e] - 1)
: sortedCnt[s] * sortedCnt[e];
// Output the right count.
for (i = 0; i < prod; i++)
printf ("(%3d,%3d)\n", sortedVal[s], sortedVal[e]);
// Move both pointers and continue.
s++;
e--;
continue;
}
// Less than desired, move start pointer.
if (sortedVal[s] + sortedVal[e] < SUM) {
s++;
continue;
}
// Greater than desired, move end pointer.
e--;
}
return 0;
}
You'll see that the code above is all O(n) since I'm not sorting in this version, just intelligently using the values as indexes.
If the minimum is below zero (or very high to the point where it would waste too much memory), you can just use a minVal to adjust the indexes (another O(n) scan to find the minimum value and then just use i-minVal instead of i for array indexes).
And, even if the range from low to high is too expensive on memory, you can use a sparse array. You'll have to sort it, O(n log n), and search it for updating counts, also O(n log n), but that's still better than the original O(n2). The reason the binary search is O(n log n) is because a single search would be O(log n) but you have to do it for each value.
And here's the output from a test run, which shows you the various stages of calculation.
srcData[ 0] = 13
srcData[ 1] = 16
srcData[ 2] = 9
srcData[ 3] = 14
srcData[ 4] = 0
srcData[ 5] = 8
srcData[ 6] = 9
srcData[ 7] = 8
srcData[ 8] = 5
srcData[ 9] = 9
srcData[10] = 12
srcData[11] = 18
srcData[12] = 3
srcData[13] = 14
srcData[14] = 7
srcData[15] = 16
srcData[16] = 12
srcData[17] = 8
srcData[18] = 17
srcData[19] = 11
srcData[20] = 13
srcData[21] = 3
srcData[22] = 16
srcData[23] = 9
srcData[24] = 10
srcData[25] = 3
srcData[26] = 16
srcData[27] = 9
srcData[28] = 13
srcData[29] = 5
Sorted [ 0], count = 1
Sorted [ 3], count = 3
Sorted [ 5], count = 2
Sorted [ 7], count = 2
Sorted [ 8], count = 3
Sorted [ 9], count = 5
Sorted [ 10], count = 1
Sorted [ 11], count = 1
Sorted [ 12], count = 2
Sorted [ 13], count = 3
Sorted [ 14], count = 2
Sorted [ 16], count = 4
Sorted [ 17], count = 1
Sorted [ 18], count = 1
( 0, 14)
( 0, 14)
( 3, 11)
( 3, 11)
( 3, 11)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 7, 7)
Create a hash set (HashSet in Java) (could use a sparse array if your numbers are well-bounded, i.e. you know they fall into +/- 100)
For each node, first check if 10-n is in the set. If so, you have found a pair. Either way, then add n to the set and continue.
So for example you have
1 - 6 - 3 - 4 - 9
1 - is 9 in the set? Nope
6 - 4? No.
3 - 7? No.
4 - 6? Yup! Print (6,4)
9 - 1? Yup! Print (9,1)
This is a mini subset sum problem, which is NP complete.
If you were to first sort the set, it would eliminate the pairs of numbers that needed to be evaluated.

Resources