Counting the number of same ordered pairs in three different permutations - algorithm

I have three different permutations of the set {1,2,..n}, and I would like to write some code to count the number of pairs of numbers that come in the same order in all three permutations.
As an example with permutation of.
{1,2,3}
(1,2,3).
(3,2,1)
there are 0 such pairs where they come in the same order because (1,2,3) and (3,2,1) are sorted in both increasing and decreasing order
I want an optimal O(N*logN) solution. A hint was given, in which you have to count the number of inversions of each permutation, i.e
an inversion is the of pair (i,j) such that i > j but a[j] > a[i]
I can do this in O(NlogN).
So definitely if one pair came in the increasing order in each of the permutations it would add 1 to each inversion count for each permutation. But that isn't true if i > j and
a[j] > a[i] (all came in decreasing order) as I should be increasing the count but this doesn't contribute anything to the inversion count. Also afterwards if I can count the number of inversions in each array, but I don't see a link between that and the number of same-ordered pairs.

For each permutation you have, you should count the number of cases where a[i] < a[j], for each pair of indices i and j such that i < j. Let's call this non-inversions. Then, you can find out the result by taking the minimum of the non-inversion counts you found.
For instance, in your sample case, the values corresponding for the permutations (1,2,3), (1,2,3) and (3,2,1) are 3, 3, and 0, respectively.
For a different sample case, you can examine (1,2,3,4), (1,2,4,3), (1,3,2,4) and (4,1,3,2). The corresponding counts for these permutations are 6, 4, 5, and 2. The result is min(6,4,5,2) because the only tuples that remain ordered in each case are (1, 2) and (1, 3).
The key idea behind this solution is based on what non-inversion count implies. In an ordered array of size N, there are N(N-1)/2 ordered pairs contributing to the non-inversion count. As you introduce some inversions into that array, the relative order of some elements are lost, while some remain. By finding the minimum of the non-inversion counts, you can find the number of pairs that preserve their relative ordering in the 'worst' case. (even though this alone is not enough not identify them individually)
If you insist on counting the inversions (i.e. as opposed to the non-inversions) the procedure is pretty much the same. Count the inversions for each permutation given to you. Then simply subtract the maximum value you find from N(N-1)/2, and obtain the result.

Related

Join (sum) two adjacent elements of an array into one element until its size is K and GCD of new elements is maximum possible

I'm having a problem to solve this one. The task is to make a program that will for a given array of N numbers ( N <= 10^5 ), print a new array that's made by joining any two adjacent elements into their sum, (the sum is replacing these two adjacent elements and the size of the array is smaller by 1), until array's size is K. I need to print a solution where GCD of new elements is maximized. (and also print GCD after printing the array).
Note: Sum of all elements in the given array is not higher than 10^6.
I've realized that I could use prefix sum somehow because the sum of all elements isn't higher than 10^6, but that didn't help me that much.
What is an optimal solution to this problem?
Your GCD will be a divisor of the sum of all elements in the array. Your sum is not greater then 10^6, so the number of divisors is not greater than 240, so you can just check all of this GCDs, and it will be fast enough. You can check if asked gcd is possible in linear time: just go through array while the current sum is not the divisor of wanted gcd. When it is just set the current sum to 0. If you have found at least k blocks, it is possible to get current gcd (you can join any 2 blocks, and gcd will be the same).

Finding two sublists of fixed sizes (K,L) to maximize total sum among positive sequence

Given a list of positive integers, and two integers K and L, I need to select two non-overlapping contiguous sublists of lengths K and L so as to maximize the combined sum of the two sublists.
For example, if the list is [6,1,4,6,3,2,7,4], K = 3, and L = 2, then I want the sublists [4,6,3] and [7,4], whose combined sum of 24 is the maximum achievable.
The list has at least K + L elements and at most 600 elements; the elements are integers in the range [1, 500].
I do not know where to start. I'm thinking a Dynamic Programming solution, but I'm not very familiar with it so I'm not sure if that's the way to go.
Scan array left to right, calculating partial sums for continuous subarrays of length K and of length L starting at every index. It might be performed in O(n).
Write the largest sums before each index to auxiliary arrays LeftK, LeftL
Write the largest sums after each index to auxiliary arrays RightK, RightL
Now for every index i get sums of LeftK[i]+RightL[i] and LeftL[i]+RightK[i] and choose the best sum among all entries.

Splitting values into similarly distributed evenly sized groups

Given a list of scalar values, how can we split the list into K evenly-sized groups such that the groups have similar distributions? Note that simplicity is strongly favored over efficiency.
I am currently doing:
sort values
create K empty groups: group_1, ... group_k
while values is not empty:
for group in groups:
group.add(values.pop())
if values is empty:
break
This is a variation on what #m.raynal came up with that will work well even when n is just a fairly small multiple of k.
Sort the elements from smallest to largest.
Create k empty groups.
Put them into a Priority Queue sorted from least elements to most, then largest sum to smallest. (So the next element is always the one with the largest sum among all of those with the fewest elements.)
For each element, take a group off of the priority queue, add that element, put the group back in the priority queue.
In practice this means that the first k elements go to groups randomly, the next k elements go in reverse order. And then it gets clever about keeping things balanced.
Depending on your application, the fact that the bottom two values are spaced predictably far apart could be a problem. If that is the case then you could complicate this by going "middle out". But that scheme is much more complicated.
Here is a way to (somehow) distribute values evenly.
Let's assume your array of scalars A is of size n, with n being a multiple of k to make it more simple.
One way could then be :
sort(A)
d = n/k
g = 0
for i from 0 to d-1 do {
for j from 0 to k-1 do {
group[(j+g) % k].add(A[k*i + j])
}
g ++
}
You then add the first k elements to the groups 1, ..., k, the k following to the groups 2, ..., k, 1, then 3, ...k, 1, 2 etc.
It would not work well if k² > n, in this case you should not increment g by 1, but by a larger value close to k/d. If k is almost n, then this algorithm becomes simply useless.
This gives absolutely no guarantee about an even distribution of the scalars if some extreme values were to be in A. But in the case A itself would be somehow well distributed, and n > k², then it would somehow distribute the values among the k groups.
It has at least the advantage of running in O(n) once A is sorted.

Given a set of n integers, list all possible subsets with sum>=k

Given an unsorted set of integers in the form of array, find all possible subsets whose sum is greater than or equal to a const integer k,
eg:- Our set is {1,2,3} and k=2
Possible subsets:-
{2},
{3},
{1,2},
{1,3},
{2,3},
{1,2,3}
I can only think of a naive algorithm which lists all the subsets of set and checks if sum of subset is >=k or not, but its an exponential algorithm and listing all subsets requires O(2^N). Can I use dynamic programming to solve it in polynomial time?
Listing all the subsets is going to be still O(2^N) because in the worst case you may still have to list all subsets apart from the empty one.
Dynamic programming can help you count the number of sets that have sum >= K
You go bottom-up keeping track of how many subsets summed to some value from range [1..K]. An approach like this will be O(N*K) which is going to be only feasible for small K.
The idea with the dynamic programming solution is best illustrated with an example. Consider this situation. Assume you know that out of all the sets composed of the first i elements you know that t1 sum to 2 and t2 sum to 3. Let's say that the next i+1 element is 4. Given all the existing sets we can build all the new sets by either appending the element i+1 or leaving it out. If we leave it out we get t1 subsets that sum to 2 and t2 subsets that sum to 3. If we append it then we obtain t1 subsets that sum to 6 (2 + 4) and t2 that sum to 7 (3 + 4) and one subset which contains just i+1 which sums to 4. That gives us the numbers of subsets that sum to (2,3,4,6,7) consisting of the first i+1 elements. We continue until N.
In pseudo-code this could look something like this:
int DP[N][K];
int set[N];
//go through all elements in the set by index
for i in range[0..N-1]
//count the one element subset consisting only of set[i]
DP[i][set[i]] = 1
if (i == 0) continue;
//case 1. build and count all subsets that don't contain element set[i]
for k in range[1..K-1]
DP[i][k] += DP[i-1][k]
//case 2. build and count subsets that contain element set[i]
for k in range[0..K-1]
if k + set[i] >= K then break inner loop
DP[i][k+set[i]] += DP[i-1][k]
//result is the number of all subsets - number of subsets with sum < K
//the -1 is for the empty subset
return 2^N - sum(DP[N-1][1..K-1]) - 1
Can I use dynamic programming to solve it in polynomial time?
No. The problem is even harder than #amit (in the comments) mentions. Finding if there exists a subset that sums to a specific k is the subset-sum problem, which is NP-hard. Instead you are asking for how many solutions are equal to a specific k, which is in the much more difficult class of P#. In addition, your exact problem is slightly more difficult since you want to not only count, but enumerate all the possible subsets for k and targets < k.
If k is 0, and every element of the set is positive then you have no choice but to output every possible subset, so the lower-bound to this problem is O(2N) -- the time taken to produce the output.
Unless you know something more about the value k that you haven't told us, there's no faster general solution that to just check every subset.

Finding sub-array sum in an integer array

Given an array of N positive integers. It can have n*(n+1)/2 sub-arrays including single element sub-arrays. Each sub-array has a sum S. Find S's for all sub-arrays is obviously O(n^2) as number of sub-arrays are O(n^2). Many sums S's may be repeated also. Is there any way to find count of all distinct sum (not the exact values of sums but only count) in O(n logn).
I tried an approach but stuck on the way. I iterated the array from index 1 to n.
Say a[i] is the given array. For each index i, a[i] will add to all the sums in which a[i-1] is involved and will include itself also as individual element. But duplicate will emerge if among sums in which a[i-1] is involved, the difference of two sums is a[i]. I mean that, say sums Sp and Sq end up at a[i-1] and difference of both is a[i]. Then Sp + a[i] equals Sq, giving Sq as a duplicate.
Say C[i] is count of the distinct sums in which end up at a[i].
So C[i] = C[i-1] + 1 - numbers of pairs of sums in which a[i-1] is involved whose difference is a[i].
But problem is to find the part of number of pairs in O(log n). Please give me some hint about this or if I am on wrong way and completely different approach is required problem point that out.
When S is not too large, we can count the distinct sums with one (fast) polynomial multiplication. When S is larger, N is hopefully small enough to use a quadratic algorithm.
Let x_1, x_2, ..., x_n be the array elements. Let y_0 = 0 and y_i = x_1 + x_2 + ... + x_i. Let P(z) = z^{y_0} + z^{y_1} + ... + z^{y_n}. Compute the product of polynomials P(z) * P(z^{-1}); the coefficient of z^k with k > 0 is nonzero if and only if k is a sub-array sum, so we just have to read off the number of nonzero coefficients of positive powers. The powers of z, moreover, range from -S to S, so the multiplication takes time on the order of S log S.
You can look at the sub-arrays as a kind of tree. In the sense that subarray [0,3] can be divided to [0,1] and [2,3].
So build up a tree, where nodes are defined by length of the subarray and it's staring offset in the original array, and whenever you compute a subarray, store the result in this tree.
When computing a sub-array, you can check this tree for existing pre-computed values.
Also, when dividing, parts of the array can be computed on different CPU cores, if that matters.
This solution assumes that you don't need all values at once, rather ad-hoc.
For the former, there could be some smarter solution.
Also, I assume that we're talking about counts of elements in 10000's and more. Otherwise, such work is a nice excercise but has not much of a practical value.

Resources