Data transformation for permutation - algorithm

I have a square matrix MxN, with elements xij. Each of this values is used into a function of the form some_function(i,j).
That function is applied in column order. What I want to achieve is a kernel function k(i, j) that will be placed inside some_function:
def some_function(i, j):
i', j' = k(i, j)
I will return another set (i', j') so that (i'!=i, j'!=j) and (i', j') correspond to a real value on the initial square matrix. This function applied to each of the values (i,j) wont produce any repeated pairs. The numbers produced by the kernel function (i',j') should be distributed.
My first idea was to precompute the permutations in another list and pass that values to some_function. I would like to know if there is any better way to do it. Thank you.

Consider MxN matrix as one-dimensional array of length M*N. You want to create a transformation that uniquely maps every number in range 0..MN-1 to another number in this range (and after MN steps returns to initial index).
The simplest way to achieve this goal is to make steps of size P that is mutually prime with M and N and large than M
Example:
indx = M * i + j /start cell
for k = 0.. M*N - 1 do begin
indx = (indx + P) % (M*N) //integer modulus
i = indx / M //integer division
j = indx % M //integer modulus
end // indx returns to the start value
for M=2,N=4, P=5
indx i j
0 0 0
5 2 1
2 1 0
7 3 1
4 2 0
1 0 1
6 3 0
3 1 1
Note that both i and j changes every time.

Related

How many ways can we split a number into k unequal summands?

I am taking a challenge online and came across this question, where I need to find the number of ways to split a number 'n' into 'k' unequal summands. For example,
3 - Can be split into 2 and 1.
4 - Can be split into 3 and 1. Note: We cannot do 2 and 2 because, they are equal
5 - (3,2) and (4,1). and so on..
Is there any algorithm for this.
Code in python:
def minArgument(x):
s = 0
i = 1
while s < x:
s += i
i += 1
return i - 1
def maxArgument(x):
return x - 1
def number_of_sumsDP(M, K):
lowerLimit = minArgument(M)
if K < lowerLimit:
return 0
else:
if K - 1 >= M // 2:
return 1 + number_of_sumsDP(M, K - 1)
else:
return 0
def number_of_sums_simple(n):
if n % 2 == 0:
return n // 2 - 1
else:
return n // 2
for i in range(2, 100):
if number_of_sumsDP(i, maxArgument(i)) != number_of_sums_simple(i):
print("mistake")
print("works")
First thought dynamic programming (number_of_sumsDP(M, K)) - number of sums is equal to number of sums with the biggest possible number (subject - 1) and sums without it (with obvious stop when number is less than min arg - it doesn't make sense to add up to 10 with numbers less than 4 [minArgument] and when we start repeating ourself [if K - 1 < M // 2]).
After few prints it leads to even simpler and much efficient algorithm:
number_of_sums_simple - return division in integers by 2 when odd and the same minus one when even; as a proof I convinced myself that it works.

How to find all possible reachable numbers from a position?

Given 2 elements n, s and an array A of size m, where s is initial position which lies between 1 <= s <= n, our task is to perform m operations to s and in each operation we either make s = s + A[i] or s = s - A[i], and we have to print all the values which are possible after the m operation and all those value should lie between 1 - n (inclusive).
Important Note: If during an operation we get a value s < 1 or s > n,
we don't go further with that value of s.
I solved the problem using BFS, but the problem is BFS approach is not optimal here, can someone suggest any other more optimal approach to me or an algorithm will greatly help.
For example:-
If n = 3, s = 3, and A = {1, 1, 1}
3
/ \
operation 1: 2 4 (we don’t proceed with 4 as it is > n)
/ \ / \
operation 2: 1 3 3 5
/ \ / \ / \ / \
operation 3: 0 2 2 4 2 4 4 6
So final values reachable by following above rules are 2 and 2 (that is two times 2). we don't consider the third two as it has an intermediate state which is > n ( same case applicable if < 1).
There is this dynamic programming solution, which runs in O(nm) time and requires O(n) space.
First establish a boolean array called reachable, initialize it to false everywhere except for reachable[s], which is true.
This array now represents whether a number is reachable in 0 steps. Now for every i from 1 to m, we update the array so that reachable[x] represents whether the number x is reachable in i steps. This is easy: x is reachable in i steps if and only if either x - A[i] or x + A[i] is reachable in i - 1 steps.
In the end, the array becomes the final result you want.
EDIT: pseudo-code here.
// initialization:
for x = 1 to n:
r[x] = false
r[s] = true
// main loop:
for k = 1 to m:
for x = 1 to n:
last_r[x] = r[x]
for x = 1 to n:
r[x] = (last_r[x + A[k]] or last_r[x - A[k]])
Here last_r[x] is by convention false if x is not in the range [1 .. n].
If you want to maintain the number of ways that each number can be reached, then you do the following changes:
Change the array r to an integer array;
In the initialization, initialize all r[x] to 0, except r[s] to 1;
In the main loop, change the key line to:
r[x] = last_r[x + A[k]] + last_r[x - A[k]]

Why does this maximum product subarray algorithm work?

The problem is to find the contiguous subarray within an array (containing at least one number) which has the largest product.
For example, given the array [2,3,-2,4],
the contiguous subarray [2,3] has the largest product 6.
Why does the following work? Can anyone provide any insight on how to prove its correctness?
if(nums == null || nums.Length == 0)
{
throw new ArgumentException("Invalid input");
}
int max = nums[0];
int min = nums[0];
int result = nums[0];
for(int i = 1; i < nums.Length; i++)
{
int prev_max = max;
int prev_min = min;
max = Math.Max(nums[i],Math.Max(prev_max*nums[i], prev_min*nums[i]));
min = Math.Min(nums[i],Math.Min(prev_max*nums[i], prev_min*nums[i]));
result = Math.Max(result, max);
}
return result;
Start from the logic-side to understand how to solve the problem. There are two relevant traits for each subarray to consider:
If it contains a 0, the product of the subarray is aswell 0.
If the subarray contains an odd number of negative values, it's total value is negative aswell, otherwise positive (or 0, considering 0 as a positive value).
Now we can start off with the algorithm itself:
Rule 1: zeros
Since a 0 zeros out the product of the subarray, the subarray of the solution mustn't contain a 0, unless only negative values and 0 are contained in the input. This can be achieved pretty simple, since max and min are both reset to 0, as soon as a 0 is encountered in the array:
max = Math.Max(0 , Math.Max(prev_max * 0 , prev_min * 0));
min = Math.Min(0 , Math.Min(prev_max * 0 , prev_min * 0));
Will logically evaluate to 0, no matter what the so far input is.
arr: 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 0
result: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
min: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
max: 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 0
//non-zero values don't matter for Rule 1, so I just used 1
Rule 2: negative numbers
With Rule 1, we've already implicitly splitted the array into subarrays, such that a subarray consists of either a single 0, or multiple non-zero values. Now the task is to find the largest possible product inside that subarray (I'll refer to that as array from here on).
If the number of negative values in the array is even, the entire problem becomes pretty trivial: just multiply all values in the array and the result is the maximum-product of the array. For an odd number of negative values there are two possible cases:
The array contains only a single negative value: In that case either the subarray with all values with smaller index than the negative value or the subarray with all values with larger index than the negative value becomes the subarray with the maximum-value
The array contains at least 3 negative values: In that case we have to eliminate either the first negative number and all of it's predecessors, or the last negative number and all of it's successors.
Now let's have a look at the code:
max = Math.Max(nums[i] , Math.Max(prev_max * nums[i] , prev_min * nums[i]));
min = Math.Min(nums[i] , Math.Min(prev_max * nums[i] , prev_min * nums[i]));
Case 1: the evaluation of min is actually irrelevant, since the sign of the product of the array will only flip once, for the negative value. As soon as the negative number is encountered (= nums[i]), max will be nums[i], since both max and min are at least 1 and thus multiplication with nums[i] results in a number <= nums[i]. And for the first number after the negative number nums[i + 1], max will be nums[i + 1] again. Since the so far found maximum is made persistent in result (result = Math.Max(result, max);) after each step, this will automatically result in the correct result for that array.
arr: 2 3 2 -4 4 5
result: 2 6 12 12 12 20
max: 2 6 12 -4 4 20
//Omitted min, since it's irrelevant here.
Case 2: Here min becomes relevant too. Before we encounter the first negative value, min is the smallest number encountered so far in the array. After we encounter the first positive element in the array, the value turns negative. We continue to build both products (min and max) and swap them each time a negative value is encountered and keep updating result. When the last negative value of the array is encountered, result will hold the value of the subarray that eliminates the last negative value and it's successor. After the last negative value, max will be the product of the subarray that eliminates the first negative value and it's predecessors and min becomes irrelevant. Now we simply continue to multiply max with the remaining values in the array and update result until the end of the array is reached.
arr: 2 3 -4 3 -2 5 -6 3
result: 2 6 6 6 144 770 770 770
min: 2 6 -24 -72 -6 -30 -4620 ...
max: 2 6 -4 3 144 770 180 540
//min becomes irrelevant after the last negative value
Putting the pieces together
Since min and max are reset every time we encounter a 0, we can easily reuse them for each subarray that doesn't contain a 0. Thus Rule 1 is applied implicitly without interfering with Rule 2. Since result isn't reset each time a new subarray is inspected, the value will be kept persistent over all runs. Thus this algorithm works.
Hope this is understandable (To be honest, I doubt it and will try to improve the answer, if any questions appear). Sry for that monstrous answer.
Lets take assume the contiguous subarray, which produces the maximal product, is a[i], a[i+1], ..., a[j]. Since it is the array with the largest product, it is also the one suffix of a[0], a[1], ..., a[j], that produces the largest product.
The idea of your given algorithm is the following: For every prefix-array a[0], ..., a[j] find the largest suffix array. Out of these suffix arrays, take the maximal.
At the beginning, the smallest and biggest suffix-product are simply nums[0]. Then it iterates over all other numbers in the array. The largest suffix-array is always build in one of three ways. It's just the last numbers nums[i], it's the largest suffix-product of the shortened list multiplied by the last number (if nums[i] > 0), or it's the smallest (< 0) suffix-product multiplied by the last number (if nums[i] < 0). (*)
Using the helper variable result, you store the maximal such suffix-product you found so far.
(*) This fact is quite easy to proof. If you have a different case, for instance there exists a different suffix-product that produces a bigger number, than together with the last number nums[i] you create an even bigger suffix, which would be a contradiction.

Number of ways of distributing n identical balls into groups such that each group has atleast k balls?

I am trying to do this using recursion with memoization ,I have identified the following base cases .
I) when n==k there is only one group with all the balls.
II) when k>n then no groups can have atleast k balls,hence zero.
I am unable to move forward from here.How can this be done?
As an illustration when n=6 ,k=2
(2,2,2)
(4,2)
(3,3)
(6)
That is 4 different groupings can be formed.
This can be represented by the two dimensional recursive formula described below:
T(0, k) = 1
T(n, k) = 0 n < k, n != 0
T(n, k) = T(n-k, k) + T(n, k + 1)
^ ^
There is a box with k balls, No box with k balls, advance to next k
put them
In the above, T(n,k) is the number of distributions of n balls such that each box gets at least k.
And the trick is to think of k as the lowest possible number of balls, and seperate the problem to two scenarios: Is there a box with exactly k balls (if so, place them and recurse with n-k balls), or not (and then, recurse with minimal value of k+1, and same number of balls).
Example, to calculate your example: T(6,2) (6 balls, minimum 2 per box):
T(6,2) = T(4,2) + T(6,3)
T(4,2) = T(2,2) + T(4,3) = T(0,2) + T(2,3) + T(1,3) + T(4,4) =
= T(0,2) + T(2,3) + T(1,3) + T(0,4) + T(4,5) =
= 1 + 0 + 0 + 1 + 0
= 2
T(6,3) = T(3,3) + T(6,4) = T(0,3) + T(3,4) + T(2,4) + T(6,5)
= T(0,3) + T(3,4) + T(2,4) + T(1,5) + T(6,6) =
= T(0,3) + T(3,4) + T(2,4) + T(1,5) + T(0,6) + T(6,7) =
= 1 + 0 + 0 + 0 + 1 + 0
= 2
T(6,2) = T(4,2) + T(6,3) = 2 + 2 = 4
Using Dynamic Programming, it can be calculated in O(n^2) time.
This case can be solved pretty simple:
Number of buckets
The maximum-number of buckets b can be determined as follows:
b = roundDown(n / k)
Each valid distribution can use at most b buckets.
Number of distributions with x buckets
For a given number of buckets the number of distribution can be found pretty simple:
Distribute k balls to each bucket. Find the number of ways to distribute the remaining balls (r = n - k * x) to x buckets:
total_distributions(x) = bincoefficient(x , n - k * x)
EDIT: this will onyl work, if order matters. Since it doesn't for the question, we can use a few tricks here:
Each distribution can be mapped to a sequence of numbers. E.g.: d = {d1 , d2 , ... , dx}. We can easily generate all of these sequences starting with the "first" sequence {r , 0 , ... , 0} and subsequently moving 1s from the left to the right. So the next sequence would look like this: {r - 1 , 1 , ... , 0}. If only sequences matching d1 >= d2 >= ... >= dx are generated, no duplicates will be generated. This constraint can easily be used to optimize this search a bit: We can only move a 1 from da to db (with a = b - 1), if da - 1 >= db + 1 is given, since otherwise the constraint that the array is sorted is violated. The 1s to move are always the rightmost that can be moved. Another way to think of this would be to view r as a unary number and simply split that string into groups such that each group is atleast as long as it's successor.
countSequences(x)
sequence[]
sequence[0] = r
sequenceCount = 1
while true
int i = findRightmostMoveable(sequence)
if i == -1
return sequenceCount
sequence[i] -= 1
sequence[i + 1] -= 1
sequenceCount
findRightmostMoveable(sequence)
for i in [length(sequence) - 1 , 0)
if sequence[i - 1] > sequence[i] + 1
return i - 1
return -1
Actually findRightmostMoveable could be optimized a bit, if we look at the structure-transitions of the sequence (to be more precise the difference between two elements of the sequence). But to be honest I'm by far too lazy to optimize this further.
Putting the pieces together
range(1 , roundDown(n / k)).map(b -> countSequences(b)).sum()

How to find rows of a matrix where with the same ordering of unique and duplicated elements, but not necessarily the same value

I wasn't quite sure how to phrase this question. Suppose I have the following matrix:
A=[1 0 0;
0 0 1;
0 1 0;
0 1 1;
0 1 2;
3 4 4]
Given row 1, I want to find all rows where:
the elements that are unique in row 1, are unique in the same column in the other row, but don't necessarily have the same value
and if there are elements with duplicate values in row 1, there are be duplicate values in the same columns in the other row, but not necessarily the same value
For example, in matrix A, if I was given row 1 I would like to find rows 4 and 6.
Can't test this right now, but I think the following will work:
A=[1 0 0;
0 0 1;
0 1 0;
0 1 1;
0 1 2;
3 4 4];
B = zeros(size(A));
for ii = 1:size(A,1)
r = A(ii,:);
B(ii,1) = 1;
for jj = 2:size(A,2)
c = find(r(1:jj-1)==r(jj));
if numel(c) > 0
B(ii,jj) = B(ii,c);
else
B(ii,jj) = B(ii,jj-1)+1;
end
end
end
At the end of this we have an array B in which "like indices have like values" and the rows you are looking for are now identical.
Now you can do
[C, ia, ic] = unique(B,'rows','stable');
disp('The answer you want is ');
disp(ia);
And the answer you want will be in the variable ia. See http://www.mathworks.com/help/matlab/ref/unique.html#btb0_8v . I am not 100% sure that you can use the rows and stable parameters in the same call - but I think you can.
Try it and see if it works - and ask questions if you need more info.
Here is a simple method
B = NaN(size(A)); %//Preallocation
for row = 1:size(A,1)
[~,~,B(row,:)] = unique(A(row,:), 'stable');
end
find(ismember(B(2:end,:), B(1,:), 'rows')) + 1
A simple solution without loops:
row = 1; %// row used as reference
equal = bsxfun(#eq, A, permute(A, [1 3 2]));
equal = reshape(equal,size(A,1),[]); %// linearized signature of each row
result = find(ismember(equal,equal(row,:),'rows')); %// find matching rows
result = setdiff(result,row); %// remove reference row, if needed
The key is to compute a "signature" of each row, meaning the equality relationship between all combinations of its elements. This is done with bsxfun. Then, rows with the same signature can be easily found with ismember.
Thanks, Floris. The unique call didn't work correctly and I think you meant to use matrix B in it, too. Here's what I managed to do, although it's not as clean:
A=[1 0 0 1;
0 0 1 3;
0 1 0 1;
0 1 1 0;
0 1 2 2;
3 4 4 3;
5 9 9 4];
B = zeros(size(A));
for ii = 1:size(A,1)
r = A(ii,:);
B(ii,1) = 1;
for jj = 2:size(A,2)
c = find(r(1:jj-1)==r(jj));
if numel(c) > 0
B(ii,jj) = B(ii,c);
else
B(ii,jj) = max(B(ii,:))+1; % need max to generalize to more columns
end
end
end
match = zeros(size(A,1)-1,size(A,2));
for i=2:size(A,1)
for j=1:size(A,2)
if B(i,j) == B(1,j)
match(i-1,j)=1;
end
end
end
index=find(sum(match,2)==size(A,2));
In the nested loops I check if the elements in the rows below it match up in the correct column. If there is a perfect match the row should sum to the row dimension.
When I generalize this for the specific problem I'm working on the matrix fills with a certain set of base size(A,2) numbers. So for base 4 and greater, a max statement is needed in the else statement for no matches. Otherwise, for certain number combinations in a given row, a duplication of an element may occur when there is none.
A overview would be to reduce each row into a "signature" counting element repeats, i.e., your row 1 becomes 1, 2. Then check for equal signatures.

Resources