From a loop index k, obtain pairs i,j with i < j? - algorithm

I need to traverse all pairs i,j with 0 <= i < n, 0 <= j < n and i < j for some positive integer n.
Problem is that I can only loop through another variable, say k. I can control the bounds of k. So the problem is to determine two arithmetic methods, f(k) and g(k) such that i=f(k) and j=g(k) traverse all admissible pairs as k traverses its consecutive values.
How can I do this in a simple way?

I think I got it (in Python):
def get_ij(n, k):
j = k // (n - 1) # // is integer (truncating) division
i = k - j * (n - 1)
if i >= j:
i = (n - 2) - i
j = (n - 1) - j
return i, j
for n in range(2, 6):
print n, sorted(get_ij(n, k) for k in range(n * (n - 1) / 2))
It basically folds the matrix so that it's (almost) rectangular. By "almost" I mean that there could be some unused entries on the far right of the bottom row.
The following pictures illustrate how the folding works for n=4:
and n=5:
Now, iterating over the rectangle is easy, as is mapping from folded coordinates back to coordinates in the original triangular matrix.
Pros: uses simple integer math.
Cons: returns the tuples in a weird order.

I think I found another way, that gives the pairs in lexicographic order. Note that here i > j instead of i < j.
Basically the algorithm consists of the two expressions:
i = floor((1 + sqrt(1 + 8*k))/2)
j = k - i*(i - 1)/2
that give i,j as functions of k. Here k is a zero-based index.
Pros: Gives the pairs in lexicographic order.
Cons: Relies on floating-point arithmetic.
Rationale:
We want to achieve the mapping in the following table:
k -> (i,j)
0 -> (1,0)
1 -> (2,0)
2 -> (2,1)
3 -> (3,0)
4 -> (3,1)
5 -> (3,2)
....
We start by considering the inverse mapping (i,j) -> k. It isn't hard to realize that:
k = i*(i-1)/2 + j
Since j < i, it follows that the value of k corresponding to all pairs (i,j) with fixed i satisfies:
i*(i-1)/2 <= k < i*(i+1)/2
Therefore, given k, i=f(k) returns the largest integer i such that i*(i-1)/2 <= k. After some algebra:
i = f(k) = floor((1 + sqrt(1 + 8*k))/2)
After we have found the value i, j is trivially given by
j = k - i*(i-1)/2

I'm not sure to understand exactly the question, but to sum up, if 0 <= i < n, 0 <= j < n , then you want to traverse 0 <= k < n*n
for (int k = 0; k < n*n; k++) {
int i = k / n;
int j = k % n;
// ...
}
[edit] I just saw that i < j ; so, this solution is not optimal since there's less that n*n necessary iterations ...

If we think of our solution in terms of a number triangle, where k is the sequence
1
2 3
4 5 6
7 8 9 10
11 12 13 14 15
...
Then j would be our (non zero-based) row number, that is, the greatest integer such that
j * (j - 1) / 2 < k
Solving for j:
j = ceiling ((sqrt (1 + 8 * k) - 1) / 2)
And i would be k's (zero-based) position in the row
i = k - j * (j - 1) / 2 - 1
The bounds for k are:
1 <= k <= n * (n - 1) / 2

Is it important that you actually have two arithmetic functions f(k) and g(k) doing this? Because you could first create a list such as
L = []
for i in range(n-1):
for j in range(n):
if j>i:
L.append((i,j))
This will give you all the pairs you asked for. Your variable k can now just run along the index of the list. For example, if we take n=5,
for x in L:
print(x)
gives us
(0,1), (0,2), (0,3), (0,4), (1,2), (1,3), (1,4), (2,3), (2,4), (3,4)
Suppose your have 2<=k<5 for example, then
for k in range(2, 5)
print L[k]
yields
(0,3), (0,4), (1,2)

Related

Count number of subsequences of A such that every element of the subsequence is divisible by its index (starts from 1)

B is a subsequence of A if and only if we can turn A to B by removing zero or more element(s).
A = [1,2,3,4]
B = [1,4] is a subsequence of A.(Just remove 2 and 4).
B = [4,1] is not a subsequence of A.
Count all subsequences of A that satisfy this condition : A[i]%i = 0
Note that i starts from 1 not 0.
Example :
Input :
5
2 2 1 22 14
Output:
13
All of these 13 subsequences satisfy B[i]%i = 0 condition.
{2},{2,2},{2,22},{2,14},{2},{2,22},{2,14},{1},{1,22},{1,14},{22},{22,14},{14}
My attempt :
The only solution that I could came up with has O(n^2) complexity.
Assuming the maximum element in A is C, the following is an algorithm with time complexity O(n * sqrt(C)):
For every element x in A, find all divisors of x.
For every i from 1 to n, find every j such that A[j] is a multiple of i, using the result of step 1.
For every i from 1 to n and j such that A[j] is a multiple of i (using the result of step 2), find the number of B that has i elements and the last element is A[j] (dynamic programming).
def find_factors(x):
"""Returns all factors of x"""
for i in range(1, int(x ** 0.5) + 1):
if x % i == 0:
yield i
if i != x // i:
yield x // i
def solve(a):
"""Returns the answer for a"""
n = len(a)
# b[i] contains every j such that a[j] is a multiple of i+1.
b = [[] for i in range(n)]
for i, x in enumerate(a):
for factor in find_factors(x):
if factor <= n:
b[factor - 1].append(i)
# There are dp[i][j] sub arrays of A of length (i+1) ending at b[i][j]
dp = [[] for i in range(n)]
dp[0] = [1] * n
for i in range(1, n):
k = x = 0
for j in b[i]:
while k < len(b[i - 1]) and b[i - 1][k] < j:
x += dp[i - 1][k]
k += 1
dp[i].append(x)
return sum(sum(dpi) for dpi in dp)
For every divisor d of A[i], where d is greater than 1 and at most i+1, A[i] can be the dth element of the number of subsequences already counted for d-1.
JavaScript code:
function getDivisors(n, max){
let m = 1;
const left = [];
const right = [];
while (m*m <= n && m <= max){
if (n % m == 0){
left.push(m);
const l = n / m;
if (l != m && l <= max)
right.push(l);
}
m += 1;
}
return right.concat(left.reverse());
}
function f(A){
const dp = [1, ...new Array(A.length).fill(0)];
let result = 0;
for (let i=0; i<A.length; i++){
for (d of getDivisors(A[i], i+1)){
result += dp[d-1];
dp[d] += dp[d-1];
}
}
return result;
}
var A = [2, 2, 1, 22, 14];
console.log(JSON.stringify(A));
console.log(f(A));
I believe that for the general case we can't provably find an algorithm with complexity less than O(n^2).
First, an intuitive explanation:
Let's indicate the elements of the array by a1, a2, a3, ..., a_n.
If the element a1 appears in a subarray, it must be element no. 1.
If the element a2 appears in a subarray, it can be element no. 1 or 2.
If the element a3 appears in a subarray, it can be element no. 1, 2 or 3.
...
If the element a_n appears in a subarray, it can be element no. 1, 2, 3, ..., n.
So to take all the possibilities into account, we have to perform the following tests:
Check if a1 is divisible by 1 (trivial, of course)
Check if a2 is divisible by 1 or 2
Check if a3 is divisible by 1, 2 or 3
...
Check if a_n is divisible by 1, 2, 3, ..., n
All in all we have to perform 1+ 2 + 3 + ... + n = n(n - 1) / 2 tests, which gives a complexity of O(n^2).
Note that the above is somewhat inaccurate, because not all the tests are strictly necessary. For example, if a_i is divisible by 2 and 3 then it must be divisible by 6. Nevertheless, I think this gives a good intuition.
Now for a more formal argument:
Define an array like so:
a1 = 1
a2 = 1× 2
a3 = 1× 2 × 3
...
a_n = 1 × 2 × 3 × ... × n
By the definition, every subarray is valid.
Now let (m, p) be such that m <= n and p <= n and change a_mtoa_m / p`. We can now choose one of two paths:
If we restrict p to be prime, then each tuple (m, p) represents a mandatory test, because the corresponding change in the value of a_m changes the number of valid subarrays. But that requires prime factorization of each number between 1 and n. By the known methods, I don't think we can get here a complexity less than O(n^2).
If we omit the above restriction, then we clearly perform n(n - 1) / 2 tests, which gives a complexity of O(n^2).

Formulating dp problem [Codeforces 414 B]

all here is the problem statement from an old contest on codeforces
A sequence of l integers b 1, b 2, ..., b l (1 ≤ b 1 ≤ b 2 ≤ ... ≤ b
l ≤ n) is called good if each number divides (without a remainder) by
the next number in the sequence. More formally for all i
(1 ≤ i ≤ l - 1).
Given n and k find the number of good sequences of length k. As the
answer can be rather large print it modulo 1000000007 (109 + 7).
I have formulated my dp[i][j] as the number of good sequences of length i which ends with the jth number, and the transition table as the following pseudocode
dp[k][n] =
for each factor of n as i do
for j from 1 to k - 1
dp[k][n] += dp[j][i]
end
end
But in the editorial it is given as
Lets define dp[i][j] as number of good sequences of length i that ends in j.
Let's denote divisors of j by x1, x2, ..., xl. Then dp[i][j] = sigma dp[i - 1][xr]
But in my understanding, we need two sigmas, one for the divisors and the other for length. Please help me correct my understanding.
My code ->
MOD = 10 ** 9 + 7
N, K = map(int, input().split())
dp = [[0 for _ in range(N + 1)] for _ in range(K + 1)]
for k in range(1, K + 1):
for n in range(1, N + 1):
c = 1
for i in range(1, n):
if n % i != 0:
continue
for j in range(1, k):
c += dp[j][i]
dp[k][n] = c
c = 0
for i in range(1, N + 1):
c = (c + dp[K][i]) % MOD
print(c)
Link to the problem: https://codeforces.com/problemset/problem/414/B
So let's define dp[i][j] as the number of good sequences of length exactly i and which ends with a value j as its last element.
Now, dp[i][j] = Sum(dp[i-1][x]) for all x s.t. x is a divisor of i. Note that x can be equal to j itself.
This is true because if there is some sequence of length i-1 which we have already found that ends with some value x, then we can simply add j to its end and form a new sequence which satisfies all the conditions.
I guess your confusion is with the length. The thing is that since our current length is i, we can add j to the end of a sequence only if its length is i-1, we cannot iterate over other lengths.
Hope this is clear.

DP solution to find if there is group of numbers which is divisible by M

Let's say we have number N, such that 0 < N <= 10^6 and 2 <= M <= 10^3 and array of N elements a[1], a[2], ... a[N] (0<= a[i] <=10^9)\
Now we have to check if we can choose group of numbers from the array such that their sum will be divisible by M, and output "YES" or "NO".
Here are two examples:
N = 3, M =5 a={1,2,3} answer="YES"
N = 4, M = 6 a={3,1,1,3} answer="YES"
thanks in advance.
C++ solution.
//declare dp array of boolean values of size M
bool dp[M] = {0}; // init with fasle values
for(int i = 0; i < N; i++) {
bool ndp[M] = {0}; // init temporary boolean array
ndp[a[i] % M] = 1; // add a subset with one a[i] element
for(int j = 0; j < M; j++)
if(dp[j]) { // if we may find a subset of elements with sum = j (modulo M)
ndp[j] = 1; // copy existing values
ndp[(j + a[i]) % M] = 1; // extend the subset with a[i], which will give a sum = j + a[i] (modulo M)
}
// copy from ndp to dp before proceeding to the next element of a
for(int j = 0; j < M; j++) dp[j] = ndp[j];
}
//check dp[0] for the answer
The algorithm complexity will be O(N*M) which in your case is O(109)
Edit: Added ndp[a[i] % M] = 1; line in order to make dp[j] ever become nonzero.
There might be another alternative O(M * M * log(M) + N) solution which in your case is O(107) (but with big constant).
Notice that if substitute each a[i] with a[i] % M the problem statement does not change. Lets count the number of a[i] elements that give specific remainder j after division on M. If for some remainder j we found k elements in a then we can generate the following sums of subsets (that may produce unique remainder)
j, 2 * j % M, 3 * j % M ... k * j % M
Example: let M = 6 and for remainder 2 we found 5 elements in a. Then we have the following unique sums of subsets:
2 % 6, 2 * 2 % 6, 3 * 2 % 6, 4 * 2 % 6, 5 * 2 % 6
which is 0, 2, 4
store this information in boolean form {1, 0, 1, 0, 1, 0}
At most we have M such groups that produce M-size bool array of possible remainders.
Next we need to find all possible subsets that may appear if we will take elements of different groups. Lets say we merge two bool remainder arrays a and b if we can introduce new array c that will contain all possible remainder sums of elements from subset of a and b. Naive approach will require us to make two nested loops over a and b giving O(M2) merge time complexity.
We may reduce complexity to O(M * log(M)) using Fast Fourier Transform algo. Each bool array has a polynomial Σ ai*xi where coefficients ai are taken from bool array. If we want to merge two array we may just multiply their polynomials.
Overall complxity is O(M2 * log(M)) as we need to make M such merges.

Count number of subsequences with given k modulo sum

Given an array a of n integers, count how many subsequences (non-consecutive as well) have sum % k = 0:
1 <= k < 100
1 <= n <= 10^6
1 <= a[i] <= 1000
An O(n^2) solution is easily possible, however a faster way O(n log n) or O(n) is needed.
This is the subset sum problem.
A simple solution is this:
s = 0
dp[x] = how many subsequences we can build with sum x
dp[0] = 1, 0 elsewhere
for i = 1 to n:
s += a[i]
for j = s down to a[i]:
dp[j] = dp[j] + dp[j - a[i]]
Then you can simply return the sum of all dp[x] such that x % k == 0. This has a high complexity though: about O(n*S), where S is the sum of all of your elements. The dp array must also have size S, which you probably can't even afford to declare for your constraints.
A better solution is to not iterate over sums larger than or equal to k in the first place. To do this, we will use 2 dp arrays:
dp1, dp2 = arrays of size k
dp1[0] = dp2[0] = 1, 0 elsewhere
for i = 1 to n:
mod_elem = a[i] % k
for j = 0 to k - 1:
dp2[j] = dp2[j] + dp1[(j - mod_elem + k) % k]
copy dp2 into dp1
return dp1[0]
Whose complexity is O(n*k), and is optimal for this problem.
There's an O(n + k^2 lg n)-time algorithm. Compute a histogram c(0), c(1), ..., c(k-1) of the input array mod k (i.e., there are c(r) elements that are r mod k). Then compute
k-1
product (1 + x^r)^c(r) mod (1 - x^k)
r=0
as follows, where the constant term of the reduced polynomial is the answer.
Rather than evaluate each factor with a fast exponentiation method and then multiply, we turn things inside out. If all c(r) are zero, then the answer is 1. Otherwise, recursively evaluate
k-1
P = product (1 + x^r)^(floor(c(r)/2)) mod (1 - x^k).
r=0
and then compute
k-1
Q = product (1 + x^r)^(c(r) - 2 floor(c(r)/2)) mod (1 - x^k),
r=0
in time O(k^2) for the latter computation by exploiting the sparsity of the factors. The result is P^2 Q mod (1 - x^k), computed in time O(k^2) via naive convolution.
Traverse a and count a[i] mod k; there ought to be k such counts.
Recurse and memoize over the distinct partitions of k, 2*k, 3*k...etc. with parts less than or equal to k, adding the products of the appropriate counts.
For example, if k were 10, some of the partitions would be 1+2+7 and 1+2+3+4; but while memoizing, we would only need to calculate once how many pairs mod k in the array produce (1 + 2).
For example, k = 5, a = {1,4,2,3,5,6}:
counts of a[i] mod k: {1,2,1,1,1}
products of distinct partitions of k:
5 => 1
4,1 => 2
3,2 => 1
products of distinct partitions of 2 * k with parts <= k:
5,4,1 => 2
5,3,2 => 1
4,1,3,2 => 2
products of distinct partitions of 3 * k with parts <= k:
5,4,1,3,2 => 2
answer = 11
{1,4} {4,6} {2,3} {5}
{1,4,2,3} {1,4,5} {4,6,2,3} {4,6,5} {2,3,5}
{1,4,2,3,5} {4,6,2,3,5}

How to find largest square of palindrome in a matrix

I am trying to solve a problem where I am given a nXn square matrix of characters and I want to find out size of the largest palindrome square from this? The largest palindrome square is, a square with all rows and all columns as palindrome.
For eg.
Input
a g h j k
s d g d j
s e f e n
a d g d h
r y d g s
The output will be:
3
corresponding to the middle square. I am thinking of dynamic programming solution but unable to formulate the recurrence relation. I am thinking the dimensions should be a(i,j,k) where i, j are the bottom-right of rectangle and k be the size of palindrome square.
Can someone help me with the recurrence relation for this problem?
EDIT:
n<500, so I believe that I can't go beyond O(n^3).
Assuming that you can solve the following problem:
Ending at cell (i, j) is there any palindrome with different length horizontally and vertically.
Hint for above problem:
boolean[][][]palindrome;//Is there any palindrome ending at (i , j) has length k
for(int i = 0; i < n; i++){
for(int j = 0; j < n; j++){
palindrome[i][j][0] = true;
palindrome[i][j][1] = true;
for(int k = 2; k <= n; k++)
if(data[i][j - k + 1] == data[i][j] && palindrome[i][j - 1][k - 2])
palindrome[i][j][k] = true;
}
}
So, we can create two three dimensional arrays int[n][n][n]col and int[n][n][n]row.
For each cell(i, j), we will calculate the total number of palindrome with length k, ending at cell (0, j), (1, j) , ... (i, j) and total number of palindrome with length k, ending at cell (i,0), (i, 1), ... (i, j)
for(int k = 1; k <= n; k++)
if(there is palindrome length k horizontally, end at cell (i, j))
row[i][j][k] = 1 + row[i - 1][j][k];
if(there is palindrome length k vertically, end at cell (i, j))
col[i][j][k] = 1 + col[i][j - 1][k];
Finally, if row[i][j][k] >= k && col[i][j][k] >= k -> there is an square palindrome length k ending at (i,j).
In total, the time complexity will be O(n^3)
lets start with the complexity of validating a palindrome:
A palindrome can be identified in O(k) where k is the length of the palindrome see here
you then want need to do that test 2k times once for each row and column in you r inner square. (using the length of the palindrome k, as the dimension)
so now you have k * 2k -> O(2k^2) -> O(k^2)
then you want to increase the possible search space to the whole data set nxn this is when a 2nd variable gets introduced
you will need to iterate over columns 1 to (n-k) and all rows 1 to (n-k) in a nested loop.
so now you have (n-k)^2 * O(k^2) -> O(n^2 * k^2)
Note: this problem is dependant on more than one variable
This is the same approach i suggest you take to coding the solution, start small and get bigger
Im sure there is probably a better way, and im pretty sure my logic is correct so take this at face value as its not tested.
Just to make the example easy im going to say that i,j is the top left corner or coordinates 1,1
1 2 3 4 5 6 7 8
1 a b c d e f g f
2 d e f g h j k q
3 a b g d z f g f
4 a a a a a a a a
ie (1,1) = a, (1,5) = e and (2,1) = d
now instead of checking every column you could start by checking every kth column
ie when k=3
1) create a 2D boolean array the size of the character table all results TRUE
2) I start by checking column 3 cfg which is not a palindrome, thus I no longer need to test columns 1 or 2.
3) because the palindrome test failed marked the coresponding result in the 2D array (1,3) as FALSE (I know not to test any range that uses this position as it is not a palindrome)
4) Next check column 6, fjf which is a palindrome so I go back and test column 5, ehz != a palindrome
5) set (1,5) = FALSE
6) Then test column 8 then 7,
NOTE: You have only had to test 5 of the 8 columns.
since there were k columns in a row that were palindromes, now test the corresponding rows. Start from the bottom row in this case 3 as it will eliminate the most other checks if it fails
7) check row starting at (3,6) fgf = palindrome
8) check row starting at (2,6) jkq != a palindrome
9) set (2,6) = FALSE
10) check column starting at (2,3) daa != palindrome
11) set (2,3) = FALSE
Dont need to test any more for row 2 as both (2,3) and (2,6) are FALSE
Hopefully you can make sense of that.
Note: you would probably start this at k = n and decrement k until you find a result

Resources