Finding the longest consecutive sequence of integers that appears in both integers - algorithm

I am given two integers X and Y. The goal is to find the longest consecutive sequence of integers that appears in both X and Y. So, if X = 124534891 and Y = 324534768, then the output would be 24534 since we have 124534891 and 324534768. The integers can be of different length.
I am trying to design a dynamic algorithm solution, but I'm completely lost.

This is a modification of Longest common substring problem.
Consider the numbers as strings and apply the same algorithm as for LCS problem.
Here's pseudocode to get started,
function maxConsecutiveSequence(A, B):
S[N] = toString(A)
T[M] = toString(B)
L = array(N, M)
len = 0
ans = {}
for i = 1 to r
for j = 1 to n
if S[i] = T[j]
if i = 1 or j = 1
L[i][j] = 1
else
L[i][j] = L[i - 1][j - 1] + 1
if L[i][j] > len
len = L[i][j]
ans = {S[i − z + 1..i]}
else if L[i][j] = len
ans = ans ∪ {S[i − z + 1..i]}
else
L[i][j] = 0
return ans

Related

Parallelize dynamic programming solution for longest common subsequence (LCS)

Below is a standard dynamic programming solution for finding the length of the longest common subsequence (LCS) among three strings
procedure THREECOUNT(A, B, C, m, n, o)
Where A is an array of characters of size m
Where B is an array of characters of size n
Where C is an array of characters of size o
Let V be a three dimensional array of the size m * n * o.
Let maxV alue be an integer initialized to 0.
for i 0 to m do
V [i; 0; 0] = 0
for j 0 to n do
V [0; j; 0] = 0
for k 0 to o do
V [0; 0; o] = 0
for i 1 to m do
for j 1 to n do
for k 1 to o do
if A[i - 1] == B[j - 1] and B[j - 1] == C[k - 1]
V [i][j][k] = V [i - 1][j - 1][k - 1] + 1
else
V [i][j][k] = max(V [i -1 1][j][k], V [i][j -1 1][k], V [i][j][k - 1])
if V [i][j][k] > maxV alue
maxV alue = V [i][j][k]
return maxV alue which is an integer
Since we only care about the length of LCS we can traverse the diagonal of the 3D DP table. Then I am stuck with how to parallelize this algorithm. The solution claims that parallelization can achieve a span of O(min(m,n,o)log(max(m,n,o)) while remaining work-optimal i.e. O(mno) which I have no idea about how to achieve it. Could somebody help to point out how to parallelize this DP solution?

Count number of subsequences of A such that every element of the subsequence is divisible by its index (starts from 1)

B is a subsequence of A if and only if we can turn A to B by removing zero or more element(s).
A = [1,2,3,4]
B = [1,4] is a subsequence of A.(Just remove 2 and 4).
B = [4,1] is not a subsequence of A.
Count all subsequences of A that satisfy this condition : A[i]%i = 0
Note that i starts from 1 not 0.
Example :
Input :
5
2 2 1 22 14
Output:
13
All of these 13 subsequences satisfy B[i]%i = 0 condition.
{2},{2,2},{2,22},{2,14},{2},{2,22},{2,14},{1},{1,22},{1,14},{22},{22,14},{14}
My attempt :
The only solution that I could came up with has O(n^2) complexity.
Assuming the maximum element in A is C, the following is an algorithm with time complexity O(n * sqrt(C)):
For every element x in A, find all divisors of x.
For every i from 1 to n, find every j such that A[j] is a multiple of i, using the result of step 1.
For every i from 1 to n and j such that A[j] is a multiple of i (using the result of step 2), find the number of B that has i elements and the last element is A[j] (dynamic programming).
def find_factors(x):
"""Returns all factors of x"""
for i in range(1, int(x ** 0.5) + 1):
if x % i == 0:
yield i
if i != x // i:
yield x // i
def solve(a):
"""Returns the answer for a"""
n = len(a)
# b[i] contains every j such that a[j] is a multiple of i+1.
b = [[] for i in range(n)]
for i, x in enumerate(a):
for factor in find_factors(x):
if factor <= n:
b[factor - 1].append(i)
# There are dp[i][j] sub arrays of A of length (i+1) ending at b[i][j]
dp = [[] for i in range(n)]
dp[0] = [1] * n
for i in range(1, n):
k = x = 0
for j in b[i]:
while k < len(b[i - 1]) and b[i - 1][k] < j:
x += dp[i - 1][k]
k += 1
dp[i].append(x)
return sum(sum(dpi) for dpi in dp)
For every divisor d of A[i], where d is greater than 1 and at most i+1, A[i] can be the dth element of the number of subsequences already counted for d-1.
JavaScript code:
function getDivisors(n, max){
let m = 1;
const left = [];
const right = [];
while (m*m <= n && m <= max){
if (n % m == 0){
left.push(m);
const l = n / m;
if (l != m && l <= max)
right.push(l);
}
m += 1;
}
return right.concat(left.reverse());
}
function f(A){
const dp = [1, ...new Array(A.length).fill(0)];
let result = 0;
for (let i=0; i<A.length; i++){
for (d of getDivisors(A[i], i+1)){
result += dp[d-1];
dp[d] += dp[d-1];
}
}
return result;
}
var A = [2, 2, 1, 22, 14];
console.log(JSON.stringify(A));
console.log(f(A));
I believe that for the general case we can't provably find an algorithm with complexity less than O(n^2).
First, an intuitive explanation:
Let's indicate the elements of the array by a1, a2, a3, ..., a_n.
If the element a1 appears in a subarray, it must be element no. 1.
If the element a2 appears in a subarray, it can be element no. 1 or 2.
If the element a3 appears in a subarray, it can be element no. 1, 2 or 3.
...
If the element a_n appears in a subarray, it can be element no. 1, 2, 3, ..., n.
So to take all the possibilities into account, we have to perform the following tests:
Check if a1 is divisible by 1 (trivial, of course)
Check if a2 is divisible by 1 or 2
Check if a3 is divisible by 1, 2 or 3
...
Check if a_n is divisible by 1, 2, 3, ..., n
All in all we have to perform 1+ 2 + 3 + ... + n = n(n - 1) / 2 tests, which gives a complexity of O(n^2).
Note that the above is somewhat inaccurate, because not all the tests are strictly necessary. For example, if a_i is divisible by 2 and 3 then it must be divisible by 6. Nevertheless, I think this gives a good intuition.
Now for a more formal argument:
Define an array like so:
a1 = 1
a2 = 1× 2
a3 = 1× 2 × 3
...
a_n = 1 × 2 × 3 × ... × n
By the definition, every subarray is valid.
Now let (m, p) be such that m <= n and p <= n and change a_mtoa_m / p`. We can now choose one of two paths:
If we restrict p to be prime, then each tuple (m, p) represents a mandatory test, because the corresponding change in the value of a_m changes the number of valid subarrays. But that requires prime factorization of each number between 1 and n. By the known methods, I don't think we can get here a complexity less than O(n^2).
If we omit the above restriction, then we clearly perform n(n - 1) / 2 tests, which gives a complexity of O(n^2).

Formulating dp problem [Codeforces 414 B]

all here is the problem statement from an old contest on codeforces
A sequence of l integers b 1, b 2, ..., b l (1 ≤ b 1 ≤ b 2 ≤ ... ≤ b
l ≤ n) is called good if each number divides (without a remainder) by
the next number in the sequence. More formally for all i
(1 ≤ i ≤ l - 1).
Given n and k find the number of good sequences of length k. As the
answer can be rather large print it modulo 1000000007 (109 + 7).
I have formulated my dp[i][j] as the number of good sequences of length i which ends with the jth number, and the transition table as the following pseudocode
dp[k][n] =
for each factor of n as i do
for j from 1 to k - 1
dp[k][n] += dp[j][i]
end
end
But in the editorial it is given as
Lets define dp[i][j] as number of good sequences of length i that ends in j.
Let's denote divisors of j by x1, x2, ..., xl. Then dp[i][j] = sigma dp[i - 1][xr]
But in my understanding, we need two sigmas, one for the divisors and the other for length. Please help me correct my understanding.
My code ->
MOD = 10 ** 9 + 7
N, K = map(int, input().split())
dp = [[0 for _ in range(N + 1)] for _ in range(K + 1)]
for k in range(1, K + 1):
for n in range(1, N + 1):
c = 1
for i in range(1, n):
if n % i != 0:
continue
for j in range(1, k):
c += dp[j][i]
dp[k][n] = c
c = 0
for i in range(1, N + 1):
c = (c + dp[K][i]) % MOD
print(c)
Link to the problem: https://codeforces.com/problemset/problem/414/B
So let's define dp[i][j] as the number of good sequences of length exactly i and which ends with a value j as its last element.
Now, dp[i][j] = Sum(dp[i-1][x]) for all x s.t. x is a divisor of i. Note that x can be equal to j itself.
This is true because if there is some sequence of length i-1 which we have already found that ends with some value x, then we can simply add j to its end and form a new sequence which satisfies all the conditions.
I guess your confusion is with the length. The thing is that since our current length is i, we can add j to the end of a sequence only if its length is i-1, we cannot iterate over other lengths.
Hope this is clear.

Optimization of profit in Ruby

I'm using Ruby, but for the purposes of this problem it doesn't really matter.
Let's say I have two different kinds of resources, the quantities of which are denoted by a and b. I can allocate d new resources, and since a and b are of equal cost and equal value to production, I can choose to allocate resources in whatever way is most profitable.
This might best be explained like so: (a + j) * (b + k) = c, where j + k = d. I want to maximize c by the best allocation of resources, with the understanding that the cost of the two different types of resources and their value to production are the same. All variables are positive integers, with a and b being greater than 0. Here's my naive brute force method:
def max_alloc(a, b, d)
max_j = 0
max_k = 0
max_c = 0
(0..d).each do |j|
k = d - j
c = (a + j) * (b + k)
if c > max_c
max_c = c
max_j = j
max_k = k
end
end
[max_j, max_k]
end
I'm hoping that there's some sort of mathematical or algorithmic "trick" I'm missing that will keep me from having to resort to brute force.
Do you really need an algorithm to do that?
This is a simple maximum/minimum optimization problem.
Now consider the equation
It is a function of j, so let's call it f(j):
You want to find j such that c = f(j) is maximum... so you want to study the sign of it's derivative
Now you can draw the table of signs
There you have it! A maximum for
this means the j, k pair you are looking for is
and for such values you'll have the maximum value of c:
In Ruby
def max_alloc(a, b, d)
j = (-a + b + d) / 2.0
j = j.round # round to prevent Float, explained at the end of this answer
if j < 0
j = 0 # prevent from negative values
elsif j > d
j = d # prevent from values greater than d
end
[j, d - j]
end
Or even shorter
def max_alloc(a, b, d)
j = ((-a + b + d) / 2.0).round
j = [0, [j, d].min].max # make sure j is in range 0..d
[j, d - j]
end
A one-liner too if you like
def max_alloc(a, b, d)
[[0, [((-a + b + d) / 2.0).round, d].min].max, d - [0, [((-a + b + d) / 2.0).round, d].min].max]
end
An in-depth look at the cases j < 0 and j > d
Let's start from the bounds that j must satisfy:
So j* is:
Now, since f(j) is always a parabola opened downward, the absolute maximum will be it's vertex, so, as discovered before:
But what if this point is outside the given range for j? You'll have to decide wheter to chose j* = 0, k* = d or j* = d, k* = 0.
Since f(j) is strictly increasing for j < j* and strictly descreasing for j > j*, the closer you get to j*, the greater would be f(j) value.
Therefore, if j* > d the choose j* = d, if j* < 0 then choose j = 0.
Here I show some plots just to see this in action:
Why j.round?
As you just learned f(j) is a parabola, and parabolas have an axis of symmetry. If j* is an integer, you are done, otherwise for what integer value is f(j) maximized? Well... for the integer value closest to the vertex; i.e., j.round.
Note: If a, b and d are integers, then j* can only be an integer or xxx.5. So f(j) would be the same for j.ceil and j.floor... You choose.
For given constants a and b, let
f(j,k) = (a + j) * (b + k)
We wish to maximize f(j,k) subject to three requirements:
j + k = d, for a given constant d
j >= 0
k >= 0
We can substitute out k (or j) to by replacing k with
k = d - j
This changes f to:
f(j) = (a + j) * (b + d - j)
= a*(b + d) + (b + d - a)*j -j**2
The problem is now to maximize f subject to:
0 <= j <= d
The second part of this inequality follows from k = d - j >= 0. If d = 0, j = k = 0 is the only solution that satisfies the requirement that the variables are non-negative. If d < 0 there is no feasible solution. These two cases should be checked for but I will assume d > 0.
We first set the derivative of f to zero and solve for j to determine where f's slope is zero:
f'(j) = b + d - a - 2*j = 0
so
j* = (b + d - a)/2
As the second derivative of f is
f''(j) = -2 < 0
we know f is concave, so j* is a maximum (rather than a minimum were it convex). Convex and concave functions are shown here1:
Consider the graph of the concave function. Values of j are on the horizontal axis. Since j* must be between 0 and d to be feasible (both variables have non-negative values), let the points a, c and b on the graph equal 0, j* and d, respectively.
There are three possibilities:
0 <= j* <= d, in which case that is a feasible solution (since k = d - j* >= 0).
j* < 0, in which case the largest feasible value of f is where j = 0.
j* > d, in which case the largest feasible value of f is where j = d.
Once the optimum value of j has been determined, k = d - j
Here are some examples.
Ex. 1: a = 2, b = 3, d = 5
j* = (b + d - a)/2 = (3 + 5 - 2)/2 = 3
Since 0 <= 3 <= 5, j = 3, k = 5 - 3 = 2 are the optimum values of j and k and f(3) = 25 is the optimum value.
Ex. 2: a = 6, b = 1, d = 3
j* = (b + d - a)/2 = (1 + 3 - 6)/2 = -1
Since -1 < 0, j = 0, k = 3 - 0 = 3 are the optimum values of j and k and f(0) = 24 is the optimum value.
Ex. 3: a = 2, b = 7, d = 3
j* = (b + d - a)/2 = (7 + 3 - 2)/2 = 4
Since 4 < 3, j = 3, k = 3 - 3 = 0 is the optimum value of j and f(3) = 35 is the optimum value.
If j and k must be integer-valued at the maximum value of f, we can assume a, b and d are integer-valued. (If a and b are not, a can be rounded up and b rounded down.) Let j* now be the value of j satisfying 0 <= j <= d for which is f(j) is a maximum (but j* is not necessarily an integer). Because f is concave, if j* is not integer, the optimal value of j is J*.floor if f(j*.floor) >= f(j*.ceil) and j*.ceil otherwise.
1 A function f is concave if, for all a and b, a < b, and all x, a <= x <= b, f(x) >= g(x), where g is the linear function having the property that g(a) = f(a) and g(b) = f(b).

Double Squares: counting numbers which are sums of two perfect squares

Source: Facebook Hacker Cup Qualification Round 2011
A double-square number is an integer X which can be expressed as the sum of two perfect squares. For example, 10 is a double-square because 10 = 32 + 12. Given X, how can we determine the number of ways in which it can be written as the sum of two squares? For example, 10 can only be written as 32 + 12 (we don't count 12 + 32 as being different). On the other hand, 25 can be written as 52 + 02 or as 42 + 32.
You need to solve this problem for 0 ≤ X ≤ 2,147,483,647.
Examples:
10 => 1
25 => 2
3 => 0
0 => 1
1 => 1
Factor the number n, and check if it has a prime factor p with odd valuation, such that p = 3 (mod 4). It does if and only if n is not a sum of two squares.
The number of solutions has a closed form expression involving the number of divisors of n. See this, Theorem 3 for a precise statement.
Here is my simple answer in O(sqrt(n)) complexity
x^2 + y^2 = n
x^2 = n-y^2
x = sqrt(n - y^2)
x should be integer so (n-y^2) should be perfect square. Loop to y=[0, sqrt(n)] and check whether (n-y^2) is perfect square or not
Pseudocode :
count = 0;
for y in range(0, sqrt(n))
if( isPerfectSquare(n - y^2))
count++
return count/2
Here's a much simpler solution:
create list of squares in the given range (that's 46340 values for the example given)
for each square value x
if list contains a value y such that x + y = target value (i.e. does [target - x] exist in list)
output √x, √y as solution (roots can be stored in a std::map lookup created in the first step)
Looping through all pairs (a, b) is infeasible given the constrains on X. There is a faster way though!
For fixed a, we can work out b: b = √(X - a2). b won't always be an integer though, so we have to check this. Due to precision issues, perform the check with a small tolerance: if b is x.99999, we can be fairly certain it's an integer. So we loop through all possible values of a and count all cases where b is an integer. We need to be careful not to double-count, so we place the constraint that a <= b. For X = a2 + b2, a will be at most √(X/2) with this constraint.
Here is an implementation of this algorithm in C++:
int count = 0;
// add EPS to avoid flooring x.99999 to x
for (int a = 0; a <= sqrt(X/2) + EPS; a++) {
int b2 = X - a*a; // b^2
int b = (int) (sqrt(b2) + EPS);
if (abs(b - sqrt(b2)) < EPS) // check b is an integer
count++;
}
cout << count << endl;
See it on ideone with sample input
Here's a version which is trivially O(sqrt(N)) and avoids all loop-internal branches.
Start by generating all squares up to the limit, easily done without any multiplications, then initialize a l and r index.
In each iteration you calculate the sum, then update the two indices and the count based on a comparison with the target value. This is sqrt(N) iterations to generate the table and maximum sqrt(N) iterations of the search loop. Estimated running time with a reasonable compiler is max 10 clock cycles per sqrt(N), so for a maximum input value if 2^31 (sqrt(N) ~= 46341) this should correspond to less than 500K clock cycles or a few tenths of a second:
unsigned countPairs(unsigned n)
{
unsigned sq = 0, i;
unsigned square[65536];
for (i = 0; sq <= n; i++) {
square[i] = sq;
sq += i+i+1;
}
unsigned l = 0, r = i-1, count = 0;
do {
unsigned sum = square[l] + square[r];
l += sum <= n; // Increment l if the sum is <= N
count += sum == n; // Increment the count if a match
r -= sum >= n; // Decrement r if the sum is >= N
} while (l <= r);
return count;
}
A good compiler can note that the three compares at the end are all using the same operands so it only needs a single CMP opcode followed by three different conditional move operations (CMOVcc).
I was in a hurry, so solved it using a rather brute-force approach (very similar to marcog's) using Python 2.6.
def is_perfect_square(x):
rt = int(math.sqrt(x))
return rt*rt == x
def double_sqaures(n):
rng = int(math.sqrt(n))
ways = 0
for i in xrange(rng+1):
if is_perfect_square(n - i*i):
ways +=1
if ways % 2 == 0:
ways = ways // 2
else:
ways = ways // 2 + 1
return ways
Note: ways will be odd when the number is a perfect sqaure.
The number of solutions (x,y) of
x^2+y^2=n
over the integers is exactly 4 times the number of divisors of n congruent to 1 mod 4.
Similar identities exist also for the problems
x^2 + 2y^2 = n
and
x^2 + y^2 + z^2 + w^2 = n.

Resources