Below is a standard dynamic programming solution for finding the length of the longest common subsequence (LCS) among three strings
procedure THREECOUNT(A, B, C, m, n, o)
Where A is an array of characters of size m
Where B is an array of characters of size n
Where C is an array of characters of size o
Let V be a three dimensional array of the size m * n * o.
Let maxV alue be an integer initialized to 0.
for i 0 to m do
V [i; 0; 0] = 0
for j 0 to n do
V [0; j; 0] = 0
for k 0 to o do
V [0; 0; o] = 0
for i 1 to m do
for j 1 to n do
for k 1 to o do
if A[i - 1] == B[j - 1] and B[j - 1] == C[k - 1]
V [i][j][k] = V [i - 1][j - 1][k - 1] + 1
else
V [i][j][k] = max(V [i -1 1][j][k], V [i][j -1 1][k], V [i][j][k - 1])
if V [i][j][k] > maxV alue
maxV alue = V [i][j][k]
return maxV alue which is an integer
Since we only care about the length of LCS we can traverse the diagonal of the 3D DP table. Then I am stuck with how to parallelize this algorithm. The solution claims that parallelization can achieve a span of O(min(m,n,o)log(max(m,n,o)) while remaining work-optimal i.e. O(mno) which I have no idea about how to achieve it. Could somebody help to point out how to parallelize this DP solution?
Related
I am given two integers X and Y. The goal is to find the longest consecutive sequence of integers that appears in both X and Y. So, if X = 124534891 and Y = 324534768, then the output would be 24534 since we have 124534891 and 324534768. The integers can be of different length.
I am trying to design a dynamic algorithm solution, but I'm completely lost.
This is a modification of Longest common substring problem.
Consider the numbers as strings and apply the same algorithm as for LCS problem.
Here's pseudocode to get started,
function maxConsecutiveSequence(A, B):
S[N] = toString(A)
T[M] = toString(B)
L = array(N, M)
len = 0
ans = {}
for i = 1 to r
for j = 1 to n
if S[i] = T[j]
if i = 1 or j = 1
L[i][j] = 1
else
L[i][j] = L[i - 1][j - 1] + 1
if L[i][j] > len
len = L[i][j]
ans = {S[i − z + 1..i]}
else if L[i][j] = len
ans = ans ∪ {S[i − z + 1..i]}
else
L[i][j] = 0
return ans
all here is the problem statement from an old contest on codeforces
A sequence of l integers b 1, b 2, ..., b l (1 ≤ b 1 ≤ b 2 ≤ ... ≤ b
l ≤ n) is called good if each number divides (without a remainder) by
the next number in the sequence. More formally for all i
(1 ≤ i ≤ l - 1).
Given n and k find the number of good sequences of length k. As the
answer can be rather large print it modulo 1000000007 (109 + 7).
I have formulated my dp[i][j] as the number of good sequences of length i which ends with the jth number, and the transition table as the following pseudocode
dp[k][n] =
for each factor of n as i do
for j from 1 to k - 1
dp[k][n] += dp[j][i]
end
end
But in the editorial it is given as
Lets define dp[i][j] as number of good sequences of length i that ends in j.
Let's denote divisors of j by x1, x2, ..., xl. Then dp[i][j] = sigma dp[i - 1][xr]
But in my understanding, we need two sigmas, one for the divisors and the other for length. Please help me correct my understanding.
My code ->
MOD = 10 ** 9 + 7
N, K = map(int, input().split())
dp = [[0 for _ in range(N + 1)] for _ in range(K + 1)]
for k in range(1, K + 1):
for n in range(1, N + 1):
c = 1
for i in range(1, n):
if n % i != 0:
continue
for j in range(1, k):
c += dp[j][i]
dp[k][n] = c
c = 0
for i in range(1, N + 1):
c = (c + dp[K][i]) % MOD
print(c)
Link to the problem: https://codeforces.com/problemset/problem/414/B
So let's define dp[i][j] as the number of good sequences of length exactly i and which ends with a value j as its last element.
Now, dp[i][j] = Sum(dp[i-1][x]) for all x s.t. x is a divisor of i. Note that x can be equal to j itself.
This is true because if there is some sequence of length i-1 which we have already found that ends with some value x, then we can simply add j to its end and form a new sequence which satisfies all the conditions.
I guess your confusion is with the length. The thing is that since our current length is i, we can add j to the end of a sequence only if its length is i-1, we cannot iterate over other lengths.
Hope this is clear.
I need to traverse all pairs i,j with 0 <= i < n, 0 <= j < n and i < j for some positive integer n.
Problem is that I can only loop through another variable, say k. I can control the bounds of k. So the problem is to determine two arithmetic methods, f(k) and g(k) such that i=f(k) and j=g(k) traverse all admissible pairs as k traverses its consecutive values.
How can I do this in a simple way?
I think I got it (in Python):
def get_ij(n, k):
j = k // (n - 1) # // is integer (truncating) division
i = k - j * (n - 1)
if i >= j:
i = (n - 2) - i
j = (n - 1) - j
return i, j
for n in range(2, 6):
print n, sorted(get_ij(n, k) for k in range(n * (n - 1) / 2))
It basically folds the matrix so that it's (almost) rectangular. By "almost" I mean that there could be some unused entries on the far right of the bottom row.
The following pictures illustrate how the folding works for n=4:
and n=5:
Now, iterating over the rectangle is easy, as is mapping from folded coordinates back to coordinates in the original triangular matrix.
Pros: uses simple integer math.
Cons: returns the tuples in a weird order.
I think I found another way, that gives the pairs in lexicographic order. Note that here i > j instead of i < j.
Basically the algorithm consists of the two expressions:
i = floor((1 + sqrt(1 + 8*k))/2)
j = k - i*(i - 1)/2
that give i,j as functions of k. Here k is a zero-based index.
Pros: Gives the pairs in lexicographic order.
Cons: Relies on floating-point arithmetic.
Rationale:
We want to achieve the mapping in the following table:
k -> (i,j)
0 -> (1,0)
1 -> (2,0)
2 -> (2,1)
3 -> (3,0)
4 -> (3,1)
5 -> (3,2)
....
We start by considering the inverse mapping (i,j) -> k. It isn't hard to realize that:
k = i*(i-1)/2 + j
Since j < i, it follows that the value of k corresponding to all pairs (i,j) with fixed i satisfies:
i*(i-1)/2 <= k < i*(i+1)/2
Therefore, given k, i=f(k) returns the largest integer i such that i*(i-1)/2 <= k. After some algebra:
i = f(k) = floor((1 + sqrt(1 + 8*k))/2)
After we have found the value i, j is trivially given by
j = k - i*(i-1)/2
I'm not sure to understand exactly the question, but to sum up, if 0 <= i < n, 0 <= j < n , then you want to traverse 0 <= k < n*n
for (int k = 0; k < n*n; k++) {
int i = k / n;
int j = k % n;
// ...
}
[edit] I just saw that i < j ; so, this solution is not optimal since there's less that n*n necessary iterations ...
If we think of our solution in terms of a number triangle, where k is the sequence
1
2 3
4 5 6
7 8 9 10
11 12 13 14 15
...
Then j would be our (non zero-based) row number, that is, the greatest integer such that
j * (j - 1) / 2 < k
Solving for j:
j = ceiling ((sqrt (1 + 8 * k) - 1) / 2)
And i would be k's (zero-based) position in the row
i = k - j * (j - 1) / 2 - 1
The bounds for k are:
1 <= k <= n * (n - 1) / 2
Is it important that you actually have two arithmetic functions f(k) and g(k) doing this? Because you could first create a list such as
L = []
for i in range(n-1):
for j in range(n):
if j>i:
L.append((i,j))
This will give you all the pairs you asked for. Your variable k can now just run along the index of the list. For example, if we take n=5,
for x in L:
print(x)
gives us
(0,1), (0,2), (0,3), (0,4), (1,2), (1,3), (1,4), (2,3), (2,4), (3,4)
Suppose your have 2<=k<5 for example, then
for k in range(2, 5)
print L[k]
yields
(0,3), (0,4), (1,2)
I am having trouble understanding a solution to an algorithmic problem
In particular, I don't understand how or why this part of the code
s += a[i];
total += query(s);
update(s);
allows you to compute the total number of points in the lower left quadrant of each point.
Could someone please elaborate?
As an analogue for the plane problem, consider this:
For a point (a, b) to lie in the lower left quadrant of (x, y), a <
x & b < y; thus, points of the form (i, P[i]) lie in the lower left quadrant
of (j, P[j]) iff i < j and P[i] < P[j]
When iterating in ascending order, all points that were considered earlier lie on the left compared to the current (i, P[i])
So one only has to locate all P[j]s less that P[i] that have been considered until now
*current point refers to the point in consideration in the current iteration of the for loop that you quoted ie, (i, P[i])
Let's define another array, C[s]:
C[s] = Number of Prefix Sums of array A[1..(i - 1)] that amount to s
So the solution to #3 becomes the sum ... C[-2] + C[-1] + C[0] + C[1] + C[2] ... C[P[i] - 1], ie prefix sum of C[P[i]]
Use the BIT to store the prefix sum of C, thus defining query(s) as:
query(s) = Number of Prefix Sums of array A[1..(i - 1)] that amount to a value < s
Using these definitions, s in the given code gives you the prefix sum up to the current index i (P[i]). total builds the answer, and update simply adds P[i] to the BIT.
We have to repeat this method for all i, hence the for loop.
PS: It uses a data structure called a Binary Indexed Tree (http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binaryIndexedTrees) for operations. If you aren't acquainted with it, I'd recommend that you check the link.
EDIT:
You are given a array S and a value X. You can split S into two disjoint subarrays such that L has all elements of S less than X, and H that has those that are greater than or equal to X.
A: All elements of L are less than all elements of H.
Any subsequence T of S will have some elements of L and some elements of H. Let's say it has p elements of L and q of H. When T is sorted to give T', all p elements of L appear before the q elements of H because of A.
Median being the central value is the value at location m = (p + q)/2
It is intuitive to think that having q >= p implies that the median lies in X, as a proof:
Values in locations [1..p] in T' belong to L. Therefore for the median to be in H, it's position m should be greater than p:
m > p
(p + q)/2 > p
p + q > 2p
q > p
B: q - p > 0
To computer q - p, I replace all elements in T' with -1 if they belong to L ( < X ) and +1 if they belong to H ( >= X)
T looks something like {-1, -1, -1... 1, 1, 1}
It has p times -1 and q times 1. Sum of T' will now give me:
Sum = p * (-1) + q * (1)
C: Sum = q - p
I can use this information to find the value in B.
All subsequences are of the form {A[i], A[i + 2], A[i + 3] ... A[j + 1]} since they are contiguous, To compute sum of A[i] to A[j + 1], I can compute the prefix sum of A[i] with P[i] = A[1] + A[2] + .. A[i - 1]
Sum of subsequence from A[i] to A[j] then can be computed as P[j] - P[i] (j is greater of j and i)
With C and B in mind, we conclude:
Sum = P[j] - P[i] = q - p (q - p > 0)
P[j] - P[i] > 0
P[j] > P[i]
j > i and P[j] > P[i] for each solution that gives you a median >= X
In summary:
Replace all A[i] with -1 if they are less than X and -1 otherwise
Computer prefix sums of A[i]
For each pair (i, P[i]), count pairs which lie to its lower left quadrant.
How can I find the total number of Increasing sub-sequences of certain length with Binary Index Tree(BIT)?
Actually this is a problem from Spoj Online Judge
Example
Suppose I have an array 1,2,2,10
The increasing sub-sequences of length 3 are 1,2,4 and 1,3,4
So, the answer is 2.
Let:
dp[i, j] = number of increasing subsequences of length j that end at i
An easy solution is in O(n^2 * k):
for i = 1 to n do
dp[i, 1] = 1
for i = 1 to n do
for j = 1 to i - 1 do
if array[i] > array[j]
for p = 2 to k do
dp[i, p] += dp[j, p - 1]
The answer is dp[1, k] + dp[2, k] + ... + dp[n, k].
Now, this works, but it is inefficient for your given constraints, since n can go up to 10000. k is small enough, so we should try to find a way to get rid of an n.
Let's try another approach. We also have S - the upper bound on the values in our array. Let's try to find an algorithm in relation to this.
dp[i, j] = same as before
num[i] = how many subsequences that end with i (element, not index this time)
have a certain length
for i = 1 to n do
dp[i, 1] = 1
for p = 2 to k do // for each length this time
num = {0}
for i = 2 to n do
// note: dp[1, p > 1] = 0
// how many that end with the previous element
// have length p - 1
num[ array[i - 1] ] += dp[i - 1, p - 1]
// append the current element to all those smaller than it
// that end an increasing subsequence of length p - 1,
// creating an increasing subsequence of length p
for j = 1 to array[i] - 1 do
dp[i, p] += num[j]
This has complexity O(n * k * S), but we can reduce it to O(n * k * log S) quite easily. All we need is a data structure that lets us efficiently sum and update elements in a range: segment trees, binary indexed trees etc.