Find number of paths in grid with certain restrictions - algorithm

You are given a rectangular grid with n rows and m columns. The rows are numbered 1 to n, from bottom to top, and the columns are numbered 1 to m, from left to right.
You are also given k special fields in the form (row, column). For each i, where 0 <= i <= k, count the number of different paths from (1, 1) to (n, m) that contains exactly n special fields.
There is one rule you must follow. You are only allowed to make moves that are straight up or to the right. In other words, from each field (row, column), you can only move to field (row+1, column) or field (row, column+1).
Output an array of k + 1 elements. The i-th element (0-indexed) must be the number of different paths that contain exactly i special fields. Since, the answer can be too big, output it modulo 1000007.
Input:
First line contains three space separated integers, n, m and k. Next k lines, each contain two space separated integers, the coordinates of a special field.
Output:
k + 1 space separated integers, the answer to the question.
Constraints:
1 <= n, m, k <= 100
For all coordinates (r, c) - 1 <= r <= n, 1 <= c <= m
All coordinates are valid and different.

This is a simple DP:
Initialization:
T[i][0][k] = 0
T[0][j][k] = 0
If grid[1][1] is not special:
T[1][1][k!=0] = 0
T[1][1][0] = 1
Otherwise:
T[1][1][k!=1] = 0
T[1][1][1] = 1
Bulk:
if grid[i][j] is not special:
T[i][j][k] = (T[i-1][j][k] + T[i][j-1][k]) % 1000007
Otherwise:
T[i][j][0] = 0
T[i][j][k>0] = (T[i-1][j][k-1] + T[i][j-1][k-1]) % 1000007
Answer:
T[n][m][k], for every possible k.

Related

How to get the intuition behind the solution?

I was solving the below problem from USACO training. I found this really fast solution for which, I am finding it unable to absorb fully.
Problem: Consider an ordered set S of strings of N (1 <= N <= 31) bits. Bits, of course, are either 0 or 1.
This set of strings is interesting because it is ordered and contains all possible strings of length N that have L (1 <= L <= N) or fewer bits that are `1'.
Your task is to read a number I (1 <= I <= sizeof(S)) from the input and print the Ith element of the ordered set for N bits with no more than L bits that are `1'.
sample input: 5 3 19
output: 10110
The two solutions I could think of:
Firstly the brute force solution which goes through all possible combinations of bits, selects and stores the strings whose count of '1's are less than equal to 'L' and returning the Ith string.
Secondly, we can find all the permutations of '1's from 5 positions with range of count(0 to L), sort the strings in increasing order and returning the Ith string.
The best Solution:
The OP who posted the solution has used combination instead of permutation. According to him, the total number of string possible is 5C0 + 5C1 + 5C2 + 5C3.
So at every position i of the string, we decide whether to include the ith bit in our output or not, based on the total number of ways we have to build the rest of the string. Below is a dry run of the entire approach for the above input.
N = 5, L = 3, I = 19
00000
at i = 0, for the rem string, we have 4C0 + 4C1 + 4C2 + 4C3 = 15
It says that, there are 15 other numbers possible with the last 4 positions. as 15 is less than 19, our first bit has to be set.
N = 5, L = 2, I = 4
10000
at i = 1, we have 3C0 + 3C1 + 3C2 (as we have used 1 from L) = 7
as 7 is greater than 4, we cannot set this bit.
N = 5, L = 2, I = 4
10000
at i = 2 we have 2C0 + 2C2 = 2
as 2 <= I(4), we take this bit in our output.
N = 5, L = 1, I = 2
10100
at i = 3, we have 1C0 + 1C1 = 2
as 2 <= I(2) we can take this bit in our output.
as L == 0, we stop and 10110 is our answer. I was amazed to find this solution. However, I am finding it difficult to get the intuition behind this solution.
How does this solution sort-of zero in directly to the Ith number in the set?
Why does the order of the bits not matter in the combinations of set bits?
Suppose we have precomputed the number of strings of length n with k or fewer bits set. Call that S(n, k).
Now suppose we want the i'th string (in lexicographic order) of length N with L or fewer bits set.
All the strings with the most significant bit zero come before those with the most significant bit 1. There's S(N-1, L) strings with the most significant bit zero, and S(N-1, L-1) strings with the most significant bit 1. So if we want the i'th string, if i<=S(N-1, L), then it must have the top bit zero and the remainder must be the i'th string of length N-1 with at most L bits set, and otherwise it must have the top bit one, and the remainder must be the (i-S(N-1, L))'th string of length N-1 with at most L-1 bits set.
All that remains to code is to precompute S(n, k), and to handle the base cases.
You can figure out a combinatorial solution to S(n, k) as your friend did, but it's more practical to use a recurrence relation: S(n, k) = S(n-1, k) + S(n-1, k-1), and S(0, k) = S(n, 0) = 1.
Here's code that does all that, and as an example prints out all 8-bit numbers with 3 or fewer bits set, in lexicographic order. If i is out of range, then it raises an IndexError exception, although in your question you assume i is always in range, so perhaps that's not necessary.
S = [[1] * 32 for _ in range(32)]
for n in range(1, 32):
for k in range(1, 32):
S[n][k] = S[n-1][k] + S[n-1][k-1]
def ith_string(n, k, i):
if n == 0:
if i != 1:
raise IndexError
return ''
elif i <= S[n-1][k]:
return "0" + ith_string(n-1, k, i)
elif k == 0:
raise IndexError
else:
return "1" + ith_string(n-1, k-1, i - S[n-1][k])
print([ith_string(8, 3, i) for i in range(1, 94)])

Finding the amount of combination of three numbers in a sequence which fulfills a specific requirement

The question is, given a number D and a sequence of numbers with amount N, find the amount of the combinations of three numbers that have a highest difference value within it that does not exceed the value D. For example:
D = 3, N = 4
Sequence of numbers: 1 2 3 4
Possible combinations: 1 2 3 (3-1 = 2 <= D), 1 2 4 (4 - 1 = 3 <= D), 1 3 4, 2 3 4.
Output: 4
What I've done: link
Well my concept is: iterate through the whole sequence of numbers and find the smallest number that exceeds the D value when subtracted to the current compared number. Then, find the combinations between those two numbers with the currently compared number being a fixed value (which means combination of n [numbers between the two numbers] taken 2). If even the biggest number in the sequence subtracted with the currently compared number does not exceed D, then use a combination of the whole elements taken 3.
N can be as big as 10^5 with the smallest being 1 and D can be as big as 10^9 with the smallest being 1 too.
Problem with my algorithm: overflow occurs when I do a combination of the 1st element and 10^5th element. How can I fix this? Is there a way to calculate that large amount of combination without actually doing the factorials?
EDIT:
Overflow occurs when worst case happens: currently compared number is still in index 0 while all other numbers, when subtracted with the currently compared number, is still smaller than D. For example, the value of number at index 0 is 1, the value of number at index 10^5 is 10^5 + 1 and D is 10^9. Then, my algorithm will attempt to calculate the factorial of 10^5 - 0 which then overflows. The factorial will be used to calculate the combination of 10^5 taken 3.
When you seek for items in value range D in sorted list, and get index difference M, then you should calculate C(M,3).
But for such combination number you don't need to use huge factorials:
C(M,3) = M! / (6 * (M-3)!) = M * (M-1) * (M-2) / 6
To diminish intermediate results even more:
A = (M - 1) * (M - 2) / 2
A = (A * M) / 3
You didn't add the C++ tag to your question, so let me write the answer in Python 3 (it should be easy to translate it to C++):
N = int(input("N = "))
D = int(input("D = "))
v = [int(input("v[{}] = ".format(i))) for i in range (0, N)]
count = 0
i, j = 0, 1
while j + 1 < N:
j += 1
while v[j] - v[i] > D:
i += 1
d = j - i
if d >= 2:
count += (d - 1) * d // 2 # // is the integer division
print(count)
The idea is to move up the upper index of the triples j, while dragging the lower index i at the greatest distance j-i=d where v[j]-v[i]<=D. For each i-j pair, there are 1+2+3+...+d-1 possible triples keeping j fixed, i.e., (d-1)*d/2.

get a specific combination with replacement

I know how to calculate the total number of combinations of n different objects taken k at a time, with replacement:
(n+k-1)!/k!/(n-1)!
What I need is a formula or algorithm to recover the i-th such combination from an ordered list.
Say I have an ordered list of all combinations of a,b,c taken 3 at a time (so n=3 and k=3):
1 aaa
2 aab
3 aac
4 abb
5 abc
6 acc
7 bbb
8 bbc
9 bcc
10 ccc
How would I calculate the i-th (say 7-th) combination in this list, without first enumerating them all ? Enumerating will be very inefficient for any but the simplest cases, if I am only interested in a few specific combinations. For instance, there are 119,877,472 combinations of 64 items taken 6 at a time.
Needless to say, I need a solution for arbitrary n, k and i.
The reverse function (given the combination, how to calculate its index) would also be interesting.
I found one similar question, but it was about permutations, not combinations:
I want to get a specific combination of permutation?
And there are many ways to list all the combinations, such as mentioned here:
How to generate all permutations and combinations with/without replacement for distinct items and non distinct items (multisets)
But they don't give the functions I need
The algorithm you are interested in is very easy to implement. The first thing you should understand is why actually C(k, n + k - 1) = C(n - 1, n + k - 1) = (n + k - 1)! / k! / (n - 1)! formula works. Formula says that the number of ways to take k items out of n is the same as to take n-k items out of n.
Lets say your objects are balls of some color. There are n different colors numbered from 1 to n. You need to calculate the number of ways to have k balls. Imagine initially k white balls (without any color) so you need to paint them in different ways. Arrange the balls in a row. Choose some k1 ≥ 0 balls from the left to paint in color #1, next k2 ≥ 0 balls we paint in #2, and so on... We have ∑ki = k. A series of k1 balls painted in color #1 is followed by k2 of color #2, next by k3 of color #3 etc...
We can do the same painting in a slightly different way however. In order to separate ki-1- and ki-colored balls we would use delimiters. In total we should have n - 1 such delimiters to be placed among the balls. The delimiters are ordered, one that separates 1-colored and 2-colored balls should appear before another that separates 2-colored and 3-colored. If some ki = 0 then corresponding delimiters appear one by one. We have to arrange delimiters and balls in some way.
Interestingly we can imagine now that both n - 1 delimiters and k balls are just objects initially placed in a row. We have to choose either n - 1 of them to declare selected objects to be delimiters or k objects to be balls. And that's where well-known combination formula can be applied.
Example for your case:
o - ball
. - delimiter
a, b, c - colors
We have:
ooo.. => aaa
oo.o. => aab
oo..o => aac
o.oo. => abb
o.o.o => abc
o..oo => acc
.ooo. => bbb
.oo.o => bbc
.o.oo => bcc
..ooo => ccc
Notice the pattern how delimiters move from right to left.
Algorithm
Now to the question of how to get the p-th arrangement. Efficient algorithm description follows. Remember that we have k balls and nd = n - 1 delimiters. We will be placing delimiters one by one first trying their rightmost positions. Consider leaving current delimiter at its current position, calculate the number of combinations to place the remaining objects to the right, let the number be some N. Compare N with p, if p is greater or equal to N then reduce p by N (p <- p - N) and we should move current delimiter left by 1. Else if p is lower than N then we will not move current delimiter but proceed to the next one trying to move it again from the rightmost position. Note that p-th arrangement is zero-based.
Having "converted" some i-th object to j-th delimiter we have N = C(nd - j, nd + k - i) number of ways to arrange remaining k - i + j balls and nd - j delimiters.
Since we'll often refer to binomial coefficients we'd better make their precalculation.
The reverse function may be implemented accordingly. You have positions for every delimiter. Accumulate the number of ways to arrange remaining objects while moving ordinary delimiter to its place from the rightmost position.
Example:
3 balls, 2 delimiters, find 7-th arrangement (which is bbc or .oo.o)
Place delimiters to the rightmost position: ooo... Let first delimiter be current.
Calculate N = C(1, 1) = 1, p ≥ N so we reduce p by N getting p = 6. At the same time we move current delimiter 1 pos left getting oo.o..
Calculate N = C(1, 2) = 2, p ≥ N, reduce p by N getting p = 6 - 2 = 4. Move getting o.oo..
Calculate N = C(1, 3) = 3, p ≥ N once again, move and reduce p getting p = 1 and .ooo..
Calculate N = C(1,4) = 4, p < N. Good, we've found final position for the first delimiter so leave it there and take second delimiter as current.
Calculate N = C(0,0) = 1, p ≥ N, p = 1 - 1 = 0, move, .oo.o.
Calculate N = C(0,1) = 1, p < N, found final position for the second delimiter. Resulting arrangement is .oo.o => bbc.
EDIT #1. Changed the algo description and added example.
here is the function (not optimized but working):
findcomb <- function(n, k, p) {
# n = nr of object types (colors, letters etc)
# k = number of objects (balls) to select
# p = 0-based index of target combination
# return = positions of delimiters at index p
nd <- n-1 #nr of delimiters: 1 - nr of colors
pos <- seq(n+k-nd, n+k-1) #original positions of delimiters, all at right
for (j in 1:(nd-1)) {
s <- 0 #cumulative nr of accounted-for combinations with this delimiter
while (TRUE) {
N <- choose(nd+k-pos[j], nd-j)
if (s + N <= p) {
pos[j] <- pos[j] - 1
s <- s + N
} else break
}
p <- p - s
}
#last delimiter:
pos[nd] <- pos[nd] - p
pos
}

How to find largest square of palindrome in a matrix

I am trying to solve a problem where I am given a nXn square matrix of characters and I want to find out size of the largest palindrome square from this? The largest palindrome square is, a square with all rows and all columns as palindrome.
For eg.
Input
a g h j k
s d g d j
s e f e n
a d g d h
r y d g s
The output will be:
3
corresponding to the middle square. I am thinking of dynamic programming solution but unable to formulate the recurrence relation. I am thinking the dimensions should be a(i,j,k) where i, j are the bottom-right of rectangle and k be the size of palindrome square.
Can someone help me with the recurrence relation for this problem?
EDIT:
n<500, so I believe that I can't go beyond O(n^3).
Assuming that you can solve the following problem:
Ending at cell (i, j) is there any palindrome with different length horizontally and vertically.
Hint for above problem:
boolean[][][]palindrome;//Is there any palindrome ending at (i , j) has length k
for(int i = 0; i < n; i++){
for(int j = 0; j < n; j++){
palindrome[i][j][0] = true;
palindrome[i][j][1] = true;
for(int k = 2; k <= n; k++)
if(data[i][j - k + 1] == data[i][j] && palindrome[i][j - 1][k - 2])
palindrome[i][j][k] = true;
}
}
So, we can create two three dimensional arrays int[n][n][n]col and int[n][n][n]row.
For each cell(i, j), we will calculate the total number of palindrome with length k, ending at cell (0, j), (1, j) , ... (i, j) and total number of palindrome with length k, ending at cell (i,0), (i, 1), ... (i, j)
for(int k = 1; k <= n; k++)
if(there is palindrome length k horizontally, end at cell (i, j))
row[i][j][k] = 1 + row[i - 1][j][k];
if(there is palindrome length k vertically, end at cell (i, j))
col[i][j][k] = 1 + col[i][j - 1][k];
Finally, if row[i][j][k] >= k && col[i][j][k] >= k -> there is an square palindrome length k ending at (i,j).
In total, the time complexity will be O(n^3)
lets start with the complexity of validating a palindrome:
A palindrome can be identified in O(k) where k is the length of the palindrome see here
you then want need to do that test 2k times once for each row and column in you r inner square. (using the length of the palindrome k, as the dimension)
so now you have k * 2k -> O(2k^2) -> O(k^2)
then you want to increase the possible search space to the whole data set nxn this is when a 2nd variable gets introduced
you will need to iterate over columns 1 to (n-k) and all rows 1 to (n-k) in a nested loop.
so now you have (n-k)^2 * O(k^2) -> O(n^2 * k^2)
Note: this problem is dependant on more than one variable
This is the same approach i suggest you take to coding the solution, start small and get bigger
Im sure there is probably a better way, and im pretty sure my logic is correct so take this at face value as its not tested.
Just to make the example easy im going to say that i,j is the top left corner or coordinates 1,1
1 2 3 4 5 6 7 8
1 a b c d e f g f
2 d e f g h j k q
3 a b g d z f g f
4 a a a a a a a a
ie (1,1) = a, (1,5) = e and (2,1) = d
now instead of checking every column you could start by checking every kth column
ie when k=3
1) create a 2D boolean array the size of the character table all results TRUE
2) I start by checking column 3 cfg which is not a palindrome, thus I no longer need to test columns 1 or 2.
3) because the palindrome test failed marked the coresponding result in the 2D array (1,3) as FALSE (I know not to test any range that uses this position as it is not a palindrome)
4) Next check column 6, fjf which is a palindrome so I go back and test column 5, ehz != a palindrome
5) set (1,5) = FALSE
6) Then test column 8 then 7,
NOTE: You have only had to test 5 of the 8 columns.
since there were k columns in a row that were palindromes, now test the corresponding rows. Start from the bottom row in this case 3 as it will eliminate the most other checks if it fails
7) check row starting at (3,6) fgf = palindrome
8) check row starting at (2,6) jkq != a palindrome
9) set (2,6) = FALSE
10) check column starting at (2,3) daa != palindrome
11) set (2,3) = FALSE
Dont need to test any more for row 2 as both (2,3) and (2,6) are FALSE
Hopefully you can make sense of that.
Note: you would probably start this at k = n and decrement k until you find a result

Levenstein distance from particular group of numbers

My input are three numbers - a number s and the beginning b and end e of a range with 0 <= s,b,e <= 10^1000. The task is to find the minimal Levenstein distance between s and all numbers in range [b, e]. It is not necessary to find the number minimizing the distance, the minimal distance is sufficient.
Obviously I have to read the numbers as string, because standard C++ type will not handle such large numbers. Calculating the Levenstein distance for every number in the possibly huge range is not feasible.
Any ideas?
[EDIT 10/8/2013: Some cases considered in the DP algorithm actually don't need to be considered after all, though considering them does not lead to incorrectness :)]
In the following I describe an algorithm that takes O(N^2) time, where N is the largest number of digits in any of b, e, or s. Since all these numbers are limited to 1000 digits, this means at most a few million basic operations, which will take milliseconds on any modern CPU.
Suppose s has n digits. In the following, "between" means "inclusive"; I will say "strictly between" if I mean "excluding its endpoints". Indices are 1-based. x[i] means the ith digit of x, so e.g. x[1] is its first digit.
Splitting up the problem
The first thing to do is to break up the problem into a series of subproblems in which each b and e have the same number of digits. Suppose e has k >= 0 more digits than s: break up the problem into k+1 subproblems. E.g. if b = 5 and e = 14032, create the following subproblems:
b = 5, e = 9
b = 10, e = 99
b = 100, e = 999
b = 1000, e = 9999
b = 10000, e = 14032
We can solve each of these subproblems, and take the minimum solution.
The easy cases: the middle
The easy cases are the ones in the middle. Whenever e has k >= 1 more digits than b, there will be k-1 subproblems (e.g. 3 above) in which b is a power of 10 and e is the next power of 10, minus 1. Suppose b is 10^m. Notice that choosing any digit between 1 and 9, followed by any m digits between 0 and 9, produces a number x that is in the range b <= x <= e. Furthermore there are no numbers in this range that cannot be produced this way. The minimum Levenshtein distance between s (or in fact any given length-n digit string that doesn't start with a 0) and any number x in the range 10^m <= x <= 10^(m+1)-1 is necessarily abs(m+1-n), since if m+1 >= n it's possible to simply choose the first n digits of x to be the same as those in s, and delete the remainder, and if m+1 < n then choose the first m+1 to be the same as those in s and insert the remainder.
In fact we can deal with all these subproblems in a single constant-time operation: if the smallest "easy" subproblem has b = 10^m and the largest "easy" subproblem has b = 10^u, then the minimum Levenshtein distance between s and any number in any of these ranges is m-n if n < m, n-u if n > u, and 0 otherwise.
The hard cases: the end(s)
The hard cases are when b and e are not restricted to have the form b = 10^m and e = 10^(m+1)-1 respectively. Any master problem can generate at most two subproblems like this: either two "ends" (resulting from a master problem in which b and e have different numbers of digits, such as the example at the top) or a single subproblem (i.e. the master problem itself, which didn't need to be subdivided at all because b and e already have the same number of digits). Note that due to the previous splitting of the problem, we can assume that the subproblem's b and e have the same number of digits, which we will call m.
Super-Levenshtein!
What we will do is design a variation of the Levenshtein DP matrix that calculates the minimum Levenshtein distance between a given digit string (s) and any number x in the range b <= x <= e. Despite this added "power", the algorithm will still run in O(n^2) time :)
First, observe that if b and e have the same number of digits and b != e, then it must be the case that they consist of some number q >= 0 of identical digits at the left, followed by a digit that is larger in e than in b. Now consider the following procedure for generating a random digit string x:
Set x to the first q digits of b.
Append a randomly-chosen digit d between b[i] and e[i] to x.
If d == b[i], we "hug" the lower bound:
For i from q+1 to m:
If b[i] == 9 then append b[i]. [EDIT 10/8/2013: Actually this can't happen, because we chose q so that e[i] will be larger then b[i], and there is no digit larger than 9!]
Otherwise, flip a coin:
Heads: Append b[i].
Tails: Append a randomly-chosen digit d > b[i], then goto 6.
Stop.
Else if d == e[i], we "hug" the upper bound:
For i from q+1 to m:
If e[i] == 0 then append e[i]. [EDIT 10/8/2013: Actually this can't happen, because we chose q so that b[i] will be smaller then e[i], and there is no digit smaller than 0!]
Otherwise, flip a coin:
Heads: Append e[i].
Tails: Append a randomly-chosen digit d < e[i], then goto 6.
Stop.
Otherwise (if d is strictly between b[i] and e[i]), drop through to step 6.
Keep appending randomly-chosen digits to x until it has m digits.
The basic idea is that after including all the digits that you must include, you can either "hug" the lower bound's digits for as long as you want, or "hug" the upper bound's digits for as long as you want, and as soon as you decide to stop "hugging", you can thereafter choose any digits you want. For suitable random choices, this procedure will generate all and only the numbers x such that b <= x <= e.
In the "usual" Levenshtein distance computation between two strings s and x, of lengths n and m respectively, we have a rectangular grid from (0, 0) to (n, m), and at each grid point (i, j) we record the Levenshtein distance between the prefix s[1..i] and the prefix x[1..j]. The score at (i, j) is calculated from the scores at (i-1, j), (i, j-1) and (i-1, j-1) using bottom-up dynamic programming. To adapt this to treat x as one of a set of possible strings (specifically, a digit string corresponding to a number between b and e) instead of a particular given string, what we need to do is record not one but two scores for each grid point: one for the case where we assume that the digit at position j was chosen to hug the lower bound, and one where we assume it was chosen to hug the upper bound. The 3rd possibility (step 5 above) doesn't actually require space in the DP matrix because we can work out the minimal Levenshtein distance for the entire rest of the input string immediately, very similar to the way we work it out for the "easy" subproblems in the first section.
Super-Levenshtein DP recursion
Call the overall minimal score at grid point (i, j) v(i, j). Let diff(a, b) = 1 if characters a and b are different, and 0 otherwise. Let inrange(a, b..c) be 1 if the character a is in the range b..c, and 0 otherwise. The calculations are:
# The best Lev distance overall between s[1..i] and x[1..j]
v(i, j) = min(hb(i, j), he(i, j))
# The best Lev distance between s[1..i] and x[1..j] obtainable by
# continuing to hug the lower bound
hb(i, j) = min(hb(i-1, j)+1, hb(i, j-1)+1, hb(i-1, j-1)+diff(s[i], b[j]))
# The best Lev distance between s[1..i] and x[1..j] obtainable by
# continuing to hug the upper bound
he(i, j) = min(he(i-1, j)+1, he(i, j-1)+1, he(i-1, j-1)+diff(s[i], e[j]))
At the point in time when v(i, j) is being calculated, we will also calculate the Levenshtein distance resulting from choosing to "stop hugging", i.e. by choosing a digit that is strictly in between b[j] and e[j] (if j == q) or (if j != q) is either above b[j] or below e[j], and thereafter freely choosing digits to make the suffix of x match the suffix of s as closely as possible:
# The best Lev distance possible between the ENTIRE STRINGS s and x, given that
# we choose to stop hugging at the jth digit of x, and have optimally aligned
# the first i digits of s to these j digits
sh(i, j) = if j >= q then shc(i, j)+abs(n-i-m+j)
else infinity
shc(i, j) = if j == q then
min(hb(i, j-1)+1, hb(i-1, j-1)+inrange(s[i], (b[j]+1)..(e[j]-1)))
else
min(hb(i, j-1)+1, hb(i-1, j-1)+inrange(s[i], (b[j]+1)..9),
he(i, j-1)+1, he(i-1, j-1)+inrange(s[i], (0..(e[j]-1)))
The formula for shc(i, j) doesn't need to consider "downward" moves, since such moves don't involve any digit choice for x.
The overall minimal Levenshtein distance is the minimum of v(n, m) and sh(i, j), for all 0 <= i <= n and 0 <= j <= m.
Complexity
Take N to be the largest number of digits in any of s, b or e. The original problem can be split in linear time into at most 1 set of easy problems that collectively takes O(1) time to solve and 2 hard subproblems that each take O(N^2) time to solve using the super-Levenshtein algorithm, so overall the problem can be solved in O(N^2) time, i.e. time proportional to the square of the number of digits.
A first idea to speed up the computation (works if |e-b| is not too large):
Question: how much can the Levestein distance change when we compare s with n and then with n+1?
Answer: not too much!
Let's see the dynamic-programming tables for s = 12007 and two consecutive n
n = 12296
0 1 2 3 4 5
1 0 1 2 3 4
2 1 0 1 2 3
3 2 1 1 2 3
4 3 2 2 2 3
5 4 3 3 3 3
and
n = 12297
0 1 2 3 4 5
1 0 1 2 3 4
2 1 0 1 2 3
3 2 1 1 2 3
4 3 2 2 2 3
5 4 3 3 3 2
As you can see, only the last column changes, since n and n+1 have the same digits, except for the last one.
If you have the dynamic-programming table for the edit-distance of s = 12001 and n = 12296, you already have the table for n = 12297, you just need to update the last column!
Obviously if n = 12299 then n+1 = 12300 and you need to update the last 3 columns of the previous table.. but this happens just once every 100 iteration.
In general, you have to
update the last column on every iterations (so, length(s) cells)
update the second-to-last too, once every 10 iterations
update the third-to-last, too, once every 100 iterations
so let L = length(s) and D = e-b. First you compute the edit-distance between s and b. Then you can find the minimum Levenstein distance over [b,e] looping over every integer in the interval. There are D of them, so the execution time is about:
Now since
we have an algorithm wich is

Resources