Double Increasing Series relation explanation - algorithm

I was solving this question: https://www.interviewbit.com/problems/double-increasing-series/
Basically the question is:
Given two integers A and B.
Find the number of sequences of length B, such that every element of this sequence is an positive integer and is less than of equal to A, also every previous element in the sequence is less than or equal to half of the next element.
This is a dynamic programming problem.
Its solution on the site is given as: DP (i, j) = DP (i-1, j) + DP (floor(i/2), j-1).
Can anyone explain how this result came? I searched online but couldn't find any explanation.

The formula given is DP[i][j] = DP[i-1][j] + DP[i/2][j-1], where
i is the maximum value of sequence (surely the last value of sequence).
j is the length of the sequence.
DP[i][j] is the total number of sequences ending at a integer <= i and of length j.
Now try to understand this formula by an example.
Assume you want to find the number of sequences ending at an integer <= 50 and length 5, means DP[50][5]. Now give it a thought that sequences which were ending at an integer <= 49 and length 5 (DP[49][5]), can also be counted in sequences ending at an integer <= 50 and length 5 (DP[50][5]).
Now we will consider those sequences which are ending strictly at integer 50. Now the sequences which were ending at an integer <= 25 (50 / 2) and of length 4 (5 - 1), means DP[25][4]. By appending the 50 at the end of those sequences we can make new sequences ending at 50 and of length 5. So by adding DP[49][5] and DP[25][4], we can get DP[50][5].

Related

Calculating limits in dynamic programming

I found this question on topcoder:
Your friend Lucas gave you a sequence S of positive integers.
For a while, you two played a simple game with S: Lucas would pick a number, and you had to select some elements of S such that the sum of all numbers you selected is the number chosen by Lucas. For example, if S={2,1,2,7} and Lucas chose the number 11, you would answer that 2+2+7 = 11.
Lucas now wants to trick you by choosing a number X such that there will be no valid answer. For example, if S={2,1,2,7}, it is not possible to select elements of S that sum up to 6.
You are given the int[] S. Find the smallest positive integer X that cannot be obtained as the sum of some (possibly all) elements of S.
Constraints: - S will contain between 1 and 20 elements, inclusive. - Each element of S will be between 1 and 100,000, inclusive.
But in the editorial solution it has been written:
How about finding the smallest impossible sum? Well, we can try the following naive algorithm: First try with x = 1, if this is not a valid sum (found using the methods in the previous section), then we can return x, else we increment x and try again, and again until we find the smallest number that is not a valid sum.
Let's find an upper bound for the number of iterations, the number of values of x we will need to try before we find a result. First of all, the maximum sum possible in this problem is 100000 * 20 (All numbers are the maximum 100000), this means that 100000 * 20 + 1 will not be an impossible value. We can be certain to need at most 2000001 steps.
How good is this upper bound? If we had 100000 in each of the 20 numbers, 1 wouldn't be a possible sum. So we actually need one iteration in that case. If we want 1 to be a possible sum, we should have 1 in the initial elements. Then we need a 2 (Else we would only need 2 iterations), then a 4 (3 can be found by adding 1+2), then 8 (Numbers from 5 to 7 can be found by adding some of the first 3 powers of two), then 16, 32, .... It turns out that with the powers of 2, we can easily make inputs that require many iterations. With the first 17 powers of two, we can cover up to the first 262143 integer numbers. That should be a good estimation for the largest number. (We cannot use 2^18 in the input, smaller than 100000).
Up to 262143 times, we need to query if a number x is in the set of possible sums. We can just use a boolean array here. It appears that even O(log(n)) data structures should be fast enough, however.
I did understand the first paragraph. But after that they have explained something about "How good is this upper bound?...". I couldnt understand that paragraph. How did they deduce to the fact that we need to query 262143 times if a number x is in the set of possible sums?
I am a newbie at dynamic programming and so it would be great if somebody could explain this to me.
Thank you.
The idea is as follows:
If the input sequence contains the first k powers of two: 2^0, 2^1, ... 2^(k-1), then the sum can be any integer between 0 and (2^k) - 1. Since the greatest power of two that can appear in the sequence is 2^17, the greatest sum that you can build from 18 numbers is 2^18 - 1=262,143. If a power of two would be missing, there would be a smaller sum that was not possible to achieve.
However, the statement is missing that there may be 2 more numbers in the sequence (at most 20). From these two numbers, you can repeat the same process. Hence, the maximum number to check is actually (2^18) - 1 + (2^2) - 1.
You may wonder why we use powers of two and not any other powers. The reason is the binary selection that we perform on the numbers in the input sequence. Either we add a number to the sum or we don't. So, if we represent this selection for number ni as a selection variable si (either 0 or 1), then the possible sum is:
s = s0 * n0 + s1 * n1 + s2 * n2 + ...
Now, if we choose the ni to be powers of two ni = 2^i, then:
s = s0 * 2^0 + s1 * 2^1 + s2 * 2^2 + ...
= sum si * 2^i
This is equivalent to the binary representations of numbers (see Positional Notation). By definition, different choices for the selection variables will produce different sums. Hence, the number of possible sums is maximal by choosing powers of two in the input sequence.

Why is the total number of possible substrings of a string n^2?

I read that the total number of substrings that can be formed from a given string is n^2 but I don't understand how to count this.
By substrings, I mean, given a string CAT, the substrings would be:
C
CA
CAT
A
AT
T
The total number of (nonempty) substrings is n + C(n,2). The leading n counts the number of substrings of length 1 and C(n,2) counts the number of substrings of length > 1 and is equal to the number of ways to choose 2 indices from the set of n. The standard formula for binomial coefficients yields C(n,2) = n*(n-1)/2. Combining these two terms and simplifying gives that the total number is (n^2 + n)/2. #rici in the comments notes that this is the same as C(n+1,2) which makes sense if you e.g. think in terms of Python string slicing where substrings of s can always be written in the form s[i:j] where 0 <= i < j <= n (with j being 1 more than the final index). For n = 3 this works out to (9 + 3)/2 = 6.
In the sense of complexity theory the number of substrings is O(n^2), which might be what you read somewhere.
You have a starting point and and end point - if each could point to anywhere along the word, each would have n possible values, and therefor an overall of n^2, so that's an upper limit.
However, we need a constraint saying that the substring cannot end before it started, so end - start >=0. This cuts the possible count in about half, but on asymptotic terms it's still O(n^2)
Substring calculation is logically
selecting 2 blank spaces atleast one letter apart.
a| b c | d = substring bc
| a b c |d = substring abc.
Now how many ways can you chose these 2 blankspace. For n letter word there are n+1.
Then first select one = n+1 ways
Select another (not the same)= n
So total n(n+1). But you have calculated everything twice. So n*(n+1)/2.
Programmatically, without applying any special algorithms(like Z algo etc) you can use a map to calculate no of distinct substrings.(O(n^3)).
You can use suffix tree to get O(n^2) substring calculaton.
To get a substring of a given string s, you just need to select two different points in the string. Let s contain n characters,
|s[0]|s[1]|...|s[n-1]|
You want to choose two vertical bars to get a substring. How many vertical bars do you have? Exactly n+1. So the number of sustrings is C(n+1,2) = n(n+1)/2, which is to choose 2 items from n+1. Of course, it could be denoted as O(n^2).

Count of numbers between A and B (inclusive) that have sum of digits equal to S

The problems is to find the count of numbers between A and B (inclusive) that have sum of digits equal to S.
Also print the smallest such number between A and B (inclusive).
Input:
Single line consisting of A,B,S.
Output:
Two lines.
In first line the number of integers between A and B having sum of digits equal to S.
In second line the smallest such number between A and B.
Constraints:
1 <= A <= B < 10^15
1 <= S <= 135
Source: Hacker Earth
My solution works for only 30 pc of their inputs. What could be the best possible solution to this?
The algorithm I am using now computes the sum of the smallest digit and then upon every change of the tens digit computes the sum again.
Below is the solution in Python:
def sum(n):
if (n<10):return n
return n%10 + sum(n/10)
stri = raw_input()
min = 99999
stri = stri.split(" ")
a= long (stri[0])
b= long (stri[1])
s= long (stri[2])
count= 0
su = sum(a)
while a<=b :
if (a % 10 == 0 ):
su = sum(a)
print a
if ( s == su):
count+=1
if (a<= min):
min=a
a+=1
su+=1
print count
print min
There are two separate problems here: finding the smallest number between those numbers that has the right digit sum and finding the number of values in the range with that digit sum. I'll talk about those problems separately.
Counting values between A and B with digit sum S.
The general approach for solving this problem will be the following:
Compute the number of values less than or equal to A - 1 with digit sum S.
Compute the number of values less than or equal to B with digit sum S.
Subtract the first number from the second.
To do this, we should be able to use a dynamic programming approach. We're going to try to answer queries of the following form:
How many D-digit numbers are there, whose first digit is k, whose digits that sum up to S?
We'll create a table N[D, k, S] to hold these values. We know that D is going to be at most 16 and that S is going to be at most 136, so this table will have only 10 × 16 × 136 = 21,760 entries, which isn't too bad. To fill it in, we can use the following base cases:
N[1, S, S] = 1 for 0 ≤ S ≤ 9, since there's only one one-digit number that sums up to any value less than ten.
N[1, k, S] = 0 for 0 ≤ S ≤ 9 if k ≠ S, since no one-digit number whose first digit isn't a particular sum sums up to some value.
N[1, k, S] = 0 for 10 ≤ S ≤ 135, since no one-digit number sums up to exactly S for any k greater than a single digit.
N[1, k, S] = 0 for any S < 0.
Then, we can use the following logic to fill in the other table entries:
N[D + 1, k, S] = sum(i from 0 to 9) N[D, i, S - k].
This says that the number of (D+1)-digit numbers whose first digit is k that sum up to S is given by the number of D-digit numbers that sum up to S - k. The number of D-digit numbers that sum up to S - k is given by the number of D-digit numbers that sum up to S - k whose first digit is 0, 1, 2, ..., 9, so we have to sum up over them.
Filling in this DP table takes time only O(1), and in fact you could conceivably precompute it and hardcode it into the program if you were really concerned about time.
So how can we use this table? Well, suppose we want to know how many numbers that sum up to S are less than or equal to some number X. To do this, we can process the digits of X one at a time. Let's write X one digit at a time as d1 ... dn. We can start off by looking at N[n, d1, S]. This gives us the number of n-digit numbers whose first digit is d1 that sum up to S. This may overestimate the number of values less than or equal to X that sum up to S. For example, if our number is 21,111 and we want the number of values that sum up to exactly 12, then looking up this table value will give us false positives for numbers like 29,100 that start with a 2 and are five digits long, but which are still greater than X. To handle this, we can move to the next digit of the number X. Since the first digit was a 2, the rest of the digits in the number must sum up to 10. Moreover, since the next digit of X (21,111) is a 1, we can now subtract from our total the number of 4-digit numbers starting with 2, 3, 4, 5, ..., 9 that add up to 10. We can then repeat this process one digit at a time.
More generally, our algorithm will be as follows. Let X be our number and S the target sum. Write X = d1d2...dn and compute the following:
# Begin by starting with all numbers whose first digit is less than d[1].
# Those count as well.
result = 0
for i from 0 to d[1]:
result += N[n, i, S]
# Now, exclude everything whose first digit is d[1] that is too large.
S -= d[1]
for i = 2 to n:
for j = d[i] to 8:
result -= N[n, d[i], S]
S -= d[i]
The value of result will then be the number of values less than or equal to X that sum up to exactly S. This algorithm will only run for at most 16 iterations, so it should be very quick. Moreover, using this algorithm and the earlier subtraction trick, we can use it to compute how many values between A and B sum up to exactly S.
Finding the smallest value in [A, B] with digit sum S.
We can use a similar trick with our DP table to find the smallest number greater than A number that sums up to exactly S. I'll leave the details as an exercise, but as a hint, work one digit at a time, trying to find the smallest number for which the DP table returns a nonzero value.
Hope this helps!

Sum of last k digits same as sum of first k digits

I want to find if sum of first k digits of few numbers in given range is equal to sum of last k digits. Here the range is very large and k is less than 20.
One way we can do this is by brute force method. Can someone suggest some other efficient algo. for same?
If it is a range, the first digits will not change often and the last digits will change in a simple way. S is the sum of the first 20 digits. While the secund digit doesn't change, the sum will be increased by one when you go to the next digit. So if all yours digits, except the last one, are fixed, and if the sum with the last digit equal to i is Si, you the only good last digit is n= S - Si + i. You then have to check if n is between 0 and 9, and if the resulting number is in the interval. This decrease by ten the number of lookups.
You can check for the next secund lower digits.
If the first n is lower than 0, you need to decrease the secund digit by -n. Call n2 this secund digit. If n2 > = 0, the good numbers will end by (n2,0), (n2 -1,1), ..., (0, n2). This decrease the complexity by 100.
If n is bigger than 10, you increase the second digit by n-9. Call n2 the second digit. If n2<=9, the good numbers are (n2,9),(n2-1,8),...,(0,something).
This also decrease the complexity by 100.
You can do the same for the third digit, and then for the fourth, up to the 20. This will result in just 1 sum, and a complexity in O(number of solutions), so it is minimal. For coding, be careful that your firsts numbers can change. Do one computation per group of 20 first numbers.
one theoretical improvement to the brute force method:
1) sum up the frist k digits, store in sumFirst
2) sum up the last k digits, but stop if sum exceeds sumFirst.
Point 2 could save summing up some of the last few digits.
But you have to measure if the additional logic, costs more then simply adding all k digits.
Optimization N-k
One way to improve the algorithm is if when the number having N digits has the following property:N < 2k.
For instance if N = 5 and k = 3, 5 < 2x3, digits being
abcde
you only have to count ab against de (ie no need to check k (3) digits, since the 3rd is shared by k-last and k-first digits).In other words, the number of digits to be counted both sides is only
min(k, N-k), having N >= k
If you are going to use that multiple times for the same array, you can sum all element with previous elements which is O(n) where the size of array is n i.e
for(int i = 1; i < n; i++)
arr[i] = arr[i] + arr[i-1];
This will convert your array from probability density function to cumulative distribution function (for discrete numbers). Therefore your query is going to be O(1) i.e.
if(arr[k-1] == (arr[n-1]-arr[n-k])) //arr[k-1] is sum of first k element
return true;
return false;
another improvement over the brute force:
i = 0, T = 0
while |T| < 9 * (k - i)
T = T + last[i] - first[i]
i = i + 1
return T == 0

Counting Binary Strings

This is in reference to this problem. We are required to calculate f(n , k), which is the number of binary strings of length n that have the length of the longest substring of ones as k. I am having trouble coming up with a recursion.
The case when the ith digit is a 0 , i think i can handle.
Specifically, I am unable to extend the solution to a sub-problem f(i-1 , j) , when I consider the ith digit to be a 1. how do i stitch the two together?
Sorry if I am a bit unclear. Any pointers would be a great help. Thanks.
I think you could build up a table using a variation of dynamic programming, if you expand the state space. Suppose that you calculate f(n,k,e) defined as the number of different binary strings of length n with the longest substring of 1s length at most k and ending with e 1s in a row. If you have calculated f(n,k,e) for all possible values of k and e associated with a given n, then, because you have the values split up by e, you can calculate f(n+1,k,e) for all possible values of k and e - what happens to an n-long string when you extend it with 0 or 1 depends on how many 1s it ends with at the moment, and you know that because of e.
Let s be the start index of the length k pattern. Then s is in: 1 to n-k.
For each s, we divide the Sting S into three strings:
PRE(s,k,n) = S[1:s-1]
POST(s,k,n)=S[s+k-1:n]
ONE(s,k,n) which has all 1s from S[s] to S[s+k-1]
The longest sub-string of 1s for PRE and POST should be less than k.
Let
x = s-1
y = n-(s+k)-1
Let NS(p,k) is total number of ways you can have a longest sub-string of size greater than equal to k.
NS(p,k) = sum{f(p,k), f(p,k+1),... f(p,p)}
Terminating condition:
NS(p,k) = 1 if p==k, 0 if k>p
f(n,k) = 1 if n==k, 0, if k > n.
For a string of length n, the number of permutations such that the longest substring of 1s is of size less than k = 2^n - NS(n,k).
f(n,k) = Sum over all s=1 to n-k
{2^x - NS(x,k)}*{2^y - NS(y,k)}
i.e. product of the number of permutations of each of the pre and post substrings where the longest sub-string is less than size k.
So we have a repeating sub-problem, and a whole bunch of reuse which can be DPed
Added Later:
Based on the comment below, I guess we really do not need to go into NS.
We can define S(p,k) as
S(p,k) = sum{f(p,1), f(p,2),... f(p,k-1)}
and
f(n,k) = Sum over all s=1 to n-k
S(x,k)*S(y,k)
I know this is quite an old question if any one wants I can clarify my small answer..
Here is my code
#include<bits/stdc++.h>
using namespace std;
long long DP[64][64];
int main()
{
ios::sync_with_stdio(0);
cin.tie(0);
int i,j,k;
DP[1][0]=1;
DP[1][1]=1;
DP[0][0]=1;
cout<<"1 1\n";
for(i=2;i<=63;i++,cout<<"\n")
{
DP[i][0]=1;
DP[i][i]=1;
cout<<"1 ";
for(j=1;j<i;j++)
{
for(k=0;k<=j;k++)
DP[i][j]+=DP[i-k-1][j]+DP[i-j-1][k];
DP[i][j]-=DP[i-j-1][j];
cout<<DP[i][j]<<" ";
}
cout<<"1 ";
}
return 0;
}
DP[i][j] represents F(i,j) .
Transitions/Recurrence (Hard to think):
Considering F(i,j):
1)I can put k 1s on the right and seperate them using a 0 i.e
String + 0 + k times '1' .
F(i-k-1,j)
Note : k=0 signifies I am only keeping 0 at the right!
2) I am missing out the ways in which the right j+1 positions are filled with 0 and j '1' s and All the left do not form any consecutive string of length j !!
F(i-j-1,k) (Note I have used k to signify both just because I have done so in my Code , you can define other variables too!)

Resources