Number of ways to remove items from box - algorithm

I encountered the following algorithmic question which has strict constraints on runtime (<10s and no large memory footprint) and I am stumped. My approach fails half the test cases.
Question
A box contains a number of items that can only br removed 1 at a time or 3 at a time.
How many ways can the box be emptied? the answer can be very large so return it as modulo of 10^9+7.
for example,there are n=7 items initially.They can be removed nine ways,as follows:
1.(1,1,1,1,1,1,1)
2.(1.1.1.1.3)
3.(1,1,1,3,1)
4.(1,1,3,1,1)
5.(1,3,1,1,1)
6.(3,1,1,1,1)
7.(1,3,3)
8.(3,1,3)
9.(3,3,1)
So the function should return 9.
Function Description:
Your function must take in a parameter, n for the number of items, and return an integer which denotes the number of ways to empty the box.
Constraints: 1<=n<=10^8
Sample cases :
Input: 1
Sample OutPut: 1
Explanation: There is only 1 way to remove 1 item. Answer=(1%1000000007)=1
Input: 7
Sample OutPut: 9
There is only 9 ways to remove 7 items
My Approach
This leads to a standard recurrence relation where f(n) = f(n-3) + f(n-1) for n > 2, so i did it as follows
def memoized_number_of_ways(dic, n):
if n not in dic:
dic[n] = memoized_number_of_ways(dic, n-3) + memoized_number_of_ways(dic, n-1)
return dic[n]
def numberOfWays(n):
# Write your code here
memoize = {1:1,2:1,3:2}
import math
ans = memoized_number_of_ways(memoize,n)
return ans % (math.pow(10,9) + 7)
However this fails on any case where n > 10**2. How can you do this problem while accomodating n up to 10^8 and in less than 10s with not much memory?

Just write your recurrence using matrices (pardon my way of writing matrices, StackOverflow doesn't allow LaTeX).
[f(n) ] = [1 0 1] [f(n-1) ]
[f(n-1)] = [1 0 0] [f(n-2) ]
[f(n-2)] = [0 1 0] [f(n-3) ]
Now all you have to do is raise a 3x3 matrix (modulo fixed constant) to power n (or n-3 or something like that, depending on your "base case column vector", fill in the details), and then multiply it by a "base case column vector". This can be done in time O(logn).
PS: You may want to lookup matrix exponentiation.

Three solutions, fastest takes about 31 μs for n=108 on tio.run (which has medium-fast computers).
A matrix power solution like described by advocateofnone that takes about 1 millisecond (Try it online!):
import numpy as np
from time import time
class ModInt:
def __init__(self, x):
self.x = x % (10**9 + 7)
def __add__(a, b):
return ModInt(a.x + b.x)
def __mul__(a, b):
return ModInt(a.x * b.x)
def __str__(self):
return str(self.x)
def solve(n):
O = ModInt(0)
I = ModInt(1)
A = np.matrix([[O,I,O], [O,O,I], [I,O,I]])
return (A**n)[2,2]
for _ in range(3):
t0 = time()
print(solve(10**8), time() - t0)
Output (result and time in seconds for n=108, three attempts):
109786077 0.0010712146759033203
109786077 0.0010180473327636719
109786077 0.0009677410125732422
Another, taking about 0.5 milliseconds (Try it online!):
import numpy as np
from time import time
def solve(n):
A = np.matrix([[0,1,0], [0,0,1], [1,0,1]])
power = 1
mod = 10**9 + 7
while n:
if n % 2:
power = power * A % mod
A = A**2 % mod
n //= 2
return power[2,2]
for _ in range(3):
t0 = time()
print(solve(10**8), time() - t0)
One based on #rici's solution in the comments, takes about 31 μs (Try it online!):
from timeit import repeat
def solve(n):
m = 10**9 + 7
def abc(n):
if n == 0:
return 0, 1, 0
a, b, c = abc(n // 2)
d = a + c
e = b + d
A = 2*a*b + c*c
C = 2*b*c + d*d
E = 2*c*d + e*e
D = A + C
B = E - D
if n % 2:
A, B, C = B, C, D
return A%m, B%m, C%m
return sum(abc(n)) % m
n = 10**8
print(solve(n))
for _ in range(3):
t = min(repeat(lambda: solve(n), 'gc.enable()', number=1000)) / 1000
print('%.1f μs' % (t * 1e6))
Explanation: Looking at the matrix powers from my previous solutions, I noticed they only actually contain five different values, and they're consecutive result numbers from our desired sequence. For example, A**19 is:
[[277 189 406]
[406 277 595]
[595 406 872]]
I gave them names in increasing order:
| b a c |
| c b d |
| d c e |
Squaring that matrix results in a matrix for larger n, with entries A/B/C/D/E. And if you square the above matrix, you'll find the relationships A = 2*a*b + c*c etc.
My helper function abc(n) computes the entries a/b/c of the n-th matrix power. For n=0, that's the identity matrix, so my a/b/c are 0/1/0 there. And in the end I, return the e-value (computed as e=b+d=a+b+c).

Here's a simple iterative O(n) time / O(1) space solution whose optimized version takes 6 seconds on a medium-fast machine (unoptimized takes 15 seconds there).
Unoptimized (Try it online!):
def solve(n):
mod = 10**9 + 7
a = b = c = 1
for _ in range(n):
a, b, c = b, c, (a+c) % mod
return a
print(solve(7))
print(solve(10**8))
Optimized (Try it online!):
def solve(n):
mod = 10**9 + 7
a = b = c = 1
for _ in range(n // 300):
for _ in range(100):
a += c
b += a
c += b
a %= mod
b %= mod
c %= mod
for _ in range(n % 300):
a, b, c = b, c, (a+c) % mod
return a

Your solution is on the right track and the bug is not related to your algorithm (Yay).
The problem is when you are performing operations on some big numbers you lose precision. Notice that you can apply the mod 10 ** 9 + 7 along your code since addition is not affected by it. By doing so you keep all your numbers below a certain size and you will not have any floating point precision errors:
import math
def memoized_number_of_ways(dic, n):
if n not in dic:
dic[n] = (memoized_number_of_ways(dic, n-3) + memoized_number_of_ways(dic, n-1)) % (math.pow(10,9) + 7)
return dic[n]
def numberOfWays(n):
memoize = {1:1,2:1,3:2}
ans = memoized_number_of_ways(memoize,n)
return ans
Note that for you to be able to answer the question for n > 1000 you need to solve this recursion error problem.
Unfortunately even a very efficient solution (hint: you don't really need more than 3 items in your dict at any moment) will not solve the question for n ~ 10 ** 9 under a second. And you will need to find another way - a great option is the second answer here :)

Related

Number of N-digit numbers that are divisible by given two numbers

One of my friends got this question in google coding contest. Here goes the question.
Find the number of N-digit numbers that are divisible by both X and Y.
Since the answer can be very large, print the answer modulo 10^9 + 7.
Note: 0 is not considered single-digit number.
Input: N, X, Y.
Constraints:
1 <= N <= 10000
1 <= X,Y <= 20
Eg-1 :
N = 2, X = 5, Y = 7
output : 2 (35 and 70 are the required numbers)
Eg-2 :
N = 1, X = 2, Y = 3
output : 1 (6 is the required number)
If the constraints on N were smaller, then it would be easy (ans = 10^N / LCM(X,Y) - 10^(N-1) / LCM(X,Y)).
But N is upto 1000, hence I am unable to solve it.
This question looks like it was intended to be more difficult, but I would do it pretty much the way you said:
ans = floor((10N-1)/LCM(X,Y)) - floor((10N-1-1)/LCM(X,Y))
The trick is to calculate the terms quickly.
Let M = LCM(X,Y), and say we have:
10a = Mqa + ra, and
10b = Mqb + rb
The we can easily calculate:
10a+b = M(Mqaqb + raqb + rbqa + floor(rarb/M)) + (rarb%M)
With that formula, we can calculate the quotient and remainder for 10N/M in just 2 log N steps using exponentiation by squaring: https://en.wikipedia.org/wiki/Exponentiation_by_squaring
Following python works for this question ,
import math
MOD = 1000000007
def sub(x,y):
return (x-y+MOD)%MOD
def mul(x,y):
return (x*y)%MOD
def power(x,y):
res = 1
x%=MOD
while y!=0:
if y&1 :
res = mul(res,x)
y>>=1
x = mul(x,x)
return res
def mod_inv(n):
return power(n,MOD-2)
x,y = [int(i) for i in input().split()]
m = math.lcm(x,y)
n = int(input())
a = -1
b = -1
total = 1
for i in range(n-1):
total = (total * 10)%m
b = total % m
total = (total*10)%m
a = total % m
l = power(10 , n-1)
r = power(10 , n)
ans = sub( sub(r , l) , sub(a,b) )
ans = mul(ans , mod_inv(m))
print(ans)
Approach for this question is pretty straight forward,
let, m = lcm(x,y)
let,
10^n -1 = m*x + a
10^(n-1) -1 = m*y + b
now from above two equations it is clear that our answer is equal to
(x - y)%MOD .
so,
(x-y) = ((10^n - 10^(n-1)) - (a-b)) / m
also , a = (10^n)%m and b = (10^(n-1))%m
using simple modular arithmetic rules we can easily calculate a and b in O(n) time.
also for subtraction and division performed in the formula we can use modular subtraction and division respectively.
Note: (a/b)%MOD = ( a * (mod_inverse(b, MOD)%MOD )%MOD

Algorithm to solve a Hacker earth problem

I have been working on a Hackerearth Problem. Here is the problem statement:
We have three variables a, b and c. We need to convert a to b and following operations are allowed:
1. Can decrement by 1.
2. Can decrement by 2.
3. Can multiply by c.
Minimum steps required to convert a to b.
Here is the algorithm I came up with:
Increment count to 0.
Loop through till a === b:
1. Perform (x = a * c), (y = a - 1) and (z = a - 2).
2. Among x, y and z, choose the one whose absolute difference with b is the least.
3. Update the value of a to the value chosen among x, y and z.
4. Increment the count by 1.
I can get pass the basic test case but all my advance cases are failing. I guess my logic is correct but due to the complexity it seems to fail.
Can someone suggest a more optimized solution.
Edit 1
Sample Code
function findMinStep(arr) {
let a = parseInt(arr[0]);
let b = parseInt(arr[1]);
let c = parseInt(arr[2]);
let numOfSteps = 0;
while(a !== b) {
let multiply = Math.abs(b - (a * c));
let decrement = Math.abs(b - (a - 1));
let doubleDecrement = Math.abs(b - (a - 2));
let abs = Math.min(multiply, decrement, doubleDecrement);
if(abs === multiply) a = a * c;
else if(abs === decrement) a -= 1;
else a -= 2;
numOfSteps += 1;
}
return numOfSteps.toString()
}
Sample Input: a = 3, b = 10, c = 2
Explanation: Multiply 3 with 2 to get 6, subtract 1 from 6 to get 5, multiply 5 with 2 to get 10.
Reason for tagging both Python and JS: Comfortable with both but I am not looking for code, just an optimized algorithm and analytical thinking.
Edit 2:
function findMinStep(arr) {
let a = parseInt(arr[0]);
let b = parseInt(arr[1]);
let c = parseInt(arr[2]);
let depth = 0;
let queue = [a, 'flag'];
if(a === b ) return 0
if(a > b) {
let output = Math.floor((a - b) / 2);
if((a - b) % 2) return output + 1;
return output
}
while(true) {
let current = queue.shift();
if(current === 'flag') {
depth += 1;
queue.push('flag');
continue;
}
let multiple = current * c;
let decrement = current - 1;
let doubleDecrement = current -2;
if (multiple !== b) queue.push(multiple);
else return depth + 1
if (decrement !== b) queue.push(decrement);
else return depth + 1
if (doubleDecrement !== b) queue.push(doubleDecrement);
else return depth + 1
}
}
Still times out. Any more suggestions?
Link for the question for you reference.
BFS
A greedy approach won't work here.
However it is already on the right track. Consider the graph G, where each node represents a value and each edge represents one of the operations and connects two values that are related by that operation (e.g.: 4 and 3 are connected by "subtract 1"). Using this graph, we can easily perform a BFS-search to find the shortest path:
def a_to_b(a, b, c):
visited = set()
state = {a}
depth = 0
while b not in state:
visited |= state
state = {v - 1 for v in state if v - 1 not in visited} | \
{v - 2 for v in state if v - 2 not in visited} | \
{v * c for v in state if v * c not in visited}
depth += 1
return 1
This query systematically tests all possible combinations of operations until it reaches b by testing stepwise. I.e. generate all values that can be reached with a single operation from a, then test all values that can be reached with two operations, etc., until b is among the generated values.
In depth analysis
(Assuming c >= 0, but can be generalized)
So far for the standard-approach that works with little analysis. This approach has the advantage that it works for any problem of this kind and is easy to implement. However it isn't very efficient and will reach it's limits fairly fast, once the numbers grow. So instead I'll show a way to analyze the problem in depth and gain a (far) more performant solution:
In a first step this answer will analyze the problem:
We need operations -->op such that a -->op b and -->op is a sequence of
subtract 1
subtract 2
multiply by c
First of all, what happens if we first subtract and afterwards multiply?
(a - x) * c = a * c - x * c
Next what happens, if we first multiply and afterwards subtract?
a * c - x'
Positional systems
Well, there's no simplifying transformation for this. But we've got the basic pieces to analyze more complicated chains of operations. Let's see what happens when we chain subtractions and multiplications alternatingly:
(((a - x) * c - x') * c - x'') * c - x'''=
((a * c - x * c - x') * c - x'') * c - x''' =
(a * c^2 - x * c^2 - x' * c - x'') * c - x''' =
a * c^3 - x * c^3 - x' * c^2 - x'' * c - x'''
Looks familiar? We're one step away from defining the difference between a and b in a positional system base c:
a * c^3 - x * c^3 - x' * c^2 - x'' * c - x''' = b
x * c^3 + x' * c^2 + x'' * c + x''' = a * c^3 - b
Unfortunately the above is still not quite what we need. All we can tell is that the LHS of the equation will always be >=0. In general, we first need to derive the proper exponent n (3 in the above example), s.t. it is minimal, nonnegative and a * c^n - b >= 0. Solving this for the individual coefficients (x, x', ...), where all coefficients are non-negative is a fairly trivial task.
We can show two things from the above:
if a < b and a < 0, there is no solution
solving as above and transforming all coefficients into the appropriate operations leads to the optimal solution
Proof of optimality
The second statement above can be proven by induction over n.
n = 0: In this case a - b < c, so there is only one -->op
n + 1: let d = a * c^(n + 1) - b. Let d' = d - m * c^(n + 1), where m is chosen, such that d' is minimal and nonnegative. Per induction-hypothesis d' can be generated optimally via a positional system. Leaving a difference of exactly m * c^n. This difference can not be covered more efficiently via lower-order terms than by m / 2 subtractions.
Algorithm (The TLDR-part)
Consider a * c^n - b as a number base c and try to find it's digits. The final number should have n + 1 digits, where each digit represents a certain number of subtractions. Multiple subtractions are represented by a single digit by addition of the subtracted values. E.g. 5 means -2 -2 -1. Working from the most significant to the least significant digit, the algorithm operates as follows:
perform the subtractions as specified by the digit
if the current digit is was the last, terminate
multiply by c and repeat from 1. with the next digit
E.g.:
a = 3, b = 10, c = 2
choose n = 2
a * c^n - b = 3 * 4 - 10 = 2
2 in binary is 010
steps performed: 3 - 0 = 3, 3 * 2 = 6, 6 - 1 = 5, 5 * 2 = 10
or
a = 2, b = 25, c = 6
choose n = 2
a * c^n - b = 47
47 base 6 is 115
steps performed: 2 - 1 = 1, 1 * 6 = 6, 6 - 1 = 5, 5 * 6 = 30, 30 - 2 - 2 - 1 = 25
in python:
def a_to_b(a, b, c):
# calculate n
n = 0
pow_c = 1
while a * pow_c - b < 0:
n += 1
pow_c *= 1
# calculate coefficients
d = a * pow_c - b
coeff = []
for i in range(0, n + 1):
coeff.append(d // pow_c) # calculate x and append to terms
d %= pow_c # remainder after eliminating ith term
pow_c //= c
# sum up subtractions and multiplications as defined by the coefficients
return n + sum(c // 2 + c % 2 for c in coeff)

How to implement exponential backoff/delay calculation with fixed timeout and number of attempts?

Most backoff/delay algorithms I've seen have fixed number of attempts OR fixed timeout, but not both.
I want to make exactly M attempts within T seconds, with exponential spacing between them, so that "T = delay(0) + delay(1) + ... + delay(M-1)", where "delay(N) = (e^N - 1) / e" (where N - retry number).
How to calculate "e" value in the description above, so that exactly M attempts will be made within overall timeout T (with M and T specified in advance)?
Since "T" is a monotonous function of "e", you can perform a binary search to find the best value that fits.
Here's an example Python program to find such "e" given "T" and "M":
def total_time(e, M):
current = 1
total = 0
for i in range(M):
total += current-1
current *= e
return total
def find_best_e(T, M):
a, b = 0, T
while abs(a-b) > 1e-6:
m = (a+b)/2.0
if total_time(m, M) > T:
b = m
else:
a = m
return (a+b)/2
e = find_best_e(10, 3)
print([e**n-1 for n in range(3)])

Improving the code for Lemoine's conjecture

I am trying to improve the following code:
The code is written to solve the following equation:
2*n + 1 = p + 2*q
This equation denotes that given an integer number n the value 2*n + 1 always can be represented with p + 2*q where p and q are prime numbers.
This has been proved many years ago and is called Lemoine's conjecture.
The input to the code is a number (n>2) and the output would be a matrix including 2 columns of valid prime numbers.
n = 23;
S = 2*n+1;
P = primes(S);
V = [];
kk = 1;
for ii=1:length(P)
for jj=1:length(P)
if (S == P(ii)+2*P(jj))
V(kk,:) = [P(ii) P(jj)];
kk = kk + 1;
end
end
end
the result would be ;
V =
13 17
37 5
41 3
43 2
and for instance:
2*23+1 = 43 + 2*2
Is there a way to get rid of the for loops in MATLAB?
Update:
Suggested by #Daniel, also works
n = 23;
S = 2*n+1;
P = primes(S);
for ii=1:length(P)
if ismember(S - P(ii),P)
V(kk,:) = [P(ii) S-P(ii)];
end
end
You can replace those loops with a vectorized solution using bsxfun -
[R,C] = find(bsxfun(#eq,P-S,-2*P(:)));
V = [P(C) ; P(R)].'
With n = 100000, the runtimes I got were -
------------ With loopy solution
Elapsed time is 33.789586 seconds.
----------- With bsxfun solution
Elapsed time is 1.338330 seconds.
This is a alternative implementation:
p_candidates=primes(2*n+1-4);
q_candidates=p_candidates(p_candidates<n+1);
p_needed=2*n+1-2*q_candidates;
solution=ismember(p_needed,p_candidates);
m=[q_candidates(solution);p_needed(solution)];
calulate upper bounds for p and q, start with the primes smaller than these bounds.
Choose q, calculate the corresponding value for p (p_needed).
Check if the needed value is a prime

Pseudo number generation

Following is text from Data structure and algorithm analysis by Mark Allen Wessis.
Following x(i+1) should be read as x subscript of i+1, and x(i) should be
read as x subscript i.
x(i + 1) = (a*x(i))mod m.
It is also common to return a random real number in the open interval
(0, 1) (0 and 1 are not possible values); this can be done by
dividing by m. From this, a random number in any closed interval [a,
b] can be computed by normalizing.
The problem with this routine is that the multiplication could
overflow; although this is not an error, it affects the result and
thus the pseudo-randomness. Schrage gave a procedure in which all of
the calculations can be done on a 32-bit machine without overflow. We
compute the quotient and remainder of m/a and define these as q and
r, respectively.
In our case for M=2,147,483,647 A =48,271, q = 127,773, r = 2,836, and r < q.
We have
x(i + 1) = (a*x(i))mod m.---------------------------> Eq 1.
= ax(i) - m (floorof(ax(i)/m)).------------> Eq 2
Also author is mentioning about:
x(i) = q(floor of(x(i)/q)) + (x(i) mod Q).--->Eq 3
My question
what does author mean by random number is computed by normalizing?
How author came with Eq 2 from Eq 1?
How author came with Eq 3?
Normalizing means if you have X ∈ [0,1] and you need to get Y ∈ [a, b] you can compute
Y = a + X * (b - a)
EDIT:
2. Let's suppose
a = 3, x = 5, m = 9
Then we have
where [ax/m] means an integer part.
So we have 15 = [ax/m]*m + 6
We need to get 6. 15 - [ax/m]*m = 6 => ax - [ax/m]*m = 6 => x(i+1) = ax(i) - [ax(i)/m]*m
If you have a random number in the range [0,1], you can get a number in the range [2,5] (for example) by multiplying by 3 and adding 2.

Resources