Finding the closest integer fraction to a given random real between 0..1, given ranges of numerator and denominator - algorithm

Given two ranges of positive integers x: [1 ... n] and y: [1 ... m] and random real R from 0 to 1, I need to find the pair of elements (i,j) from x and y such that x_i / y_j is closest to R.
What is the most efficient way to find this pair?

Using Farey sequence
This is a simple and mathematically beautiful algorithm to solve this: run a binary search, where on each iteration the next number is given by the mediant formula (below). By the properties of the Farey sequence that number is the one with the smallest denominator within that interval. Consequently this sequence will always converge and never 'miss' a valid solution.
In pseudocode:
input: m, n, R
a_num = 0, a_denom = 1
b_num = 1, b_denom = 1
repeat:
-- interestingly c_num/c_denom is already in reduced form
c_num = a_num + b_num
c_denom = a_denom + b_denom
-- if the numbers are too big, return the closest of a and b
if c_num > n or c_denom > m then
if R - a_num/a_denom < b_num/b_denom - R then
return a_num, a_denom
else
return b_num, b_denom
-- adjust the interval:
if c_num/c_denom < R then
a_num = c_num, a_denom = c_denom
else
b_num = c_num, b_denom = c_denom
goto repeat
Even though it's fast on average (my educated guess that it's O(log max(m,n))), it can still be slow if R is close to a fraction with a small denominator. For example finding an approximation to 1/1000000 with m = n = 1000000 will take a million iterations.

The standard approach to approximating reals with rationals is computing the continued fraction series (see [1]). Put a limit on the nominator and denominator while computing parts of the series, and the last value before you break the limits is a fraction very close to your real number.
This will find a very good approximation very fast, but I'm not sure this will always find a closest approximation. It is known that
any convergent [partial value of the continued fraction expansion] is nearer to the continued fraction than any other fraction whose denominator is less than that of the convergent
but there may be approximations with larger denominator (still below your limit) that are better approximations, but are not convergents.
[1] http://en.wikipedia.org/wiki/Continued_fraction

Given that R is a real number such that 0 <= R <= 1, integers x: [1 ... n] and integers y: [1 ... m]. It is assumed that n <= m, since if n > m then x[n]/y[m] will be greater than 1, which cannot be the closest approximation to R.
Therefore, the best approximation of R with the denominator d will be either floor(R*d) / d or ceil(R*d) / d.
The problem can be solved in O(m) time and O(1) space (in Python):
from __future__ import division
from random import random
from math import floor
def fractionize(R, n, d):
error = abs(n/d - R)
return (n, d, error) # (numerator, denominator, absolute difference to R)
def better(a, b):
return a if a[2] < b[2] else b
def approximate(R, n, m):
best = (0, 1, R)
for d in xrange(1, m+1):
n1 = min(n, int(floor(R * d)))
n2 = min(n, n1 + 1) # ceil(R*d)
best = better(best, fractionize(R, n1, d))
best = better(best, fractionize(R, n2, d))
return best
if __name__ == '__main__':
def main():
R = random()
n = 30
m = 100
print R, approximate(R, n, m)
main()

Prolly get flamed, but a lookup might be best where we compute all of the fractional values for each of the possible values.. So a simply indexing a 2d array indexed via the fractional parts with the array element containing the real equivalent. I guess we have discrete X and Y parts so this is finite, it wouldnt be the other way around.... Ahh yeah, the actual searching part....erm reet....

Rather than a completely brute force search, do a linear search over the shortest of your lists, using round to find the best match for each element. Maybe something like this:
best_x,best_y=(1,1)
for x in 1...n:
y=max(1,min(m,round(x/R)))
#optional optimization (if you have a fast gcd)
if gcd(x,y)>1:
continue
if abs(R-x/y)<abs(R-bestx/besty):
best_x,best_y=(x,y)
return (best_x,best_y)
Not at all sure whether the gcd "optimization" will ever be faster...

The Solution:
You can do this O(1) space and O(m log(n)) time:
there is no need to create any list to search,
The pseudo code may be is buggy but the idea is this:
r: input number to search.
n,m: the ranges.
for (int i=1;i<=m;i++)
{
minVal = min(Search(i,1,n,r), minVal);
}
//x and y are start and end of array:
decimal Search(i,x,y,r)
{
if (i/x > r)
return i/x - r;
decimal middle1 = i/Cill((x+y)/2);
decimal middle2 = i/Roof((x+y)/2);
decimal dist = min(middle1,middle2)
decimal searchResult = 100000;
if( middle > r)
searchResult = Search (i, x, cill((x+y)/2),r)
else
searchResult = Search(i, roof((x+y)/2), y,r)
if (searchResult < dist)
dist = searchResult;
return dist;
}
finding the index as home work to reader.
Description: I think you can understand what's the idea by code, but let trace one of a for loop:
when i=1:
you should search within bellow numbers:
1,1/2,1/3,1/4,....,1/n
you check the number with (1,1/cill(n/2)) and (1/floor(n/2), 1/n) and doing similar binary search on it to find the smallest one.
Should do this for loop for all items, so it will be done m time. and in each time it takes O(log(n)). this function can improve by some mathematical rules, but It will be complicated, I skip it.

If the denominator of R is larger than m then use the Farey method (which the Fraction.limit_denominator method implements) with a limit of m to get a fraction a/b where b is smaller than m else let a/b = R. With b <= m, either a <= n and you are done or else let M = math.ceil(n/R) and re-run the Farey method.
def approx2(a, b, n, m):
from math import ceil
from fractions import Fraction
R = Fraction(a, b)
if R < Fraction(1, m):
return 1, m
r = R.limit_denominator(m)
if r.numerator > n:
M = ceil(n/R)
r = R.limit_denominator(M)
return r.numerator, r.denominator
>>> approx2(113, 205, 50, 200)
(43, 78)
It might be possible to just run the Farey method once using a limiting denominator of min(ceil(n/R), m) but I am not sure about that:
def approx(a, b, n, m):
from math import ceil
from fractions import Fraction
R = Fraction(a, b)
if R < Fraction(1, m):
return 1, m
r = R.limit_denominator(min(ceil(n/R), m))
return r.numerator, r.denominator

Related

Calculate integer powers with a given loop invariant

I need to derive an algorithm in C++ to calculate integer powers m^n that uses the loop invariant r = y^n and the loop condition y != m.
I tried using the instruction y= y+1 to advance, but I don´t know how to obtain (y+1)^n from y^n, and it shouldn't be difficult to find . So, probably, this isn't the correct path to follow
Could you help me to derive the program?
EDIT: this is a problem from the subject Data Structures and Algorithms. The difficulty ( if there is at all) shouldn't be mathematic.
EDIT2: Just to clarify, the difficulty of the problem is using the invariant y^n and loop condition y != m. If I vary the n I'm not achieving that
Given w and P such that 2^w > m, P > 2^(wn), and 2^((P-1)/2) = -1 mod P,
then 2 is a generator mod P, and there will be some x such that 2^x = m mod P, so:
if (m<=1 || n==1)
return m;
if (n==0)
return 1;
let y = 2;
let r = 1<<n;
while(y!=m)
{
y = (y*2)%P;
r = (r*(1<<n))%P;
}
return r;
Unless your function needs to produce bignum results, you can just pick the largest P that fits into an integer in your language.
There is no useful relation between (y+1)^n and y^n (you can write (y+1)^n = (√(y^n)+1)^n or (y+1)^n = (1+1/y)^n y^n, but this leads you nowhere).
If y was factored, you could exploit (a.b)^n = (a^n).(b^n), but you would need a table of the nth powers of the primes.
I can't see an answer that makes sense.
You can also think of the Binomial theorem,
(y+1)^n = y^n + n y^(n-1) + n(n-1)/2 y^(n-2) + ... 1
but this is worse than anything: you need to compute n binomial coefficients, and update all powers of y from 0 to n. The total cost of the computation would be ridiculously high.

Find the value of f(T) for big value T

I am trying to solve a problem which is described below,
Given value of f(0) and k , which are integers.
I need to find value of f( T ). where T<=1010
Recursive function is,
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
My efforts,
#include<iostream>
using namespace std;
int main(){
long k,f0,i;
cin>>k>>f0;
long operation ;
cin>>operation;
long answer=f0;
for(i=1;i<=operation;i++){
answer=(4*answer <= k )?(2*answer):(k-(2*answer));
}
cout<<answer;
return 0;
}
My code gives me right answer. But, The code will run 1010 time in worst case that gives me Time Limit Exceed. I need more efficient solution for this problem. Please help me. I don't know the correct algorithm.
If 2f(0) < k then you can compute this function in O(log n) time (using exponentiation by squaring modulo k).
r = f(0) * 2^n mod k
return 2 * r >= k ? k - r : r
You can prove this by induction. The induction hypothesis is that 0 <= f(n) < k/2, and that the above code fragment computes f(n).
Here's a Python program which checks random test cases, comparing a naive implementation (f) with an optimized one (g).
def f(n, k, z):
r = z
for _ in xrange(n):
if 4*r <= k:
r = 2 * r
else:
r = k - 2 * r
return r
def g(n, k, z):
r = (z * pow(2, n, k)) % k
if 2 * r >= k:
r = k - r
return r
import random
errs = 0
while errs < 20:
k = random.randrange(100, 10000000)
n = random.randrange(100000)
z = random.randrange(k//2)
a1 = f(n, k, z)
a2 = g(n, k, z)
if a1 != a2:
print n, k, z, a1, a2
errs += 1
print '.',
Can you use methmetical solution before progamming and compulating?
Actually,
f(n) = f0*2^(n-1) , if f(n-1)*4 <= k
k - f0*2^(n-1) , if f(n-1)*4 > k
thus, your code will write like this:
condition = f0*pow(2, operation-2)
answer = condition*4 =< k? condition*2: k - condition*2
For a simple loop, your answer looks pretty tight; one could optimise a little bit using answer<<2 instead of 4*answer, and answer<<1 for 2*answer, but quite possibly your compiler is already doing that. If you're blowing the time with this, it might be necessary to reduce the loop itself somehow.
I can't figure out a mathematical pattern that #Shannon was going for, but I'm thinking we could exploit the fact that this function will sooner or later cycle. If the cycle is short enough, then we could short the loop by just getting the answer at the same point in the cycle.
So let's get some cycle detection equipment in the form of Brent's algorithm, and see if we can cut the loop to reasonable levels.
def brent(f, x0):
# main phase: search successive powers of two
power = lam = 1
tortoise = x0
hare = f(x0) # f(x0) is the element/node next to x0.
while tortoise != hare:
if power == lam: # time to start a new power of two?
tortoise = hare
power *= 2
lam = 0
hare = f(hare)
lam += 1
# Find the position of the first repetition of length λ
mu = 0
tortoise = hare = x0
for i in range(lam):
# range(lam) produces a list with the values 0, 1, ... , lam-1
hare = f(hare)
# The distance between the hare and tortoise is now λ.
# Next, the hare and tortoise move at same speed until they agree
while tortoise != hare:
tortoise = f(tortoise)
hare = f(hare)
mu += 1
return lam, mu
f0 = 2
k = 198779
t = 10000000000
def f(x):
if 4 * x <= k:
return 2 * x
else:
return k - 2 * x
lam, mu = brent(f, f0)
t2 = t
if t >= mu + lam: # if T is past the cycle's first loop,
t2 = (t - mu) % lam + mu # find the equivalent place in the first loop
x = f0
for i in range(t2):
x = f(x)
print("Cycle start: %d; length: %d" % (mu, lam))
print("Equivalent result at index: %d" % t2)
print("Loop iterations skipped: %d" % (t - t2))
print("Result: %d" % x)
As opposed to the other proposed answers, this approach actually could use a memo array to speed up the process, since the start of the function is actually calculated multiple times (in particular, inside brent), or it may be irrelevant, depending on how big the cycle happens to be.
The algorithm you proposed already has O(n).
To come up with more efficient algorithms, there is not that much direction we can go about. Some typical options we have
1.Decease the coefficients of the linear term( but I doubt it would make a difference in this case
2.Change to O(Logn)(typically use some sort of divide and conquer technique)
3.Change to O(1)
In this case, we can do the last one.
The recursion function is a piece-wise function
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
Let's tackle it by case:
case 1: if 4*f(n-1) <= k (1)(assuming the starting index is zero)
this is a obvious a geometry series
a_n = 2*a_n-1
Therefore, have the formula
Sn = 2^(n-1)f(0) ----()
Case 2: if 4*f(n-1) > k (2), we have
a_n = -2a_n-1 + k
Assuming, a_j is the element in the sequence which just satisfy condition (2)
Nestedly sub in an_1 to the formula, you will obtain the equation
an = k -2k +4k -8k... +(-2)^(n-j)* a_j
k -2k 4k -8... is another gemo series
Sn = k*(1-2^(n-j))/(1-2) ---gemo series sum formula with starting value k and ratio = -2
Therefore, we have a formula for an in the case 2
an = k * (1-2^(n-j))/(1-2) + (-2)^(n-j) * a_j ----(**)
All we left to do it to find aj which just dissatisfy condition (1) and satisfy (2)
This can be obtained in constant time again using the formula we have for case 1:
find n such that, 4*an = 4*Sn = 4*2^(n-1)*f(0)
solve for n: 4*2^(n-1)*f(0) = k, if n is not integer, take ceiling of n
In my first attempt to solve this question, I had wrong assumption that the value of the sequence is monotonically increasing but in fact the sequence might jump between case 1 and case 2. Therefore, there might not be constant algorithm to solve the problem.
However, we can use utilize the result above to skip iterative update complexity.
The overall algorithm will look something like:
start with T, K, and f(0)
compute n that make the condition switch using either (*) or (**)
update f(0) with f(n), update T - n
repeat
terminate when T-n = 0(the last iteration might over compute causing T-n<0, therefore, you need to go back a little bit if that happen)
Create a map that can store your results. Before finding f(n) check in that map, if solution is already existed or not.
If exists, use that solution.
Otherwise find it, store it for future use.
For C++:
Definition:
map<long,long>result;
Insertion:
result[key]=value
Accessing:
value=result[key];
Checking:
map<long,long>::iterator it=result.find(key);
if(it==result.end())
{
//key was not found, find the solution and insert into result
}
else
{
return result[key];
}
Use above technique for better solution.

Creating a random number generator from a coin toss

Yesterday i had this interview question, which I couldn't fully answer:
Given a function f() = 0 or 1 with a perfect 1:1 distribution, create a function f(n) = 0, 1, 2, ..., n-1 each with probability 1/n
I could come up with a solution for if n is a natural power of 2, ie use f() to generate the bits of a binary number of k=ln_2 n. But this obviously wouldn't work for, say, n=5 as this would generate f(5) = 5,6,7 which we do not want.
Does anyone know a solution?
You can build a rng for the smallest power of two greater than n as you described. Then whenever this algorithm generates a number larger than n-1, throw that number away and try again. This is called the method of rejection.
Addition
The algorithm is
Let m = 2^k >= n where k is is as small as possible.
do
Let r = random number in 0 .. m-1 generated by k coin flips
while r >= n
return r
The probability that this loop stops with at most i iterations is bounded by 1 - (1/2)^i. This goes to 1 very rapidly: The loop is still running after 30 iterations with probability less than one-billionth.
You can decrease the expected number of iterations with a slightly modified algorithm:
Choose p >= 1
Let m = 2^k >= p n where k is is as small as possible.
do
Let r = random number in 0 .. m-1 generated by k coin flips
while r >= p n
return floor(r / p)
For example if we are trying to generate 0 .. 4 (n = 5) with the simpler algorithm, we would reject 5, 6 and 7, which is 3/8 of the results. With p = 3 (for example), pn = 15, we'd have m = 16 and would reject only 15, or 1/16 of the results. The price is needing four coin flips rather than 3 and a division op. You can continue to increase p and add coin flips to decrease rejections as far as you wish.
Another interesting solution can be derived through a Markov Chain Monte Carlo technique, the Metropolis-Hastings algorithm. This would be significantly more efficient if a large number of samples were required but it would only approach the uniform distribution in the limit.
initialize: x[0] arbitrarily
for i=1,2,...,N
if (f() == 1) x[i] = (x[i-1]++) % n
else x[i] = (x[i-1]-- + n) % n
For large N the vector x will contain uniformly distributed numbers between 0 and n. Additionally, by adding in an accept/reject step we can simulate from an arbitrary distribution, but you would need to simulate uniform random numbers on [0,1] as a sub-procedure.
def gen(a, b):
min_possible = a
max_possible = b
while True:
floor_min_possible = floor(min_possible)
floor_max_possible = floor(max_possible)
if max_possible.is_integer():
floor_max_possible -= 1
if floor_max_possible == floor_min_possible:
return floor_max_possible
mid = (min_possible + max_possible)/2
if coin_flip():
min_possible = mid
else:
max_possible = mid
My #RandomNumberGenerator #RNG
/w any f(x) that gives rand ints from 1 to x, we can get rand ints from 1 to k, for any k:
get ints p & q, so p^q is smallest possible, while p is a factor of x, & p^q >= k;
Lbl A
i=0 & s=1; while i < q {
s+= ((f(x) mod p) - 1) * p^i;
i++;
}
if s > k, goto A, else return s
//** about notation/terms:
rand = random
int = integer
mod is (from) modulo arithmetic
Lbl is a “Label”, from the Basic language, & serves as a coordinates for executing code. After the while loop, if s > k, then “goto A” means return to the point of code where it says “Lbl A”, & resume. If you return to Lbl A & process the code again, it resets the values of i to 0 & s to 1.
i is an iterator for powers of p, & s is a sum.
"s+= foo" means "let s now equal what it used to be + foo".
"i++" means "let i now equal what it used to be + 1".
f(x) returns random integers from 1 to x. **//
I figured out/invented/solved it on my own, around 2008. The method is discussed as common knowledge here. Does anyone know since when the random number generator rejection method has been common knowledge? RSVP.

Integer distance

As a single operation between two positive integers we understand
multiplying one of the numbers by some prime number or dividing it by
such (provided it can be divided by this prime number without
the remainder). The distance between a and b denoted as d(a,b) is a
minimal amount of operations needed to transform number a into number
b. For example, d(69,42)=3.
Keep in mind that our function d indeed has characteristics of the
distance - for any positive ints a, b and c we get:
a) d(a,a)==0
b) d(a,b)==d(b,a)
c) the inequality of a triangle d(a,b)+d(b,c)>=d(a,c) is fulfilled.
You'll be given a sequence of positive ints a_1, a_2,...,a_n. For every a_i of them
output such a_j (j!=i) that d(a_i, a_j) is as low as possible. For example, the sequence of length 6: {1,2,3,4,5,6} should output {2,1,1,2,1,2}.
This seems really hard to me. What I think would be useful is:
a) if a_i is prime, we are unable to make anything less than a_i (unless it's 1) so the only operation allowed is multiplication. Therefore, if we have 1 in our set, for every prime number d(this_number, 1) is the lowest.
b) also, for 1 d(1, any_prime_number) is the lowest.
c) for a non-prime number we check if we have any of its factors in our set or multiplication of its factors
That's all I can deduce, though. The worst part is I know it will take an eternity for such an algorithm to run and check all the possibilities... Could you please try to help me with it? How should this be done?
Indeed, you can represent any number N as 2^n1 * 3^n2 * 5^n3 * 7^n4 * ... (most of the n's are zeroes).
This way you set a correspondence between a number N and infinite sequence (n1, n2, n3, ...).
Note that your operation is just adding or subtracting 1 at exactly one of the appropriate sequence's places.
Let N and M be two numbers, and their sequences be (n1, n2, n3, ...) and (m1, m2, m3, ...).
The distance between the two numbers is indeed nothing but |n1 - m1| + |n2 - m2| + ...
So, in order to find out the closest number, you need to calculate the sequences for all the input numbers (this is just decomposing them into primes). Having this decomposition, the calculation is straightforward.
Edit:
In fact, you don't need the exact position of your prime factor: you just need to know, which is the exponent for each of the prime divisors.
Edit:
this is the simple procedure for converting the number into the chain representation:
#include <map>
typedef std::map<unsigned int, unsigned int> ChainRepresentation;
// maps prime factor -> exponent, default exponent is of course 0
void convertToListRepresentation(int n, ChainRepresentation& r)
{
// find a divisor
int d = 2;
while (n > 1)
{
for (; n % d; d++)
{
if (n/d < d) // n is prime
{
r[n]++;
return;
}
}
r[d]++;
n /= d;
}
}
Edit:
... and the code for distance:
#include <set>
unsigned int chainDistance(ChainRepresentation& c1, ChainRepresentation& c2)
{
if (&c1 == &c2)
return 0; // protect from modification done by [] during self-comparison
int result = 0;
std::set<unsigned int> visited;
for (ChainRepresentation::const_iterator it = c1.begin(); it != c1.end(); ++it)
{
unsigned int factor = it->first;
unsigned int exponent = it->second;
unsigned int exponent2 = c2[factor];
unsigned int expabsdiff = (exponent > exponent2) ?
exponent - exponent2 : exponent2 - exponent;
result += expabsdiff;
visited.insert(factor);
}
for (ChainRepresentation::const_iterator it = c2.begin(); it != c2.end(); ++it)
{
unsigned int factor = it->first;
if (visited.find(factor) != visited.end())
continue;
unsigned int exponent2 = it->second;
// unsigned int exponent = 0;
result += exponent2;
}
return result;
}
For the given limits: 100_000 numbers not greater than a million the most-straightforward algorithm works (1e10 calls to distance()):
For each number in the sequence print its closest neighbor (as defined by minimal distance):
solution = []
for i, ai in enumerate(numbers):
all_except_i = (aj for j, aj in enumerate(numbers) if j != i)
solution.append(min(all_except_i, key=lambda x: distance(x, ai)))
print(', '.join(map(str, solution)))
Where distance() can be calculated as (see #Vlad's explanation):
def distance(a, b):
"""
a = p1**n1 * p2**n2 * p3**n3 ...
b = p1**m1 * p2**m2 * p3**m3 ...
distance = |m1-n1| + |m2-n2| + |m3-n3| ...
"""
diff = Counter(prime_factors(b))
diff.subtract(prime_factors(a))
return sum(abs(d) for d in diff.values())
Where prime_factors() returns prime factors of a number with corresponding multiplicities {p1: n1, p2: n2, ...}:
uniq_primes_factors = dict(islice(prime_factors_gen(), max(numbers)))
def prime_factors(n):
return dict(multiplicities(n, uniq_primes_factors[n]))
Where multiplicities() function given n and its factors returns them with their corresponding multiplicities (how many times a factor divides the number without a remainder):
def multiplicities(n, factors):
assert n > 0
for prime in factors:
alpha = 0 # multiplicity of `prime` in `n`
q, r = divmod(n, prime)
while r == 0: # `prime` is a factor of `n`
n = q
alpha += 1
q, r = divmod(n, prime)
yield prime, alpha
prime_factors_gen() yields prime factors for each natural number. It uses Sieve of Eratosthenes algorithm to find prime numbers. The implementation is based on gen_primes() function by #Eli Bendersky:
def prime_factors_gen():
"""Yield prime factors for each natural number."""
D = defaultdict(list) # nonprime -> prime factors of `nonprime`
D[1] = [] # `1` has no prime factors
for q in count(1): # Sieve of Eratosthenes algorithm
if q not in D: # `q` is a prime number
D[q + q] = [q]
yield q, [q]
else: # q is a composite
for p in D[q]: # `p` is a factor of `q`: `q == m*p`
# therefore `p` is a factor of `p + q == p + m*p` too
D[p + q].append(p)
yield q, D[q]
del D[q]
See full example in Python.
Output
2, 1, 1, 2, 1, 2
Without bounds on how large your numbers can be and how many numbers can be on the input, we can't really deduce it will take "an eternity" to complete. I am tempted to suggest the most "obvious" solution I can think of
Given the factorization of the numbers it is very easy to find their distance
60 = (2^2)*(3^1)*(5^1)*(7^0)
42 = (2^1)*(3^1)*(5^0)*(7^1)
distance = 3
Calculating this factorization using the naive trial division should take at most O(sqrt(N)) time per number, where N is the number being factorized.
Given the factorizations, you only have O(n^2) combinations to worry about, where n is the ammount of numbers. If you store all the factorizations so that you only compute them once, this step shouldn't take that long unless you have a really large amount of numbers.
You do wonder if there is a faster algorithm though. Perhaps it is possible to do some greatest common divisor trick to avoid computing large factorizations and perhaps we can use some graph algorithms to find the distances in a smarter way.
Haven't really thought this through, but it seems to me that to get from prime A to prime B you multiply A * B and then divide by A.
If you thus break the initial non-prime A & B into their prime factors, factor out the common prime factors, and then use the technique in the first paragraph to convert the unique primes, you should be following a minimal path to get from A to B.

Algorithm to partition a number

Given a positive integer X, how can one partition it into N parts, each between A and B where A <= B are also positive integers? That is, write
X = X_1 + X_2 + ... + X_N
where A <= X_i <= B and the order of the X_is doesn't matter?
If you want to know the number of ways to do this, then you can use generating functions.
Essentially, you are interested in integer partitions. An integer partition of X is a way to write X as a sum of positive integers. Let p(n) be the number of integer partitions of n. For example, if n=5 then p(n)=7 corresponding to the partitions:
5
4,1
3,2
3,1,1
2,2,1
2,1,1,1
1,1,1,1,1
The the generating function for p(n) is
sum_{n >= 0} p(n) z^n = Prod_{i >= 1} ( 1 / (1 - z^i) )
What does this do for you? By expanding the right hand side and taking the coefficient of z^n you can recover p(n). Don't worry that the product is infinite since you'll only ever be taking finitely many terms to compute p(n). In fact, if that's all you want, then just truncate the product and stop at i=n.
Why does this work? Remember that
1 / (1 - z^i) = 1 + z^i + z^{2i} + z^{3i} + ...
So the coefficient of z^n is the number of ways to write
n = 1*a_1 + 2*a_2 + 3*a_3 +...
where now I'm thinking of a_i as the number of times i appears in the partition of n.
How does this generalize? Easily, as it turns out. From the description above, if you only want the parts of the partition to be in a given set A, then instead of taking the product over all i >= 1, take the product over only i in A. Let p_A(n) be the number of integer partitions of n whose parts come from the set A. Then
sum_{n >= 0} p_A(n) z^n = Prod_{i in A} ( 1 / (1 - z^i) )
Again, taking the coefficient of z^n in this expansion solves your problem. But we can go further and track the number of parts of the partition. To do this, add in another place holder q to keep track of how many parts we're using. Let p_A(n,k) be the number of integer partitions of n into k parts where the parts come from the set A. Then
sum_{n >= 0} sum_{k >= 0} p_A(n,k) q^k z^n = Prod_{i in A} ( 1 / (1 - q*z^i) )
so taking the coefficient of q^k z^n gives the number of integer partitions of n into k parts where the parts come from the set A.
How can you code this? The generating function approach actually gives you an algorithm for generating all of the solutions to the problem as well as a way to uniformly sample from the set of solutions. Once n and k are chosen, the product on the right is finite.
Here is a python solution to this problem, This is quite un-optimised but I have tried to keep it as simple as I can to demonstrate an iterative method of solving this problem.
The results of this method will commonly be a list of max values and min values with maybe 1 or 2 values inbetween. Because of this, there is a slight optimisation in there, (using abs) which will prevent the iterator constantly trying to find min values counting down from max and vice versa.
There are recursive ways of doing this that look far more elegant, but this will get the job done and hopefully give you an insite into a better solution.
SCRIPT:
# iterative approach in-case the number of partitians is particularly large
def splitter(value, partitians, min_range, max_range, part_values):
# lower bound used to determine if the solution is within reach
lower_bound = 0
# upper bound used to determine if the solution is within reach
upper_bound = 0
# upper_range used as upper limit for the iterator
upper_range = 0
# lower range used as lower limit for the iterator
lower_range = 0
# interval will be + or -
interval = 0
while value > 0:
partitians -= 1
lower_bound = min_range*(partitians)
upper_bound = max_range*(partitians)
# if the value is more likely at the upper bound start from there
if abs(lower_bound - value) < abs(upper_bound - value):
upper_range = max_range
lower_range = min_range-1
interval = -1
# if the value is more likely at the lower bound start from there
else:
upper_range = min_range
lower_range = max_range+1
interval = 1
for i in range(upper_range, lower_range, interval):
# make sure what we are doing won't break solution
if lower_bound <= value-i and upper_bound >= value-i:
part_values.append(i)
value -= i
break
return part_values
def partitioner(value, partitians, min_range, max_range):
if min_range*partitians <= value and max_range*partitians >= value:
return splitter(value, partitians, min_range, max_range, [])
else:
print ("this is impossible to solve")
def main():
print(partitioner(9800, 1000, 2, 100))
The basic idea behind this script is that the value needs to fall between min*parts and max*parts, for each step of the solution, if we always achieve this goal, we will eventually end up at min < value < max for parts == 1, so if we constantly take away from the value, and keep it within this min < value < max range we will always find the result if it is possable.
For this code's example, it will basically always take away either max or min depending on which bound the value is closer to, untill some non min or max value is left over as remainder.
A simple realization you can make is that the average of the X_i must be between A and B, so we can simply divide X by N and then do some small adjustments to distribute the remainder evenly to get a valid partition.
Here's one way to do it:
X_i = ceil (X / N) if i <= X mod N,
floor (X / N) otherwise.
This gives a valid solution if A <= floor (X / N) and ceil (X / N) <= B. Otherwise, there is no solution. See proofs below.
sum(X_i) == X
Proof:
Use the division algorithm to write X = q*N + r with 0 <= r < N.
If r == 0, then ceil (X / N) == floor (X / N) == q so the algorithm sets all X_i = q. Their sum is q*N == X.
If r > 0, then floor (X / N) == q and ceil (X / N) == q+1. The algorithm sets X_i = q+1 for 1 <= i <= r (i.e. r copies), and X_i = q for the remaining N - r pieces. The sum is therefore (q+1)*r + (N-r)*q == q*r + r + N*q - r*q == q*N + r == X.
If floor (X / N) < A or ceil (X / N) > B, then there is no solution.
Proof:
If floor (X / N) < A, then floor (X / N) * N < A * N, and since floor(X / N) * N <= X, this means that X < A*N, so even using only the smallest pieces possible, the sum would be larger than X.
Similarly, if ceil (X / N) > B, then ceil (X / N) * N > B * N, and since ceil(X / N) * N >= X, this means that X > B*N, so even using only the largest pieces possible, the sum would be smaller than X.

Resources