Rounding rationals in (0, 1) to the nearest unit fraction - algorithm

What's a good algorithm for the following problem?
Given a rational a / b strictly between 0 and 1, find a natural n that minimizes |a / b - 1 / n|.
The simplest algorithm I can think of is to compare a / b and 1 / m for m = b, b - 1, …, stopping when a / b ≤ 1 / m, and then compare |a / b - 1 / m| and |a / b - 1 / (m + 1)|. That's O( b ). Can you do any better?

Let k = floor(b/a) and then n must equal either k or k+1. Try the 2 candidates and see which one wins. This is O(1).
That this is true follows from the fact that 1/(k+1) <= a/b <= 1/k which in turn follows from the inequalities k <= b/a <= k+1.

I believe that you can do this in O(1) by using continued fractions. Any rational number in the range (0, 1] can be written in the form
1 / (a0 + 1 / (a1 + 1 / (a2 + 1 / (... an))))
Moreover, this representation has some remarkable properties. For starters, if you truncate the representation at any point, you get an extremely good approximation to the rational number. In particular, if you just truncate this representation at
1 / a0
Then the fraction a/b will be between 1/a0 and 1/(a0+1). Consequently, if we can get the value of a0, then you can just check the above two numbers to see which is closer.
The second important property is that there is a great way of obtaining the value of a0: it's given by the quotient of b/a. In other words, you can find the closest fraction as follows:
Compute x = b / a using integer division.
Check whether 1/x or 1/(x+1) is closer to a/b and output that result.
If a and b fit into machine words, this runs in O(1) time.

As suggested in the comments, your best bet is to use Ceiling and Floor functions.
If your rational a / b is given as 0 <= x <= 1 then you can simply do this:
int Rationalize(double x)
{
int n1 = floor(1 / x);
int n2 = ceiling(1 / x);
double e1 = abs(x - 1.0 / n1);
double e2 = abs(x - 1.0 / n2);
if (e1 < e2) return n1;
else return n2;
}
(Where it's assumed abs, floor, and ceiling are predefined)

Related

Comparing sqrt(n) with the rational p/q

You are given an integer n and a rational p/q (p and q are integers).
How do you compare sqrt(n) and p/q?
Solution 1: sqrt(n) <= (double) p / q
Should work, but calls sqrt which is slower than just using multiplication/division.
Solution 2: (double) n * q * q <= p * p
Better, but I can't help thinking that because we are using floats, we might get an incorrect answer if p/q is very close to sqrt(n). Moreover, it requires converting integers to floats, which is (marginally) slower than just working with integers.
Solution 3: n*q*q <= p*p
Even better, but one runs into trouble if p and q get big because of overflow (typically, if p or q >= 2^32 when working with 64 bits integers).
Solution 4: Use solution 3 with a bignum library / in a programming language that has unbound integers.
Solution 5: (q / p) * n <= p / q
Successfully avoids any overflow problems, but I am not sure that this is correct in all cases, because of integer division...
So... I would happily go with solution 2 or 4, but I was wondering if anyone has clever tricks to solve this problem or maybe a proof (or counter example) that solution 5 works (or not).
As I commented, a simple and elegant solution is to use bignum, especially if builtin, or easily available in the chosen language. It will work without restriction on n,p,q.
I will develop here an alternate solution based on IEEE floating point when:
n,p,q are all representable exactly with the given floating point precision (e.g. are within 24 or 53 bits for single or double IEEE 754)
a fused multiply add is available.
I will note f the float type, and f(x) the conversion of value x to f, presumably rounded to nearest floating point, tie to even.
fsqrt(x) will denote the floating point approximation of exact square root.
let f x = fsqrt(f(n)) and f y = f(p) / f(q).
By IEEE 754 property, both x and y are nearest floating point to exact result, and n=f(n), p=f(p), q=f(q) from our preliminary conditions.
Thus if x < y then problem is solved sqrt(n) < p/q.
And if x > y then problem is solved too sqrt(n) > p/q.
Else if x == y we can't tell immediately...
Let's note the residues f r = fma(x,x,-n) and f s = fma(y,q,-p).
We have r = x*x - n and s = y*q - p exactly. Thus s/q = y - p/q (the exact operations, not the floating point ones).
Now we can compare the residual errors. (p/q)^2 = y^2-2*y*s/q+ (s/q)^2. How does it compare to n = x^2 - r?
n-(p/q)^2 = 2*y*s/q - r - (s/q)^2.
We thus have an approximation of the difference d, at 1st order: f d = 2*y*s/f(q) - r. So here is a C like prototype:
int sqrt_compare(i n,i p,i q)
/* answer -1 if sqrt(n)<p/q, 0 if sqrt(n)==p/q, +1 if sqrt(n)>p/q */
/* n,p,q are presumed representable in f exactly */
{
f x=sqrt((f) n);
f y=(f) p / (f) q;
if(x<y) return -1;
if(x>y) return +1;
f r=fma(x,x,-(f) n);
f s=fma(y,(f) q,-(f) p);
f d=y*s/(f) q - r;
if(d<0) return -1;
if(d>0) return +1;
if(r==0 && s==0) return 0; /* both exact and equal */
return -1; /* due to 2nd order */
}
As you can see, it's relatively short, should be efficient, but is hard to decipher, so at least from this POV, I would not qualify this solution as better than trivial bignum.
You might consider solution 3 with integers 2x the size,
n * uint2n_t{q} * q <= uint2n_t{p} * p
This overflows if n * q * q overflows, but in that case you return false anyway.
uint2n_t nqq;
bool overflow = __builtin_mul_overflow(uint2n_t{n} * q, q, &nqq);
(!overflow) && (uint2n_t{n} * q * q <= uint2n_t{p} * p);

Computing all infix products for a monoid / semigroup

Introduction: Infix products for a group
Suppose I have a group
G = (G, *)
and a list of elements
A = {0, 1, ..., n} ⊂ ℕ
x : A -> G
If our goal is to implement a function
f : A × A -> G
such that
f(i, j) = x(i) * x(i+1) * ... * x(j)
(and we don't care about what happens if i > j)
then we can do that by pre-computing a table of prefixes
m(-1) = 1
m(i) = m(i-1) * x(i)
(with 1 on the right-hand side denoting the unit of G) and then implementing f as
f(i, j) = m(i-1)⁻¹ * m(j)
This works because
m(i-1) = x(0) * x(1) * ... * x(i-1)
m(j) = x(0) * x(1) * ... * x(i-1) * x(i) * x(i+1) * ... * x(j)
and so
m(i)⁻¹ * m(j) = x(i) * x(i+1) * ... * x(j)
after sufficient reassociation.
My question
Can we rescue this idea, or do something not much worse, if G is only a monoid, not a group?
For my particular problem, can we do something similar if G = ([0, 1] ⊂ ℝ, *), i.e. we have real numbers from the unit line, and we can't divide by 0?
Yes, if G is ([0, 1] ⊂ ℝ, *), then the idea can be rescued, making it possible to compute ranged products in O(log n) time (or more accurately, O(log z) where z is the number of a in A with x(a) = 0).
For each i, compute the product m(i) = x(0)*x(1)*...*x(i), ignoring any zeros (so these products will always be non-zero). Also, build a sorted array Z of indices for all the zero elements.
Then the product of elements from i to j is 0 if there's a zero in the range [i, j], and m(j) / m(i-1) otherwise.
To find if there's a zero in the range [i, j], one can binary search in Z for the smallest value >= i in Z, and compare it to j. This is where the extra O(log n) time cost appears.
General monoid solution
In the case where G is any monoid, it's possible to do precomputation of n products to make an arbitrary range product computable in O(log(j-i)) time, although its a bit fiddlier than the more specific case above.
Rather than precomputing prefix products, compute m(i, j) for all i, j where j-i+1 = 2^k for some k>=0, and 2^k divides both i and j. In fact, for k=0 we don't need to compute anything, since the values of m(i, i+1) is simply x(i).
So we need to compute n/2 + n/4 + n/8 + ... total products, which is at most n-1 things.
One can construct an arbitrary interval [i, j] from at O(log_2(j-i+1)) of these building blocks (and elements of the original array): pick the largest building block contained in the interval and append decreasing sized blocks on either side of it until you get to [i, j]. Then multiply the precomputed products m(x, y) for each of the building blocks.
For example, suppose your array is of size 10. For example's sake, I'll assume the monoid is addition of natural numbers.
i: 0 1 2 3 4 5 6 7 8 9
x: 1 3 2 4 2 3 0 8 2 1
2: ---- ---- ---- ---- ----
4 6 5 8 3
4: ----------- ----------
10 13
8: ----------------------
23
Here, the 2, 4, and 8 rows show sums of aligned intervals of length 2, 4, 8 (ignoring bits left over if the array isn't a power of 2 in length).
Now, suppose we want to calculate x(1) + x(2) + x(3) + ... + x(8).
That's x(1) + m(2, 3) + m(4, 7) + x(8) = 3 + 6 + 13 + 2 = 24.

A fast algorithm to minimize a pseudo Diophantine equation

We're looking for an algorithm to solve this problem in under O(N).
given two real numbers a and b (without loss of generality you can assume they are both between 0 and 1)
Find an integer n between -N and N that minimizes the expression:
|a n - b - round(a n - b)|
We have thought that the Euclidean Algorithm might work well for this, but can't figure it out. It certainly looks like there should be much faster ways to do this than via an exhaustive search over integers n.
Note: in our situation a and b could be changing often, so fixing a and b for a lookup table is possible, it gets kind of ugly as N can vary as well. Haven't looked in detail into the lookup table yet, to see how small we can get it as a function of N.
It sounds like you may be looking for something like continued fractions...
How are they related? Suppose you can substitute b with a rational number b1/b2. Now you are looking for integers n and m such that an-b1/b2 is approximately m. Put it otherwise, you are looking for n and m such that (m+(b1/b2))/n = (mb2+b1)/nb1, a rational number, is approximately a. Set a1 = mb2+b1 and a2 = nb1. Find values for a1 and a2 from a continued fractions approximation and solve for n and m.
Another approach could be this:
Find a good rational approximations for a and b: a ~ a1/a2 and b ~ b1/b2.
Solve n(a1/a2)-(b1/b2) = m for n and m.
I'm not too sure it would work though. The accuracy needed for a depends on n and b.
You are effectively searching for the integer N that makes the expression aN - b as close as possible to an integer. Are a and b fixed? If yes you can pre-compute a lookup table and have O(1) :-)
If not consider looking for the N that makes aN close to I + b for all integers I.
You can compute a continued fraction for the ratio a/b. You can stop when the denominator is greater than N, or when your approximation is good enough.
// Initialize:
double ratio = a / b;
int ak = (int)(ratio);
double remainder = ratio - ak;
int n0 = 1;
int d0 = 0;
int n1 = ak;
int d1 = 1;
do {
ratio = 1 / remainder;
ak = (int)ratio;
int n2 = ak * n1 + n0;
int d2 = ak * d1 + d0;
n0 = n1;
d0 = d1;
n1 = n2;
d1 = d2;
remainder = ratio - ak;
} while (d1 < N);
The value for n you're looking for is d0 (or d1 if it is still smaller than N).
This doesn't necessarily give you the minimum solution, but it will likely be a very good approximation.
First, let us consider a simpler case where b=0 and 0 < a < 1. F(a,n) = |an-round(an)|
Let step_size = 1
Step 1. Let v=a
Step 2. Let period size p = upper_round( 1/v ).
Step 3. Now, for n=1..p, there must be a number i such that F(v,i) < v.
Step 4. v = F(v,i), step_size = stepsize * i
Step 5. Go to step 2
As you can see you can reduce F(v, *) to any level you want. Final solution n = step_size.

Algorithm to partition a number

Given a positive integer X, how can one partition it into N parts, each between A and B where A <= B are also positive integers? That is, write
X = X_1 + X_2 + ... + X_N
where A <= X_i <= B and the order of the X_is doesn't matter?
If you want to know the number of ways to do this, then you can use generating functions.
Essentially, you are interested in integer partitions. An integer partition of X is a way to write X as a sum of positive integers. Let p(n) be the number of integer partitions of n. For example, if n=5 then p(n)=7 corresponding to the partitions:
5
4,1
3,2
3,1,1
2,2,1
2,1,1,1
1,1,1,1,1
The the generating function for p(n) is
sum_{n >= 0} p(n) z^n = Prod_{i >= 1} ( 1 / (1 - z^i) )
What does this do for you? By expanding the right hand side and taking the coefficient of z^n you can recover p(n). Don't worry that the product is infinite since you'll only ever be taking finitely many terms to compute p(n). In fact, if that's all you want, then just truncate the product and stop at i=n.
Why does this work? Remember that
1 / (1 - z^i) = 1 + z^i + z^{2i} + z^{3i} + ...
So the coefficient of z^n is the number of ways to write
n = 1*a_1 + 2*a_2 + 3*a_3 +...
where now I'm thinking of a_i as the number of times i appears in the partition of n.
How does this generalize? Easily, as it turns out. From the description above, if you only want the parts of the partition to be in a given set A, then instead of taking the product over all i >= 1, take the product over only i in A. Let p_A(n) be the number of integer partitions of n whose parts come from the set A. Then
sum_{n >= 0} p_A(n) z^n = Prod_{i in A} ( 1 / (1 - z^i) )
Again, taking the coefficient of z^n in this expansion solves your problem. But we can go further and track the number of parts of the partition. To do this, add in another place holder q to keep track of how many parts we're using. Let p_A(n,k) be the number of integer partitions of n into k parts where the parts come from the set A. Then
sum_{n >= 0} sum_{k >= 0} p_A(n,k) q^k z^n = Prod_{i in A} ( 1 / (1 - q*z^i) )
so taking the coefficient of q^k z^n gives the number of integer partitions of n into k parts where the parts come from the set A.
How can you code this? The generating function approach actually gives you an algorithm for generating all of the solutions to the problem as well as a way to uniformly sample from the set of solutions. Once n and k are chosen, the product on the right is finite.
Here is a python solution to this problem, This is quite un-optimised but I have tried to keep it as simple as I can to demonstrate an iterative method of solving this problem.
The results of this method will commonly be a list of max values and min values with maybe 1 or 2 values inbetween. Because of this, there is a slight optimisation in there, (using abs) which will prevent the iterator constantly trying to find min values counting down from max and vice versa.
There are recursive ways of doing this that look far more elegant, but this will get the job done and hopefully give you an insite into a better solution.
SCRIPT:
# iterative approach in-case the number of partitians is particularly large
def splitter(value, partitians, min_range, max_range, part_values):
# lower bound used to determine if the solution is within reach
lower_bound = 0
# upper bound used to determine if the solution is within reach
upper_bound = 0
# upper_range used as upper limit for the iterator
upper_range = 0
# lower range used as lower limit for the iterator
lower_range = 0
# interval will be + or -
interval = 0
while value > 0:
partitians -= 1
lower_bound = min_range*(partitians)
upper_bound = max_range*(partitians)
# if the value is more likely at the upper bound start from there
if abs(lower_bound - value) < abs(upper_bound - value):
upper_range = max_range
lower_range = min_range-1
interval = -1
# if the value is more likely at the lower bound start from there
else:
upper_range = min_range
lower_range = max_range+1
interval = 1
for i in range(upper_range, lower_range, interval):
# make sure what we are doing won't break solution
if lower_bound <= value-i and upper_bound >= value-i:
part_values.append(i)
value -= i
break
return part_values
def partitioner(value, partitians, min_range, max_range):
if min_range*partitians <= value and max_range*partitians >= value:
return splitter(value, partitians, min_range, max_range, [])
else:
print ("this is impossible to solve")
def main():
print(partitioner(9800, 1000, 2, 100))
The basic idea behind this script is that the value needs to fall between min*parts and max*parts, for each step of the solution, if we always achieve this goal, we will eventually end up at min < value < max for parts == 1, so if we constantly take away from the value, and keep it within this min < value < max range we will always find the result if it is possable.
For this code's example, it will basically always take away either max or min depending on which bound the value is closer to, untill some non min or max value is left over as remainder.
A simple realization you can make is that the average of the X_i must be between A and B, so we can simply divide X by N and then do some small adjustments to distribute the remainder evenly to get a valid partition.
Here's one way to do it:
X_i = ceil (X / N) if i <= X mod N,
floor (X / N) otherwise.
This gives a valid solution if A <= floor (X / N) and ceil (X / N) <= B. Otherwise, there is no solution. See proofs below.
sum(X_i) == X
Proof:
Use the division algorithm to write X = q*N + r with 0 <= r < N.
If r == 0, then ceil (X / N) == floor (X / N) == q so the algorithm sets all X_i = q. Their sum is q*N == X.
If r > 0, then floor (X / N) == q and ceil (X / N) == q+1. The algorithm sets X_i = q+1 for 1 <= i <= r (i.e. r copies), and X_i = q for the remaining N - r pieces. The sum is therefore (q+1)*r + (N-r)*q == q*r + r + N*q - r*q == q*N + r == X.
If floor (X / N) < A or ceil (X / N) > B, then there is no solution.
Proof:
If floor (X / N) < A, then floor (X / N) * N < A * N, and since floor(X / N) * N <= X, this means that X < A*N, so even using only the smallest pieces possible, the sum would be larger than X.
Similarly, if ceil (X / N) > B, then ceil (X / N) * N > B * N, and since ceil(X / N) * N >= X, this means that X > B*N, so even using only the largest pieces possible, the sum would be smaller than X.

Building an expression with maximum value

Given n integers, is there an O(n) or O(n log n) algorithm that can compute the maximum value of a mathematical expression that can be obtained by inserting the operators -, +, * and parentheses between the given numbers? Assume only binary variants of the operators, so no unary minus, except before the first element if needed.
For example, given -3 -4 5, we can build the expression (-3) * (-4) * 5, whose value is 60, and maximum possible.
Background:
I stumbled upon this problem some time ago when studying genetic algorithms, and learned that it can be solved pretty simply with a classical genetic algorithm. This runs slowly however, and it's only simple in theory, as the code gets rather ugly in practice (evaluate the expression, check for correct placement of brackets etc.). What's more, we're not guaranteed to find the absolute maximum either.
All these shortcomings of genetic algorithms got me wondering: since we can don't have to worry about division, is there a way to do this efficiently with a more classic approach, such as dynamic programming or a greedy strategy?
Update:
Here's an F# program that implements the DP solution proposed by #Keith Randall together with my improvement, which I wrote in a comment to his post. This is very inefficient, but I maintain that it's polynomial and has cubic complexity. It runs in a few seconds for ~50 element arrays. It would probably be faster if written in a fully imperative manner, as a lot of time is probably wasted on building and traversing lists.
open System
open System.IO
open System.Collections.Generic
let Solve (arr : int array) =
let memo = new Dictionary<int * int * int, int>()
let rec Inner st dr last =
if st = dr then
arr.[st]
else
if memo.ContainsKey(st, dr, last) then
memo.Item(st, dr, last)
else
match last with
| 0 -> memo.Add((st, dr, last),
[
for i in [st .. dr - 1] do
for j in 0 .. 2 do
for k in 0 .. 2 do
yield (Inner st i j) * (Inner (i + 1) dr k)
] |> List.max)
memo.Item(st, dr, last)
| 1 -> memo.Add((st, dr, last),
[
for i in [st .. dr - 1] do
for j in 0 .. 2 do
for k in 0 .. 2 do
yield (Inner st i j) + (Inner (i + 1) dr k)
] |> List.max)
memo.Item(st, dr, last)
| 2 -> memo.Add((st, dr, last),
[
for i in [st .. dr - 1] do
for j in 0 .. 2 do
for k in 0 .. 2 do
yield (Inner st i j) - (Inner (i + 1) dr k)
] |> List.max)
memo.Item(st, dr, last)
let noFirst = [ for i in 0 .. 2 do yield Inner 0 (arr.Length - 1) i ] |> List.max
arr.[0] <- -1 * arr.[0]
memo.Clear()
let yesFirst = [ for i in 0 .. 2 do yield Inner 0 (arr.Length - 1) i ] |> List.max
[noFirst; yesFirst] |> List.max
let _ =
printfn "%d" <| Solve [|-10; 10; -10|]
printfn "%d" <| Solve [|2; -2; -1|]
printfn "%d" <| Solve [|-5; -3; -2; 0; 1; -1; -1; 6|]
printfn "%d" <| Solve [|-5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6;|]
Results:
1000
6
540
2147376354
The last one is most likely an error due to overflow, I'm just trying to show that a relatively big test runs too fast for this to be exponential.
Here's a proposed solution:
def max_result(a_):
memo = {}
a = list(a_)
a.insert(0, 0)
return min_and_max(a, 0, len(a)-1, memo)[1]
def min_and_max(a, i, j, memo):
if (i, j) in memo:
return memo[i, j]
if i == j:
return (a[i], a[i])
min_val = max_val = None
for k in range(i, j):
left = min_and_max(a, i, k, memo)
right = min_and_max(a, k+1, j, memo)
for op in "*-+":
for x in left:
for y in right:
val = apply(x, y, op)
if min_val == None or val < min_val: min_val = val
if max_val == None or val > max_val: max_val = val
ret = (min_val, max_val)
memo[i, j] = ret
return ret
def apply(x, y, op):
if op == '*': return x*y
if op == '+': return x+y
return x-y
max_result is the main function, and min_and_max is auxiliary. The latter returns the minimum and maximum results that can be achieved by sub-sequence a[i..j].
It assumes that maximum and minimum results of sequences are composed by maximum and minimum results of sub-sequences. Under this assumption, the problem has optimal substructure and can be solved with dynamic programming (or memoization). Run time is O(n^3).
I haven't proved correctness, but I have verified its output against a brute force solution with thousands of small randomly generated inputs.
It handles the possibility of a leading unary minus by inserting a zero at the beginning of the sequence.
EDIT
Been thinking a bit more about this problem, and I believe it can be reduced to a simpler problem in which all values are (strictly) positive and only operators * and + are allowed.
Just remove all zeroes from the sequence and replace negative numbers by their absolute value.
Furthermore, if there are no ones in the resulting sequence, the result is simply the product of all numbers.
After this reduction, the simple dynamic programming algorithm would work.
EDIT 2
Based on the previous insights I think I found a linear solution:
def reduce(a):
return filter(lambda x: x > 0, map(abs, a))
def max_result(a):
b = reduce(a)
if len(b) == 0: return 0
return max_result_aux(b)
def max_result_aux(b):
best = [1] * (len(b) + 1)
for i in range(len(b)):
j = i
sum = 0
while j >= 0 and i-j <= 2:
sum += b[j]
best[i+1] = max(best[i+1], best[j] * sum)
j -= 1
return best[len(b)]
best[i] is the maximum result that can be achieved by sub-sequence b[0..(i-1)].
EDIT 3
Here's an argument in favor of the O(n) algorithm based on the following assumption:
You can always achieve the maximum result with an expression of the form
+/- (a_1 +/- ... +/- a_i) * ... * (a_j +/- ... +/- a_n)
That is: a product of factors composed of an algebraic sum of terms (including the case of only one factor).
I will also use the following lemmas which are easy to prove:
Lemma 1: x*y >= x+y for all x,y such that x,y >= 2
Lemma 2: abs(x_1) + ... + abs(x_n) >= abs(x_1 +/- ... +/- x_n)
Here it goes.
The sign of each factor doesn't matter, since you can always make the product positive by using the leading unary minus. Hence, to maximize the product we need to maximize the absolute value of each factor.
Setting aside the trivial case in which all numbers are zeroes, in an optimal solution no factor will be composed only of zeroes. Therefore, since zeroes have no effect inside each sum of terms, and each factor will have at least one non-zero number, we can remove all zeroes. From now on, let's assume there are no zeroes.
Let's concentrate in each sum of terms separately:
(x_1 +/- x_2 +/- ... +/- x_n)
By Lemma 2, the maximum absolute value each factor can achieve is the sum of the absolute values of each term. This can be achieved in the following way:
If x_1 is positive, add all positive terms and subtract all negative terms. If x_1 is negative, subtract all positive terms and add all negative terms.
This implies that the sign of each term does not matter, we can consider the absolute value of each number and only use operator + inside factors. From now on, let's consider all numbers are positive.
The crucial step, that leads to an O(n) algorithm, is to prove that the maximum result can always be achieved with factors that have at most 3 terms.
Suppose we have a factor of more than 3 terms, by Lemma 1 we can break it into two smaller factors of 2 or more terms each (hence, each add up to 2 or more), without reducing the total result. We can break it down repeatedly until no factors of more than 3 terms are left.
That completes the argument. I still haven't found a complete justification of the initial assumption. But I tested my code with millions of randomly generated cases and couldn't break it.
A reasonable big value can be found in O(N). Consider this a greedy algorithm.
Find all positive numbers ≥ 2. Store the result as A.
Count all "-1"s . Store the result as B.
Find all negative numbers ≤ -2. Store the result as C.
Count all "1"s. Store the result as D.
Initialize Product to 1.
If A is not empty, multiply Product by the product of A.
If C is not empty and has even count, multiply Product by the product of C.
If C is has odd count, take the smallest number in magnitude of C away (store it as x), and multiply Product by the product of the rest of C.
If x is set and B is nonzero, compare Product × -x with Product − x + 1.
If the former is strictly larger, decrease B by 1 and multiply Product by -x, then remove x.
If the latter is larger, do nothing.
Set Result to 0. If Product ≠ 1, add it to Result.
Add D to Result, representing addition of D "1"s.
Add B to Result, representing subtraction of B "-1"s.
If x is set, substract x from Result.
The time complexities are:
1. O(N), 2. O(N), 3. O(N), 4. O(N), 5. O(1), 6. O(N), 7. O(N), 8. O(N), 9. O(1), 10. O(1), 11. O(1), 12. O(1), 13. O(1),
so the whole algorithm runs in O(N) time.
An example session:
-3 -4 5
A = [5]
B = 0
C = [-3, -4]
D = 1
Product = 1
A is not empty, so Product = 5.
C is even, so Product = 5 × -3 × -4 = 60
-
-
Product ≠ 1, so Result = 60.
-
-
-
5 × -3 × -4 = 60
-5 -3 -2 0 1 -1 -1 6
A = [6]
B = 2
C = [-5, -3, -2]
D = 1
Product = 1
A is not empty, so Product = 6
-
C is odd, so x = -2, and Product = 6 × -5 × -3 = 90.
x is set and B is nonzero. Compare Product × -x = 180 and Product − x + 1 = 93. Since the former is larger, we reset B to 1, Product to 180 and remove x.
Result = 180.
Result = 180 + 1 = 181
Result = 181 + 1 = 182
-
6 × -5 × -3 × -2 × -1 + 1 − (-1) + 0 = 182
2 -2 -1
A = [2]
B = 1
C = [-2]
D = 0
Product = 1
Product = 2
-
x = -2, Product is unchanged.
B is nonzero. Compare Product × -x = 4 and Product − x + 1 = 5. Since the latter is larger, we do nothing.
Result = 2
-
Result = 2 + 1 = 3
Result = 3 − (-2) = 5.
2 − (-1) − (-2) = 5.
You should be able to do this with dynamic programming. Let x_i be your input numbers. Then let M(a,b) be the maximum value you can get with the subsequence x_a through x_b. You can then compute:
M(a,a) = x_a
M(a,b) = max_i(max(M(a,i)*M(i+1,b), M(a,i)+M(i+1,b), M(a,i)-M(i+1,b))
edit:
I think you need to compute both the max and min computable value using each subsequence. So
Max(a,a) = Min(a,a) = x_a
Max(a,b) = max_i(max(Max(a,i)*Max(i+1,b),
Max(a,i)*Min(i+1,b),
Min(a,i)*Max(i+1,b),
Min(a,i)*Min(i+1,b),
Max(a,i)+Max(i+1,b),
Max(a,i)-Min(i+1,b))
...similarly for Min(a,b)...
Work this in reverse polish - that way you don't have to deal with parentheses. Next put a - in front of every -ve number (thereby making it positive). Finally multiply them all together. Not sure about the complexity, probably about O(N).
EDIT: forgot about 0. If it occurs in your input set, add it to the result.
This feels NP Complete to me, though I haven't yet figured out how to do a reduction. If I'm right, then I could say
Nobody in the world knows if any polynomial algorithm exists, let alone O(n log n), but most computer scientists suspect there isn't.
There are poly time algorithms to estimate the answer, such as the genetic algorithm you describe.
In fact, I think the question you mean to ask is, "Is there a reasonably useful O(n) or O(n log n) algorithm to estimate the maximum value?"
This is my first post on stackoverflow, so I apologize in advance for missing any preliminary etiquette. Also, in the interest of full disclosure, Dave brought this problem to my attention.
Here's an O(N^2logN) solution, mostly because of the the repeated sorting step in the for loop.
Absolute values: Remove zero elements and sort by absolute value. Since you are allowed to place a negative sign in front of your final result, it does not matter whether your answer is negative or positive. Only the absolute values of all numbers in the set matter.
Multiplication only for numbers > 1: We make the observation that for any set of positive integers greater than 1, (e.g. {2,3,4}), the largest result comes from a multiplication. This can be shown by an enumerative technique or a contradiction argument over permitted operations + and -. e.g. (2+3)*4 = 2*4 + 3*4 < 3*4 + 3*4 = 2*(3*4). In other words, multiplication is the most "powerful" operation (except for the 1s).
Addition of the 1s to the smallest non-1 numbers: For the 1s, since multiplication is a useless operation, we are better off adding. Here again we show a complete ordering on the result of an addition. For rhetoric sake, consider again the set {2,3,4}. We note that: 2*3*(4+1) <= 2*(3+1)*4 <= (2+1)*3*4. In other words, we get the most "mileage" from a 1 by adding it to the smallest existing non-1 element in the set. Given a sorted set, this can be done in O(N^2logN).
Here's what the pseudo-code looks like:
S = input set of integers;
S.absolute();
S.sort();
//delete all the 0 elements
S.removeZeros();
//remove all 1 elements from the sorted list, and store them
ones = S.removeOnes();
//now S contains only integers > 1, in ascending order S[0] ... S[end]
for each 1 in ones:
S[0] = S[0] + 1;
S.sort();
end
max_result = Product(S);
I know I'm late to the party, but I took this on as a challenge to myself. Here is the solution I came up with.
type Operation =
| Add
| Sub
| Mult
type 'a Expr =
| Op of 'a Expr * Operation * 'a Expr
| Value of 'a
let rec eval = function
| Op (a, Add, b) -> (eval a) + (eval b)
| Op (a, Sub, b) -> (eval a) - (eval b)
| Op (a, Mult, b) -> (eval a) * (eval b)
| Value x -> x
let rec toString : int Expr -> string = function
| Op (a, Add, b) -> (toString a) + " + " + (toString b)
| Op (a, Sub, b) -> (toString a) + " - " + (toString b)
| Op (a, Mult, b) -> (toString a) + " * " + (toString b)
| Value x -> string x
let appendExpr (a:'a Expr) (o:Operation) (v:'a) =
match o, a with
| Mult, Op(x, o2, y) -> Op(x, o2, Op(y, o, Value v))
| _ -> Op(a, o, Value v)
let genExprs (xs:'a list) : 'a Expr seq =
let rec permute xs e =
match xs with
| x::xs ->
[Add; Sub; Mult]
|> Seq.map (fun o -> appendExpr e o x)
|> Seq.map (permute xs)
|> Seq.concat
| [] -> seq [e]
match xs with
| x::xs -> permute xs (Value x)
| [] -> Seq.empty
let findBest xs =
let best,result =
genExprs xs
|> Seq.map (fun e -> e,eval e)
|> Seq.maxBy snd
toString best + " = " + string result
findBest [-3; -4; 5]
returns "-3 * -4 * 5 = 60"
findBest [0; 10; -4; 0; 52; -2; -40]
returns "0 - 10 * -4 + 0 + 52 * -2 * -40 = 4200"
It should work with any type supporting comparison and the basic mathmatical operators, but FSI will constrain it to ints.

Resources