Solving addition chain problems using dynamic programming - algorithm

You are given a positive integer A. The goal is to construct the shortest possible sequence of integers ending with A, using the following rules:
The first element of the sequence is 1
Each of the successive elements is the sum of any two preceding elements (adding a single element to itself is also permissible)
Each element is larger than all the preceding elements; that is, the sequence is increasing.
For example, for A = 42, a possible solution is:
[1, 2, 3, 6, 12, 24, 30, 42]
Another possible solution for A = 42 is:
[1, 2, 4, 5, 8, 16, 21, 42]
After reading the problem statement, the first thing that came to my mind is dynamic programming (DP), hence I expressed it as a search problem and tried to write a recursive solution.
The search space up to A = 8 is:
1
|
2
/ \
/ \
3 4
/|\ /|\
/ | \ 5 6 8
/ | \
4 5 6
/| |\
5 6 7 8
We can see that 4 occurs in two places, but in both cases the children of 4 are different. In one case the prior sequence is [1, 2, 4]. In the other case the prior sequence is [1, 2, 3, 4]. Therefore, we cannot say that we have overlapping sub-problems. Is there any way to apply DP to the above problem? Or I am wrong in judging that it can be solve using DP?

This is an addition chain...
http://en.wikipedia.org/wiki/Addition_chain
There is no known algorithm which can calculate a
minimal addition chain for a given number with any guarantees of
reasonable timing or small memory usage. However, several techniques
to calculate relatively short chains exist. One very well known
technique to calculate relatively short addition chains is the binary
method, similar to exponentiation by squaring. Other well-known
methods are the factor method and window method.
See also: New Methods For Generating Short Addition Chains in IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences.

There is a pure dynamic programming solution to this problem. But as all DP solutions, the complexity for both memory and time is N squared. So it can be a space hog. But here is the crux of the DP solution just for DP lovers:
Generalization:
Instead of finding the min length of addition chain for N viz. l(N), we are to find the min length of addition chain that contains both i and j with i ≤ j and j being max in the chain. Call this length A(i, j). Obviously, we have
l(N) = A(1, N) = A(N,N)
A(1, N) = A(2, N) if N ≥ 2
A(1, N) = 1+min (A(i, j) for 1 &leq; i &leq; j < N and i+j=N
Advance Calculation
To calculate any A(m, n) by using smaller A(i, j), a typical DP step, we apply a number of heuristics.
H1: maintain an S(i+j) for the min part of the (3) along the way. Then
A(1, n) = A(2, n) = A(n, n) = S(n)+1
For other m, we reduce n by introducing at most one more new element to the chain, and with such a new element, we just need one more step to conclude A(m, n). Possibilities are
H2: if n is even, we attempt to introduce n/2 to the chain
H3: attempt to introduce n-m to the chain
H4: attempt to introduce n-1 to the chain
H5: attempt to introduce n-2 to the chain (when n>2)
So A(m, n) to take the min of A of m and the reduced n based on H2-H5, plus 1.
Example, A(52, 100) = 1+min(A(50, 52), A(48, 52), A(52, 99), A(52, 98)) by applying H2-H5 respectively.

Due to rule #2,
Each of the successive elements is the sum of any two preceding elements
The algorithm does rely on duplicate subproblems, so Dynamic Programming would be suitable for solving it.

Related

Merge sort O(n log n) number of operations per level

This is a question that I have avoided for a long time, but I have a problem understanding why merge sort is O(n * log n), even after reading other answers. Probably it's something dumb I'm overlooking.
What I do understand is that log n comes from the height of the binary tree.
What I do not understand is why the height every level in the tree requires n operations.
Or maybe I'm looking at this entirely the wrong way (?).
Let's say that I have a situation where n = 8:
[1, 5, 2, 3, 4, 8, 1, 9]
... Then I build the binary tree, splitting up each level:
Eventually I will end up with (conveniently sorted):
[1, 5], [2, 3], [4, 8], [1, 9]
I don't see how merging these will result in 8 operations on the first level (as I understand it, n * log(n) is the number of levels * the number of operations per level.
Merge of the first two pairs:
I end up with 3 operations, i.e. check 2 against 1 and 5.
Since you know 3 > 2, you don't need to check 1 from the first pair anymore.
I can't find a situation where you need 4 operations per 2 pairs in any worst case.
So how do you end up with 8 operations per level?
I'm not mathematically gifted, I'm still studying currently.
So apologies if I'm looking at this the wrong way.
Looking at the last step where you merge two lists into one. If the lists share the same length n/2 you need at most n-1 comparisions looking for the smallest element in front of one list given that they shrink at the same speed. Otherwise, you may end up with less operations. Similary on other layers you get below n operations, the number should even be slightly decreasing. But O(n log n) is an upper bound and further inspection would show that merge sort is not asymptotically better.
Further inspection:
Let n=2^k. We have k layers. One layer with at most n-1 ops, one with two times n/2 - 1 ops ... until n/2 times 1 op.
We should get n-1 + 2(n/2 - 1) +4(n/4-1)... + n/2(1)= n-1 + n - 2 + n - 4 +... + n - n/2 = kn - (1+2+4+...+2^(k-1)) = kn - (2^k - 1) = n log n - n +1.
n log n grows faster than n and 1, so we even get θ(n log n) as we are only interested in the fastest growing Part.

How more effectively find the minimal composition from n sets that satisfies the given condition?

We have N sets of triples,like
1. { (4; 0,1), (5 ; 0.3), (7; 0,6) }
2. { (7; 0.2), (8 ; 0.4), (1 ; 0.4) }
...
N. { (6; 0.3), (1; 0.2), (9 ; 0.5) }
and need to choose only one pair from each triple, so that the sum of the first members in pair will be minimal, but also we have a condition that sum of the second members in pair must be not less than a given P number.
We can solve this by sorting all possible pair combinations with the sum of their first members (3 ^ N combinations), and in that sorted list choose the first one which also satisfies the second condition.
Could you please help to suggest a better, non trivial solution for this problem?
If there are no constraints on the values inside your triplets, then we are facing a pretty general version of integer programming problem, more specifically a 0-1 linear programming problem, as it can be represented as a system of equations with every coefficient being 0 or 1. You can find the possible approaches on the wiki page, but there is no fast-and-easy solution for this problem in general.
Alternatively, if the second numbers of each pair (the ones that need to sum up to >= P) are from a small enough range, we could view this as Dynamic Programming problem similar to a Knapsack problem. "Small enough" there is a bit hard to define because the original data has non-integer numbers. If they were integers, then the algorithmic complexity of solution I will describe is O(P * N). For non-integer numbers, they need to be first converted to integers by multiplying them all, as well as P, by a large enough number. In your example, the precision of each number is 1 digit after zero, so multiplying by 10 is enough. Hence, the actual complexity is O(M * P * N), where M is the factor everything was multiplied by to achieve integer numbers.
After this, we are essentially solving a modified Knapsack problem: instead of constraining the weight from above, we are constraining it from below, and on each step we are choosing a pair from a triplet, as opposed to deciding whether to put an item into the knapsack or not.
Let's define a function minimum_sum[i][s] which at values i, s represents the minimum possible sum (of first numbers in each pair we took) we can achieve if the sum of the second numbers in pairs taken so far is equal to s and we already considered the first i triplets. One exception to this definition is that minimum_sum[i][P] has the minimum for all sums exceeding P as well. If we can compute all values of this function, then minimum_sum[N][P] is the answer. The function values can be computed with something like this:
minimum_sum[0][0]=0, all other values are set to infinity
for i=0..N-1:
for s=0..P:
for j=0..2:
minimum_sum[i+1][min(P, s+B[i][j])] = min(minimum_sum[i+1][min(P, s+B[i][j])], minimum_sum[i][s] + A[i][j]
A[i][j] here denote the first number in i-th triplet's j-th pair, and B[i][j] denote the second number of the same triplet.
This solution is viable if N is large, but P is small and precision on Bs isn't too high. For instance, if N=50, there is little hope to compute 3^N possibilities, but with M*P=1000000 this approach would work extremely fast.
Python implementation of the idea above:
def compute(A, B, P):
n = len(A)
# note that I use 1,000,000 as “infinity” here, which might need to be increased depending on input data
best = [[1000000 for i in range(P + 1)] for j in range(n + 1)]
best[0][0] = 0
for i in range(n):
for s in range(P+1):
for j in range(3):
best[i+1][min(P, s+B[i][j])] = min(best[i+1][min(P, s+B[i][j])], best[i][s]+A[i][j])
return best[n][P]
Testing:
A=[[4, 5, 7], [7, 8, 1], [6, 1, 9]]
# second numbers in each pair after scaling them up to be integers
B=[[1, 3, 6], [2, 4, 4], [3, 2, 5]]
In [7]: compute(A, B, 0)
Out[7]: 6
In [14]: compute(A, B, 7)
Out[14]: 6
In [15]: compute(A, B, 8)
Out[15]: 7
In [20]: compute(A, B, 13)
Out[20]: 14

Heap's algorithm for permutations

I'm preparing for interviews and I'm trying to memorize Heap's algorithm:
procedure generate(n : integer, A : array of any):
if n = 1 then
output(A)
else
for i := 0; i < n; i += 1 do
generate(n - 1, A)
if n is even then
swap(A[i], A[n-1])
else
swap(A[0], A[n-1])
end if
end for
end if
This algorithm is a pretty famous one to generate permutations. It is concise and fast and goes hand-in-hand with the code to generate combinations.
The problem is: I don't like to memorize things by heart and I always try to keep the concepts to "deduce" the algorithm later.
This algorithm is really not intuitive and I can't find a way to explain how it works to myself.
Can someone please tell me why and how this algorithm works as expected when generating permutations?
Heap's algorithm is probably not the answer to any reasonable interview question. There is a much more intuitive algorithm which will produce permutations in lexicographical order; although it is amortized O(1) (per permutation) instead of O(1), it is not noticeably slower in practice, and it is much easier to derive on the fly.
The lexicographic order algorithm is extremely simple to describe. Given some permutation, find the next one by:
Finding the rightmost element which is smaller than the element to its right.
Swap that element with the smallest element to its right which is larger than it.
Reverse the part of the permutation to the right of where that element was.
Both steps (1) and (3) are worst-case O(n), but it is easy to prove that the average time for those steps is O(1).
An indication of how tricky Heap's algorithm is (in the details) is that your expression of it is slightly wrong because it does one extra swap; the extra swap is a no-op if n is even, but significantly changes the order of permutations generated when n is odd. In either case, it does unnecessary work. See https://en.wikipedia.org/wiki/Heap%27s_algorithm for the correct algorithm (at least, it's correct today) or see the discussion at Heap's algorithm permutation generator
To see how Heap's algorithm works, you need to look at what a full iteration of the loop does to the vector, in both even and odd cases. Given a vector of even length, a full iteration of Heap's algorithm will rearrange the elements according to the rule
[1,...n] → [(n-2),(n-1),2,3,...,(n-3),n,1]
whereas if the vector is of odd length, it will be simply swap the first and last elements:
[1,...n] → [n,2,3,4,...,(n-2),(n-1),1]
You can prove that both of these facts are true using induction, although that doesn't provide any intuition as to why it's true. Looking at the diagram on the Wikipedia page might help.
I found an article that tries to explain it here: Why does Heap's algorithm work?
However, I think it is hard to understand it, so came up with an explanation that is hopefully easier to understand:
Please just assume that these statements are true for a moment (i'll show that later):
Each invocation of the "generate" function
(I) where n is odd, leaves the elements in the exact same ordering when it is finished.
(II) where n is even, rotates the elements to the right, for example ABCD becomes DABC.
So in the "for i"-loop
when
n is even
The recursive call "generate(n - 1, A)" does not change the order.
So the for-loop can iteratively swap the element at i=0..(n-1) with the element at (n - 1) and will have called "generate(n - 1, A)" each time with another element missing.
n is odd
The recursive call "generate(n - 1, A)" has rotated the elements right.
So the element at index 0 will always be a different element automatically.
Just swap the elements at 0 and (n-1) in each iteration to produce a unique set of elements.
Finally, let's see why the initial statements are true:
Rotate-right
(III) This series of swaps result in a rotation to the right by one position:
A[0] <-> A[n - 1]
A[1] <-> A[n - 1]
A[2] <-> A[n - 1]
...
A[n - 2] <-> A[n - 1]
For example try it with sequence ABCD:
A[0] <-> A[3]: DBCA
A[1] <-> A[3]: DACB
A[2] <-> A[3]: DABC
No-op
(IV) This series of steps leaves the sequence in the exact same ordering as before:
Repeat n times:
Rotate the sub-sequence a[0...(n-2)] to the right
Swap: a[0] <-> a[n - 1]
Intuitively, this is true:
If you have a sequence of length 5, then rotate it 5 times, it ends up unchanged.
Taking the element at 0 out before the rotation, then after the rotation swapping it with the new element at 0 does not change the outcome (if rotating n times).
Induction
Now we can see why (I) and (II) are true:
If n is 1:
Trivially, the ordering is unchanged after invoking the function.
If n is 2:
The recursive calls "generate(n - 1, A)" leave the ordering unchanged (because it invokes generate with first argument being 1).
So we can just ignore those calls.
The swaps that get executed in this invocation result in a right-rotation, see (III).
If n is 3:
The recursive calls "generate(n - 1, A)" result in a right-rotation.
So the total steps in this invocation equal (IV) => The sequence is unchanged.
Repeat for n = 4, 5, 6, ...
The reason Heap’s algorithm constructs all permutations is that it adjoins each element to each permutation of the rest of the elements. When you execute Heap's algorithm, recursive calls on even length inputs place elements n, (n-1), 2, 3, 4, ..., (n-2), 1 in the last position and recursive calls on odd length inputs place elements n, (n-3), (n-4), (n-5), ..., 2, (n-2), (n-1), 1 in the last position. Thus, in either case, all elements are adjoined with all permutations of n - 1 elements.
If you would like a more detailed an graphical explanation, have a look at this article.
function* permute<T>(array: T[], n = array.length): Generator<T[]> {
if (n > 1) {
for (let ix = 1; ix < n; ix += 1) {
for (let _arr of permute(array, n - 1)) yield _arr
let j = n % 2 ? 0 : ix - 1
;[array[j], array[n - 1]] = [array[n - 1], array[j]]
}
for (let _arr of permute(array, n - 1)) yield _arr
} else yield array
}
Example use:
for (let arr of permute([1, 2, 3])) console.log(arr)
Trickiest part for me to understand as I am still studying it as well was the recursive expression:
for i := 0; i < n; i += 1 do
generate(n - 1, A)
I read it as evaluate at every i to n
have the terminate condition at n = 1
either an odd/even n return on execution
Since it calls and returns one 1 for every i as n is passed back recursively. Minimal change can be achieved when permuting every n + 1 passed back.
just a side tip. the heap algorithm will generate n! combinations.
i.e
if you pass n=[1,2,3] as a input the result will be n! which is

Sum-subset with a fixed subset size

The sum-subset problem states:
Given a set of integers, is there a non-empty subset whose sum is zero?
This problem is NP-complete in general. I'm curious if the complexity of this slight variant is known:
Given a set of integers, is there a subset of size k whose sum is zero?
For example, if k = 1, you can do a binary search to find the answer in O(log n). If k = 2, then you can get it down to O(n log n) (e.g. see Find a pair of elements from an array whose sum equals a given number). If k = 3, then you can do O(n^2) (e.g. see Finding three elements in an array whose sum is closest to a given number).
Is there a known bound that can be placed on this problem as a function of k?
As motivation, I was thinking about this question How do you partition an array into 2 parts such that the two parts have equal average? and trying to determine if it is actually NP-complete. The answer lies in whether or not there is a formula as described above.
Barring a general solution, I'd be very interested in knowing an optimal bound for k=4.
For k=4, space complexity O(n), time complexity O(n2 * log(n))
Sort the array. Starting from 2 smallest and 2 largest elements, calculate all lesser sums of 2 elements (a[i] + a[j]) in the non-decreasing order and all greater sums of 2 elements (a[k] + a[l]) in the non-increasing order. Increase lesser sum if total sum is less than zero, decrease greater one if total sum is greater than zero, stop when total sum is zero (success) or a[i] + a[j] > a[k] + a[l] (failure).
The trick is to iterate through all the indexes i and j in such a way, that (a[i] + a[j]) will never decrease. And for k and l, (a[k] + a[l]) should never increase. A priority queue helps to do this:
Put key=(a[i] + a[j]), value=(i = 0, j = 1) to priority queue.
Pop (sum, i, j) from priority queue.
Use sum in the above algorithm.
Put (a[i+1] + a[j]), i+1, j and (a[i] + a[j+1]), i, j+1 to priority queue only if these elements were not already used. To keep track of used elements, maintain an array of maximal used 'j' for each 'i'. It is enough to use only values for 'j', that are greater, than 'i'.
Continue from step 2.
For k>4
If space complexity is limited to O(n), I cannot find anything better, than use brute force for k-4 values and the above algorithm for the remaining 4 values. Time complexity O(n(k-2) * log(n)).
For very large k integer linear programming may give some improvement.
Update
If n is very large (on the same order as maximum integer value), it is possible to implement O(1) priority queue, improving complexities to O(n2) and O(n(k-2)).
If n >= k * INT_MAX, different algorithm with O(n) space complexity is possible. Precalculate a bitset for all possible sums of k/2 values. And use it to check sums of other k/2 values. Time complexity is O(n(ceil(k/2))).
The problem of determining whether 0 in W + X + Y + Z = {w + x + y + z | w in W, x in X, y in Y, z in Z} is basically the same except for not having annoying degenerate cases (i.e., the problems are inter-reducible with minimal resources).
This problem (and thus the original for k = 4) has an O(n^2 log n)-time, O(n)-space algorithm. The O(n log n)-time algorithm for k = 2 (to determine whether 0 in A + B) accesses A in sorted order and B in reverse sorted order. Thus all we need is an O(n)-space iterator for A = W + X, which can be reused symmetrically for B = Y + Z. Let W = {w1, ..., wn} in sorted order. For all x in X, insert a key-value item (w1 + x, (1, x)) into a priority queue. Repeatedly remove the min element (wi + x, (i, x)) and insert (wi+1 + x, (i+1, x)).
Question that is very similar:
Is this variant of the subset sum problem easier to solve?
It's still NP-complete.
If it were not, the subset-sum would also be in P, as it could be represented as F(1) | F(2) | ... F(n) where F is your function. This would have O(O(F(1)) + O(F(2)) + O(F(n))) which would still be polynomial, which is incorrect as we know it's NP-complete.
Note that if you have certain bounds on the inputs you can achieve polynomial time.
Also note that the brute-force runtime can be calculated with binomial coefficients.
The solution for k=4 in O(n^2log(n))
Step 1: Calculate the pairwise sum and sort the list. There are n(n-1)/2 sums. So the complexity is O(n^2log(n)). Keep the identities of the individuals which make the sum.
Step 2: For each element in the above list search for the complement and make sure they don't share "the individuals). There are n^2 searches, each with complexity O(log(n))
EDIT: The space complexity of the original algorithm is O(n^2). The space complexity can be reduced to O(1) by simulating a virtual 2D matrix (O(n), if you consider space to store sorted version of the array).
First about 2D matrix: sort the numbers and create a matrix X using pairwise sums. Now the matrix is ins such a way that all the rows and columns are sorted. To search for a value in this matrix, search the numbers on the diagonal. If the number is in between X[i,i] and X[i+1,i+1], you can basically halve the search space by to matrices X[i:N, 0:i] and X[0:i, i:N]. The resulting search algorithm is O(log^2n) (I AM NOT VERY SURE. CAN SOMEBODY CHECK IT?).
Now, instead of using a real matrix, use a virtual matrix where X[i,j] are calculated as needed instead of pre-computing them.
Resulting time complexity: O( (nlogn)^2 ).
PS: In the following link, it says the complexity of 2D sorted matrix search is O(n) complexity. If that is true (i.e. O(log^2n) is incorrect), then the finally complexity is O(n^3).
To build on awesomo's answer... if we can assume that numbers are sorted, we can do better than O(n^k) for given k; simply take all O(n^(k-1)) subsets of size (k-1), then do a binary search in what remains for a number that, when added to the first (k-1), gives the target. This is O(n^(k-1) log n). This means the complexity is certainly less than that.
In fact, if we know that the complexity is O(n^2) for k=3, we can do even better for k > 3: choose all (k-3)-subsets, of which there are O(n^(k-3)), and then solve the problem in O(n^2) on the remaining elements. This is O(n^(k-1)) for k >= 3.
However, maybe you can do even better? I'll think about this one.
EDIT: I was initially going to add a lot proposing a different take on this problem, but I've decided to post an abridged version. I encourage other posters to see whether they believe this idea has any merit. The analysis is tough, but it might just be crazy enough to work.
We can use the fact that we have a fixed k, and that sums of odd and even numbers behave in certain ways, to define a recursive algorithm to solve this problem.
First, modify the problem so that you have both even and odd numbers in the list (this can be accomplished by dividing by two if all are even, or by subtracting 1 from numbers and k from the target sum if all are odd, and repeating as necessary).
Next, use the fact that even target sums can be reached only by using an even number of odd numbers, and odd target sums can be reached using only an odd number of odd numbers. Generate appropriate subsets of the odd numbers, and call the algorithm recursively using the even numbers, the sum minus the sum of the subset of odd numbers being examined, and k minus the size of the subset of odd numbers. When k = 1, do binary search. If ever k > n (not sure this can happen), return false.
If you have very few odd numbers, this could allow you to very quickly pick up terms that must be part of a winning subset, or discard ones that cannot. You can transform problems with lots of even numbers to equivalent problems with lots of odd numbers by using the subtraction trick. The worst case must therefore be when the numbers of even and odd numbers are very similar... and that's where I am right now. A uselessly loose upper bound on this is many orders of magnitudes worse than brute-force, but I feel like this is probably at least as good as brute-force. Thoughts are welcome!
EDIT2: An example of the above, for illustration.
{1, 2, 2, 6, 7, 7, 20}, k = 3, sum = 20.
Subset {}:
{2, 2, 6, 20}, k = 3, sum = 20
= {1, 1, 3, 10}, k = 3, sum = 10
Subset {}:
{10}, k = 3, sum = 10
Failure
Subset {1, 1}:
{10}, k = 1, sum = 8
Failure
Subset {1, 3}:
{10}, k = 1, sum = 6
Failure
Subset {1, 7}:
{2, 2, 6, 20}, k = 1, sum = 12
Failure
Subset {7, 7}:
{2, 2, 6, 20}, k = 1, sum = 6
Success
The time complexity is trivially O(n^k) (number of k-sized subsets from n elements).
Since k is a given constant, a (possibly quite high-order) polynomial upper bounds the complexity as a function of n.

From a given number, determine three close numbers whose product is the original number

I have a number n, and I want to find three numbers whose product is n but are as close to each other as possible. That is, if n = 12 then I'd like to get 3, 2, 2 as a result, as opposed to 6, 1, 2.
Another way to think of it is that if n is the volume of a cuboid then I want to find the lengths of the sides so as to make the cuboid as much like a cube as possible (that is, the lengths as similar as possible). These numbers must be integers.
I know there is unlikely to be a perfect solution to this, and I'm happy to use something which gives a good answer most of the time, but I just can't think where to go with coming up with this algorithm. Any ideas?
Here's my first algorithm sketch, granted that n is relatively small:
Compute the prime factors of n.
Pick out the three largest and assign them to f1, f2, f3. If there are less than three factors, assign 1.
Loop over remaining factors in decreasing order, multiply them into the currently smallest partition.
Edit
Let's take n=60.
Its prime factors are 5 3 2 2.
Set f1=5, f2=3 and f3=2.
The remaining 2 is multiplied to f3, because it is the smallest.
We end up with 5 * 4 * 3 = 60.
Edit
This algorithm will not find optimum, notice btillys comment:
Consider 17550 = 2 * 3 * 3 * 3 * 5 * 5
* 13. Your algorithm would give 15, 30, 39 when the best is 25, 26, 27.
Edit
Ok, here's my second algorithm sketch with a slightly better heuristic:
Set the list L to the prime factors of n.
Set r to the cube root of n.
Create the set of three factors F, initially set to 1.
Iterate over the prime factors in descending order:
Try to multiply the current factor L[i] with each of the factors in descending order.
If the result is less than r, perform the multiplication and move on to the next
prime factor.
If not, try the next F. If out of Fs, multiply with the smallest one.
This will work for the case of 17550:
n=17550
L=13,5,5,3,3,3,2
r=25.98
F = { 1, 1, 1 }
Iteration 1:
F[0] * 13 is less than r, set F to {13,1,1}.
Iteration 2:
F[0] * 5 = 65 is greated than r.
F[1] * 5 = 5 is less than r, set F to {13,5,1}.
Iteration 3:
F[0] * 5 = 65 is greated than r.
F[1] * 5 = 25 is less than r, set F to {13,25,1}.
Iteration 4:
F[0] * 3 = 39 is greated than r.
F[1] * 3 = 75 is greated than r.
F[2] * 3 = 3 is less than r, set F to {13,25,3}.
Iteration 5:
F[0] * 3 = 39 is greated than r.
F[1] * 3 = 75 is greated than r.
F[2] * 3 = 9 is less than r, set F to {13,25,9}.
Iteration 6:
F[0] * 3 = 39 is greated than r.
F[1] * 3 = 75 is greated than r.
F[2] * 3 = 27 is greater than r, but it is the smallest F we can get. Set F to {13,25,27}.
Iteration 7:
F[0] * 2 = 26 is greated than r, but it is the smallest F we can get. Set F to {26,25,27}.
Here's a purely math based approach, that returns the optimal solution and does not involve any kind of sorting. Hell, it doesn't even need the prime factors.
Background:
1) Recall that for a polynomial
the sum and product of the roots are given by
where x_i are the roots.
2) Recall another elementary result from optimization theory:
i.e., given two variables such that their product is a constant, the sum is minimum when the two variables are equal to each other. The tilde variables denote the optimal values.
A corollary of this would be that if the sum of two variables whose product is constant, is a minimum, then the two variables are equal to each other.
Reformulate the original problem:
Your question above can now be reformulated as a polynomial root-finding exercise. We'll construct a polynomial that satisfies your conditions, and the roots of that polynomial will be your answer. If you need k numbers that are optimal, you'll have a polynomial of degree k. In this case, we can talk in terms of a cubic equation
We know that:
c is the negative of the input number (assume positive)
a is an integer and negative (since factors are all positive)
b is an integer (which is the sum of the roots taken two at a time) and is positive.
Roots of p must be real (and positive, but that has already been addressed).
To solve the problem, we simply need to maximize a subject to the above set of conditions. The only part not explicitly known right now, is condition 4, which we can easily enforce using the discriminant of the polynomial.
For a cubic polynomial p, the discriminant is
and p has real and distinct roots if ∆>0 and real and coincident (either two or all three) if ∆=0. So, constraint 4 now reads ∆>=0. This is now simple and easy to program.
Solution in Mathematica
Here's a solution in Mathematica that implements this.
And here's a test on some of the numbers used in other answers/comments.
The column on the left is the list and the corresponding row in the column on the right gives the optimal solution.
NOTE:
I just noticed that OP never mentioned that the 3 numbers needed to be integers although everyone (including myself until now) assumed that they were (probably because of his first example). Re-reading the question, and going by the cube example, it doesn't seem like OP was fixated on integers.
This is an important point which will decide which class of algorithms to pursue and needs to be defined. If they need not be integers, there are several polynomial based solutions that can be provided, one of which is mine (after relaxing the integer constraint). If they should be integers, then perhaps an approach using branch-n-bound/branch-n-cut/cutting plane might be more appropriate.
The following was written assuming the OP meant the three numbers to be integers.
The way I've implemented it right now, it can give a non-integer solution in certain cases.
The reason this gives non-integer solutions for x is because I had only maximized a, when actually, b also needs to be minimum (not only that, but also because I haven't placed a constraint on the x_i being integers. It is possible to use the integer root theorem, which would involve finding the prime factors, but makes things more complicated.)
Mathematica code in text
Clear[poly, disc, f]
poly = x^3 + a x^2 + b x + c;
disc = Discriminant[poly, x];
f[n_Integer] :=
Module[{p, \[CapitalDelta] = disc /. c -> -n},
p = poly /.
Maximize[{a, \[CapitalDelta] >= 0,
b > 0 && a < 0 && {a, b} \[Element] Integers}, {a, b}][[
2]] /. c -> -n;
Solve[p == 0]
]
There may be a clever way to find the tightest triplet, as Anders Lindahl is pursuing, but I will focus on a more basic approach.
If I generate all triplets, then I can filter them afterward however I want, so I will start there. The best way I know to generate these uses recursion:
f[n_, 1] := {{n}}
f[n_, k_] := Join ##
Table[
{q, ##} & ### Select[f[n/q, k - 1], #[[1]] >= q &],
{q, #[[2 ;; ⌈ Length##/k ⌉ ]] & # Divisors # n}
]
This function f takes two integer arguments, the number to factor n, and the number of factors to produce k.
The section #[[2 ;; ⌈ Length##/k ⌉ ]] & # Divisors # n uses Divisors to produce a list of all divisors of n (including 1), and then takes from these from the second (to drop the 1) to the Ceiling of the number of divisors divided by k.
For example, for {n = 240, k = 3} the output is {2, 3, 4, 5, 6, 8}
The Table command iterates over this list while accumulating results, assigning each element to q.
The body of the Table is Select[f[n/q, k - 1], #[[1]] >= q &]. This calls f recursively, and then selects from the result all lists that begin with a number >= q.
{q, ##} & ### (also in the body) then "prepends" q to each of these selected lists.
Finally, Join ## merges the lists of these selected lists that are produced by each loop of Table.
The result is all of the integer factors of n into k parts, in lexicographical order. Example:
In[]:= f[240, 3]
Out[]= {{2, 2, 60}, {2, 3, 40}, {2, 4, 30}, {2, 5, 24}, {2, 6, 20},
{2, 8, 15}, {2, 10, 12}, {3, 4, 20}, {3, 5, 16}, {3, 8, 10},
{4, 4, 15}, {4, 5, 12}, {4, 6, 10}, {5, 6, 8}}
With the output of the function/algorithm given above, one can then test triplets for quality however desired.
Notice that because of the ordering the last triplet in the output is the one with the greatest minimum factor. This will usually be the most "cubic" of the results, but occasionally it is not.
If the true optimum must be found, it makes sense to test starting from the right side of the list, abandoning the search if a better result is not found quickly, as the quality of the results decrease as you move left.
Obviously this method relies upon a fast Divisors function, but I presume that this is either a standard library function, or you can find a good implementation here on StackOverflow. With that in place, this should be quite fast. The code above finds all triplets for n from 1 to 10,000 in 1.26 seconds on my machine.
Instead of reinventing the wheel, one should recognize this as a variation of a well known NP-complete problem.
Compute the prime factors of n.
Compute the logarithms of these factors
The problem translates as partitioning these logs into three sums that are as close as possible.
This problem is known as a variation of the Bin Packing problem, known as Multiprocessor scheduling
Given the fact that the Multiprocessor scheduling problem is NP-complete, it's no wonder that it's hard to find an algorithm that does not search the whole problem space and finds the optimum solution.
But I guess there are already several algorithms that deal with either Bin-Packing or Multiprocessor-Scheduling and find near-optimum solutions in efficient manner.
Another related problem (generalization) is the Job shop scheduling. See the wikipedia description with many links to known algorithms.
What wikipedia describes as (the often-used LPT-Algorithm (Longest Processing Time) is exactly what Anders Lindahl came up with first.
EDIT
Here's a shorter explanation using more efficient code, KSetPartitions simplifies things considerably. So did some suggestions from Mr.W. The overall logic remains the same.
Assuming there a at least 3 prime factors of n,
Find the list of triplet KSetPartitions for the prime factors of n.
Multiply each of the elements (prime factors) within each subset to produce all possible combinations for three divisors of n (when multiplied they yield n). You can think of the divisors as the length, width and height of an orthogonal parallelepiped.
The parallelepiped closest to a cube will have the shortest space diagonal. Sum the squares of the three divisors for each case and pick the smallest.
Here's the code in Mathematica:
Needs["Combinatorica`"]
g[n_] := Module[{factors = Join ## ConstantArray ### FactorInteger[n]},
Sort[Union[Sort /# Apply[Times, Union[Sort /#
KSetPartitions[factors, 3]], {2}]]
/. {a_Integer, b_Integer, c_Integer} :>
{Total[Power[{a, b, c}, 2]], {a, b, c}}][[1, 2]]]
It can handle fairly large numbers, but slows down considerably as the number of factors of n grows. The examples below show timings for 240, 2400, ...24000000.
This could be sped up in principle by taking into account cases where a prime factor appears more than once in a divisor. I don't have the know-how to do it yet.
In[28]:= g[240]
Out[28]= {5, 6, 8}
In[27]:= t = Table[Timing[g[24*10^n]][[1]], {n, 6}]
Out[27]= {0.001868, 0.012734, 0.102968, 1.02469, 10.4816, 105.444}

Resources